linux-arm-kernel.lists.infradead.org archive mirror
 help / color / mirror / Atom feed
* [PATCH V8 0/8] Add cpufreq and cci devfreq for mt8183, and SVS support
@ 2021-03-23 11:33 Andrew-sh.Cheng
  2021-03-23 11:33 ` [PATCH V8 1/8] PM / devfreq: Add cpu based scaling support to passive_governor Andrew-sh.Cheng
                   ` (7 more replies)
  0 siblings, 8 replies; 31+ messages in thread
From: Andrew-sh.Cheng @ 2021-03-23 11:33 UTC (permalink / raw)
  To: MyungJoo Ham, Kyungmin Park, Chanwoo Choi, Rob Herring,
	Mark Rutland, Matthias Brugger, Rafael J. Wysocki, Viresh Kumar,
	Nishanth Menon, Stephen Boyd, Liam Girdwood, Mark Brown
  Cc: linux-pm, devicetree, linux-arm-kernel, linux-mediatek,
	linux-kernel, srv_heupstream, Andrew-sh.Cheng

From: "Andrew-sh.Cheng" <andrew-sh.cheng@mediatek.com>

MT8183 supports CPU DVFS and CCI DVFS, and LITTLE cpus and CCI are in the same voltage domain.
So, this series is to add drivers to handle the voltage coupling between CPU and CCI DVFS.

For SVS support, need OPP_EVENT_ADJUST_VOLTAGE and corresponding reaction.

Change since v7:
	- Separate part of "[v7,6/8] cpufreq: mediatek: add opp notification for SVS support" into another patch.
		- "cpufreq: mediatek: Add record of previous desired vproc value"
		- This is for case of there are multiple users on Vproc.
		- cpufreq will record desired voltage.
	- For "[v7,8/8] arm64: dts: mediatek: add cpufreq and cci devfreq nodes for mt8183"
		- Add "required-opps" for cci to make cpufreq based passive governor work.
	- Depend on patches have already in K-5.12

Andrew-sh.Cheng (7):
  cpufreq: mediatek: Enable clock and regulator
  dt-bindings: devfreq: add compatible for mt8183 cci devfreq
  devfreq: add mediatek cci devfreq
  cpufreq: mediatek: Add record of previous desired vproc value
  cpufreq: mediatek: add opp notification for SVS support
  devfreq: mediatek: cci devfreq register opp notification for SVS
    support
  arm64: dts: mediatek: add cpufreq and cci devfreq nodes for mt8183

Saravana Kannan (1):
  PM / devfreq: Add cpu based scaling support to passive_governor

 .../devicetree/bindings/devfreq/mt8183-cci.yaml    |  51 ++++
 arch/arm64/boot/dts/mediatek/mt8183-evb.dts        |  36 +++
 arch/arm64/boot/dts/mediatek/mt8183-kukui.dtsi     |   4 +
 arch/arm64/boot/dts/mediatek/mt8183.dtsi           | 277 +++++++++++++++++
 drivers/cpufreq/mediatek-cpufreq.c                 | 122 +++++++-
 drivers/devfreq/Kconfig                            |  12 +
 drivers/devfreq/Makefile                           |   1 +
 drivers/devfreq/governor_passive.c                 | 329 ++++++++++++++++++++-
 drivers/devfreq/mt8183-cci-devfreq.c               | 225 ++++++++++++++
 include/linux/devfreq.h                            |  29 +-
 10 files changed, 1060 insertions(+), 26 deletions(-)
 create mode 100644 Documentation/devicetree/bindings/devfreq/mt8183-cci.yaml
 create mode 100644 drivers/devfreq/mt8183-cci-devfreq.c

-- 
2.12.5
_______________________________________________
linux-arm-kernel mailing list
linux-arm-kernel@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-arm-kernel

^ permalink raw reply	[flat|nested] 31+ messages in thread

* [PATCH V8 1/8] PM / devfreq: Add cpu based scaling support to passive_governor
  2021-03-23 11:33 [PATCH V8 0/8] Add cpufreq and cci devfreq for mt8183, and SVS support Andrew-sh.Cheng
@ 2021-03-23 11:33 ` Andrew-sh.Cheng
  2021-03-25  7:42   ` Chanwoo Choi
  2021-03-25  8:14   ` Chanwoo Choi
  2021-03-23 11:33 ` [PATCH V8 2/8] cpufreq: mediatek: Enable clock and regulator Andrew-sh.Cheng
                   ` (6 subsequent siblings)
  7 siblings, 2 replies; 31+ messages in thread
From: Andrew-sh.Cheng @ 2021-03-23 11:33 UTC (permalink / raw)
  To: MyungJoo Ham, Kyungmin Park, Chanwoo Choi, Rob Herring,
	Mark Rutland, Matthias Brugger, Rafael J. Wysocki, Viresh Kumar,
	Nishanth Menon, Stephen Boyd, Liam Girdwood, Mark Brown
  Cc: linux-pm, devicetree, linux-arm-kernel, linux-mediatek,
	linux-kernel, srv_heupstream, Saravana Kannan, Sibi Sankar,
	Andrew-sh.Cheng

From: Saravana Kannan <skannan@codeaurora.org>

Many CPU architectures have caches that can scale independent of the
CPUs. Frequency scaling of the caches is necessary to make sure that the
cache is not a performance bottleneck that leads to poor performance and
power. The same idea applies for RAM/DDR.

To achieve this, this patch adds support for cpu based scaling to the
passive governor. This is accomplished by taking the current frequency
of each CPU frequency domain and then adjust the frequency of the cache
(or any devfreq device) based on the frequency of the CPUs. It listens
to CPU frequency transition notifiers to keep itself up to date on the
current CPU frequency.

To decide the frequency of the device, the governor does one of the
following:
* Derives the optimal devfreq device opp from required-opps property of
  the parent cpu opp_table.

* Scales the device frequency in proportion to the CPU frequency. So, if
  the CPUs are running at their max frequency, the device runs at its
  max frequency. If the CPUs are running at their min frequency, the
  device runs at its min frequency. It is interpolated for frequencies
  in between.

Andrew-sh.Cheng change
dev_pm_opp_xlate_opp to dev_pm_opp_xlate_required_opp devfreq->max_freq
to devfreq->user_min_freq_req.data.freq.qos->min_freq.target_value
after kernel-5.7
Don't return -EINVAL in devfreq_passive_event_handler()
since it doesn't handle DEVFREQ_GOV_SUSPEND DEVFREQ_GOV_RESUME cases.

Signed-off-by: Saravana Kannan <skannan@codeaurora.org>
[Sibi: Integrated cpu-freqmap governor into passive_governor]
Signed-off-by: Sibi Sankar <sibis@codeaurora.org>
Signed-off-by: Andrew-sh.Cheng <andrew-sh.cheng@mediatek.com>
---
 drivers/devfreq/Kconfig            |   2 +
 drivers/devfreq/governor_passive.c | 329 +++++++++++++++++++++++++++++++++++--
 include/linux/devfreq.h            |  29 +++-
 3 files changed, 342 insertions(+), 18 deletions(-)

diff --git a/drivers/devfreq/Kconfig b/drivers/devfreq/Kconfig
index 00704efe6398..f56132b0ae64 100644
--- a/drivers/devfreq/Kconfig
+++ b/drivers/devfreq/Kconfig
@@ -73,6 +73,8 @@ config DEVFREQ_GOV_PASSIVE
 	  device. This governor does not change the frequency by itself
 	  through sysfs entries. The passive governor recommends that
 	  devfreq device uses the OPP table to get the frequency/voltage.
+	  Alternatively the governor can also be chosen to scale based on
+	  the online CPUs current frequency.
 
 comment "DEVFREQ Drivers"
 
diff --git a/drivers/devfreq/governor_passive.c b/drivers/devfreq/governor_passive.c
index b094132bd20b..9cc57b083839 100644
--- a/drivers/devfreq/governor_passive.c
+++ b/drivers/devfreq/governor_passive.c
@@ -8,11 +8,103 @@
  */
 
 #include <linux/module.h>
+#include <linux/cpu.h>
+#include <linux/cpufreq.h>
+#include <linux/cpumask.h>
 #include <linux/device.h>
 #include <linux/devfreq.h>
+#include <linux/slab.h>
 #include "governor.h"
 
-static int devfreq_passive_get_target_freq(struct devfreq *devfreq,
+struct devfreq_cpu_state {
+	unsigned int curr_freq;
+	unsigned int min_freq;
+	unsigned int max_freq;
+	unsigned int first_cpu;
+	struct device *cpu_dev;
+	struct opp_table *opp_table;
+};
+
+static unsigned long xlate_cpufreq_to_devfreq(struct devfreq_passive_data *data,
+					      unsigned int cpu)
+{
+	unsigned int cpu_min_freq, cpu_max_freq, cpu_curr_freq_khz, cpu_percent;
+	unsigned long dev_min_freq, dev_max_freq, dev_max_state;
+
+	struct devfreq_cpu_state *cpu_state = data->cpu_state[cpu];
+	struct devfreq *devfreq = (struct devfreq *)data->this;
+	unsigned long *dev_freq_table = devfreq->profile->freq_table;
+	struct dev_pm_opp *opp = NULL, *p_opp = NULL;
+	unsigned long cpu_curr_freq, freq;
+
+	if (!cpu_state || cpu_state->first_cpu != cpu ||
+	    !cpu_state->opp_table || !devfreq->opp_table)
+		return 0;
+
+	cpu_curr_freq = cpu_state->curr_freq * 1000;
+	p_opp = devfreq_recommended_opp(cpu_state->cpu_dev, &cpu_curr_freq, 0);
+	if (IS_ERR(p_opp))
+		return 0;
+
+	opp = dev_pm_opp_xlate_required_opp(cpu_state->opp_table,
+					    devfreq->opp_table, p_opp);
+	dev_pm_opp_put(p_opp);
+
+	if (!IS_ERR(opp)) {
+		freq = dev_pm_opp_get_freq(opp);
+		dev_pm_opp_put(opp);
+		goto out;
+	}
+
+	/* Use Interpolation if required opps is not available */
+	cpu_min_freq = cpu_state->min_freq;
+	cpu_max_freq = cpu_state->max_freq;
+	cpu_curr_freq_khz = cpu_state->curr_freq;
+
+	if (dev_freq_table) {
+		/* Get minimum frequency according to sorting order */
+		dev_max_state = dev_freq_table[devfreq->profile->max_state - 1];
+		if (dev_freq_table[0] < dev_max_state) {
+			dev_min_freq = dev_freq_table[0];
+			dev_max_freq = dev_max_state;
+		} else {
+			dev_min_freq = dev_max_state;
+			dev_max_freq = dev_freq_table[0];
+		}
+	} else {
+		dev_min_freq = dev_pm_qos_read_value(devfreq->dev.parent,
+						     DEV_PM_QOS_MIN_FREQUENCY);
+		dev_max_freq = dev_pm_qos_read_value(devfreq->dev.parent,
+						     DEV_PM_QOS_MAX_FREQUENCY);
+
+		if (dev_max_freq <= dev_min_freq)
+			return 0;
+	}
+	cpu_percent = ((cpu_curr_freq_khz - cpu_min_freq) * 100) / cpu_max_freq - cpu_min_freq;
+	freq = dev_min_freq + mult_frac(dev_max_freq - dev_min_freq, cpu_percent, 100);
+
+out:
+	return freq;
+}
+
+static int get_target_freq_with_cpufreq(struct devfreq *devfreq,
+					unsigned long *freq)
+{
+	struct devfreq_passive_data *p_data =
+				(struct devfreq_passive_data *)devfreq->data;
+	unsigned int cpu;
+	unsigned long target_freq = 0;
+
+	for_each_online_cpu(cpu)
+		target_freq = max(target_freq,
+				  xlate_cpufreq_to_devfreq(p_data, cpu));
+
+	*freq = target_freq;
+
+	return 0;
+}
+
+static int get_target_freq_with_devfreq(struct devfreq *devfreq,
 					unsigned long *freq)
 {
 	struct devfreq_passive_data *p_data
@@ -23,14 +115,6 @@ static int devfreq_passive_get_target_freq(struct devfreq *devfreq,
 	int i, count;
 
 	/*
-	 * If the devfreq device with passive governor has the specific method
-	 * to determine the next frequency, should use the get_target_freq()
-	 * of struct devfreq_passive_data.
-	 */
-	if (p_data->get_target_freq)
-		return p_data->get_target_freq(devfreq, freq);
-
-	/*
 	 * If the parent and passive devfreq device uses the OPP table,
 	 * get the next frequency by using the OPP table.
 	 */
@@ -98,6 +182,37 @@ static int devfreq_passive_get_target_freq(struct devfreq *devfreq,
 	return 0;
 }
 
+static int devfreq_passive_get_target_freq(struct devfreq *devfreq,
+					   unsigned long *freq)
+{
+	struct devfreq_passive_data *p_data =
+		(struct devfreq_passive_data *)devfreq->data;
+	int ret;
+
+	/*
+	 * If the devfreq device with passive governor has the specific method
+	 * to determine the next frequency, should use the get_target_freq()
+	 * of struct devfreq_passive_data.
+	 */
+	if (p_data->get_target_freq)
+		return p_data->get_target_freq(devfreq, freq);
+
+	switch (p_data->parent_type) {
+	case DEVFREQ_PARENT_DEV:
+		ret = get_target_freq_with_devfreq(devfreq, freq);
+		break;
+	case CPUFREQ_PARENT_DEV:
+		ret = get_target_freq_with_cpufreq(devfreq, freq);
+		break;
+	default:
+		ret = -EINVAL;
+		dev_err(&devfreq->dev, "Invalid parent type\n");
+		break;
+	}
+
+	return ret;
+}
+
 static int devfreq_passive_notifier_call(struct notifier_block *nb,
 				unsigned long event, void *ptr)
 {
@@ -130,16 +245,200 @@ static int devfreq_passive_notifier_call(struct notifier_block *nb,
 	return NOTIFY_DONE;
 }
 
+static int cpufreq_passive_notifier_call(struct notifier_block *nb,
+					 unsigned long event, void *ptr)
+{
+	struct devfreq_passive_data *data =
+			container_of(nb, struct devfreq_passive_data, nb);
+	struct devfreq *devfreq = (struct devfreq *)data->this;
+	struct devfreq_cpu_state *cpu_state;
+	struct cpufreq_freqs *cpu_freq = ptr;
+	unsigned int curr_freq;
+	int ret;
+
+	if (event != CPUFREQ_POSTCHANGE || !cpu_freq ||
+	    !data->cpu_state[cpu_freq->policy->cpu])
+		return 0;
+
+	cpu_state = data->cpu_state[cpu_freq->policy->cpu];
+	if (cpu_state->curr_freq == cpu_freq->new)
+		return 0;
+
+	/* Backup current freq and pre-update cpu state freq*/
+	curr_freq = cpu_state->curr_freq;
+	cpu_state->curr_freq = cpu_freq->new;
+
+	mutex_lock(&devfreq->lock);
+	ret = update_devfreq(devfreq);
+	mutex_unlock(&devfreq->lock);
+	if (ret) {
+		cpu_state->curr_freq = curr_freq;
+		dev_err(&devfreq->dev, "Couldn't update the frequency.\n");
+		return ret;
+	}
+
+	return 0;
+}
+
+static int cpufreq_passive_register(struct devfreq_passive_data **p_data)
+{
+	struct devfreq_passive_data *data = *p_data;
+	struct devfreq *devfreq = (struct devfreq *)data->this;
+	struct device *dev = devfreq->dev.parent;
+	struct opp_table *opp_table = NULL;
+	struct devfreq_cpu_state *cpu_state;
+	struct cpufreq_policy *policy;
+	struct device *cpu_dev;
+	unsigned int cpu;
+	int ret;
+
+	get_online_cpus();
+
+	data->nb.notifier_call = cpufreq_passive_notifier_call;
+	ret = cpufreq_register_notifier(&data->nb,
+					CPUFREQ_TRANSITION_NOTIFIER);
+	if (ret) {
+		dev_err(dev, "Couldn't register cpufreq notifier.\n");
+		data->nb.notifier_call = NULL;
+		goto out;
+	}
+
+	/* Populate devfreq_cpu_state */
+	for_each_online_cpu(cpu) {
+		if (data->cpu_state[cpu])
+			continue;
+
+		policy = cpufreq_cpu_get(cpu);
+		if (!policy) {
+			ret = -EINVAL;
+			goto out;
+		} else if (PTR_ERR(policy) == -EPROBE_DEFER) {
+			ret = -EPROBE_DEFER;
+			goto out;
+		} else if (IS_ERR(policy)) {
+			ret = PTR_ERR(policy);
+			dev_err(dev, "Couldn't get the cpufreq_poliy.\n");
+			goto out;
+		}
+
+		cpu_state = kzalloc(sizeof(*cpu_state), GFP_KERNEL);
+		if (!cpu_state) {
+			ret = -ENOMEM;
+			goto out;
+		}
+
+		cpu_dev = get_cpu_device(cpu);
+		if (!cpu_dev) {
+			dev_err(dev, "Couldn't get cpu device.\n");
+			ret = -ENODEV;
+			goto out;
+		}
+
+		opp_table = dev_pm_opp_get_opp_table(cpu_dev);
+		if (IS_ERR(devfreq->opp_table)) {
+			ret = PTR_ERR(opp_table);
+			goto out;
+		}
+
+		cpu_state->cpu_dev = cpu_dev;
+		cpu_state->opp_table = opp_table;
+		cpu_state->first_cpu = cpumask_first(policy->related_cpus);
+		cpu_state->curr_freq = policy->cur;
+		cpu_state->min_freq = policy->cpuinfo.min_freq;
+		cpu_state->max_freq = policy->cpuinfo.max_freq;
+		data->cpu_state[cpu] = cpu_state;
+
+		cpufreq_cpu_put(policy);
+	}
+
+out:
+	put_online_cpus();
+	if (ret)
+		return ret;
+
+	/* Update devfreq */
+	mutex_lock(&devfreq->lock);
+	ret = update_devfreq(devfreq);
+	mutex_unlock(&devfreq->lock);
+	if (ret)
+		dev_err(dev, "Couldn't update the frequency.\n");
+
+	return ret;
+}
+
+static int cpufreq_passive_unregister(struct devfreq_passive_data **p_data)
+{
+	struct devfreq_passive_data *data = *p_data;
+	struct devfreq_cpu_state *cpu_state;
+	int cpu;
+
+	if (data->nb.notifier_call)
+		cpufreq_unregister_notifier(&data->nb,
+					    CPUFREQ_TRANSITION_NOTIFIER);
+
+	for_each_possible_cpu(cpu) {
+		cpu_state = data->cpu_state[cpu];
+		if (cpu_state) {
+			if (cpu_state->opp_table)
+				dev_pm_opp_put_opp_table(cpu_state->opp_table);
+			kfree(cpu_state);
+			cpu_state = NULL;
+		}
+	}
+
+	return 0;
+}
+
+int register_parent_dev_notifier(struct devfreq_passive_data **p_data)
+{
+	struct notifier_block *nb = &(*p_data)->nb;
+	int ret = 0;
+
+	switch ((*p_data)->parent_type) {
+	case DEVFREQ_PARENT_DEV:
+		nb->notifier_call = devfreq_passive_notifier_call;
+		ret = devfreq_register_notifier((struct devfreq *)(*p_data)->parent, nb,
+						DEVFREQ_TRANSITION_NOTIFIER);
+		break;
+	case CPUFREQ_PARENT_DEV:
+		ret = cpufreq_passive_register(p_data);
+		break;
+	default:
+		ret = -EINVAL;
+		break;
+	}
+	return ret;
+}
+
+int unregister_parent_dev_notifier(struct devfreq_passive_data **p_data)
+{
+	int ret = 0;
+
+	switch ((*p_data)->parent_type) {
+	case DEVFREQ_PARENT_DEV:
+		WARN_ON(devfreq_unregister_notifier((struct devfreq *)(*p_data)->parent,
+						    &(*p_data)->nb,
+						    DEVFREQ_TRANSITION_NOTIFIER));
+		break;
+	case CPUFREQ_PARENT_DEV:
+		cpufreq_passive_unregister(p_data);
+		break;
+	default:
+		ret = -EINVAL;
+		break;
+	}
+	return ret;
+}
+
 static int devfreq_passive_event_handler(struct devfreq *devfreq,
 				unsigned int event, void *data)
 {
 	struct devfreq_passive_data *p_data
 			= (struct devfreq_passive_data *)devfreq->data;
 	struct devfreq *parent = (struct devfreq *)p_data->parent;
-	struct notifier_block *nb = &p_data->nb;
 	int ret = 0;
 
-	if (!parent)
+	if (p_data->parent_type == DEVFREQ_PARENT_DEV && !parent)
 		return -EPROBE_DEFER;
 
 	switch (event) {
@@ -147,13 +446,11 @@ static int devfreq_passive_event_handler(struct devfreq *devfreq,
 		if (!p_data->this)
 			p_data->this = devfreq;
 
-		nb->notifier_call = devfreq_passive_notifier_call;
-		ret = devfreq_register_notifier(parent, nb,
-					DEVFREQ_TRANSITION_NOTIFIER);
+		ret = register_parent_dev_notifier(&p_data);
 		break;
+
 	case DEVFREQ_GOV_STOP:
-		WARN_ON(devfreq_unregister_notifier(parent, nb,
-					DEVFREQ_TRANSITION_NOTIFIER));
+		ret = unregister_parent_dev_notifier(&p_data);
 		break;
 	default:
 		break;
diff --git a/include/linux/devfreq.h b/include/linux/devfreq.h
index 26ea0850be9b..e0093b7c805c 100644
--- a/include/linux/devfreq.h
+++ b/include/linux/devfreq.h
@@ -280,6 +280,25 @@ struct devfreq_simple_ondemand_data {
 
 #if IS_ENABLED(CONFIG_DEVFREQ_GOV_PASSIVE)
 /**
+ * struct devfreq_cpu_state - holds the per-cpu state
+ * @freq:	the current frequency of the cpu.
+ * @min_freq:	the min frequency of the cpu.
+ * @max_freq:	the max frequency of the cpu.
+ * @first_cpu:	the cpumask of the first cpu of a policy.
+ * @dev:	reference to cpu device.
+ * @opp_table:	reference to cpu opp table.
+ *
+ * This structure stores the required cpu_state of a cpu.
+ * This is auto-populated by the governor.
+ */
+struct devfreq_cpu_state;
+
+enum devfreq_parent_dev_type {
+	DEVFREQ_PARENT_DEV,
+	CPUFREQ_PARENT_DEV,
+};
+
+/**
  * struct devfreq_passive_data - ``void *data`` fed to struct devfreq
  *	and devfreq_add_device
  * @parent:	the devfreq instance of parent device.
@@ -290,13 +309,15 @@ struct devfreq_simple_ondemand_data {
  *			using governors except for passive governor.
  *			If the devfreq device has the specific method to decide
  *			the next frequency, should use this callback.
+ * @parent_type:	parent type of the device
  * @this:	the devfreq instance of own device.
  * @nb:		the notifier block for DEVFREQ_TRANSITION_NOTIFIER list
+ * @cpu_state:		the state min/max/current frequency of all online cpu's
  *
  * The devfreq_passive_data have to set the devfreq instance of parent
  * device with governors except for the passive governor. But, don't need to
- * initialize the 'this' and 'nb' field because the devfreq core will handle
- * them.
+ * initialize the 'this', 'nb' and 'cpu_state' field because the devfreq core
+ * will handle them.
  */
 struct devfreq_passive_data {
 	/* Should set the devfreq instance of parent device */
@@ -305,9 +326,13 @@ struct devfreq_passive_data {
 	/* Optional callback to decide the next frequency of passvice device */
 	int (*get_target_freq)(struct devfreq *this, unsigned long *freq);
 
+	/* Should set the type of parent device */
+	enum devfreq_parent_dev_type parent_type;
+
 	/* For passive governor's internal use. Don't need to set them */
 	struct devfreq *this;
 	struct notifier_block nb;
+	struct devfreq_cpu_state *cpu_state[NR_CPUS];
 };
 #endif
 
-- 
2.12.5
_______________________________________________
linux-arm-kernel mailing list
linux-arm-kernel@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-arm-kernel

^ permalink raw reply related	[flat|nested] 31+ messages in thread

* [PATCH V8 2/8] cpufreq: mediatek: Enable clock and regulator
  2021-03-23 11:33 [PATCH V8 0/8] Add cpufreq and cci devfreq for mt8183, and SVS support Andrew-sh.Cheng
  2021-03-23 11:33 ` [PATCH V8 1/8] PM / devfreq: Add cpu based scaling support to passive_governor Andrew-sh.Cheng
@ 2021-03-23 11:33 ` Andrew-sh.Cheng
  2021-03-30  4:36   ` Viresh Kumar
  2021-03-23 11:33 ` [PATCH V8 3/8] dt-bindings: devfreq: add compatible for mt8183 cci devfreq Andrew-sh.Cheng
                   ` (5 subsequent siblings)
  7 siblings, 1 reply; 31+ messages in thread
From: Andrew-sh.Cheng @ 2021-03-23 11:33 UTC (permalink / raw)
  To: MyungJoo Ham, Kyungmin Park, Chanwoo Choi, Rob Herring,
	Mark Rutland, Matthias Brugger, Rafael J. Wysocki, Viresh Kumar,
	Nishanth Menon, Stephen Boyd, Liam Girdwood, Mark Brown
  Cc: linux-pm, devicetree, linux-arm-kernel, linux-mediatek,
	linux-kernel, srv_heupstream, Andrew-sh.Cheng

From: "Andrew-sh.Cheng" <andrew-sh.cheng@mediatek.com>

Need to enable regulator,
so that the max/min requested value will be recorded
even it is not applied right away.

Intermediate clock is not always enabled by ccf in different projects,
so cpufreq should enable it by itself.

Signed-off-by: Andrew-sh.Cheng <andrew-sh.cheng@mediatek.com>
---
 drivers/cpufreq/mediatek-cpufreq.c | 33 +++++++++++++++++++++++++++++----
 1 file changed, 29 insertions(+), 4 deletions(-)

diff --git a/drivers/cpufreq/mediatek-cpufreq.c b/drivers/cpufreq/mediatek-cpufreq.c
index f2e491b25b07..432368707ea6 100644
--- a/drivers/cpufreq/mediatek-cpufreq.c
+++ b/drivers/cpufreq/mediatek-cpufreq.c
@@ -350,6 +350,11 @@ static int mtk_cpu_dvfs_info_init(struct mtk_cpu_dvfs_info *info, int cpu)
 		ret = PTR_ERR(proc_reg);
 		goto out_free_resources;
 	}
+	ret = regulator_enable(proc_reg);
+	if (ret) {
+		pr_warn("enable vproc for cpu%d fail\n", cpu);
+		goto out_free_resources;
+	}
 
 	/* Both presence and absence of sram regulator are valid cases. */
 	sram_reg = regulator_get_exclusive(cpu_dev, "sram");
@@ -368,13 +373,21 @@ static int mtk_cpu_dvfs_info_init(struct mtk_cpu_dvfs_info *info, int cpu)
 		goto out_free_resources;
 	}
 
+	ret = clk_prepare_enable(cpu_clk);
+	if (ret)
+		goto out_free_opp_table;
+
+	ret = clk_prepare_enable(inter_clk);
+	if (ret)
+		goto out_disable_mux_clock;
+
 	/* Search a safe voltage for intermediate frequency. */
 	rate = clk_get_rate(inter_clk);
 	opp = dev_pm_opp_find_freq_ceil(cpu_dev, &rate);
 	if (IS_ERR(opp)) {
 		pr_err("failed to get intermediate opp for cpu%d\n", cpu);
 		ret = PTR_ERR(opp);
-		goto out_free_opp_table;
+		goto out_disable_inter_clock;
 	}
 	info->intermediate_voltage = dev_pm_opp_get_voltage(opp);
 	dev_pm_opp_put(opp);
@@ -393,6 +406,12 @@ static int mtk_cpu_dvfs_info_init(struct mtk_cpu_dvfs_info *info, int cpu)
 
 	return 0;
 
+out_disable_inter_clock:
+	clk_disable_unprepare(inter_clk);
+
+out_disable_mux_clock:
+	clk_disable_unprepare(cpu_clk);
+
 out_free_opp_table:
 	dev_pm_opp_of_cpumask_remove_table(&info->cpus);
 
@@ -411,14 +430,20 @@ static int mtk_cpu_dvfs_info_init(struct mtk_cpu_dvfs_info *info, int cpu)
 
 static void mtk_cpu_dvfs_info_release(struct mtk_cpu_dvfs_info *info)
 {
-	if (!IS_ERR(info->proc_reg))
+	if (!IS_ERR(info->proc_reg)) {
+		regulator_disable(info->proc_reg);
 		regulator_put(info->proc_reg);
+	}
 	if (!IS_ERR(info->sram_reg))
 		regulator_put(info->sram_reg);
-	if (!IS_ERR(info->cpu_clk))
+	if (!IS_ERR(info->cpu_clk)) {
+		clk_disable_unprepare(info->cpu_clk);
 		clk_put(info->cpu_clk);
-	if (!IS_ERR(info->inter_clk))
+	}
+	if (!IS_ERR(info->inter_clk)) {
+		clk_disable_unprepare(info->inter_clk);
 		clk_put(info->inter_clk);
+	}
 
 	dev_pm_opp_of_cpumask_remove_table(&info->cpus);
 }
-- 
2.12.5
_______________________________________________
linux-arm-kernel mailing list
linux-arm-kernel@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-arm-kernel

^ permalink raw reply related	[flat|nested] 31+ messages in thread

* [PATCH V8 3/8] dt-bindings: devfreq: add compatible for mt8183 cci devfreq
  2021-03-23 11:33 [PATCH V8 0/8] Add cpufreq and cci devfreq for mt8183, and SVS support Andrew-sh.Cheng
  2021-03-23 11:33 ` [PATCH V8 1/8] PM / devfreq: Add cpu based scaling support to passive_governor Andrew-sh.Cheng
  2021-03-23 11:33 ` [PATCH V8 2/8] cpufreq: mediatek: Enable clock and regulator Andrew-sh.Cheng
@ 2021-03-23 11:33 ` Andrew-sh.Cheng
  2021-03-23 11:33 ` [PATCH V8 4/8] devfreq: add mediatek " Andrew-sh.Cheng
                   ` (4 subsequent siblings)
  7 siblings, 0 replies; 31+ messages in thread
From: Andrew-sh.Cheng @ 2021-03-23 11:33 UTC (permalink / raw)
  To: MyungJoo Ham, Kyungmin Park, Chanwoo Choi, Rob Herring,
	Mark Rutland, Matthias Brugger, Rafael J. Wysocki, Viresh Kumar,
	Nishanth Menon, Stephen Boyd, Liam Girdwood, Mark Brown
  Cc: linux-pm, devicetree, linux-arm-kernel, linux-mediatek,
	linux-kernel, srv_heupstream, Andrew-sh.Cheng

From: "Andrew-sh.Cheng" <andrew-sh.cheng@mediatek.com>

This adds dt-binding documentation of cci devfreq
for Mediatek MT8183 SoC platform.

Signed-off-by: Andrew-sh.Cheng <andrew-sh.cheng@mediatek.com>
---
 .../devicetree/bindings/devfreq/mt8183-cci.yaml    | 51 ++++++++++++++++++++++
 1 file changed, 51 insertions(+)
 create mode 100644 Documentation/devicetree/bindings/devfreq/mt8183-cci.yaml

diff --git a/Documentation/devicetree/bindings/devfreq/mt8183-cci.yaml b/Documentation/devicetree/bindings/devfreq/mt8183-cci.yaml
new file mode 100644
index 000000000000..a7341fd94097
--- /dev/null
+++ b/Documentation/devicetree/bindings/devfreq/mt8183-cci.yaml
@@ -0,0 +1,51 @@
+# SPDX-License-Identifier: (GPL-2.0-only OR BSD-2-Clause)
+%YAML 1.2
+---
+$id: http://devicetree.org/schemas/devfreq/mt8183-cci.yaml#
+$schema: http://devicetree.org/meta-schemas/core.yaml#
+
+title: CCI_DEVFREQ driver for MT8183.
+
+maintainers:
+  - Andrew-sh.Cheng <andrew-sh.cheng@mediatek.com>
+
+description: |
+  This module is used to create CCI DEVFREQ.
+  The performance will depend on both CCI frequency and CPU frequency.
+  For MT8183, CCI co-buck with Little core.
+  Contain CCI opp table for voltage and frequency scaling.
+
+properties:
+  compatible:
+    const: "mediatek,mt8183-cci"
+
+  clocks:
+    maxItems: 1
+
+  clock-names:
+    const: "cci"
+
+  operating-points-v2: true
+  opp-table: true
+
+  proc-supply:
+    description:
+      Phandle of the regulator that provides the supply voltage.
+
+required:
+  - compatible
+  - clocks
+  - clock-names
+  - proc-supply
+
+examples:
+  - |
+    #include <dt-bindings/clock/mt8183-clk.h>
+    cci: cci {
+      compatible = "mediatek,mt8183-cci";
+      clocks = <&apmixedsys CLK_APMIXED_CCIPLL>;
+      clock-names = "cci";
+      operating-points-v2 = <&cci_opp>;
+      proc-supply = <&mt6358_vproc12_reg>;
+    };
+
-- 
2.12.5
_______________________________________________
linux-arm-kernel mailing list
linux-arm-kernel@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-arm-kernel

^ permalink raw reply related	[flat|nested] 31+ messages in thread

* [PATCH V8 4/8] devfreq: add mediatek cci devfreq
  2021-03-23 11:33 [PATCH V8 0/8] Add cpufreq and cci devfreq for mt8183, and SVS support Andrew-sh.Cheng
                   ` (2 preceding siblings ...)
  2021-03-23 11:33 ` [PATCH V8 3/8] dt-bindings: devfreq: add compatible for mt8183 cci devfreq Andrew-sh.Cheng
@ 2021-03-23 11:33 ` Andrew-sh.Cheng
  2021-03-25  8:04   ` Chanwoo Choi
  2021-03-23 11:33 ` [PATCH V8 5/8] cpufreq: mediatek: Add record of previous desired vproc value Andrew-sh.Cheng
                   ` (3 subsequent siblings)
  7 siblings, 1 reply; 31+ messages in thread
From: Andrew-sh.Cheng @ 2021-03-23 11:33 UTC (permalink / raw)
  To: MyungJoo Ham, Kyungmin Park, Chanwoo Choi, Rob Herring,
	Mark Rutland, Matthias Brugger, Rafael J. Wysocki, Viresh Kumar,
	Nishanth Menon, Stephen Boyd, Liam Girdwood, Mark Brown
  Cc: linux-pm, devicetree, linux-arm-kernel, linux-mediatek,
	linux-kernel, srv_heupstream, Andrew-sh.Cheng

From: "Andrew-sh.Cheng" <andrew-sh.cheng@mediatek.com>

This adds a devfreq driver for the Cache Coherent Interconnect (CCI)
of the Mediatek MT8183.

On the MT8183 the CCI is supplied by the same regulator as the LITTLE
cores. The driver is notified when the regulator voltage changes
(driven by cpufreq) and adjusts the CCI frequency to the maximum
possible value.

Signed-off-by: Andrew-sh.Cheng <andrew-sh.cheng@mediatek.com>
---
 drivers/devfreq/Kconfig              |  10 ++
 drivers/devfreq/Makefile             |   1 +
 drivers/devfreq/mt8183-cci-devfreq.c | 198 +++++++++++++++++++++++++++++++++++
 3 files changed, 209 insertions(+)
 create mode 100644 drivers/devfreq/mt8183-cci-devfreq.c

diff --git a/drivers/devfreq/Kconfig b/drivers/devfreq/Kconfig
index f56132b0ae64..2538255ac2c1 100644
--- a/drivers/devfreq/Kconfig
+++ b/drivers/devfreq/Kconfig
@@ -111,6 +111,16 @@ config ARM_IMX8M_DDRC_DEVFREQ
 	  This adds the DEVFREQ driver for the i.MX8M DDR Controller. It allows
 	  adjusting DRAM frequency.
 
+config ARM_MT8183_CCI_DEVFREQ
+	tristate "MT8183 CCI DEVFREQ Driver"
+	depends on ARM_MEDIATEK_CPUFREQ
+	help
+		This adds a devfreq driver for Cache Coherent Interconnect
+		of Mediatek MT8183, which is shared the same regulator
+		with cpu cluster.
+		It can track buck voltage and update a proper CCI frequency.
+		Use notification to get regulator status.
+
 config ARM_TEGRA_DEVFREQ
 	tristate "NVIDIA Tegra30/114/124/210 DEVFREQ Driver"
 	depends on ARCH_TEGRA_3x_SOC || ARCH_TEGRA_114_SOC || \
diff --git a/drivers/devfreq/Makefile b/drivers/devfreq/Makefile
index a16333ea7034..991ef7740759 100644
--- a/drivers/devfreq/Makefile
+++ b/drivers/devfreq/Makefile
@@ -11,6 +11,7 @@ obj-$(CONFIG_DEVFREQ_GOV_PASSIVE)	+= governor_passive.o
 obj-$(CONFIG_ARM_EXYNOS_BUS_DEVFREQ)	+= exynos-bus.o
 obj-$(CONFIG_ARM_IMX_BUS_DEVFREQ)	+= imx-bus.o
 obj-$(CONFIG_ARM_IMX8M_DDRC_DEVFREQ)	+= imx8m-ddrc.o
+obj-$(CONFIG_ARM_MT8183_CCI_DEVFREQ)	+= mt8183-cci-devfreq.o
 obj-$(CONFIG_ARM_RK3399_DMC_DEVFREQ)	+= rk3399_dmc.o
 obj-$(CONFIG_ARM_TEGRA_DEVFREQ)		+= tegra30-devfreq.o
 
diff --git a/drivers/devfreq/mt8183-cci-devfreq.c b/drivers/devfreq/mt8183-cci-devfreq.c
new file mode 100644
index 000000000000..018543db7bae
--- /dev/null
+++ b/drivers/devfreq/mt8183-cci-devfreq.c
@@ -0,0 +1,198 @@
+// SPDX-License-Identifier: GPL-2.0
+/*
+ * Copyright (c) 2021 MediaTek Inc.
+
+ * Author: Andrew-sh.Cheng <andrew-sh.cheng@mediatek.com>
+ */
+
+#include <linux/clk.h>
+#include <linux/devfreq.h>
+#include <linux/module.h>
+#include <linux/of.h>
+#include <linux/platform_device.h>
+#include <linux/regulator/consumer.h>
+#include <linux/time.h>
+
+#define MAX_VOLT_LIMIT		(1150000)
+
+struct cci_devfreq {
+	struct devfreq *devfreq;
+	struct regulator *cpu_reg;
+	struct clk *cci_clk;
+	int old_vproc;
+	unsigned long old_freq;
+};
+
+static int mtk_cci_set_voltage(struct cci_devfreq *cci_df, int vproc)
+{
+	int ret;
+
+	ret = regulator_set_voltage(cci_df->cpu_reg, vproc,
+				    MAX_VOLT_LIMIT);
+	if (!ret)
+		cci_df->old_vproc = vproc;
+	return ret;
+}
+
+static int mtk_cci_devfreq_target(struct device *dev, unsigned long *freq,
+				  u32 flags)
+{
+	int ret;
+	struct cci_devfreq *cci_df = dev_get_drvdata(dev);
+	struct dev_pm_opp *opp;
+	unsigned long opp_rate, opp_voltage, old_voltage;
+
+	if (!cci_df)
+		return -EINVAL;
+
+	if (cci_df->old_freq == *freq)
+		return 0;
+
+	opp_rate = *freq;
+	opp = devfreq_recommended_opp(dev, &opp_rate, 1);
+	opp_voltage = dev_pm_opp_get_voltage(opp);
+	dev_pm_opp_put(opp);
+
+	old_voltage = cci_df->old_vproc;
+	if (old_voltage == 0)
+		old_voltage = regulator_get_voltage(cci_df->cpu_reg);
+
+	// scale up: set voltage first then freq
+	if (opp_voltage > old_voltage) {
+		ret = mtk_cci_set_voltage(cci_df, opp_voltage);
+		if (ret) {
+			pr_err("cci: failed to scale up voltage\n");
+			return ret;
+		}
+	}
+
+	ret = clk_set_rate(cci_df->cci_clk, *freq);
+	if (ret) {
+		pr_err("%s: failed cci to set rate: %d\n", __func__,
+		       ret);
+		mtk_cci_set_voltage(cci_df, old_voltage);
+		return ret;
+	}
+
+	// scale down: set freq first then voltage
+	if (opp_voltage < old_voltage) {
+		ret = mtk_cci_set_voltage(cci_df, opp_voltage);
+		if (ret) {
+			pr_err("cci: failed to scale down voltage\n");
+			clk_set_rate(cci_df->cci_clk, cci_df->old_freq);
+			return ret;
+		}
+	}
+
+	cci_df->old_freq = *freq;
+
+	return 0;
+}
+
+static struct devfreq_dev_profile cci_devfreq_profile = {
+	.target = mtk_cci_devfreq_target,
+};
+
+static int mtk_cci_devfreq_probe(struct platform_device *pdev)
+{
+	struct device *cci_dev = &pdev->dev;
+	struct cci_devfreq *cci_df;
+	struct devfreq_passive_data *passive_data;
+	int ret;
+
+	cci_df = devm_kzalloc(cci_dev, sizeof(*cci_df), GFP_KERNEL);
+	if (!cci_df)
+		return -ENOMEM;
+
+	cci_df->cci_clk = devm_clk_get(cci_dev, "cci_clock");
+	ret = PTR_ERR_OR_ZERO(cci_df->cci_clk);
+	if (ret) {
+		if (ret != -EPROBE_DEFER)
+			dev_err(cci_dev, "failed to get clock for CCI: %d\n",
+				ret);
+		return ret;
+	}
+	cci_df->cpu_reg = devm_regulator_get_optional(cci_dev, "proc");
+	ret = PTR_ERR_OR_ZERO(cci_df->cpu_reg);
+	if (ret) {
+		if (ret != -EPROBE_DEFER)
+			dev_err(cci_dev, "failed to get regulator for CCI: %d\n",
+				ret);
+		return ret;
+	}
+	ret = regulator_enable(cci_df->cpu_reg);
+	if (ret) {
+		dev_err(cci_dev, "enable buck for cci fail\n");
+		return ret;
+	}
+
+	ret = dev_pm_opp_of_add_table(cci_dev);
+	if (ret) {
+		dev_err(cci_dev, "Fail to get OPP table for CCI: %d\n", ret);
+		return ret;
+	}
+
+	platform_set_drvdata(pdev, cci_df);
+
+	passive_data = devm_kzalloc(cci_dev, sizeof(*passive_data), GFP_KERNEL);
+	if (!passive_data) {
+		ret = -ENOMEM;
+		goto err_opp;
+	}
+
+	passive_data->parent_type = CPUFREQ_PARENT_DEV;
+
+	cci_df->devfreq = devm_devfreq_add_device(cci_dev,
+						  &cci_devfreq_profile,
+						  DEVFREQ_GOV_PASSIVE,
+						  passive_data);
+	if (IS_ERR(cci_df->devfreq)) {
+		ret = PTR_ERR(cci_df->devfreq);
+		dev_err(cci_dev, "cannot create cci devfreq device:%d\n", ret);
+		goto err_opp;
+	}
+
+	return 0;
+
+err_opp:
+	dev_pm_opp_of_remove_table(cci_dev);
+	return ret;
+}
+
+static int mtk_cci_devfreq_remove(struct platform_device *pdev)
+{
+	struct device *cci_dev = &pdev->dev;
+	struct cci_devfreq *cci_df;
+	struct notifier_block *opp_nb;
+
+	cci_df = platform_get_drvdata(pdev);
+	opp_nb = &cci_df->opp_nb;
+
+	dev_pm_opp_unregister_notifier(cci_dev, opp_nb);
+	dev_pm_opp_of_remove_table(cci_dev);
+	regulator_disable(cci_df->cpu_reg);
+
+	return 0;
+}
+
+static const __maybe_unused struct of_device_id
+	mediatek_cci_of_match[] = {
+	{ .compatible = "mediatek,mt8183-cci" },
+	{ },
+};
+MODULE_DEVICE_TABLE(of, mediatek_cci_of_match);
+
+static struct platform_driver cci_devfreq_driver = {
+	.probe	= mtk_cci_devfreq_probe,
+	.remove	= mtk_cci_devfreq_remove,
+	.driver = {
+		.name = "mediatek-cci-devfreq",
+		.of_match_table = of_match_ptr(mediatek_cci_of_match),
+	},
+};
+
+module_platform_driver(cci_devfreq_driver);
+
+MODULE_DESCRIPTION("Mediatek CCI devfreq driver");
+MODULE_AUTHOR("Andrew-sh.Cheng <andrew-sh.cheng@mediatek.com>");
+MODULE_LICENSE("GPL v2");
-- 
2.12.5
_______________________________________________
linux-arm-kernel mailing list
linux-arm-kernel@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-arm-kernel

^ permalink raw reply related	[flat|nested] 31+ messages in thread

* [PATCH V8 5/8] cpufreq: mediatek: Add record of previous desired vproc value
  2021-03-23 11:33 [PATCH V8 0/8] Add cpufreq and cci devfreq for mt8183, and SVS support Andrew-sh.Cheng
                   ` (3 preceding siblings ...)
  2021-03-23 11:33 ` [PATCH V8 4/8] devfreq: add mediatek " Andrew-sh.Cheng
@ 2021-03-23 11:33 ` Andrew-sh.Cheng
  2021-03-23 11:33 ` [PATCH V8 6/8] cpufreq: mediatek: add opp notification for SVS support Andrew-sh.Cheng
                   ` (2 subsequent siblings)
  7 siblings, 0 replies; 31+ messages in thread
From: Andrew-sh.Cheng @ 2021-03-23 11:33 UTC (permalink / raw)
  To: MyungJoo Ham, Kyungmin Park, Chanwoo Choi, Rob Herring,
	Mark Rutland, Matthias Brugger, Rafael J. Wysocki, Viresh Kumar,
	Nishanth Menon, Stephen Boyd, Liam Girdwood, Mark Brown
  Cc: linux-pm, devicetree, linux-arm-kernel, linux-mediatek,
	linux-kernel, srv_heupstream, Andrew-sh.Cheng

From: "Andrew-sh.Cheng" <andrew-sh.cheng@mediatek.com>

For case of share buck with other module,
the result buck voltage may not exactly the same with what we set.
Need to record the previous desired value instead of read from regulator.

Signed-off-by: Andrew-sh.Cheng <andrew-sh.cheng@mediatek.com>
---
 drivers/cpufreq/mediatek-cpufreq.c | 16 ++++++++++++----
 1 file changed, 12 insertions(+), 4 deletions(-)

diff --git a/drivers/cpufreq/mediatek-cpufreq.c b/drivers/cpufreq/mediatek-cpufreq.c
index 432368707ea6..2a82c36aec21 100644
--- a/drivers/cpufreq/mediatek-cpufreq.c
+++ b/drivers/cpufreq/mediatek-cpufreq.c
@@ -42,6 +42,7 @@ struct mtk_cpu_dvfs_info {
 	struct list_head list_head;
 	int intermediate_voltage;
 	bool need_voltage_tracking;
+	int old_vproc;
 };
 
 static LIST_HEAD(dvfs_info_list);
@@ -192,11 +193,16 @@ static int mtk_cpufreq_voltage_tracking(struct mtk_cpu_dvfs_info *info,
 
 static int mtk_cpufreq_set_voltage(struct mtk_cpu_dvfs_info *info, int vproc)
 {
+	int ret;
+
 	if (info->need_voltage_tracking)
-		return mtk_cpufreq_voltage_tracking(info, vproc);
+		ret = mtk_cpufreq_voltage_tracking(info, vproc);
 	else
-		return regulator_set_voltage(info->proc_reg, vproc,
-					     vproc + VOLT_TOL);
+		ret = regulator_set_voltage(info->proc_reg, vproc,
+					    MAX_VOLT_LIMIT);
+	if (!ret)
+		info->old_vproc = vproc;
+	return ret;
 }
 
 static int mtk_cpufreq_set_target(struct cpufreq_policy *policy,
@@ -214,7 +220,9 @@ static int mtk_cpufreq_set_target(struct cpufreq_policy *policy,
 	inter_vproc = info->intermediate_voltage;
 
 	old_freq_hz = clk_get_rate(cpu_clk);
-	old_vproc = regulator_get_voltage(info->proc_reg);
+	old_vproc = info->old_vproc;
+	if (old_vproc == 0)
+		old_vproc = regulator_get_voltage(info->proc_reg);
 	if (old_vproc < 0) {
 		pr_err("%s: invalid Vproc value: %d\n", __func__, old_vproc);
 		return old_vproc;
-- 
2.12.5
_______________________________________________
linux-arm-kernel mailing list
linux-arm-kernel@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-arm-kernel

^ permalink raw reply related	[flat|nested] 31+ messages in thread

* [PATCH V8 6/8] cpufreq: mediatek: add opp notification for SVS support
  2021-03-23 11:33 [PATCH V8 0/8] Add cpufreq and cci devfreq for mt8183, and SVS support Andrew-sh.Cheng
                   ` (4 preceding siblings ...)
  2021-03-23 11:33 ` [PATCH V8 5/8] cpufreq: mediatek: Add record of previous desired vproc value Andrew-sh.Cheng
@ 2021-03-23 11:33 ` Andrew-sh.Cheng
  2021-03-23 11:34 ` [PATCH V8 7/8] devfreq: mediatek: cci devfreq register " Andrew-sh.Cheng
  2021-03-23 11:34 ` [PATCH V8 8/8] arm64: dts: mediatek: add cpufreq and cci devfreq nodes for mt8183 Andrew-sh.Cheng
  7 siblings, 0 replies; 31+ messages in thread
From: Andrew-sh.Cheng @ 2021-03-23 11:33 UTC (permalink / raw)
  To: MyungJoo Ham, Kyungmin Park, Chanwoo Choi, Rob Herring,
	Mark Rutland, Matthias Brugger, Rafael J. Wysocki, Viresh Kumar,
	Nishanth Menon, Stephen Boyd, Liam Girdwood, Mark Brown
  Cc: linux-pm, devicetree, linux-arm-kernel, linux-mediatek,
	linux-kernel, srv_heupstream, Andrew-sh.Cheng

From: "Andrew-sh.Cheng" <andrew-sh.cheng@mediatek.com>

cpufreq should listen opp notification and do proper actions
when receiving disable and voltage adjustment events,
which are triggered when SVS is enabled.

Signed-off-by: Andrew-sh.Cheng <andrew-sh.cheng@mediatek.com>
---
 drivers/cpufreq/mediatek-cpufreq.c | 73 ++++++++++++++++++++++++++++++++++++++
 1 file changed, 73 insertions(+)

diff --git a/drivers/cpufreq/mediatek-cpufreq.c b/drivers/cpufreq/mediatek-cpufreq.c
index 2a82c36aec21..1747b03e3059 100644
--- a/drivers/cpufreq/mediatek-cpufreq.c
+++ b/drivers/cpufreq/mediatek-cpufreq.c
@@ -43,6 +43,10 @@ struct mtk_cpu_dvfs_info {
 	int intermediate_voltage;
 	bool need_voltage_tracking;
 	int old_vproc;
+	struct mutex lock; /* avoid notify and policy race condition */
+	struct notifier_block opp_nb;
+	int opp_cpu;
+	unsigned long opp_freq;
 };
 
 static LIST_HEAD(dvfs_info_list);
@@ -239,6 +243,7 @@ static int mtk_cpufreq_set_target(struct cpufreq_policy *policy,
 	vproc = dev_pm_opp_get_voltage(opp);
 	dev_pm_opp_put(opp);
 
+	mutex_lock(&info->lock);
 	/*
 	 * If the new voltage or the intermediate voltage is higher than the
 	 * current voltage, scale up voltage first.
@@ -250,6 +255,7 @@ static int mtk_cpufreq_set_target(struct cpufreq_policy *policy,
 			pr_err("cpu%d: failed to scale up voltage!\n",
 			       policy->cpu);
 			mtk_cpufreq_set_voltage(info, old_vproc);
+			mutex_unlock(&info->lock);
 			return ret;
 		}
 	}
@@ -261,6 +267,7 @@ static int mtk_cpufreq_set_target(struct cpufreq_policy *policy,
 		       policy->cpu);
 		mtk_cpufreq_set_voltage(info, old_vproc);
 		WARN_ON(1);
+		mutex_unlock(&info->lock);
 		return ret;
 	}
 
@@ -271,6 +278,7 @@ static int mtk_cpufreq_set_target(struct cpufreq_policy *policy,
 		       policy->cpu);
 		clk_set_parent(cpu_clk, armpll);
 		mtk_cpufreq_set_voltage(info, old_vproc);
+		mutex_unlock(&info->lock);
 		return ret;
 	}
 
@@ -281,6 +289,7 @@ static int mtk_cpufreq_set_target(struct cpufreq_policy *policy,
 		       policy->cpu);
 		mtk_cpufreq_set_voltage(info, inter_vproc);
 		WARN_ON(1);
+		mutex_unlock(&info->lock);
 		return ret;
 	}
 
@@ -296,15 +305,69 @@ static int mtk_cpufreq_set_target(struct cpufreq_policy *policy,
 			clk_set_parent(cpu_clk, info->inter_clk);
 			clk_set_rate(armpll, old_freq_hz);
 			clk_set_parent(cpu_clk, armpll);
+			mutex_unlock(&info->lock);
 			return ret;
 		}
 	}
 
+	info->opp_freq = freq_hz;
+	mutex_unlock(&info->lock);
+
 	return 0;
 }
 
 #define DYNAMIC_POWER "dynamic-power-coefficient"
 
+static int mtk_cpufreq_opp_notifier(struct notifier_block *nb,
+				    unsigned long event, void *data)
+{
+	struct dev_pm_opp *opp = data;
+	struct dev_pm_opp *new_opp;
+	struct mtk_cpu_dvfs_info *info;
+	unsigned long freq, volt;
+	struct cpufreq_policy *policy;
+	int ret = 0;
+
+	info = container_of(nb, struct mtk_cpu_dvfs_info, opp_nb);
+
+	if (event == OPP_EVENT_ADJUST_VOLTAGE) {
+		freq = dev_pm_opp_get_freq(opp);
+
+		mutex_lock(&info->lock);
+		if (info->opp_freq == freq) {
+			volt = dev_pm_opp_get_voltage(opp);
+			ret = mtk_cpufreq_set_voltage(info, volt);
+			if (ret)
+				dev_err(info->cpu_dev, "failed to scale voltage: %d\n",
+					ret);
+		}
+		mutex_unlock(&info->lock);
+	} else if (event == OPP_EVENT_DISABLE) {
+		freq = dev_pm_opp_get_freq(opp);
+		/* case of current opp item is disabled */
+		if (info->opp_freq == freq) {
+			freq = 1;
+			new_opp = dev_pm_opp_find_freq_ceil(info->cpu_dev,
+							    &freq);
+			if (!IS_ERR(new_opp)) {
+				dev_pm_opp_put(new_opp);
+				policy = cpufreq_cpu_get(info->opp_cpu);
+				if (policy) {
+					cpufreq_driver_target(policy,
+							      freq / 1000,
+							      CPUFREQ_RELATION_L);
+					cpufreq_cpu_put(policy);
+				}
+			} else {
+				pr_err("%s: all opp items are disabled\n",
+				       __func__);
+			}
+		}
+	}
+
+	return notifier_from_errno(ret);
+}
+
 static int mtk_cpu_dvfs_info_init(struct mtk_cpu_dvfs_info *info, int cpu)
 {
 	struct device *cpu_dev;
@@ -400,11 +463,21 @@ static int mtk_cpu_dvfs_info_init(struct mtk_cpu_dvfs_info *info, int cpu)
 	info->intermediate_voltage = dev_pm_opp_get_voltage(opp);
 	dev_pm_opp_put(opp);
 
+	info->opp_cpu = cpu;
+	info->opp_nb.notifier_call = mtk_cpufreq_opp_notifier;
+	ret = dev_pm_opp_register_notifier(cpu_dev, &info->opp_nb);
+	if (ret) {
+		pr_warn("cannot register opp notification\n");
+		goto out_disable_inter_clock;
+	}
+
+	mutex_init(&info->lock);
 	info->cpu_dev = cpu_dev;
 	info->proc_reg = proc_reg;
 	info->sram_reg = IS_ERR(sram_reg) ? NULL : sram_reg;
 	info->cpu_clk = cpu_clk;
 	info->inter_clk = inter_clk;
+	info->opp_freq = clk_get_rate(cpu_clk);
 
 	/*
 	 * If SRAM regulator is present, software "voltage tracking" is needed
-- 
2.12.5
_______________________________________________
linux-arm-kernel mailing list
linux-arm-kernel@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-arm-kernel

^ permalink raw reply related	[flat|nested] 31+ messages in thread

* [PATCH V8 7/8] devfreq: mediatek: cci devfreq register opp notification for SVS support
  2021-03-23 11:33 [PATCH V8 0/8] Add cpufreq and cci devfreq for mt8183, and SVS support Andrew-sh.Cheng
                   ` (5 preceding siblings ...)
  2021-03-23 11:33 ` [PATCH V8 6/8] cpufreq: mediatek: add opp notification for SVS support Andrew-sh.Cheng
@ 2021-03-23 11:34 ` Andrew-sh.Cheng
  2021-03-25  8:11   ` Chanwoo Choi
  2021-03-23 11:34 ` [PATCH V8 8/8] arm64: dts: mediatek: add cpufreq and cci devfreq nodes for mt8183 Andrew-sh.Cheng
  7 siblings, 1 reply; 31+ messages in thread
From: Andrew-sh.Cheng @ 2021-03-23 11:34 UTC (permalink / raw)
  To: MyungJoo Ham, Kyungmin Park, Chanwoo Choi, Rob Herring,
	Mark Rutland, Matthias Brugger, Rafael J. Wysocki, Viresh Kumar,
	Nishanth Menon, Stephen Boyd, Liam Girdwood, Mark Brown
  Cc: linux-pm, devicetree, linux-arm-kernel, linux-mediatek,
	linux-kernel, srv_heupstream, Andrew-sh.Cheng

From: "Andrew-sh.Cheng" <andrew-sh.cheng@mediatek.com>

SVS will change the voltage of opp item.
CCI devfreq need to react to change frequency.

Signed-off-by: Andrew-sh.Cheng <andrew-sh.cheng@mediatek.com>
---
 drivers/devfreq/mt8183-cci-devfreq.c | 27 +++++++++++++++++++++++++++
 1 file changed, 27 insertions(+)

diff --git a/drivers/devfreq/mt8183-cci-devfreq.c b/drivers/devfreq/mt8183-cci-devfreq.c
index 018543db7bae..6942a48f3f4f 100644
--- a/drivers/devfreq/mt8183-cci-devfreq.c
+++ b/drivers/devfreq/mt8183-cci-devfreq.c
@@ -21,6 +21,7 @@ struct cci_devfreq {
 	struct clk *cci_clk;
 	int old_vproc;
 	unsigned long old_freq;
+	struct notifier_block opp_nb;
 };
 
 static int mtk_cci_set_voltage(struct cci_devfreq *cci_df, int vproc)
@@ -89,6 +90,26 @@ static int mtk_cci_devfreq_target(struct device *dev, unsigned long *freq,
 	return 0;
 }
 
+static int ccidevfreq_opp_notifier(struct notifier_block *nb,
+				   unsigned long event, void *data)
+{
+	struct dev_pm_opp *opp = data;
+	struct cci_devfreq *cci_df = container_of(nb, struct cci_devfreq,
+						  opp_nb);
+	unsigned long	freq, volt;
+
+	if (event == OPP_EVENT_ADJUST_VOLTAGE) {
+		freq = dev_pm_opp_get_freq(opp);
+		/* current opp item is changed */
+		if (freq == cci_df->old_freq) {
+			volt = dev_pm_opp_get_voltage(opp);
+			mtk_cci_set_voltage(cci_df, volt);
+		}
+	}
+
+	return 0;
+}
+
 static struct devfreq_dev_profile cci_devfreq_profile = {
 	.target = mtk_cci_devfreq_target,
 };
@@ -98,12 +119,15 @@ static int mtk_cci_devfreq_probe(struct platform_device *pdev)
 	struct device *cci_dev = &pdev->dev;
 	struct cci_devfreq *cci_df;
 	struct devfreq_passive_data *passive_data;
+	struct notifier_block *opp_nb;
 	int ret;
 
 	cci_df = devm_kzalloc(cci_dev, sizeof(*cci_df), GFP_KERNEL);
 	if (!cci_df)
 		return -ENOMEM;
 
+	opp_nb = &cci_df->opp_nb;
+
 	cci_df->cci_clk = devm_clk_get(cci_dev, "cci_clock");
 	ret = PTR_ERR_OR_ZERO(cci_df->cci_clk);
 	if (ret) {
@@ -152,6 +176,9 @@ static int mtk_cci_devfreq_probe(struct platform_device *pdev)
 		goto err_opp;
 	}
 
+	opp_nb->notifier_call = ccidevfreq_opp_notifier;
+	dev_pm_opp_register_notifier(cci_dev, opp_nb);
+
 	return 0;
 
 err_opp:
-- 
2.12.5
_______________________________________________
linux-arm-kernel mailing list
linux-arm-kernel@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-arm-kernel

^ permalink raw reply related	[flat|nested] 31+ messages in thread

* [PATCH V8 8/8] arm64: dts: mediatek: add cpufreq and cci devfreq nodes for mt8183
  2021-03-23 11:33 [PATCH V8 0/8] Add cpufreq and cci devfreq for mt8183, and SVS support Andrew-sh.Cheng
                   ` (6 preceding siblings ...)
  2021-03-23 11:34 ` [PATCH V8 7/8] devfreq: mediatek: cci devfreq register " Andrew-sh.Cheng
@ 2021-03-23 11:34 ` Andrew-sh.Cheng
  7 siblings, 0 replies; 31+ messages in thread
From: Andrew-sh.Cheng @ 2021-03-23 11:34 UTC (permalink / raw)
  To: MyungJoo Ham, Kyungmin Park, Chanwoo Choi, Rob Herring,
	Mark Rutland, Matthias Brugger, Rafael J. Wysocki, Viresh Kumar,
	Nishanth Menon, Stephen Boyd, Liam Girdwood, Mark Brown
  Cc: linux-pm, devicetree, linux-arm-kernel, linux-mediatek,
	linux-kernel, srv_heupstream, Andrew-sh.Cheng

From: "Andrew-sh.Cheng" <andrew-sh.cheng@mediatek.com>

add cpufreq and cci devfreq nodes for mt8183

base on regulator node
https://patchwork.kernel.org/patch/11500339/
Now queued for v5.7-next/dts64

Signed-off-by: Andrew-sh.Cheng <andrew-sh.cheng@mediatek.com>
---
 arch/arm64/boot/dts/mediatek/mt8183-evb.dts    |  36 ++++
 arch/arm64/boot/dts/mediatek/mt8183-kukui.dtsi |   4 +
 arch/arm64/boot/dts/mediatek/mt8183.dtsi       | 277 +++++++++++++++++++++++++
 3 files changed, 317 insertions(+)

diff --git a/arch/arm64/boot/dts/mediatek/mt8183-evb.dts b/arch/arm64/boot/dts/mediatek/mt8183-evb.dts
index 3249c959f76f..77a591cc09a6 100644
--- a/arch/arm64/boot/dts/mediatek/mt8183-evb.dts
+++ b/arch/arm64/boot/dts/mediatek/mt8183-evb.dts
@@ -395,6 +395,42 @@
 
 };
 
+&cci {
+	proc-supply = <&mt6358_vproc12_reg>;
+};
+
+&cpu0 {
+	proc-supply = <&mt6358_vproc12_reg>;
+};
+
+&cpu1 {
+	proc-supply = <&mt6358_vproc12_reg>;
+};
+
+&cpu2 {
+	proc-supply = <&mt6358_vproc12_reg>;
+};
+
+&cpu3 {
+	proc-supply = <&mt6358_vproc12_reg>;
+};
+
+&cpu4 {
+	proc-supply = <&mt6358_vproc11_reg>;
+};
+
+&cpu5 {
+	proc-supply = <&mt6358_vproc11_reg>;
+};
+
+&cpu6 {
+	proc-supply = <&mt6358_vproc11_reg>;
+};
+
+&cpu7 {
+	proc-supply = <&mt6358_vproc11_reg>;
+};
+
 &uart0 {
 	status = "okay";
 };
diff --git a/arch/arm64/boot/dts/mediatek/mt8183-kukui.dtsi b/arch/arm64/boot/dts/mediatek/mt8183-kukui.dtsi
index ff56bcfa3370..b1c3b88c4ac4 100644
--- a/arch/arm64/boot/dts/mediatek/mt8183-kukui.dtsi
+++ b/arch/arm64/boot/dts/mediatek/mt8183-kukui.dtsi
@@ -217,6 +217,10 @@
 	status = "okay";
 };
 
+&cci {
+	proc-supply = <&mt6358_vproc12_reg>;
+};
+
 &cpu0 {
 	proc-supply = <&mt6358_vproc12_reg>;
 };
diff --git a/arch/arm64/boot/dts/mediatek/mt8183.dtsi b/arch/arm64/boot/dts/mediatek/mt8183.dtsi
index 80519a145f13..c3dc87b01067 100644
--- a/arch/arm64/boot/dts/mediatek/mt8183.dtsi
+++ b/arch/arm64/boot/dts/mediatek/mt8183.dtsi
@@ -41,6 +41,251 @@
 		rdma1 = &rdma1;
 	};
 
+	cluster0_opp: opp_table0 {
+		compatible = "operating-points-v2";
+		opp-shared;
+		opp0_00 {
+			opp-hz = /bits/ 64 <793000000>;
+			opp-microvolt = <650000>;
+			required-opps = <&opp2_00>;
+		};
+		opp0_01 {
+			opp-hz = /bits/ 64 <910000000>;
+			opp-microvolt = <687500>;
+			required-opps = <&opp2_01>;
+		};
+		opp0_02 {
+			opp-hz = /bits/ 64 <1014000000>;
+			opp-microvolt = <718750>;
+			required-opps = <&opp2_02>;
+		};
+		opp0_03 {
+			opp-hz = /bits/ 64 <1131000000>;
+			opp-microvolt = <756250>;
+			required-opps = <&opp2_03>;
+		};
+		opp0_04 {
+			opp-hz = /bits/ 64 <1248000000>;
+			opp-microvolt = <800000>;
+			required-opps = <&opp2_04>;
+		};
+		opp0_05 {
+			opp-hz = /bits/ 64 <1326000000>;
+			opp-microvolt = <818750>;
+			required-opps = <&opp2_05>;
+		};
+		opp0_06 {
+			opp-hz = /bits/ 64 <1417000000>;
+			opp-microvolt = <850000>;
+			required-opps = <&opp2_06>;
+		};
+		opp0_07 {
+			opp-hz = /bits/ 64 <1508000000>;
+			opp-microvolt = <868750>;
+			required-opps = <&opp2_07>;
+		};
+		opp0_08 {
+			opp-hz = /bits/ 64 <1586000000>;
+			opp-microvolt = <893750>;
+			required-opps = <&opp2_08>;
+		};
+		opp0_09 {
+			opp-hz = /bits/ 64 <1625000000>;
+			opp-microvolt = <906250>;
+			required-opps = <&opp2_09>;
+		};
+		opp0_10 {
+			opp-hz = /bits/ 64 <1677000000>;
+			opp-microvolt = <931250>;
+			required-opps = <&opp2_10>;
+		};
+		opp0_11 {
+			opp-hz = /bits/ 64 <1716000000>;
+			opp-microvolt = <943750>;
+			required-opps = <&opp2_11>;
+		};
+		opp0_12 {
+			opp-hz = /bits/ 64 <1781000000>;
+			opp-microvolt = <975000>;
+			required-opps = <&opp2_12>;
+		};
+		opp0_13 {
+			opp-hz = /bits/ 64 <1846000000>;
+			opp-microvolt = <1000000>;
+			required-opps = <&opp2_13>;
+		};
+		opp0_14 {
+			opp-hz = /bits/ 64 <1924000000>;
+			opp-microvolt = <1025000>;
+			required-opps = <&opp2_14>;
+		};
+		opp0_15 {
+			opp-hz = /bits/ 64 <1989000000>;
+			opp-microvolt = <1050000>;
+			required-opps = <&opp2_15>;
+		};	};
+
+	cluster1_opp: opp_table1 {
+		compatible = "operating-points-v2";
+		opp-shared;
+		opp1_00 {
+			opp-hz = /bits/ 64 <793000000>;
+			opp-microvolt = <700000>;
+			required-opps = <&opp2_00>;
+		};
+		opp1_01 {
+			opp-hz = /bits/ 64 <910000000>;
+			opp-microvolt = <725000>;
+			required-opps = <&opp2_01>;
+		};
+		opp1_02 {
+			opp-hz = /bits/ 64 <1014000000>;
+			opp-microvolt = <750000>;
+			required-opps = <&opp2_02>;
+		};
+		opp1_03 {
+			opp-hz = /bits/ 64 <1131000000>;
+			opp-microvolt = <775000>;
+			required-opps = <&opp2_03>;
+		};
+		opp1_04 {
+			opp-hz = /bits/ 64 <1248000000>;
+			opp-microvolt = <800000>;
+			required-opps = <&opp2_04>;
+		};
+		opp1_05 {
+			opp-hz = /bits/ 64 <1326000000>;
+			opp-microvolt = <825000>;
+			required-opps = <&opp2_05>;
+		};
+		opp1_06 {
+			opp-hz = /bits/ 64 <1417000000>;
+			opp-microvolt = <850000>;
+			required-opps = <&opp2_06>;
+		};
+		opp1_07 {
+			opp-hz = /bits/ 64 <1508000000>;
+			opp-microvolt = <875000>;
+			required-opps = <&opp2_07>;
+		};
+		opp1_08 {
+			opp-hz = /bits/ 64 <1586000000>;
+			opp-microvolt = <900000>;
+			required-opps = <&opp2_08>;
+		};
+		opp1_09 {
+			opp-hz = /bits/ 64 <1625000000>;
+			opp-microvolt = <912500>;
+			required-opps = <&opp2_09>;
+		};
+		opp1_10 {
+			opp-hz = /bits/ 64 <1677000000>;
+			opp-microvolt = <931250>;
+			required-opps = <&opp2_10>;
+		};
+		opp1_11 {
+			opp-hz = /bits/ 64 <1716000000>;
+			opp-microvolt = <950000>;
+			required-opps = <&opp2_11>;
+		};
+		opp1_12 {
+			opp-hz = /bits/ 64 <1781000000>;
+			opp-microvolt = <975000>;
+			required-opps = <&opp2_12>;
+		};
+		opp1_13 {
+			opp-hz = /bits/ 64 <1846000000>;
+			opp-microvolt = <1000000>;
+			required-opps = <&opp2_13>;
+		};
+		opp1_14 {
+			opp-hz = /bits/ 64 <1924000000>;
+			opp-microvolt = <1025000>;
+			required-opps = <&opp2_14>;
+		};
+		opp1_15 {
+			opp-hz = /bits/ 64 <1989000000>;
+			opp-microvolt = <1050000>;
+			required-opps = <&opp2_15>;
+		};
+	};
+
+	cci_opp: opp_table2 {
+		compatible = "operating-points-v2";
+		opp-shared;
+		opp2_00: opp-273000000 {
+			opp-hz = /bits/ 64 <273000000>;
+			opp-microvolt = <650000>;
+		};
+		opp2_01: opp-338000000 {
+			opp-hz = /bits/ 64 <338000000>;
+			opp-microvolt = <687500>;
+		};
+		opp2_02: opp-403000000 {
+			opp-hz = /bits/ 64 <403000000>;
+			opp-microvolt = <718750>;
+		};
+		opp2_03: opp-463000000 {
+			opp-hz = /bits/ 64 <463000000>;
+			opp-microvolt = <756250>;
+		};
+		opp2_04: opp-546000000 {
+			opp-hz = /bits/ 64 <546000000>;
+			opp-microvolt = <800000>;
+		};
+		opp2_05: opp-624000000 {
+			opp-hz = /bits/ 64 <624000000>;
+			opp-microvolt = <818750>;
+		};
+		opp2_06: opp-689000000 {
+			opp-hz = /bits/ 64 <689000000>;
+			opp-microvolt = <850000>;
+		};
+		opp2_07: opp-767000000 {
+			opp-hz = /bits/ 64 <767000000>;
+			opp-microvolt = <868750>;
+		};
+		opp2_08: opp-845000000 {
+			opp-hz = /bits/ 64 <845000000>;
+			opp-microvolt = <893750>;
+		};
+		opp2_09: opp-871000000 {
+			opp-hz = /bits/ 64 <871000000>;
+			opp-microvolt = <906250>;
+		};
+		opp2_10: opp-923000000 {
+			opp-hz = /bits/ 64 <923000000>;
+			opp-microvolt = <931250>;
+		};
+		opp2_11: opp-962000000 {
+			opp-hz = /bits/ 64 <962000000>;
+			opp-microvolt = <943750>;
+		};
+		opp2_12: opp-1027000000 {
+			opp-hz = /bits/ 64 <1027000000>;
+			opp-microvolt = <975000>;
+		};
+		opp2_13: opp-1092000000 {
+			opp-hz = /bits/ 64 <1092000000>;
+			opp-microvolt = <1000000>;
+		};
+		opp2_14: opp-1144000000 {
+			opp-hz = /bits/ 64 <1144000000>;
+			opp-microvolt = <1025000>;
+		};
+		opp2_15: opp-1196000000 {
+			opp-hz = /bits/ 64 <1196000000>;
+			opp-microvolt = <1050000>;
+		};
+	};
+
+	cci: cci {
+		compatible = "mediatek,mt8183-cci";
+		clocks = <&apmixedsys CLK_APMIXED_CCIPLL>;
+		clock-names = "cci_clock";
+		operating-points-v2 = <&cci_opp>;
+	};
+
 	cpus {
 		#address-cells = <1>;
 		#size-cells = <0>;
@@ -84,6 +329,10 @@
 			enable-method = "psci";
 			capacity-dmips-mhz = <741>;
 			cpu-idle-states = <&CPU_SLEEP &CLUSTER_SLEEP0>;
+			clocks = <&mcucfg CLK_MCU_MP0_SEL>,
+				 <&topckgen CLK_TOP_ARMPLL_DIV_PLL1>;
+			clock-names = "cpu", "intermediate";
+			operating-points-v2 = <&cluster0_opp>;
 			dynamic-power-coefficient = <84>;
 			#cooling-cells = <2>;
 		};
@@ -95,6 +344,10 @@
 			enable-method = "psci";
 			capacity-dmips-mhz = <741>;
 			cpu-idle-states = <&CPU_SLEEP &CLUSTER_SLEEP0>;
+			clocks = <&mcucfg CLK_MCU_MP0_SEL>,
+				 <&topckgen CLK_TOP_ARMPLL_DIV_PLL1>;
+			clock-names = "cpu", "intermediate";
+			operating-points-v2 = <&cluster0_opp>;
 			dynamic-power-coefficient = <84>;
 			#cooling-cells = <2>;
 		};
@@ -106,6 +359,10 @@
 			enable-method = "psci";
 			capacity-dmips-mhz = <741>;
 			cpu-idle-states = <&CPU_SLEEP &CLUSTER_SLEEP0>;
+			clocks = <&mcucfg CLK_MCU_MP0_SEL>,
+				 <&topckgen CLK_TOP_ARMPLL_DIV_PLL1>;
+			clock-names = "cpu", "intermediate";
+			operating-points-v2 = <&cluster0_opp>;
 			dynamic-power-coefficient = <84>;
 			#cooling-cells = <2>;
 		};
@@ -117,6 +374,10 @@
 			enable-method = "psci";
 			capacity-dmips-mhz = <741>;
 			cpu-idle-states = <&CPU_SLEEP &CLUSTER_SLEEP0>;
+			clocks = <&mcucfg CLK_MCU_MP0_SEL>,
+				 <&topckgen CLK_TOP_ARMPLL_DIV_PLL1>;
+			clock-names = "cpu", "intermediate";
+			operating-points-v2 = <&cluster0_opp>;
 			dynamic-power-coefficient = <84>;
 			#cooling-cells = <2>;
 		};
@@ -128,6 +389,10 @@
 			enable-method = "psci";
 			capacity-dmips-mhz = <1024>;
 			cpu-idle-states = <&CPU_SLEEP &CLUSTER_SLEEP1>;
+			clocks = <&mcucfg CLK_MCU_MP2_SEL>,
+				 <&topckgen CLK_TOP_ARMPLL_DIV_PLL1>;
+			clock-names = "cpu", "intermediate";
+			operating-points-v2 = <&cluster1_opp>;
 			dynamic-power-coefficient = <211>;
 			#cooling-cells = <2>;
 		};
@@ -139,6 +404,10 @@
 			enable-method = "psci";
 			capacity-dmips-mhz = <1024>;
 			cpu-idle-states = <&CPU_SLEEP &CLUSTER_SLEEP1>;
+			clocks = <&mcucfg CLK_MCU_MP2_SEL>,
+				 <&topckgen CLK_TOP_ARMPLL_DIV_PLL1>;
+			clock-names = "cpu", "intermediate";
+			operating-points-v2 = <&cluster1_opp>;
 			dynamic-power-coefficient = <211>;
 			#cooling-cells = <2>;
 		};
@@ -150,6 +419,10 @@
 			enable-method = "psci";
 			capacity-dmips-mhz = <1024>;
 			cpu-idle-states = <&CPU_SLEEP &CLUSTER_SLEEP1>;
+			clocks = <&mcucfg CLK_MCU_MP2_SEL>,
+				 <&topckgen CLK_TOP_ARMPLL_DIV_PLL1>;
+			clock-names = "cpu", "intermediate";
+			operating-points-v2 = <&cluster1_opp>;
 			dynamic-power-coefficient = <211>;
 			#cooling-cells = <2>;
 		};
@@ -161,6 +434,10 @@
 			enable-method = "psci";
 			capacity-dmips-mhz = <1024>;
 			cpu-idle-states = <&CPU_SLEEP &CLUSTER_SLEEP1>;
+			clocks = <&mcucfg CLK_MCU_MP2_SEL>,
+				 <&topckgen CLK_TOP_ARMPLL_DIV_PLL1>;
+			clock-names = "cpu", "intermediate";
+			operating-points-v2 = <&cluster1_opp>;
 			dynamic-power-coefficient = <211>;
 			#cooling-cells = <2>;
 		};
-- 
2.12.5
_______________________________________________
linux-arm-kernel mailing list
linux-arm-kernel@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-arm-kernel

^ permalink raw reply related	[flat|nested] 31+ messages in thread

* Re: [PATCH V8 1/8] PM / devfreq: Add cpu based scaling support to passive_governor
  2021-03-23 11:33 ` [PATCH V8 1/8] PM / devfreq: Add cpu based scaling support to passive_governor Andrew-sh.Cheng
@ 2021-03-25  7:42   ` Chanwoo Choi
  2021-03-25  8:14   ` Chanwoo Choi
  1 sibling, 0 replies; 31+ messages in thread
From: Chanwoo Choi @ 2021-03-25  7:42 UTC (permalink / raw)
  To: Andrew-sh.Cheng, MyungJoo Ham, Kyungmin Park, Rob Herring,
	Mark Rutland, Matthias Brugger, Rafael J. Wysocki, Viresh Kumar,
	Nishanth Menon, Stephen Boyd, Liam Girdwood, Mark Brown
  Cc: linux-pm, devicetree, linux-arm-kernel, linux-mediatek,
	linux-kernel, srv_heupstream, Saravana Kannan, Sibi Sankar

Hi,

You are missing to add these patches to linux-pm mailing list.
Need to send them to linu-pm ML.

Also, before received this series, I tried to clean-up these patches
on testing branch[1]. So that I add my comment with my clean-up case.
[1] https://git.kernel.org/pub/scm/linux/kernel/git/chanwoo/linux.git/log/?h=devfreq-testing-passive-gov


On 3/23/21 8:33 PM, Andrew-sh.Cheng wrote:
> From: Saravana Kannan <skannan@codeaurora.org>
> 
> Many CPU architectures have caches that can scale independent of the
> CPUs. Frequency scaling of the caches is necessary to make sure that the
> cache is not a performance bottleneck that leads to poor performance and
> power. The same idea applies for RAM/DDR.
> 
> To achieve this, this patch adds support for cpu based scaling to the
> passive governor. This is accomplished by taking the current frequency
> of each CPU frequency domain and then adjust the frequency of the cache
> (or any devfreq device) based on the frequency of the CPUs. It listens
> to CPU frequency transition notifiers to keep itself up to date on the
> current CPU frequency.
> 
> To decide the frequency of the device, the governor does one of the
> following:
> * Derives the optimal devfreq device opp from required-opps property of
>   the parent cpu opp_table.
> 
> * Scales the device frequency in proportion to the CPU frequency. So, if
>   the CPUs are running at their max frequency, the device runs at its
>   max frequency. If the CPUs are running at their min frequency, the
>   device runs at its min frequency. It is interpolated for frequencies
>   in between.
> 
> Andrew-sh.Cheng change
> dev_pm_opp_xlate_opp to dev_pm_opp_xlate_required_opp devfreq->max_freq
> to devfreq->user_min_freq_req.data.freq.qos->min_freq.target_value
> after kernel-5.7
> Don't return -EINVAL in devfreq_passive_event_handler()
> since it doesn't handle DEVFREQ_GOV_SUSPEND DEVFREQ_GOV_RESUME cases.
> 
> Signed-off-by: Saravana Kannan <skannan@codeaurora.org>
> [Sibi: Integrated cpu-freqmap governor into passive_governor]
> Signed-off-by: Sibi Sankar <sibis@codeaurora.org>
> Signed-off-by: Andrew-sh.Cheng <andrew-sh.cheng@mediatek.com>
> ---
>  drivers/devfreq/Kconfig            |   2 +
>  drivers/devfreq/governor_passive.c | 329 +++++++++++++++++++++++++++++++++++--
>  include/linux/devfreq.h            |  29 +++-
>  3 files changed, 342 insertions(+), 18 deletions(-)
> 
> diff --git a/drivers/devfreq/Kconfig b/drivers/devfreq/Kconfig
> index 00704efe6398..f56132b0ae64 100644
> --- a/drivers/devfreq/Kconfig
> +++ b/drivers/devfreq/Kconfig
> @@ -73,6 +73,8 @@ config DEVFREQ_GOV_PASSIVE
>  	  device. This governor does not change the frequency by itself
>  	  through sysfs entries. The passive governor recommends that
>  	  devfreq device uses the OPP table to get the frequency/voltage.
> +	  Alternatively the governor can also be chosen to scale based on
> +	  the online CPUs current frequency.
>  
>  comment "DEVFREQ Drivers"
>  
> diff --git a/drivers/devfreq/governor_passive.c b/drivers/devfreq/governor_passive.c
> index b094132bd20b..9cc57b083839 100644
> --- a/drivers/devfreq/governor_passive.c
> +++ b/drivers/devfreq/governor_passive.c
> @@ -8,11 +8,103 @@
>   */
>  
>  #include <linux/module.h>
> +#include <linux/cpu.h>
> +#include <linux/cpufreq.h>
> +#include <linux/cpumask.h>
>  #include <linux/device.h>
>  #include <linux/devfreq.h>
> +#include <linux/slab.h>
>  #include "governor.h"
>  
> -static int devfreq_passive_get_target_freq(struct devfreq *devfreq,
> +struct devfreq_cpu_state {
> +	unsigned int curr_freq;
> +	unsigned int min_freq;
> +	unsigned int max_freq;
> +	unsigned int first_cpu;
> +	struct device *cpu_dev;
> +	struct opp_table *opp_table;
> +};

As I knew, the previous version has the description of structure
as following:  I wan to add the description like below.

And if you have no any objection, I'd like you to order
the variables as following and use 'dev' instead of 'cpu_dev'
because this patch use the 'cpu_state->cpu_dev' at the multiple points.
I think that 'cpu_state->dev' is better than 'cpu_state->cpu_dev'.
Also, I prefer to use 'cur_freq' instead of 'curr_freq'
because devfreq subsystem uses 'cur_freq' for expressing the 'current frequency'.

/**                                                                             
 * struct devfreq_cpu_state - Hold the per-cpu data                              
 * @dev:        reference to cpu device.                                        
 * @first_cpu:  the cpumask of the first cpu of a policy.                       
 * @opp_table:  reference to cpu opp table.                                     
 * @cur_freq:   the current frequency of the cpu.                               
 * @min_freq:   the min frequency of the cpu.                                   
 * @max_freq:   the max frequency of the cpu.                                   
 *                                                                              
 * This structure stores the required cpu_data of a cpu.                        
 * This is auto-populated by the governor.                                      
 */                                                                             
struct devfreq_cpu_state {                                                       
         struct device *dev;                                                     
         unsigned int first_cpu;                                                 

         struct opp_table *opp_table;                                            
         unsigned int cur_freq;                                                  
         unsigned int min_freq;                                                  
         unsigned int max_freq;                                                  
};               


> +
> +static unsigned long xlate_cpufreq_to_devfreq(struct devfreq_passive_data *data,
> +					      unsigned int cpu)
> +{
> +	unsigned int cpu_min_freq, cpu_max_freq, cpu_curr_freq_khz, cpu_percent;
> +	unsigned long dev_min_freq, dev_max_freq, dev_max_state;
> +
> +	struct devfreq_cpu_state *cpu_state = data->cpu_state[cpu];
> +	struct devfreq *devfreq = (struct devfreq *)data->this;
> +	unsigned long *dev_freq_table = devfreq->profile->freq_table;
> +	struct dev_pm_opp *opp = NULL, *p_opp = NULL;
> +	unsigned long cpu_curr_freq, freq;
> +
> +	if (!cpu_state || cpu_state->first_cpu != cpu ||
> +	    !cpu_state->opp_table || !devfreq->opp_table)
> +		return 0;
> +
> +	cpu_curr_freq = cpu_state->curr_freq * 1000;
> +	p_opp = devfreq_recommended_opp(cpu_state->cpu_dev, &cpu_curr_freq, 0);
> +	if (IS_ERR(p_opp))
> +		return 0;
> +
> +	opp = dev_pm_opp_xlate_required_opp(cpu_state->opp_table,
> +					    devfreq->opp_table, p_opp);
> +	dev_pm_opp_put(p_opp);
> +
> +	if (!IS_ERR(opp)) {
> +		freq = dev_pm_opp_get_freq(opp);
> +		dev_pm_opp_put(opp);
> +		goto out;
> +	}
> +
> +	/* Use Interpolation if required opps is not available */
> +	cpu_min_freq = cpu_state->min_freq;
> +	cpu_max_freq = cpu_state->max_freq;
> +	cpu_curr_freq_khz = cpu_state->curr_freq;
> +
> +	if (dev_freq_table) {
> +		/* Get minimum frequency according to sorting order */
> +		dev_max_state = dev_freq_table[devfreq->profile->max_state - 1];
> +		if (dev_freq_table[0] < dev_max_state) {
> +			dev_min_freq = dev_freq_table[0];
> +			dev_max_freq = dev_max_state;
> +		} else {
> +			dev_min_freq = dev_max_state;
> +			dev_max_freq = dev_freq_table[0];
> +		}
> +	} else {
> +		dev_min_freq = dev_pm_qos_read_value(devfreq->dev.parent,
> +						     DEV_PM_QOS_MIN_FREQUENCY);
> +		dev_max_freq = dev_pm_qos_read_value(devfreq->dev.parent,
> +						     DEV_PM_QOS_MAX_FREQUENCY);
> +
> +		if (dev_max_freq <= dev_min_freq)
> +			return 0;
> +	}
> +	cpu_percent = ((cpu_curr_freq_khz - cpu_min_freq) * 100) / cpu_max_freq - cpu_min_freq;
> +	freq = dev_min_freq + mult_frac(dev_max_freq - dev_min_freq, cpu_percent, 100);
> +
> +out:
> +	return freq;
> +}
> +
> +static int get_target_freq_with_cpufreq(struct devfreq *devfreq,
> +					unsigned long *freq)
> +{
> +	struct devfreq_passive_data *p_data =
> +				(struct devfreq_passive_data *)devfreq->data;
> +	unsigned int cpu;
> +	unsigned long target_freq = 0;
> +
> +	for_each_online_cpu(cpu)
> +		target_freq = max(target_freq,
> +				  xlate_cpufreq_to_devfreq(p_data, cpu));
> +
> +	*freq = target_freq;
> +
> +	return 0;
> +}

As you knew, governor_passive.c was already used 
both 'dev_pm_opp_xlate_required_opp' and 'devfreq_recommended_opp'
to get the target from OPP. So, I wan to make the common function
like 'get_taget_freq_by_required_opp' as following:
If define 'get_taget_freq_by_required_opp' as following,
it will be used for get_target_freq_with_devfreq().
After finisied the review of this patch, I'll send the patch[2].
[2] https://git.kernel.org/pub/scm/linux/kernel/git/chanwoo/linux.git/commit/?h=devfreq-testing-passive-gov&id=101c5a087586ab2b5cf3370166a7e39227ca83cf

For example but this code is not tested,
static unsigned long get_taget_freq_by_required_opp(struct device *p_dev,
						struct opp_table *p_opp_table,
						struct opp_table *opp_table,
						unsigned long freq)
{
	struct dev_pm_opp *opp = NULL, *p_opp = NULL;

	if (!p_dev || !p_opp_table || !opp_table || !freq)
		return 0;

	p_opp = devfreq_recommended_opp(p_dev, &freq, 0);
	if (IS_ERR(p_opp))
		return 0;

	opp = dev_pm_opp_xlate_required_opp(p_opp_table, opp_table, p_opp);
	dev_pm_opp_put(p_opp);

	if (IS_ERR(opp))
		return 0;

	freq = dev_pm_opp_get_freq(opp);
	dev_pm_opp_put(opp);

	return freq;
}

static int get_target_freq_with_cpufreq(struct devfreq *devfreq,
					unsigned long *target_freq)
{
	struct devfreq_passive_data *p_data =
				(struct devfreq_passive_data *)devfreq->data;
	struct devfreq_cpu_data *cpu_data;
	unsigned long cpu, cpu_cur, cpu_min, cpu_max, cpu_percent;
	unsigned long dev_min, dev_max;
	unsigned long freq = 0;

	for_each_online_cpu(cpu) {
		cpu_data = p_data->cpu_data[cpu];
		if (!cpu_data || cpu_data->first_cpu != cpu)
			continue;

		/* Get target freq via required opps */
		cpu_cur = cpu_data->cur_freq * HZ_PER_KHZ;
		freq = get_taget_freq_by_required_opp(cpu_data->dev,
					cpu_data->opp_table,
					devfreq->opp_table, cpu_cur);
		if (freq) {
			*target_freq = max(freq, *target_freq);
			continue;
		}

		/* Use Interpolation if required opps is not available */
		devfreq_get_freq_range(devfreq, &dev_min, &dev_max);

		cpu_min = cpu_data->min_freq;
		cpu_max = cpu_data->max_freq;
		cpu_cur = cpu_data->cur_freq;

		cpu_percent = ((cpu_cur - cpu_min) * 100) / cpu_max - cpu_min;
		freq = dev_min + mult_frac(dev_max - dev_min, cpu_percent, 100);

		*target_freq = max(freq, *target_freq);
	}

	return 0;
}

> +
> +static int get_target_freq_with_devfreq(struct devfreq *devfreq,
>  					unsigned long *freq)
>  {
>  	struct devfreq_passive_data *p_data
> @@ -23,14 +115,6 @@ static int devfreq_passive_get_target_freq(struct devfreq *devfreq,
>  	int i, count;
>  
>  	/*
> -	 * If the devfreq device with passive governor has the specific method
> -	 * to determine the next frequency, should use the get_target_freq()
> -	 * of struct devfreq_passive_data.
> -	 */
> -	if (p_data->get_target_freq)
> -		return p_data->get_target_freq(devfreq, freq);
> -
> -	/*
>  	 * If the parent and passive devfreq device uses the OPP table,
>  	 * get the next frequency by using the OPP table.
>  	 */
> @@ -98,6 +182,37 @@ static int devfreq_passive_get_target_freq(struct devfreq *devfreq,
>  	return 0;
>  }
>  
> +static int devfreq_passive_get_target_freq(struct devfreq *devfreq,
> +					   unsigned long *freq)
> +{
> +	struct devfreq_passive_data *p_data =
> +		(struct devfreq_passive_data *)devfreq->data;
> +	int ret;
> +
> +	/*
> +	 * If the devfreq device with passive governor has the specific method
> +	 * to determine the next frequency, should use the get_target_freq()
> +	 * of struct devfreq_passive_data.
> +	 */
> +	if (p_data->get_target_freq)
> +		return p_data->get_target_freq(devfreq, freq);
> +
> +	switch (p_data->parent_type) {
> +	case DEVFREQ_PARENT_DEV:
> +		ret = get_target_freq_with_devfreq(devfreq, freq);
> +		break;
> +	case CPUFREQ_PARENT_DEV:
> +		ret = get_target_freq_with_cpufreq(devfreq, freq);
> +		break;
> +	default:
> +		ret = -EINVAL;
> +		dev_err(&devfreq->dev, "Invalid parent type\n");
> +		break;
> +	}
> +
> +	return ret;
> +}
> +
>  static int devfreq_passive_notifier_call(struct notifier_block *nb,
>  				unsigned long event, void *ptr)
>  {
> @@ -130,16 +245,200 @@ static int devfreq_passive_notifier_call(struct notifier_block *nb,
>  	return NOTIFY_DONE;
>  }
>  
> +static int cpufreq_passive_notifier_call(struct notifier_block *nb,
> +					 unsigned long event, void *ptr)
> +{
> +	struct devfreq_passive_data *data =
> +			container_of(nb, struct devfreq_passive_data, nb);
> +	struct devfreq *devfreq = (struct devfreq *)data->this;
> +	struct devfreq_cpu_state *cpu_state;
> +	struct cpufreq_freqs *cpu_freq = ptr;

Use 'freqs' variable name.  I prefer to use the same variable name
for both devfreq_freqs and cpufreq_freqs instance.

> +	unsigned int curr_freq;

As I commented above, better to use 'cur_frq' instead of 'curr_freq'
if there is no any special reason.

> +	int ret;
> +
> +	if (event != CPUFREQ_POSTCHANGE || !cpu_freq ||
> +	    !data->cpu_state[cpu_freq->policy->cpu])
> +		return 0;
> +
> +	cpu_state = data->cpu_state[cpu_freq->policy->cpu];
> +	if (cpu_state->curr_freq == cpu_freq->new)
> +		return 0;
> +
> +	/* Backup current freq and pre-update cpu state freq*/

I think that this commnet is not critial. So, please drop this comment.

> +	curr_freq = cpu_state->curr_freq;
> +	cpu_state->curr_freq = cpu_freq->new;
> +
> +	mutex_lock(&devfreq->lock);
> +	ret = update_devfreq(devfreq);

I recommend to use 'devfreq_update_target' instead of 'update_devfreq'
as following:
	devfreq_update_target(devfreq, freqs->new);

> +	mutex_unlock(&devfreq->lock);
> +	if (ret) {
> +		cpu_state->curr_freq = curr_freq;
> +		dev_err(&devfreq->dev, "Couldn't update the frequency.\n");
> +		return ret;
> +	}
> +
> +	return 0;
> +}
> +
> +static int cpufreq_passive_register(struct devfreq_passive_data **p_data)

In order to keep the consistent style of function name,
please change the name as following because devfreq defines
the function name as 'devfreq_regiter_notifier'
- cpufreq_passive_register -> cpufreq_passive_register_notifier

> +{
> +	struct devfreq_passive_data *data = *p_data;
> +	struct devfreq *devfreq = (struct devfreq *)data->this;
> +	struct device *dev = devfreq->dev.parent;
> +	struct opp_table *opp_table = NULL;
> +	struct devfreq_cpu_state *cpu_state;
> +	struct cpufreq_policy *policy;
> +	struct device *cpu_dev;
> +	unsigned int cpu;
> +	int ret;
> +
> +	get_online_cpus();
> +
> +	data->nb.notifier_call = cpufreq_passive_notifier_call;
> +	ret = cpufreq_register_notifier(&data->nb,
> +					CPUFREQ_TRANSITION_NOTIFIER);
> +	if (ret) {
> +		dev_err(dev, "Couldn't register cpufreq notifier.\n");
> +		data->nb.notifier_call = NULL;
> +		goto out;
> +	}
> +
> +	/* Populate devfreq_cpu_state */

Don't need this comment. Please drop it.

> +	for_each_online_cpu(cpu) {
> +		if (data->cpu_state[cpu])
> +			continue;
> +
> +		policy = cpufreq_cpu_get(cpu);
> +		if (!policy) {
> +			ret = -EINVAL;
> +			goto out;
> +		} else if (PTR_ERR(policy) == -EPROBE_DEFER) {
> +			ret = -EPROBE_DEFER;
> +			goto out;
> +		} else if (IS_ERR(policy)) {
> +			ret = PTR_ERR(policy);
> +			dev_err(dev, "Couldn't get the cpufreq_poliy.\n");
> +			goto out;
> +		}

Use dev_err_probe() funciotn to handle hte EPROBE_DEFER.
It make code more simple.

> +
> +		cpu_state = kzalloc(sizeof(*cpu_state), GFP_KERNEL);
> +		if (!cpu_state) {
> +			ret = -ENOMEM;
> +			goto out;
> +		}
> +
> +		cpu_dev = get_cpu_device(cpu);
> +		if (!cpu_dev) {
> +			dev_err(dev, "Couldn't get cpu device.\n");
> +			ret = -ENODEV;
> +			goto out;
> +		}
> +
> +		opp_table = dev_pm_opp_get_opp_table(cpu_dev);
> +		if (IS_ERR(devfreq->opp_table)) {
> +			ret = PTR_ERR(opp_table);
> +			goto out;
> +		}
> +
> +		cpu_state->cpu_dev = cpu_dev;
> +		cpu_state->opp_table = opp_table;
> +		cpu_state->first_cpu = cpumask_first(policy->related_cpus);
> +		cpu_state->curr_freq = policy->cur;
> +		cpu_state->min_freq = policy->cpuinfo.min_freq;
> +		cpu_state->max_freq = policy->cpuinfo.max_freq;
> +		data->cpu_state[cpu] = cpu_state;
> +
> +		cpufreq_cpu_put(policy);
> +	}
> +
> +out:
> +	put_online_cpus();
> +	if (ret)
> +		return ret;
> +
> +	/* Update devfreq */
> +	mutex_lock(&devfreq->lock);
> +	ret = update_devfreq(devfreq);

> +	mutex_unlock(&devfreq->lock);
> +	if (ret)
> +		dev_err(dev, "Couldn't update the frequency.\n");
> +
> +	return ret;
> +}
> +
> +static int cpufreq_passive_unregister(struct devfreq_passive_data **p_data)

As I commented above, please change the name as following:
- cpufreq_passive_unregister -> cpufreq_passive_unregister_notifier

> +{
> +	struct devfreq_passive_data *data = *p_data;
> +	struct devfreq_cpu_state *cpu_state;
> +	int cpu;
> +
> +	if (data->nb.notifier_call)
> +		cpufreq_unregister_notifier(&data->nb,
> +					    CPUFREQ_TRANSITION_NOTIFIER);
> +
> +	for_each_possible_cpu(cpu) {
> +		cpu_state = data->cpu_state[cpu];
> +		if (cpu_state) {
> +			if (cpu_state->opp_table)
> +				dev_pm_opp_put_opp_table(cpu_state->opp_table);
> +			kfree(cpu_state);
> +			cpu_state = NULL;
> +		}
> +	}
> +
> +	return 0;
> +}
> +
> +int register_parent_dev_notifier(struct devfreq_passive_data **p_data)
> +{
> +	struct notifier_block *nb = &(*p_data)->nb;
> +	int ret = 0;
> +
> +	switch ((*p_data)->parent_type) {
> +	case DEVFREQ_PARENT_DEV:
> +		nb->notifier_call = devfreq_passive_notifier_call;
> +		ret = devfreq_register_notifier((struct devfreq *)(*p_data)->parent, nb,
> +						DEVFREQ_TRANSITION_NOTIFIER);
> +		break;
> +	case CPUFREQ_PARENT_DEV:
> +		ret = cpufreq_passive_register(p_data);
> +		break;
> +	default:
> +		ret = -EINVAL;
> +		break;
> +	}
> +	return ret;
> +}
> +
> +int unregister_parent_dev_notifier(struct devfreq_passive_data **p_data)
> +{
> +	int ret = 0;
> +
> +	switch ((*p_data)->parent_type) {
> +	case DEVFREQ_PARENT_DEV:
> +		WARN_ON(devfreq_unregister_notifier((struct devfreq *)(*p_data)->parent,
> +						    &(*p_data)->nb,
> +						    DEVFREQ_TRANSITION_NOTIFIER));
> +		break;
> +	case CPUFREQ_PARENT_DEV:
> +		cpufreq_passive_unregister(p_data);
> +		break;
> +	default:
> +		ret = -EINVAL;
> +		break;
> +	}
> +	return ret;
> +}

I think that you don't need to define register_parent_dev_notifier
and unregister_parent_dev_notifier as the separate functions.

Instead of the separate functions, just add the code
into devfreq_passive_event_handler.


> +
>  static int devfreq_passive_event_handler(struct devfreq *devfreq,
>  				unsigned int event, void *data)
>  {
>  	struct devfreq_passive_data *p_data
>  			= (struct devfreq_passive_data *)devfreq->data;
>  	struct devfreq *parent = (struct devfreq *)p_data->parent;
> -	struct notifier_block *nb = &p_data->nb;
>  	int ret = 0;
>  
> -	if (!parent)
> +	if (p_data->parent_type == DEVFREQ_PARENT_DEV && !parent)
>  		return -EPROBE_DEFER;
>  
>  	switch (event) {
> @@ -147,13 +446,11 @@ static int devfreq_passive_event_handler(struct devfreq *devfreq,
>  		if (!p_data->this)
>  			p_data->this = devfreq;
>  
> -		nb->notifier_call = devfreq_passive_notifier_call;
> -		ret = devfreq_register_notifier(parent, nb,
> -					DEVFREQ_TRANSITION_NOTIFIER);
> +		ret = register_parent_dev_notifier(&p_data);
>  		break;
> +
>  	case DEVFREQ_GOV_STOP:
> -		WARN_ON(devfreq_unregister_notifier(parent, nb,
> -					DEVFREQ_TRANSITION_NOTIFIER));
> +		ret = unregister_parent_dev_notifier(&p_data);
>  		break;
>  	default:
>  		break;
> diff --git a/include/linux/devfreq.h b/include/linux/devfreq.h
> index 26ea0850be9b..e0093b7c805c 100644
> --- a/include/linux/devfreq.h
> +++ b/include/linux/devfreq.h
> @@ -280,6 +280,25 @@ struct devfreq_simple_ondemand_data {
>  
>  #if IS_ENABLED(CONFIG_DEVFREQ_GOV_PASSIVE)
>  /**
> + * struct devfreq_cpu_state - holds the per-cpu state
> + * @freq:	the current frequency of the cpu.
> + * @min_freq:	the min frequency of the cpu.
> + * @max_freq:	the max frequency of the cpu.
> + * @first_cpu:	the cpumask of the first cpu of a policy.
> + * @dev:	reference to cpu device.
> + * @opp_table:	reference to cpu opp table.
> + *
> + * This structure stores the required cpu_state of a cpu.
> + * This is auto-populated by the governor.
> + */
> +struct devfreq_cpu_state;
> +
> +enum devfreq_parent_dev_type {
> +	DEVFREQ_PARENT_DEV,
> +	CPUFREQ_PARENT_DEV,
> +};
> +
> +/**
>   * struct devfreq_passive_data - ``void *data`` fed to struct devfreq
>   *	and devfreq_add_device
>   * @parent:	the devfreq instance of parent device.
> @@ -290,13 +309,15 @@ struct devfreq_simple_ondemand_data {
>   *			using governors except for passive governor.
>   *			If the devfreq device has the specific method to decide
>   *			the next frequency, should use this callback.
> + * @parent_type:	parent type of the device
>   * @this:	the devfreq instance of own device.
>   * @nb:		the notifier block for DEVFREQ_TRANSITION_NOTIFIER list
> + * @cpu_state:		the state min/max/current frequency of all online cpu's
>   *
>   * The devfreq_passive_data have to set the devfreq instance of parent
>   * device with governors except for the passive governor. But, don't need to
> - * initialize the 'this' and 'nb' field because the devfreq core will handle
> - * them.
> + * initialize the 'this', 'nb' and 'cpu_state' field because the devfreq core
> + * will handle them.
>   */
>  struct devfreq_passive_data {
>  	/* Should set the devfreq instance of parent device */
> @@ -305,9 +326,13 @@ struct devfreq_passive_data {
>  	/* Optional callback to decide the next frequency of passvice device */
>  	int (*get_target_freq)(struct devfreq *this, unsigned long *freq);
>  
> +	/* Should set the type of parent device */
> +	enum devfreq_parent_dev_type parent_type;
> +
>  	/* For passive governor's internal use. Don't need to set them */
>  	struct devfreq *this;
>  	struct notifier_block nb;
> +	struct devfreq_cpu_state *cpu_state[NR_CPUS];
>  };
>  #endif
>  
> 


-- 
Best Regards,
Chanwoo Choi
Samsung Electronics

_______________________________________________
linux-arm-kernel mailing list
linux-arm-kernel@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-arm-kernel

^ permalink raw reply	[flat|nested] 31+ messages in thread

* Re: [PATCH V8 4/8] devfreq: add mediatek cci devfreq
  2021-03-23 11:33 ` [PATCH V8 4/8] devfreq: add mediatek " Andrew-sh.Cheng
@ 2021-03-25  8:04   ` Chanwoo Choi
  2021-03-31  6:21     ` andrew-sh.cheng
  0 siblings, 1 reply; 31+ messages in thread
From: Chanwoo Choi @ 2021-03-25  8:04 UTC (permalink / raw)
  To: Andrew-sh.Cheng, MyungJoo Ham, Kyungmin Park, Rob Herring,
	Mark Rutland, Matthias Brugger, Rafael J. Wysocki, Viresh Kumar,
	Nishanth Menon, Stephen Boyd, Liam Girdwood, Mark Brown
  Cc: linux-pm, devicetree, linux-arm-kernel, linux-mediatek,
	linux-kernel, srv_heupstream

Hi,

On 3/23/21 8:33 PM, Andrew-sh.Cheng wrote:
> From: "Andrew-sh.Cheng" <andrew-sh.cheng@mediatek.com>
> 
> This adds a devfreq driver for the Cache Coherent Interconnect (CCI)
> of the Mediatek MT8183.
> 
> On the MT8183 the CCI is supplied by the same regulator as the LITTLE
> cores. The driver is notified when the regulator voltage changes
> (driven by cpufreq) and adjusts the CCI frequency to the maximum
> possible value.
> 
> Signed-off-by: Andrew-sh.Cheng <andrew-sh.cheng@mediatek.com>
> ---
>  drivers/devfreq/Kconfig              |  10 ++
>  drivers/devfreq/Makefile             |   1 +
>  drivers/devfreq/mt8183-cci-devfreq.c | 198 +++++++++++++++++++++++++++++++++++
>  3 files changed, 209 insertions(+)
>  create mode 100644 drivers/devfreq/mt8183-cci-devfreq.c
> 
> diff --git a/drivers/devfreq/Kconfig b/drivers/devfreq/Kconfig
> index f56132b0ae64..2538255ac2c1 100644
> --- a/drivers/devfreq/Kconfig
> +++ b/drivers/devfreq/Kconfig
> @@ -111,6 +111,16 @@ config ARM_IMX8M_DDRC_DEVFREQ
>  	  This adds the DEVFREQ driver for the i.MX8M DDR Controller. It allows
>  	  adjusting DRAM frequency.
>  
> +config ARM_MT8183_CCI_DEVFREQ
> +	tristate "MT8183 CCI DEVFREQ Driver"
> +	depends on ARM_MEDIATEK_CPUFREQ
> +	help
> +		This adds a devfreq driver for Cache Coherent Interconnect
> +		of Mediatek MT8183, which is shared the same regulator
> +		with cpu cluster.
> +		It can track buck voltage and update a proper CCI frequency.
> +		Use notification to get regulator status.
> +
>  config ARM_TEGRA_DEVFREQ
>  	tristate "NVIDIA Tegra30/114/124/210 DEVFREQ Driver"
>  	depends on ARCH_TEGRA_3x_SOC || ARCH_TEGRA_114_SOC || \
> diff --git a/drivers/devfreq/Makefile b/drivers/devfreq/Makefile
> index a16333ea7034..991ef7740759 100644
> --- a/drivers/devfreq/Makefile
> +++ b/drivers/devfreq/Makefile
> @@ -11,6 +11,7 @@ obj-$(CONFIG_DEVFREQ_GOV_PASSIVE)	+= governor_passive.o
>  obj-$(CONFIG_ARM_EXYNOS_BUS_DEVFREQ)	+= exynos-bus.o
>  obj-$(CONFIG_ARM_IMX_BUS_DEVFREQ)	+= imx-bus.o
>  obj-$(CONFIG_ARM_IMX8M_DDRC_DEVFREQ)	+= imx8m-ddrc.o
> +obj-$(CONFIG_ARM_MT8183_CCI_DEVFREQ)	+= mt8183-cci-devfreq.o
>  obj-$(CONFIG_ARM_RK3399_DMC_DEVFREQ)	+= rk3399_dmc.o
>  obj-$(CONFIG_ARM_TEGRA_DEVFREQ)		+= tegra30-devfreq.o
>  
> diff --git a/drivers/devfreq/mt8183-cci-devfreq.c b/drivers/devfreq/mt8183-cci-devfreq.c
> new file mode 100644
> index 000000000000..018543db7bae
> --- /dev/null
> +++ b/drivers/devfreq/mt8183-cci-devfreq.c
> @@ -0,0 +1,198 @@
> +// SPDX-License-Identifier: GPL-2.0
> +/*
> + * Copyright (c) 2021 MediaTek Inc.
> +
> + * Author: Andrew-sh.Cheng <andrew-sh.cheng@mediatek.com>
> + */
> +
> +#include <linux/clk.h>
> +#include <linux/devfreq.h>
> +#include <linux/module.h>
> +#include <linux/of.h>
> +#include <linux/platform_device.h>
> +#include <linux/regulator/consumer.h>
> +#include <linux/time.h>
> +
> +#define MAX_VOLT_LIMIT		(1150000)
> +
> +struct cci_devfreq {
> +	struct devfreq *devfreq;
> +	struct regulator *cpu_reg;
> +	struct clk *cci_clk;
> +	int old_vproc;

nitpick. how about using 'old_voltage'?
because 'vproc' is not easy for understanding.

> +	unsigned long old_freq;
> +};
> +
> +static int mtk_cci_set_voltage(struct cci_devfreq *cci_df, int vproc)

nitpick: how about changing 'vproc -> voltage'?

> +{
> +	int ret;
> +
> +	ret = regulator_set_voltage(cci_df->cpu_reg, vproc,
> +				    MAX_VOLT_LIMIT);
> +	if (!ret)
> +		cci_df->old_vproc = vproc;
> +	return ret;
> +}
> +
> +static int mtk_cci_devfreq_target(struct device *dev, unsigned long *freq,
> +				  u32 flags)
> +{
> +	int ret;
> +	struct cci_devfreq *cci_df = dev_get_drvdata(dev);
> +	struct dev_pm_opp *opp;
> +	unsigned long opp_rate, opp_voltage, old_voltage;
> +
> +	if (!cci_df)
> +		return -EINVAL;
> +
> +	if (cci_df->old_freq == *freq)
> +		return 0;
> +
> +	opp_rate = *freq;
> +	opp = devfreq_recommended_opp(dev, &opp_rate, 1);
> +	opp_voltage = dev_pm_opp_get_voltage(opp);
> +	dev_pm_opp_put(opp);
> +
> +	old_voltage = cci_df->old_vproc;
> +	if (old_voltage == 0)
> +		old_voltage = regulator_get_voltage(cci_df->cpu_reg);
> +
> +	// scale up: set voltage first then freq
> +	if (opp_voltage > old_voltage) {
> +		ret = mtk_cci_set_voltage(cci_df, opp_voltage);
> +		if (ret) {
> +			pr_err("cci: failed to scale up voltage\n");
> +			return ret;
> +		}
> +	}
> +
> +	ret = clk_set_rate(cci_df->cci_clk, *freq);
> +	if (ret) {
> +		pr_err("%s: failed cci to set rate: %d\n", __func__,
> +		       ret);
> +		mtk_cci_set_voltage(cci_df, old_voltage);
> +		return ret;
> +	}
> +
> +	// scale down: set freq first then voltage
> +	if (opp_voltage < old_voltage) {
> +		ret = mtk_cci_set_voltage(cci_df, opp_voltage);
> +		if (ret) {
> +			pr_err("cci: failed to scale down voltage\n");
> +			clk_set_rate(cci_df->cci_clk, cci_df->old_freq);
> +			return ret;
> +		}
> +	}
> +
> +	cci_df->old_freq = *freq;
> +
> +	return 0;
> +}
> +
> +static struct devfreq_dev_profile cci_devfreq_profile = {
> +	.target = mtk_cci_devfreq_target,
> +};
> +
> +static int mtk_cci_devfreq_probe(struct platform_device *pdev)
> +{
> +	struct device *cci_dev = &pdev->dev;
> +	struct cci_devfreq *cci_df;
> +	struct devfreq_passive_data *passive_data;
> +	int ret;
> +
> +	cci_df = devm_kzalloc(cci_dev, sizeof(*cci_df), GFP_KERNEL);
> +	if (!cci_df)
> +		return -ENOMEM;
> +
> +	cci_df->cci_clk = devm_clk_get(cci_dev, "cci_clock");
> +	ret = PTR_ERR_OR_ZERO(cci_df->cci_clk);
> +	if (ret) {
> +		if (ret != -EPROBE_DEFER)
> +			dev_err(cci_dev, "failed to get clock for CCI: %d\n",
> +				ret);

Use dev_err_probe() to handle EPROBE_DEFER case. It makes code more simple.

> +		return ret;
> +	}
> +	cci_df->cpu_reg = devm_regulator_get_optional(cci_dev, "proc");
> +	ret = PTR_ERR_OR_ZERO(cci_df->cpu_reg);
> +	if (ret) {
> +		if (ret != -EPROBE_DEFER)
> +			dev_err(cci_dev, "failed to get regulator for CCI: %d\n",
> +				ret);

ditto. Use dev_err_probe()

> +		return ret;
> +	}
> +	ret = regulator_enable(cci_df->cpu_reg);
> +	if (ret) {
> +		dev_err(cci_dev, "enable buck for cci fail\n");
> +		return ret;
> +	}
> +
> +	ret = dev_pm_opp_of_add_table(cci_dev);
> +	if (ret) {
> +		dev_err(cci_dev, "Fail to get OPP table for CCI: %d\n", ret);
> +		return ret;
> +	}
> +
> +	platform_set_drvdata(pdev, cci_df);
> +
> +	passive_data = devm_kzalloc(cci_dev, sizeof(*passive_data), GFP_KERNEL);
> +	if (!passive_data) {
> +		ret = -ENOMEM;
> +		goto err_opp;
> +	}
> +
> +	passive_data->parent_type = CPUFREQ_PARENT_DEV;
> +
> +	cci_df->devfreq = devm_devfreq_add_device(cci_dev,
> +						  &cci_devfreq_profile,
> +						  DEVFREQ_GOV_PASSIVE,
> +						  passive_data);
> +	if (IS_ERR(cci_df->devfreq)) {
> +		ret = PTR_ERR(cci_df->devfreq);
> +		dev_err(cci_dev, "cannot create cci devfreq device:%d\n", ret);
> +		goto err_opp;
> +	}
> +
> +	return 0;
> +
> +err_opp:
> +	dev_pm_opp_of_remove_table(cci_dev);
> +	return ret;
> +}
> +
> +static int mtk_cci_devfreq_remove(struct platform_device *pdev)
> +{
> +	struct device *cci_dev = &pdev->dev;
> +	struct cci_devfreq *cci_df;
> +	struct notifier_block *opp_nb;
> +
> +	cci_df = platform_get_drvdata(pdev);
> +	opp_nb = &cci_df->opp_nb;
> +
> +	dev_pm_opp_unregister_notifier(cci_dev, opp_nb);

Why do you call this function without registration?
If you want to catch the OPP changes of devfreq,
you can use devfreq_register_opp_notifier/devfreq_unregister_opp_notifier
functions.

> +	dev_pm_opp_of_remove_table(cci_dev);
> +	regulator_disable(cci_df->cpu_reg);
> +
> +	return 0;
> +}
> +
> +static const __maybe_unused struct of_device_id
> +	mediatek_cci_of_match[] = {

Need to change it as following at same line:
static const __maybe_unused struct of_device_idmediatek_cci_of_match[] = {


> +	{ .compatible = "mediatek,mt8183-cci" },
> +	{ },
> +};
> +MODULE_DEVICE_TABLE(of, mediatek_cci_of_match);
> +
> +static struct platform_driver cci_devfreq_driver = {
> +	.probe	= mtk_cci_devfreq_probe,
> +	.remove	= mtk_cci_devfreq_remove,
> +	.driver = {
> +		.name = "mediatek-cci-devfreq",
> +		.of_match_table = of_match_ptr(mediatek_cci_of_match),
> +	},
> +};
> +
> +module_platform_driver(cci_devfreq_driver);
> +
> +MODULE_DESCRIPTION("Mediatek CCI devfreq driver");
> +MODULE_AUTHOR("Andrew-sh.Cheng <andrew-sh.cheng@mediatek.com>");
> +MODULE_LICENSE("GPL v2");
> 


-- 
Best Regards,
Chanwoo Choi
Samsung Electronics

_______________________________________________
linux-arm-kernel mailing list
linux-arm-kernel@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-arm-kernel

^ permalink raw reply	[flat|nested] 31+ messages in thread

* Re: [PATCH V8 7/8] devfreq: mediatek: cci devfreq register opp notification for SVS support
  2021-03-23 11:34 ` [PATCH V8 7/8] devfreq: mediatek: cci devfreq register " Andrew-sh.Cheng
@ 2021-03-25  8:11   ` Chanwoo Choi
  2021-03-31  7:53     ` andrew-sh.cheng
  0 siblings, 1 reply; 31+ messages in thread
From: Chanwoo Choi @ 2021-03-25  8:11 UTC (permalink / raw)
  To: Andrew-sh.Cheng, MyungJoo Ham, Kyungmin Park, Rob Herring,
	Mark Rutland, Matthias Brugger, Rafael J. Wysocki, Viresh Kumar,
	Nishanth Menon, Stephen Boyd, Liam Girdwood, Mark Brown
  Cc: linux-pm, devicetree, linux-arm-kernel, linux-mediatek,
	linux-kernel, srv_heupstream

Hi,

I think that you can squash this patch to patch4.

On 3/23/21 8:34 PM, Andrew-sh.Cheng wrote:
> From: "Andrew-sh.Cheng" <andrew-sh.cheng@mediatek.com>
> 
> SVS will change the voltage of opp item.

What it the full name of SVS?

> CCI devfreq need to react to change frequency.
> 
> Signed-off-by: Andrew-sh.Cheng <andrew-sh.cheng@mediatek.com>
> ---
>  drivers/devfreq/mt8183-cci-devfreq.c | 27 +++++++++++++++++++++++++++
>  1 file changed, 27 insertions(+)
> 
> diff --git a/drivers/devfreq/mt8183-cci-devfreq.c b/drivers/devfreq/mt8183-cci-devfreq.c
> index 018543db7bae..6942a48f3f4f 100644
> --- a/drivers/devfreq/mt8183-cci-devfreq.c
> +++ b/drivers/devfreq/mt8183-cci-devfreq.c
> @@ -21,6 +21,7 @@ struct cci_devfreq {
>  	struct clk *cci_clk;
>  	int old_vproc;
>  	unsigned long old_freq;
> +	struct notifier_block opp_nb;
>  };
>  
>  static int mtk_cci_set_voltage(struct cci_devfreq *cci_df, int vproc)
> @@ -89,6 +90,26 @@ static int mtk_cci_devfreq_target(struct device *dev, unsigned long *freq,
>  	return 0;
>  }
>  
> +static int ccidevfreq_opp_notifier(struct notifier_block *nb,

I think that you better to change the function name as following:
ccidevfreq_opp_notifier -> mtk_cci_devfreq_opp_notifier

> +				   unsigned long event, void *data)
> +{
> +	struct dev_pm_opp *opp = data;
> +	struct cci_devfreq *cci_df = container_of(nb, struct cci_devfreq,
> +						  opp_nb);
> +	unsigned long	freq, volt;
> +
> +	if (event == OPP_EVENT_ADJUST_VOLTAGE) {
> +		freq = dev_pm_opp_get_freq(opp);
> +		/* current opp item is changed */
> +		if (freq == cci_df->old_freq) {
> +			volt = dev_pm_opp_get_voltage(opp);
> +			mtk_cci_set_voltage(cci_df, volt);
> +		}
> +	}
> +
> +	return 0;
> +}
> +
>  static struct devfreq_dev_profile cci_devfreq_profile = {
>  	.target = mtk_cci_devfreq_target,
>  };
> @@ -98,12 +119,15 @@ static int mtk_cci_devfreq_probe(struct platform_device *pdev)
>  	struct device *cci_dev = &pdev->dev;
>  	struct cci_devfreq *cci_df;
>  	struct devfreq_passive_data *passive_data;
> +	struct notifier_block *opp_nb;
>  	int ret;
>  
>  	cci_df = devm_kzalloc(cci_dev, sizeof(*cci_df), GFP_KERNEL);
>  	if (!cci_df)
>  		return -ENOMEM;
>  
> +	opp_nb = &cci_df->opp_nb;

Just move this code at the neighborhood of 'opp_nb->notifier_call' init code.

> +
>  	cci_df->cci_clk = devm_clk_get(cci_dev, "cci_clock");
>  	ret = PTR_ERR_OR_ZERO(cci_df->cci_clk);
>  	if (ret) {
> @@ -152,6 +176,9 @@ static int mtk_cci_devfreq_probe(struct platform_device *pdev)
>  		goto err_opp;
>  	}
>  
> +	opp_nb->notifier_call = ccidevfreq_opp_notifier;
> +	dev_pm_opp_register_notifier(cci_dev, opp_nb);

Need to check whether return value is valid or not.

> +
>  	return 0;
>  
>  err_opp:
> 


-- 
Best Regards,
Chanwoo Choi
Samsung Electronics

_______________________________________________
linux-arm-kernel mailing list
linux-arm-kernel@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-arm-kernel

^ permalink raw reply	[flat|nested] 31+ messages in thread

* Re: [PATCH V8 1/8] PM / devfreq: Add cpu based scaling support to passive_governor
  2021-03-23 11:33 ` [PATCH V8 1/8] PM / devfreq: Add cpu based scaling support to passive_governor Andrew-sh.Cheng
  2021-03-25  7:42   ` Chanwoo Choi
@ 2021-03-25  8:14   ` Chanwoo Choi
  2021-03-31  8:03     ` andrew-sh.cheng
  2021-03-31 10:46     ` Hsin-Yi Wang
  1 sibling, 2 replies; 31+ messages in thread
From: Chanwoo Choi @ 2021-03-25  8:14 UTC (permalink / raw)
  To: Andrew-sh.Cheng, MyungJoo Ham, Kyungmin Park, Rob Herring,
	Mark Rutland, Matthias Brugger, Rafael J. Wysocki, Viresh Kumar,
	Nishanth Menon, Stephen Boyd, Liam Girdwood, Mark Brown
  Cc: linux-pm, devicetree, linux-arm-kernel, linux-mediatek,
	linux-kernel, srv_heupstream, Sibi Sankar

Hi,

You are missing to add these patches to linux-pm mailing list.
Need to send them to linu-pm ML.

Also, before received this series, I tried to clean-up these patches
on testing branch[1]. So that I add my comment with my clean-up case.
[1] https://git.kernel.org/pub/scm/linux/kernel/git/chanwoo/linux.git/log/?h=devfreq-testing-passive-gov

And 'Saravana Kannan <skannan@codeaurora.org>' is wrong email address.
Please update the email or drop this email.


On 3/23/21 8:33 PM, Andrew-sh.Cheng wrote:
> From: Saravana Kannan <skannan@codeaurora.org>
>
> Many CPU architectures have caches that can scale independent of the
> CPUs. Frequency scaling of the caches is necessary to make sure that the
> cache is not a performance bottleneck that leads to poor performance and
> power. The same idea applies for RAM/DDR.
>
> To achieve this, this patch adds support for cpu based scaling to the
> passive governor. This is accomplished by taking the current frequency
> of each CPU frequency domain and then adjust the frequency of the cache
> (or any devfreq device) based on the frequency of the CPUs. It listens
> to CPU frequency transition notifiers to keep itself up to date on the
> current CPU frequency.
>
> To decide the frequency of the device, the governor does one of the
> following:
> * Derives the optimal devfreq device opp from required-opps property of
>   the parent cpu opp_table.
>
> * Scales the device frequency in proportion to the CPU frequency. So, if
>   the CPUs are running at their max frequency, the device runs at its
>   max frequency. If the CPUs are running at their min frequency, the
>   device runs at its min frequency. It is interpolated for frequencies
>   in between.
>
> Andrew-sh.Cheng change
> dev_pm_opp_xlate_opp to dev_pm_opp_xlate_required_opp devfreq->max_freq
> to devfreq->user_min_freq_req.data.freq.qos->min_freq.target_value
> after kernel-5.7
> Don't return -EINVAL in devfreq_passive_event_handler()
> since it doesn't handle DEVFREQ_GOV_SUSPEND DEVFREQ_GOV_RESUME cases.
>
> Signed-off-by: Saravana Kannan <skannan@codeaurora.org>
> [Sibi: Integrated cpu-freqmap governor into passive_governor]
> Signed-off-by: Sibi Sankar <sibis@codeaurora.org>
> Signed-off-by: Andrew-sh.Cheng <andrew-sh.cheng@mediatek.com>
> ---
>  drivers/devfreq/Kconfig            |   2 +
>  drivers/devfreq/governor_passive.c | 329 +++++++++++++++++++++++++++++++++++--
>  include/linux/devfreq.h            |  29 +++-
>  3 files changed, 342 insertions(+), 18 deletions(-)
>
> diff --git a/drivers/devfreq/Kconfig b/drivers/devfreq/Kconfig
> index 00704efe6398..f56132b0ae64 100644
> --- a/drivers/devfreq/Kconfig
> +++ b/drivers/devfreq/Kconfig
> @@ -73,6 +73,8 @@ config DEVFREQ_GOV_PASSIVE
>  	  device. This governor does not change the frequency by itself
>  	  through sysfs entries. The passive governor recommends that
>  	  devfreq device uses the OPP table to get the frequency/voltage.
> +	  Alternatively the governor can also be chosen to scale based on
> +	  the online CPUs current frequency.
>  
>  comment "DEVFREQ Drivers"
>  
> diff --git a/drivers/devfreq/governor_passive.c b/drivers/devfreq/governor_passive.c
> index b094132bd20b..9cc57b083839 100644
> --- a/drivers/devfreq/governor_passive.c
> +++ b/drivers/devfreq/governor_passive.c
> @@ -8,11 +8,103 @@
>   */
>  
>  #include <linux/module.h>
> +#include <linux/cpu.h>
> +#include <linux/cpufreq.h>
> +#include <linux/cpumask.h>
>  #include <linux/device.h>
>  #include <linux/devfreq.h>
> +#include <linux/slab.h>
>  #include "governor.h"
>  
> -static int devfreq_passive_get_target_freq(struct devfreq *devfreq,
> +struct devfreq_cpu_state {
> +	unsigned int curr_freq;
> +	unsigned int min_freq;
> +	unsigned int max_freq;
> +	unsigned int first_cpu;
> +	struct device *cpu_dev;
> +	struct opp_table *opp_table;
> +};

As I knew, the previous version has the description of structure
as following:  I wan to add the description like below.

And if you have no any objection, I'd like you to order
the variables as following and use 'dev' instead of 'cpu_dev'
because this patch use the 'cpu_state->cpu_dev' at the multiple points.
I think that 'cpu_state->dev' is better than 'cpu_state->cpu_dev'.
Also, I prefer to use 'cur_freq' instead of 'curr_freq'
because devfreq subsystem uses 'cur_freq' for expressing the 'current frequency'.

/**                                                                             
 * struct devfreq_cpu_state - Hold the per-cpu data                              
 * @dev:        reference to cpu device.                                        
 * @first_cpu:  the cpumask of the first cpu of a policy.                       
 * @opp_table:  reference to cpu opp table.                                     
 * @cur_freq:   the current frequency of the cpu.                               
 * @min_freq:   the min frequency of the cpu.                                   
 * @max_freq:   the max frequency of the cpu.                                   
 *                                                                              
 * This structure stores the required cpu_data of a cpu.                        
 * This is auto-populated by the governor.                                      
 */                                                                             
struct devfreq_cpu_state {                                                       
         struct device *dev;                                                     
         unsigned int first_cpu;                                                 

         struct opp_table *opp_table;                                            
         unsigned int cur_freq;                                                  
         unsigned int min_freq;                                                  
         unsigned int max_freq;                                                  
};               


> +
> +static unsigned long xlate_cpufreq_to_devfreq(struct devfreq_passive_data *data,
> +					      unsigned int cpu)
> +{
> +	unsigned int cpu_min_freq, cpu_max_freq, cpu_curr_freq_khz, cpu_percent;
> +	unsigned long dev_min_freq, dev_max_freq, dev_max_state;
> +
> +	struct devfreq_cpu_state *cpu_state = data->cpu_state[cpu];
> +	struct devfreq *devfreq = (struct devfreq *)data->this;
> +	unsigned long *dev_freq_table = devfreq->profile->freq_table;
> +	struct dev_pm_opp *opp = NULL, *p_opp = NULL;
> +	unsigned long cpu_curr_freq, freq;
> +
> +	if (!cpu_state || cpu_state->first_cpu != cpu ||
> +	    !cpu_state->opp_table || !devfreq->opp_table)
> +		return 0;
> +
> +	cpu_curr_freq = cpu_state->curr_freq * 1000;
> +	p_opp = devfreq_recommended_opp(cpu_state->cpu_dev, &cpu_curr_freq, 0);
> +	if (IS_ERR(p_opp))
> +		return 0;
> +
> +	opp = dev_pm_opp_xlate_required_opp(cpu_state->opp_table,
> +					    devfreq->opp_table, p_opp);
> +	dev_pm_opp_put(p_opp);
> +
> +	if (!IS_ERR(opp)) {
> +		freq = dev_pm_opp_get_freq(opp);
> +		dev_pm_opp_put(opp);
> +		goto out;
> +	}
> +
> +	/* Use Interpolation if required opps is not available */
> +	cpu_min_freq = cpu_state->min_freq;
> +	cpu_max_freq = cpu_state->max_freq;
> +	cpu_curr_freq_khz = cpu_state->curr_freq;
> +
> +	if (dev_freq_table) {
> +		/* Get minimum frequency according to sorting order */
> +		dev_max_state = dev_freq_table[devfreq->profile->max_state - 1];
> +		if (dev_freq_table[0] < dev_max_state) {
> +			dev_min_freq = dev_freq_table[0];
> +			dev_max_freq = dev_max_state;
> +		} else {
> +			dev_min_freq = dev_max_state;
> +			dev_max_freq = dev_freq_table[0];
> +		}
> +	} else {
> +		dev_min_freq = dev_pm_qos_read_value(devfreq->dev.parent,
> +						     DEV_PM_QOS_MIN_FREQUENCY);
> +		dev_max_freq = dev_pm_qos_read_value(devfreq->dev.parent,
> +						     DEV_PM_QOS_MAX_FREQUENCY);
> +
> +		if (dev_max_freq <= dev_min_freq)
> +			return 0;
> +	}
> +	cpu_percent = ((cpu_curr_freq_khz - cpu_min_freq) * 100) / cpu_max_freq - cpu_min_freq;
> +	freq = dev_min_freq + mult_frac(dev_max_freq - dev_min_freq, cpu_percent, 100);
> +
> +out:
> +	return freq;
> +}
> +
> +static int get_target_freq_with_cpufreq(struct devfreq *devfreq,
> +					unsigned long *freq)
> +{
> +	struct devfreq_passive_data *p_data =
> +				(struct devfreq_passive_data *)devfreq->data;
> +	unsigned int cpu;
> +	unsigned long target_freq = 0;
> +
> +	for_each_online_cpu(cpu)
> +		target_freq = max(target_freq,
> +				  xlate_cpufreq_to_devfreq(p_data, cpu));
> +
> +	*freq = target_freq;
> +
> +	return 0;
> +}

As you knew, governor_passive.c was already used 
both 'dev_pm_opp_xlate_required_opp' and 'devfreq_recommended_opp'
to get the target from OPP. So, I wan to make the common function
like 'get_taget_freq_by_required_opp' as following:
If define 'get_taget_freq_by_required_opp' as following,
it will be used for get_target_freq_with_devfreq().
After finisied the review of this patch, I'll send the patch[2].
[2] https://git.kernel.org/pub/scm/linux/kernel/git/chanwoo/linux.git/commit/?h=devfreq-testing-passive-gov&id=101c5a087586ab2b5cf3370166a7e39227ca83cf

For example but this code is not tested,
static unsigned long get_taget_freq_by_required_opp(struct device *p_dev,
						struct opp_table *p_opp_table,
						struct opp_table *opp_table,
						unsigned long freq)
{
	struct dev_pm_opp *opp = NULL, *p_opp = NULL;

	if (!p_dev || !p_opp_table || !opp_table || !freq)
		return 0;

	p_opp = devfreq_recommended_opp(p_dev, &freq, 0);
	if (IS_ERR(p_opp))
		return 0;

	opp = dev_pm_opp_xlate_required_opp(p_opp_table, opp_table, p_opp);
	dev_pm_opp_put(p_opp);

	if (IS_ERR(opp))
		return 0;

	freq = dev_pm_opp_get_freq(opp);
	dev_pm_opp_put(opp);

	return freq;
}

static int get_target_freq_with_cpufreq(struct devfreq *devfreq,
					unsigned long *target_freq)
{
	struct devfreq_passive_data *p_data =
				(struct devfreq_passive_data *)devfreq->data;
	struct devfreq_cpu_data *cpu_data;
	unsigned long cpu, cpu_cur, cpu_min, cpu_max, cpu_percent;
	unsigned long dev_min, dev_max;
	unsigned long freq = 0;

	for_each_online_cpu(cpu) {
		cpu_data = p_data->cpu_data[cpu];
		if (!cpu_data || cpu_data->first_cpu != cpu)
			continue;

		/* Get target freq via required opps */
		cpu_cur = cpu_data->cur_freq * HZ_PER_KHZ;
		freq = get_taget_freq_by_required_opp(cpu_data->dev,
					cpu_data->opp_table,
					devfreq->opp_table, cpu_cur);
		if (freq) {
			*target_freq = max(freq, *target_freq);
			continue;
		}

		/* Use Interpolation if required opps is not available */
		devfreq_get_freq_range(devfreq, &dev_min, &dev_max);

		cpu_min = cpu_data->min_freq;
		cpu_max = cpu_data->max_freq;
		cpu_cur = cpu_data->cur_freq;

		cpu_percent = ((cpu_cur - cpu_min) * 100) / cpu_max - cpu_min;
		freq = dev_min + mult_frac(dev_max - dev_min, cpu_percent, 100);

		*target_freq = max(freq, *target_freq);
	}

	return 0;
}

> +
> +static int get_target_freq_with_devfreq(struct devfreq *devfreq,
>  					unsigned long *freq)
>  {
>  	struct devfreq_passive_data *p_data
> @@ -23,14 +115,6 @@ static int devfreq_passive_get_target_freq(struct devfreq *devfreq,
>  	int i, count;
>  
>  	/*
> -	 * If the devfreq device with passive governor has the specific method
> -	 * to determine the next frequency, should use the get_target_freq()
> -	 * of struct devfreq_passive_data.
> -	 */
> -	if (p_data->get_target_freq)
> -		return p_data->get_target_freq(devfreq, freq);
> -
> -	/*
>  	 * If the parent and passive devfreq device uses the OPP table,
>  	 * get the next frequency by using the OPP table.
>  	 */
> @@ -98,6 +182,37 @@ static int devfreq_passive_get_target_freq(struct devfreq *devfreq,
>  	return 0;
>  }
>  
> +static int devfreq_passive_get_target_freq(struct devfreq *devfreq,
> +					   unsigned long *freq)
> +{
> +	struct devfreq_passive_data *p_data =
> +		(struct devfreq_passive_data *)devfreq->data;
> +	int ret;
> +
> +	/*
> +	 * If the devfreq device with passive governor has the specific method
> +	 * to determine the next frequency, should use the get_target_freq()
> +	 * of struct devfreq_passive_data.
> +	 */
> +	if (p_data->get_target_freq)
> +		return p_data->get_target_freq(devfreq, freq);
> +
> +	switch (p_data->parent_type) {
> +	case DEVFREQ_PARENT_DEV:
> +		ret = get_target_freq_with_devfreq(devfreq, freq);
> +		break;
> +	case CPUFREQ_PARENT_DEV:
> +		ret = get_target_freq_with_cpufreq(devfreq, freq);
> +		break;
> +	default:
> +		ret = -EINVAL;
> +		dev_err(&devfreq->dev, "Invalid parent type\n");
> +		break;
> +	}
> +
> +	return ret;
> +}
> +
>  static int devfreq_passive_notifier_call(struct notifier_block *nb,
>  				unsigned long event, void *ptr)
>  {
> @@ -130,16 +245,200 @@ static int devfreq_passive_notifier_call(struct notifier_block *nb,
>  	return NOTIFY_DONE;
>  }
>  
> +static int cpufreq_passive_notifier_call(struct notifier_block *nb,
> +					 unsigned long event, void *ptr)
> +{
> +	struct devfreq_passive_data *data =
> +			container_of(nb, struct devfreq_passive_data, nb);
> +	struct devfreq *devfreq = (struct devfreq *)data->this;
> +	struct devfreq_cpu_state *cpu_state;
> +	struct cpufreq_freqs *cpu_freq = ptr;

Use 'freqs' variable name.  I prefer to use the same variable name
for both devfreq_freqs and cpufreq_freqs instance.

> +	unsigned int curr_freq;

As I commented above, better to use 'cur_frq' instead of 'curr_freq'
if there is no any special reason.

> +	int ret;
> +
> +	if (event != CPUFREQ_POSTCHANGE || !cpu_freq ||
> +	    !data->cpu_state[cpu_freq->policy->cpu])
> +		return 0;
> +
> +	cpu_state = data->cpu_state[cpu_freq->policy->cpu];
> +	if (cpu_state->curr_freq == cpu_freq->new)
> +		return 0;
> +
> +	/* Backup current freq and pre-update cpu state freq*/

I think that this commnet is not critial. So, please drop this comment.

> +	curr_freq = cpu_state->curr_freq;
> +	cpu_state->curr_freq = cpu_freq->new;
> +
> +	mutex_lock(&devfreq->lock);
> +	ret = update_devfreq(devfreq);

I recommend to use 'devfreq_update_target' instead of 'update_devfreq'
as following:
	devfreq_update_target(devfreq, freqs->new);

> +	mutex_unlock(&devfreq->lock);
> +	if (ret) {
> +		cpu_state->curr_freq = curr_freq;
> +		dev_err(&devfreq->dev, "Couldn't update the frequency.\n");
> +		return ret;
> +	}
> +
> +	return 0;
> +}
> +
> +static int cpufreq_passive_register(struct devfreq_passive_data **p_data)

In order to keep the consistent style of function name,
please change the name as following because devfreq defines
the function name as 'devfreq_regiter_notifier'
- cpufreq_passive_register -> cpufreq_passive_register_notifier

> +{
> +	struct devfreq_passive_data *data = *p_data;
> +	struct devfreq *devfreq = (struct devfreq *)data->this;
> +	struct device *dev = devfreq->dev.parent;
> +	struct opp_table *opp_table = NULL;
> +	struct devfreq_cpu_state *cpu_state;
> +	struct cpufreq_policy *policy;
> +	struct device *cpu_dev;
> +	unsigned int cpu;
> +	int ret;
> +
> +	get_online_cpus();
> +
> +	data->nb.notifier_call = cpufreq_passive_notifier_call;
> +	ret = cpufreq_register_notifier(&data->nb,
> +					CPUFREQ_TRANSITION_NOTIFIER);
> +	if (ret) {
> +		dev_err(dev, "Couldn't register cpufreq notifier.\n");
> +		data->nb.notifier_call = NULL;
> +		goto out;
> +	}
> +
> +	/* Populate devfreq_cpu_state */

Don't need this comment. Please drop it.

> +	for_each_online_cpu(cpu) {
> +		if (data->cpu_state[cpu])
> +			continue;
> +
> +		policy = cpufreq_cpu_get(cpu);
> +		if (!policy) {
> +			ret = -EINVAL;
> +			goto out;
> +		} else if (PTR_ERR(policy) == -EPROBE_DEFER) {
> +			ret = -EPROBE_DEFER;
> +			goto out;
> +		} else if (IS_ERR(policy)) {
> +			ret = PTR_ERR(policy);
> +			dev_err(dev, "Couldn't get the cpufreq_poliy.\n");
> +			goto out;
> +		}

Use dev_err_probe() funciotn to handle hte EPROBE_DEFER.
It make code more simple.

> +
> +		cpu_state = kzalloc(sizeof(*cpu_state), GFP_KERNEL);
> +		if (!cpu_state) {
> +			ret = -ENOMEM;
> +			goto out;
> +		}
> +
> +		cpu_dev = get_cpu_device(cpu);
> +		if (!cpu_dev) {
> +			dev_err(dev, "Couldn't get cpu device.\n");
> +			ret = -ENODEV;
> +			goto out;
> +		}
> +
> +		opp_table = dev_pm_opp_get_opp_table(cpu_dev);
> +		if (IS_ERR(devfreq->opp_table)) {
> +			ret = PTR_ERR(opp_table);
> +			goto out;
> +		}
> +
> +		cpu_state->cpu_dev = cpu_dev;
> +		cpu_state->opp_table = opp_table;
> +		cpu_state->first_cpu = cpumask_first(policy->related_cpus);
> +		cpu_state->curr_freq = policy->cur;
> +		cpu_state->min_freq = policy->cpuinfo.min_freq;
> +		cpu_state->max_freq = policy->cpuinfo.max_freq;
> +		data->cpu_state[cpu] = cpu_state;
> +
> +		cpufreq_cpu_put(policy);
> +	}
> +
> +out:
> +	put_online_cpus();
> +	if (ret)
> +		return ret;
> +
> +	/* Update devfreq */
> +	mutex_lock(&devfreq->lock);
> +	ret = update_devfreq(devfreq);

> +	mutex_unlock(&devfreq->lock);
> +	if (ret)
> +		dev_err(dev, "Couldn't update the frequency.\n");
> +
> +	return ret;
> +}
> +
> +static int cpufreq_passive_unregister(struct devfreq_passive_data **p_data)

As I commented above, please change the name as following:
- cpufreq_passive_unregister -> cpufreq_passive_unregister_notifier

> +{
> +	struct devfreq_passive_data *data = *p_data;
> +	struct devfreq_cpu_state *cpu_state;
> +	int cpu;
> +
> +	if (data->nb.notifier_call)
> +		cpufreq_unregister_notifier(&data->nb,
> +					    CPUFREQ_TRANSITION_NOTIFIER);
> +
> +	for_each_possible_cpu(cpu) {
> +		cpu_state = data->cpu_state[cpu];
> +		if (cpu_state) {
> +			if (cpu_state->opp_table)
> +				dev_pm_opp_put_opp_table(cpu_state->opp_table);
> +			kfree(cpu_state);
> +			cpu_state = NULL;
> +		}
> +	}
> +
> +	return 0;
> +}
> +
> +int register_parent_dev_notifier(struct devfreq_passive_data **p_data)
> +{
> +	struct notifier_block *nb = &(*p_data)->nb;
> +	int ret = 0;
> +
> +	switch ((*p_data)->parent_type) {
> +	case DEVFREQ_PARENT_DEV:
> +		nb->notifier_call = devfreq_passive_notifier_call;
> +		ret = devfreq_register_notifier((struct devfreq *)(*p_data)->parent, nb,
> +						DEVFREQ_TRANSITION_NOTIFIER);
> +		break;
> +	case CPUFREQ_PARENT_DEV:
> +		ret = cpufreq_passive_register(p_data);
> +		break;
> +	default:
> +		ret = -EINVAL;
> +		break;
> +	}
> +	return ret;
> +}
> +
> +int unregister_parent_dev_notifier(struct devfreq_passive_data **p_data)
> +{
> +	int ret = 0;
> +
> +	switch ((*p_data)->parent_type) {
> +	case DEVFREQ_PARENT_DEV:
> +		WARN_ON(devfreq_unregister_notifier((struct devfreq *)(*p_data)->parent,
> +						    &(*p_data)->nb,
> +						    DEVFREQ_TRANSITION_NOTIFIER));
> +		break;
> +	case CPUFREQ_PARENT_DEV:
> +		cpufreq_passive_unregister(p_data);
> +		break;
> +	default:
> +		ret = -EINVAL;
> +		break;
> +	}
> +	return ret;
> +}

I think that you don't need to define register_parent_dev_notifier
and unregister_parent_dev_notifier as the separate functions.

Instead of the separate functions, just add the code
into devfreq_passive_event_handler.


> +
>  static int devfreq_passive_event_handler(struct devfreq *devfreq,
>  				unsigned int event, void *data)
>  {
>  	struct devfreq_passive_data *p_data
>  			= (struct devfreq_passive_data *)devfreq->data;
>  	struct devfreq *parent = (struct devfreq *)p_data->parent;
> -	struct notifier_block *nb = &p_data->nb;
>  	int ret = 0;
>  
> -	if (!parent)
> +	if (p_data->parent_type == DEVFREQ_PARENT_DEV && !parent)
>  		return -EPROBE_DEFER;
>  
>  	switch (event) {
> @@ -147,13 +446,11 @@ static int devfreq_passive_event_handler(struct devfreq *devfreq,
>  		if (!p_data->this)
>  			p_data->this = devfreq;
>  
> -		nb->notifier_call = devfreq_passive_notifier_call;
> -		ret = devfreq_register_notifier(parent, nb,
> -					DEVFREQ_TRANSITION_NOTIFIER);
> +		ret = register_parent_dev_notifier(&p_data);
>  		break;
> +
>  	case DEVFREQ_GOV_STOP:
> -		WARN_ON(devfreq_unregister_notifier(parent, nb,
> -					DEVFREQ_TRANSITION_NOTIFIER));
> +		ret = unregister_parent_dev_notifier(&p_data);
>  		break;
>  	default:
>  		break;
> diff --git a/include/linux/devfreq.h b/include/linux/devfreq.h
> index 26ea0850be9b..e0093b7c805c 100644
> --- a/include/linux/devfreq.h
> +++ b/include/linux/devfreq.h
> @@ -280,6 +280,25 @@ struct devfreq_simple_ondemand_data {
>  
>  #if IS_ENABLED(CONFIG_DEVFREQ_GOV_PASSIVE)
>  /**
> + * struct devfreq_cpu_state - holds the per-cpu state
> + * @freq:	the current frequency of the cpu.
> + * @min_freq:	the min frequency of the cpu.
> + * @max_freq:	the max frequency of the cpu.
> + * @first_cpu:	the cpumask of the first cpu of a policy.
> + * @dev:	reference to cpu device.
> + * @opp_table:	reference to cpu opp table.
> + *
> + * This structure stores the required cpu_state of a cpu.
> + * This is auto-populated by the governor.
> + */
> +struct devfreq_cpu_state;
> +
> +enum devfreq_parent_dev_type {
> +	DEVFREQ_PARENT_DEV,
> +	CPUFREQ_PARENT_DEV,
> +};
> +
> +/**
>   * struct devfreq_passive_data - ``void *data`` fed to struct devfreq
>   *	and devfreq_add_device
>   * @parent:	the devfreq instance of parent device.
> @@ -290,13 +309,15 @@ struct devfreq_simple_ondemand_data {
>   *			using governors except for passive governor.
>   *			If the devfreq device has the specific method to decide
>   *			the next frequency, should use this callback.
> + * @parent_type:	parent type of the device
>   * @this:	the devfreq instance of own device.
>   * @nb:		the notifier block for DEVFREQ_TRANSITION_NOTIFIER list
> + * @cpu_state:		the state min/max/current frequency of all online cpu's
>   *
>   * The devfreq_passive_data have to set the devfreq instance of parent
>   * device with governors except for the passive governor. But, don't need to
> - * initialize the 'this' and 'nb' field because the devfreq core will handle
> - * them.
> + * initialize the 'this', 'nb' and 'cpu_state' field because the devfreq core
> + * will handle them.
>   */
>  struct devfreq_passive_data {
>  	/* Should set the devfreq instance of parent device */
> @@ -305,9 +326,13 @@ struct devfreq_passive_data {
>  	/* Optional callback to decide the next frequency of passvice device */
>  	int (*get_target_freq)(struct devfreq *this, unsigned long *freq);
>  
> +	/* Should set the type of parent device */
> +	enum devfreq_parent_dev_type parent_type;
> +
>  	/* For passive governor's internal use. Don't need to set them */
>  	struct devfreq *this;
>  	struct notifier_block nb;
> +	struct devfreq_cpu_state *cpu_state[NR_CPUS];
>  };
>  #endif
>  
>


_______________________________________________
linux-arm-kernel mailing list
linux-arm-kernel@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-arm-kernel

^ permalink raw reply	[flat|nested] 31+ messages in thread

* Re: [PATCH V8 2/8] cpufreq: mediatek: Enable clock and regulator
  2021-03-23 11:33 ` [PATCH V8 2/8] cpufreq: mediatek: Enable clock and regulator Andrew-sh.Cheng
@ 2021-03-30  4:36   ` Viresh Kumar
  2021-03-31  5:21     ` andrew-sh.cheng
  0 siblings, 1 reply; 31+ messages in thread
From: Viresh Kumar @ 2021-03-30  4:36 UTC (permalink / raw)
  To: Andrew-sh.Cheng
  Cc: MyungJoo Ham, Kyungmin Park, Chanwoo Choi, Rob Herring,
	Mark Rutland, Matthias Brugger, Rafael J. Wysocki,
	Nishanth Menon, Stephen Boyd, Liam Girdwood, Mark Brown,
	linux-pm, devicetree, linux-arm-kernel, linux-mediatek,
	linux-kernel, srv_heupstream

On 23-03-21, 19:33, Andrew-sh.Cheng wrote:
> From: "Andrew-sh.Cheng" <andrew-sh.cheng@mediatek.com>
> 
> Need to enable regulator,
> so that the max/min requested value will be recorded
> even it is not applied right away.
> 
> Intermediate clock is not always enabled by ccf in different projects,
> so cpufreq should enable it by itself.
> 
> Signed-off-by: Andrew-sh.Cheng <andrew-sh.cheng@mediatek.com>
> ---
>  drivers/cpufreq/mediatek-cpufreq.c | 33 +++++++++++++++++++++++++++++----
>  1 file changed, 29 insertions(+), 4 deletions(-)
> 
> diff --git a/drivers/cpufreq/mediatek-cpufreq.c b/drivers/cpufreq/mediatek-cpufreq.c
> index f2e491b25b07..432368707ea6 100644
> --- a/drivers/cpufreq/mediatek-cpufreq.c
> +++ b/drivers/cpufreq/mediatek-cpufreq.c
> @@ -350,6 +350,11 @@ static int mtk_cpu_dvfs_info_init(struct mtk_cpu_dvfs_info *info, int cpu)
>  		ret = PTR_ERR(proc_reg);
>  		goto out_free_resources;
>  	}
> +	ret = regulator_enable(proc_reg);
> +	if (ret) {
> +		pr_warn("enable vproc for cpu%d fail\n", cpu);
> +		goto out_free_resources;
> +	}

Regulators are enabled by OPP core as well now, you sure this is
required ?

-- 
viresh

_______________________________________________
linux-arm-kernel mailing list
linux-arm-kernel@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-arm-kernel

^ permalink raw reply	[flat|nested] 31+ messages in thread

* Re: [PATCH V8 2/8] cpufreq: mediatek: Enable clock and regulator
  2021-03-30  4:36   ` Viresh Kumar
@ 2021-03-31  5:21     ` andrew-sh.cheng
  2021-03-31  6:17       ` Viresh Kumar
  0 siblings, 1 reply; 31+ messages in thread
From: andrew-sh.cheng @ 2021-03-31  5:21 UTC (permalink / raw)
  To: Viresh Kumar
  Cc: MyungJoo Ham, Kyungmin Park, Chanwoo Choi, Rob Herring,
	Mark Rutland, Matthias Brugger, Rafael J. Wysocki,
	Nishanth Menon, Stephen Boyd, Liam Girdwood, Mark Brown,
	linux-pm, devicetree, linux-arm-kernel, linux-mediatek,
	linux-kernel, srv_heupstream

On Tue, 2021-03-30 at 10:06 +0530, Viresh Kumar wrote:
> On 23-03-21, 19:33, Andrew-sh.Cheng wrote:
> > From: "Andrew-sh.Cheng" <andrew-sh.cheng@mediatek.com>
> > 
> > Need to enable regulator,
> > so that the max/min requested value will be recorded
> > even it is not applied right away.
> > 
> > Intermediate clock is not always enabled by ccf in different projects,
> > so cpufreq should enable it by itself.
> > 
> > Signed-off-by: Andrew-sh.Cheng <andrew-sh.cheng@mediatek.com>
> > ---
> >  drivers/cpufreq/mediatek-cpufreq.c | 33 +++++++++++++++++++++++++++++----
> >  1 file changed, 29 insertions(+), 4 deletions(-)
> > 
> > diff --git a/drivers/cpufreq/mediatek-cpufreq.c b/drivers/cpufreq/mediatek-cpufreq.c
> > index f2e491b25b07..432368707ea6 100644
> > --- a/drivers/cpufreq/mediatek-cpufreq.c
> > +++ b/drivers/cpufreq/mediatek-cpufreq.c
> > @@ -350,6 +350,11 @@ static int mtk_cpu_dvfs_info_init(struct mtk_cpu_dvfs_info *info, int cpu)
> >  		ret = PTR_ERR(proc_reg);
> >  		goto out_free_resources;
> >  	}
> > +	ret = regulator_enable(proc_reg);
> > +	if (ret) {
> > +		pr_warn("enable vproc for cpu%d fail\n", cpu);
> > +		goto out_free_resources;
> > +	}
> 
> Regulators are enabled by OPP core as well now, you sure this is
> required ?
> 
Hi Viresh,
Yes.
As you mentioned, it will be enable by OPP core.

Per discuss with hotplug owner and regulator owner,
they suggest that "users should not suppose other module, will enable
regulators for them".
They suggest to add enable_regulator here.

_______________________________________________
linux-arm-kernel mailing list
linux-arm-kernel@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-arm-kernel

^ permalink raw reply	[flat|nested] 31+ messages in thread

* Re: [PATCH V8 2/8] cpufreq: mediatek: Enable clock and regulator
  2021-03-31  5:21     ` andrew-sh.cheng
@ 2021-03-31  6:17       ` Viresh Kumar
  0 siblings, 0 replies; 31+ messages in thread
From: Viresh Kumar @ 2021-03-31  6:17 UTC (permalink / raw)
  To: andrew-sh.cheng
  Cc: MyungJoo Ham, Kyungmin Park, Chanwoo Choi, Rob Herring,
	Mark Rutland, Matthias Brugger, Rafael J. Wysocki,
	Nishanth Menon, Stephen Boyd, Liam Girdwood, Mark Brown,
	linux-pm, devicetree, linux-arm-kernel, linux-mediatek,
	linux-kernel, srv_heupstream

On 31-03-21, 13:21, andrew-sh.cheng wrote:
> Hi Viresh,
> Yes.
> As you mentioned, it will be enable by OPP core.
> 
> Per discuss with hotplug owner and regulator owner,
> they suggest that "users should not suppose other module, will enable
> regulators for them".
> They suggest to add enable_regulator here.

Which is fine if the modules in question aren't closely related to each other,
but OPP core and cpufreq are too closely bound to each other. So much that the
cpufreq driver can depend on the OPP core for doing it.

Though I won't Nack a patch just for that, but it was just a suggestion.

-- 
viresh

_______________________________________________
linux-arm-kernel mailing list
linux-arm-kernel@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-arm-kernel

^ permalink raw reply	[flat|nested] 31+ messages in thread

* Re: [PATCH V8 4/8] devfreq: add mediatek cci devfreq
  2021-03-25  8:04   ` Chanwoo Choi
@ 2021-03-31  6:21     ` andrew-sh.cheng
  0 siblings, 0 replies; 31+ messages in thread
From: andrew-sh.cheng @ 2021-03-31  6:21 UTC (permalink / raw)
  To: Chanwoo Choi
  Cc: MyungJoo Ham, Kyungmin Park, Rob Herring, Mark Rutland,
	Matthias Brugger, Rafael J. Wysocki, Viresh Kumar,
	Nishanth Menon, Stephen Boyd, Liam Girdwood, Mark Brown,
	linux-pm, devicetree, linux-arm-kernel, linux-mediatek,
	linux-kernel, srv_heupstream

On Thu, 2021-03-25 at 16:04 +0800, Chanwoo Choi wrote:
> Hi,
> 
> On 3/23/21 8:33 PM, Andrew-sh.Cheng wrote:
> > From: "Andrew-sh.Cheng" <andrew-sh.cheng@mediatek.com>
> > 
> > This adds a devfreq driver for the Cache Coherent Interconnect (CCI)
> > of the Mediatek MT8183.
> > 
> > On the MT8183 the CCI is supplied by the same regulator as the LITTLE
> > cores. The driver is notified when the regulator voltage changes
> > (driven by cpufreq) and adjusts the CCI frequency to the maximum
> > possible value.
> > 
> > Signed-off-by: Andrew-sh.Cheng <andrew-sh.cheng@mediatek.com>
> > ---
> >  drivers/devfreq/Kconfig              |  10 ++
> >  drivers/devfreq/Makefile             |   1 +
> >  drivers/devfreq/mt8183-cci-devfreq.c | 198 +++++++++++++++++++++++++++++++++++
> >  3 files changed, 209 insertions(+)
> >  create mode 100644 drivers/devfreq/mt8183-cci-devfreq.c
> > 
> > diff --git a/drivers/devfreq/Kconfig b/drivers/devfreq/Kconfig
> > index f56132b0ae64..2538255ac2c1 100644
> > --- a/drivers/devfreq/Kconfig
> > +++ b/drivers/devfreq/Kconfig
> > @@ -111,6 +111,16 @@ config ARM_IMX8M_DDRC_DEVFREQ
> >  	  This adds the DEVFREQ driver for the i.MX8M DDR Controller. It allows
> >  	  adjusting DRAM frequency.
> >  
> > +config ARM_MT8183_CCI_DEVFREQ
> > +	tristate "MT8183 CCI DEVFREQ Driver"
> > +	depends on ARM_MEDIATEK_CPUFREQ
> > +	help
> > +		This adds a devfreq driver for Cache Coherent Interconnect
> > +		of Mediatek MT8183, which is shared the same regulator
> > +		with cpu cluster.
> > +		It can track buck voltage and update a proper CCI frequency.
> > +		Use notification to get regulator status.
> > +
> >  config ARM_TEGRA_DEVFREQ
> >  	tristate "NVIDIA Tegra30/114/124/210 DEVFREQ Driver"
> >  	depends on ARCH_TEGRA_3x_SOC || ARCH_TEGRA_114_SOC || \
> > diff --git a/drivers/devfreq/Makefile b/drivers/devfreq/Makefile
> > index a16333ea7034..991ef7740759 100644
> > --- a/drivers/devfreq/Makefile
> > +++ b/drivers/devfreq/Makefile
> > @@ -11,6 +11,7 @@ obj-$(CONFIG_DEVFREQ_GOV_PASSIVE)	+= governor_passive.o
> >  obj-$(CONFIG_ARM_EXYNOS_BUS_DEVFREQ)	+= exynos-bus.o
> >  obj-$(CONFIG_ARM_IMX_BUS_DEVFREQ)	+= imx-bus.o
> >  obj-$(CONFIG_ARM_IMX8M_DDRC_DEVFREQ)	+= imx8m-ddrc.o
> > +obj-$(CONFIG_ARM_MT8183_CCI_DEVFREQ)	+= mt8183-cci-devfreq.o
> >  obj-$(CONFIG_ARM_RK3399_DMC_DEVFREQ)	+= rk3399_dmc.o
> >  obj-$(CONFIG_ARM_TEGRA_DEVFREQ)		+= tegra30-devfreq.o
> >  
> > diff --git a/drivers/devfreq/mt8183-cci-devfreq.c b/drivers/devfreq/mt8183-cci-devfreq.c
> > new file mode 100644
> > index 000000000000..018543db7bae
> > --- /dev/null
> > +++ b/drivers/devfreq/mt8183-cci-devfreq.c
> > @@ -0,0 +1,198 @@
> > +// SPDX-License-Identifier: GPL-2.0
> > +/*
> > + * Copyright (c) 2021 MediaTek Inc.
> > +
> > + * Author: Andrew-sh.Cheng <andrew-sh.cheng@mediatek.com>
> > + */
> > +
> > +#include <linux/clk.h>
> > +#include <linux/devfreq.h>
> > +#include <linux/module.h>
> > +#include <linux/of.h>
> > +#include <linux/platform_device.h>
> > +#include <linux/regulator/consumer.h>
> > +#include <linux/time.h>
> > +
> > +#define MAX_VOLT_LIMIT		(1150000)
> > +
> > +struct cci_devfreq {
> > +	struct devfreq *devfreq;
> > +	struct regulator *cpu_reg;
> > +	struct clk *cci_clk;
> > +	int old_vproc;
> 
> nitpick. how about using 'old_voltage'?
> because 'vproc' is not easy for understanding.
I will modify it on next patch version.

> 
> > +	unsigned long old_freq;
> > +};
> > +
> > +static int mtk_cci_set_voltage(struct cci_devfreq *cci_df, int vproc)
> 
> nitpick: how about changing 'vproc -> voltage'?
I will modify it on next patch version.

> 
> > +{
> > +	int ret;
> > +
> > +	ret = regulator_set_voltage(cci_df->cpu_reg, vproc,
> > +				    MAX_VOLT_LIMIT);
> > +	if (!ret)
> > +		cci_df->old_vproc = vproc;
> > +	return ret;
> > +}
> > +
> > +static int mtk_cci_devfreq_target(struct device *dev, unsigned long *freq,
> > +				  u32 flags)
> > +{
> > +	int ret;
> > +	struct cci_devfreq *cci_df = dev_get_drvdata(dev);
> > +	struct dev_pm_opp *opp;
> > +	unsigned long opp_rate, opp_voltage, old_voltage;
> > +
> > +	if (!cci_df)
> > +		return -EINVAL;
> > +
> > +	if (cci_df->old_freq == *freq)
> > +		return 0;
> > +
> > +	opp_rate = *freq;
> > +	opp = devfreq_recommended_opp(dev, &opp_rate, 1);
> > +	opp_voltage = dev_pm_opp_get_voltage(opp);
> > +	dev_pm_opp_put(opp);
> > +
> > +	old_voltage = cci_df->old_vproc;
> > +	if (old_voltage == 0)
> > +		old_voltage = regulator_get_voltage(cci_df->cpu_reg);
> > +
> > +	// scale up: set voltage first then freq
> > +	if (opp_voltage > old_voltage) {
> > +		ret = mtk_cci_set_voltage(cci_df, opp_voltage);
> > +		if (ret) {
> > +			pr_err("cci: failed to scale up voltage\n");
> > +			return ret;
> > +		}
> > +	}
> > +
> > +	ret = clk_set_rate(cci_df->cci_clk, *freq);
> > +	if (ret) {
> > +		pr_err("%s: failed cci to set rate: %d\n", __func__,
> > +		       ret);
> > +		mtk_cci_set_voltage(cci_df, old_voltage);
> > +		return ret;
> > +	}
> > +
> > +	// scale down: set freq first then voltage
> > +	if (opp_voltage < old_voltage) {
> > +		ret = mtk_cci_set_voltage(cci_df, opp_voltage);
> > +		if (ret) {
> > +			pr_err("cci: failed to scale down voltage\n");
> > +			clk_set_rate(cci_df->cci_clk, cci_df->old_freq);
> > +			return ret;
> > +		}
> > +	}
> > +
> > +	cci_df->old_freq = *freq;
> > +
> > +	return 0;
> > +}
> > +
> > +static struct devfreq_dev_profile cci_devfreq_profile = {
> > +	.target = mtk_cci_devfreq_target,
> > +};
> > +
> > +static int mtk_cci_devfreq_probe(struct platform_device *pdev)
> > +{
> > +	struct device *cci_dev = &pdev->dev;
> > +	struct cci_devfreq *cci_df;
> > +	struct devfreq_passive_data *passive_data;
> > +	int ret;
> > +
> > +	cci_df = devm_kzalloc(cci_dev, sizeof(*cci_df), GFP_KERNEL);
> > +	if (!cci_df)
> > +		return -ENOMEM;
> > +
> > +	cci_df->cci_clk = devm_clk_get(cci_dev, "cci_clock");
> > +	ret = PTR_ERR_OR_ZERO(cci_df->cci_clk);
> > +	if (ret) {
> > +		if (ret != -EPROBE_DEFER)
> > +			dev_err(cci_dev, "failed to get clock for CCI: %d\n",
> > +				ret);
> 
> Use dev_err_probe() to handle EPROBE_DEFER case. It makes code more simple.
I will modify it on next patch version.

> 
> > +		return ret;
> > +	}
> > +	cci_df->cpu_reg = devm_regulator_get_optional(cci_dev, "proc");
> > +	ret = PTR_ERR_OR_ZERO(cci_df->cpu_reg);
> > +	if (ret) {
> > +		if (ret != -EPROBE_DEFER)
> > +			dev_err(cci_dev, "failed to get regulator for CCI: %d\n",
> > +				ret);
> 
> ditto. Use dev_err_probe()
I will modify it on next patch version.

> 
> > +		return ret;
> > +	}
> > +	ret = regulator_enable(cci_df->cpu_reg);
> > +	if (ret) {
> > +		dev_err(cci_dev, "enable buck for cci fail\n");
> > +		return ret;
> > +	}
> > +
> > +	ret = dev_pm_opp_of_add_table(cci_dev);
> > +	if (ret) {
> > +		dev_err(cci_dev, "Fail to get OPP table for CCI: %d\n", ret);
> > +		return ret;
> > +	}
> > +
> > +	platform_set_drvdata(pdev, cci_df);
> > +
> > +	passive_data = devm_kzalloc(cci_dev, sizeof(*passive_data), GFP_KERNEL);
> > +	if (!passive_data) {
> > +		ret = -ENOMEM;
> > +		goto err_opp;
> > +	}
> > +
> > +	passive_data->parent_type = CPUFREQ_PARENT_DEV;
> > +
> > +	cci_df->devfreq = devm_devfreq_add_device(cci_dev,
> > +						  &cci_devfreq_profile,
> > +						  DEVFREQ_GOV_PASSIVE,
> > +						  passive_data);
> > +	if (IS_ERR(cci_df->devfreq)) {
> > +		ret = PTR_ERR(cci_df->devfreq);
> > +		dev_err(cci_dev, "cannot create cci devfreq device:%d\n", ret);
> > +		goto err_opp;
> > +	}
> > +
> > +	return 0;
> > +
> > +err_opp:
> > +	dev_pm_opp_of_remove_table(cci_dev);
> > +	return ret;
> > +}
> > +
> > +static int mtk_cci_devfreq_remove(struct platform_device *pdev)
> > +{
> > +	struct device *cci_dev = &pdev->dev;
> > +	struct cci_devfreq *cci_df;
> > +	struct notifier_block *opp_nb;
> > +
> > +	cci_df = platform_get_drvdata(pdev);
> > +	opp_nb = &cci_df->opp_nb;
> > +
> > +	dev_pm_opp_unregister_notifier(cci_dev, opp_nb);
> 
> Why do you call this function without registration?
> If you want to catch the OPP changes of devfreq,
> you can use devfreq_register_opp_notifier/devfreq_unregister_opp_notifier
> functions.
Yes, I will move it to 
[V8,7/8] devfreq: mediatek: cci devfreq register opp notification for
SVS support

> 
> > +	dev_pm_opp_of_remove_table(cci_dev);
> > +	regulator_disable(cci_df->cpu_reg);
> > +
> > +	return 0;
> > +}
> > +
> > +static const __maybe_unused struct of_device_id
> > +	mediatek_cci_of_match[] = {
> 
> Need to change it as following at same line:
> static const __maybe_unused struct of_device_idmediatek_cci_of_match[] = {

Hi Chanwoo,
I don't quite understand when to us __maybe_unused
This is a suggestion from patch version 2.
https://patchwork.kernel.org/patch/10876449/
Please give me some advice.
Thank you.


> 
> 
> > +	{ .compatible = "mediatek,mt8183-cci" },
> > +	{ },
> > +};
> > +MODULE_DEVICE_TABLE(of, mediatek_cci_of_match);
> > +
> > +static struct platform_driver cci_devfreq_driver = {
> > +	.probe	= mtk_cci_devfreq_probe,
> > +	.remove	= mtk_cci_devfreq_remove,
> > +	.driver = {
> > +		.name = "mediatek-cci-devfreq",
> > +		.of_match_table = of_match_ptr(mediatek_cci_of_match),
> > +	},
> > +};
> > +
> > +module_platform_driver(cci_devfreq_driver);
> > +
> > +MODULE_DESCRIPTION("Mediatek CCI devfreq driver");
> > +MODULE_AUTHOR("Andrew-sh.Cheng <andrew-sh.cheng@mediatek.com>");
> > +MODULE_LICENSE("GPL v2");
> > 
> 
> 

_______________________________________________
linux-arm-kernel mailing list
linux-arm-kernel@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-arm-kernel

^ permalink raw reply	[flat|nested] 31+ messages in thread

* Re: [PATCH V8 7/8] devfreq: mediatek: cci devfreq register opp notification for SVS support
  2021-03-25  8:11   ` Chanwoo Choi
@ 2021-03-31  7:53     ` andrew-sh.cheng
  0 siblings, 0 replies; 31+ messages in thread
From: andrew-sh.cheng @ 2021-03-31  7:53 UTC (permalink / raw)
  To: Chanwoo Choi
  Cc: MyungJoo Ham, Kyungmin Park, Rob Herring, Mark Rutland,
	Matthias Brugger, Rafael J. Wysocki, Viresh Kumar,
	Nishanth Menon, Stephen Boyd, Liam Girdwood, Mark Brown,
	linux-pm, devicetree, linux-arm-kernel, linux-mediatek,
	linux-kernel, srv_heupstream

On Thu, 2021-03-25 at 17:11 +0900, Chanwoo Choi wrote:
> Hi,
> 
> I think that you can squash this patch to patch4.

> On 3/23/21 8:34 PM, Andrew-sh.Cheng wrote:
> > From: "Andrew-sh.Cheng" <andrew-sh.cheng@mediatek.com>
> > 
> > SVS will change the voltage of opp item.
> 
> What it the full name of SVS?

Due to the content of this patch is for SVS, so I separate it from
patch4.
SVS is Smart-Voltage-Scaling.
It will check the IC quality, and then modify the voltage field of opp
table.
The required voltage will smaller than original signed-off voltage in
opp table.
This voltage will change when temperature is changed.
cci devfreq need to raise voltage when the required voltage raise.

> 
> > CCI devfreq need to react to change frequency.
> > 
> > Signed-off-by: Andrew-sh.Cheng <andrew-sh.cheng@mediatek.com>
> > ---
> >  drivers/devfreq/mt8183-cci-devfreq.c | 27 +++++++++++++++++++++++++++
> >  1 file changed, 27 insertions(+)
> > 
> > diff --git a/drivers/devfreq/mt8183-cci-devfreq.c b/drivers/devfreq/mt8183-cci-devfreq.c
> > index 018543db7bae..6942a48f3f4f 100644
> > --- a/drivers/devfreq/mt8183-cci-devfreq.c
> > +++ b/drivers/devfreq/mt8183-cci-devfreq.c
> > @@ -21,6 +21,7 @@ struct cci_devfreq {
> >  	struct clk *cci_clk;
> >  	int old_vproc;
> >  	unsigned long old_freq;
> > +	struct notifier_block opp_nb;
> >  };
> >  
> >  static int mtk_cci_set_voltage(struct cci_devfreq *cci_df, int vproc)
> > @@ -89,6 +90,26 @@ static int mtk_cci_devfreq_target(struct device *dev, unsigned long *freq,
> >  	return 0;
> >  }
> >  
> > +static int ccidevfreq_opp_notifier(struct notifier_block *nb,
> 
> I think that you better to change the function name as following:
> ccidevfreq_opp_notifier -> mtk_cci_devfreq_opp_notifier
I will change it on next patch

> 
> > +				   unsigned long event, void *data)
> > +{
> > +	struct dev_pm_opp *opp = data;
> > +	struct cci_devfreq *cci_df = container_of(nb, struct cci_devfreq,
> > +						  opp_nb);
> > +	unsigned long	freq, volt;
> > +
> > +	if (event == OPP_EVENT_ADJUST_VOLTAGE) {
> > +		freq = dev_pm_opp_get_freq(opp);
> > +		/* current opp item is changed */
> > +		if (freq == cci_df->old_freq) {
> > +			volt = dev_pm_opp_get_voltage(opp);
> > +			mtk_cci_set_voltage(cci_df, volt);
> > +		}
> > +	}
> > +
> > +	return 0;
> > +}
> > +
> >  static struct devfreq_dev_profile cci_devfreq_profile = {
> >  	.target = mtk_cci_devfreq_target,
> >  };
> > @@ -98,12 +119,15 @@ static int mtk_cci_devfreq_probe(struct platform_device *pdev)
> >  	struct device *cci_dev = &pdev->dev;
> >  	struct cci_devfreq *cci_df;
> >  	struct devfreq_passive_data *passive_data;
> > +	struct notifier_block *opp_nb;
> >  	int ret;
> >  
> >  	cci_df = devm_kzalloc(cci_dev, sizeof(*cci_df), GFP_KERNEL);
> >  	if (!cci_df)
> >  		return -ENOMEM;
> >  
> > +	opp_nb = &cci_df->opp_nb;
> 
> Just move this code at the neighborhood of 'opp_nb->notifier_call' init code.
I will modify it on next patch

> 
> > +
> >  	cci_df->cci_clk = devm_clk_get(cci_dev, "cci_clock");
> >  	ret = PTR_ERR_OR_ZERO(cci_df->cci_clk);
> >  	if (ret) {
> > @@ -152,6 +176,9 @@ static int mtk_cci_devfreq_probe(struct platform_device *pdev)
> >  		goto err_opp;
> >  	}
> >  
> > +	opp_nb->notifier_call = ccidevfreq_opp_notifier;
> > +	dev_pm_opp_register_notifier(cci_dev, opp_nb);
> 
> Need to check whether return value is valid or not.
I will add the check for next patch


> 
> > +
> >  	return 0;
> >  
> >  err_opp:
> > 
> 
> 

_______________________________________________
linux-arm-kernel mailing list
linux-arm-kernel@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-arm-kernel

^ permalink raw reply	[flat|nested] 31+ messages in thread

* Re: [PATCH V8 1/8] PM / devfreq: Add cpu based scaling support to passive_governor
  2021-03-25  8:14   ` Chanwoo Choi
@ 2021-03-31  8:03     ` andrew-sh.cheng
  2021-03-31  8:27       ` Chanwoo Choi
  2021-03-31 10:46     ` Hsin-Yi Wang
  1 sibling, 1 reply; 31+ messages in thread
From: andrew-sh.cheng @ 2021-03-31  8:03 UTC (permalink / raw)
  To: Chanwoo Choi
  Cc: MyungJoo Ham, Kyungmin Park, Rob Herring, Mark Rutland,
	Matthias Brugger, Rafael J. Wysocki, Viresh Kumar,
	Nishanth Menon, Stephen Boyd, Liam Girdwood, Mark Brown,
	linux-pm, devicetree, linux-arm-kernel, linux-mediatek,
	linux-kernel, srv_heupstream, Sibi Sankar

On Thu, 2021-03-25 at 17:14 +0900, Chanwoo Choi wrote:
> Hi,
> 
> You are missing to add these patches to linux-pm mailing list.
> Need to send them to linu-pm ML.
> 
> Also, before received this series, I tried to clean-up these patches
> on testing branch[1]. So that I add my comment with my clean-up case.
> [1] https://urldefense.com/v3/__https://git.kernel.org/pub/scm/linux/kernel/git/chanwoo/linux.git/log/?h=devfreq-testing-passive-gov__;!!CTRNKA9wMg0ARbw!zIrzeDp9vPnm1_SDzVPuzqdHn3zWie9DnfBXaA-j9-CSrVc6aR9_rJQQiw81_CgAPh9XRRs$ 
> 
> And 'Saravana Kannan <skannan@codeaurora.org>' is wrong email address.
> Please update the email or drop this email.

Hi Chanwoo,

Thank you for the advices.
I will resend patch v9 (add to linux-pm ML), remove this patch, and note
that my patch set base on
https://git.kernel.org/pub/scm/linux/kernel/git/chanwoo/linux.git/log/?h=devfreq-testing-passive-gov


> 
> 
> On 3/23/21 8:33 PM, Andrew-sh.Cheng wrote:
> > From: Saravana Kannan <skannan@codeaurora.org>
> >
> > Many CPU architectures have caches that can scale independent of the
> > CPUs. Frequency scaling of the caches is necessary to make sure that the
> > cache is not a performance bottleneck that leads to poor performance and
> > power. The same idea applies for RAM/DDR.
> >
> > To achieve this, this patch adds support for cpu based scaling to the
> > passive governor. This is accomplished by taking the current frequency
> > of each CPU frequency domain and then adjust the frequency of the cache
> > (or any devfreq device) based on the frequency of the CPUs. It listens
> > to CPU frequency transition notifiers to keep itself up to date on the
> > current CPU frequency.
> >
> > To decide the frequency of the device, the governor does one of the
> > following:
> > * Derives the optimal devfreq device opp from required-opps property of
> >   the parent cpu opp_table.
> >
> > * Scales the device frequency in proportion to the CPU frequency. So, if
> >   the CPUs are running at their max frequency, the device runs at its
> >   max frequency. If the CPUs are running at their min frequency, the
> >   device runs at its min frequency. It is interpolated for frequencies
> >   in between.
> >
> > Andrew-sh.Cheng change
> > dev_pm_opp_xlate_opp to dev_pm_opp_xlate_required_opp devfreq->max_freq
> > to devfreq->user_min_freq_req.data.freq.qos->min_freq.target_value
> > after kernel-5.7
> > Don't return -EINVAL in devfreq_passive_event_handler()
> > since it doesn't handle DEVFREQ_GOV_SUSPEND DEVFREQ_GOV_RESUME cases.
> >
> > Signed-off-by: Saravana Kannan <skannan@codeaurora.org>
> > [Sibi: Integrated cpu-freqmap governor into passive_governor]
> > Signed-off-by: Sibi Sankar <sibis@codeaurora.org>
> > Signed-off-by: Andrew-sh.Cheng <andrew-sh.cheng@mediatek.com>
> > ---
> >  drivers/devfreq/Kconfig            |   2 +
> >  drivers/devfreq/governor_passive.c | 329 +++++++++++++++++++++++++++++++++++--
> >  include/linux/devfreq.h            |  29 +++-
> >  3 files changed, 342 insertions(+), 18 deletions(-)
> >
> > diff --git a/drivers/devfreq/Kconfig b/drivers/devfreq/Kconfig
> > index 00704efe6398..f56132b0ae64 100644
> > --- a/drivers/devfreq/Kconfig
> > +++ b/drivers/devfreq/Kconfig
> > @@ -73,6 +73,8 @@ config DEVFREQ_GOV_PASSIVE
> >  	  device. This governor does not change the frequency by itself
> >  	  through sysfs entries. The passive governor recommends that
> >  	  devfreq device uses the OPP table to get the frequency/voltage.
> > +	  Alternatively the governor can also be chosen to scale based on
> > +	  the online CPUs current frequency.
> >  
> >  comment "DEVFREQ Drivers"
> >  
> > diff --git a/drivers/devfreq/governor_passive.c b/drivers/devfreq/governor_passive.c
> > index b094132bd20b..9cc57b083839 100644
> > --- a/drivers/devfreq/governor_passive.c
> > +++ b/drivers/devfreq/governor_passive.c
> > @@ -8,11 +8,103 @@
> >   */
> >  
> >  #include <linux/module.h>
> > +#include <linux/cpu.h>
> > +#include <linux/cpufreq.h>
> > +#include <linux/cpumask.h>
> >  #include <linux/device.h>
> >  #include <linux/devfreq.h>
> > +#include <linux/slab.h>
> >  #include "governor.h"
> >  
> > -static int devfreq_passive_get_target_freq(struct devfreq *devfreq,
> > +struct devfreq_cpu_state {
> > +	unsigned int curr_freq;
> > +	unsigned int min_freq;
> > +	unsigned int max_freq;
> > +	unsigned int first_cpu;
> > +	struct device *cpu_dev;
> > +	struct opp_table *opp_table;
> > +};
> 
> As I knew, the previous version has the description of structure
> as following:  I wan to add the description like below.
> 
> And if you have no any objection, I'd like you to order
> the variables as following and use 'dev' instead of 'cpu_dev'
> because this patch use the 'cpu_state->cpu_dev' at the multiple points.
> I think that 'cpu_state->dev' is better than 'cpu_state->cpu_dev'.
> Also, I prefer to use 'cur_freq' instead of 'curr_freq'
> because devfreq subsystem uses 'cur_freq' for expressing the 'current frequency'.
> 
> /**                                                                             
>  * struct devfreq_cpu_state - Hold the per-cpu data                              
>  * @dev:        reference to cpu device.                                        
>  * @first_cpu:  the cpumask of the first cpu of a policy.                       
>  * @opp_table:  reference to cpu opp table.                                     
>  * @cur_freq:   the current frequency of the cpu.                               
>  * @min_freq:   the min frequency of the cpu.                                   
>  * @max_freq:   the max frequency of the cpu.                                   
>  *                                                                              
>  * This structure stores the required cpu_data of a cpu.                        
>  * This is auto-populated by the governor.                                      
>  */                                                                             
> struct devfreq_cpu_state {                                                       
>          struct device *dev;                                                     
>          unsigned int first_cpu;                                                 
> 
>          struct opp_table *opp_table;                                            
>          unsigned int cur_freq;                                                  
>          unsigned int min_freq;                                                  
>          unsigned int max_freq;                                                  
> };               
> 
> 
> > +
> > +static unsigned long xlate_cpufreq_to_devfreq(struct devfreq_passive_data *data,
> > +					      unsigned int cpu)
> > +{
> > +	unsigned int cpu_min_freq, cpu_max_freq, cpu_curr_freq_khz, cpu_percent;
> > +	unsigned long dev_min_freq, dev_max_freq, dev_max_state;
> > +
> > +	struct devfreq_cpu_state *cpu_state = data->cpu_state[cpu];
> > +	struct devfreq *devfreq = (struct devfreq *)data->this;
> > +	unsigned long *dev_freq_table = devfreq->profile->freq_table;
> > +	struct dev_pm_opp *opp = NULL, *p_opp = NULL;
> > +	unsigned long cpu_curr_freq, freq;
> > +
> > +	if (!cpu_state || cpu_state->first_cpu != cpu ||
> > +	    !cpu_state->opp_table || !devfreq->opp_table)
> > +		return 0;
> > +
> > +	cpu_curr_freq = cpu_state->curr_freq * 1000;
> > +	p_opp = devfreq_recommended_opp(cpu_state->cpu_dev, &cpu_curr_freq, 0);
> > +	if (IS_ERR(p_opp))
> > +		return 0;
> > +
> > +	opp = dev_pm_opp_xlate_required_opp(cpu_state->opp_table,
> > +					    devfreq->opp_table, p_opp);
> > +	dev_pm_opp_put(p_opp);
> > +
> > +	if (!IS_ERR(opp)) {
> > +		freq = dev_pm_opp_get_freq(opp);
> > +		dev_pm_opp_put(opp);
> > +		goto out;
> > +	}
> > +
> > +	/* Use Interpolation if required opps is not available */
> > +	cpu_min_freq = cpu_state->min_freq;
> > +	cpu_max_freq = cpu_state->max_freq;
> > +	cpu_curr_freq_khz = cpu_state->curr_freq;
> > +
> > +	if (dev_freq_table) {
> > +		/* Get minimum frequency according to sorting order */
> > +		dev_max_state = dev_freq_table[devfreq->profile->max_state - 1];
> > +		if (dev_freq_table[0] < dev_max_state) {
> > +			dev_min_freq = dev_freq_table[0];
> > +			dev_max_freq = dev_max_state;
> > +		} else {
> > +			dev_min_freq = dev_max_state;
> > +			dev_max_freq = dev_freq_table[0];
> > +		}
> > +	} else {
> > +		dev_min_freq = dev_pm_qos_read_value(devfreq->dev.parent,
> > +						     DEV_PM_QOS_MIN_FREQUENCY);
> > +		dev_max_freq = dev_pm_qos_read_value(devfreq->dev.parent,
> > +						     DEV_PM_QOS_MAX_FREQUENCY);
> > +
> > +		if (dev_max_freq <= dev_min_freq)
> > +			return 0;
> > +	}
> > +	cpu_percent = ((cpu_curr_freq_khz - cpu_min_freq) * 100) / cpu_max_freq - cpu_min_freq;
> > +	freq = dev_min_freq + mult_frac(dev_max_freq - dev_min_freq, cpu_percent, 100);
> > +
> > +out:
> > +	return freq;
> > +}
> > +
> > +static int get_target_freq_with_cpufreq(struct devfreq *devfreq,
> > +					unsigned long *freq)
> > +{
> > +	struct devfreq_passive_data *p_data =
> > +				(struct devfreq_passive_data *)devfreq->data;
> > +	unsigned int cpu;
> > +	unsigned long target_freq = 0;
> > +
> > +	for_each_online_cpu(cpu)
> > +		target_freq = max(target_freq,
> > +				  xlate_cpufreq_to_devfreq(p_data, cpu));
> > +
> > +	*freq = target_freq;
> > +
> > +	return 0;
> > +}
> 
> As you knew, governor_passive.c was already used 
> both 'dev_pm_opp_xlate_required_opp' and 'devfreq_recommended_opp'
> to get the target from OPP. So, I wan to make the common function
> like 'get_taget_freq_by_required_opp' as following:
> If define 'get_taget_freq_by_required_opp' as following,
> it will be used for get_target_freq_with_devfreq().
> After finisied the review of this patch, I'll send the patch[2].
> [2] https://urldefense.com/v3/__https://git.kernel.org/pub/scm/linux/kernel/git/chanwoo/linux.git/commit/?h=devfreq-testing-passive-gov&id=101c5a087586ab2b5cf3370166a7e39227ca83cf__;!!CTRNKA9wMg0ARbw!zIrzeDp9vPnm1_SDzVPuzqdHn3zWie9DnfBXaA-j9-CSrVc6aR9_rJQQiw81_CgA6mp3Yqo$ 
> 
> For example but this code is not tested,
> static unsigned long get_taget_freq_by_required_opp(struct device *p_dev,
> 						struct opp_table *p_opp_table,
> 						struct opp_table *opp_table,
> 						unsigned long freq)
> {
> 	struct dev_pm_opp *opp = NULL, *p_opp = NULL;
> 
> 	if (!p_dev || !p_opp_table || !opp_table || !freq)
> 		return 0;
> 
> 	p_opp = devfreq_recommended_opp(p_dev, &freq, 0);
> 	if (IS_ERR(p_opp))
> 		return 0;
> 
> 	opp = dev_pm_opp_xlate_required_opp(p_opp_table, opp_table, p_opp);
> 	dev_pm_opp_put(p_opp);
> 
> 	if (IS_ERR(opp))
> 		return 0;
> 
> 	freq = dev_pm_opp_get_freq(opp);
> 	dev_pm_opp_put(opp);
> 
> 	return freq;
> }
> 
> static int get_target_freq_with_cpufreq(struct devfreq *devfreq,
> 					unsigned long *target_freq)
> {
> 	struct devfreq_passive_data *p_data =
> 				(struct devfreq_passive_data *)devfreq->data;
> 	struct devfreq_cpu_data *cpu_data;
> 	unsigned long cpu, cpu_cur, cpu_min, cpu_max, cpu_percent;
> 	unsigned long dev_min, dev_max;
> 	unsigned long freq = 0;
> 
> 	for_each_online_cpu(cpu) {
> 		cpu_data = p_data->cpu_data[cpu];
> 		if (!cpu_data || cpu_data->first_cpu != cpu)
> 			continue;
> 
> 		/* Get target freq via required opps */
> 		cpu_cur = cpu_data->cur_freq * HZ_PER_KHZ;
> 		freq = get_taget_freq_by_required_opp(cpu_data->dev,
> 					cpu_data->opp_table,
> 					devfreq->opp_table, cpu_cur);
> 		if (freq) {
> 			*target_freq = max(freq, *target_freq);
> 			continue;
> 		}
> 
> 		/* Use Interpolation if required opps is not available */
> 		devfreq_get_freq_range(devfreq, &dev_min, &dev_max);
> 
> 		cpu_min = cpu_data->min_freq;
> 		cpu_max = cpu_data->max_freq;
> 		cpu_cur = cpu_data->cur_freq;
> 
> 		cpu_percent = ((cpu_cur - cpu_min) * 100) / cpu_max - cpu_min;
> 		freq = dev_min + mult_frac(dev_max - dev_min, cpu_percent, 100);
> 
> 		*target_freq = max(freq, *target_freq);
> 	}
> 
> 	return 0;
> }
> 
> > +
> > +static int get_target_freq_with_devfreq(struct devfreq *devfreq,
> >  					unsigned long *freq)
> >  {
> >  	struct devfreq_passive_data *p_data
> > @@ -23,14 +115,6 @@ static int devfreq_passive_get_target_freq(struct devfreq *devfreq,
> >  	int i, count;
> >  
> >  	/*
> > -	 * If the devfreq device with passive governor has the specific method
> > -	 * to determine the next frequency, should use the get_target_freq()
> > -	 * of struct devfreq_passive_data.
> > -	 */
> > -	if (p_data->get_target_freq)
> > -		return p_data->get_target_freq(devfreq, freq);
> > -
> > -	/*
> >  	 * If the parent and passive devfreq device uses the OPP table,
> >  	 * get the next frequency by using the OPP table.
> >  	 */
> > @@ -98,6 +182,37 @@ static int devfreq_passive_get_target_freq(struct devfreq *devfreq,
> >  	return 0;
> >  }
> >  
> > +static int devfreq_passive_get_target_freq(struct devfreq *devfreq,
> > +					   unsigned long *freq)
> > +{
> > +	struct devfreq_passive_data *p_data =
> > +		(struct devfreq_passive_data *)devfreq->data;
> > +	int ret;
> > +
> > +	/*
> > +	 * If the devfreq device with passive governor has the specific method
> > +	 * to determine the next frequency, should use the get_target_freq()
> > +	 * of struct devfreq_passive_data.
> > +	 */
> > +	if (p_data->get_target_freq)
> > +		return p_data->get_target_freq(devfreq, freq);
> > +
> > +	switch (p_data->parent_type) {
> > +	case DEVFREQ_PARENT_DEV:
> > +		ret = get_target_freq_with_devfreq(devfreq, freq);
> > +		break;
> > +	case CPUFREQ_PARENT_DEV:
> > +		ret = get_target_freq_with_cpufreq(devfreq, freq);
> > +		break;
> > +	default:
> > +		ret = -EINVAL;
> > +		dev_err(&devfreq->dev, "Invalid parent type\n");
> > +		break;
> > +	}
> > +
> > +	return ret;
> > +}
> > +
> >  static int devfreq_passive_notifier_call(struct notifier_block *nb,
> >  				unsigned long event, void *ptr)
> >  {
> > @@ -130,16 +245,200 @@ static int devfreq_passive_notifier_call(struct notifier_block *nb,
> >  	return NOTIFY_DONE;
> >  }
> >  
> > +static int cpufreq_passive_notifier_call(struct notifier_block *nb,
> > +					 unsigned long event, void *ptr)
> > +{
> > +	struct devfreq_passive_data *data =
> > +			container_of(nb, struct devfreq_passive_data, nb);
> > +	struct devfreq *devfreq = (struct devfreq *)data->this;
> > +	struct devfreq_cpu_state *cpu_state;
> > +	struct cpufreq_freqs *cpu_freq = ptr;
> 
> Use 'freqs' variable name.  I prefer to use the same variable name
> for both devfreq_freqs and cpufreq_freqs instance.
> 
> > +	unsigned int curr_freq;
> 
> As I commented above, better to use 'cur_frq' instead of 'curr_freq'
> if there is no any special reason.
> 
> > +	int ret;
> > +
> > +	if (event != CPUFREQ_POSTCHANGE || !cpu_freq ||
> > +	    !data->cpu_state[cpu_freq->policy->cpu])
> > +		return 0;
> > +
> > +	cpu_state = data->cpu_state[cpu_freq->policy->cpu];
> > +	if (cpu_state->curr_freq == cpu_freq->new)
> > +		return 0;
> > +
> > +	/* Backup current freq and pre-update cpu state freq*/
> 
> I think that this commnet is not critial. So, please drop this comment.
> 
> > +	curr_freq = cpu_state->curr_freq;
> > +	cpu_state->curr_freq = cpu_freq->new;
> > +
> > +	mutex_lock(&devfreq->lock);
> > +	ret = update_devfreq(devfreq);
> 
> I recommend to use 'devfreq_update_target' instead of 'update_devfreq'
> as following:
> 	devfreq_update_target(devfreq, freqs->new);
> 
> > +	mutex_unlock(&devfreq->lock);
> > +	if (ret) {
> > +		cpu_state->curr_freq = curr_freq;
> > +		dev_err(&devfreq->dev, "Couldn't update the frequency.\n");
> > +		return ret;
> > +	}
> > +
> > +	return 0;
> > +}
> > +
> > +static int cpufreq_passive_register(struct devfreq_passive_data **p_data)
> 
> In order to keep the consistent style of function name,
> please change the name as following because devfreq defines
> the function name as 'devfreq_regiter_notifier'
> - cpufreq_passive_register -> cpufreq_passive_register_notifier
> 
> > +{
> > +	struct devfreq_passive_data *data = *p_data;
> > +	struct devfreq *devfreq = (struct devfreq *)data->this;
> > +	struct device *dev = devfreq->dev.parent;
> > +	struct opp_table *opp_table = NULL;
> > +	struct devfreq_cpu_state *cpu_state;
> > +	struct cpufreq_policy *policy;
> > +	struct device *cpu_dev;
> > +	unsigned int cpu;
> > +	int ret;
> > +
> > +	get_online_cpus();
> > +
> > +	data->nb.notifier_call = cpufreq_passive_notifier_call;
> > +	ret = cpufreq_register_notifier(&data->nb,
> > +					CPUFREQ_TRANSITION_NOTIFIER);
> > +	if (ret) {
> > +		dev_err(dev, "Couldn't register cpufreq notifier.\n");
> > +		data->nb.notifier_call = NULL;
> > +		goto out;
> > +	}
> > +
> > +	/* Populate devfreq_cpu_state */
> 
> Don't need this comment. Please drop it.
> 
> > +	for_each_online_cpu(cpu) {
> > +		if (data->cpu_state[cpu])
> > +			continue;
> > +
> > +		policy = cpufreq_cpu_get(cpu);
> > +		if (!policy) {
> > +			ret = -EINVAL;
> > +			goto out;
> > +		} else if (PTR_ERR(policy) == -EPROBE_DEFER) {
> > +			ret = -EPROBE_DEFER;
> > +			goto out;
> > +		} else if (IS_ERR(policy)) {
> > +			ret = PTR_ERR(policy);
> > +			dev_err(dev, "Couldn't get the cpufreq_poliy.\n");
> > +			goto out;
> > +		}
> 
> Use dev_err_probe() funciotn to handle hte EPROBE_DEFER.
> It make code more simple.
> 
> > +
> > +		cpu_state = kzalloc(sizeof(*cpu_state), GFP_KERNEL);
> > +		if (!cpu_state) {
> > +			ret = -ENOMEM;
> > +			goto out;
> > +		}
> > +
> > +		cpu_dev = get_cpu_device(cpu);
> > +		if (!cpu_dev) {
> > +			dev_err(dev, "Couldn't get cpu device.\n");
> > +			ret = -ENODEV;
> > +			goto out;
> > +		}
> > +
> > +		opp_table = dev_pm_opp_get_opp_table(cpu_dev);
> > +		if (IS_ERR(devfreq->opp_table)) {
> > +			ret = PTR_ERR(opp_table);
> > +			goto out;
> > +		}
> > +
> > +		cpu_state->cpu_dev = cpu_dev;
> > +		cpu_state->opp_table = opp_table;
> > +		cpu_state->first_cpu = cpumask_first(policy->related_cpus);
> > +		cpu_state->curr_freq = policy->cur;
> > +		cpu_state->min_freq = policy->cpuinfo.min_freq;
> > +		cpu_state->max_freq = policy->cpuinfo.max_freq;
> > +		data->cpu_state[cpu] = cpu_state;
> > +
> > +		cpufreq_cpu_put(policy);
> > +	}
> > +
> > +out:
> > +	put_online_cpus();
> > +	if (ret)
> > +		return ret;
> > +
> > +	/* Update devfreq */
> > +	mutex_lock(&devfreq->lock);
> > +	ret = update_devfreq(devfreq);
> 
> > +	mutex_unlock(&devfreq->lock);
> > +	if (ret)
> > +		dev_err(dev, "Couldn't update the frequency.\n");
> > +
> > +	return ret;
> > +}
> > +
> > +static int cpufreq_passive_unregister(struct devfreq_passive_data **p_data)
> 
> As I commented above, please change the name as following:
> - cpufreq_passive_unregister -> cpufreq_passive_unregister_notifier
> 
> > +{
> > +	struct devfreq_passive_data *data = *p_data;
> > +	struct devfreq_cpu_state *cpu_state;
> > +	int cpu;
> > +
> > +	if (data->nb.notifier_call)
> > +		cpufreq_unregister_notifier(&data->nb,
> > +					    CPUFREQ_TRANSITION_NOTIFIER);
> > +
> > +	for_each_possible_cpu(cpu) {
> > +		cpu_state = data->cpu_state[cpu];
> > +		if (cpu_state) {
> > +			if (cpu_state->opp_table)
> > +				dev_pm_opp_put_opp_table(cpu_state->opp_table);
> > +			kfree(cpu_state);
> > +			cpu_state = NULL;
> > +		}
> > +	}
> > +
> > +	return 0;
> > +}
> > +
> > +int register_parent_dev_notifier(struct devfreq_passive_data **p_data)
> > +{
> > +	struct notifier_block *nb = &(*p_data)->nb;
> > +	int ret = 0;
> > +
> > +	switch ((*p_data)->parent_type) {
> > +	case DEVFREQ_PARENT_DEV:
> > +		nb->notifier_call = devfreq_passive_notifier_call;
> > +		ret = devfreq_register_notifier((struct devfreq *)(*p_data)->parent, nb,
> > +						DEVFREQ_TRANSITION_NOTIFIER);
> > +		break;
> > +	case CPUFREQ_PARENT_DEV:
> > +		ret = cpufreq_passive_register(p_data);
> > +		break;
> > +	default:
> > +		ret = -EINVAL;
> > +		break;
> > +	}
> > +	return ret;
> > +}
> > +
> > +int unregister_parent_dev_notifier(struct devfreq_passive_data **p_data)
> > +{
> > +	int ret = 0;
> > +
> > +	switch ((*p_data)->parent_type) {
> > +	case DEVFREQ_PARENT_DEV:
> > +		WARN_ON(devfreq_unregister_notifier((struct devfreq *)(*p_data)->parent,
> > +						    &(*p_data)->nb,
> > +						    DEVFREQ_TRANSITION_NOTIFIER));
> > +		break;
> > +	case CPUFREQ_PARENT_DEV:
> > +		cpufreq_passive_unregister(p_data);
> > +		break;
> > +	default:
> > +		ret = -EINVAL;
> > +		break;
> > +	}
> > +	return ret;
> > +}
> 
> I think that you don't need to define register_parent_dev_notifier
> and unregister_parent_dev_notifier as the separate functions.
> 
> Instead of the separate functions, just add the code
> into devfreq_passive_event_handler.
> 
> 
> > +
> >  static int devfreq_passive_event_handler(struct devfreq *devfreq,
> >  				unsigned int event, void *data)
> >  {
> >  	struct devfreq_passive_data *p_data
> >  			= (struct devfreq_passive_data *)devfreq->data;
> >  	struct devfreq *parent = (struct devfreq *)p_data->parent;
> > -	struct notifier_block *nb = &p_data->nb;
> >  	int ret = 0;
> >  
> > -	if (!parent)
> > +	if (p_data->parent_type == DEVFREQ_PARENT_DEV && !parent)
> >  		return -EPROBE_DEFER;
> >  
> >  	switch (event) {
> > @@ -147,13 +446,11 @@ static int devfreq_passive_event_handler(struct devfreq *devfreq,
> >  		if (!p_data->this)
> >  			p_data->this = devfreq;
> >  
> > -		nb->notifier_call = devfreq_passive_notifier_call;
> > -		ret = devfreq_register_notifier(parent, nb,
> > -					DEVFREQ_TRANSITION_NOTIFIER);
> > +		ret = register_parent_dev_notifier(&p_data);
> >  		break;
> > +
> >  	case DEVFREQ_GOV_STOP:
> > -		WARN_ON(devfreq_unregister_notifier(parent, nb,
> > -					DEVFREQ_TRANSITION_NOTIFIER));
> > +		ret = unregister_parent_dev_notifier(&p_data);
> >  		break;
> >  	default:
> >  		break;
> > diff --git a/include/linux/devfreq.h b/include/linux/devfreq.h
> > index 26ea0850be9b..e0093b7c805c 100644
> > --- a/include/linux/devfreq.h
> > +++ b/include/linux/devfreq.h
> > @@ -280,6 +280,25 @@ struct devfreq_simple_ondemand_data {
> >  
> >  #if IS_ENABLED(CONFIG_DEVFREQ_GOV_PASSIVE)
> >  /**
> > + * struct devfreq_cpu_state - holds the per-cpu state
> > + * @freq:	the current frequency of the cpu.
> > + * @min_freq:	the min frequency of the cpu.
> > + * @max_freq:	the max frequency of the cpu.
> > + * @first_cpu:	the cpumask of the first cpu of a policy.
> > + * @dev:	reference to cpu device.
> > + * @opp_table:	reference to cpu opp table.
> > + *
> > + * This structure stores the required cpu_state of a cpu.
> > + * This is auto-populated by the governor.
> > + */
> > +struct devfreq_cpu_state;
> > +
> > +enum devfreq_parent_dev_type {
> > +	DEVFREQ_PARENT_DEV,
> > +	CPUFREQ_PARENT_DEV,
> > +};
> > +
> > +/**
> >   * struct devfreq_passive_data - ``void *data`` fed to struct devfreq
> >   *	and devfreq_add_device
> >   * @parent:	the devfreq instance of parent device.
> > @@ -290,13 +309,15 @@ struct devfreq_simple_ondemand_data {
> >   *			using governors except for passive governor.
> >   *			If the devfreq device has the specific method to decide
> >   *			the next frequency, should use this callback.
> > + * @parent_type:	parent type of the device
> >   * @this:	the devfreq instance of own device.
> >   * @nb:		the notifier block for DEVFREQ_TRANSITION_NOTIFIER list
> > + * @cpu_state:		the state min/max/current frequency of all online cpu's
> >   *
> >   * The devfreq_passive_data have to set the devfreq instance of parent
> >   * device with governors except for the passive governor. But, don't need to
> > - * initialize the 'this' and 'nb' field because the devfreq core will handle
> > - * them.
> > + * initialize the 'this', 'nb' and 'cpu_state' field because the devfreq core
> > + * will handle them.
> >   */
> >  struct devfreq_passive_data {
> >  	/* Should set the devfreq instance of parent device */
> > @@ -305,9 +326,13 @@ struct devfreq_passive_data {
> >  	/* Optional callback to decide the next frequency of passvice device */
> >  	int (*get_target_freq)(struct devfreq *this, unsigned long *freq);
> >  
> > +	/* Should set the type of parent device */
> > +	enum devfreq_parent_dev_type parent_type;
> > +
> >  	/* For passive governor's internal use. Don't need to set them */
> >  	struct devfreq *this;
> >  	struct notifier_block nb;
> > +	struct devfreq_cpu_state *cpu_state[NR_CPUS];
> >  };
> >  #endif
> >  
> >
> 

_______________________________________________
linux-arm-kernel mailing list
linux-arm-kernel@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-arm-kernel

^ permalink raw reply	[flat|nested] 31+ messages in thread

* Re: [PATCH V8 1/8] PM / devfreq: Add cpu based scaling support to passive_governor
  2021-03-31  8:03     ` andrew-sh.cheng
@ 2021-03-31  8:27       ` Chanwoo Choi
  2021-03-31  8:35         ` Chanwoo Choi
  0 siblings, 1 reply; 31+ messages in thread
From: Chanwoo Choi @ 2021-03-31  8:27 UTC (permalink / raw)
  To: andrew-sh.cheng
  Cc: MyungJoo Ham, Kyungmin Park, Rob Herring, Mark Rutland,
	Matthias Brugger, Rafael J. Wysocki, Viresh Kumar,
	Nishanth Menon, Stephen Boyd, Liam Girdwood, Mark Brown,
	linux-pm, devicetree, linux-arm-kernel, linux-mediatek,
	linux-kernel, srv_heupstream, Sibi Sankar

Hi,

On 3/31/21 5:03 PM, andrew-sh.cheng wrote:
> On Thu, 2021-03-25 at 17:14 +0900, Chanwoo Choi wrote:
>> Hi,
>>
>> You are missing to add these patches to linux-pm mailing list.
>> Need to send them to linu-pm ML.
>>
>> Also, before received this series, I tried to clean-up these patches
>> on testing branch[1]. So that I add my comment with my clean-up case.
>> [1] https://urldefense.com/v3/__https://git.kernel.org/pub/scm/linux/kernel/git/chanwoo/linux.git/log/?h=devfreq-testing-passive-gov__;!!CTRNKA9wMg0ARbw!zIrzeDp9vPnm1_SDzVPuzqdHn3zWie9DnfBXaA-j9-CSrVc6aR9_rJQQiw81_CgAPh9XRRs$ 
>>
>> And 'Saravana Kannan <skannan@codeaurora.org>' is wrong email address.
>> Please update the email or drop this email.
> 
> Hi Chanwoo,
> 
> Thank you for the advices.
> I will resend patch v9 (add to linux-pm ML), remove this patch, and note
> that my patch set base on
> https://git.kernel.org/pub/scm/linux/kernel/git/chanwoo/linux.git/log/?h=devfreq-testing-passive-gov

I has not yet test this patch[1] on devfreq-testing-passive-gov branch.
So that if possible, I'd like you to test your patches with this patch[1] 
and then if there is no problem, could you send the next patches with patch[1]?

[1]https://git.kernel.org/pub/scm/linux/kernel/git/chanwoo/linux.git/commit/?h=devfreq-testing-passive-gov&id=39c80d11a8f42dd63ecea1e0df595a0ceb83b454

> 
> 
>>
>>
>> On 3/23/21 8:33 PM, Andrew-sh.Cheng wrote:
>>> From: Saravana Kannan <skannan@codeaurora.org>
>>>
>>> Many CPU architectures have caches that can scale independent of the
>>> CPUs. Frequency scaling of the caches is necessary to make sure that the
>>> cache is not a performance bottleneck that leads to poor performance and
>>> power. The same idea applies for RAM/DDR.
>>>
>>> To achieve this, this patch adds support for cpu based scaling to the
>>> passive governor. This is accomplished by taking the current frequency
>>> of each CPU frequency domain and then adjust the frequency of the cache
>>> (or any devfreq device) based on the frequency of the CPUs. It listens
>>> to CPU frequency transition notifiers to keep itself up to date on the
>>> current CPU frequency.
>>>
>>> To decide the frequency of the device, the governor does one of the
>>> following:
>>> * Derives the optimal devfreq device opp from required-opps property of
>>>   the parent cpu opp_table.
>>>
>>> * Scales the device frequency in proportion to the CPU frequency. So, if
>>>   the CPUs are running at their max frequency, the device runs at its
>>>   max frequency. If the CPUs are running at their min frequency, the
>>>   device runs at its min frequency. It is interpolated for frequencies
>>>   in between.
>>>
>>> Andrew-sh.Cheng change
>>> dev_pm_opp_xlate_opp to dev_pm_opp_xlate_required_opp devfreq->max_freq
>>> to devfreq->user_min_freq_req.data.freq.qos->min_freq.target_value
>>> after kernel-5.7
>>> Don't return -EINVAL in devfreq_passive_event_handler()
>>> since it doesn't handle DEVFREQ_GOV_SUSPEND DEVFREQ_GOV_RESUME cases.
>>>
>>> Signed-off-by: Saravana Kannan <skannan@codeaurora.org>
>>> [Sibi: Integrated cpu-freqmap governor into passive_governor]
>>> Signed-off-by: Sibi Sankar <sibis@codeaurora.org>
>>> Signed-off-by: Andrew-sh.Cheng <andrew-sh.cheng@mediatek.com>
>>> ---
>>>  drivers/devfreq/Kconfig            |   2 +
>>>  drivers/devfreq/governor_passive.c | 329 +++++++++++++++++++++++++++++++++++--
>>>  include/linux/devfreq.h            |  29 +++-
>>>  3 files changed, 342 insertions(+), 18 deletions(-)
>>>
>>> diff --git a/drivers/devfreq/Kconfig b/drivers/devfreq/Kconfig
>>> index 00704efe6398..f56132b0ae64 100644
>>> --- a/drivers/devfreq/Kconfig
>>> +++ b/drivers/devfreq/Kconfig
>>> @@ -73,6 +73,8 @@ config DEVFREQ_GOV_PASSIVE
>>>  	  device. This governor does not change the frequency by itself
>>>  	  through sysfs entries. The passive governor recommends that
>>>  	  devfreq device uses the OPP table to get the frequency/voltage.
>>> +	  Alternatively the governor can also be chosen to scale based on
>>> +	  the online CPUs current frequency.
>>>  
>>>  comment "DEVFREQ Drivers"
>>>  
>>> diff --git a/drivers/devfreq/governor_passive.c b/drivers/devfreq/governor_passive.c
>>> index b094132bd20b..9cc57b083839 100644
>>> --- a/drivers/devfreq/governor_passive.c
>>> +++ b/drivers/devfreq/governor_passive.c
>>> @@ -8,11 +8,103 @@
>>>   */
>>>  
>>>  #include <linux/module.h>
>>> +#include <linux/cpu.h>
>>> +#include <linux/cpufreq.h>
>>> +#include <linux/cpumask.h>
>>>  #include <linux/device.h>
>>>  #include <linux/devfreq.h>
>>> +#include <linux/slab.h>
>>>  #include "governor.h"
>>>  
>>> -static int devfreq_passive_get_target_freq(struct devfreq *devfreq,
>>> +struct devfreq_cpu_state {
>>> +	unsigned int curr_freq;
>>> +	unsigned int min_freq;
>>> +	unsigned int max_freq;
>>> +	unsigned int first_cpu;
>>> +	struct device *cpu_dev;
>>> +	struct opp_table *opp_table;
>>> +};
>>
>> As I knew, the previous version has the description of structure
>> as following:  I wan to add the description like below.
>>
>> And if you have no any objection, I'd like you to order
>> the variables as following and use 'dev' instead of 'cpu_dev'
>> because this patch use the 'cpu_state->cpu_dev' at the multiple points.
>> I think that 'cpu_state->dev' is better than 'cpu_state->cpu_dev'.
>> Also, I prefer to use 'cur_freq' instead of 'curr_freq'
>> because devfreq subsystem uses 'cur_freq' for expressing the 'current frequency'.
>>
>> /**                                                                             
>>  * struct devfreq_cpu_state - Hold the per-cpu data                              
>>  * @dev:        reference to cpu device.                                        
>>  * @first_cpu:  the cpumask of the first cpu of a policy.                       
>>  * @opp_table:  reference to cpu opp table.                                     
>>  * @cur_freq:   the current frequency of the cpu.                               
>>  * @min_freq:   the min frequency of the cpu.                                   
>>  * @max_freq:   the max frequency of the cpu.                                   
>>  *                                                                              
>>  * This structure stores the required cpu_data of a cpu.                        
>>  * This is auto-populated by the governor.                                      
>>  */                                                                             
>> struct devfreq_cpu_state {                                                       
>>          struct device *dev;                                                     
>>          unsigned int first_cpu;                                                 
>>
>>          struct opp_table *opp_table;                                            
>>          unsigned int cur_freq;                                                  
>>          unsigned int min_freq;                                                  
>>          unsigned int max_freq;                                                  
>> };               
>>
>>
>>> +
>>> +static unsigned long xlate_cpufreq_to_devfreq(struct devfreq_passive_data *data,
>>> +					      unsigned int cpu)
>>> +{
>>> +	unsigned int cpu_min_freq, cpu_max_freq, cpu_curr_freq_khz, cpu_percent;
>>> +	unsigned long dev_min_freq, dev_max_freq, dev_max_state;
>>> +
>>> +	struct devfreq_cpu_state *cpu_state = data->cpu_state[cpu];
>>> +	struct devfreq *devfreq = (struct devfreq *)data->this;
>>> +	unsigned long *dev_freq_table = devfreq->profile->freq_table;
>>> +	struct dev_pm_opp *opp = NULL, *p_opp = NULL;
>>> +	unsigned long cpu_curr_freq, freq;
>>> +
>>> +	if (!cpu_state || cpu_state->first_cpu != cpu ||
>>> +	    !cpu_state->opp_table || !devfreq->opp_table)
>>> +		return 0;
>>> +
>>> +	cpu_curr_freq = cpu_state->curr_freq * 1000;
>>> +	p_opp = devfreq_recommended_opp(cpu_state->cpu_dev, &cpu_curr_freq, 0);
>>> +	if (IS_ERR(p_opp))
>>> +		return 0;
>>> +
>>> +	opp = dev_pm_opp_xlate_required_opp(cpu_state->opp_table,
>>> +					    devfreq->opp_table, p_opp);
>>> +	dev_pm_opp_put(p_opp);
>>> +
>>> +	if (!IS_ERR(opp)) {
>>> +		freq = dev_pm_opp_get_freq(opp);
>>> +		dev_pm_opp_put(opp);
>>> +		goto out;
>>> +	}
>>> +
>>> +	/* Use Interpolation if required opps is not available */
>>> +	cpu_min_freq = cpu_state->min_freq;
>>> +	cpu_max_freq = cpu_state->max_freq;
>>> +	cpu_curr_freq_khz = cpu_state->curr_freq;
>>> +
>>> +	if (dev_freq_table) {
>>> +		/* Get minimum frequency according to sorting order */
>>> +		dev_max_state = dev_freq_table[devfreq->profile->max_state - 1];
>>> +		if (dev_freq_table[0] < dev_max_state) {
>>> +			dev_min_freq = dev_freq_table[0];
>>> +			dev_max_freq = dev_max_state;
>>> +		} else {
>>> +			dev_min_freq = dev_max_state;
>>> +			dev_max_freq = dev_freq_table[0];
>>> +		}
>>> +	} else {
>>> +		dev_min_freq = dev_pm_qos_read_value(devfreq->dev.parent,
>>> +						     DEV_PM_QOS_MIN_FREQUENCY);
>>> +		dev_max_freq = dev_pm_qos_read_value(devfreq->dev.parent,
>>> +						     DEV_PM_QOS_MAX_FREQUENCY);
>>> +
>>> +		if (dev_max_freq <= dev_min_freq)
>>> +			return 0;
>>> +	}
>>> +	cpu_percent = ((cpu_curr_freq_khz - cpu_min_freq) * 100) / cpu_max_freq - cpu_min_freq;
>>> +	freq = dev_min_freq + mult_frac(dev_max_freq - dev_min_freq, cpu_percent, 100);
>>> +
>>> +out:
>>> +	return freq;
>>> +}
>>> +
>>> +static int get_target_freq_with_cpufreq(struct devfreq *devfreq,
>>> +					unsigned long *freq)
>>> +{
>>> +	struct devfreq_passive_data *p_data =
>>> +				(struct devfreq_passive_data *)devfreq->data;
>>> +	unsigned int cpu;
>>> +	unsigned long target_freq = 0;
>>> +
>>> +	for_each_online_cpu(cpu)
>>> +		target_freq = max(target_freq,
>>> +				  xlate_cpufreq_to_devfreq(p_data, cpu));
>>> +
>>> +	*freq = target_freq;
>>> +
>>> +	return 0;
>>> +}
>>
>> As you knew, governor_passive.c was already used 
>> both 'dev_pm_opp_xlate_required_opp' and 'devfreq_recommended_opp'
>> to get the target from OPP. So, I wan to make the common function
>> like 'get_taget_freq_by_required_opp' as following:
>> If define 'get_taget_freq_by_required_opp' as following,
>> it will be used for get_target_freq_with_devfreq().
>> After finisied the review of this patch, I'll send the patch[2].
>> [2] https://urldefense.com/v3/__https://git.kernel.org/pub/scm/linux/kernel/git/chanwoo/linux.git/commit/?h=devfreq-testing-passive-gov&id=101c5a087586ab2b5cf3370166a7e39227ca83cf__;!!CTRNKA9wMg0ARbw!zIrzeDp9vPnm1_SDzVPuzqdHn3zWie9DnfBXaA-j9-CSrVc6aR9_rJQQiw81_CgA6mp3Yqo$ 
>>
>> For example but this code is not tested,
>> static unsigned long get_taget_freq_by_required_opp(struct device *p_dev,
>> 						struct opp_table *p_opp_table,
>> 						struct opp_table *opp_table,
>> 						unsigned long freq)
>> {
>> 	struct dev_pm_opp *opp = NULL, *p_opp = NULL;
>>
>> 	if (!p_dev || !p_opp_table || !opp_table || !freq)
>> 		return 0;
>>
>> 	p_opp = devfreq_recommended_opp(p_dev, &freq, 0);
>> 	if (IS_ERR(p_opp))
>> 		return 0;
>>
>> 	opp = dev_pm_opp_xlate_required_opp(p_opp_table, opp_table, p_opp);
>> 	dev_pm_opp_put(p_opp);
>>
>> 	if (IS_ERR(opp))
>> 		return 0;
>>
>> 	freq = dev_pm_opp_get_freq(opp);
>> 	dev_pm_opp_put(opp);
>>
>> 	return freq;
>> }
>>
>> static int get_target_freq_with_cpufreq(struct devfreq *devfreq,
>> 					unsigned long *target_freq)
>> {
>> 	struct devfreq_passive_data *p_data =
>> 				(struct devfreq_passive_data *)devfreq->data;
>> 	struct devfreq_cpu_data *cpu_data;
>> 	unsigned long cpu, cpu_cur, cpu_min, cpu_max, cpu_percent;
>> 	unsigned long dev_min, dev_max;
>> 	unsigned long freq = 0;
>>
>> 	for_each_online_cpu(cpu) {
>> 		cpu_data = p_data->cpu_data[cpu];
>> 		if (!cpu_data || cpu_data->first_cpu != cpu)
>> 			continue;
>>
>> 		/* Get target freq via required opps */
>> 		cpu_cur = cpu_data->cur_freq * HZ_PER_KHZ;
>> 		freq = get_taget_freq_by_required_opp(cpu_data->dev,
>> 					cpu_data->opp_table,
>> 					devfreq->opp_table, cpu_cur);
>> 		if (freq) {
>> 			*target_freq = max(freq, *target_freq);
>> 			continue;
>> 		}
>>
>> 		/* Use Interpolation if required opps is not available */
>> 		devfreq_get_freq_range(devfreq, &dev_min, &dev_max);
>>
>> 		cpu_min = cpu_data->min_freq;
>> 		cpu_max = cpu_data->max_freq;
>> 		cpu_cur = cpu_data->cur_freq;
>>
>> 		cpu_percent = ((cpu_cur - cpu_min) * 100) / cpu_max - cpu_min;
>> 		freq = dev_min + mult_frac(dev_max - dev_min, cpu_percent, 100);
>>
>> 		*target_freq = max(freq, *target_freq);
>> 	}
>>
>> 	return 0;
>> }
>>
>>> +
>>> +static int get_target_freq_with_devfreq(struct devfreq *devfreq,
>>>  					unsigned long *freq)
>>>  {
>>>  	struct devfreq_passive_data *p_data
>>> @@ -23,14 +115,6 @@ static int devfreq_passive_get_target_freq(struct devfreq *devfreq,
>>>  	int i, count;
>>>  
>>>  	/*
>>> -	 * If the devfreq device with passive governor has the specific method
>>> -	 * to determine the next frequency, should use the get_target_freq()
>>> -	 * of struct devfreq_passive_data.
>>> -	 */
>>> -	if (p_data->get_target_freq)
>>> -		return p_data->get_target_freq(devfreq, freq);
>>> -
>>> -	/*
>>>  	 * If the parent and passive devfreq device uses the OPP table,
>>>  	 * get the next frequency by using the OPP table.
>>>  	 */
>>> @@ -98,6 +182,37 @@ static int devfreq_passive_get_target_freq(struct devfreq *devfreq,
>>>  	return 0;
>>>  }
>>>  
>>> +static int devfreq_passive_get_target_freq(struct devfreq *devfreq,
>>> +					   unsigned long *freq)
>>> +{
>>> +	struct devfreq_passive_data *p_data =
>>> +		(struct devfreq_passive_data *)devfreq->data;
>>> +	int ret;
>>> +
>>> +	/*
>>> +	 * If the devfreq device with passive governor has the specific method
>>> +	 * to determine the next frequency, should use the get_target_freq()
>>> +	 * of struct devfreq_passive_data.
>>> +	 */
>>> +	if (p_data->get_target_freq)
>>> +		return p_data->get_target_freq(devfreq, freq);
>>> +
>>> +	switch (p_data->parent_type) {
>>> +	case DEVFREQ_PARENT_DEV:
>>> +		ret = get_target_freq_with_devfreq(devfreq, freq);
>>> +		break;
>>> +	case CPUFREQ_PARENT_DEV:
>>> +		ret = get_target_freq_with_cpufreq(devfreq, freq);
>>> +		break;
>>> +	default:
>>> +		ret = -EINVAL;
>>> +		dev_err(&devfreq->dev, "Invalid parent type\n");
>>> +		break;
>>> +	}
>>> +
>>> +	return ret;
>>> +}
>>> +
>>>  static int devfreq_passive_notifier_call(struct notifier_block *nb,
>>>  				unsigned long event, void *ptr)
>>>  {
>>> @@ -130,16 +245,200 @@ static int devfreq_passive_notifier_call(struct notifier_block *nb,
>>>  	return NOTIFY_DONE;
>>>  }
>>>  
>>> +static int cpufreq_passive_notifier_call(struct notifier_block *nb,
>>> +					 unsigned long event, void *ptr)
>>> +{
>>> +	struct devfreq_passive_data *data =
>>> +			container_of(nb, struct devfreq_passive_data, nb);
>>> +	struct devfreq *devfreq = (struct devfreq *)data->this;
>>> +	struct devfreq_cpu_state *cpu_state;
>>> +	struct cpufreq_freqs *cpu_freq = ptr;
>>
>> Use 'freqs' variable name.  I prefer to use the same variable name
>> for both devfreq_freqs and cpufreq_freqs instance.
>>
>>> +	unsigned int curr_freq;
>>
>> As I commented above, better to use 'cur_frq' instead of 'curr_freq'
>> if there is no any special reason.
>>
>>> +	int ret;
>>> +
>>> +	if (event != CPUFREQ_POSTCHANGE || !cpu_freq ||
>>> +	    !data->cpu_state[cpu_freq->policy->cpu])
>>> +		return 0;
>>> +
>>> +	cpu_state = data->cpu_state[cpu_freq->policy->cpu];
>>> +	if (cpu_state->curr_freq == cpu_freq->new)
>>> +		return 0;
>>> +
>>> +	/* Backup current freq and pre-update cpu state freq*/
>>
>> I think that this commnet is not critial. So, please drop this comment.
>>
>>> +	curr_freq = cpu_state->curr_freq;
>>> +	cpu_state->curr_freq = cpu_freq->new;
>>> +
>>> +	mutex_lock(&devfreq->lock);
>>> +	ret = update_devfreq(devfreq);
>>
>> I recommend to use 'devfreq_update_target' instead of 'update_devfreq'
>> as following:
>> 	devfreq_update_target(devfreq, freqs->new);
>>
>>> +	mutex_unlock(&devfreq->lock);
>>> +	if (ret) {
>>> +		cpu_state->curr_freq = curr_freq;
>>> +		dev_err(&devfreq->dev, "Couldn't update the frequency.\n");
>>> +		return ret;
>>> +	}
>>> +
>>> +	return 0;
>>> +}
>>> +
>>> +static int cpufreq_passive_register(struct devfreq_passive_data **p_data)
>>
>> In order to keep the consistent style of function name,
>> please change the name as following because devfreq defines
>> the function name as 'devfreq_regiter_notifier'
>> - cpufreq_passive_register -> cpufreq_passive_register_notifier
>>
>>> +{
>>> +	struct devfreq_passive_data *data = *p_data;
>>> +	struct devfreq *devfreq = (struct devfreq *)data->this;
>>> +	struct device *dev = devfreq->dev.parent;
>>> +	struct opp_table *opp_table = NULL;
>>> +	struct devfreq_cpu_state *cpu_state;
>>> +	struct cpufreq_policy *policy;
>>> +	struct device *cpu_dev;
>>> +	unsigned int cpu;
>>> +	int ret;
>>> +
>>> +	get_online_cpus();
>>> +
>>> +	data->nb.notifier_call = cpufreq_passive_notifier_call;
>>> +	ret = cpufreq_register_notifier(&data->nb,
>>> +					CPUFREQ_TRANSITION_NOTIFIER);
>>> +	if (ret) {
>>> +		dev_err(dev, "Couldn't register cpufreq notifier.\n");
>>> +		data->nb.notifier_call = NULL;
>>> +		goto out;
>>> +	}
>>> +
>>> +	/* Populate devfreq_cpu_state */
>>
>> Don't need this comment. Please drop it.
>>
>>> +	for_each_online_cpu(cpu) {
>>> +		if (data->cpu_state[cpu])
>>> +			continue;
>>> +
>>> +		policy = cpufreq_cpu_get(cpu);
>>> +		if (!policy) {
>>> +			ret = -EINVAL;
>>> +			goto out;
>>> +		} else if (PTR_ERR(policy) == -EPROBE_DEFER) {
>>> +			ret = -EPROBE_DEFER;
>>> +			goto out;
>>> +		} else if (IS_ERR(policy)) {
>>> +			ret = PTR_ERR(policy);
>>> +			dev_err(dev, "Couldn't get the cpufreq_poliy.\n");
>>> +			goto out;
>>> +		}
>>
>> Use dev_err_probe() funciotn to handle hte EPROBE_DEFER.
>> It make code more simple.
>>
>>> +
>>> +		cpu_state = kzalloc(sizeof(*cpu_state), GFP_KERNEL);
>>> +		if (!cpu_state) {
>>> +			ret = -ENOMEM;
>>> +			goto out;
>>> +		}
>>> +
>>> +		cpu_dev = get_cpu_device(cpu);
>>> +		if (!cpu_dev) {
>>> +			dev_err(dev, "Couldn't get cpu device.\n");
>>> +			ret = -ENODEV;
>>> +			goto out;
>>> +		}
>>> +
>>> +		opp_table = dev_pm_opp_get_opp_table(cpu_dev);
>>> +		if (IS_ERR(devfreq->opp_table)) {
>>> +			ret = PTR_ERR(opp_table);
>>> +			goto out;
>>> +		}
>>> +
>>> +		cpu_state->cpu_dev = cpu_dev;
>>> +		cpu_state->opp_table = opp_table;
>>> +		cpu_state->first_cpu = cpumask_first(policy->related_cpus);
>>> +		cpu_state->curr_freq = policy->cur;
>>> +		cpu_state->min_freq = policy->cpuinfo.min_freq;
>>> +		cpu_state->max_freq = policy->cpuinfo.max_freq;
>>> +		data->cpu_state[cpu] = cpu_state;
>>> +
>>> +		cpufreq_cpu_put(policy);
>>> +	}
>>> +
>>> +out:
>>> +	put_online_cpus();
>>> +	if (ret)
>>> +		return ret;
>>> +
>>> +	/* Update devfreq */
>>> +	mutex_lock(&devfreq->lock);
>>> +	ret = update_devfreq(devfreq);
>>
>>> +	mutex_unlock(&devfreq->lock);
>>> +	if (ret)
>>> +		dev_err(dev, "Couldn't update the frequency.\n");
>>> +
>>> +	return ret;
>>> +}
>>> +
>>> +static int cpufreq_passive_unregister(struct devfreq_passive_data **p_data)
>>
>> As I commented above, please change the name as following:
>> - cpufreq_passive_unregister -> cpufreq_passive_unregister_notifier
>>
>>> +{
>>> +	struct devfreq_passive_data *data = *p_data;
>>> +	struct devfreq_cpu_state *cpu_state;
>>> +	int cpu;
>>> +
>>> +	if (data->nb.notifier_call)
>>> +		cpufreq_unregister_notifier(&data->nb,
>>> +					    CPUFREQ_TRANSITION_NOTIFIER);
>>> +
>>> +	for_each_possible_cpu(cpu) {
>>> +		cpu_state = data->cpu_state[cpu];
>>> +		if (cpu_state) {
>>> +			if (cpu_state->opp_table)
>>> +				dev_pm_opp_put_opp_table(cpu_state->opp_table);
>>> +			kfree(cpu_state);
>>> +			cpu_state = NULL;
>>> +		}
>>> +	}
>>> +
>>> +	return 0;
>>> +}
>>> +
>>> +int register_parent_dev_notifier(struct devfreq_passive_data **p_data)
>>> +{
>>> +	struct notifier_block *nb = &(*p_data)->nb;
>>> +	int ret = 0;
>>> +
>>> +	switch ((*p_data)->parent_type) {
>>> +	case DEVFREQ_PARENT_DEV:
>>> +		nb->notifier_call = devfreq_passive_notifier_call;
>>> +		ret = devfreq_register_notifier((struct devfreq *)(*p_data)->parent, nb,
>>> +						DEVFREQ_TRANSITION_NOTIFIER);
>>> +		break;
>>> +	case CPUFREQ_PARENT_DEV:
>>> +		ret = cpufreq_passive_register(p_data);
>>> +		break;
>>> +	default:
>>> +		ret = -EINVAL;
>>> +		break;
>>> +	}
>>> +	return ret;
>>> +}
>>> +
>>> +int unregister_parent_dev_notifier(struct devfreq_passive_data **p_data)
>>> +{
>>> +	int ret = 0;
>>> +
>>> +	switch ((*p_data)->parent_type) {
>>> +	case DEVFREQ_PARENT_DEV:
>>> +		WARN_ON(devfreq_unregister_notifier((struct devfreq *)(*p_data)->parent,
>>> +						    &(*p_data)->nb,
>>> +						    DEVFREQ_TRANSITION_NOTIFIER));
>>> +		break;
>>> +	case CPUFREQ_PARENT_DEV:
>>> +		cpufreq_passive_unregister(p_data);
>>> +		break;
>>> +	default:
>>> +		ret = -EINVAL;
>>> +		break;
>>> +	}
>>> +	return ret;
>>> +}
>>
>> I think that you don't need to define register_parent_dev_notifier
>> and unregister_parent_dev_notifier as the separate functions.
>>
>> Instead of the separate functions, just add the code
>> into devfreq_passive_event_handler.
>>
>>
>>> +
>>>  static int devfreq_passive_event_handler(struct devfreq *devfreq,
>>>  				unsigned int event, void *data)
>>>  {
>>>  	struct devfreq_passive_data *p_data
>>>  			= (struct devfreq_passive_data *)devfreq->data;
>>>  	struct devfreq *parent = (struct devfreq *)p_data->parent;
>>> -	struct notifier_block *nb = &p_data->nb;
>>>  	int ret = 0;
>>>  
>>> -	if (!parent)
>>> +	if (p_data->parent_type == DEVFREQ_PARENT_DEV && !parent)
>>>  		return -EPROBE_DEFER;
>>>  
>>>  	switch (event) {
>>> @@ -147,13 +446,11 @@ static int devfreq_passive_event_handler(struct devfreq *devfreq,
>>>  		if (!p_data->this)
>>>  			p_data->this = devfreq;
>>>  
>>> -		nb->notifier_call = devfreq_passive_notifier_call;
>>> -		ret = devfreq_register_notifier(parent, nb,
>>> -					DEVFREQ_TRANSITION_NOTIFIER);
>>> +		ret = register_parent_dev_notifier(&p_data);
>>>  		break;
>>> +
>>>  	case DEVFREQ_GOV_STOP:
>>> -		WARN_ON(devfreq_unregister_notifier(parent, nb,
>>> -					DEVFREQ_TRANSITION_NOTIFIER));
>>> +		ret = unregister_parent_dev_notifier(&p_data);
>>>  		break;
>>>  	default:
>>>  		break;
>>> diff --git a/include/linux/devfreq.h b/include/linux/devfreq.h
>>> index 26ea0850be9b..e0093b7c805c 100644
>>> --- a/include/linux/devfreq.h
>>> +++ b/include/linux/devfreq.h
>>> @@ -280,6 +280,25 @@ struct devfreq_simple_ondemand_data {
>>>  
>>>  #if IS_ENABLED(CONFIG_DEVFREQ_GOV_PASSIVE)
>>>  /**
>>> + * struct devfreq_cpu_state - holds the per-cpu state
>>> + * @freq:	the current frequency of the cpu.
>>> + * @min_freq:	the min frequency of the cpu.
>>> + * @max_freq:	the max frequency of the cpu.
>>> + * @first_cpu:	the cpumask of the first cpu of a policy.
>>> + * @dev:	reference to cpu device.
>>> + * @opp_table:	reference to cpu opp table.
>>> + *
>>> + * This structure stores the required cpu_state of a cpu.
>>> + * This is auto-populated by the governor.
>>> + */
>>> +struct devfreq_cpu_state;
>>> +
>>> +enum devfreq_parent_dev_type {
>>> +	DEVFREQ_PARENT_DEV,
>>> +	CPUFREQ_PARENT_DEV,
>>> +};
>>> +
>>> +/**
>>>   * struct devfreq_passive_data - ``void *data`` fed to struct devfreq
>>>   *	and devfreq_add_device
>>>   * @parent:	the devfreq instance of parent device.
>>> @@ -290,13 +309,15 @@ struct devfreq_simple_ondemand_data {
>>>   *			using governors except for passive governor.
>>>   *			If the devfreq device has the specific method to decide
>>>   *			the next frequency, should use this callback.
>>> + * @parent_type:	parent type of the device
>>>   * @this:	the devfreq instance of own device.
>>>   * @nb:		the notifier block for DEVFREQ_TRANSITION_NOTIFIER list
>>> + * @cpu_state:		the state min/max/current frequency of all online cpu's
>>>   *
>>>   * The devfreq_passive_data have to set the devfreq instance of parent
>>>   * device with governors except for the passive governor. But, don't need to
>>> - * initialize the 'this' and 'nb' field because the devfreq core will handle
>>> - * them.
>>> + * initialize the 'this', 'nb' and 'cpu_state' field because the devfreq core
>>> + * will handle them.
>>>   */
>>>  struct devfreq_passive_data {
>>>  	/* Should set the devfreq instance of parent device */
>>> @@ -305,9 +326,13 @@ struct devfreq_passive_data {
>>>  	/* Optional callback to decide the next frequency of passvice device */
>>>  	int (*get_target_freq)(struct devfreq *this, unsigned long *freq);
>>>  
>>> +	/* Should set the type of parent device */
>>> +	enum devfreq_parent_dev_type parent_type;
>>> +
>>>  	/* For passive governor's internal use. Don't need to set them */
>>>  	struct devfreq *this;
>>>  	struct notifier_block nb;
>>> +	struct devfreq_cpu_state *cpu_state[NR_CPUS];
>>>  };
>>>  #endif
>>>  
>>>
>>
> 


-- 
Best Regards,
Chanwoo Choi
Samsung Electronics

_______________________________________________
linux-arm-kernel mailing list
linux-arm-kernel@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-arm-kernel

^ permalink raw reply	[flat|nested] 31+ messages in thread

* Re: [PATCH V8 1/8] PM / devfreq: Add cpu based scaling support to passive_governor
  2021-03-31  8:27       ` Chanwoo Choi
@ 2021-03-31  8:35         ` Chanwoo Choi
  2021-03-31 13:03           ` andrew-sh.cheng
  0 siblings, 1 reply; 31+ messages in thread
From: Chanwoo Choi @ 2021-03-31  8:35 UTC (permalink / raw)
  To: andrew-sh.cheng
  Cc: MyungJoo Ham, Kyungmin Park, Rob Herring, Mark Rutland,
	Matthias Brugger, Rafael J. Wysocki, Viresh Kumar,
	Nishanth Menon, Stephen Boyd, Liam Girdwood, Mark Brown,
	linux-pm, devicetree, linux-arm-kernel, linux-mediatek,
	linux-kernel, srv_heupstream, Sibi Sankar

On 3/31/21 5:27 PM, Chanwoo Choi wrote:
> Hi,
> 
> On 3/31/21 5:03 PM, andrew-sh.cheng wrote:
>> On Thu, 2021-03-25 at 17:14 +0900, Chanwoo Choi wrote:
>>> Hi,
>>>
>>> You are missing to add these patches to linux-pm mailing list.
>>> Need to send them to linu-pm ML.
>>>
>>> Also, before received this series, I tried to clean-up these patches
>>> on testing branch[1]. So that I add my comment with my clean-up case.
>>> [1] https://urldefense.com/v3/__https://git.kernel.org/pub/scm/linux/kernel/git/chanwoo/linux.git/log/?h=devfreq-testing-passive-gov__;!!CTRNKA9wMg0ARbw!zIrzeDp9vPnm1_SDzVPuzqdHn3zWie9DnfBXaA-j9-CSrVc6aR9_rJQQiw81_CgAPh9XRRs$ 
>>>
>>> And 'Saravana Kannan <skannan@codeaurora.org>' is wrong email address.
>>> Please update the email or drop this email.
>>
>> Hi Chanwoo,
>>
>> Thank you for the advices.
>> I will resend patch v9 (add to linux-pm ML), remove this patch, and note
>> that my patch set base on
>> https://git.kernel.org/pub/scm/linux/kernel/git/chanwoo/linux.git/log/?h=devfreq-testing-passive-gov
> 
> I has not yet test this patch[1] on devfreq-testing-passive-gov branch.
> So that if possible, I'd like you to test your patches with this patch[1] 
> and then if there is no problem, could you send the next patches with patch[1]?
> 
> [1]https://git.kernel.org/pub/scm/linux/kernel/git/chanwoo/linux.git/commit/?h=devfreq-testing-passive-gov&id=39c80d11a8f42dd63ecea1e0df595a0ceb83b454


Sorry for the confusion. I make the devfreq-testing-passive-gov[1]
branch based on latest devfreq-next branch.
[1] https://git.kernel.org/pub/scm/linux/kernel/git/chanwoo/linux.git/log/?h=devfreq-testing-passive-gov

First of all, if possible, I want to test them[1] with your patches in this series.
And then if there are no any problem, please let me know. After confirmed from you,
I'll send the patches of devfreq-testing-passive-gov[1] branch.
How about that?


> 
>>
>>
>>>
>>>
>>> On 3/23/21 8:33 PM, Andrew-sh.Cheng wrote:
>>>> From: Saravana Kannan <skannan@codeaurora.org>
>>>>
>>>> Many CPU architectures have caches that can scale independent of the
>>>> CPUs. Frequency scaling of the caches is necessary to make sure that the
>>>> cache is not a performance bottleneck that leads to poor performance and
>>>> power. The same idea applies for RAM/DDR.
>>>>
>>>> To achieve this, this patch adds support for cpu based scaling to the
>>>> passive governor. This is accomplished by taking the current frequency
>>>> of each CPU frequency domain and then adjust the frequency of the cache
>>>> (or any devfreq device) based on the frequency of the CPUs. It listens
>>>> to CPU frequency transition notifiers to keep itself up to date on the
>>>> current CPU frequency.
>>>>
>>>> To decide the frequency of the device, the governor does one of the
>>>> following:
>>>> * Derives the optimal devfreq device opp from required-opps property of
>>>>   the parent cpu opp_table.
>>>>
>>>> * Scales the device frequency in proportion to the CPU frequency. So, if
>>>>   the CPUs are running at their max frequency, the device runs at its
>>>>   max frequency. If the CPUs are running at their min frequency, the
>>>>   device runs at its min frequency. It is interpolated for frequencies
>>>>   in between.
>>>>
>>>> Andrew-sh.Cheng change
>>>> dev_pm_opp_xlate_opp to dev_pm_opp_xlate_required_opp devfreq->max_freq
>>>> to devfreq->user_min_freq_req.data.freq.qos->min_freq.target_value
>>>> after kernel-5.7
>>>> Don't return -EINVAL in devfreq_passive_event_handler()
>>>> since it doesn't handle DEVFREQ_GOV_SUSPEND DEVFREQ_GOV_RESUME cases.
>>>>
>>>> Signed-off-by: Saravana Kannan <skannan@codeaurora.org>
>>>> [Sibi: Integrated cpu-freqmap governor into passive_governor]
>>>> Signed-off-by: Sibi Sankar <sibis@codeaurora.org>
>>>> Signed-off-by: Andrew-sh.Cheng <andrew-sh.cheng@mediatek.com>
>>>> ---
>>>>  drivers/devfreq/Kconfig            |   2 +
>>>>  drivers/devfreq/governor_passive.c | 329 +++++++++++++++++++++++++++++++++++--
>>>>  include/linux/devfreq.h            |  29 +++-
>>>>  3 files changed, 342 insertions(+), 18 deletions(-)
>>>>
>>>> diff --git a/drivers/devfreq/Kconfig b/drivers/devfreq/Kconfig
>>>> index 00704efe6398..f56132b0ae64 100644
>>>> --- a/drivers/devfreq/Kconfig
>>>> +++ b/drivers/devfreq/Kconfig
>>>> @@ -73,6 +73,8 @@ config DEVFREQ_GOV_PASSIVE
>>>>  	  device. This governor does not change the frequency by itself
>>>>  	  through sysfs entries. The passive governor recommends that
>>>>  	  devfreq device uses the OPP table to get the frequency/voltage.
>>>> +	  Alternatively the governor can also be chosen to scale based on
>>>> +	  the online CPUs current frequency.
>>>>  
>>>>  comment "DEVFREQ Drivers"
>>>>  
>>>> diff --git a/drivers/devfreq/governor_passive.c b/drivers/devfreq/governor_passive.c
>>>> index b094132bd20b..9cc57b083839 100644
>>>> --- a/drivers/devfreq/governor_passive.c
>>>> +++ b/drivers/devfreq/governor_passive.c
>>>> @@ -8,11 +8,103 @@
>>>>   */
>>>>  
>>>>  #include <linux/module.h>
>>>> +#include <linux/cpu.h>
>>>> +#include <linux/cpufreq.h>
>>>> +#include <linux/cpumask.h>
>>>>  #include <linux/device.h>
>>>>  #include <linux/devfreq.h>
>>>> +#include <linux/slab.h>
>>>>  #include "governor.h"
>>>>  
>>>> -static int devfreq_passive_get_target_freq(struct devfreq *devfreq,
>>>> +struct devfreq_cpu_state {
>>>> +	unsigned int curr_freq;
>>>> +	unsigned int min_freq;
>>>> +	unsigned int max_freq;
>>>> +	unsigned int first_cpu;
>>>> +	struct device *cpu_dev;
>>>> +	struct opp_table *opp_table;
>>>> +};
>>>
>>> As I knew, the previous version has the description of structure
>>> as following:  I wan to add the description like below.
>>>
>>> And if you have no any objection, I'd like you to order
>>> the variables as following and use 'dev' instead of 'cpu_dev'
>>> because this patch use the 'cpu_state->cpu_dev' at the multiple points.
>>> I think that 'cpu_state->dev' is better than 'cpu_state->cpu_dev'.
>>> Also, I prefer to use 'cur_freq' instead of 'curr_freq'
>>> because devfreq subsystem uses 'cur_freq' for expressing the 'current frequency'.
>>>
>>> /**                                                                             
>>>  * struct devfreq_cpu_state - Hold the per-cpu data                              
>>>  * @dev:        reference to cpu device.                                        
>>>  * @first_cpu:  the cpumask of the first cpu of a policy.                       
>>>  * @opp_table:  reference to cpu opp table.                                     
>>>  * @cur_freq:   the current frequency of the cpu.                               
>>>  * @min_freq:   the min frequency of the cpu.                                   
>>>  * @max_freq:   the max frequency of the cpu.                                   
>>>  *                                                                              
>>>  * This structure stores the required cpu_data of a cpu.                        
>>>  * This is auto-populated by the governor.                                      
>>>  */                                                                             
>>> struct devfreq_cpu_state {                                                       
>>>          struct device *dev;                                                     
>>>          unsigned int first_cpu;                                                 
>>>
>>>          struct opp_table *opp_table;                                            
>>>          unsigned int cur_freq;                                                  
>>>          unsigned int min_freq;                                                  
>>>          unsigned int max_freq;                                                  
>>> };               
>>>
>>>
>>>> +
>>>> +static unsigned long xlate_cpufreq_to_devfreq(struct devfreq_passive_data *data,
>>>> +					      unsigned int cpu)
>>>> +{
>>>> +	unsigned int cpu_min_freq, cpu_max_freq, cpu_curr_freq_khz, cpu_percent;
>>>> +	unsigned long dev_min_freq, dev_max_freq, dev_max_state;
>>>> +
>>>> +	struct devfreq_cpu_state *cpu_state = data->cpu_state[cpu];
>>>> +	struct devfreq *devfreq = (struct devfreq *)data->this;
>>>> +	unsigned long *dev_freq_table = devfreq->profile->freq_table;
>>>> +	struct dev_pm_opp *opp = NULL, *p_opp = NULL;
>>>> +	unsigned long cpu_curr_freq, freq;
>>>> +
>>>> +	if (!cpu_state || cpu_state->first_cpu != cpu ||
>>>> +	    !cpu_state->opp_table || !devfreq->opp_table)
>>>> +		return 0;
>>>> +
>>>> +	cpu_curr_freq = cpu_state->curr_freq * 1000;
>>>> +	p_opp = devfreq_recommended_opp(cpu_state->cpu_dev, &cpu_curr_freq, 0);
>>>> +	if (IS_ERR(p_opp))
>>>> +		return 0;
>>>> +
>>>> +	opp = dev_pm_opp_xlate_required_opp(cpu_state->opp_table,
>>>> +					    devfreq->opp_table, p_opp);
>>>> +	dev_pm_opp_put(p_opp);
>>>> +
>>>> +	if (!IS_ERR(opp)) {
>>>> +		freq = dev_pm_opp_get_freq(opp);
>>>> +		dev_pm_opp_put(opp);
>>>> +		goto out;
>>>> +	}
>>>> +
>>>> +	/* Use Interpolation if required opps is not available */
>>>> +	cpu_min_freq = cpu_state->min_freq;
>>>> +	cpu_max_freq = cpu_state->max_freq;
>>>> +	cpu_curr_freq_khz = cpu_state->curr_freq;
>>>> +
>>>> +	if (dev_freq_table) {
>>>> +		/* Get minimum frequency according to sorting order */
>>>> +		dev_max_state = dev_freq_table[devfreq->profile->max_state - 1];
>>>> +		if (dev_freq_table[0] < dev_max_state) {
>>>> +			dev_min_freq = dev_freq_table[0];
>>>> +			dev_max_freq = dev_max_state;
>>>> +		} else {
>>>> +			dev_min_freq = dev_max_state;
>>>> +			dev_max_freq = dev_freq_table[0];
>>>> +		}
>>>> +	} else {
>>>> +		dev_min_freq = dev_pm_qos_read_value(devfreq->dev.parent,
>>>> +						     DEV_PM_QOS_MIN_FREQUENCY);
>>>> +		dev_max_freq = dev_pm_qos_read_value(devfreq->dev.parent,
>>>> +						     DEV_PM_QOS_MAX_FREQUENCY);
>>>> +
>>>> +		if (dev_max_freq <= dev_min_freq)
>>>> +			return 0;
>>>> +	}
>>>> +	cpu_percent = ((cpu_curr_freq_khz - cpu_min_freq) * 100) / cpu_max_freq - cpu_min_freq;
>>>> +	freq = dev_min_freq + mult_frac(dev_max_freq - dev_min_freq, cpu_percent, 100);
>>>> +
>>>> +out:
>>>> +	return freq;
>>>> +}
>>>> +
>>>> +static int get_target_freq_with_cpufreq(struct devfreq *devfreq,
>>>> +					unsigned long *freq)
>>>> +{
>>>> +	struct devfreq_passive_data *p_data =
>>>> +				(struct devfreq_passive_data *)devfreq->data;
>>>> +	unsigned int cpu;
>>>> +	unsigned long target_freq = 0;
>>>> +
>>>> +	for_each_online_cpu(cpu)
>>>> +		target_freq = max(target_freq,
>>>> +				  xlate_cpufreq_to_devfreq(p_data, cpu));
>>>> +
>>>> +	*freq = target_freq;
>>>> +
>>>> +	return 0;
>>>> +}
>>>
>>> As you knew, governor_passive.c was already used 
>>> both 'dev_pm_opp_xlate_required_opp' and 'devfreq_recommended_opp'
>>> to get the target from OPP. So, I wan to make the common function
>>> like 'get_taget_freq_by_required_opp' as following:
>>> If define 'get_taget_freq_by_required_opp' as following,
>>> it will be used for get_target_freq_with_devfreq().
>>> After finisied the review of this patch, I'll send the patch[2].
>>> [2] https://urldefense.com/v3/__https://git.kernel.org/pub/scm/linux/kernel/git/chanwoo/linux.git/commit/?h=devfreq-testing-passive-gov&id=101c5a087586ab2b5cf3370166a7e39227ca83cf__;!!CTRNKA9wMg0ARbw!zIrzeDp9vPnm1_SDzVPuzqdHn3zWie9DnfBXaA-j9-CSrVc6aR9_rJQQiw81_CgA6mp3Yqo$ 
>>>
>>> For example but this code is not tested,
>>> static unsigned long get_taget_freq_by_required_opp(struct device *p_dev,
>>> 						struct opp_table *p_opp_table,
>>> 						struct opp_table *opp_table,
>>> 						unsigned long freq)
>>> {
>>> 	struct dev_pm_opp *opp = NULL, *p_opp = NULL;
>>>
>>> 	if (!p_dev || !p_opp_table || !opp_table || !freq)
>>> 		return 0;
>>>
>>> 	p_opp = devfreq_recommended_opp(p_dev, &freq, 0);
>>> 	if (IS_ERR(p_opp))
>>> 		return 0;
>>>
>>> 	opp = dev_pm_opp_xlate_required_opp(p_opp_table, opp_table, p_opp);
>>> 	dev_pm_opp_put(p_opp);
>>>
>>> 	if (IS_ERR(opp))
>>> 		return 0;
>>>
>>> 	freq = dev_pm_opp_get_freq(opp);
>>> 	dev_pm_opp_put(opp);
>>>
>>> 	return freq;
>>> }
>>>
>>> static int get_target_freq_with_cpufreq(struct devfreq *devfreq,
>>> 					unsigned long *target_freq)
>>> {
>>> 	struct devfreq_passive_data *p_data =
>>> 				(struct devfreq_passive_data *)devfreq->data;
>>> 	struct devfreq_cpu_data *cpu_data;
>>> 	unsigned long cpu, cpu_cur, cpu_min, cpu_max, cpu_percent;
>>> 	unsigned long dev_min, dev_max;
>>> 	unsigned long freq = 0;
>>>
>>> 	for_each_online_cpu(cpu) {
>>> 		cpu_data = p_data->cpu_data[cpu];
>>> 		if (!cpu_data || cpu_data->first_cpu != cpu)
>>> 			continue;
>>>
>>> 		/* Get target freq via required opps */
>>> 		cpu_cur = cpu_data->cur_freq * HZ_PER_KHZ;
>>> 		freq = get_taget_freq_by_required_opp(cpu_data->dev,
>>> 					cpu_data->opp_table,
>>> 					devfreq->opp_table, cpu_cur);
>>> 		if (freq) {
>>> 			*target_freq = max(freq, *target_freq);
>>> 			continue;
>>> 		}
>>>
>>> 		/* Use Interpolation if required opps is not available */
>>> 		devfreq_get_freq_range(devfreq, &dev_min, &dev_max);
>>>
>>> 		cpu_min = cpu_data->min_freq;
>>> 		cpu_max = cpu_data->max_freq;
>>> 		cpu_cur = cpu_data->cur_freq;
>>>
>>> 		cpu_percent = ((cpu_cur - cpu_min) * 100) / cpu_max - cpu_min;
>>> 		freq = dev_min + mult_frac(dev_max - dev_min, cpu_percent, 100);
>>>
>>> 		*target_freq = max(freq, *target_freq);
>>> 	}
>>>
>>> 	return 0;
>>> }
>>>
>>>> +
>>>> +static int get_target_freq_with_devfreq(struct devfreq *devfreq,
>>>>  					unsigned long *freq)
>>>>  {
>>>>  	struct devfreq_passive_data *p_data
>>>> @@ -23,14 +115,6 @@ static int devfreq_passive_get_target_freq(struct devfreq *devfreq,
>>>>  	int i, count;
>>>>  
>>>>  	/*
>>>> -	 * If the devfreq device with passive governor has the specific method
>>>> -	 * to determine the next frequency, should use the get_target_freq()
>>>> -	 * of struct devfreq_passive_data.
>>>> -	 */
>>>> -	if (p_data->get_target_freq)
>>>> -		return p_data->get_target_freq(devfreq, freq);
>>>> -
>>>> -	/*
>>>>  	 * If the parent and passive devfreq device uses the OPP table,
>>>>  	 * get the next frequency by using the OPP table.
>>>>  	 */
>>>> @@ -98,6 +182,37 @@ static int devfreq_passive_get_target_freq(struct devfreq *devfreq,
>>>>  	return 0;
>>>>  }
>>>>  
>>>> +static int devfreq_passive_get_target_freq(struct devfreq *devfreq,
>>>> +					   unsigned long *freq)
>>>> +{
>>>> +	struct devfreq_passive_data *p_data =
>>>> +		(struct devfreq_passive_data *)devfreq->data;
>>>> +	int ret;
>>>> +
>>>> +	/*
>>>> +	 * If the devfreq device with passive governor has the specific method
>>>> +	 * to determine the next frequency, should use the get_target_freq()
>>>> +	 * of struct devfreq_passive_data.
>>>> +	 */
>>>> +	if (p_data->get_target_freq)
>>>> +		return p_data->get_target_freq(devfreq, freq);
>>>> +
>>>> +	switch (p_data->parent_type) {
>>>> +	case DEVFREQ_PARENT_DEV:
>>>> +		ret = get_target_freq_with_devfreq(devfreq, freq);
>>>> +		break;
>>>> +	case CPUFREQ_PARENT_DEV:
>>>> +		ret = get_target_freq_with_cpufreq(devfreq, freq);
>>>> +		break;
>>>> +	default:
>>>> +		ret = -EINVAL;
>>>> +		dev_err(&devfreq->dev, "Invalid parent type\n");
>>>> +		break;
>>>> +	}
>>>> +
>>>> +	return ret;
>>>> +}
>>>> +
>>>>  static int devfreq_passive_notifier_call(struct notifier_block *nb,
>>>>  				unsigned long event, void *ptr)
>>>>  {
>>>> @@ -130,16 +245,200 @@ static int devfreq_passive_notifier_call(struct notifier_block *nb,
>>>>  	return NOTIFY_DONE;
>>>>  }
>>>>  
>>>> +static int cpufreq_passive_notifier_call(struct notifier_block *nb,
>>>> +					 unsigned long event, void *ptr)
>>>> +{
>>>> +	struct devfreq_passive_data *data =
>>>> +			container_of(nb, struct devfreq_passive_data, nb);
>>>> +	struct devfreq *devfreq = (struct devfreq *)data->this;
>>>> +	struct devfreq_cpu_state *cpu_state;
>>>> +	struct cpufreq_freqs *cpu_freq = ptr;
>>>
>>> Use 'freqs' variable name.  I prefer to use the same variable name
>>> for both devfreq_freqs and cpufreq_freqs instance.
>>>
>>>> +	unsigned int curr_freq;
>>>
>>> As I commented above, better to use 'cur_frq' instead of 'curr_freq'
>>> if there is no any special reason.
>>>
>>>> +	int ret;
>>>> +
>>>> +	if (event != CPUFREQ_POSTCHANGE || !cpu_freq ||
>>>> +	    !data->cpu_state[cpu_freq->policy->cpu])
>>>> +		return 0;
>>>> +
>>>> +	cpu_state = data->cpu_state[cpu_freq->policy->cpu];
>>>> +	if (cpu_state->curr_freq == cpu_freq->new)
>>>> +		return 0;
>>>> +
>>>> +	/* Backup current freq and pre-update cpu state freq*/
>>>
>>> I think that this commnet is not critial. So, please drop this comment.
>>>
>>>> +	curr_freq = cpu_state->curr_freq;
>>>> +	cpu_state->curr_freq = cpu_freq->new;
>>>> +
>>>> +	mutex_lock(&devfreq->lock);
>>>> +	ret = update_devfreq(devfreq);
>>>
>>> I recommend to use 'devfreq_update_target' instead of 'update_devfreq'
>>> as following:
>>> 	devfreq_update_target(devfreq, freqs->new);
>>>
>>>> +	mutex_unlock(&devfreq->lock);
>>>> +	if (ret) {
>>>> +		cpu_state->curr_freq = curr_freq;
>>>> +		dev_err(&devfreq->dev, "Couldn't update the frequency.\n");
>>>> +		return ret;
>>>> +	}
>>>> +
>>>> +	return 0;
>>>> +}
>>>> +
>>>> +static int cpufreq_passive_register(struct devfreq_passive_data **p_data)
>>>
>>> In order to keep the consistent style of function name,
>>> please change the name as following because devfreq defines
>>> the function name as 'devfreq_regiter_notifier'
>>> - cpufreq_passive_register -> cpufreq_passive_register_notifier
>>>
>>>> +{
>>>> +	struct devfreq_passive_data *data = *p_data;
>>>> +	struct devfreq *devfreq = (struct devfreq *)data->this;
>>>> +	struct device *dev = devfreq->dev.parent;
>>>> +	struct opp_table *opp_table = NULL;
>>>> +	struct devfreq_cpu_state *cpu_state;
>>>> +	struct cpufreq_policy *policy;
>>>> +	struct device *cpu_dev;
>>>> +	unsigned int cpu;
>>>> +	int ret;
>>>> +
>>>> +	get_online_cpus();
>>>> +
>>>> +	data->nb.notifier_call = cpufreq_passive_notifier_call;
>>>> +	ret = cpufreq_register_notifier(&data->nb,
>>>> +					CPUFREQ_TRANSITION_NOTIFIER);
>>>> +	if (ret) {
>>>> +		dev_err(dev, "Couldn't register cpufreq notifier.\n");
>>>> +		data->nb.notifier_call = NULL;
>>>> +		goto out;
>>>> +	}
>>>> +
>>>> +	/* Populate devfreq_cpu_state */
>>>
>>> Don't need this comment. Please drop it.
>>>
>>>> +	for_each_online_cpu(cpu) {
>>>> +		if (data->cpu_state[cpu])
>>>> +			continue;
>>>> +
>>>> +		policy = cpufreq_cpu_get(cpu);
>>>> +		if (!policy) {
>>>> +			ret = -EINVAL;
>>>> +			goto out;
>>>> +		} else if (PTR_ERR(policy) == -EPROBE_DEFER) {
>>>> +			ret = -EPROBE_DEFER;
>>>> +			goto out;
>>>> +		} else if (IS_ERR(policy)) {
>>>> +			ret = PTR_ERR(policy);
>>>> +			dev_err(dev, "Couldn't get the cpufreq_poliy.\n");
>>>> +			goto out;
>>>> +		}
>>>
>>> Use dev_err_probe() funciotn to handle hte EPROBE_DEFER.
>>> It make code more simple.
>>>
>>>> +
>>>> +		cpu_state = kzalloc(sizeof(*cpu_state), GFP_KERNEL);
>>>> +		if (!cpu_state) {
>>>> +			ret = -ENOMEM;
>>>> +			goto out;
>>>> +		}
>>>> +
>>>> +		cpu_dev = get_cpu_device(cpu);
>>>> +		if (!cpu_dev) {
>>>> +			dev_err(dev, "Couldn't get cpu device.\n");
>>>> +			ret = -ENODEV;
>>>> +			goto out;
>>>> +		}
>>>> +
>>>> +		opp_table = dev_pm_opp_get_opp_table(cpu_dev);
>>>> +		if (IS_ERR(devfreq->opp_table)) {
>>>> +			ret = PTR_ERR(opp_table);
>>>> +			goto out;
>>>> +		}
>>>> +
>>>> +		cpu_state->cpu_dev = cpu_dev;
>>>> +		cpu_state->opp_table = opp_table;
>>>> +		cpu_state->first_cpu = cpumask_first(policy->related_cpus);
>>>> +		cpu_state->curr_freq = policy->cur;
>>>> +		cpu_state->min_freq = policy->cpuinfo.min_freq;
>>>> +		cpu_state->max_freq = policy->cpuinfo.max_freq;
>>>> +		data->cpu_state[cpu] = cpu_state;
>>>> +
>>>> +		cpufreq_cpu_put(policy);
>>>> +	}
>>>> +
>>>> +out:
>>>> +	put_online_cpus();
>>>> +	if (ret)
>>>> +		return ret;
>>>> +
>>>> +	/* Update devfreq */
>>>> +	mutex_lock(&devfreq->lock);
>>>> +	ret = update_devfreq(devfreq);
>>>
>>>> +	mutex_unlock(&devfreq->lock);
>>>> +	if (ret)
>>>> +		dev_err(dev, "Couldn't update the frequency.\n");
>>>> +
>>>> +	return ret;
>>>> +}
>>>> +
>>>> +static int cpufreq_passive_unregister(struct devfreq_passive_data **p_data)
>>>
>>> As I commented above, please change the name as following:
>>> - cpufreq_passive_unregister -> cpufreq_passive_unregister_notifier
>>>
>>>> +{
>>>> +	struct devfreq_passive_data *data = *p_data;
>>>> +	struct devfreq_cpu_state *cpu_state;
>>>> +	int cpu;
>>>> +
>>>> +	if (data->nb.notifier_call)
>>>> +		cpufreq_unregister_notifier(&data->nb,
>>>> +					    CPUFREQ_TRANSITION_NOTIFIER);
>>>> +
>>>> +	for_each_possible_cpu(cpu) {
>>>> +		cpu_state = data->cpu_state[cpu];
>>>> +		if (cpu_state) {
>>>> +			if (cpu_state->opp_table)
>>>> +				dev_pm_opp_put_opp_table(cpu_state->opp_table);
>>>> +			kfree(cpu_state);
>>>> +			cpu_state = NULL;
>>>> +		}
>>>> +	}
>>>> +
>>>> +	return 0;
>>>> +}
>>>> +
>>>> +int register_parent_dev_notifier(struct devfreq_passive_data **p_data)
>>>> +{
>>>> +	struct notifier_block *nb = &(*p_data)->nb;
>>>> +	int ret = 0;
>>>> +
>>>> +	switch ((*p_data)->parent_type) {
>>>> +	case DEVFREQ_PARENT_DEV:
>>>> +		nb->notifier_call = devfreq_passive_notifier_call;
>>>> +		ret = devfreq_register_notifier((struct devfreq *)(*p_data)->parent, nb,
>>>> +						DEVFREQ_TRANSITION_NOTIFIER);
>>>> +		break;
>>>> +	case CPUFREQ_PARENT_DEV:
>>>> +		ret = cpufreq_passive_register(p_data);
>>>> +		break;
>>>> +	default:
>>>> +		ret = -EINVAL;
>>>> +		break;
>>>> +	}
>>>> +	return ret;
>>>> +}
>>>> +
>>>> +int unregister_parent_dev_notifier(struct devfreq_passive_data **p_data)
>>>> +{
>>>> +	int ret = 0;
>>>> +
>>>> +	switch ((*p_data)->parent_type) {
>>>> +	case DEVFREQ_PARENT_DEV:
>>>> +		WARN_ON(devfreq_unregister_notifier((struct devfreq *)(*p_data)->parent,
>>>> +						    &(*p_data)->nb,
>>>> +						    DEVFREQ_TRANSITION_NOTIFIER));
>>>> +		break;
>>>> +	case CPUFREQ_PARENT_DEV:
>>>> +		cpufreq_passive_unregister(p_data);
>>>> +		break;
>>>> +	default:
>>>> +		ret = -EINVAL;
>>>> +		break;
>>>> +	}
>>>> +	return ret;
>>>> +}
>>>
>>> I think that you don't need to define register_parent_dev_notifier
>>> and unregister_parent_dev_notifier as the separate functions.
>>>
>>> Instead of the separate functions, just add the code
>>> into devfreq_passive_event_handler.
>>>
>>>
>>>> +
>>>>  static int devfreq_passive_event_handler(struct devfreq *devfreq,
>>>>  				unsigned int event, void *data)
>>>>  {
>>>>  	struct devfreq_passive_data *p_data
>>>>  			= (struct devfreq_passive_data *)devfreq->data;
>>>>  	struct devfreq *parent = (struct devfreq *)p_data->parent;
>>>> -	struct notifier_block *nb = &p_data->nb;
>>>>  	int ret = 0;
>>>>  
>>>> -	if (!parent)
>>>> +	if (p_data->parent_type == DEVFREQ_PARENT_DEV && !parent)
>>>>  		return -EPROBE_DEFER;
>>>>  
>>>>  	switch (event) {
>>>> @@ -147,13 +446,11 @@ static int devfreq_passive_event_handler(struct devfreq *devfreq,
>>>>  		if (!p_data->this)
>>>>  			p_data->this = devfreq;
>>>>  
>>>> -		nb->notifier_call = devfreq_passive_notifier_call;
>>>> -		ret = devfreq_register_notifier(parent, nb,
>>>> -					DEVFREQ_TRANSITION_NOTIFIER);
>>>> +		ret = register_parent_dev_notifier(&p_data);
>>>>  		break;
>>>> +
>>>>  	case DEVFREQ_GOV_STOP:
>>>> -		WARN_ON(devfreq_unregister_notifier(parent, nb,
>>>> -					DEVFREQ_TRANSITION_NOTIFIER));
>>>> +		ret = unregister_parent_dev_notifier(&p_data);
>>>>  		break;
>>>>  	default:
>>>>  		break;
>>>> diff --git a/include/linux/devfreq.h b/include/linux/devfreq.h
>>>> index 26ea0850be9b..e0093b7c805c 100644
>>>> --- a/include/linux/devfreq.h
>>>> +++ b/include/linux/devfreq.h
>>>> @@ -280,6 +280,25 @@ struct devfreq_simple_ondemand_data {
>>>>  
>>>>  #if IS_ENABLED(CONFIG_DEVFREQ_GOV_PASSIVE)
>>>>  /**
>>>> + * struct devfreq_cpu_state - holds the per-cpu state
>>>> + * @freq:	the current frequency of the cpu.
>>>> + * @min_freq:	the min frequency of the cpu.
>>>> + * @max_freq:	the max frequency of the cpu.
>>>> + * @first_cpu:	the cpumask of the first cpu of a policy.
>>>> + * @dev:	reference to cpu device.
>>>> + * @opp_table:	reference to cpu opp table.
>>>> + *
>>>> + * This structure stores the required cpu_state of a cpu.
>>>> + * This is auto-populated by the governor.
>>>> + */
>>>> +struct devfreq_cpu_state;
>>>> +
>>>> +enum devfreq_parent_dev_type {
>>>> +	DEVFREQ_PARENT_DEV,
>>>> +	CPUFREQ_PARENT_DEV,
>>>> +};
>>>> +
>>>> +/**
>>>>   * struct devfreq_passive_data - ``void *data`` fed to struct devfreq
>>>>   *	and devfreq_add_device
>>>>   * @parent:	the devfreq instance of parent device.
>>>> @@ -290,13 +309,15 @@ struct devfreq_simple_ondemand_data {
>>>>   *			using governors except for passive governor.
>>>>   *			If the devfreq device has the specific method to decide
>>>>   *			the next frequency, should use this callback.
>>>> + * @parent_type:	parent type of the device
>>>>   * @this:	the devfreq instance of own device.
>>>>   * @nb:		the notifier block for DEVFREQ_TRANSITION_NOTIFIER list
>>>> + * @cpu_state:		the state min/max/current frequency of all online cpu's
>>>>   *
>>>>   * The devfreq_passive_data have to set the devfreq instance of parent
>>>>   * device with governors except for the passive governor. But, don't need to
>>>> - * initialize the 'this' and 'nb' field because the devfreq core will handle
>>>> - * them.
>>>> + * initialize the 'this', 'nb' and 'cpu_state' field because the devfreq core
>>>> + * will handle them.
>>>>   */
>>>>  struct devfreq_passive_data {
>>>>  	/* Should set the devfreq instance of parent device */
>>>> @@ -305,9 +326,13 @@ struct devfreq_passive_data {
>>>>  	/* Optional callback to decide the next frequency of passvice device */
>>>>  	int (*get_target_freq)(struct devfreq *this, unsigned long *freq);
>>>>  
>>>> +	/* Should set the type of parent device */
>>>> +	enum devfreq_parent_dev_type parent_type;
>>>> +
>>>>  	/* For passive governor's internal use. Don't need to set them */
>>>>  	struct devfreq *this;
>>>>  	struct notifier_block nb;
>>>> +	struct devfreq_cpu_state *cpu_state[NR_CPUS];
>>>>  };
>>>>  #endif
>>>>  
>>>>
>>>
>>
> 
> 


-- 
Best Regards,
Chanwoo Choi
Samsung Electronics

_______________________________________________
linux-arm-kernel mailing list
linux-arm-kernel@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-arm-kernel

^ permalink raw reply	[flat|nested] 31+ messages in thread

* Re: [PATCH V8 1/8] PM / devfreq: Add cpu based scaling support to passive_governor
  2021-03-25  8:14   ` Chanwoo Choi
  2021-03-31  8:03     ` andrew-sh.cheng
@ 2021-03-31 10:46     ` Hsin-Yi Wang
  1 sibling, 0 replies; 31+ messages in thread
From: Hsin-Yi Wang @ 2021-03-31 10:46 UTC (permalink / raw)
  To: Chanwoo Choi
  Cc: Andrew-sh.Cheng, MyungJoo Ham, Kyungmin Park, Rob Herring,
	Mark Rutland, Matthias Brugger, Rafael J. Wysocki, Viresh Kumar,
	Nishanth Menon, Stephen Boyd, Liam Girdwood, Mark Brown,
	Linux PM, Devicetree List,
	moderated list:ARM/FREESCALE IMX / MXC ARM ARCHITECTURE,
	moderated list:ARM/Mediatek SoC support, lkml, srv_heupstream,
	Sibi Sankar

On Thu, Mar 25, 2021 at 3:58 PM Chanwoo Choi <cw00.choi@samsung.com> wrote:
>
> Hi,
>
> You are missing to add these patches to linux-pm mailing list.
> Need to send them to linu-pm ML.
>
> Also, before received this series, I tried to clean-up these patches
> on testing branch[1]. So that I add my comment with my clean-up case.
> [1] https://git.kernel.org/pub/scm/linux/kernel/git/chanwoo/linux.git/log/?h=devfreq-testing-passive-gov
>
> And 'Saravana Kannan <skannan@codeaurora.org>' is wrong email address.
> Please update the email or drop this email.
>
>
> On 3/23/21 8:33 PM, Andrew-sh.Cheng wrote:
> > From: Saravana Kannan <skannan@codeaurora.org>
> >
> > Many CPU architectures have caches that can scale independent of the
> > CPUs. Frequency scaling of the caches is necessary to make sure that the
> > cache is not a performance bottleneck that leads to poor performance and
> > power. The same idea applies for RAM/DDR.
> >
> > To achieve this, this patch adds support for cpu based scaling to the
> > passive governor. This is accomplished by taking the current frequency
> > of each CPU frequency domain and then adjust the frequency of the cache
> > (or any devfreq device) based on the frequency of the CPUs. It listens
> > to CPU frequency transition notifiers to keep itself up to date on the
> > current CPU frequency.
> >
> > To decide the frequency of the device, the governor does one of the
> > following:
> > * Derives the optimal devfreq device opp from required-opps property of
> >   the parent cpu opp_table.
> >
> > * Scales the device frequency in proportion to the CPU frequency. So, if
> >   the CPUs are running at their max frequency, the device runs at its
> >   max frequency. If the CPUs are running at their min frequency, the
> >   device runs at its min frequency. It is interpolated for frequencies
> >   in between.
> >
> > Andrew-sh.Cheng change
> > dev_pm_opp_xlate_opp to dev_pm_opp_xlate_required_opp devfreq->max_freq
> > to devfreq->user_min_freq_req.data.freq.qos->min_freq.target_value
> > after kernel-5.7
> > Don't return -EINVAL in devfreq_passive_event_handler()
> > since it doesn't handle DEVFREQ_GOV_SUSPEND DEVFREQ_GOV_RESUME cases.
> >
> > Signed-off-by: Saravana Kannan <skannan@codeaurora.org>
> > [Sibi: Integrated cpu-freqmap governor into passive_governor]
> > Signed-off-by: Sibi Sankar <sibis@codeaurora.org>
> > Signed-off-by: Andrew-sh.Cheng <andrew-sh.cheng@mediatek.com>
> > ---
> >  drivers/devfreq/Kconfig            |   2 +
> >  drivers/devfreq/governor_passive.c | 329 +++++++++++++++++++++++++++++++++++--
> >  include/linux/devfreq.h            |  29 +++-
> >  3 files changed, 342 insertions(+), 18 deletions(-)
> >
> > diff --git a/drivers/devfreq/Kconfig b/drivers/devfreq/Kconfig
> > index 00704efe6398..f56132b0ae64 100644
> > --- a/drivers/devfreq/Kconfig
> > +++ b/drivers/devfreq/Kconfig
> > @@ -73,6 +73,8 @@ config DEVFREQ_GOV_PASSIVE
> >         device. This governor does not change the frequency by itself
> >         through sysfs entries. The passive governor recommends that
> >         devfreq device uses the OPP table to get the frequency/voltage.
> > +       Alternatively the governor can also be chosen to scale based on
> > +       the online CPUs current frequency.
> >
> >  comment "DEVFREQ Drivers"
> >
> > diff --git a/drivers/devfreq/governor_passive.c b/drivers/devfreq/governor_passive.c
> > index b094132bd20b..9cc57b083839 100644
> > --- a/drivers/devfreq/governor_passive.c
> > +++ b/drivers/devfreq/governor_passive.c
> > @@ -8,11 +8,103 @@
> >   */
> >
> >  #include <linux/module.h>
> > +#include <linux/cpu.h>
> > +#include <linux/cpufreq.h>
> > +#include <linux/cpumask.h>
> >  #include <linux/device.h>
> >  #include <linux/devfreq.h>
> > +#include <linux/slab.h>
> >  #include "governor.h"
> >
> > -static int devfreq_passive_get_target_freq(struct devfreq *devfreq,
> > +struct devfreq_cpu_state {
> > +     unsigned int curr_freq;
> > +     unsigned int min_freq;
> > +     unsigned int max_freq;
> > +     unsigned int first_cpu;
> > +     struct device *cpu_dev;
> > +     struct opp_table *opp_table;
> > +};
>
> As I knew, the previous version has the description of structure
> as following:  I wan to add the description like below.
>
> And if you have no any objection, I'd like you to order
> the variables as following and use 'dev' instead of 'cpu_dev'
> because this patch use the 'cpu_state->cpu_dev' at the multiple points.
> I think that 'cpu_state->dev' is better than 'cpu_state->cpu_dev'.
> Also, I prefer to use 'cur_freq' instead of 'curr_freq'
> because devfreq subsystem uses 'cur_freq' for expressing the 'current frequency'.
>
> /**
>  * struct devfreq_cpu_state - Hold the per-cpu data
>  * @dev:        reference to cpu device.
>  * @first_cpu:  the cpumask of the first cpu of a policy.
>  * @opp_table:  reference to cpu opp table.
>  * @cur_freq:   the current frequency of the cpu.
>  * @min_freq:   the min frequency of the cpu.
>  * @max_freq:   the max frequency of the cpu.
>  *
>  * This structure stores the required cpu_data of a cpu.
>  * This is auto-populated by the governor.
>  */
> struct devfreq_cpu_state {
>          struct device *dev;
>          unsigned int first_cpu;
>
>          struct opp_table *opp_table;
>          unsigned int cur_freq;
>          unsigned int min_freq;
>          unsigned int max_freq;
> };
>
>
> > +
> > +static unsigned long xlate_cpufreq_to_devfreq(struct devfreq_passive_data *data,
> > +                                           unsigned int cpu)
> > +{
> > +     unsigned int cpu_min_freq, cpu_max_freq, cpu_curr_freq_khz, cpu_percent;
> > +     unsigned long dev_min_freq, dev_max_freq, dev_max_state;
> > +
> > +     struct devfreq_cpu_state *cpu_state = data->cpu_state[cpu];
> > +     struct devfreq *devfreq = (struct devfreq *)data->this;
> > +     unsigned long *dev_freq_table = devfreq->profile->freq_table;
> > +     struct dev_pm_opp *opp = NULL, *p_opp = NULL;
> > +     unsigned long cpu_curr_freq, freq;
> > +
> > +     if (!cpu_state || cpu_state->first_cpu != cpu ||
> > +         !cpu_state->opp_table || !devfreq->opp_table)
> > +             return 0;
> > +
> > +     cpu_curr_freq = cpu_state->curr_freq * 1000;
> > +     p_opp = devfreq_recommended_opp(cpu_state->cpu_dev, &cpu_curr_freq, 0);
> > +     if (IS_ERR(p_opp))
> > +             return 0;
> > +
> > +     opp = dev_pm_opp_xlate_required_opp(cpu_state->opp_table,
> > +                                         devfreq->opp_table, p_opp);
> > +     dev_pm_opp_put(p_opp);
> > +
> > +     if (!IS_ERR(opp)) {
> > +             freq = dev_pm_opp_get_freq(opp);
> > +             dev_pm_opp_put(opp);
> > +             goto out;
> > +     }
> > +
> > +     /* Use Interpolation if required opps is not available */
> > +     cpu_min_freq = cpu_state->min_freq;
> > +     cpu_max_freq = cpu_state->max_freq;
> > +     cpu_curr_freq_khz = cpu_state->curr_freq;
> > +
> > +     if (dev_freq_table) {
> > +             /* Get minimum frequency according to sorting order */
> > +             dev_max_state = dev_freq_table[devfreq->profile->max_state - 1];
> > +             if (dev_freq_table[0] < dev_max_state) {
> > +                     dev_min_freq = dev_freq_table[0];
> > +                     dev_max_freq = dev_max_state;
> > +             } else {
> > +                     dev_min_freq = dev_max_state;
> > +                     dev_max_freq = dev_freq_table[0];
> > +             }
> > +     } else {
> > +             dev_min_freq = dev_pm_qos_read_value(devfreq->dev.parent,
> > +                                                  DEV_PM_QOS_MIN_FREQUENCY);
> > +             dev_max_freq = dev_pm_qos_read_value(devfreq->dev.parent,
> > +                                                  DEV_PM_QOS_MAX_FREQUENCY);
> > +
> > +             if (dev_max_freq <= dev_min_freq)
> > +                     return 0;
> > +     }
> > +     cpu_percent = ((cpu_curr_freq_khz - cpu_min_freq) * 100) / cpu_max_freq - cpu_min_freq;

() is missing for denominator?
cpu_percent = ((cpu_curr_freq_khz - cpu_min_freq) * 100) /
(cpu_max_freq - cpu_min_freq);


> > +     freq = dev_min_freq + mult_frac(dev_max_freq - dev_min_freq, cpu_percent, 100);
> > +
> > +out:
> > +     return freq;
> > +}
> > +
> > +static int get_target_freq_with_cpufreq(struct devfreq *devfreq,
> > +                                     unsigned long *freq)
> > +{
> > +     struct devfreq_passive_data *p_data =
> > +                             (struct devfreq_passive_data *)devfreq->data;
> > +     unsigned int cpu;
> > +     unsigned long target_freq = 0;
> > +
> > +     for_each_online_cpu(cpu)
> > +             target_freq = max(target_freq,
> > +                               xlate_cpufreq_to_devfreq(p_data, cpu));
> > +
> > +     *freq = target_freq;
> > +
> > +     return 0;
> > +}
>
> As you knew, governor_passive.c was already used
> both 'dev_pm_opp_xlate_required_opp' and 'devfreq_recommended_opp'
> to get the target from OPP. So, I wan to make the common function
> like 'get_taget_freq_by_required_opp' as following:
> If define 'get_taget_freq_by_required_opp' as following,
> it will be used for get_target_freq_with_devfreq().
> After finisied the review of this patch, I'll send the patch[2].
> [2] https://git.kernel.org/pub/scm/linux/kernel/git/chanwoo/linux.git/commit/?h=devfreq-testing-passive-gov&id=101c5a087586ab2b5cf3370166a7e39227ca83cf
>
> For example but this code is not tested,
> static unsigned long get_taget_freq_by_required_opp(struct device *p_dev,
>                                                 struct opp_table *p_opp_table,
>                                                 struct opp_table *opp_table,
>                                                 unsigned long freq)
> {
>         struct dev_pm_opp *opp = NULL, *p_opp = NULL;
>
>         if (!p_dev || !p_opp_table || !opp_table || !freq)
>                 return 0;
>
>         p_opp = devfreq_recommended_opp(p_dev, &freq, 0);
>         if (IS_ERR(p_opp))
>                 return 0;
>
>         opp = dev_pm_opp_xlate_required_opp(p_opp_table, opp_table, p_opp);
>         dev_pm_opp_put(p_opp);
>
>         if (IS_ERR(opp))
>                 return 0;
>
>         freq = dev_pm_opp_get_freq(opp);
>         dev_pm_opp_put(opp);
>
>         return freq;
> }
>
> static int get_target_freq_with_cpufreq(struct devfreq *devfreq,
>                                         unsigned long *target_freq)
> {
>         struct devfreq_passive_data *p_data =
>                                 (struct devfreq_passive_data *)devfreq->data;
>         struct devfreq_cpu_data *cpu_data;
>         unsigned long cpu, cpu_cur, cpu_min, cpu_max, cpu_percent;
>         unsigned long dev_min, dev_max;
>         unsigned long freq = 0;
>
>         for_each_online_cpu(cpu) {
>                 cpu_data = p_data->cpu_data[cpu];
>                 if (!cpu_data || cpu_data->first_cpu != cpu)
>                         continue;
>
>                 /* Get target freq via required opps */
>                 cpu_cur = cpu_data->cur_freq * HZ_PER_KHZ;
>                 freq = get_taget_freq_by_required_opp(cpu_data->dev,
>                                         cpu_data->opp_table,
>                                         devfreq->opp_table, cpu_cur);
>                 if (freq) {
>                         *target_freq = max(freq, *target_freq);
>                         continue;
>                 }
>
>                 /* Use Interpolation if required opps is not available */
>                 devfreq_get_freq_range(devfreq, &dev_min, &dev_max);
>
>                 cpu_min = cpu_data->min_freq;
>                 cpu_max = cpu_data->max_freq;
>                 cpu_cur = cpu_data->cur_freq;
>
>                 cpu_percent = ((cpu_cur - cpu_min) * 100) / cpu_max - cpu_min;
>                 freq = dev_min + mult_frac(dev_max - dev_min, cpu_percent, 100);
>
>                 *target_freq = max(freq, *target_freq);
>         }
>
>         return 0;
> }
>
> > +
> > +static int get_target_freq_with_devfreq(struct devfreq *devfreq,
> >                                       unsigned long *freq)
> >  {
> >       struct devfreq_passive_data *p_data
> > @@ -23,14 +115,6 @@ static int devfreq_passive_get_target_freq(struct devfreq *devfreq,
> >       int i, count;
> >
> >       /*
> > -      * If the devfreq device with passive governor has the specific method
> > -      * to determine the next frequency, should use the get_target_freq()
> > -      * of struct devfreq_passive_data.
> > -      */
> > -     if (p_data->get_target_freq)
> > -             return p_data->get_target_freq(devfreq, freq);
> > -
> > -     /*
> >        * If the parent and passive devfreq device uses the OPP table,
> >        * get the next frequency by using the OPP table.
> >        */
> > @@ -98,6 +182,37 @@ static int devfreq_passive_get_target_freq(struct devfreq *devfreq,
> >       return 0;
> >  }
> >
> > +static int devfreq_passive_get_target_freq(struct devfreq *devfreq,
> > +                                        unsigned long *freq)
> > +{
> > +     struct devfreq_passive_data *p_data =
> > +             (struct devfreq_passive_data *)devfreq->data;
> > +     int ret;
> > +
> > +     /*
> > +      * If the devfreq device with passive governor has the specific method
> > +      * to determine the next frequency, should use the get_target_freq()
> > +      * of struct devfreq_passive_data.
> > +      */
> > +     if (p_data->get_target_freq)
> > +             return p_data->get_target_freq(devfreq, freq);
> > +
> > +     switch (p_data->parent_type) {
> > +     case DEVFREQ_PARENT_DEV:
> > +             ret = get_target_freq_with_devfreq(devfreq, freq);
> > +             break;
> > +     case CPUFREQ_PARENT_DEV:
> > +             ret = get_target_freq_with_cpufreq(devfreq, freq);
> > +             break;
> > +     default:
> > +             ret = -EINVAL;
> > +             dev_err(&devfreq->dev, "Invalid parent type\n");
> > +             break;
> > +     }
> > +
> > +     return ret;
> > +}
> > +
> >  static int devfreq_passive_notifier_call(struct notifier_block *nb,
> >                               unsigned long event, void *ptr)
> >  {
> > @@ -130,16 +245,200 @@ static int devfreq_passive_notifier_call(struct notifier_block *nb,
> >       return NOTIFY_DONE;
> >  }
> >
> > +static int cpufreq_passive_notifier_call(struct notifier_block *nb,
> > +                                      unsigned long event, void *ptr)
> > +{
> > +     struct devfreq_passive_data *data =
> > +                     container_of(nb, struct devfreq_passive_data, nb);
> > +     struct devfreq *devfreq = (struct devfreq *)data->this;
> > +     struct devfreq_cpu_state *cpu_state;
> > +     struct cpufreq_freqs *cpu_freq = ptr;
>
> Use 'freqs' variable name.  I prefer to use the same variable name
> for both devfreq_freqs and cpufreq_freqs instance.
>
> > +     unsigned int curr_freq;
>
> As I commented above, better to use 'cur_frq' instead of 'curr_freq'
> if there is no any special reason.
>
> > +     int ret;
> > +
> > +     if (event != CPUFREQ_POSTCHANGE || !cpu_freq ||
> > +         !data->cpu_state[cpu_freq->policy->cpu])
> > +             return 0;
> > +
> > +     cpu_state = data->cpu_state[cpu_freq->policy->cpu];
> > +     if (cpu_state->curr_freq == cpu_freq->new)
> > +             return 0;
> > +
> > +     /* Backup current freq and pre-update cpu state freq*/
>
> I think that this commnet is not critial. So, please drop this comment.
>
> > +     curr_freq = cpu_state->curr_freq;
> > +     cpu_state->curr_freq = cpu_freq->new;
> > +
> > +     mutex_lock(&devfreq->lock);
> > +     ret = update_devfreq(devfreq);
>
> I recommend to use 'devfreq_update_target' instead of 'update_devfreq'
> as following:
>         devfreq_update_target(devfreq, freqs->new);
>
> > +     mutex_unlock(&devfreq->lock);
> > +     if (ret) {
> > +             cpu_state->curr_freq = curr_freq;
> > +             dev_err(&devfreq->dev, "Couldn't update the frequency.\n");
> > +             return ret;
> > +     }
> > +
> > +     return 0;
> > +}
> > +
> > +static int cpufreq_passive_register(struct devfreq_passive_data **p_data)
>
> In order to keep the consistent style of function name,
> please change the name as following because devfreq defines
> the function name as 'devfreq_regiter_notifier'
> - cpufreq_passive_register -> cpufreq_passive_register_notifier
>
> > +{
> > +     struct devfreq_passive_data *data = *p_data;
> > +     struct devfreq *devfreq = (struct devfreq *)data->this;
> > +     struct device *dev = devfreq->dev.parent;
> > +     struct opp_table *opp_table = NULL;
> > +     struct devfreq_cpu_state *cpu_state;
> > +     struct cpufreq_policy *policy;
> > +     struct device *cpu_dev;
> > +     unsigned int cpu;
> > +     int ret;
> > +
> > +     get_online_cpus();
> > +
> > +     data->nb.notifier_call = cpufreq_passive_notifier_call;
> > +     ret = cpufreq_register_notifier(&data->nb,
> > +                                     CPUFREQ_TRANSITION_NOTIFIER);
> > +     if (ret) {
> > +             dev_err(dev, "Couldn't register cpufreq notifier.\n");
> > +             data->nb.notifier_call = NULL;
> > +             goto out;
> > +     }
> > +
> > +     /* Populate devfreq_cpu_state */
>
> Don't need this comment. Please drop it.
>
> > +     for_each_online_cpu(cpu) {
> > +             if (data->cpu_state[cpu])
> > +                     continue;
> > +
> > +             policy = cpufreq_cpu_get(cpu);
> > +             if (!policy) {
> > +                     ret = -EINVAL;
> > +                     goto out;
> > +             } else if (PTR_ERR(policy) == -EPROBE_DEFER) {
> > +                     ret = -EPROBE_DEFER;
> > +                     goto out;
> > +             } else if (IS_ERR(policy)) {
> > +                     ret = PTR_ERR(policy);
> > +                     dev_err(dev, "Couldn't get the cpufreq_poliy.\n");
> > +                     goto out;
> > +             }
>
> Use dev_err_probe() funciotn to handle hte EPROBE_DEFER.
> It make code more simple.
>
> > +
> > +             cpu_state = kzalloc(sizeof(*cpu_state), GFP_KERNEL);
> > +             if (!cpu_state) {
> > +                     ret = -ENOMEM;
> > +                     goto out;
> > +             }
> > +
> > +             cpu_dev = get_cpu_device(cpu);
> > +             if (!cpu_dev) {
> > +                     dev_err(dev, "Couldn't get cpu device.\n");
> > +                     ret = -ENODEV;
> > +                     goto out;
> > +             }
> > +
> > +             opp_table = dev_pm_opp_get_opp_table(cpu_dev);
> > +             if (IS_ERR(devfreq->opp_table)) {
> > +                     ret = PTR_ERR(opp_table);
> > +                     goto out;
> > +             }
> > +
> > +             cpu_state->cpu_dev = cpu_dev;
> > +             cpu_state->opp_table = opp_table;
> > +             cpu_state->first_cpu = cpumask_first(policy->related_cpus);
> > +             cpu_state->curr_freq = policy->cur;
> > +             cpu_state->min_freq = policy->cpuinfo.min_freq;
> > +             cpu_state->max_freq = policy->cpuinfo.max_freq;
> > +             data->cpu_state[cpu] = cpu_state;
> > +
> > +             cpufreq_cpu_put(policy);
> > +     }
> > +
> > +out:
> > +     put_online_cpus();
> > +     if (ret)
> > +             return ret;
> > +
> > +     /* Update devfreq */
> > +     mutex_lock(&devfreq->lock);
> > +     ret = update_devfreq(devfreq);
>
> > +     mutex_unlock(&devfreq->lock);
> > +     if (ret)
> > +             dev_err(dev, "Couldn't update the frequency.\n");
> > +
> > +     return ret;
> > +}
> > +
> > +static int cpufreq_passive_unregister(struct devfreq_passive_data **p_data)
>
> As I commented above, please change the name as following:
> - cpufreq_passive_unregister -> cpufreq_passive_unregister_notifier
>
> > +{
> > +     struct devfreq_passive_data *data = *p_data;
> > +     struct devfreq_cpu_state *cpu_state;
> > +     int cpu;
> > +
> > +     if (data->nb.notifier_call)
> > +             cpufreq_unregister_notifier(&data->nb,
> > +                                         CPUFREQ_TRANSITION_NOTIFIER);
> > +
> > +     for_each_possible_cpu(cpu) {
> > +             cpu_state = data->cpu_state[cpu];
> > +             if (cpu_state) {
> > +                     if (cpu_state->opp_table)
> > +                             dev_pm_opp_put_opp_table(cpu_state->opp_table);
> > +                     kfree(cpu_state);
> > +                     cpu_state = NULL;
> > +             }
> > +     }
> > +
> > +     return 0;
> > +}
> > +
> > +int register_parent_dev_notifier(struct devfreq_passive_data **p_data)
> > +{
> > +     struct notifier_block *nb = &(*p_data)->nb;
> > +     int ret = 0;
> > +
> > +     switch ((*p_data)->parent_type) {
> > +     case DEVFREQ_PARENT_DEV:
> > +             nb->notifier_call = devfreq_passive_notifier_call;
> > +             ret = devfreq_register_notifier((struct devfreq *)(*p_data)->parent, nb,
> > +                                             DEVFREQ_TRANSITION_NOTIFIER);
> > +             break;
> > +     case CPUFREQ_PARENT_DEV:
> > +             ret = cpufreq_passive_register(p_data);
> > +             break;
> > +     default:
> > +             ret = -EINVAL;
> > +             break;
> > +     }
> > +     return ret;
> > +}
> > +
> > +int unregister_parent_dev_notifier(struct devfreq_passive_data **p_data)
> > +{
> > +     int ret = 0;
> > +
> > +     switch ((*p_data)->parent_type) {
> > +     case DEVFREQ_PARENT_DEV:
> > +             WARN_ON(devfreq_unregister_notifier((struct devfreq *)(*p_data)->parent,
> > +                                                 &(*p_data)->nb,
> > +                                                 DEVFREQ_TRANSITION_NOTIFIER));
> > +             break;
> > +     case CPUFREQ_PARENT_DEV:
> > +             cpufreq_passive_unregister(p_data);
> > +             break;
> > +     default:
> > +             ret = -EINVAL;
> > +             break;
> > +     }
> > +     return ret;
> > +}
>
> I think that you don't need to define register_parent_dev_notifier
> and unregister_parent_dev_notifier as the separate functions.
>
> Instead of the separate functions, just add the code
> into devfreq_passive_event_handler.
>
>
> > +
> >  static int devfreq_passive_event_handler(struct devfreq *devfreq,
> >                               unsigned int event, void *data)
> >  {
> >       struct devfreq_passive_data *p_data
> >                       = (struct devfreq_passive_data *)devfreq->data;
> >       struct devfreq *parent = (struct devfreq *)p_data->parent;
> > -     struct notifier_block *nb = &p_data->nb;
> >       int ret = 0;
> >
> > -     if (!parent)
> > +     if (p_data->parent_type == DEVFREQ_PARENT_DEV && !parent)
> >               return -EPROBE_DEFER;
> >
> >       switch (event) {
> > @@ -147,13 +446,11 @@ static int devfreq_passive_event_handler(struct devfreq *devfreq,
> >               if (!p_data->this)
> >                       p_data->this = devfreq;
> >
> > -             nb->notifier_call = devfreq_passive_notifier_call;
> > -             ret = devfreq_register_notifier(parent, nb,
> > -                                     DEVFREQ_TRANSITION_NOTIFIER);
> > +             ret = register_parent_dev_notifier(&p_data);
> >               break;
> > +
> >       case DEVFREQ_GOV_STOP:
> > -             WARN_ON(devfreq_unregister_notifier(parent, nb,
> > -                                     DEVFREQ_TRANSITION_NOTIFIER));
> > +             ret = unregister_parent_dev_notifier(&p_data);
> >               break;
> >       default:
> >               break;
> > diff --git a/include/linux/devfreq.h b/include/linux/devfreq.h
> > index 26ea0850be9b..e0093b7c805c 100644
> > --- a/include/linux/devfreq.h
> > +++ b/include/linux/devfreq.h
> > @@ -280,6 +280,25 @@ struct devfreq_simple_ondemand_data {
> >
> >  #if IS_ENABLED(CONFIG_DEVFREQ_GOV_PASSIVE)
> >  /**
> > + * struct devfreq_cpu_state - holds the per-cpu state
> > + * @freq:    the current frequency of the cpu.
> > + * @min_freq:        the min frequency of the cpu.
> > + * @max_freq:        the max frequency of the cpu.
> > + * @first_cpu:       the cpumask of the first cpu of a policy.
> > + * @dev:     reference to cpu device.
> > + * @opp_table:       reference to cpu opp table.
> > + *
> > + * This structure stores the required cpu_state of a cpu.
> > + * This is auto-populated by the governor.
> > + */
> > +struct devfreq_cpu_state;
> > +
> > +enum devfreq_parent_dev_type {
> > +     DEVFREQ_PARENT_DEV,
> > +     CPUFREQ_PARENT_DEV,
> > +};
> > +
> > +/**
> >   * struct devfreq_passive_data - ``void *data`` fed to struct devfreq
> >   *   and devfreq_add_device
> >   * @parent:  the devfreq instance of parent device.
> > @@ -290,13 +309,15 @@ struct devfreq_simple_ondemand_data {
> >   *                   using governors except for passive governor.
> >   *                   If the devfreq device has the specific method to decide
> >   *                   the next frequency, should use this callback.
> > + * @parent_type:     parent type of the device
> >   * @this:    the devfreq instance of own device.
> >   * @nb:              the notifier block for DEVFREQ_TRANSITION_NOTIFIER list
> > + * @cpu_state:               the state min/max/current frequency of all online cpu's
> >   *
> >   * The devfreq_passive_data have to set the devfreq instance of parent
> >   * device with governors except for the passive governor. But, don't need to
> > - * initialize the 'this' and 'nb' field because the devfreq core will handle
> > - * them.
> > + * initialize the 'this', 'nb' and 'cpu_state' field because the devfreq core
> > + * will handle them.
> >   */
> >  struct devfreq_passive_data {
> >       /* Should set the devfreq instance of parent device */
> > @@ -305,9 +326,13 @@ struct devfreq_passive_data {
> >       /* Optional callback to decide the next frequency of passvice device */
> >       int (*get_target_freq)(struct devfreq *this, unsigned long *freq);
> >
> > +     /* Should set the type of parent device */
> > +     enum devfreq_parent_dev_type parent_type;
> > +
> >       /* For passive governor's internal use. Don't need to set them */
> >       struct devfreq *this;
> >       struct notifier_block nb;
> > +     struct devfreq_cpu_state *cpu_state[NR_CPUS];
> >  };
> >  #endif
> >
> >
>

_______________________________________________
linux-arm-kernel mailing list
linux-arm-kernel@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-arm-kernel

^ permalink raw reply	[flat|nested] 31+ messages in thread

* Re: [PATCH V8 1/8] PM / devfreq: Add cpu based scaling support to passive_governor
  2021-03-31  8:35         ` Chanwoo Choi
@ 2021-03-31 13:03           ` andrew-sh.cheng
  2021-04-01  0:16             ` Chanwoo Choi
  0 siblings, 1 reply; 31+ messages in thread
From: andrew-sh.cheng @ 2021-03-31 13:03 UTC (permalink / raw)
  To: Chanwoo Choi
  Cc: MyungJoo Ham, Kyungmin Park, Rob Herring, Mark Rutland,
	Matthias Brugger, Rafael J. Wysocki, Viresh Kumar,
	Nishanth Menon, Stephen Boyd, Liam Girdwood, Mark Brown,
	linux-pm, devicetree, linux-arm-kernel, linux-mediatek,
	linux-kernel, srv_heupstream, Sibi Sankar

On Wed, 2021-03-31 at 17:35 +0900, Chanwoo Choi wrote:
> On 3/31/21 5:27 PM, Chanwoo Choi wrote:
> > Hi,
> > 
> > On 3/31/21 5:03 PM, andrew-sh.cheng wrote:
> >> On Thu, 2021-03-25 at 17:14 +0900, Chanwoo Choi wrote:
> >>> Hi,
> >>>
> >>> You are missing to add these patches to linux-pm mailing list.
> >>> Need to send them to linu-pm ML.
> >>>
> >>> Also, before received this series, I tried to clean-up these patches
> >>> on testing branch[1]. So that I add my comment with my clean-up case.
> >>> [1] https://urldefense.com/v3/__https://git.kernel.org/pub/scm/linux/kernel/git/chanwoo/linux.git/log/?h=devfreq-testing-passive-gov__;!!CTRNKA9wMg0ARbw!zIrzeDp9vPnm1_SDzVPuzqdHn3zWie9DnfBXaA-j9-CSrVc6aR9_rJQQiw81_CgAPh9XRRs$ 
> >>>
> >>> And 'Saravana Kannan <skannan@codeaurora.org>' is wrong email address.
> >>> Please update the email or drop this email.
> >>
> >> Hi Chanwoo,
> >>
> >> Thank you for the advices.
> >> I will resend patch v9 (add to linux-pm ML), remove this patch, and note
> >> that my patch set base on
> >> https://urldefense.com/v3/__https://git.kernel.org/pub/scm/linux/kernel/git/chanwoo/linux.git/log/?h=devfreq-testing-passive-gov__;!!CTRNKA9wMg0ARbw!yUlsuxrL5PcbF7o6A9DlCfvoA6w8V8VXKjYIybYyiJg3D0HM-lI2xRuxLUV6b3UJ8WFhg_g$ 
> > 
> > I has not yet test this patch[1] on devfreq-testing-passive-gov branch.
> > So that if possible, I'd like you to test your patches with this patch[1] 
> > and then if there is no problem, could you send the next patches with patch[1]?
> > 
> > [1]https://urldefense.com/v3/__https://git.kernel.org/pub/scm/linux/kernel/git/chanwoo/linux.git/commit/?h=devfreq-testing-passive-gov&id=39c80d11a8f42dd63ecea1e0df595a0ceb83b454__;!!CTRNKA9wMg0ARbw!yUlsuxrL5PcbF7o6A9DlCfvoA6w8V8VXKjYIybYyiJg3D0HM-lI2xRuxLUV6b3UJR2cQqZs$ 
> 
> 
> Sorry for the confusion. I make the devfreq-testing-passive-gov[1]
> branch based on latest devfreq-next branch.
> [1] https://urldefense.com/v3/__https://git.kernel.org/pub/scm/linux/kernel/git/chanwoo/linux.git/log/?h=devfreq-testing-passive-gov__;!!CTRNKA9wMg0ARbw!yUlsuxrL5PcbF7o6A9DlCfvoA6w8V8VXKjYIybYyiJg3D0HM-lI2xRuxLUV6b3UJ8WFhg_g$ 
> 
> First of all, if possible, I want to test them[1] with your patches in this series.
> And then if there are no any problem, please let me know. After confirmed from you,
> I'll send the patches of devfreq-testing-passive-gov[1] branch.
> How about that?
> 
Hi Chanwoo~

We will use this on Google Chrome project.
Google Hsin-Yi has test your patch + my patch set v8 [2~8]

    make sure cci devfreqs runs with cpufreq.
    suspend resume
    speedometer2 benchmark
It is okay.

Please send the patches of devfreq-testing-passive-gov[1] branch.

I will send patch v9 base on yours latter.


> 
> > 
> >>
> >>
> >>>
> >>>
> >>> On 3/23/21 8:33 PM, Andrew-sh.Cheng wrote:
> >>>> From: Saravana Kannan <skannan@codeaurora.org>
> >>>>
> >>>> Many CPU architectures have caches that can scale independent of the
> >>>> CPUs. Frequency scaling of the caches is necessary to make sure that the
> >>>> cache is not a performance bottleneck that leads to poor performance and
> >>>> power. The same idea applies for RAM/DDR.
> >>>>
> >>>> To achieve this, this patch adds support for cpu based scaling to the
> >>>> passive governor. This is accomplished by taking the current frequency
> >>>> of each CPU frequency domain and then adjust the frequency of the cache
> >>>> (or any devfreq device) based on the frequency of the CPUs. It listens
> >>>> to CPU frequency transition notifiers to keep itself up to date on the
> >>>> current CPU frequency.
> >>>>
> >>>> To decide the frequency of the device, the governor does one of the
> >>>> following:
> >>>> * Derives the optimal devfreq device opp from required-opps property of
> >>>>   the parent cpu opp_table.
> >>>>
> >>>> * Scales the device frequency in proportion to the CPU frequency. So, if
> >>>>   the CPUs are running at their max frequency, the device runs at its
> >>>>   max frequency. If the CPUs are running at their min frequency, the
> >>>>   device runs at its min frequency. It is interpolated for frequencies
> >>>>   in between.
> >>>>
> >>>> Andrew-sh.Cheng change
> >>>> dev_pm_opp_xlate_opp to dev_pm_opp_xlate_required_opp devfreq->max_freq
> >>>> to devfreq->user_min_freq_req.data.freq.qos->min_freq.target_value
> >>>> after kernel-5.7
> >>>> Don't return -EINVAL in devfreq_passive_event_handler()
> >>>> since it doesn't handle DEVFREQ_GOV_SUSPEND DEVFREQ_GOV_RESUME cases.
> >>>>
> >>>> Signed-off-by: Saravana Kannan <skannan@codeaurora.org>
> >>>> [Sibi: Integrated cpu-freqmap governor into passive_governor]
> >>>> Signed-off-by: Sibi Sankar <sibis@codeaurora.org>
> >>>> Signed-off-by: Andrew-sh.Cheng <andrew-sh.cheng@mediatek.com>
> >>>> ---
> >>>>  drivers/devfreq/Kconfig            |   2 +
> >>>>  drivers/devfreq/governor_passive.c | 329 +++++++++++++++++++++++++++++++++++--
> >>>>  include/linux/devfreq.h            |  29 +++-
> >>>>  3 files changed, 342 insertions(+), 18 deletions(-)
> >>>>
> >>>> diff --git a/drivers/devfreq/Kconfig b/drivers/devfreq/Kconfig
> >>>> index 00704efe6398..f56132b0ae64 100644
> >>>> --- a/drivers/devfreq/Kconfig
> >>>> +++ b/drivers/devfreq/Kconfig
> >>>> @@ -73,6 +73,8 @@ config DEVFREQ_GOV_PASSIVE
> >>>>  	  device. This governor does not change the frequency by itself
> >>>>  	  through sysfs entries. The passive governor recommends that
> >>>>  	  devfreq device uses the OPP table to get the frequency/voltage.
> >>>> +	  Alternatively the governor can also be chosen to scale based on
> >>>> +	  the online CPUs current frequency.
> >>>>  
> >>>>  comment "DEVFREQ Drivers"
> >>>>  
> >>>> diff --git a/drivers/devfreq/governor_passive.c b/drivers/devfreq/governor_passive.c
> >>>> index b094132bd20b..9cc57b083839 100644
> >>>> --- a/drivers/devfreq/governor_passive.c
> >>>> +++ b/drivers/devfreq/governor_passive.c
> >>>> @@ -8,11 +8,103 @@
> >>>>   */
> >>>>  
> >>>>  #include <linux/module.h>
> >>>> +#include <linux/cpu.h>
> >>>> +#include <linux/cpufreq.h>
> >>>> +#include <linux/cpumask.h>
> >>>>  #include <linux/device.h>
> >>>>  #include <linux/devfreq.h>
> >>>> +#include <linux/slab.h>
> >>>>  #include "governor.h"
> >>>>  
> >>>> -static int devfreq_passive_get_target_freq(struct devfreq *devfreq,
> >>>> +struct devfreq_cpu_state {
> >>>> +	unsigned int curr_freq;
> >>>> +	unsigned int min_freq;
> >>>> +	unsigned int max_freq;
> >>>> +	unsigned int first_cpu;
> >>>> +	struct device *cpu_dev;
> >>>> +	struct opp_table *opp_table;
> >>>> +};
> >>>
> >>> As I knew, the previous version has the description of structure
> >>> as following:  I wan to add the description like below.
> >>>
> >>> And if you have no any objection, I'd like you to order
> >>> the variables as following and use 'dev' instead of 'cpu_dev'
> >>> because this patch use the 'cpu_state->cpu_dev' at the multiple points.
> >>> I think that 'cpu_state->dev' is better than 'cpu_state->cpu_dev'.
> >>> Also, I prefer to use 'cur_freq' instead of 'curr_freq'
> >>> because devfreq subsystem uses 'cur_freq' for expressing the 'current frequency'.
> >>>
> >>> /**                                                                             
> >>>  * struct devfreq_cpu_state - Hold the per-cpu data                              
> >>>  * @dev:        reference to cpu device.                                        
> >>>  * @first_cpu:  the cpumask of the first cpu of a policy.                       
> >>>  * @opp_table:  reference to cpu opp table.                                     
> >>>  * @cur_freq:   the current frequency of the cpu.                               
> >>>  * @min_freq:   the min frequency of the cpu.                                   
> >>>  * @max_freq:   the max frequency of the cpu.                                   
> >>>  *                                                                              
> >>>  * This structure stores the required cpu_data of a cpu.                        
> >>>  * This is auto-populated by the governor.                                      
> >>>  */                                                                             
> >>> struct devfreq_cpu_state {                                                       
> >>>          struct device *dev;                                                     
> >>>          unsigned int first_cpu;                                                 
> >>>
> >>>          struct opp_table *opp_table;                                            
> >>>          unsigned int cur_freq;                                                  
> >>>          unsigned int min_freq;                                                  
> >>>          unsigned int max_freq;                                                  
> >>> };               
> >>>
> >>>
> >>>> +
> >>>> +static unsigned long xlate_cpufreq_to_devfreq(struct devfreq_passive_data *data,
> >>>> +					      unsigned int cpu)
> >>>> +{
> >>>> +	unsigned int cpu_min_freq, cpu_max_freq, cpu_curr_freq_khz, cpu_percent;
> >>>> +	unsigned long dev_min_freq, dev_max_freq, dev_max_state;
> >>>> +
> >>>> +	struct devfreq_cpu_state *cpu_state = data->cpu_state[cpu];
> >>>> +	struct devfreq *devfreq = (struct devfreq *)data->this;
> >>>> +	unsigned long *dev_freq_table = devfreq->profile->freq_table;
> >>>> +	struct dev_pm_opp *opp = NULL, *p_opp = NULL;
> >>>> +	unsigned long cpu_curr_freq, freq;
> >>>> +
> >>>> +	if (!cpu_state || cpu_state->first_cpu != cpu ||
> >>>> +	    !cpu_state->opp_table || !devfreq->opp_table)
> >>>> +		return 0;
> >>>> +
> >>>> +	cpu_curr_freq = cpu_state->curr_freq * 1000;
> >>>> +	p_opp = devfreq_recommended_opp(cpu_state->cpu_dev, &cpu_curr_freq, 0);
> >>>> +	if (IS_ERR(p_opp))
> >>>> +		return 0;
> >>>> +
> >>>> +	opp = dev_pm_opp_xlate_required_opp(cpu_state->opp_table,
> >>>> +					    devfreq->opp_table, p_opp);
> >>>> +	dev_pm_opp_put(p_opp);
> >>>> +
> >>>> +	if (!IS_ERR(opp)) {
> >>>> +		freq = dev_pm_opp_get_freq(opp);
> >>>> +		dev_pm_opp_put(opp);
> >>>> +		goto out;
> >>>> +	}
> >>>> +
> >>>> +	/* Use Interpolation if required opps is not available */
> >>>> +	cpu_min_freq = cpu_state->min_freq;
> >>>> +	cpu_max_freq = cpu_state->max_freq;
> >>>> +	cpu_curr_freq_khz = cpu_state->curr_freq;
> >>>> +
> >>>> +	if (dev_freq_table) {
> >>>> +		/* Get minimum frequency according to sorting order */
> >>>> +		dev_max_state = dev_freq_table[devfreq->profile->max_state - 1];
> >>>> +		if (dev_freq_table[0] < dev_max_state) {
> >>>> +			dev_min_freq = dev_freq_table[0];
> >>>> +			dev_max_freq = dev_max_state;
> >>>> +		} else {
> >>>> +			dev_min_freq = dev_max_state;
> >>>> +			dev_max_freq = dev_freq_table[0];
> >>>> +		}
> >>>> +	} else {
> >>>> +		dev_min_freq = dev_pm_qos_read_value(devfreq->dev.parent,
> >>>> +						     DEV_PM_QOS_MIN_FREQUENCY);
> >>>> +		dev_max_freq = dev_pm_qos_read_value(devfreq->dev.parent,
> >>>> +						     DEV_PM_QOS_MAX_FREQUENCY);
> >>>> +
> >>>> +		if (dev_max_freq <= dev_min_freq)
> >>>> +			return 0;
> >>>> +	}
> >>>> +	cpu_percent = ((cpu_curr_freq_khz - cpu_min_freq) * 100) / cpu_max_freq - cpu_min_freq;
> >>>> +	freq = dev_min_freq + mult_frac(dev_max_freq - dev_min_freq, cpu_percent, 100);
> >>>> +
> >>>> +out:
> >>>> +	return freq;
> >>>> +}
> >>>> +
> >>>> +static int get_target_freq_with_cpufreq(struct devfreq *devfreq,
> >>>> +					unsigned long *freq)
> >>>> +{
> >>>> +	struct devfreq_passive_data *p_data =
> >>>> +				(struct devfreq_passive_data *)devfreq->data;
> >>>> +	unsigned int cpu;
> >>>> +	unsigned long target_freq = 0;
> >>>> +
> >>>> +	for_each_online_cpu(cpu)
> >>>> +		target_freq = max(target_freq,
> >>>> +				  xlate_cpufreq_to_devfreq(p_data, cpu));
> >>>> +
> >>>> +	*freq = target_freq;
> >>>> +
> >>>> +	return 0;
> >>>> +}
> >>>
> >>> As you knew, governor_passive.c was already used 
> >>> both 'dev_pm_opp_xlate_required_opp' and 'devfreq_recommended_opp'
> >>> to get the target from OPP. So, I wan to make the common function
> >>> like 'get_taget_freq_by_required_opp' as following:
> >>> If define 'get_taget_freq_by_required_opp' as following,
> >>> it will be used for get_target_freq_with_devfreq().
> >>> After finisied the review of this patch, I'll send the patch[2].
> >>> [2] https://urldefense.com/v3/__https://git.kernel.org/pub/scm/linux/kernel/git/chanwoo/linux.git/commit/?h=devfreq-testing-passive-gov&id=101c5a087586ab2b5cf3370166a7e39227ca83cf__;!!CTRNKA9wMg0ARbw!zIrzeDp9vPnm1_SDzVPuzqdHn3zWie9DnfBXaA-j9-CSrVc6aR9_rJQQiw81_CgA6mp3Yqo$ 
> >>>
> >>> For example but this code is not tested,
> >>> static unsigned long get_taget_freq_by_required_opp(struct device *p_dev,
> >>> 						struct opp_table *p_opp_table,
> >>> 						struct opp_table *opp_table,
> >>> 						unsigned long freq)
> >>> {
> >>> 	struct dev_pm_opp *opp = NULL, *p_opp = NULL;
> >>>
> >>> 	if (!p_dev || !p_opp_table || !opp_table || !freq)
> >>> 		return 0;
> >>>
> >>> 	p_opp = devfreq_recommended_opp(p_dev, &freq, 0);
> >>> 	if (IS_ERR(p_opp))
> >>> 		return 0;
> >>>
> >>> 	opp = dev_pm_opp_xlate_required_opp(p_opp_table, opp_table, p_opp);
> >>> 	dev_pm_opp_put(p_opp);
> >>>
> >>> 	if (IS_ERR(opp))
> >>> 		return 0;
> >>>
> >>> 	freq = dev_pm_opp_get_freq(opp);
> >>> 	dev_pm_opp_put(opp);
> >>>
> >>> 	return freq;
> >>> }
> >>>
> >>> static int get_target_freq_with_cpufreq(struct devfreq *devfreq,
> >>> 					unsigned long *target_freq)
> >>> {
> >>> 	struct devfreq_passive_data *p_data =
> >>> 				(struct devfreq_passive_data *)devfreq->data;
> >>> 	struct devfreq_cpu_data *cpu_data;
> >>> 	unsigned long cpu, cpu_cur, cpu_min, cpu_max, cpu_percent;
> >>> 	unsigned long dev_min, dev_max;
> >>> 	unsigned long freq = 0;
> >>>
> >>> 	for_each_online_cpu(cpu) {
> >>> 		cpu_data = p_data->cpu_data[cpu];
> >>> 		if (!cpu_data || cpu_data->first_cpu != cpu)
> >>> 			continue;
> >>>
> >>> 		/* Get target freq via required opps */
> >>> 		cpu_cur = cpu_data->cur_freq * HZ_PER_KHZ;
> >>> 		freq = get_taget_freq_by_required_opp(cpu_data->dev,
> >>> 					cpu_data->opp_table,
> >>> 					devfreq->opp_table, cpu_cur);
> >>> 		if (freq) {
> >>> 			*target_freq = max(freq, *target_freq);
> >>> 			continue;
> >>> 		}
> >>>
> >>> 		/* Use Interpolation if required opps is not available */
> >>> 		devfreq_get_freq_range(devfreq, &dev_min, &dev_max);
> >>>
> >>> 		cpu_min = cpu_data->min_freq;
> >>> 		cpu_max = cpu_data->max_freq;
> >>> 		cpu_cur = cpu_data->cur_freq;
> >>>
> >>> 		cpu_percent = ((cpu_cur - cpu_min) * 100) / cpu_max - cpu_min;
> >>> 		freq = dev_min + mult_frac(dev_max - dev_min, cpu_percent, 100);
> >>>
> >>> 		*target_freq = max(freq, *target_freq);
> >>> 	}
> >>>
> >>> 	return 0;
> >>> }
> >>>
> >>>> +
> >>>> +static int get_target_freq_with_devfreq(struct devfreq *devfreq,
> >>>>  					unsigned long *freq)
> >>>>  {
> >>>>  	struct devfreq_passive_data *p_data
> >>>> @@ -23,14 +115,6 @@ static int devfreq_passive_get_target_freq(struct devfreq *devfreq,
> >>>>  	int i, count;
> >>>>  
> >>>>  	/*
> >>>> -	 * If the devfreq device with passive governor has the specific method
> >>>> -	 * to determine the next frequency, should use the get_target_freq()
> >>>> -	 * of struct devfreq_passive_data.
> >>>> -	 */
> >>>> -	if (p_data->get_target_freq)
> >>>> -		return p_data->get_target_freq(devfreq, freq);
> >>>> -
> >>>> -	/*
> >>>>  	 * If the parent and passive devfreq device uses the OPP table,
> >>>>  	 * get the next frequency by using the OPP table.
> >>>>  	 */
> >>>> @@ -98,6 +182,37 @@ static int devfreq_passive_get_target_freq(struct devfreq *devfreq,
> >>>>  	return 0;
> >>>>  }
> >>>>  
> >>>> +static int devfreq_passive_get_target_freq(struct devfreq *devfreq,
> >>>> +					   unsigned long *freq)
> >>>> +{
> >>>> +	struct devfreq_passive_data *p_data =
> >>>> +		(struct devfreq_passive_data *)devfreq->data;
> >>>> +	int ret;
> >>>> +
> >>>> +	/*
> >>>> +	 * If the devfreq device with passive governor has the specific method
> >>>> +	 * to determine the next frequency, should use the get_target_freq()
> >>>> +	 * of struct devfreq_passive_data.
> >>>> +	 */
> >>>> +	if (p_data->get_target_freq)
> >>>> +		return p_data->get_target_freq(devfreq, freq);
> >>>> +
> >>>> +	switch (p_data->parent_type) {
> >>>> +	case DEVFREQ_PARENT_DEV:
> >>>> +		ret = get_target_freq_with_devfreq(devfreq, freq);
> >>>> +		break;
> >>>> +	case CPUFREQ_PARENT_DEV:
> >>>> +		ret = get_target_freq_with_cpufreq(devfreq, freq);
> >>>> +		break;
> >>>> +	default:
> >>>> +		ret = -EINVAL;
> >>>> +		dev_err(&devfreq->dev, "Invalid parent type\n");
> >>>> +		break;
> >>>> +	}
> >>>> +
> >>>> +	return ret;
> >>>> +}
> >>>> +
> >>>>  static int devfreq_passive_notifier_call(struct notifier_block *nb,
> >>>>  				unsigned long event, void *ptr)
> >>>>  {
> >>>> @@ -130,16 +245,200 @@ static int devfreq_passive_notifier_call(struct notifier_block *nb,
> >>>>  	return NOTIFY_DONE;
> >>>>  }
> >>>>  
> >>>> +static int cpufreq_passive_notifier_call(struct notifier_block *nb,
> >>>> +					 unsigned long event, void *ptr)
> >>>> +{
> >>>> +	struct devfreq_passive_data *data =
> >>>> +			container_of(nb, struct devfreq_passive_data, nb);
> >>>> +	struct devfreq *devfreq = (struct devfreq *)data->this;
> >>>> +	struct devfreq_cpu_state *cpu_state;
> >>>> +	struct cpufreq_freqs *cpu_freq = ptr;
> >>>
> >>> Use 'freqs' variable name.  I prefer to use the same variable name
> >>> for both devfreq_freqs and cpufreq_freqs instance.
> >>>
> >>>> +	unsigned int curr_freq;
> >>>
> >>> As I commented above, better to use 'cur_frq' instead of 'curr_freq'
> >>> if there is no any special reason.
> >>>
> >>>> +	int ret;
> >>>> +
> >>>> +	if (event != CPUFREQ_POSTCHANGE || !cpu_freq ||
> >>>> +	    !data->cpu_state[cpu_freq->policy->cpu])
> >>>> +		return 0;
> >>>> +
> >>>> +	cpu_state = data->cpu_state[cpu_freq->policy->cpu];
> >>>> +	if (cpu_state->curr_freq == cpu_freq->new)
> >>>> +		return 0;
> >>>> +
> >>>> +	/* Backup current freq and pre-update cpu state freq*/
> >>>
> >>> I think that this commnet is not critial. So, please drop this comment.
> >>>
> >>>> +	curr_freq = cpu_state->curr_freq;
> >>>> +	cpu_state->curr_freq = cpu_freq->new;
> >>>> +
> >>>> +	mutex_lock(&devfreq->lock);
> >>>> +	ret = update_devfreq(devfreq);
> >>>
> >>> I recommend to use 'devfreq_update_target' instead of 'update_devfreq'
> >>> as following:
> >>> 	devfreq_update_target(devfreq, freqs->new);
> >>>
> >>>> +	mutex_unlock(&devfreq->lock);
> >>>> +	if (ret) {
> >>>> +		cpu_state->curr_freq = curr_freq;
> >>>> +		dev_err(&devfreq->dev, "Couldn't update the frequency.\n");
> >>>> +		return ret;
> >>>> +	}
> >>>> +
> >>>> +	return 0;
> >>>> +}
> >>>> +
> >>>> +static int cpufreq_passive_register(struct devfreq_passive_data **p_data)
> >>>
> >>> In order to keep the consistent style of function name,
> >>> please change the name as following because devfreq defines
> >>> the function name as 'devfreq_regiter_notifier'
> >>> - cpufreq_passive_register -> cpufreq_passive_register_notifier
> >>>
> >>>> +{
> >>>> +	struct devfreq_passive_data *data = *p_data;
> >>>> +	struct devfreq *devfreq = (struct devfreq *)data->this;
> >>>> +	struct device *dev = devfreq->dev.parent;
> >>>> +	struct opp_table *opp_table = NULL;
> >>>> +	struct devfreq_cpu_state *cpu_state;
> >>>> +	struct cpufreq_policy *policy;
> >>>> +	struct device *cpu_dev;
> >>>> +	unsigned int cpu;
> >>>> +	int ret;
> >>>> +
> >>>> +	get_online_cpus();
> >>>> +
> >>>> +	data->nb.notifier_call = cpufreq_passive_notifier_call;
> >>>> +	ret = cpufreq_register_notifier(&data->nb,
> >>>> +					CPUFREQ_TRANSITION_NOTIFIER);
> >>>> +	if (ret) {
> >>>> +		dev_err(dev, "Couldn't register cpufreq notifier.\n");
> >>>> +		data->nb.notifier_call = NULL;
> >>>> +		goto out;
> >>>> +	}
> >>>> +
> >>>> +	/* Populate devfreq_cpu_state */
> >>>
> >>> Don't need this comment. Please drop it.
> >>>
> >>>> +	for_each_online_cpu(cpu) {
> >>>> +		if (data->cpu_state[cpu])
> >>>> +			continue;
> >>>> +
> >>>> +		policy = cpufreq_cpu_get(cpu);
> >>>> +		if (!policy) {
> >>>> +			ret = -EINVAL;
> >>>> +			goto out;
> >>>> +		} else if (PTR_ERR(policy) == -EPROBE_DEFER) {
> >>>> +			ret = -EPROBE_DEFER;
> >>>> +			goto out;
> >>>> +		} else if (IS_ERR(policy)) {
> >>>> +			ret = PTR_ERR(policy);
> >>>> +			dev_err(dev, "Couldn't get the cpufreq_poliy.\n");
> >>>> +			goto out;
> >>>> +		}
> >>>
> >>> Use dev_err_probe() funciotn to handle hte EPROBE_DEFER.
> >>> It make code more simple.
> >>>
> >>>> +
> >>>> +		cpu_state = kzalloc(sizeof(*cpu_state), GFP_KERNEL);
> >>>> +		if (!cpu_state) {
> >>>> +			ret = -ENOMEM;
> >>>> +			goto out;
> >>>> +		}
> >>>> +
> >>>> +		cpu_dev = get_cpu_device(cpu);
> >>>> +		if (!cpu_dev) {
> >>>> +			dev_err(dev, "Couldn't get cpu device.\n");
> >>>> +			ret = -ENODEV;
> >>>> +			goto out;
> >>>> +		}
> >>>> +
> >>>> +		opp_table = dev_pm_opp_get_opp_table(cpu_dev);
> >>>> +		if (IS_ERR(devfreq->opp_table)) {
> >>>> +			ret = PTR_ERR(opp_table);
> >>>> +			goto out;
> >>>> +		}
> >>>> +
> >>>> +		cpu_state->cpu_dev = cpu_dev;
> >>>> +		cpu_state->opp_table = opp_table;
> >>>> +		cpu_state->first_cpu = cpumask_first(policy->related_cpus);
> >>>> +		cpu_state->curr_freq = policy->cur;
> >>>> +		cpu_state->min_freq = policy->cpuinfo.min_freq;
> >>>> +		cpu_state->max_freq = policy->cpuinfo.max_freq;
> >>>> +		data->cpu_state[cpu] = cpu_state;
> >>>> +
> >>>> +		cpufreq_cpu_put(policy);
> >>>> +	}
> >>>> +
> >>>> +out:
> >>>> +	put_online_cpus();
> >>>> +	if (ret)
> >>>> +		return ret;
> >>>> +
> >>>> +	/* Update devfreq */
> >>>> +	mutex_lock(&devfreq->lock);
> >>>> +	ret = update_devfreq(devfreq);
> >>>
> >>>> +	mutex_unlock(&devfreq->lock);
> >>>> +	if (ret)
> >>>> +		dev_err(dev, "Couldn't update the frequency.\n");
> >>>> +
> >>>> +	return ret;
> >>>> +}
> >>>> +
> >>>> +static int cpufreq_passive_unregister(struct devfreq_passive_data **p_data)
> >>>
> >>> As I commented above, please change the name as following:
> >>> - cpufreq_passive_unregister -> cpufreq_passive_unregister_notifier
> >>>
> >>>> +{
> >>>> +	struct devfreq_passive_data *data = *p_data;
> >>>> +	struct devfreq_cpu_state *cpu_state;
> >>>> +	int cpu;
> >>>> +
> >>>> +	if (data->nb.notifier_call)
> >>>> +		cpufreq_unregister_notifier(&data->nb,
> >>>> +					    CPUFREQ_TRANSITION_NOTIFIER);
> >>>> +
> >>>> +	for_each_possible_cpu(cpu) {
> >>>> +		cpu_state = data->cpu_state[cpu];
> >>>> +		if (cpu_state) {
> >>>> +			if (cpu_state->opp_table)
> >>>> +				dev_pm_opp_put_opp_table(cpu_state->opp_table);
> >>>> +			kfree(cpu_state);
> >>>> +			cpu_state = NULL;
> >>>> +		}
> >>>> +	}
> >>>> +
> >>>> +	return 0;
> >>>> +}
> >>>> +
> >>>> +int register_parent_dev_notifier(struct devfreq_passive_data **p_data)
> >>>> +{
> >>>> +	struct notifier_block *nb = &(*p_data)->nb;
> >>>> +	int ret = 0;
> >>>> +
> >>>> +	switch ((*p_data)->parent_type) {
> >>>> +	case DEVFREQ_PARENT_DEV:
> >>>> +		nb->notifier_call = devfreq_passive_notifier_call;
> >>>> +		ret = devfreq_register_notifier((struct devfreq *)(*p_data)->parent, nb,
> >>>> +						DEVFREQ_TRANSITION_NOTIFIER);
> >>>> +		break;
> >>>> +	case CPUFREQ_PARENT_DEV:
> >>>> +		ret = cpufreq_passive_register(p_data);
> >>>> +		break;
> >>>> +	default:
> >>>> +		ret = -EINVAL;
> >>>> +		break;
> >>>> +	}
> >>>> +	return ret;
> >>>> +}
> >>>> +
> >>>> +int unregister_parent_dev_notifier(struct devfreq_passive_data **p_data)
> >>>> +{
> >>>> +	int ret = 0;
> >>>> +
> >>>> +	switch ((*p_data)->parent_type) {
> >>>> +	case DEVFREQ_PARENT_DEV:
> >>>> +		WARN_ON(devfreq_unregister_notifier((struct devfreq *)(*p_data)->parent,
> >>>> +						    &(*p_data)->nb,
> >>>> +						    DEVFREQ_TRANSITION_NOTIFIER));
> >>>> +		break;
> >>>> +	case CPUFREQ_PARENT_DEV:
> >>>> +		cpufreq_passive_unregister(p_data);
> >>>> +		break;
> >>>> +	default:
> >>>> +		ret = -EINVAL;
> >>>> +		break;
> >>>> +	}
> >>>> +	return ret;
> >>>> +}
> >>>
> >>> I think that you don't need to define register_parent_dev_notifier
> >>> and unregister_parent_dev_notifier as the separate functions.
> >>>
> >>> Instead of the separate functions, just add the code
> >>> into devfreq_passive_event_handler.
> >>>
> >>>
> >>>> +
> >>>>  static int devfreq_passive_event_handler(struct devfreq *devfreq,
> >>>>  				unsigned int event, void *data)
> >>>>  {
> >>>>  	struct devfreq_passive_data *p_data
> >>>>  			= (struct devfreq_passive_data *)devfreq->data;
> >>>>  	struct devfreq *parent = (struct devfreq *)p_data->parent;
> >>>> -	struct notifier_block *nb = &p_data->nb;
> >>>>  	int ret = 0;
> >>>>  
> >>>> -	if (!parent)
> >>>> +	if (p_data->parent_type == DEVFREQ_PARENT_DEV && !parent)
> >>>>  		return -EPROBE_DEFER;
> >>>>  
> >>>>  	switch (event) {
> >>>> @@ -147,13 +446,11 @@ static int devfreq_passive_event_handler(struct devfreq *devfreq,
> >>>>  		if (!p_data->this)
> >>>>  			p_data->this = devfreq;
> >>>>  
> >>>> -		nb->notifier_call = devfreq_passive_notifier_call;
> >>>> -		ret = devfreq_register_notifier(parent, nb,
> >>>> -					DEVFREQ_TRANSITION_NOTIFIER);
> >>>> +		ret = register_parent_dev_notifier(&p_data);
> >>>>  		break;
> >>>> +
> >>>>  	case DEVFREQ_GOV_STOP:
> >>>> -		WARN_ON(devfreq_unregister_notifier(parent, nb,
> >>>> -					DEVFREQ_TRANSITION_NOTIFIER));
> >>>> +		ret = unregister_parent_dev_notifier(&p_data);
> >>>>  		break;
> >>>>  	default:
> >>>>  		break;
> >>>> diff --git a/include/linux/devfreq.h b/include/linux/devfreq.h
> >>>> index 26ea0850be9b..e0093b7c805c 100644
> >>>> --- a/include/linux/devfreq.h
> >>>> +++ b/include/linux/devfreq.h
> >>>> @@ -280,6 +280,25 @@ struct devfreq_simple_ondemand_data {
> >>>>  
> >>>>  #if IS_ENABLED(CONFIG_DEVFREQ_GOV_PASSIVE)
> >>>>  /**
> >>>> + * struct devfreq_cpu_state - holds the per-cpu state
> >>>> + * @freq:	the current frequency of the cpu.
> >>>> + * @min_freq:	the min frequency of the cpu.
> >>>> + * @max_freq:	the max frequency of the cpu.
> >>>> + * @first_cpu:	the cpumask of the first cpu of a policy.
> >>>> + * @dev:	reference to cpu device.
> >>>> + * @opp_table:	reference to cpu opp table.
> >>>> + *
> >>>> + * This structure stores the required cpu_state of a cpu.
> >>>> + * This is auto-populated by the governor.
> >>>> + */
> >>>> +struct devfreq_cpu_state;
> >>>> +
> >>>> +enum devfreq_parent_dev_type {
> >>>> +	DEVFREQ_PARENT_DEV,
> >>>> +	CPUFREQ_PARENT_DEV,
> >>>> +};
> >>>> +
> >>>> +/**
> >>>>   * struct devfreq_passive_data - ``void *data`` fed to struct devfreq
> >>>>   *	and devfreq_add_device
> >>>>   * @parent:	the devfreq instance of parent device.
> >>>> @@ -290,13 +309,15 @@ struct devfreq_simple_ondemand_data {
> >>>>   *			using governors except for passive governor.
> >>>>   *			If the devfreq device has the specific method to decide
> >>>>   *			the next frequency, should use this callback.
> >>>> + * @parent_type:	parent type of the device
> >>>>   * @this:	the devfreq instance of own device.
> >>>>   * @nb:		the notifier block for DEVFREQ_TRANSITION_NOTIFIER list
> >>>> + * @cpu_state:		the state min/max/current frequency of all online cpu's
> >>>>   *
> >>>>   * The devfreq_passive_data have to set the devfreq instance of parent
> >>>>   * device with governors except for the passive governor. But, don't need to
> >>>> - * initialize the 'this' and 'nb' field because the devfreq core will handle
> >>>> - * them.
> >>>> + * initialize the 'this', 'nb' and 'cpu_state' field because the devfreq core
> >>>> + * will handle them.
> >>>>   */
> >>>>  struct devfreq_passive_data {
> >>>>  	/* Should set the devfreq instance of parent device */
> >>>> @@ -305,9 +326,13 @@ struct devfreq_passive_data {
> >>>>  	/* Optional callback to decide the next frequency of passvice device */
> >>>>  	int (*get_target_freq)(struct devfreq *this, unsigned long *freq);
> >>>>  
> >>>> +	/* Should set the type of parent device */
> >>>> +	enum devfreq_parent_dev_type parent_type;
> >>>> +
> >>>>  	/* For passive governor's internal use. Don't need to set them */
> >>>>  	struct devfreq *this;
> >>>>  	struct notifier_block nb;
> >>>> +	struct devfreq_cpu_state *cpu_state[NR_CPUS];
> >>>>  };
> >>>>  #endif
> >>>>  
> >>>>
> >>>
> >>
> > 
> > 
> 
> 

_______________________________________________
linux-arm-kernel mailing list
linux-arm-kernel@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-arm-kernel

^ permalink raw reply	[flat|nested] 31+ messages in thread

* Re: [PATCH V8 1/8] PM / devfreq: Add cpu based scaling support to passive_governor
  2021-03-31 13:03           ` andrew-sh.cheng
@ 2021-04-01  0:16             ` Chanwoo Choi
  2021-04-08  2:47               ` Chanwoo Choi
  0 siblings, 1 reply; 31+ messages in thread
From: Chanwoo Choi @ 2021-04-01  0:16 UTC (permalink / raw)
  To: andrew-sh.cheng
  Cc: MyungJoo Ham, Kyungmin Park, Rob Herring, Mark Rutland,
	Matthias Brugger, Rafael J. Wysocki, Viresh Kumar,
	Nishanth Menon, Stephen Boyd, Liam Girdwood, Mark Brown,
	linux-pm, devicetree, linux-arm-kernel, linux-mediatek,
	linux-kernel, srv_heupstream, Sibi Sankar

On 3/31/21 10:03 PM, andrew-sh.cheng wrote:
> On Wed, 2021-03-31 at 17:35 +0900, Chanwoo Choi wrote:
>> On 3/31/21 5:27 PM, Chanwoo Choi wrote:
>>> Hi,
>>>
>>> On 3/31/21 5:03 PM, andrew-sh.cheng wrote:
>>>> On Thu, 2021-03-25 at 17:14 +0900, Chanwoo Choi wrote:
>>>>> Hi,
>>>>>
>>>>> You are missing to add these patches to linux-pm mailing list.
>>>>> Need to send them to linu-pm ML.
>>>>>
>>>>> Also, before received this series, I tried to clean-up these patches
>>>>> on testing branch[1]. So that I add my comment with my clean-up case.
>>>>> [1] https://urldefense.com/v3/__https://git.kernel.org/pub/scm/linux/kernel/git/chanwoo/linux.git/log/?h=devfreq-testing-passive-gov__;!!CTRNKA9wMg0ARbw!zIrzeDp9vPnm1_SDzVPuzqdHn3zWie9DnfBXaA-j9-CSrVc6aR9_rJQQiw81_CgAPh9XRRs$ 
>>>>>
>>>>> And 'Saravana Kannan <skannan@codeaurora.org>' is wrong email address.
>>>>> Please update the email or drop this email.
>>>>
>>>> Hi Chanwoo,
>>>>
>>>> Thank you for the advices.
>>>> I will resend patch v9 (add to linux-pm ML), remove this patch, and note
>>>> that my patch set base on
>>>> https://urldefense.com/v3/__https://git.kernel.org/pub/scm/linux/kernel/git/chanwoo/linux.git/log/?h=devfreq-testing-passive-gov__;!!CTRNKA9wMg0ARbw!yUlsuxrL5PcbF7o6A9DlCfvoA6w8V8VXKjYIybYyiJg3D0HM-lI2xRuxLUV6b3UJ8WFhg_g$ 
>>>
>>> I has not yet test this patch[1] on devfreq-testing-passive-gov branch.
>>> So that if possible, I'd like you to test your patches with this patch[1] 
>>> and then if there is no problem, could you send the next patches with patch[1]?
>>>
>>> [1]https://urldefense.com/v3/__https://git.kernel.org/pub/scm/linux/kernel/git/chanwoo/linux.git/commit/?h=devfreq-testing-passive-gov&id=39c80d11a8f42dd63ecea1e0df595a0ceb83b454__;!!CTRNKA9wMg0ARbw!yUlsuxrL5PcbF7o6A9DlCfvoA6w8V8VXKjYIybYyiJg3D0HM-lI2xRuxLUV6b3UJR2cQqZs$ 
>>
>>
>> Sorry for the confusion. I make the devfreq-testing-passive-gov[1]
>> branch based on latest devfreq-next branch.
>> [1] https://urldefense.com/v3/__https://git.kernel.org/pub/scm/linux/kernel/git/chanwoo/linux.git/log/?h=devfreq-testing-passive-gov__;!!CTRNKA9wMg0ARbw!yUlsuxrL5PcbF7o6A9DlCfvoA6w8V8VXKjYIybYyiJg3D0HM-lI2xRuxLUV6b3UJ8WFhg_g$ 
>>
>> First of all, if possible, I want to test them[1] with your patches in this series.
>> And then if there are no any problem, please let me know. After confirmed from you,
>> I'll send the patches of devfreq-testing-passive-gov[1] branch.
>> How about that?
>>
> Hi Chanwoo~
> 
> We will use this on Google Chrome project.
> Google Hsin-Yi has test your patch + my patch set v8 [2~8]
> 
>     make sure cci devfreqs runs with cpufreq.
>     suspend resume
>     speedometer2 benchmark
> It is okay.
> 
> Please send the patches of devfreq-testing-passive-gov[1] branch.
> 
> I will send patch v9 base on yours latter.

Thanks for your test. I'll send the patches today.

> 
> 
>>
>>>
>>>>
>>>>
>>>>>
>>>>>
>>>>> On 3/23/21 8:33 PM, Andrew-sh.Cheng wrote:
>>>>>> From: Saravana Kannan <skannan@codeaurora.org>
>>>>>>
>>>>>> Many CPU architectures have caches that can scale independent of the
>>>>>> CPUs. Frequency scaling of the caches is necessary to make sure that the
>>>>>> cache is not a performance bottleneck that leads to poor performance and
>>>>>> power. The same idea applies for RAM/DDR.
>>>>>>
>>>>>> To achieve this, this patch adds support for cpu based scaling to the
>>>>>> passive governor. This is accomplished by taking the current frequency
>>>>>> of each CPU frequency domain and then adjust the frequency of the cache
>>>>>> (or any devfreq device) based on the frequency of the CPUs. It listens
>>>>>> to CPU frequency transition notifiers to keep itself up to date on the
>>>>>> current CPU frequency.
>>>>>>
>>>>>> To decide the frequency of the device, the governor does one of the
>>>>>> following:
>>>>>> * Derives the optimal devfreq device opp from required-opps property of
>>>>>>   the parent cpu opp_table.
>>>>>>
>>>>>> * Scales the device frequency in proportion to the CPU frequency. So, if
>>>>>>   the CPUs are running at their max frequency, the device runs at its
>>>>>>   max frequency. If the CPUs are running at their min frequency, the
>>>>>>   device runs at its min frequency. It is interpolated for frequencies
>>>>>>   in between.
>>>>>>
>>>>>> Andrew-sh.Cheng change
>>>>>> dev_pm_opp_xlate_opp to dev_pm_opp_xlate_required_opp devfreq->max_freq
>>>>>> to devfreq->user_min_freq_req.data.freq.qos->min_freq.target_value
>>>>>> after kernel-5.7
>>>>>> Don't return -EINVAL in devfreq_passive_event_handler()
>>>>>> since it doesn't handle DEVFREQ_GOV_SUSPEND DEVFREQ_GOV_RESUME cases.
>>>>>>
>>>>>> Signed-off-by: Saravana Kannan <skannan@codeaurora.org>
>>>>>> [Sibi: Integrated cpu-freqmap governor into passive_governor]
>>>>>> Signed-off-by: Sibi Sankar <sibis@codeaurora.org>
>>>>>> Signed-off-by: Andrew-sh.Cheng <andrew-sh.cheng@mediatek.com>
>>>>>> ---
>>>>>>  drivers/devfreq/Kconfig            |   2 +
>>>>>>  drivers/devfreq/governor_passive.c | 329 +++++++++++++++++++++++++++++++++++--
>>>>>>  include/linux/devfreq.h            |  29 +++-
>>>>>>  3 files changed, 342 insertions(+), 18 deletions(-)
>>>>>>
>>>>>> diff --git a/drivers/devfreq/Kconfig b/drivers/devfreq/Kconfig
>>>>>> index 00704efe6398..f56132b0ae64 100644
>>>>>> --- a/drivers/devfreq/Kconfig
>>>>>> +++ b/drivers/devfreq/Kconfig
>>>>>> @@ -73,6 +73,8 @@ config DEVFREQ_GOV_PASSIVE
>>>>>>  	  device. This governor does not change the frequency by itself
>>>>>>  	  through sysfs entries. The passive governor recommends that
>>>>>>  	  devfreq device uses the OPP table to get the frequency/voltage.
>>>>>> +	  Alternatively the governor can also be chosen to scale based on
>>>>>> +	  the online CPUs current frequency.
>>>>>>  
>>>>>>  comment "DEVFREQ Drivers"
>>>>>>  
>>>>>> diff --git a/drivers/devfreq/governor_passive.c b/drivers/devfreq/governor_passive.c
>>>>>> index b094132bd20b..9cc57b083839 100644
>>>>>> --- a/drivers/devfreq/governor_passive.c
>>>>>> +++ b/drivers/devfreq/governor_passive.c
>>>>>> @@ -8,11 +8,103 @@
>>>>>>   */
>>>>>>  
>>>>>>  #include <linux/module.h>
>>>>>> +#include <linux/cpu.h>
>>>>>> +#include <linux/cpufreq.h>
>>>>>> +#include <linux/cpumask.h>
>>>>>>  #include <linux/device.h>
>>>>>>  #include <linux/devfreq.h>
>>>>>> +#include <linux/slab.h>
>>>>>>  #include "governor.h"
>>>>>>  
>>>>>> -static int devfreq_passive_get_target_freq(struct devfreq *devfreq,
>>>>>> +struct devfreq_cpu_state {
>>>>>> +	unsigned int curr_freq;
>>>>>> +	unsigned int min_freq;
>>>>>> +	unsigned int max_freq;
>>>>>> +	unsigned int first_cpu;
>>>>>> +	struct device *cpu_dev;
>>>>>> +	struct opp_table *opp_table;
>>>>>> +};
>>>>>
>>>>> As I knew, the previous version has the description of structure
>>>>> as following:  I wan to add the description like below.
>>>>>
>>>>> And if you have no any objection, I'd like you to order
>>>>> the variables as following and use 'dev' instead of 'cpu_dev'
>>>>> because this patch use the 'cpu_state->cpu_dev' at the multiple points.
>>>>> I think that 'cpu_state->dev' is better than 'cpu_state->cpu_dev'.
>>>>> Also, I prefer to use 'cur_freq' instead of 'curr_freq'
>>>>> because devfreq subsystem uses 'cur_freq' for expressing the 'current frequency'.
>>>>>
>>>>> /**                                                                             
>>>>>  * struct devfreq_cpu_state - Hold the per-cpu data                              
>>>>>  * @dev:        reference to cpu device.                                        
>>>>>  * @first_cpu:  the cpumask of the first cpu of a policy.                       
>>>>>  * @opp_table:  reference to cpu opp table.                                     
>>>>>  * @cur_freq:   the current frequency of the cpu.                               
>>>>>  * @min_freq:   the min frequency of the cpu.                                   
>>>>>  * @max_freq:   the max frequency of the cpu.                                   
>>>>>  *                                                                              
>>>>>  * This structure stores the required cpu_data of a cpu.                        
>>>>>  * This is auto-populated by the governor.                                      
>>>>>  */                                                                             
>>>>> struct devfreq_cpu_state {                                                       
>>>>>          struct device *dev;                                                     
>>>>>          unsigned int first_cpu;                                                 
>>>>>
>>>>>          struct opp_table *opp_table;                                            
>>>>>          unsigned int cur_freq;                                                  
>>>>>          unsigned int min_freq;                                                  
>>>>>          unsigned int max_freq;                                                  
>>>>> };               
>>>>>
>>>>>
>>>>>> +
>>>>>> +static unsigned long xlate_cpufreq_to_devfreq(struct devfreq_passive_data *data,
>>>>>> +					      unsigned int cpu)
>>>>>> +{
>>>>>> +	unsigned int cpu_min_freq, cpu_max_freq, cpu_curr_freq_khz, cpu_percent;
>>>>>> +	unsigned long dev_min_freq, dev_max_freq, dev_max_state;
>>>>>> +
>>>>>> +	struct devfreq_cpu_state *cpu_state = data->cpu_state[cpu];
>>>>>> +	struct devfreq *devfreq = (struct devfreq *)data->this;
>>>>>> +	unsigned long *dev_freq_table = devfreq->profile->freq_table;
>>>>>> +	struct dev_pm_opp *opp = NULL, *p_opp = NULL;
>>>>>> +	unsigned long cpu_curr_freq, freq;
>>>>>> +
>>>>>> +	if (!cpu_state || cpu_state->first_cpu != cpu ||
>>>>>> +	    !cpu_state->opp_table || !devfreq->opp_table)
>>>>>> +		return 0;
>>>>>> +
>>>>>> +	cpu_curr_freq = cpu_state->curr_freq * 1000;
>>>>>> +	p_opp = devfreq_recommended_opp(cpu_state->cpu_dev, &cpu_curr_freq, 0);
>>>>>> +	if (IS_ERR(p_opp))
>>>>>> +		return 0;
>>>>>> +
>>>>>> +	opp = dev_pm_opp_xlate_required_opp(cpu_state->opp_table,
>>>>>> +					    devfreq->opp_table, p_opp);
>>>>>> +	dev_pm_opp_put(p_opp);
>>>>>> +
>>>>>> +	if (!IS_ERR(opp)) {
>>>>>> +		freq = dev_pm_opp_get_freq(opp);
>>>>>> +		dev_pm_opp_put(opp);
>>>>>> +		goto out;
>>>>>> +	}
>>>>>> +
>>>>>> +	/* Use Interpolation if required opps is not available */
>>>>>> +	cpu_min_freq = cpu_state->min_freq;
>>>>>> +	cpu_max_freq = cpu_state->max_freq;
>>>>>> +	cpu_curr_freq_khz = cpu_state->curr_freq;
>>>>>> +
>>>>>> +	if (dev_freq_table) {
>>>>>> +		/* Get minimum frequency according to sorting order */
>>>>>> +		dev_max_state = dev_freq_table[devfreq->profile->max_state - 1];
>>>>>> +		if (dev_freq_table[0] < dev_max_state) {
>>>>>> +			dev_min_freq = dev_freq_table[0];
>>>>>> +			dev_max_freq = dev_max_state;
>>>>>> +		} else {
>>>>>> +			dev_min_freq = dev_max_state;
>>>>>> +			dev_max_freq = dev_freq_table[0];
>>>>>> +		}
>>>>>> +	} else {
>>>>>> +		dev_min_freq = dev_pm_qos_read_value(devfreq->dev.parent,
>>>>>> +						     DEV_PM_QOS_MIN_FREQUENCY);
>>>>>> +		dev_max_freq = dev_pm_qos_read_value(devfreq->dev.parent,
>>>>>> +						     DEV_PM_QOS_MAX_FREQUENCY);
>>>>>> +
>>>>>> +		if (dev_max_freq <= dev_min_freq)
>>>>>> +			return 0;
>>>>>> +	}
>>>>>> +	cpu_percent = ((cpu_curr_freq_khz - cpu_min_freq) * 100) / cpu_max_freq - cpu_min_freq;
>>>>>> +	freq = dev_min_freq + mult_frac(dev_max_freq - dev_min_freq, cpu_percent, 100);
>>>>>> +
>>>>>> +out:
>>>>>> +	return freq;
>>>>>> +}
>>>>>> +
>>>>>> +static int get_target_freq_with_cpufreq(struct devfreq *devfreq,
>>>>>> +					unsigned long *freq)
>>>>>> +{
>>>>>> +	struct devfreq_passive_data *p_data =
>>>>>> +				(struct devfreq_passive_data *)devfreq->data;
>>>>>> +	unsigned int cpu;
>>>>>> +	unsigned long target_freq = 0;
>>>>>> +
>>>>>> +	for_each_online_cpu(cpu)
>>>>>> +		target_freq = max(target_freq,
>>>>>> +				  xlate_cpufreq_to_devfreq(p_data, cpu));
>>>>>> +
>>>>>> +	*freq = target_freq;
>>>>>> +
>>>>>> +	return 0;
>>>>>> +}
>>>>>
>>>>> As you knew, governor_passive.c was already used 
>>>>> both 'dev_pm_opp_xlate_required_opp' and 'devfreq_recommended_opp'
>>>>> to get the target from OPP. So, I wan to make the common function
>>>>> like 'get_taget_freq_by_required_opp' as following:
>>>>> If define 'get_taget_freq_by_required_opp' as following,
>>>>> it will be used for get_target_freq_with_devfreq().
>>>>> After finisied the review of this patch, I'll send the patch[2].
>>>>> [2] https://urldefense.com/v3/__https://git.kernel.org/pub/scm/linux/kernel/git/chanwoo/linux.git/commit/?h=devfreq-testing-passive-gov&id=101c5a087586ab2b5cf3370166a7e39227ca83cf__;!!CTRNKA9wMg0ARbw!zIrzeDp9vPnm1_SDzVPuzqdHn3zWie9DnfBXaA-j9-CSrVc6aR9_rJQQiw81_CgA6mp3Yqo$ 
>>>>>
>>>>> For example but this code is not tested,
>>>>> static unsigned long get_taget_freq_by_required_opp(struct device *p_dev,
>>>>> 						struct opp_table *p_opp_table,
>>>>> 						struct opp_table *opp_table,
>>>>> 						unsigned long freq)
>>>>> {
>>>>> 	struct dev_pm_opp *opp = NULL, *p_opp = NULL;
>>>>>
>>>>> 	if (!p_dev || !p_opp_table || !opp_table || !freq)
>>>>> 		return 0;
>>>>>
>>>>> 	p_opp = devfreq_recommended_opp(p_dev, &freq, 0);
>>>>> 	if (IS_ERR(p_opp))
>>>>> 		return 0;
>>>>>
>>>>> 	opp = dev_pm_opp_xlate_required_opp(p_opp_table, opp_table, p_opp);
>>>>> 	dev_pm_opp_put(p_opp);
>>>>>
>>>>> 	if (IS_ERR(opp))
>>>>> 		return 0;
>>>>>
>>>>> 	freq = dev_pm_opp_get_freq(opp);
>>>>> 	dev_pm_opp_put(opp);
>>>>>
>>>>> 	return freq;
>>>>> }
>>>>>
>>>>> static int get_target_freq_with_cpufreq(struct devfreq *devfreq,
>>>>> 					unsigned long *target_freq)
>>>>> {
>>>>> 	struct devfreq_passive_data *p_data =
>>>>> 				(struct devfreq_passive_data *)devfreq->data;
>>>>> 	struct devfreq_cpu_data *cpu_data;
>>>>> 	unsigned long cpu, cpu_cur, cpu_min, cpu_max, cpu_percent;
>>>>> 	unsigned long dev_min, dev_max;
>>>>> 	unsigned long freq = 0;
>>>>>
>>>>> 	for_each_online_cpu(cpu) {
>>>>> 		cpu_data = p_data->cpu_data[cpu];
>>>>> 		if (!cpu_data || cpu_data->first_cpu != cpu)
>>>>> 			continue;
>>>>>
>>>>> 		/* Get target freq via required opps */
>>>>> 		cpu_cur = cpu_data->cur_freq * HZ_PER_KHZ;
>>>>> 		freq = get_taget_freq_by_required_opp(cpu_data->dev,
>>>>> 					cpu_data->opp_table,
>>>>> 					devfreq->opp_table, cpu_cur);
>>>>> 		if (freq) {
>>>>> 			*target_freq = max(freq, *target_freq);
>>>>> 			continue;
>>>>> 		}
>>>>>
>>>>> 		/* Use Interpolation if required opps is not available */
>>>>> 		devfreq_get_freq_range(devfreq, &dev_min, &dev_max);
>>>>>
>>>>> 		cpu_min = cpu_data->min_freq;
>>>>> 		cpu_max = cpu_data->max_freq;
>>>>> 		cpu_cur = cpu_data->cur_freq;
>>>>>
>>>>> 		cpu_percent = ((cpu_cur - cpu_min) * 100) / cpu_max - cpu_min;
>>>>> 		freq = dev_min + mult_frac(dev_max - dev_min, cpu_percent, 100);
>>>>>
>>>>> 		*target_freq = max(freq, *target_freq);
>>>>> 	}
>>>>>
>>>>> 	return 0;
>>>>> }
>>>>>
>>>>>> +
>>>>>> +static int get_target_freq_with_devfreq(struct devfreq *devfreq,
>>>>>>  					unsigned long *freq)
>>>>>>  {
>>>>>>  	struct devfreq_passive_data *p_data
>>>>>> @@ -23,14 +115,6 @@ static int devfreq_passive_get_target_freq(struct devfreq *devfreq,
>>>>>>  	int i, count;
>>>>>>  
>>>>>>  	/*
>>>>>> -	 * If the devfreq device with passive governor has the specific method
>>>>>> -	 * to determine the next frequency, should use the get_target_freq()
>>>>>> -	 * of struct devfreq_passive_data.
>>>>>> -	 */
>>>>>> -	if (p_data->get_target_freq)
>>>>>> -		return p_data->get_target_freq(devfreq, freq);
>>>>>> -
>>>>>> -	/*
>>>>>>  	 * If the parent and passive devfreq device uses the OPP table,
>>>>>>  	 * get the next frequency by using the OPP table.
>>>>>>  	 */
>>>>>> @@ -98,6 +182,37 @@ static int devfreq_passive_get_target_freq(struct devfreq *devfreq,
>>>>>>  	return 0;
>>>>>>  }
>>>>>>  
>>>>>> +static int devfreq_passive_get_target_freq(struct devfreq *devfreq,
>>>>>> +					   unsigned long *freq)
>>>>>> +{
>>>>>> +	struct devfreq_passive_data *p_data =
>>>>>> +		(struct devfreq_passive_data *)devfreq->data;
>>>>>> +	int ret;
>>>>>> +
>>>>>> +	/*
>>>>>> +	 * If the devfreq device with passive governor has the specific method
>>>>>> +	 * to determine the next frequency, should use the get_target_freq()
>>>>>> +	 * of struct devfreq_passive_data.
>>>>>> +	 */
>>>>>> +	if (p_data->get_target_freq)
>>>>>> +		return p_data->get_target_freq(devfreq, freq);
>>>>>> +
>>>>>> +	switch (p_data->parent_type) {
>>>>>> +	case DEVFREQ_PARENT_DEV:
>>>>>> +		ret = get_target_freq_with_devfreq(devfreq, freq);
>>>>>> +		break;
>>>>>> +	case CPUFREQ_PARENT_DEV:
>>>>>> +		ret = get_target_freq_with_cpufreq(devfreq, freq);
>>>>>> +		break;
>>>>>> +	default:
>>>>>> +		ret = -EINVAL;
>>>>>> +		dev_err(&devfreq->dev, "Invalid parent type\n");
>>>>>> +		break;
>>>>>> +	}
>>>>>> +
>>>>>> +	return ret;
>>>>>> +}
>>>>>> +
>>>>>>  static int devfreq_passive_notifier_call(struct notifier_block *nb,
>>>>>>  				unsigned long event, void *ptr)
>>>>>>  {
>>>>>> @@ -130,16 +245,200 @@ static int devfreq_passive_notifier_call(struct notifier_block *nb,
>>>>>>  	return NOTIFY_DONE;
>>>>>>  }
>>>>>>  
>>>>>> +static int cpufreq_passive_notifier_call(struct notifier_block *nb,
>>>>>> +					 unsigned long event, void *ptr)
>>>>>> +{
>>>>>> +	struct devfreq_passive_data *data =
>>>>>> +			container_of(nb, struct devfreq_passive_data, nb);
>>>>>> +	struct devfreq *devfreq = (struct devfreq *)data->this;
>>>>>> +	struct devfreq_cpu_state *cpu_state;
>>>>>> +	struct cpufreq_freqs *cpu_freq = ptr;
>>>>>
>>>>> Use 'freqs' variable name.  I prefer to use the same variable name
>>>>> for both devfreq_freqs and cpufreq_freqs instance.
>>>>>
>>>>>> +	unsigned int curr_freq;
>>>>>
>>>>> As I commented above, better to use 'cur_frq' instead of 'curr_freq'
>>>>> if there is no any special reason.
>>>>>
>>>>>> +	int ret;
>>>>>> +
>>>>>> +	if (event != CPUFREQ_POSTCHANGE || !cpu_freq ||
>>>>>> +	    !data->cpu_state[cpu_freq->policy->cpu])
>>>>>> +		return 0;
>>>>>> +
>>>>>> +	cpu_state = data->cpu_state[cpu_freq->policy->cpu];
>>>>>> +	if (cpu_state->curr_freq == cpu_freq->new)
>>>>>> +		return 0;
>>>>>> +
>>>>>> +	/* Backup current freq and pre-update cpu state freq*/
>>>>>
>>>>> I think that this commnet is not critial. So, please drop this comment.
>>>>>
>>>>>> +	curr_freq = cpu_state->curr_freq;
>>>>>> +	cpu_state->curr_freq = cpu_freq->new;
>>>>>> +
>>>>>> +	mutex_lock(&devfreq->lock);
>>>>>> +	ret = update_devfreq(devfreq);
>>>>>
>>>>> I recommend to use 'devfreq_update_target' instead of 'update_devfreq'
>>>>> as following:
>>>>> 	devfreq_update_target(devfreq, freqs->new);
>>>>>
>>>>>> +	mutex_unlock(&devfreq->lock);
>>>>>> +	if (ret) {
>>>>>> +		cpu_state->curr_freq = curr_freq;
>>>>>> +		dev_err(&devfreq->dev, "Couldn't update the frequency.\n");
>>>>>> +		return ret;
>>>>>> +	}
>>>>>> +
>>>>>> +	return 0;
>>>>>> +}
>>>>>> +
>>>>>> +static int cpufreq_passive_register(struct devfreq_passive_data **p_data)
>>>>>
>>>>> In order to keep the consistent style of function name,
>>>>> please change the name as following because devfreq defines
>>>>> the function name as 'devfreq_regiter_notifier'
>>>>> - cpufreq_passive_register -> cpufreq_passive_register_notifier
>>>>>
>>>>>> +{
>>>>>> +	struct devfreq_passive_data *data = *p_data;
>>>>>> +	struct devfreq *devfreq = (struct devfreq *)data->this;
>>>>>> +	struct device *dev = devfreq->dev.parent;
>>>>>> +	struct opp_table *opp_table = NULL;
>>>>>> +	struct devfreq_cpu_state *cpu_state;
>>>>>> +	struct cpufreq_policy *policy;
>>>>>> +	struct device *cpu_dev;
>>>>>> +	unsigned int cpu;
>>>>>> +	int ret;
>>>>>> +
>>>>>> +	get_online_cpus();
>>>>>> +
>>>>>> +	data->nb.notifier_call = cpufreq_passive_notifier_call;
>>>>>> +	ret = cpufreq_register_notifier(&data->nb,
>>>>>> +					CPUFREQ_TRANSITION_NOTIFIER);
>>>>>> +	if (ret) {
>>>>>> +		dev_err(dev, "Couldn't register cpufreq notifier.\n");
>>>>>> +		data->nb.notifier_call = NULL;
>>>>>> +		goto out;
>>>>>> +	}
>>>>>> +
>>>>>> +	/* Populate devfreq_cpu_state */
>>>>>
>>>>> Don't need this comment. Please drop it.
>>>>>
>>>>>> +	for_each_online_cpu(cpu) {
>>>>>> +		if (data->cpu_state[cpu])
>>>>>> +			continue;
>>>>>> +
>>>>>> +		policy = cpufreq_cpu_get(cpu);
>>>>>> +		if (!policy) {
>>>>>> +			ret = -EINVAL;
>>>>>> +			goto out;
>>>>>> +		} else if (PTR_ERR(policy) == -EPROBE_DEFER) {
>>>>>> +			ret = -EPROBE_DEFER;
>>>>>> +			goto out;
>>>>>> +		} else if (IS_ERR(policy)) {
>>>>>> +			ret = PTR_ERR(policy);
>>>>>> +			dev_err(dev, "Couldn't get the cpufreq_poliy.\n");
>>>>>> +			goto out;
>>>>>> +		}
>>>>>
>>>>> Use dev_err_probe() funciotn to handle hte EPROBE_DEFER.
>>>>> It make code more simple.
>>>>>
>>>>>> +
>>>>>> +		cpu_state = kzalloc(sizeof(*cpu_state), GFP_KERNEL);
>>>>>> +		if (!cpu_state) {
>>>>>> +			ret = -ENOMEM;
>>>>>> +			goto out;
>>>>>> +		}
>>>>>> +
>>>>>> +		cpu_dev = get_cpu_device(cpu);
>>>>>> +		if (!cpu_dev) {
>>>>>> +			dev_err(dev, "Couldn't get cpu device.\n");
>>>>>> +			ret = -ENODEV;
>>>>>> +			goto out;
>>>>>> +		}
>>>>>> +
>>>>>> +		opp_table = dev_pm_opp_get_opp_table(cpu_dev);
>>>>>> +		if (IS_ERR(devfreq->opp_table)) {
>>>>>> +			ret = PTR_ERR(opp_table);
>>>>>> +			goto out;
>>>>>> +		}
>>>>>> +
>>>>>> +		cpu_state->cpu_dev = cpu_dev;
>>>>>> +		cpu_state->opp_table = opp_table;
>>>>>> +		cpu_state->first_cpu = cpumask_first(policy->related_cpus);
>>>>>> +		cpu_state->curr_freq = policy->cur;
>>>>>> +		cpu_state->min_freq = policy->cpuinfo.min_freq;
>>>>>> +		cpu_state->max_freq = policy->cpuinfo.max_freq;
>>>>>> +		data->cpu_state[cpu] = cpu_state;
>>>>>> +
>>>>>> +		cpufreq_cpu_put(policy);
>>>>>> +	}
>>>>>> +
>>>>>> +out:
>>>>>> +	put_online_cpus();
>>>>>> +	if (ret)
>>>>>> +		return ret;
>>>>>> +
>>>>>> +	/* Update devfreq */
>>>>>> +	mutex_lock(&devfreq->lock);
>>>>>> +	ret = update_devfreq(devfreq);
>>>>>
>>>>>> +	mutex_unlock(&devfreq->lock);
>>>>>> +	if (ret)
>>>>>> +		dev_err(dev, "Couldn't update the frequency.\n");
>>>>>> +
>>>>>> +	return ret;
>>>>>> +}
>>>>>> +
>>>>>> +static int cpufreq_passive_unregister(struct devfreq_passive_data **p_data)
>>>>>
>>>>> As I commented above, please change the name as following:
>>>>> - cpufreq_passive_unregister -> cpufreq_passive_unregister_notifier
>>>>>
>>>>>> +{
>>>>>> +	struct devfreq_passive_data *data = *p_data;
>>>>>> +	struct devfreq_cpu_state *cpu_state;
>>>>>> +	int cpu;
>>>>>> +
>>>>>> +	if (data->nb.notifier_call)
>>>>>> +		cpufreq_unregister_notifier(&data->nb,
>>>>>> +					    CPUFREQ_TRANSITION_NOTIFIER);
>>>>>> +
>>>>>> +	for_each_possible_cpu(cpu) {
>>>>>> +		cpu_state = data->cpu_state[cpu];
>>>>>> +		if (cpu_state) {
>>>>>> +			if (cpu_state->opp_table)
>>>>>> +				dev_pm_opp_put_opp_table(cpu_state->opp_table);
>>>>>> +			kfree(cpu_state);
>>>>>> +			cpu_state = NULL;
>>>>>> +		}
>>>>>> +	}
>>>>>> +
>>>>>> +	return 0;
>>>>>> +}
>>>>>> +
>>>>>> +int register_parent_dev_notifier(struct devfreq_passive_data **p_data)
>>>>>> +{
>>>>>> +	struct notifier_block *nb = &(*p_data)->nb;
>>>>>> +	int ret = 0;
>>>>>> +
>>>>>> +	switch ((*p_data)->parent_type) {
>>>>>> +	case DEVFREQ_PARENT_DEV:
>>>>>> +		nb->notifier_call = devfreq_passive_notifier_call;
>>>>>> +		ret = devfreq_register_notifier((struct devfreq *)(*p_data)->parent, nb,
>>>>>> +						DEVFREQ_TRANSITION_NOTIFIER);
>>>>>> +		break;
>>>>>> +	case CPUFREQ_PARENT_DEV:
>>>>>> +		ret = cpufreq_passive_register(p_data);
>>>>>> +		break;
>>>>>> +	default:
>>>>>> +		ret = -EINVAL;
>>>>>> +		break;
>>>>>> +	}
>>>>>> +	return ret;
>>>>>> +}
>>>>>> +
>>>>>> +int unregister_parent_dev_notifier(struct devfreq_passive_data **p_data)
>>>>>> +{
>>>>>> +	int ret = 0;
>>>>>> +
>>>>>> +	switch ((*p_data)->parent_type) {
>>>>>> +	case DEVFREQ_PARENT_DEV:
>>>>>> +		WARN_ON(devfreq_unregister_notifier((struct devfreq *)(*p_data)->parent,
>>>>>> +						    &(*p_data)->nb,
>>>>>> +						    DEVFREQ_TRANSITION_NOTIFIER));
>>>>>> +		break;
>>>>>> +	case CPUFREQ_PARENT_DEV:
>>>>>> +		cpufreq_passive_unregister(p_data);
>>>>>> +		break;
>>>>>> +	default:
>>>>>> +		ret = -EINVAL;
>>>>>> +		break;
>>>>>> +	}
>>>>>> +	return ret;
>>>>>> +}
>>>>>
>>>>> I think that you don't need to define register_parent_dev_notifier
>>>>> and unregister_parent_dev_notifier as the separate functions.
>>>>>
>>>>> Instead of the separate functions, just add the code
>>>>> into devfreq_passive_event_handler.
>>>>>
>>>>>
>>>>>> +
>>>>>>  static int devfreq_passive_event_handler(struct devfreq *devfreq,
>>>>>>  				unsigned int event, void *data)
>>>>>>  {
>>>>>>  	struct devfreq_passive_data *p_data
>>>>>>  			= (struct devfreq_passive_data *)devfreq->data;
>>>>>>  	struct devfreq *parent = (struct devfreq *)p_data->parent;
>>>>>> -	struct notifier_block *nb = &p_data->nb;
>>>>>>  	int ret = 0;
>>>>>>  
>>>>>> -	if (!parent)
>>>>>> +	if (p_data->parent_type == DEVFREQ_PARENT_DEV && !parent)
>>>>>>  		return -EPROBE_DEFER;
>>>>>>  
>>>>>>  	switch (event) {
>>>>>> @@ -147,13 +446,11 @@ static int devfreq_passive_event_handler(struct devfreq *devfreq,
>>>>>>  		if (!p_data->this)
>>>>>>  			p_data->this = devfreq;
>>>>>>  
>>>>>> -		nb->notifier_call = devfreq_passive_notifier_call;
>>>>>> -		ret = devfreq_register_notifier(parent, nb,
>>>>>> -					DEVFREQ_TRANSITION_NOTIFIER);
>>>>>> +		ret = register_parent_dev_notifier(&p_data);
>>>>>>  		break;
>>>>>> +
>>>>>>  	case DEVFREQ_GOV_STOP:
>>>>>> -		WARN_ON(devfreq_unregister_notifier(parent, nb,
>>>>>> -					DEVFREQ_TRANSITION_NOTIFIER));
>>>>>> +		ret = unregister_parent_dev_notifier(&p_data);
>>>>>>  		break;
>>>>>>  	default:
>>>>>>  		break;
>>>>>> diff --git a/include/linux/devfreq.h b/include/linux/devfreq.h
>>>>>> index 26ea0850be9b..e0093b7c805c 100644
>>>>>> --- a/include/linux/devfreq.h
>>>>>> +++ b/include/linux/devfreq.h
>>>>>> @@ -280,6 +280,25 @@ struct devfreq_simple_ondemand_data {
>>>>>>  
>>>>>>  #if IS_ENABLED(CONFIG_DEVFREQ_GOV_PASSIVE)
>>>>>>  /**
>>>>>> + * struct devfreq_cpu_state - holds the per-cpu state
>>>>>> + * @freq:	the current frequency of the cpu.
>>>>>> + * @min_freq:	the min frequency of the cpu.
>>>>>> + * @max_freq:	the max frequency of the cpu.
>>>>>> + * @first_cpu:	the cpumask of the first cpu of a policy.
>>>>>> + * @dev:	reference to cpu device.
>>>>>> + * @opp_table:	reference to cpu opp table.
>>>>>> + *
>>>>>> + * This structure stores the required cpu_state of a cpu.
>>>>>> + * This is auto-populated by the governor.
>>>>>> + */
>>>>>> +struct devfreq_cpu_state;
>>>>>> +
>>>>>> +enum devfreq_parent_dev_type {
>>>>>> +	DEVFREQ_PARENT_DEV,
>>>>>> +	CPUFREQ_PARENT_DEV,
>>>>>> +};
>>>>>> +
>>>>>> +/**
>>>>>>   * struct devfreq_passive_data - ``void *data`` fed to struct devfreq
>>>>>>   *	and devfreq_add_device
>>>>>>   * @parent:	the devfreq instance of parent device.
>>>>>> @@ -290,13 +309,15 @@ struct devfreq_simple_ondemand_data {
>>>>>>   *			using governors except for passive governor.
>>>>>>   *			If the devfreq device has the specific method to decide
>>>>>>   *			the next frequency, should use this callback.
>>>>>> + * @parent_type:	parent type of the device
>>>>>>   * @this:	the devfreq instance of own device.
>>>>>>   * @nb:		the notifier block for DEVFREQ_TRANSITION_NOTIFIER list
>>>>>> + * @cpu_state:		the state min/max/current frequency of all online cpu's
>>>>>>   *
>>>>>>   * The devfreq_passive_data have to set the devfreq instance of parent
>>>>>>   * device with governors except for the passive governor. But, don't need to
>>>>>> - * initialize the 'this' and 'nb' field because the devfreq core will handle
>>>>>> - * them.
>>>>>> + * initialize the 'this', 'nb' and 'cpu_state' field because the devfreq core
>>>>>> + * will handle them.
>>>>>>   */
>>>>>>  struct devfreq_passive_data {
>>>>>>  	/* Should set the devfreq instance of parent device */
>>>>>> @@ -305,9 +326,13 @@ struct devfreq_passive_data {
>>>>>>  	/* Optional callback to decide the next frequency of passvice device */
>>>>>>  	int (*get_target_freq)(struct devfreq *this, unsigned long *freq);
>>>>>>  
>>>>>> +	/* Should set the type of parent device */
>>>>>> +	enum devfreq_parent_dev_type parent_type;
>>>>>> +
>>>>>>  	/* For passive governor's internal use. Don't need to set them */
>>>>>>  	struct devfreq *this;
>>>>>>  	struct notifier_block nb;
>>>>>> +	struct devfreq_cpu_state *cpu_state[NR_CPUS];
>>>>>>  };
>>>>>>  #endif
>>>>>>  
>>>>>>
>>>>>
>>>>
>>>
>>>
>>
>>
> 


-- 
Best Regards,
Chanwoo Choi
Samsung Electronics

_______________________________________________
linux-arm-kernel mailing list
linux-arm-kernel@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-arm-kernel

^ permalink raw reply	[flat|nested] 31+ messages in thread

* Re: [PATCH V8 1/8] PM / devfreq: Add cpu based scaling support to passive_governor
  2021-04-01  0:16             ` Chanwoo Choi
@ 2021-04-08  2:47               ` Chanwoo Choi
  2021-04-22 13:34                 ` andrew-sh.cheng
  2021-05-26  2:22                 ` andrew-sh.cheng
  0 siblings, 2 replies; 31+ messages in thread
From: Chanwoo Choi @ 2021-04-08  2:47 UTC (permalink / raw)
  To: andrew-sh.cheng
  Cc: MyungJoo Ham, Kyungmin Park, Rob Herring, Mark Rutland,
	Matthias Brugger, Rafael J. Wysocki, Viresh Kumar,
	Nishanth Menon, Stephen Boyd, Liam Girdwood, Mark Brown,
	linux-pm, devicetree, linux-arm-kernel, linux-mediatek,
	linux-kernel, srv_heupstream, Sibi Sankar

On 4/1/21 9:16 AM, Chanwoo Choi wrote:
> On 3/31/21 10:03 PM, andrew-sh.cheng wrote:
>> On Wed, 2021-03-31 at 17:35 +0900, Chanwoo Choi wrote:
>>> On 3/31/21 5:27 PM, Chanwoo Choi wrote:
>>>> Hi,
>>>>
>>>> On 3/31/21 5:03 PM, andrew-sh.cheng wrote:
>>>>> On Thu, 2021-03-25 at 17:14 +0900, Chanwoo Choi wrote:
>>>>>> Hi,
>>>>>>
>>>>>> You are missing to add these patches to linux-pm mailing list.
>>>>>> Need to send them to linu-pm ML.
>>>>>>
>>>>>> Also, before received this series, I tried to clean-up these patches
>>>>>> on testing branch[1]. So that I add my comment with my clean-up case.
>>>>>> [1] https://urldefense.com/v3/__https://git.kernel.org/pub/scm/linux/kernel/git/chanwoo/linux.git/log/?h=devfreq-testing-passive-gov__;!!CTRNKA9wMg0ARbw!zIrzeDp9vPnm1_SDzVPuzqdHn3zWie9DnfBXaA-j9-CSrVc6aR9_rJQQiw81_CgAPh9XRRs$ 
>>>>>>
>>>>>> And 'Saravana Kannan <skannan@codeaurora.org>' is wrong email address.
>>>>>> Please update the email or drop this email.
>>>>>
>>>>> Hi Chanwoo,
>>>>>
>>>>> Thank you for the advices.
>>>>> I will resend patch v9 (add to linux-pm ML), remove this patch, and note
>>>>> that my patch set base on
>>>>> https://urldefense.com/v3/__https://git.kernel.org/pub/scm/linux/kernel/git/chanwoo/linux.git/log/?h=devfreq-testing-passive-gov__;!!CTRNKA9wMg0ARbw!yUlsuxrL5PcbF7o6A9DlCfvoA6w8V8VXKjYIybYyiJg3D0HM-lI2xRuxLUV6b3UJ8WFhg_g$ 
>>>>
>>>> I has not yet test this patch[1] on devfreq-testing-passive-gov branch.
>>>> So that if possible, I'd like you to test your patches with this patch[1] 
>>>> and then if there is no problem, could you send the next patches with patch[1]?
>>>>
>>>> [1]https://urldefense.com/v3/__https://git.kernel.org/pub/scm/linux/kernel/git/chanwoo/linux.git/commit/?h=devfreq-testing-passive-gov&id=39c80d11a8f42dd63ecea1e0df595a0ceb83b454__;!!CTRNKA9wMg0ARbw!yUlsuxrL5PcbF7o6A9DlCfvoA6w8V8VXKjYIybYyiJg3D0HM-lI2xRuxLUV6b3UJR2cQqZs$ 
>>>
>>>
>>> Sorry for the confusion. I make the devfreq-testing-passive-gov[1]
>>> branch based on latest devfreq-next branch.
>>> [1] https://urldefense.com/v3/__https://git.kernel.org/pub/scm/linux/kernel/git/chanwoo/linux.git/log/?h=devfreq-testing-passive-gov__;!!CTRNKA9wMg0ARbw!yUlsuxrL5PcbF7o6A9DlCfvoA6w8V8VXKjYIybYyiJg3D0HM-lI2xRuxLUV6b3UJ8WFhg_g$ 
>>>
>>> First of all, if possible, I want to test them[1] with your patches in this series.
>>> And then if there are no any problem, please let me know. After confirmed from you,
>>> I'll send the patches of devfreq-testing-passive-gov[1] branch.
>>> How about that?
>>>
>> Hi Chanwoo~
>>
>> We will use this on Google Chrome project.
>> Google Hsin-Yi has test your patch + my patch set v8 [2~8]
>>
>>     make sure cci devfreqs runs with cpufreq.
>>     suspend resume
>>     speedometer2 benchmark
>> It is okay.
>>
>> Please send the patches of devfreq-testing-passive-gov[1] branch.
>>
>> I will send patch v9 base on yours latter.
> 
> Thanks for your test. I'll send the patches today.

I'm sorry for delay because when I tested the patches
for devfreq parent type on Odroid-xu3, there are some problem
related to lazy linking of OPP. So I'm trying to analyze them.
Unfortunately, we need to postpone these patches to next linux
version.


[snip]

-- 
Best Regards,
Chanwoo Choi
Samsung Electronics

_______________________________________________
linux-arm-kernel mailing list
linux-arm-kernel@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-arm-kernel

^ permalink raw reply	[flat|nested] 31+ messages in thread

* Re: [PATCH V8 1/8] PM / devfreq: Add cpu based scaling support to passive_governor
  2021-04-08  2:47               ` Chanwoo Choi
@ 2021-04-22 13:34                 ` andrew-sh.cheng
  2021-05-26  2:22                 ` andrew-sh.cheng
  1 sibling, 0 replies; 31+ messages in thread
From: andrew-sh.cheng @ 2021-04-22 13:34 UTC (permalink / raw)
  To: Chanwoo Choi
  Cc: MyungJoo Ham, Kyungmin Park, Rob Herring, Mark Rutland,
	Matthias Brugger, Rafael J. Wysocki, Viresh Kumar,
	Nishanth Menon, Stephen Boyd, Liam Girdwood, Mark Brown,
	linux-pm, devicetree, linux-arm-kernel, linux-mediatek,
	linux-kernel, srv_heupstream, Sibi Sankar

On Thu, 2021-04-08 at 11:47 +0900, Chanwoo Choi wrote:
> On 4/1/21 9:16 AM, Chanwoo Choi wrote:
> > On 3/31/21 10:03 PM, andrew-sh.cheng wrote:
> >> On Wed, 2021-03-31 at 17:35 +0900, Chanwoo Choi wrote:
> >>> On 3/31/21 5:27 PM, Chanwoo Choi wrote:
> >>>> Hi,
> >>>>
> >>>> On 3/31/21 5:03 PM, andrew-sh.cheng wrote:
> >>>>> On Thu, 2021-03-25 at 17:14 +0900, Chanwoo Choi wrote:
> >>>>>> Hi,
> >>>>>>
> >>>>>> You are missing to add these patches to linux-pm mailing list.
> >>>>>> Need to send them to linu-pm ML.
> >>>>>>
> >>>>>> Also, before received this series, I tried to clean-up these patches
> >>>>>> on testing branch[1]. So that I add my comment with my clean-up case.
> >>>>>> [1] https://urldefense.com/v3/__https://git.kernel.org/pub/scm/linux/kernel/git/chanwoo/linux.git/log/?h=devfreq-testing-passive-gov__;!!CTRNKA9wMg0ARbw!zIrzeDp9vPnm1_SDzVPuzqdHn3zWie9DnfBXaA-j9-CSrVc6aR9_rJQQiw81_CgAPh9XRRs$ 
> >>>>>>
> >>>>>> And 'Saravana Kannan <skannan@codeaurora.org>' is wrong email address.
> >>>>>> Please update the email or drop this email.
> >>>>>
> >>>>> Hi Chanwoo,
> >>>>>
> >>>>> Thank you for the advices.
> >>>>> I will resend patch v9 (add to linux-pm ML), remove this patch, and note
> >>>>> that my patch set base on
> >>>>> https://urldefense.com/v3/__https://git.kernel.org/pub/scm/linux/kernel/git/chanwoo/linux.git/log/?h=devfreq-testing-passive-gov__;!!CTRNKA9wMg0ARbw!yUlsuxrL5PcbF7o6A9DlCfvoA6w8V8VXKjYIybYyiJg3D0HM-lI2xRuxLUV6b3UJ8WFhg_g$ 
> >>>>
> >>>> I has not yet test this patch[1] on devfreq-testing-passive-gov branch.
> >>>> So that if possible, I'd like you to test your patches with this patch[1] 
> >>>> and then if there is no problem, could you send the next patches with patch[1]?
> >>>>
> >>>> [1]https://urldefense.com/v3/__https://git.kernel.org/pub/scm/linux/kernel/git/chanwoo/linux.git/commit/?h=devfreq-testing-passive-gov&id=39c80d11a8f42dd63ecea1e0df595a0ceb83b454__;!!CTRNKA9wMg0ARbw!yUlsuxrL5PcbF7o6A9DlCfvoA6w8V8VXKjYIybYyiJg3D0HM-lI2xRuxLUV6b3UJR2cQqZs$ 
> >>>
> >>>
> >>> Sorry for the confusion. I make the devfreq-testing-passive-gov[1]
> >>> branch based on latest devfreq-next branch.
> >>> [1] https://urldefense.com/v3/__https://git.kernel.org/pub/scm/linux/kernel/git/chanwoo/linux.git/log/?h=devfreq-testing-passive-gov__;!!CTRNKA9wMg0ARbw!yUlsuxrL5PcbF7o6A9DlCfvoA6w8V8VXKjYIybYyiJg3D0HM-lI2xRuxLUV6b3UJ8WFhg_g$ 
> >>>
> >>> First of all, if possible, I want to test them[1] with your patches in this series.
> >>> And then if there are no any problem, please let me know. After confirmed from you,
> >>> I'll send the patches of devfreq-testing-passive-gov[1] branch.
> >>> How about that?
> >>>
> >> Hi Chanwoo~
> >>
> >> We will use this on Google Chrome project.
> >> Google Hsin-Yi has test your patch + my patch set v8 [2~8]
> >>
> >>     make sure cci devfreqs runs with cpufreq.
> >>     suspend resume
> >>     speedometer2 benchmark
> >> It is okay.
> >>
> >> Please send the patches of devfreq-testing-passive-gov[1] branch.
> >>
> >> I will send patch v9 base on yours latter.
> > 
> > Thanks for your test. I'll send the patches today.
> 
> I'm sorry for delay because when I tested the patches
> for devfreq parent type on Odroid-xu3, there are some problem
> related to lazy linking of OPP. So I'm trying to analyze them.
> Unfortunately, we need to postpone these patches to next linux
> version.
> 
> 
Hi Chanwoo,
Sorry to bother you.
Do you work on this patch now?
Is there any thing that we can do?


> [snip]
> 

_______________________________________________
linux-arm-kernel mailing list
linux-arm-kernel@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-arm-kernel

^ permalink raw reply	[flat|nested] 31+ messages in thread

* Re: [PATCH V8 1/8] PM / devfreq: Add cpu based scaling support to passive_governor
  2021-04-08  2:47               ` Chanwoo Choi
  2021-04-22 13:34                 ` andrew-sh.cheng
@ 2021-05-26  2:22                 ` andrew-sh.cheng
  2021-05-26  3:08                   ` Chanwoo Choi
  1 sibling, 1 reply; 31+ messages in thread
From: andrew-sh.cheng @ 2021-05-26  2:22 UTC (permalink / raw)
  To: Chanwoo Choi
  Cc: MyungJoo Ham, Kyungmin Park, Rob Herring, Mark Rutland,
	Matthias Brugger, Rafael J. Wysocki, Viresh Kumar,
	Nishanth Menon, Stephen Boyd, Liam Girdwood, Mark Brown,
	linux-pm, devicetree, linux-arm-kernel, linux-mediatek,
	linux-kernel, srv_heupstream, Sibi Sankar

On Thu, 2021-04-08 at 11:47 +0900, Chanwoo Choi wrote:
> On 4/1/21 9:16 AM, Chanwoo Choi wrote:
> > On 3/31/21 10:03 PM, andrew-sh.cheng wrote:
> >> On Wed, 2021-03-31 at 17:35 +0900, Chanwoo Choi wrote:
> >>> On 3/31/21 5:27 PM, Chanwoo Choi wrote:
> >>>> Hi,
> >>>>
> >>>> On 3/31/21 5:03 PM, andrew-sh.cheng wrote:
> >>>>> On Thu, 2021-03-25 at 17:14 +0900, Chanwoo Choi wrote:
> >>>>>> Hi,
> >>>>>>
> >>>>>> You are missing to add these patches to linux-pm mailing list.
> >>>>>> Need to send them to linu-pm ML.
> >>>>>>
> >>>>>> Also, before received this series, I tried to clean-up these patches
> >>>>>> on testing branch[1]. So that I add my comment with my clean-up case.
> >>>>>> [1] https://urldefense.com/v3/__https://git.kernel.org/pub/scm/linux/kernel/git/chanwoo/linux.git/log/?h=devfreq-testing-passive-gov__;!!CTRNKA9wMg0ARbw!zIrzeDp9vPnm1_SDzVPuzqdHn3zWie9DnfBXaA-j9-CSrVc6aR9_rJQQiw81_CgAPh9XRRs$ 
> >>>>>>
> >>>>>> And 'Saravana Kannan <skannan@codeaurora.org>' is wrong email address.
> >>>>>> Please update the email or drop this email.
> >>>>>
> >>>>> Hi Chanwoo,
> >>>>>
> >>>>> Thank you for the advices.
> >>>>> I will resend patch v9 (add to linux-pm ML), remove this patch, and note
> >>>>> that my patch set base on
> >>>>> https://urldefense.com/v3/__https://git.kernel.org/pub/scm/linux/kernel/git/chanwoo/linux.git/log/?h=devfreq-testing-passive-gov__;!!CTRNKA9wMg0ARbw!yUlsuxrL5PcbF7o6A9DlCfvoA6w8V8VXKjYIybYyiJg3D0HM-lI2xRuxLUV6b3UJ8WFhg_g$ 
> >>>>
> >>>> I has not yet test this patch[1] on devfreq-testing-passive-gov branch.
> >>>> So that if possible, I'd like you to test your patches with this patch[1] 
> >>>> and then if there is no problem, could you send the next patches with patch[1]?
> >>>>
> >>>> [1]https://urldefense.com/v3/__https://git.kernel.org/pub/scm/linux/kernel/git/chanwoo/linux.git/commit/?h=devfreq-testing-passive-gov&id=39c80d11a8f42dd63ecea1e0df595a0ceb83b454__;!!CTRNKA9wMg0ARbw!yUlsuxrL5PcbF7o6A9DlCfvoA6w8V8VXKjYIybYyiJg3D0HM-lI2xRuxLUV6b3UJR2cQqZs$ 
> >>>
> >>>
> >>> Sorry for the confusion. I make the devfreq-testing-passive-gov[1]
> >>> branch based on latest devfreq-next branch.
> >>> [1] https://urldefense.com/v3/__https://git.kernel.org/pub/scm/linux/kernel/git/chanwoo/linux.git/log/?h=devfreq-testing-passive-gov__;!!CTRNKA9wMg0ARbw!yUlsuxrL5PcbF7o6A9DlCfvoA6w8V8VXKjYIybYyiJg3D0HM-lI2xRuxLUV6b3UJ8WFhg_g$ 
> >>>
> >>> First of all, if possible, I want to test them[1] with your patches in this series.
> >>> And then if there are no any problem, please let me know. After confirmed from you,
> >>> I'll send the patches of devfreq-testing-passive-gov[1] branch.
> >>> How about that?
> >>>
> >> Hi Chanwoo~
> >>
> >> We will use this on Google Chrome project.
> >> Google Hsin-Yi has test your patch + my patch set v8 [2~8]
> >>
> >>     make sure cci devfreqs runs with cpufreq.
> >>     suspend resume
> >>     speedometer2 benchmark
> >> It is okay.
> >>
> >> Please send the patches of devfreq-testing-passive-gov[1] branch.
> >>
> >> I will send patch v9 base on yours latter.
> > 
> > Thanks for your test. I'll send the patches today.
> 
> I'm sorry for delay because when I tested the patches
> for devfreq parent type on Odroid-xu3, there are some problem
> related to lazy linking of OPP. So I'm trying to analyze them.
> Unfortunately, we need to postpone these patches to next linux
> version.
> 
Hi Chanwoo Choi~

It is said that you are busy on another task recently.
May I know your plan on this patch?
Thank you.

> 
> [snip]
> 

_______________________________________________
linux-arm-kernel mailing list
linux-arm-kernel@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-arm-kernel

^ permalink raw reply	[flat|nested] 31+ messages in thread

* Re: [PATCH V8 1/8] PM / devfreq: Add cpu based scaling support to passive_governor
  2021-05-26  2:22                 ` andrew-sh.cheng
@ 2021-05-26  3:08                   ` Chanwoo Choi
  2021-05-31  3:22                     ` andrew-sh.cheng
  0 siblings, 1 reply; 31+ messages in thread
From: Chanwoo Choi @ 2021-05-26  3:08 UTC (permalink / raw)
  To: andrew-sh.cheng
  Cc: MyungJoo Ham, Kyungmin Park, Rob Herring, Mark Rutland,
	Matthias Brugger, Rafael J. Wysocki, Viresh Kumar,
	Nishanth Menon, Stephen Boyd, Liam Girdwood, Mark Brown,
	linux-pm, devicetree, linux-arm-kernel, linux-mediatek,
	linux-kernel, srv_heupstream, Sibi Sankar

Hi,
On 5/26/21 11:22 AM, andrew-sh.cheng wrote:
> On Thu, 2021-04-08 at 11:47 +0900, Chanwoo Choi wrote:
>> On 4/1/21 9:16 AM, Chanwoo Choi wrote:
>>> On 3/31/21 10:03 PM, andrew-sh.cheng wrote:
>>>> On Wed, 2021-03-31 at 17:35 +0900, Chanwoo Choi wrote:
>>>>> On 3/31/21 5:27 PM, Chanwoo Choi wrote:
>>>>>> Hi,
>>>>>>
>>>>>> On 3/31/21 5:03 PM, andrew-sh.cheng wrote:
>>>>>>> On Thu, 2021-03-25 at 17:14 +0900, Chanwoo Choi wrote:
>>>>>>>> Hi,
>>>>>>>>
>>>>>>>> You are missing to add these patches to linux-pm mailing list.
>>>>>>>> Need to send them to linu-pm ML.
>>>>>>>>
>>>>>>>> Also, before received this series, I tried to clean-up these patches
>>>>>>>> on testing branch[1]. So that I add my comment with my clean-up case.
>>>>>>>> [1] https://urldefense.com/v3/__https://git.kernel.org/pub/scm/linux/kernel/git/chanwoo/linux.git/log/?h=devfreq-testing-passive-gov__;!!CTRNKA9wMg0ARbw!zIrzeDp9vPnm1_SDzVPuzqdHn3zWie9DnfBXaA-j9-CSrVc6aR9_rJQQiw81_CgAPh9XRRs$ 
>>>>>>>>
>>>>>>>> And 'Saravana Kannan <skannan@codeaurora.org>' is wrong email address.
>>>>>>>> Please update the email or drop this email.
>>>>>>>
>>>>>>> Hi Chanwoo,
>>>>>>>
>>>>>>> Thank you for the advices.
>>>>>>> I will resend patch v9 (add to linux-pm ML), remove this patch, and note
>>>>>>> that my patch set base on
>>>>>>> https://urldefense.com/v3/__https://git.kernel.org/pub/scm/linux/kernel/git/chanwoo/linux.git/log/?h=devfreq-testing-passive-gov__;!!CTRNKA9wMg0ARbw!yUlsuxrL5PcbF7o6A9DlCfvoA6w8V8VXKjYIybYyiJg3D0HM-lI2xRuxLUV6b3UJ8WFhg_g$ 
>>>>>>
>>>>>> I has not yet test this patch[1] on devfreq-testing-passive-gov branch.
>>>>>> So that if possible, I'd like you to test your patches with this patch[1] 
>>>>>> and then if there is no problem, could you send the next patches with patch[1]?
>>>>>>
>>>>>> [1]https://urldefense.com/v3/__https://git.kernel.org/pub/scm/linux/kernel/git/chanwoo/linux.git/commit/?h=devfreq-testing-passive-gov&id=39c80d11a8f42dd63ecea1e0df595a0ceb83b454__;!!CTRNKA9wMg0ARbw!yUlsuxrL5PcbF7o6A9DlCfvoA6w8V8VXKjYIybYyiJg3D0HM-lI2xRuxLUV6b3UJR2cQqZs$ 
>>>>>
>>>>>
>>>>> Sorry for the confusion. I make the devfreq-testing-passive-gov[1]
>>>>> branch based on latest devfreq-next branch.
>>>>> [1] https://urldefense.com/v3/__https://git.kernel.org/pub/scm/linux/kernel/git/chanwoo/linux.git/log/?h=devfreq-testing-passive-gov__;!!CTRNKA9wMg0ARbw!yUlsuxrL5PcbF7o6A9DlCfvoA6w8V8VXKjYIybYyiJg3D0HM-lI2xRuxLUV6b3UJ8WFhg_g$ 
>>>>>
>>>>> First of all, if possible, I want to test them[1] with your patches in this series.
>>>>> And then if there are no any problem, please let me know. After confirmed from you,
>>>>> I'll send the patches of devfreq-testing-passive-gov[1] branch.
>>>>> How about that?
>>>>>
>>>> Hi Chanwoo~
>>>>
>>>> We will use this on Google Chrome project.
>>>> Google Hsin-Yi has test your patch + my patch set v8 [2~8]
>>>>
>>>>     make sure cci devfreqs runs with cpufreq.
>>>>     suspend resume
>>>>     speedometer2 benchmark
>>>> It is okay.
>>>>
>>>> Please send the patches of devfreq-testing-passive-gov[1] branch.
>>>>
>>>> I will send patch v9 base on yours latter.
>>>
>>> Thanks for your test. I'll send the patches today.
>>
>> I'm sorry for delay because when I tested the patches
>> for devfreq parent type on Odroid-xu3, there are some problem
>> related to lazy linking of OPP. So I'm trying to analyze them.
>> Unfortunately, we need to postpone these patches to next linux
>> version.
>>
> Hi Chanwoo Choi~
> 
> It is said that you are busy on another task recently.
> May I know your plan on this patch?
> Thank you.

Sorry for late work. I have a question.
When I tested exynos-bus.c with adding the 'required-opp' property
on odroid-xu3 board. I got some fail about 

When calling _set_required_opps(), always _set_required_opp() returns
-EBUSY error because of following lazy linking case[1].

[1] https://elixir.bootlin.com/linux/v5.13-rc3/source/drivers/opp/core.c#L896

/* required-opps not fully initialized yet */
if (lazy_linking_pending(opp_table))
	return -EBUSY;  


For calling dev_pm_opp_of_add_table(), lazy_link_required_opp_table() function
will be called. But, there is constraint[2]. If is_genpd of opp_table is false,
driver/opp/of.c cannot resolve the lazy linking issue.

[2]  https://elixir.bootlin.com/linux/v5.13-rc3/source/drivers/opp/of.c#L386

/* Link required OPPs for all OPPs of the newly added OPP table */
static void lazy_link_required_opp_table(struct opp_table *new_table)
{
	struct opp_table *opp_table, *temp, **required_opp_tables;
	struct device_node *required_np, *opp_np, *required_table_np;
	struct dev_pm_opp *opp;
	int i, ret;

	/*
	 * We only support genpd's OPPs in the "required-opps" for now,
	 * as we don't know much about other cases.
	 */
	if (!new_table->is_genpd)
		return;

Even if this case, there are no problem on your test case?

-- 
Best Regards,
Chanwoo Choi
Samsung Electronics

_______________________________________________
linux-arm-kernel mailing list
linux-arm-kernel@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-arm-kernel

^ permalink raw reply	[flat|nested] 31+ messages in thread

* Re: [PATCH V8 1/8] PM / devfreq: Add cpu based scaling support to passive_governor
  2021-05-26  3:08                   ` Chanwoo Choi
@ 2021-05-31  3:22                     ` andrew-sh.cheng
  2021-05-31  7:56                       ` Chanwoo Choi
  0 siblings, 1 reply; 31+ messages in thread
From: andrew-sh.cheng @ 2021-05-31  3:22 UTC (permalink / raw)
  To: Chanwoo Choi, Hsin-Yi Wang
  Cc: MyungJoo Ham, Kyungmin Park, Rob Herring, Mark Rutland,
	Matthias Brugger, Rafael J. Wysocki, Viresh Kumar,
	Nishanth Menon, Stephen Boyd, Liam Girdwood, Mark Brown,
	linux-pm, devicetree, linux-arm-kernel, linux-mediatek,
	linux-kernel, srv_heupstream, Sibi Sankar

On Wed, 2021-05-26 at 12:08 +0900, Chanwoo Choi wrote:
> Hi,
> On 5/26/21 11:22 AM, andrew-sh.cheng wrote:
> > On Thu, 2021-04-08 at 11:47 +0900, Chanwoo Choi wrote:
> >> On 4/1/21 9:16 AM, Chanwoo Choi wrote:
> >>> On 3/31/21 10:03 PM, andrew-sh.cheng wrote:
> >>>> On Wed, 2021-03-31 at 17:35 +0900, Chanwoo Choi wrote:
> >>>>> On 3/31/21 5:27 PM, Chanwoo Choi wrote:
> >>>>>> Hi,
> >>>>>>
> >>>>>> On 3/31/21 5:03 PM, andrew-sh.cheng wrote:
> >>>>>>> On Thu, 2021-03-25 at 17:14 +0900, Chanwoo Choi wrote:
> >>>>>>>> Hi,
> >>>>>>>>
> >>>>>>>> You are missing to add these patches to linux-pm mailing list.
> >>>>>>>> Need to send them to linu-pm ML.
> >>>>>>>>
> >>>>>>>> Also, before received this series, I tried to clean-up these patches
> >>>>>>>> on testing branch[1]. So that I add my comment with my clean-up case.
> >>>>>>>> [1] https://urldefense.com/v3/__https://git.kernel.org/pub/scm/linux/kernel/git/chanwoo/linux.git/log/?h=devfreq-testing-passive-gov__;!!CTRNKA9wMg0ARbw!zIrzeDp9vPnm1_SDzVPuzqdHn3zWie9DnfBXaA-j9-CSrVc6aR9_rJQQiw81_CgAPh9XRRs$ 
> >>>>>>>>
> >>>>>>>> And 'Saravana Kannan <skannan@codeaurora.org>' is wrong email address.
> >>>>>>>> Please update the email or drop this email.
> >>>>>>>
> >>>>>>> Hi Chanwoo,
> >>>>>>>
> >>>>>>> Thank you for the advices.
> >>>>>>> I will resend patch v9 (add to linux-pm ML), remove this patch, and note
> >>>>>>> that my patch set base on
> >>>>>>> https://urldefense.com/v3/__https://git.kernel.org/pub/scm/linux/kernel/git/chanwoo/linux.git/log/?h=devfreq-testing-passive-gov__;!!CTRNKA9wMg0ARbw!yUlsuxrL5PcbF7o6A9DlCfvoA6w8V8VXKjYIybYyiJg3D0HM-lI2xRuxLUV6b3UJ8WFhg_g$ 
> >>>>>>
> >>>>>> I has not yet test this patch[1] on devfreq-testing-passive-gov branch.
> >>>>>> So that if possible, I'd like you to test your patches with this patch[1] 
> >>>>>> and then if there is no problem, could you send the next patches with patch[1]?
> >>>>>>
> >>>>>> [1]https://urldefense.com/v3/__https://git.kernel.org/pub/scm/linux/kernel/git/chanwoo/linux.git/commit/?h=devfreq-testing-passive-gov&id=39c80d11a8f42dd63ecea1e0df595a0ceb83b454__;!!CTRNKA9wMg0ARbw!yUlsuxrL5PcbF7o6A9DlCfvoA6w8V8VXKjYIybYyiJg3D0HM-lI2xRuxLUV6b3UJR2cQqZs$ 
> >>>>>
> >>>>>
> >>>>> Sorry for the confusion. I make the devfreq-testing-passive-gov[1]
> >>>>> branch based on latest devfreq-next branch.
> >>>>> [1] https://urldefense.com/v3/__https://git.kernel.org/pub/scm/linux/kernel/git/chanwoo/linux.git/log/?h=devfreq-testing-passive-gov__;!!CTRNKA9wMg0ARbw!yUlsuxrL5PcbF7o6A9DlCfvoA6w8V8VXKjYIybYyiJg3D0HM-lI2xRuxLUV6b3UJ8WFhg_g$ 
> >>>>>
> >>>>> First of all, if possible, I want to test them[1] with your patches in this series.
> >>>>> And then if there are no any problem, please let me know. After confirmed from you,
> >>>>> I'll send the patches of devfreq-testing-passive-gov[1] branch.
> >>>>> How about that?
> >>>>>
> >>>> Hi Chanwoo~
> >>>>
> >>>> We will use this on Google Chrome project.
> >>>> Google Hsin-Yi has test your patch + my patch set v8 [2~8]
> >>>>
> >>>>     make sure cci devfreqs runs with cpufreq.
> >>>>     suspend resume
> >>>>     speedometer2 benchmark
> >>>> It is okay.
> >>>>
> >>>> Please send the patches of devfreq-testing-passive-gov[1] branch.
> >>>>
> >>>> I will send patch v9 base on yours latter.
> >>>
> >>> Thanks for your test. I'll send the patches today.
> >>
> >> I'm sorry for delay because when I tested the patches
> >> for devfreq parent type on Odroid-xu3, there are some problem
> >> related to lazy linking of OPP. So I'm trying to analyze them.
> >> Unfortunately, we need to postpone these patches to next linux
> >> version.
> >>
> > Hi Chanwoo Choi~
> > 
> > It is said that you are busy on another task recently.
> > May I know your plan on this patch?
> > Thank you.
> 
> Sorry for late work. I have a question.
> When I tested exynos-bus.c with adding the 'required-opp' property
> on odroid-xu3 board. I got some fail about 
> 
> When calling _set_required_opps(), always _set_required_opp() returns
> -EBUSY error because of following lazy linking case[1].
> 
> [1] https://urldefense.com/v3/__https://elixir.bootlin.com/linux/v5.13-rc3/source/drivers/opp/core.c*L896__;Iw!!CTRNKA9wMg0ARbw!3eNxwDZRy-Ev5BHGxT-BxCz4qrNy0NZohQuBGW36krkwOkl_WX8yBmxlqSk9hxp_kxspMJI$ 
> 
> /* required-opps not fully initialized yet */
> if (lazy_linking_pending(opp_table))
> 	return -EBUSY;  
> 
> 
> For calling dev_pm_opp_of_add_table(), lazy_link_required_opp_table() function
> will be called. But, there is constraint[2]. If is_genpd of opp_table is false,
> driver/opp/of.c cannot resolve the lazy linking issue.
> 
> [2]  https://urldefense.com/v3/__https://elixir.bootlin.com/linux/v5.13-rc3/source/drivers/opp/of.c*L386__;Iw!!CTRNKA9wMg0ARbw!3eNxwDZRy-Ev5BHGxT-BxCz4qrNy0NZohQuBGW36krkwOkl_WX8yBmxlqSk9hxp_QFUVY9E$ 
> 
> /* Link required OPPs for all OPPs of the newly added OPP table */
> static void lazy_link_required_opp_table(struct opp_table *new_table)
> {
> 	struct opp_table *opp_table, *temp, **required_opp_tables;
> 	struct device_node *required_np, *opp_np, *required_table_np;
> 	struct dev_pm_opp *opp;
> 	int i, ret;
> 
> 	/*
> 	 * We only support genpd's OPPs in the "required-opps" for now,
> 	 * as we don't know much about other cases.
> 	 */
> 	if (!new_table->is_genpd)
> 		return;
> 
> Even if this case, there are no problem on your test case?
> 

Hi Chanwoo~
Sorry for late reply.
Yes, we meet similar issue.
Google member Hsin-Yi had helped deal with this issue on Chrome project.

Patch segment:
@ /drivers/opp/of.c

/* Link required OPPs for all OPPs of the newly added OPP table */
static void lazy_link_required_opp_table(struct opp_table *new_table)
{
	struct opp_table *opp_table, *temp, **required_opp_tables;
	struct device_node *required_np, *opp_np, *required_table_np;
	struct dev_pm_opp *opp;
	int i, ret;

+	/*
+	 * We only support genpd's OPPs in the "required-opps" for now,
+	 * as we don't know much about other cases.
+	 */
+	if (!new_table->is_genpd)
+		return;


Hsin-Yi replied this issue in the discussion list in the original lazy
link thread:
https://patchwork.kernel.org/project/linux-pm/patch/20190717222340.137578-4-saravanak@google.com/#23932203

Loop Hsin-YI here.
You can discuss with her if needing more detail.

Thank you both.
_______________________________________________
linux-arm-kernel mailing list
linux-arm-kernel@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-arm-kernel

^ permalink raw reply	[flat|nested] 31+ messages in thread

* Re: [PATCH V8 1/8] PM / devfreq: Add cpu based scaling support to passive_governor
  2021-05-31  3:22                     ` andrew-sh.cheng
@ 2021-05-31  7:56                       ` Chanwoo Choi
       [not found]                         ` <CACb=7PUkpMkDOJ6dDHXhJ5ep4e9u8ZVYM8M2iC-iwHXn13t3DQ@mail.gmail.com>
  0 siblings, 1 reply; 31+ messages in thread
From: Chanwoo Choi @ 2021-05-31  7:56 UTC (permalink / raw)
  To: andrew-sh.cheng, Hsin-Yi Wang
  Cc: MyungJoo Ham, Kyungmin Park, Rob Herring, Mark Rutland,
	Matthias Brugger, Rafael J. Wysocki, Viresh Kumar,
	Nishanth Menon, Stephen Boyd, Liam Girdwood, Mark Brown,
	linux-pm, devicetree, linux-arm-kernel, linux-mediatek,
	linux-kernel, srv_heupstream, Sibi Sankar

Hi,

On 5/31/21 12:22 PM, andrew-sh.cheng wrote:
> On Wed, 2021-05-26 at 12:08 +0900, Chanwoo Choi wrote:
>> Hi,
>> On 5/26/21 11:22 AM, andrew-sh.cheng wrote:
>>> On Thu, 2021-04-08 at 11:47 +0900, Chanwoo Choi wrote:
>>>> On 4/1/21 9:16 AM, Chanwoo Choi wrote:
>>>>> On 3/31/21 10:03 PM, andrew-sh.cheng wrote:
>>>>>> On Wed, 2021-03-31 at 17:35 +0900, Chanwoo Choi wrote:
>>>>>>> On 3/31/21 5:27 PM, Chanwoo Choi wrote:
>>>>>>>> Hi,
>>>>>>>>
>>>>>>>> On 3/31/21 5:03 PM, andrew-sh.cheng wrote:
>>>>>>>>> On Thu, 2021-03-25 at 17:14 +0900, Chanwoo Choi wrote:
>>>>>>>>>> Hi,
>>>>>>>>>>
>>>>>>>>>> You are missing to add these patches to linux-pm mailing list.
>>>>>>>>>> Need to send them to linu-pm ML.
>>>>>>>>>>
>>>>>>>>>> Also, before received this series, I tried to clean-up these patches
>>>>>>>>>> on testing branch[1]. So that I add my comment with my clean-up case.
>>>>>>>>>> [1] https://urldefense.com/v3/__https://git.kernel.org/pub/scm/linux/kernel/git/chanwoo/linux.git/log/?h=devfreq-testing-passive-gov__;!!CTRNKA9wMg0ARbw!zIrzeDp9vPnm1_SDzVPuzqdHn3zWie9DnfBXaA-j9-CSrVc6aR9_rJQQiw81_CgAPh9XRRs$ 
>>>>>>>>>>
>>>>>>>>>> And 'Saravana Kannan <skannan@codeaurora.org>' is wrong email address.
>>>>>>>>>> Please update the email or drop this email.
>>>>>>>>>
>>>>>>>>> Hi Chanwoo,
>>>>>>>>>
>>>>>>>>> Thank you for the advices.
>>>>>>>>> I will resend patch v9 (add to linux-pm ML), remove this patch, and note
>>>>>>>>> that my patch set base on
>>>>>>>>> https://urldefense.com/v3/__https://git.kernel.org/pub/scm/linux/kernel/git/chanwoo/linux.git/log/?h=devfreq-testing-passive-gov__;!!CTRNKA9wMg0ARbw!yUlsuxrL5PcbF7o6A9DlCfvoA6w8V8VXKjYIybYyiJg3D0HM-lI2xRuxLUV6b3UJ8WFhg_g$ 
>>>>>>>>
>>>>>>>> I has not yet test this patch[1] on devfreq-testing-passive-gov branch.
>>>>>>>> So that if possible, I'd like you to test your patches with this patch[1] 
>>>>>>>> and then if there is no problem, could you send the next patches with patch[1]?
>>>>>>>>
>>>>>>>> [1]https://urldefense.com/v3/__https://git.kernel.org/pub/scm/linux/kernel/git/chanwoo/linux.git/commit/?h=devfreq-testing-passive-gov&id=39c80d11a8f42dd63ecea1e0df595a0ceb83b454__;!!CTRNKA9wMg0ARbw!yUlsuxrL5PcbF7o6A9DlCfvoA6w8V8VXKjYIybYyiJg3D0HM-lI2xRuxLUV6b3UJR2cQqZs$ 
>>>>>>>
>>>>>>>
>>>>>>> Sorry for the confusion. I make the devfreq-testing-passive-gov[1]
>>>>>>> branch based on latest devfreq-next branch.
>>>>>>> [1] https://urldefense.com/v3/__https://git.kernel.org/pub/scm/linux/kernel/git/chanwoo/linux.git/log/?h=devfreq-testing-passive-gov__;!!CTRNKA9wMg0ARbw!yUlsuxrL5PcbF7o6A9DlCfvoA6w8V8VXKjYIybYyiJg3D0HM-lI2xRuxLUV6b3UJ8WFhg_g$ 
>>>>>>>
>>>>>>> First of all, if possible, I want to test them[1] with your patches in this series.
>>>>>>> And then if there are no any problem, please let me know. After confirmed from you,
>>>>>>> I'll send the patches of devfreq-testing-passive-gov[1] branch.
>>>>>>> How about that?
>>>>>>>
>>>>>> Hi Chanwoo~
>>>>>>
>>>>>> We will use this on Google Chrome project.
>>>>>> Google Hsin-Yi has test your patch + my patch set v8 [2~8]
>>>>>>
>>>>>>     make sure cci devfreqs runs with cpufreq.
>>>>>>     suspend resume
>>>>>>     speedometer2 benchmark
>>>>>> It is okay.
>>>>>>
>>>>>> Please send the patches of devfreq-testing-passive-gov[1] branch.
>>>>>>
>>>>>> I will send patch v9 base on yours latter.
>>>>>
>>>>> Thanks for your test. I'll send the patches today.
>>>>
>>>> I'm sorry for delay because when I tested the patches
>>>> for devfreq parent type on Odroid-xu3, there are some problem
>>>> related to lazy linking of OPP. So I'm trying to analyze them.
>>>> Unfortunately, we need to postpone these patches to next linux
>>>> version.
>>>>
>>> Hi Chanwoo Choi~
>>>
>>> It is said that you are busy on another task recently.
>>> May I know your plan on this patch?
>>> Thank you.
>>
>> Sorry for late work. I have a question.
>> When I tested exynos-bus.c with adding the 'required-opp' property
>> on odroid-xu3 board. I got some fail about 
>>
>> When calling _set_required_opps(), always _set_required_opp() returns
>> -EBUSY error because of following lazy linking case[1].
>>
>> [1] https://urldefense.com/v3/__https://elixir.bootlin.com/linux/v5.13-rc3/source/drivers/opp/core.c*L896__;Iw!!CTRNKA9wMg0ARbw!3eNxwDZRy-Ev5BHGxT-BxCz4qrNy0NZohQuBGW36krkwOkl_WX8yBmxlqSk9hxp_kxspMJI$ 
>>
>> /* required-opps not fully initialized yet */
>> if (lazy_linking_pending(opp_table))
>> 	return -EBUSY;  
>>
>>
>> For calling dev_pm_opp_of_add_table(), lazy_link_required_opp_table() function
>> will be called. But, there is constraint[2]. If is_genpd of opp_table is false,
>> driver/opp/of.c cannot resolve the lazy linking issue.
>>
>> [2]  https://urldefense.com/v3/__https://elixir.bootlin.com/linux/v5.13-rc3/source/drivers/opp/of.c*L386__;Iw!!CTRNKA9wMg0ARbw!3eNxwDZRy-Ev5BHGxT-BxCz4qrNy0NZohQuBGW36krkwOkl_WX8yBmxlqSk9hxp_QFUVY9E$ 
>>
>> /* Link required OPPs for all OPPs of the newly added OPP table */
>> static void lazy_link_required_opp_table(struct opp_table *new_table)
>> {
>> 	struct opp_table *opp_table, *temp, **required_opp_tables;
>> 	struct device_node *required_np, *opp_np, *required_table_np;
>> 	struct dev_pm_opp *opp;
>> 	int i, ret;
>>
>> 	/*
>> 	 * We only support genpd's OPPs in the "required-opps" for now,
>> 	 * as we don't know much about other cases.
>> 	 */
>> 	if (!new_table->is_genpd)
>> 		return;
>>
>> Even if this case, there are no problem on your test case?
>>
> 
> Hi Chanwoo~
> Sorry for late reply.
> Yes, we meet similar issue.
> Google member Hsin-Yi had helped deal with this issue on Chrome project.
> 
> Patch segment:
> @ /drivers/opp/of.c
> 
> /* Link required OPPs for all OPPs of the newly added OPP table */
> static void lazy_link_required_opp_table(struct opp_table *new_table)
> {
> 	struct opp_table *opp_table, *temp, **required_opp_tables;
> 	struct device_node *required_np, *opp_np, *required_table_np;
> 	struct dev_pm_opp *opp;
> 	int i, ret;
> 
> +	/*
> +	 * We only support genpd's OPPs in the "required-opps" for now,
> +	 * as we don't know much about other cases.
> +	 */
> +	if (!new_table->is_genpd)
> +		return;
> 
> 
> Hsin-Yi replied this issue in the discussion list in the original lazy
> link thread:
> https://patchwork.kernel.org/project/linux-pm/patch/20190717222340.137578-4-saravanak@google.com/#23932203
> 
> Loop Hsin-YI here.
> You can discuss with her if needing more detail.
> 
> Thank you both.
> 

Thanks. First of all, we need to resolve and discuss this issue.


-- 
Best Regards,
Chanwoo Choi
Samsung Electronics

_______________________________________________
linux-arm-kernel mailing list
linux-arm-kernel@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-arm-kernel

^ permalink raw reply	[flat|nested] 31+ messages in thread

* Re: [PATCH V8 1/8] PM / devfreq: Add cpu based scaling support to passive_governor
       [not found]                         ` <CACb=7PUkpMkDOJ6dDHXhJ5ep4e9u8ZVYM8M2iC-iwHXn13t3DQ@mail.gmail.com>
@ 2021-05-31  8:13                           ` Chanwoo Choi
  0 siblings, 0 replies; 31+ messages in thread
From: Chanwoo Choi @ 2021-05-31  8:13 UTC (permalink / raw)
  To: Hsin-Yi Wang
  Cc: andrew-sh.cheng, MyungJoo Ham, Kyungmin Park, Rob Herring,
	Mark Rutland, Matthias Brugger, Rafael J. Wysocki, Viresh Kumar,
	Nishanth Menon, Stephen Boyd, Liam Girdwood, Mark Brown,
	linux-pm, devicetree, linux-arm-kernel, linux-mediatek,
	linux-kernel, srv_heupstream, Sibi Sankar

On 5/31/21 4:42 PM, Hsin-Yi Wang wrote:
> 
> 
> On Mon, May 31, 2021 at 3:37 PM Chanwoo Choi <cw00.choi@samsung.com <mailto:cw00.choi@samsung.com>> wrote:
> 
>     Hi,
> 
>     On 5/31/21 12:22 PM, andrew-sh.cheng wrote:
>     > On Wed, 2021-05-26 at 12:08 +0900, Chanwoo Choi wrote:
>     >> Hi,
>     >> On 5/26/21 11:22 AM, andrew-sh.cheng wrote:
>     >>> On Thu, 2021-04-08 at 11:47 +0900, Chanwoo Choi wrote:
>     >>>> On 4/1/21 9:16 AM, Chanwoo Choi wrote:
>     >>>>> On 3/31/21 10:03 PM, andrew-sh.cheng wrote:
>     >>>>>> On Wed, 2021-03-31 at 17:35 +0900, Chanwoo Choi wrote:
>     >>>>>>> On 3/31/21 5:27 PM, Chanwoo Choi wrote:
>     >>>>>>>> Hi,
>     >>>>>>>>
>     >>>>>>>> On 3/31/21 5:03 PM, andrew-sh.cheng wrote:
>     >>>>>>>>> On Thu, 2021-03-25 at 17:14 +0900, Chanwoo Choi wrote:
>     >>>>>>>>>> Hi,
>     >>>>>>>>>>
>     >>>>>>>>>> You are missing to add these patches to linux-pm mailing list.
>     >>>>>>>>>> Need to send them to linu-pm ML.
>     >>>>>>>>>>
>     >>>>>>>>>> Also, before received this series, I tried to clean-up these patches
>     >>>>>>>>>> on testing branch[1]. So that I add my comment with my clean-up case.
>     >>>>>>>>>> [1] https://urldefense.com/v3/__https://git.kernel.org/pub/scm/linux/kernel/git/chanwoo/linux.git/log/?h=devfreq-testing-passive-gov__;!!CTRNKA9wMg0ARbw!zIrzeDp9vPnm1_SDzVPuzqdHn3zWie9DnfBXaA-j9-CSrVc6aR9_rJQQiw81_CgAPh9XRRs$
>     >>>>>>>>>>
>     >>>>>>>>>> And 'Saravana Kannan <skannan@codeaurora.org <mailto:skannan@codeaurora.org>>' is wrong email address.
>     >>>>>>>>>> Please update the email or drop this email.
>     >>>>>>>>>
>     >>>>>>>>> Hi Chanwoo,
>     >>>>>>>>>
>     >>>>>>>>> Thank you for the advices.
>     >>>>>>>>> I will resend patch v9 (add to linux-pm ML), remove this patch, and note
>     >>>>>>>>> that my patch set base on
>     >>>>>>>>> https://urldefense.com/v3/__https://git.kernel.org/pub/scm/linux/kernel/git/chanwoo/linux.git/log/?h=devfreq-testing-passive-gov__;!!CTRNKA9wMg0ARbw!yUlsuxrL5PcbF7o6A9DlCfvoA6w8V8VXKjYIybYyiJg3D0HM-lI2xRuxLUV6b3UJ8WFhg_g$
>     >>>>>>>>
>     >>>>>>>> I has not yet test this patch[1] on devfreq-testing-passive-gov branch.
>     >>>>>>>> So that if possible, I'd like you to test your patches with this patch[1]
>     >>>>>>>> and then if there is no problem, could you send the next patches with patch[1]?
>     >>>>>>>>
>     >>>>>>>> [1]https://urldefense.com/v3/__https://git.kernel.org/pub/scm/linux/kernel/git/chanwoo/linux.git/commit/?h=devfreq-testing-passive-gov&id=39c80d11a8f42dd63ecea1e0df595a0ceb83b454__;!!CTRNKA9wMg0ARbw!yUlsuxrL5PcbF7o6A9DlCfvoA6w8V8VXKjYIybYyiJg3D0HM-lI2xRuxLUV6b3UJR2cQqZs$
>     >>>>>>>
>     >>>>>>>
>     >>>>>>> Sorry for the confusion. I make the devfreq-testing-passive-gov[1]
>     >>>>>>> branch based on latest devfreq-next branch.
>     >>>>>>> [1] https://urldefense.com/v3/__https://git.kernel.org/pub/scm/linux/kernel/git/chanwoo/linux.git/log/?h=devfreq-testing-passive-gov__;!!CTRNKA9wMg0ARbw!yUlsuxrL5PcbF7o6A9DlCfvoA6w8V8VXKjYIybYyiJg3D0HM-lI2xRuxLUV6b3UJ8WFhg_g$
>     >>>>>>>
>     >>>>>>> First of all, if possible, I want to test them[1] with your patches in this series.
>     >>>>>>> And then if there are no any problem, please let me know. After confirmed from you,
>     >>>>>>> I'll send the patches of devfreq-testing-passive-gov[1] branch.
>     >>>>>>> How about that?
>     >>>>>>>
>     >>>>>> Hi Chanwoo~
>     >>>>>>
>     >>>>>> We will use this on Google Chrome project.
>     >>>>>> Google Hsin-Yi has test your patch + my patch set v8 [2~8]
>     >>>>>>
>     >>>>>>     make sure cci devfreqs runs with cpufreq.
>     >>>>>>     suspend resume
>     >>>>>>     speedometer2 benchmark
>     >>>>>> It is okay.
>     >>>>>>
>     >>>>>> Please send the patches of devfreq-testing-passive-gov[1] branch.
>     >>>>>>
>     >>>>>> I will send patch v9 base on yours latter.
>     >>>>>
>     >>>>> Thanks for your test. I'll send the patches today.
>     >>>>
>     >>>> I'm sorry for delay because when I tested the patches
>     >>>> for devfreq parent type on Odroid-xu3, there are some problem
>     >>>> related to lazy linking of OPP. So I'm trying to analyze them.
>     >>>> Unfortunately, we need to postpone these patches to next linux
>     >>>> version.
>     >>>>
>     >>> Hi Chanwoo Choi~
>     >>>
>     >>> It is said that you are busy on another task recently.
>     >>> May I know your plan on this patch?
>     >>> Thank you.
>     >>
>     >> Sorry for late work. I have a question.
>     >> When I tested exynos-bus.c with adding the 'required-opp' property
>     >> on odroid-xu3 board. I got some fail about
>     >>
>     >> When calling _set_required_opps(), always _set_required_opp() returns
>     >> -EBUSY error because of following lazy linking case[1].
>     >>
>     >> [1] https://urldefense.com/v3/__https://elixir.bootlin.com/linux/v5.13-rc3/source/drivers/opp/core.c*L896__;Iw!!CTRNKA9wMg0ARbw!3eNxwDZRy-Ev5BHGxT-BxCz4qrNy0NZohQuBGW36krkwOkl_WX8yBmxlqSk9hxp_kxspMJI$
>     >>
>     >> /* required-opps not fully initialized yet */
>     >> if (lazy_linking_pending(opp_table))
>     >>      return -EBUSY; 
>     >>
>     >>
>     >> For calling dev_pm_opp_of_add_table(), lazy_link_required_opp_table() function
>     >> will be called. But, there is constraint[2]. If is_genpd of opp_table is false,
>     >> driver/opp/of.c cannot resolve the lazy linking issue.
>     >>
>     >> [2]  https://urldefense.com/v3/__https://elixir.bootlin.com/linux/v5.13-rc3/source/drivers/opp/of.c*L386__;Iw!!CTRNKA9wMg0ARbw!3eNxwDZRy-Ev5BHGxT-BxCz4qrNy0NZohQuBGW36krkwOkl_WX8yBmxlqSk9hxp_QFUVY9E$
>     >>
>     >> /* Link required OPPs for all OPPs of the newly added OPP table */
>     >> static void lazy_link_required_opp_table(struct opp_table *new_table)
>     >> {
>     >>      struct opp_table *opp_table, *temp, **required_opp_tables;
>     >>      struct device_node *required_np, *opp_np, *required_table_np;
>     >>      struct dev_pm_opp *opp;
>     >>      int i, ret;
>     >>
>     >>      /*
>     >>       * We only support genpd's OPPs in the "required-opps" for now,
>     >>       * as we don't know much about other cases.
>     >>       */
>     >>      if (!new_table->is_genpd)
>     >>              return;
>     >>
>     >> Even if this case, there are no problem on your test case?
>     >>
>     >
>     > Hi Chanwoo~
>     > Sorry for late reply.
>     > Yes, we meet similar issue.
>     > Google member Hsin-Yi had helped deal with this issue on Chrome project.
>     >
>     > Patch segment:
>     > @ /drivers/opp/of.c
>     >
>     > /* Link required OPPs for all OPPs of the newly added OPP table */
>     > static void lazy_link_required_opp_table(struct opp_table *new_table)
>     > {
>     >       struct opp_table *opp_table, *temp, **required_opp_tables;
>     >       struct device_node *required_np, *opp_np, *required_table_np;
>     >       struct dev_pm_opp *opp;
>     >       int i, ret;
>     >
>     > +     /*
>     > +      * We only support genpd's OPPs in the "required-opps" for now,
>     > +      * as we don't know much about other cases.
>     > +      */
>     > +     if (!new_table->is_genpd)
>     > +             return;
>     >
>     >
>     > Hsin-Yi replied this issue in the discussion list in the original lazy
>     > link thread:
>     > https://patchwork.kernel.org/project/linux-pm/patch/20190717222340.137578-4-saravanak@google.com/#23932203
>     >
>     > Loop Hsin-YI here.
>     > You can discuss with her if needing more detail.
>     >
>     > Thank you both.
>     >
> 
>     Thanks. First of all, we need to resolve and discuss this issue.
> 
> 
> Hi Chanwoo, 
> 
> We think removing the genpd check is sufficient for our use case since we only use the lazy link for opp table translation.

Hi Hsin-Yi,

IMHO, I think 'is_genpd' checking should be removed for devices except for genpd
like as following:

diff --git a/drivers/opp/of.c b/drivers/opp/of.c
index c582a9ca397b..b54d3a985515 100644
--- a/drivers/opp/of.c
+++ b/drivers/opp/of.c
@@ -201,17 +201,6 @@ static void _opp_table_alloc_required_tables(struct opp_table *opp_table,
                        lazy = true;
                        continue;
                }
-
-               /*
-                * We only support genpd's OPPs in the "required-opps" for now,
-                * as we don't know how much about other cases. Error out if the
-                * required OPP doesn't belong to a genpd.
-                */
-               if (!required_opp_tables[i]->is_genpd) {
-                       dev_err(dev, "required-opp doesn't belong to genpd: %pOF\n",
-                               required_np);
-                       goto free_required_tables;
-               }
        }
 
        /* Let's do the linking later on */
@@ -379,13 +368,6 @@ static void lazy_link_required_opp_table(struct opp_table *new_table)
        struct dev_pm_opp *opp;
        int i, ret;
 
-       /*
-        * We only support genpd's OPPs in the "required-opps" for now,
-        * as we don't know much about other cases.
-        */
-       if (!new_table->is_genpd)
-               return;
-
        mutex_lock(&opp_table_lock);
 
        list_for_each_entry_safe(opp_table, temp, &lazy_opp_tables, lazy) {
@@ -874,7 +856,7 @@ static struct dev_pm_opp *_opp_add_static_v2(struct opp_table *opp_table,
                return ERR_PTR(-ENOMEM);
 
        ret = _read_opp_key(new_opp, opp_table, np, &rate_not_available);
-       if (ret < 0 && !opp_table->is_genpd) {
+       if (ret < 0) {
                dev_err(dev, "%s: opp key field not found\n", __func__);
                goto free_opp;
        }


-- 
Best Regards,
Chanwoo Choi
Samsung Electronics

_______________________________________________
linux-arm-kernel mailing list
linux-arm-kernel@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-arm-kernel

^ permalink raw reply related	[flat|nested] 31+ messages in thread

end of thread, other threads:[~2021-05-31  7:57 UTC | newest]

Thread overview: 31+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2021-03-23 11:33 [PATCH V8 0/8] Add cpufreq and cci devfreq for mt8183, and SVS support Andrew-sh.Cheng
2021-03-23 11:33 ` [PATCH V8 1/8] PM / devfreq: Add cpu based scaling support to passive_governor Andrew-sh.Cheng
2021-03-25  7:42   ` Chanwoo Choi
2021-03-25  8:14   ` Chanwoo Choi
2021-03-31  8:03     ` andrew-sh.cheng
2021-03-31  8:27       ` Chanwoo Choi
2021-03-31  8:35         ` Chanwoo Choi
2021-03-31 13:03           ` andrew-sh.cheng
2021-04-01  0:16             ` Chanwoo Choi
2021-04-08  2:47               ` Chanwoo Choi
2021-04-22 13:34                 ` andrew-sh.cheng
2021-05-26  2:22                 ` andrew-sh.cheng
2021-05-26  3:08                   ` Chanwoo Choi
2021-05-31  3:22                     ` andrew-sh.cheng
2021-05-31  7:56                       ` Chanwoo Choi
     [not found]                         ` <CACb=7PUkpMkDOJ6dDHXhJ5ep4e9u8ZVYM8M2iC-iwHXn13t3DQ@mail.gmail.com>
2021-05-31  8:13                           ` Chanwoo Choi
2021-03-31 10:46     ` Hsin-Yi Wang
2021-03-23 11:33 ` [PATCH V8 2/8] cpufreq: mediatek: Enable clock and regulator Andrew-sh.Cheng
2021-03-30  4:36   ` Viresh Kumar
2021-03-31  5:21     ` andrew-sh.cheng
2021-03-31  6:17       ` Viresh Kumar
2021-03-23 11:33 ` [PATCH V8 3/8] dt-bindings: devfreq: add compatible for mt8183 cci devfreq Andrew-sh.Cheng
2021-03-23 11:33 ` [PATCH V8 4/8] devfreq: add mediatek " Andrew-sh.Cheng
2021-03-25  8:04   ` Chanwoo Choi
2021-03-31  6:21     ` andrew-sh.cheng
2021-03-23 11:33 ` [PATCH V8 5/8] cpufreq: mediatek: Add record of previous desired vproc value Andrew-sh.Cheng
2021-03-23 11:33 ` [PATCH V8 6/8] cpufreq: mediatek: add opp notification for SVS support Andrew-sh.Cheng
2021-03-23 11:34 ` [PATCH V8 7/8] devfreq: mediatek: cci devfreq register " Andrew-sh.Cheng
2021-03-25  8:11   ` Chanwoo Choi
2021-03-31  7:53     ` andrew-sh.cheng
2021-03-23 11:34 ` [PATCH V8 8/8] arm64: dts: mediatek: add cpufreq and cci devfreq nodes for mt8183 Andrew-sh.Cheng

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).