linux-mediatek.lists.infradead.org archive mirror
 help / color / mirror / Atom feed
* [PATCH 00/12] Add cpufreq and cci devfreq for mt8183, and SVS support
@ 2020-05-20  3:42 ` Andrew-sh.Cheng
  2020-05-20  3:42   ` [PATCH 01/12] OPP: Allow required-opps even if the device doesn't have power-domains Andrew-sh.Cheng
                     ` (13 more replies)
  0 siblings, 14 replies; 35+ messages in thread
From: Andrew-sh.Cheng @ 2020-05-20  3:42 UTC (permalink / raw)
  To: MyungJoo Ham, Kyungmin Park, Chanwoo Choi, Rob Herring,
	Mark Rutland, Matthias Brugger, Rafael J . Wysocki, Viresh Kumar,
	Nishanth Menon, Stephen Boyd, Liam Girdwood, Mark Brown
  Cc: devicetree, Andrew-sh.Cheng, srv_heupstream, linux-pm,
	linux-kernel, linux-mediatek, linux-arm-kernel

MT8183 supports CPU DVFS and CCI DVFS, and LITTLE cpus and CCI are in the same voltage domain.
So, this series is to add drivers to handle the voltage coupling between CPU and CCI DVFS.

For SVS support, need OPP_EVENT_ADJUST_VOLTAGE and corresponding reaction.

Change since v5:
	- Changing dt-binding format to yaml.
	- Extending current devfreq passive_governor instead of create a new one.
	- Resend depending patches of Sravana Kannan base on kernel-5.7


Andrew-sh.Cheng (6):
  cpufreq: mediatek: add clock and regulator enable for intermediate
    clock
  dt-bindings: devfreq: add compatible for mt8183 cci devfreq
  devfreq: add mediatek cci devfreq
  opp: Modify opp API, dev_pm_opp_get_freq(), find freq in opp, even it
    is disabled
  cpufreq: mediatek: add opp notification for SVS support
  devfreq: mediatek: cci devfreq register opp notification for SVS
    support

Saravana Kannan (6):
  OPP: Allow required-opps even if the device doesn't have power-domains
  OPP: Add function to look up required OPP's for a given OPP
  OPP: Improve required-opps linking
  PM / devfreq: Cache OPP table reference in devfreq
  PM / devfreq: Add required OPPs support to passive governor
  PM / devfreq: Add cpu based scaling support to passive_governor

 .../devicetree/bindings/devfreq/mt8183-cci.yaml    |  51 ++++
 drivers/cpufreq/mediatek-cpufreq.c                 | 122 ++++++++-
 drivers/devfreq/Kconfig                            |  12 +
 drivers/devfreq/Makefile                           |   1 +
 drivers/devfreq/devfreq.c                          |   6 +
 drivers/devfreq/governor_passive.c                 | 298 +++++++++++++++++++--
 drivers/devfreq/mt8183-cci-devfreq.c               | 233 ++++++++++++++++
 drivers/opp/core.c                                 |  85 +++++-
 drivers/opp/of.c                                   | 108 ++++----
 drivers/opp/opp.h                                  |   5 +
 include/linux/devfreq.h                            |  42 ++-
 include/linux/pm_opp.h                             |  11 +
 12 files changed, 874 insertions(+), 100 deletions(-)
 create mode 100644 Documentation/devicetree/bindings/devfreq/mt8183-cci.yaml
 create mode 100644 drivers/devfreq/mt8183-cci-devfreq.c

-- 
2.12.5
_______________________________________________
Linux-mediatek mailing list
Linux-mediatek@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-mediatek

^ permalink raw reply	[flat|nested] 35+ messages in thread

* [PATCH 01/12] OPP: Allow required-opps even if the device doesn't have power-domains
  2020-05-20  3:42 ` [PATCH 00/12] Add cpufreq and cci devfreq for mt8183, and SVS support Andrew-sh.Cheng
@ 2020-05-20  3:42   ` Andrew-sh.Cheng
  2020-05-20 14:54     ` Matthias Brugger
  2020-05-20  3:42   ` [PATCH 02/12] OPP: Add function to look up required OPP's for a given OPP Andrew-sh.Cheng
                     ` (12 subsequent siblings)
  13 siblings, 1 reply; 35+ messages in thread
From: Andrew-sh.Cheng @ 2020-05-20  3:42 UTC (permalink / raw)
  To: MyungJoo Ham, Kyungmin Park, Chanwoo Choi, Rob Herring,
	Mark Rutland, Matthias Brugger, Rafael J . Wysocki, Viresh Kumar,
	Nishanth Menon, Stephen Boyd, Liam Girdwood, Mark Brown
  Cc: devicetree, Saravana Kannan, srv_heupstream, linux-pm,
	linux-kernel, linux-mediatek, linux-arm-kernel

From: Saravana Kannan <saravanak@google.com>

A Device-A can have a (minimum) performance requirement on another
Device-B to be able to function correctly. This performance requirement
on Device-B can also change based on the current performance level of
Device-A.

The existing required-opps feature fits well to describe this need. So,
instead of limiting required-opps to point to only PM-domain devices,
allow it to point to any device.

Signed-off-by: Saravana Kannan <saravanak@google.com>
---
 drivers/opp/core.c |  2 +-
 drivers/opp/of.c   | 11 -----------
 2 files changed, 1 insertion(+), 12 deletions(-)

diff --git a/drivers/opp/core.c b/drivers/opp/core.c
index ba43e6a3dc0a..51403c1f2481 100644
--- a/drivers/opp/core.c
+++ b/drivers/opp/core.c
@@ -755,7 +755,7 @@ static int _set_required_opps(struct device *dev,
 		return 0;
 
 	/* Single genpd case */
-	if (!genpd_virt_devs) {
+	if (!genpd_virt_devs && required_opp_tables[0]->is_genpd) {
 		pstate = likely(opp) ? opp->required_opps[0]->pstate : 0;
 		ret = dev_pm_genpd_set_performance_state(dev, pstate);
 		if (ret) {
diff --git a/drivers/opp/of.c b/drivers/opp/of.c
index 9cd8f0adacae..6d33de668a7b 100644
--- a/drivers/opp/of.c
+++ b/drivers/opp/of.c
@@ -195,17 +195,6 @@ static void _opp_table_alloc_required_tables(struct opp_table *opp_table,
 
 		if (IS_ERR(required_opp_tables[i]))
 			goto free_required_tables;
-
-		/*
-		 * We only support genpd's OPPs in the "required-opps" for now,
-		 * as we don't know how much about other cases. Error out if the
-		 * required OPP doesn't belong to a genpd.
-		 */
-		if (!required_opp_tables[i]->is_genpd) {
-			dev_err(dev, "required-opp doesn't belong to genpd: %pOF\n",
-				required_np);
-			goto free_required_tables;
-		}
 	}
 
 	goto put_np;
-- 
2.12.5
_______________________________________________
Linux-mediatek mailing list
Linux-mediatek@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-mediatek

^ permalink raw reply related	[flat|nested] 35+ messages in thread

* [PATCH 02/12] OPP: Add function to look up required OPP's for a given OPP
  2020-05-20  3:42 ` [PATCH 00/12] Add cpufreq and cci devfreq for mt8183, and SVS support Andrew-sh.Cheng
  2020-05-20  3:42   ` [PATCH 01/12] OPP: Allow required-opps even if the device doesn't have power-domains Andrew-sh.Cheng
@ 2020-05-20  3:42   ` Andrew-sh.Cheng
  2020-05-20  3:42   ` [PATCH 03/12] OPP: Improve required-opps linking Andrew-sh.Cheng
                     ` (11 subsequent siblings)
  13 siblings, 0 replies; 35+ messages in thread
From: Andrew-sh.Cheng @ 2020-05-20  3:42 UTC (permalink / raw)
  To: MyungJoo Ham, Kyungmin Park, Chanwoo Choi, Rob Herring,
	Mark Rutland, Matthias Brugger, Rafael J . Wysocki, Viresh Kumar,
	Nishanth Menon, Stephen Boyd, Liam Girdwood, Mark Brown
  Cc: devicetree, Saravana Kannan, srv_heupstream, linux-pm,
	linux-kernel, linux-mediatek, linux-arm-kernel

From: Saravana Kannan <saravanak@google.com>

Add a function that allows looking up required OPPs given a source OPP
table, destination OPP table and the source OPP.

Signed-off-by: Saravana Kannan <saravanak@google.com>
---
 drivers/opp/core.c     | 53 ++++++++++++++++++++++++++++++++++++++++++++++++++
 include/linux/pm_opp.h | 11 +++++++++++
 2 files changed, 64 insertions(+)

diff --git a/drivers/opp/core.c b/drivers/opp/core.c
index 51403c1f2481..64666d3eaf5b 100644
--- a/drivers/opp/core.c
+++ b/drivers/opp/core.c
@@ -1923,6 +1923,59 @@ void dev_pm_opp_detach_genpd(struct opp_table *opp_table)
 EXPORT_SYMBOL_GPL(dev_pm_opp_detach_genpd);
 
 /**
+ * dev_pm_opp_xlate_required_opp() - Find required OPP for @src_table OPP.
+ * @src_table: OPP table which has @dst_table as one of its required OPP table.
+ * @dst_table: Required OPP table of the @src_table.
+ *
+ * This function returns the OPP (present in @dst_table) pointed out by the
+ * "required-opps" property of the OPP (present in @src_table).
+ *
+ * The callers are required to call dev_pm_opp_put() for the returned OPP after
+ * use.
+ *
+ * Return: destination table OPP on success, otherwise NULL on errors.
+ */
+struct dev_pm_opp *dev_pm_opp_xlate_required_opp(struct opp_table *src_table,
+						 struct opp_table *dst_table,
+						 struct dev_pm_opp *src_opp)
+{
+	struct dev_pm_opp *opp, *dest_opp = NULL;
+	int i;
+
+	if (!src_table || !dst_table || !src_opp)
+		return NULL;
+
+	for (i = 0; i < src_table->required_opp_count; i++) {
+		if (src_table->required_opp_tables[i]->np == dst_table->np)
+			break;
+	}
+
+	if (unlikely(i == src_table->required_opp_count)) {
+		pr_err("%s: Couldn't find matching OPP table (%p: %p)\n",
+		       __func__, src_table, dst_table);
+		return NULL;
+	}
+
+	mutex_lock(&src_table->lock);
+
+	list_for_each_entry(opp, &src_table->opp_list, node) {
+		if (opp == src_opp) {
+			dest_opp = opp->required_opps[i];
+			dev_pm_opp_get(dest_opp);
+			goto unlock;
+		}
+	}
+
+	pr_err("%s: Couldn't find matching OPP (%p: %p)\n", __func__, src_table,
+	       dst_table);
+
+unlock:
+	mutex_unlock(&src_table->lock);
+
+	return dest_opp;
+}
+
+/**
  * dev_pm_opp_xlate_performance_state() - Find required OPP's pstate for src_table.
  * @src_table: OPP table which has dst_table as one of its required OPP table.
  * @dst_table: Required OPP table of the src_table.
diff --git a/include/linux/pm_opp.h b/include/linux/pm_opp.h
index 747861816f4f..909cf7563d35 100644
--- a/include/linux/pm_opp.h
+++ b/include/linux/pm_opp.h
@@ -137,6 +137,9 @@ struct opp_table *dev_pm_opp_register_set_opp_helper(struct device *dev, int (*s
 void dev_pm_opp_unregister_set_opp_helper(struct opp_table *opp_table);
 struct opp_table *dev_pm_opp_attach_genpd(struct device *dev, const char **names, struct device ***virt_devs);
 void dev_pm_opp_detach_genpd(struct opp_table *opp_table);
+struct dev_pm_opp *dev_pm_opp_xlate_required_opp(struct opp_table *src_table,
+						 struct opp_table *dst_table,
+						 struct dev_pm_opp *src_opp);
 int dev_pm_opp_xlate_performance_state(struct opp_table *src_table, struct opp_table *dst_table, unsigned int pstate);
 int dev_pm_opp_set_rate(struct device *dev, unsigned long target_freq);
 int dev_pm_opp_set_sharing_cpus(struct device *cpu_dev, const struct cpumask *cpumask);
@@ -320,6 +323,14 @@ static inline struct opp_table *dev_pm_opp_attach_genpd(struct device *dev, cons
 
 static inline void dev_pm_opp_detach_genpd(struct opp_table *opp_table) {}
 
+static inline struct dev_pm_opp *dev_pm_opp_xlate_required_opp(
+						struct opp_table *src_table,
+						struct opp_table *dst_table,
+						struct dev_pm_opp *src_opp)
+{
+	return NULL;
+}
+
 static inline int dev_pm_opp_xlate_performance_state(struct opp_table *src_table, struct opp_table *dst_table, unsigned int pstate)
 {
 	return -ENOTSUPP;
-- 
2.12.5
_______________________________________________
Linux-mediatek mailing list
Linux-mediatek@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-mediatek

^ permalink raw reply related	[flat|nested] 35+ messages in thread

* [PATCH 03/12] OPP: Improve required-opps linking
  2020-05-20  3:42 ` [PATCH 00/12] Add cpufreq and cci devfreq for mt8183, and SVS support Andrew-sh.Cheng
  2020-05-20  3:42   ` [PATCH 01/12] OPP: Allow required-opps even if the device doesn't have power-domains Andrew-sh.Cheng
  2020-05-20  3:42   ` [PATCH 02/12] OPP: Add function to look up required OPP's for a given OPP Andrew-sh.Cheng
@ 2020-05-20  3:42   ` Andrew-sh.Cheng
  2020-05-20  3:42   ` [PATCH 04/12] PM / devfreq: Cache OPP table reference in devfreq Andrew-sh.Cheng
                     ` (10 subsequent siblings)
  13 siblings, 0 replies; 35+ messages in thread
From: Andrew-sh.Cheng @ 2020-05-20  3:42 UTC (permalink / raw)
  To: MyungJoo Ham, Kyungmin Park, Chanwoo Choi, Rob Herring,
	Mark Rutland, Matthias Brugger, Rafael J . Wysocki, Viresh Kumar,
	Nishanth Menon, Stephen Boyd, Liam Girdwood, Mark Brown
  Cc: devicetree, Saravana Kannan, srv_heupstream, linux-pm,
	linux-kernel, linux-mediatek, linux-arm-kernel

From: Saravana Kannan <saravanak@google.com>

Currently, the linking of required-opps fails silently if the
destination OPP table hasn't been added before the source OPP table is
added. This puts an unnecessary requirement that the destination table
be added before the source table is added.

In reality, the destination table is needed only when we try to
translate from source OPP to destination OPP. So, instead of
completely failing, retry linking the tables when the translation is
attempted.

Signed-off-by: Saravana Kannan <saravanak@google.com>
---
 drivers/opp/core.c |  30 +++++++++++-----
 drivers/opp/of.c   | 101 ++++++++++++++++++++++++++++-------------------------
 drivers/opp/opp.h  |   5 +++
 3 files changed, 80 insertions(+), 56 deletions(-)

diff --git a/drivers/opp/core.c b/drivers/opp/core.c
index 64666d3eaf5b..284b01223831 100644
--- a/drivers/opp/core.c
+++ b/drivers/opp/core.c
@@ -754,6 +754,9 @@ static int _set_required_opps(struct device *dev,
 	if (!required_opp_tables)
 		return 0;
 
+	if (!_of_lazy_link_required_tables(opp_table))
+		return -EPROBE_DEFER;
+
 	/* Single genpd case */
 	if (!genpd_virt_devs && required_opp_tables[0]->is_genpd) {
 		pstate = likely(opp) ? opp->required_opps[0]->pstate : 0;
@@ -774,11 +777,16 @@ static int _set_required_opps(struct device *dev,
 	mutex_lock(&opp_table->genpd_virt_dev_lock);
 
 	for (i = 0; i < opp_table->required_opp_count; i++) {
-		pstate = likely(opp) ? opp->required_opps[i]->pstate : 0;
-
 		if (!genpd_virt_devs[i])
 			continue;
 
+		if (!opp->required_opps[i]) {
+			ret = -ENODEV;
+			break;
+		}
+
+		pstate = likely(opp) ? opp->required_opps[i]->pstate : 0;
+
 		ret = dev_pm_genpd_set_performance_state(genpd_virt_devs[i], pstate);
 		if (ret) {
 			dev_err(dev, "Failed to set performance rate of %s: %d (%d)\n",
@@ -1945,8 +1953,11 @@ struct dev_pm_opp *dev_pm_opp_xlate_required_opp(struct opp_table *src_table,
 	if (!src_table || !dst_table || !src_opp)
 		return NULL;
 
+	_of_lazy_link_required_tables(src_table);
+
 	for (i = 0; i < src_table->required_opp_count; i++) {
-		if (src_table->required_opp_tables[i]->np == dst_table->np)
+		if (src_table->required_opp_tables[i]
+		    && src_table->required_opp_tables[i]->np == dst_table->np)
 			break;
 	}
 
@@ -2009,6 +2020,8 @@ int dev_pm_opp_xlate_performance_state(struct opp_table *src_table,
 	if (!src_table->required_opp_count)
 		return pstate;
 
+	_of_lazy_link_required_tables(src_table);
+
 	for (i = 0; i < src_table->required_opp_count; i++) {
 		if (src_table->required_opp_tables[i]->np == dst_table->np)
 			break;
@@ -2024,15 +2037,16 @@ int dev_pm_opp_xlate_performance_state(struct opp_table *src_table,
 
 	list_for_each_entry(opp, &src_table->opp_list, node) {
 		if (opp->pstate == pstate) {
-			dest_pstate = opp->required_opps[i]->pstate;
-			goto unlock;
+			if (opp->required_opps[i])
+				dest_pstate = opp->required_opps[i]->pstate;
+			break;
 		}
 	}
 
-	pr_err("%s: Couldn't find matching OPP (%p: %p)\n", __func__, src_table,
-	       dst_table);
+	if (dest_pstate < 0)
+		pr_err("%s: Couldn't find matching OPP (%p: %p)\n", __func__,
+		       src_table, dst_table);
 
-unlock:
 	mutex_unlock(&src_table->lock);
 
 	return dest_pstate;
diff --git a/drivers/opp/of.c b/drivers/opp/of.c
index 6d33de668a7b..c6b1c317e4f7 100644
--- a/drivers/opp/of.c
+++ b/drivers/opp/of.c
@@ -143,7 +143,7 @@ static void _opp_table_free_required_tables(struct opp_table *opp_table)
 
 	for (i = 0; i < opp_table->required_opp_count; i++) {
 		if (IS_ERR_OR_NULL(required_opp_tables[i]))
-			break;
+			continue;
 
 		dev_pm_opp_put_opp_table(required_opp_tables[i]);
 	}
@@ -163,8 +163,8 @@ static void _opp_table_alloc_required_tables(struct opp_table *opp_table,
 					     struct device_node *opp_np)
 {
 	struct opp_table **required_opp_tables;
-	struct device_node *required_np, *np;
-	int count, i;
+	struct device_node *np;
+	int count;
 
 	/* Traversing the first OPP node is all we need */
 	np = of_get_next_available_child(opp_np, NULL);
@@ -174,35 +174,65 @@ static void _opp_table_alloc_required_tables(struct opp_table *opp_table,
 	}
 
 	count = of_count_phandle_with_args(np, "required-opps", NULL);
+	of_node_put(np);
 	if (!count)
-		goto put_np;
+		return;
 
 	required_opp_tables = kcalloc(count, sizeof(*required_opp_tables),
 				      GFP_KERNEL);
 	if (!required_opp_tables)
-		goto put_np;
+		return;
 
 	opp_table->required_opp_tables = required_opp_tables;
 	opp_table->required_opp_count = count;
+}
+
+/*
+ * Try to link all required tables and return true if all of them have been
+ * linked. Otherwise, return false.
+ */
+bool _of_lazy_link_required_tables(struct opp_table *src)
+{
+	struct dev_pm_opp *src_opp, *tmp_opp;
+	struct opp_table *req_table;
+	struct device_node *req_np;
+	int i, num_linked = 0;
 
-	for (i = 0; i < count; i++) {
-		required_np = of_parse_required_opp(np, i);
-		if (!required_np)
-			goto free_required_tables;
+	mutex_lock(&src->lock);
 
-		required_opp_tables[i] = _find_table_of_opp_np(required_np);
-		of_node_put(required_np);
+	if (list_empty(&src->opp_list))
+		goto out;
 
-		if (IS_ERR(required_opp_tables[i]))
-			goto free_required_tables;
-	}
+	src_opp = list_first_entry(&src->opp_list, struct dev_pm_opp, node);
 
-	goto put_np;
+	for (i = 0; i < src->required_opp_count; i++) {
+		if (src->required_opp_tables[i]) {
+			num_linked++;
+			continue;
+		}
 
-free_required_tables:
-	_opp_table_free_required_tables(opp_table);
-put_np:
-	of_node_put(np);
+		req_np = of_parse_required_opp(src_opp->np, i);
+		if (!req_np)
+			continue;
+
+		req_table = _find_table_of_opp_np(req_np);
+		of_node_put(req_np);
+		if (!req_table)
+			continue;
+
+		src->required_opp_tables[i] = req_table;
+		list_for_each_entry(tmp_opp, &src->opp_list, node) {
+			req_np = of_parse_required_opp(tmp_opp->np, i);
+			tmp_opp->required_opps[i] = _find_opp_of_np(req_table,
+								    req_np);
+			of_node_put(req_np);
+		}
+		num_linked++;
+	}
+
+out:
+	mutex_unlock(&src->lock);
+	return num_linked == src->required_opp_count;
 }
 
 void _of_init_opp_table(struct opp_table *opp_table, struct device *dev,
@@ -265,7 +295,7 @@ void _of_opp_free_required_opps(struct opp_table *opp_table,
 
 	for (i = 0; i < opp_table->required_opp_count; i++) {
 		if (!required_opps[i])
-			break;
+			continue;
 
 		/* Put the reference back */
 		dev_pm_opp_put(required_opps[i]);
@@ -280,9 +310,7 @@ static int _of_opp_alloc_required_opps(struct opp_table *opp_table,
 				       struct dev_pm_opp *opp)
 {
 	struct dev_pm_opp **required_opps;
-	struct opp_table *required_table;
-	struct device_node *np;
-	int i, ret, count = opp_table->required_opp_count;
+	int count = opp_table->required_opp_count;
 
 	if (!count)
 		return 0;
@@ -293,32 +321,7 @@ static int _of_opp_alloc_required_opps(struct opp_table *opp_table,
 
 	opp->required_opps = required_opps;
 
-	for (i = 0; i < count; i++) {
-		required_table = opp_table->required_opp_tables[i];
-
-		np = of_parse_required_opp(opp->np, i);
-		if (unlikely(!np)) {
-			ret = -ENODEV;
-			goto free_required_opps;
-		}
-
-		required_opps[i] = _find_opp_of_np(required_table, np);
-		of_node_put(np);
-
-		if (!required_opps[i]) {
-			pr_err("%s: Unable to find required OPP node: %pOF (%d)\n",
-			       __func__, opp->np, i);
-			ret = -ENODEV;
-			goto free_required_opps;
-		}
-	}
-
 	return 0;
-
-free_required_opps:
-	_of_opp_free_required_opps(opp_table, opp);
-
-	return ret;
 }
 
 static bool _opp_is_supported(struct device *dev, struct opp_table *opp_table,
@@ -691,6 +694,8 @@ static int _of_add_opp_table_v2(struct device *dev, struct opp_table *opp_table)
 	if (pstate_count)
 		opp_table->genpd_performance_state = true;
 
+	_of_lazy_link_required_tables(opp_table);
+
 	return 0;
 
 remove_static_opp:
diff --git a/drivers/opp/opp.h b/drivers/opp/opp.h
index d14e27102730..6a679e7f3639 100644
--- a/drivers/opp/opp.h
+++ b/drivers/opp/opp.h
@@ -221,12 +221,17 @@ void _put_opp_list_kref(struct opp_table *opp_table);
 void _of_init_opp_table(struct opp_table *opp_table, struct device *dev, int index);
 void _of_clear_opp_table(struct opp_table *opp_table);
 struct opp_table *_managed_opp(struct device *dev, int index);
+bool _of_lazy_link_required_tables(struct opp_table *src);
 void _of_opp_free_required_opps(struct opp_table *opp_table,
 				struct dev_pm_opp *opp);
 #else
 static inline void _of_init_opp_table(struct opp_table *opp_table, struct device *dev, int index) {}
 static inline void _of_clear_opp_table(struct opp_table *opp_table) {}
 static inline struct opp_table *_managed_opp(struct device *dev, int index) { return NULL; }
+bool _of_lazy_link_required_tables(struct opp_table *src)
+{
+	return true;
+}
 static inline void _of_opp_free_required_opps(struct opp_table *opp_table,
 					      struct dev_pm_opp *opp) {}
 #endif
-- 
2.12.5
_______________________________________________
Linux-mediatek mailing list
Linux-mediatek@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-mediatek

^ permalink raw reply related	[flat|nested] 35+ messages in thread

* [PATCH 04/12] PM / devfreq: Cache OPP table reference in devfreq
  2020-05-20  3:42 ` [PATCH 00/12] Add cpufreq and cci devfreq for mt8183, and SVS support Andrew-sh.Cheng
                     ` (2 preceding siblings ...)
  2020-05-20  3:42   ` [PATCH 03/12] OPP: Improve required-opps linking Andrew-sh.Cheng
@ 2020-05-20  3:42   ` Andrew-sh.Cheng
  2020-05-20  3:43   ` [PATCH 05/12] PM / devfreq: Add required OPPs support to passive governor Andrew-sh.Cheng
                     ` (9 subsequent siblings)
  13 siblings, 0 replies; 35+ messages in thread
From: Andrew-sh.Cheng @ 2020-05-20  3:42 UTC (permalink / raw)
  To: MyungJoo Ham, Kyungmin Park, Chanwoo Choi, Rob Herring,
	Mark Rutland, Matthias Brugger, Rafael J . Wysocki, Viresh Kumar,
	Nishanth Menon, Stephen Boyd, Liam Girdwood, Mark Brown
  Cc: devicetree, Saravana Kannan, srv_heupstream, linux-pm,
	linux-kernel, linux-mediatek, linux-arm-kernel

From: Saravana Kannan <saravanak@google.com>

The OPP table can be used often in devfreq. Trying to get it each time can
be expensive, so cache it in the devfreq struct.

Signed-off-by: Saravana Kannan <saravanak@google.com>
Reviewed-by: Chanwoo Choi <cw00.choi@samsung.com>
Acked-by: MyungJoo Ham <myungjoo.ham@samsung.com>
---
 drivers/devfreq/devfreq.c | 6 ++++++
 include/linux/devfreq.h   | 2 ++
 2 files changed, 8 insertions(+)

diff --git a/drivers/devfreq/devfreq.c b/drivers/devfreq/devfreq.c
index 6fecd11dafdd..1103a3ae5586 100644
--- a/drivers/devfreq/devfreq.c
+++ b/drivers/devfreq/devfreq.c
@@ -719,6 +719,8 @@ static void devfreq_dev_release(struct device *dev)
 	if (devfreq->profile->exit)
 		devfreq->profile->exit(devfreq->dev.parent);
 
+	if (devfreq->opp_table)
+		dev_pm_opp_put_opp_table(devfreq->opp_table);
 	mutex_destroy(&devfreq->lock);
 	kfree(devfreq);
 }
@@ -797,6 +799,10 @@ struct devfreq *devfreq_add_device(struct device *dev,
 	}
 
 	devfreq->suspend_freq = dev_pm_opp_get_suspend_opp_freq(dev);
+	devfreq->opp_table = dev_pm_opp_get_opp_table(dev);
+	if (IS_ERR(devfreq->opp_table))
+		devfreq->opp_table = NULL;
+
 	atomic_set(&devfreq->suspend_count, 0);
 
 	dev_set_name(&devfreq->dev, "%s", dev_name(dev));
diff --git a/include/linux/devfreq.h b/include/linux/devfreq.h
index 57e871a559a9..a4b19d593151 100644
--- a/include/linux/devfreq.h
+++ b/include/linux/devfreq.h
@@ -131,6 +131,7 @@ struct devfreq_stats {
  * @profile:	device-specific devfreq profile
  * @governor:	method how to choose frequency based on the usage.
  * @governor_name:	devfreq governor name for use with this devfreq
+ * @opp_table:	Reference to OPP table of dev.parent, if one exists.
  * @nb:		notifier block used to notify devfreq object that it should
  *		reevaluate operable frequencies. Devfreq users may use
  *		devfreq.nb to the corresponding register notifier call chain.
@@ -168,6 +169,7 @@ struct devfreq {
 	struct devfreq_dev_profile *profile;
 	const struct devfreq_governor *governor;
 	char governor_name[DEVFREQ_NAME_LEN];
+	struct opp_table *opp_table;
 	struct notifier_block nb;
 	struct delayed_work work;
 
-- 
2.12.5
_______________________________________________
Linux-mediatek mailing list
Linux-mediatek@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-mediatek

^ permalink raw reply related	[flat|nested] 35+ messages in thread

* [PATCH 05/12] PM / devfreq: Add required OPPs support to passive governor
  2020-05-20  3:42 ` [PATCH 00/12] Add cpufreq and cci devfreq for mt8183, and SVS support Andrew-sh.Cheng
                     ` (3 preceding siblings ...)
  2020-05-20  3:42   ` [PATCH 04/12] PM / devfreq: Cache OPP table reference in devfreq Andrew-sh.Cheng
@ 2020-05-20  3:43   ` Andrew-sh.Cheng
  2020-05-20  3:43   ` [PATCH 06/12] PM / devfreq: Add cpu based scaling support to passive_governor Andrew-sh.Cheng
                     ` (8 subsequent siblings)
  13 siblings, 0 replies; 35+ messages in thread
From: Andrew-sh.Cheng @ 2020-05-20  3:43 UTC (permalink / raw)
  To: MyungJoo Ham, Kyungmin Park, Chanwoo Choi, Rob Herring,
	Mark Rutland, Matthias Brugger, Rafael J . Wysocki, Viresh Kumar,
	Nishanth Menon, Stephen Boyd, Liam Girdwood, Mark Brown
  Cc: devicetree, Saravana Kannan, srv_heupstream, linux-pm,
	linux-kernel, linux-mediatek, linux-arm-kernel

From: Saravana Kannan <saravanak@google.com>

Look at the required OPPs of the "parent" device to determine the OPP that
is required from the slave device managed by the passive governor. This
allows having mappings between a parent device and a slave device even when
they don't have the same number of OPPs.

Signed-off-by: Saravana Kannan <saravanak@google.com>
Acked-by: MyungJoo Ham <myungjoo.ham@samsung.com>
Acked-by: Chanwoo Choi <cw00.choi@samsung.com>
---
 drivers/devfreq/governor_passive.c | 20 +++++++++++++++-----
 1 file changed, 15 insertions(+), 5 deletions(-)

diff --git a/drivers/devfreq/governor_passive.c b/drivers/devfreq/governor_passive.c
index be6eeab9c814..2d67d6c12dce 100644
--- a/drivers/devfreq/governor_passive.c
+++ b/drivers/devfreq/governor_passive.c
@@ -19,7 +19,7 @@ static int devfreq_passive_get_target_freq(struct devfreq *devfreq,
 			= (struct devfreq_passive_data *)devfreq->data;
 	struct devfreq *parent_devfreq = (struct devfreq *)p_data->parent;
 	unsigned long child_freq = ULONG_MAX;
-	struct dev_pm_opp *opp;
+	struct dev_pm_opp *opp = NULL, *p_opp = NULL;
 	int i, count, ret = 0;
 
 	/*
@@ -56,13 +56,20 @@ static int devfreq_passive_get_target_freq(struct devfreq *devfreq,
 	 * list of parent device. Because in this case, *freq is temporary
 	 * value which is decided by ondemand governor.
 	 */
-	opp = devfreq_recommended_opp(parent_devfreq->dev.parent, freq, 0);
-	if (IS_ERR(opp)) {
-		ret = PTR_ERR(opp);
+	p_opp = devfreq_recommended_opp(parent_devfreq->dev.parent, freq, 0);
+	if (IS_ERR(p_opp)) {
+		ret = PTR_ERR(p_opp);
 		goto out;
 	}
 
-	dev_pm_opp_put(opp);
+	if (devfreq->opp_table && parent_devfreq->opp_table)
+		opp = dev_pm_opp_xlate_required_opp(parent_devfreq->opp_table,
+						    devfreq->opp_table, p_opp);
+	if (opp) {
+		*freq = dev_pm_opp_get_freq(opp);
+		dev_pm_opp_put(opp);
+		goto out;
+	}
 
 	/*
 	 * Get the OPP table's index of decided freqeuncy by governor
@@ -89,6 +96,9 @@ static int devfreq_passive_get_target_freq(struct devfreq *devfreq,
 	*freq = child_freq;
 
 out:
+	if (!IS_ERR_OR_NULL(opp))
+		dev_pm_opp_put(p_opp);
+
 	return ret;
 }
 
-- 
2.12.5
_______________________________________________
Linux-mediatek mailing list
Linux-mediatek@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-mediatek

^ permalink raw reply related	[flat|nested] 35+ messages in thread

* [PATCH 06/12] PM / devfreq: Add cpu based scaling support to passive_governor
  2020-05-20  3:42 ` [PATCH 00/12] Add cpufreq and cci devfreq for mt8183, and SVS support Andrew-sh.Cheng
                     ` (4 preceding siblings ...)
  2020-05-20  3:43   ` [PATCH 05/12] PM / devfreq: Add required OPPs support to passive governor Andrew-sh.Cheng
@ 2020-05-20  3:43   ` Andrew-sh.Cheng
  2020-05-28  5:03     ` Chanwoo Choi
  2020-05-28  6:14     ` Chanwoo Choi
  2020-05-20  3:43   ` [PATCH 07/12] cpufreq: mediatek: Enable clock and regulator Andrew-sh.Cheng
                     ` (7 subsequent siblings)
  13 siblings, 2 replies; 35+ messages in thread
From: Andrew-sh.Cheng @ 2020-05-20  3:43 UTC (permalink / raw)
  To: MyungJoo Ham, Kyungmin Park, Chanwoo Choi, Rob Herring,
	Mark Rutland, Matthias Brugger, Rafael J . Wysocki, Viresh Kumar,
	Nishanth Menon, Stephen Boyd, Liam Girdwood, Mark Brown
  Cc: devicetree, Andrew-sh . Cheng, srv_heupstream, linux-pm,
	linux-kernel, Saravana Kannan, linux-mediatek, Sibi Sankar,
	linux-arm-kernel

From: Saravana Kannan <skannan@codeaurora.org>

Many CPU architectures have caches that can scale independent of the
CPUs. Frequency scaling of the caches is necessary to make sure that the
cache is not a performance bottleneck that leads to poor performance and
power. The same idea applies for RAM/DDR.

To achieve this, this patch adds support for cpu based scaling to the
passive governor. This is accomplished by taking the current frequency
of each CPU frequency domain and then adjust the frequency of the cache
(or any devfreq device) based on the frequency of the CPUs. It listens
to CPU frequency transition notifiers to keep itself up to date on the
current CPU frequency.

To decide the frequency of the device, the governor does one of the
following:
* Derives the optimal devfreq device opp from required-opps property of
  the parent cpu opp_table.

* Scales the device frequency in proportion to the CPU frequency. So, if
  the CPUs are running at their max frequency, the device runs at its
  max frequency. If the CPUs are running at their min frequency, the
  device runs at its min frequency. It is interpolated for frequencies
  in between.

Andrew-sh.Cheng change
dev_pm_opp_xlate_opp to dev_pm_opp_xlate_required_opp devfreq->max_freq
to devfreq->user_min_freq_req.data.freq.qos->min_freq.target_value
for kernel-5.7

Signed-off-by: Saravana Kannan <skannan@codeaurora.org>
[Sibi: Integrated cpu-freqmap governor into passive_governor]
Signed-off-by: Sibi Sankar <sibis@codeaurora.org>
Signed-off-by: Andrew-sh.Cheng <andrew-sh.cheng@mediatek.com>
---
 drivers/devfreq/Kconfig            |   2 +
 drivers/devfreq/governor_passive.c | 278 ++++++++++++++++++++++++++++++++++---
 include/linux/devfreq.h            |  40 +++++-
 3 files changed, 299 insertions(+), 21 deletions(-)

diff --git a/drivers/devfreq/Kconfig b/drivers/devfreq/Kconfig
index 0b1df12e0f21..d9067950af6a 100644
--- a/drivers/devfreq/Kconfig
+++ b/drivers/devfreq/Kconfig
@@ -73,6 +73,8 @@ config DEVFREQ_GOV_PASSIVE
 	  device. This governor does not change the frequency by itself
 	  through sysfs entries. The passive governor recommends that
 	  devfreq device uses the OPP table to get the frequency/voltage.
+	  Alternatively the governor can also be chosen to scale based on
+	  the online CPUs current frequency.
 
 comment "DEVFREQ Drivers"
 
diff --git a/drivers/devfreq/governor_passive.c b/drivers/devfreq/governor_passive.c
index 2d67d6c12dce..7dcda02a5bb7 100644
--- a/drivers/devfreq/governor_passive.c
+++ b/drivers/devfreq/governor_passive.c
@@ -8,11 +8,89 @@
  */
 
 #include <linux/module.h>
+#include <linux/cpu.h>
+#include <linux/cpufreq.h>
+#include <linux/cpumask.h>
 #include <linux/device.h>
 #include <linux/devfreq.h>
+#include <linux/slab.h>
 #include "governor.h"
 
-static int devfreq_passive_get_target_freq(struct devfreq *devfreq,
+static unsigned int xlate_cpufreq_to_devfreq(struct devfreq_passive_data *data,
+					     unsigned int cpu)
+{
+	unsigned int cpu_min, cpu_max, dev_min, dev_max, cpu_percent, max_state;
+	struct devfreq_cpu_state *cpu_state = data->cpu_state[cpu];
+	struct devfreq *devfreq = (struct devfreq *)data->this;
+	unsigned long *freq_table = devfreq->profile->freq_table;
+	struct dev_pm_opp *opp = NULL, *cpu_opp = NULL;
+	unsigned long cpu_freq, freq;
+
+	if (!cpu_state || cpu_state->first_cpu != cpu ||
+	    !cpu_state->opp_table || !devfreq->opp_table)
+		return 0;
+
+	cpu_freq = cpu_state->freq * 1000;
+	cpu_opp = devfreq_recommended_opp(cpu_state->dev, &cpu_freq, 0);
+	if (IS_ERR(cpu_opp))
+		return 0;
+
+	opp = dev_pm_opp_xlate_required_opp(cpu_state->opp_table,
+					    devfreq->opp_table, cpu_opp);
+	dev_pm_opp_put(cpu_opp);
+
+	if (!IS_ERR(opp)) {
+		freq = dev_pm_opp_get_freq(opp);
+		dev_pm_opp_put(opp);
+	} else {
+		/* Use Interpolation if required opps is not available */
+		cpu_min = cpu_state->min_freq;
+		cpu_max = cpu_state->max_freq;
+		cpu_freq = cpu_state->freq;
+
+		if (freq_table) {
+			/* Get minimum frequency according to sorting order */
+			max_state = freq_table[devfreq->profile->max_state - 1];
+			if (freq_table[0] < max_state) {
+				dev_min = freq_table[0];
+				dev_max = max_state;
+			} else {
+				dev_min = max_state;
+				dev_max = freq_table[0];
+			}
+		} else {
+			if (devfreq->user_max_freq_req.data.freq.qos->max_freq.target_value
+			    <= devfreq->user_min_freq_req.data.freq.qos->min_freq.target_value)
+				return 0;
+			dev_min =
+			devfreq->user_min_freq_req.data.freq.qos->min_freq.target_value;
+			dev_max =
+			devfreq->user_max_freq_req.data.freq.qos->max_freq.target_value;
+		}
+		cpu_percent = ((cpu_freq - cpu_min) * 100) / cpu_max - cpu_min;
+		freq = dev_min + mult_frac(dev_max - dev_min, cpu_percent, 100);
+	}
+
+	return freq;
+}
+
+static int get_target_freq_with_cpufreq(struct devfreq *devfreq,
+					unsigned long *freq)
+{
+	struct devfreq_passive_data *p_data =
+				(struct devfreq_passive_data *)devfreq->data;
+	unsigned int cpu, target_freq = 0;
+
+	for_each_online_cpu(cpu)
+		target_freq = max(target_freq,
+				  xlate_cpufreq_to_devfreq(p_data, cpu));
+
+	*freq = target_freq;
+
+	return 0;
+}
+
+static int get_target_freq_with_devfreq(struct devfreq *devfreq,
 					unsigned long *freq)
 {
 	struct devfreq_passive_data *p_data
@@ -23,16 +101,6 @@ static int devfreq_passive_get_target_freq(struct devfreq *devfreq,
 	int i, count, ret = 0;
 
 	/*
-	 * If the devfreq device with passive governor has the specific method
-	 * to determine the next frequency, should use the get_target_freq()
-	 * of struct devfreq_passive_data.
-	 */
-	if (p_data->get_target_freq) {
-		ret = p_data->get_target_freq(devfreq, freq);
-		goto out;
-	}
-
-	/*
 	 * If the parent and passive devfreq device uses the OPP table,
 	 * get the next frequency by using the OPP table.
 	 */
@@ -102,6 +170,37 @@ static int devfreq_passive_get_target_freq(struct devfreq *devfreq,
 	return ret;
 }
 
+static int devfreq_passive_get_target_freq(struct devfreq *devfreq,
+					   unsigned long *freq)
+{
+	struct devfreq_passive_data *p_data =
+				(struct devfreq_passive_data *)devfreq->data;
+	int ret;
+
+	/*
+	 * If the devfreq device with passive governor has the specific method
+	 * to determine the next frequency, should use the get_target_freq()
+	 * of struct devfreq_passive_data.
+	 */
+	if (p_data->get_target_freq)
+		return p_data->get_target_freq(devfreq, freq);
+
+	switch (p_data->parent_type) {
+	case DEVFREQ_PARENT_DEV:
+		ret = get_target_freq_with_devfreq(devfreq, freq);
+		break;
+	case CPUFREQ_PARENT_DEV:
+		ret = get_target_freq_with_cpufreq(devfreq, freq);
+		break;
+	default:
+		ret = -EINVAL;
+		dev_err(&devfreq->dev, "Invalid parent type\n");
+		break;
+	}
+
+	return ret;
+}
+
 static int update_devfreq_passive(struct devfreq *devfreq, unsigned long freq)
 {
 	int ret;
@@ -156,6 +255,140 @@ static int devfreq_passive_notifier_call(struct notifier_block *nb,
 	return NOTIFY_DONE;
 }
 
+static int cpufreq_passive_notifier_call(struct notifier_block *nb,
+					 unsigned long event, void *ptr)
+{
+	struct devfreq_passive_data *data =
+			container_of(nb, struct devfreq_passive_data, nb);
+	struct devfreq *devfreq = (struct devfreq *)data->this;
+	struct devfreq_cpu_state *cpu_state;
+	struct cpufreq_freqs *freq = ptr;
+	unsigned int current_freq;
+	int ret;
+
+	if (event != CPUFREQ_POSTCHANGE || !freq ||
+	    !data->cpu_state[freq->policy->cpu])
+		return 0;
+
+	cpu_state = data->cpu_state[freq->policy->cpu];
+	if (cpu_state->freq == freq->new)
+		return 0;
+
+	/* Backup current freq and pre-update cpu state freq*/
+	current_freq = cpu_state->freq;
+	cpu_state->freq = freq->new;
+
+	mutex_lock(&devfreq->lock);
+	ret = update_devfreq(devfreq);
+	mutex_unlock(&devfreq->lock);
+	if (ret) {
+		cpu_state->freq = current_freq;
+		dev_err(&devfreq->dev, "Couldn't update the frequency.\n");
+		return ret;
+	}
+
+	return 0;
+}
+
+static int cpufreq_passive_register(struct devfreq_passive_data **p_data)
+{
+	struct devfreq_passive_data *data = *p_data;
+	struct devfreq *devfreq = (struct devfreq *)data->this;
+	struct device *dev = devfreq->dev.parent;
+	struct opp_table *opp_table = NULL;
+	struct devfreq_cpu_state *state;
+	struct cpufreq_policy *policy;
+	struct device *cpu_dev;
+	unsigned int cpu;
+	int ret;
+
+	get_online_cpus();
+	data->nb.notifier_call = cpufreq_passive_notifier_call;
+	ret = cpufreq_register_notifier(&data->nb,
+					CPUFREQ_TRANSITION_NOTIFIER);
+	if (ret) {
+		dev_err(dev, "Couldn't register cpufreq notifier.\n");
+		data->nb.notifier_call = NULL;
+		goto out;
+	}
+
+	/* Populate devfreq_cpu_state */
+	for_each_online_cpu(cpu) {
+		if (data->cpu_state[cpu])
+			continue;
+
+		policy = cpufreq_cpu_get(cpu);
+		if (policy) {
+			state = kzalloc(sizeof(*state), GFP_KERNEL);
+			if (!state) {
+				ret = -ENOMEM;
+				goto out;
+			}
+
+			cpu_dev = get_cpu_device(cpu);
+			if (!cpu_dev) {
+				dev_err(dev, "Couldn't get cpu device.\n");
+				ret = -ENODEV;
+				goto out;
+			}
+
+			opp_table = dev_pm_opp_get_opp_table(cpu_dev);
+			if (IS_ERR(devfreq->opp_table)) {
+				ret = PTR_ERR(opp_table);
+				goto out;
+			}
+
+			state->dev = cpu_dev;
+			state->opp_table = opp_table;
+			state->first_cpu = cpumask_first(policy->related_cpus);
+			state->freq = policy->cur;
+			state->min_freq = policy->cpuinfo.min_freq;
+			state->max_freq = policy->cpuinfo.max_freq;
+			data->cpu_state[cpu] = state;
+			cpufreq_cpu_put(policy);
+		} else {
+			ret = -EPROBE_DEFER;
+			goto out;
+		}
+	}
+out:
+	put_online_cpus();
+	if (ret)
+		return ret;
+
+	/* Update devfreq */
+	mutex_lock(&devfreq->lock);
+	ret = update_devfreq(devfreq);
+	mutex_unlock(&devfreq->lock);
+	if (ret)
+		dev_err(dev, "Couldn't update the frequency.\n");
+
+	return ret;
+}
+
+static int cpufreq_passive_unregister(struct devfreq_passive_data **p_data)
+{
+	struct devfreq_passive_data *data = *p_data;
+	struct devfreq_cpu_state *cpu_state;
+	int cpu;
+
+	if (data->nb.notifier_call)
+		cpufreq_unregister_notifier(&data->nb,
+					    CPUFREQ_TRANSITION_NOTIFIER);
+
+	for_each_possible_cpu(cpu) {
+		cpu_state = data->cpu_state[cpu];
+		if (cpu_state) {
+			if (cpu_state->opp_table)
+				dev_pm_opp_put_opp_table(cpu_state->opp_table);
+			kfree(cpu_state);
+			cpu_state = NULL;
+		}
+	}
+
+	return 0;
+}
+
 static int devfreq_passive_event_handler(struct devfreq *devfreq,
 				unsigned int event, void *data)
 {
@@ -165,7 +398,7 @@ static int devfreq_passive_event_handler(struct devfreq *devfreq,
 	struct notifier_block *nb = &p_data->nb;
 	int ret = 0;
 
-	if (!parent)
+	if (p_data->parent_type == DEVFREQ_PARENT_DEV && !parent)
 		return -EPROBE_DEFER;
 
 	switch (event) {
@@ -173,13 +406,24 @@ static int devfreq_passive_event_handler(struct devfreq *devfreq,
 		if (!p_data->this)
 			p_data->this = devfreq;
 
-		nb->notifier_call = devfreq_passive_notifier_call;
-		ret = devfreq_register_notifier(parent, nb,
-					DEVFREQ_TRANSITION_NOTIFIER);
+		if (p_data->parent_type == DEVFREQ_PARENT_DEV) {
+			nb->notifier_call = devfreq_passive_notifier_call;
+			ret = devfreq_register_notifier(parent, nb,
+						DEVFREQ_TRANSITION_NOTIFIER);
+		} else if (p_data->parent_type == CPUFREQ_PARENT_DEV) {
+			ret = cpufreq_passive_register(&p_data);
+		} else {
+			ret = -EINVAL;
+		}
 		break;
 	case DEVFREQ_GOV_STOP:
-		WARN_ON(devfreq_unregister_notifier(parent, nb,
-					DEVFREQ_TRANSITION_NOTIFIER));
+		if (p_data->parent_type == DEVFREQ_PARENT_DEV)
+			WARN_ON(devfreq_unregister_notifier(parent, nb,
+						DEVFREQ_TRANSITION_NOTIFIER));
+		else if (p_data->parent_type == CPUFREQ_PARENT_DEV)
+			cpufreq_passive_unregister(&p_data);
+		else
+			ret = -EINVAL;
 		break;
 	default:
 		break;
diff --git a/include/linux/devfreq.h b/include/linux/devfreq.h
index a4b19d593151..04ce576fd6f1 100644
--- a/include/linux/devfreq.h
+++ b/include/linux/devfreq.h
@@ -278,6 +278,32 @@ struct devfreq_simple_ondemand_data {
 
 #if IS_ENABLED(CONFIG_DEVFREQ_GOV_PASSIVE)
 /**
+ * struct devfreq_cpu_state - holds the per-cpu state
+ * @freq:	the current frequency of the cpu.
+ * @min_freq:	the min frequency of the cpu.
+ * @max_freq:	the max frequency of the cpu.
+ * @first_cpu:	the cpumask of the first cpu of a policy.
+ * @dev:	reference to cpu device.
+ * @opp_table:	reference to cpu opp table.
+ *
+ * This structure stores the required cpu_state of a cpu.
+ * This is auto-populated by the governor.
+ */
+struct devfreq_cpu_state {
+	unsigned int freq;
+	unsigned int min_freq;
+	unsigned int max_freq;
+	unsigned int first_cpu;
+	struct device *dev;
+	struct opp_table *opp_table;
+};
+
+enum devfreq_parent_dev_type {
+	DEVFREQ_PARENT_DEV,
+	CPUFREQ_PARENT_DEV,
+};
+
+/**
  * struct devfreq_passive_data - ``void *data`` fed to struct devfreq
  *	and devfreq_add_device
  * @parent:	the devfreq instance of parent device.
@@ -288,13 +314,15 @@ struct devfreq_simple_ondemand_data {
  *			using governors except for passive governor.
  *			If the devfreq device has the specific method to decide
  *			the next frequency, should use this callback.
- * @this:	the devfreq instance of own device.
- * @nb:		the notifier block for DEVFREQ_TRANSITION_NOTIFIER list
+ * @parent_type		parent type of the device
+ * @this:		the devfreq instance of own device.
+ * @nb:			the notifier block for DEVFREQ_TRANSITION_NOTIFIER list
+ * @cpu_state:		the state min/max/current frequency of all online cpu's
  *
  * The devfreq_passive_data have to set the devfreq instance of parent
  * device with governors except for the passive governor. But, don't need to
- * initialize the 'this' and 'nb' field because the devfreq core will handle
- * them.
+ * initialize the 'this', 'nb' and 'cpu_state' field because the devfreq core
+ * will handle them.
  */
 struct devfreq_passive_data {
 	/* Should set the devfreq instance of parent device */
@@ -303,9 +331,13 @@ struct devfreq_passive_data {
 	/* Optional callback to decide the next frequency of passvice device */
 	int (*get_target_freq)(struct devfreq *this, unsigned long *freq);
 
+	/* Should set the type of parent device */
+	enum devfreq_parent_dev_type parent_type;
+
 	/* For passive governor's internal use. Don't need to set them */
 	struct devfreq *this;
 	struct notifier_block nb;
+	struct devfreq_cpu_state *cpu_state[NR_CPUS];
 };
 #endif
 
-- 
2.12.5
_______________________________________________
Linux-mediatek mailing list
Linux-mediatek@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-mediatek

^ permalink raw reply related	[flat|nested] 35+ messages in thread

* [PATCH 07/12] cpufreq: mediatek: Enable clock and regulator
  2020-05-20  3:42 ` [PATCH 00/12] Add cpufreq and cci devfreq for mt8183, and SVS support Andrew-sh.Cheng
                     ` (5 preceding siblings ...)
  2020-05-20  3:43   ` [PATCH 06/12] PM / devfreq: Add cpu based scaling support to passive_governor Andrew-sh.Cheng
@ 2020-05-20  3:43   ` Andrew-sh.Cheng
  2020-05-20  3:43   ` [PATCH 08/12] dt-bindings: devfreq: add compatible for mt8183 cci devfreq Andrew-sh.Cheng
                     ` (6 subsequent siblings)
  13 siblings, 0 replies; 35+ messages in thread
From: Andrew-sh.Cheng @ 2020-05-20  3:43 UTC (permalink / raw)
  To: MyungJoo Ham, Kyungmin Park, Chanwoo Choi, Rob Herring,
	Mark Rutland, Matthias Brugger, Rafael J . Wysocki, Viresh Kumar,
	Nishanth Menon, Stephen Boyd, Liam Girdwood, Mark Brown
  Cc: devicetree, Andrew-sh.Cheng, srv_heupstream, linux-pm,
	linux-kernel, linux-mediatek, linux-arm-kernel

Need to enable regulator,
so that the max/min requested value will be recorded
even it is not applied right away.

Intermediate clock is not always enabled by ccf in different projects,
so cpufreq should always enable used clock by itself.

Signed-off-by: Andrew-sh.Cheng <andrew-sh.cheng@mediatek.com>
---
 drivers/cpufreq/mediatek-cpufreq.c | 33 +++++++++++++++++++++++++++++----
 1 file changed, 29 insertions(+), 4 deletions(-)

diff --git a/drivers/cpufreq/mediatek-cpufreq.c b/drivers/cpufreq/mediatek-cpufreq.c
index 0c98dd08273d..4b479c110cc9 100644
--- a/drivers/cpufreq/mediatek-cpufreq.c
+++ b/drivers/cpufreq/mediatek-cpufreq.c
@@ -350,6 +350,11 @@ static int mtk_cpu_dvfs_info_init(struct mtk_cpu_dvfs_info *info, int cpu)
 		ret = PTR_ERR(proc_reg);
 		goto out_free_resources;
 	}
+	ret = regulator_enable(proc_reg);
+	if (ret) {
+		pr_warn("enable vproc for cpu%d fail\n", cpu);
+		goto out_free_resources;
+	}
 
 	/* Both presence and absence of sram regulator are valid cases. */
 	sram_reg = regulator_get_exclusive(cpu_dev, "sram");
@@ -368,13 +373,21 @@ static int mtk_cpu_dvfs_info_init(struct mtk_cpu_dvfs_info *info, int cpu)
 		goto out_free_resources;
 	}
 
+	ret = clk_prepare_enable(cpu_clk);
+	if (ret)
+		goto out_free_opp_table;
+
+	ret = clk_prepare_enable(inter_clk);
+	if (ret)
+		goto out_disable_mux_clock;
+
 	/* Search a safe voltage for intermediate frequency. */
 	rate = clk_get_rate(inter_clk);
 	opp = dev_pm_opp_find_freq_ceil(cpu_dev, &rate);
 	if (IS_ERR(opp)) {
 		pr_err("failed to get intermediate opp for cpu%d\n", cpu);
 		ret = PTR_ERR(opp);
-		goto out_free_opp_table;
+		goto out_disable_inter_clock;
 	}
 	info->intermediate_voltage = dev_pm_opp_get_voltage(opp);
 	dev_pm_opp_put(opp);
@@ -393,6 +406,12 @@ static int mtk_cpu_dvfs_info_init(struct mtk_cpu_dvfs_info *info, int cpu)
 
 	return 0;
 
+out_disable_inter_clock:
+	clk_disable_unprepare(inter_clk);
+
+out_disable_mux_clock:
+	clk_disable_unprepare(cpu_clk);
+
 out_free_opp_table:
 	dev_pm_opp_of_cpumask_remove_table(&info->cpus);
 
@@ -411,14 +430,20 @@ static int mtk_cpu_dvfs_info_init(struct mtk_cpu_dvfs_info *info, int cpu)
 
 static void mtk_cpu_dvfs_info_release(struct mtk_cpu_dvfs_info *info)
 {
-	if (!IS_ERR(info->proc_reg))
+	if (!IS_ERR(info->proc_reg)) {
+		regulator_disable(info->proc_reg);
 		regulator_put(info->proc_reg);
+	}
 	if (!IS_ERR(info->sram_reg))
 		regulator_put(info->sram_reg);
-	if (!IS_ERR(info->cpu_clk))
+	if (!IS_ERR(info->cpu_clk)) {
+		clk_disable_unprepare(info->cpu_clk);
 		clk_put(info->cpu_clk);
-	if (!IS_ERR(info->inter_clk))
+	}
+	if (!IS_ERR(info->inter_clk)) {
+		clk_disable_unprepare(info->inter_clk);
 		clk_put(info->inter_clk);
+	}
 
 	dev_pm_opp_of_cpumask_remove_table(&info->cpus);
 }
-- 
2.12.5
_______________________________________________
Linux-mediatek mailing list
Linux-mediatek@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-mediatek

^ permalink raw reply related	[flat|nested] 35+ messages in thread

* [PATCH 08/12] dt-bindings: devfreq: add compatible for mt8183 cci devfreq
  2020-05-20  3:42 ` [PATCH 00/12] Add cpufreq and cci devfreq for mt8183, and SVS support Andrew-sh.Cheng
                     ` (6 preceding siblings ...)
  2020-05-20  3:43   ` [PATCH 07/12] cpufreq: mediatek: Enable clock and regulator Andrew-sh.Cheng
@ 2020-05-20  3:43   ` Andrew-sh.Cheng
  2020-05-28  7:42     ` Chanwoo Choi
  2020-05-20  3:43   ` [PATCH 09/12] devfreq: add mediatek " Andrew-sh.Cheng
                     ` (5 subsequent siblings)
  13 siblings, 1 reply; 35+ messages in thread
From: Andrew-sh.Cheng @ 2020-05-20  3:43 UTC (permalink / raw)
  To: MyungJoo Ham, Kyungmin Park, Chanwoo Choi, Rob Herring,
	Mark Rutland, Matthias Brugger, Rafael J . Wysocki, Viresh Kumar,
	Nishanth Menon, Stephen Boyd, Liam Girdwood, Mark Brown
  Cc: devicetree, Andrew-sh.Cheng, srv_heupstream, linux-pm,
	linux-kernel, linux-mediatek, linux-arm-kernel

This adds dt-binding documentation of cci devfreq
for Mediatek MT8183 SoC platform.

Signed-off-by: Andrew-sh.Cheng <andrew-sh.cheng@mediatek.com>
---
 .../devicetree/bindings/devfreq/mt8183-cci.yaml    | 51 ++++++++++++++++++++++
 1 file changed, 51 insertions(+)
 create mode 100644 Documentation/devicetree/bindings/devfreq/mt8183-cci.yaml

diff --git a/Documentation/devicetree/bindings/devfreq/mt8183-cci.yaml b/Documentation/devicetree/bindings/devfreq/mt8183-cci.yaml
new file mode 100644
index 000000000000..a7341fd94097
--- /dev/null
+++ b/Documentation/devicetree/bindings/devfreq/mt8183-cci.yaml
@@ -0,0 +1,51 @@
+# SPDX-License-Identifier: (GPL-2.0-only OR BSD-2-Clause)
+%YAML 1.2
+---
+$id: http://devicetree.org/schemas/devfreq/mt8183-cci.yaml#
+$schema: http://devicetree.org/meta-schemas/core.yaml#
+
+title: CCI_DEVFREQ driver for MT8183.
+
+maintainers:
+  - Andrew-sh.Cheng <andrew-sh.cheng@mediatek.com>
+
+description: |
+  This module is used to create CCI DEVFREQ.
+  The performance will depend on both CCI frequency and CPU frequency.
+  For MT8183, CCI co-buck with Little core.
+  Contain CCI opp table for voltage and frequency scaling.
+
+properties:
+  compatible:
+    const: "mediatek,mt8183-cci"
+
+  clocks:
+    maxItems: 1
+
+  clock-names:
+    const: "cci"
+
+  operating-points-v2: true
+  opp-table: true
+
+  proc-supply:
+    description:
+      Phandle of the regulator that provides the supply voltage.
+
+required:
+  - compatible
+  - clocks
+  - clock-names
+  - proc-supply
+
+examples:
+  - |
+    #include <dt-bindings/clock/mt8183-clk.h>
+    cci: cci {
+      compatible = "mediatek,mt8183-cci";
+      clocks = <&apmixedsys CLK_APMIXED_CCIPLL>;
+      clock-names = "cci";
+      operating-points-v2 = <&cci_opp>;
+      proc-supply = <&mt6358_vproc12_reg>;
+    };
+
-- 
2.12.5
_______________________________________________
Linux-mediatek mailing list
Linux-mediatek@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-mediatek

^ permalink raw reply related	[flat|nested] 35+ messages in thread

* [PATCH 09/12] devfreq: add mediatek cci devfreq
  2020-05-20  3:42 ` [PATCH 00/12] Add cpufreq and cci devfreq for mt8183, and SVS support Andrew-sh.Cheng
                     ` (7 preceding siblings ...)
  2020-05-20  3:43   ` [PATCH 08/12] dt-bindings: devfreq: add compatible for mt8183 cci devfreq Andrew-sh.Cheng
@ 2020-05-20  3:43   ` Andrew-sh.Cheng
  2020-05-20 12:31     ` Mark Brown
  2020-05-28  7:35     ` Chanwoo Choi
  2020-05-20  3:43   ` [PATCH 10/12] opp: Modify opp API, dev_pm_opp_get_freq(), find freq in opp, even it is disabled Andrew-sh.Cheng
                     ` (4 subsequent siblings)
  13 siblings, 2 replies; 35+ messages in thread
From: Andrew-sh.Cheng @ 2020-05-20  3:43 UTC (permalink / raw)
  To: MyungJoo Ham, Kyungmin Park, Chanwoo Choi, Rob Herring,
	Mark Rutland, Matthias Brugger, Rafael J . Wysocki, Viresh Kumar,
	Nishanth Menon, Stephen Boyd, Liam Girdwood, Mark Brown
  Cc: devicetree, Andrew-sh.Cheng, srv_heupstream, linux-pm,
	linux-kernel, linux-mediatek, linux-arm-kernel

This adds a devfreq driver for the Cache Coherent Interconnect (CCI)
of the Mediatek MT8183.

On the MT8183 the CCI is supplied by the same regulator as the LITTLE
cores. The driver is notified when the regulator voltage changes
(driven by cpufreq) and adjusts the CCI frequency to the maximum
possible value.

Signed-off-by: Andrew-sh.Cheng <andrew-sh.cheng@mediatek.com>
---
 drivers/devfreq/Kconfig              |  10 ++
 drivers/devfreq/Makefile             |   1 +
 drivers/devfreq/mt8183-cci-devfreq.c | 206 +++++++++++++++++++++++++++++++++++
 3 files changed, 217 insertions(+)
 create mode 100644 drivers/devfreq/mt8183-cci-devfreq.c

diff --git a/drivers/devfreq/Kconfig b/drivers/devfreq/Kconfig
index d9067950af6a..4ed7116271ee 100644
--- a/drivers/devfreq/Kconfig
+++ b/drivers/devfreq/Kconfig
@@ -103,6 +103,16 @@ config ARM_IMX8M_DDRC_DEVFREQ
 	  This adds the DEVFREQ driver for the i.MX8M DDR Controller. It allows
 	  adjusting DRAM frequency.
 
+config ARM_MT8183_CCI_DEVFREQ
+	tristate "MT8183 CCI DEVFREQ Driver"
+	depends on ARM_MEDIATEK_CPUFREQ
+	help
+		This adds a devfreq driver for Cache Coherent Interconnect
+		of Mediatek MT8183, which is shared the same regulator
+		with cpu cluster.
+		It can track buck voltage and update a proper cci frequency.
+		Use notification to get regulator status.
+
 config ARM_TEGRA_DEVFREQ
 	tristate "NVIDIA Tegra30/114/124/210 DEVFREQ Driver"
 	depends on ARCH_TEGRA_3x_SOC || ARCH_TEGRA_114_SOC || \
diff --git a/drivers/devfreq/Makefile b/drivers/devfreq/Makefile
index 3eb4d5e6635c..5b1b670c954d 100644
--- a/drivers/devfreq/Makefile
+++ b/drivers/devfreq/Makefile
@@ -10,6 +10,7 @@ obj-$(CONFIG_DEVFREQ_GOV_PASSIVE)	+= governor_passive.o
 # DEVFREQ Drivers
 obj-$(CONFIG_ARM_EXYNOS_BUS_DEVFREQ)	+= exynos-bus.o
 obj-$(CONFIG_ARM_IMX8M_DDRC_DEVFREQ)	+= imx8m-ddrc.o
+obj-$(CONFIG_ARM_MT8183_CCI_DEVFREQ)	+= mt8183-cci-devfreq.o
 obj-$(CONFIG_ARM_RK3399_DMC_DEVFREQ)	+= rk3399_dmc.o
 obj-$(CONFIG_ARM_TEGRA_DEVFREQ)		+= tegra30-devfreq.o
 obj-$(CONFIG_ARM_TEGRA20_DEVFREQ)	+= tegra20-devfreq.o
diff --git a/drivers/devfreq/mt8183-cci-devfreq.c b/drivers/devfreq/mt8183-cci-devfreq.c
new file mode 100644
index 000000000000..cd7929a83bf8
--- /dev/null
+++ b/drivers/devfreq/mt8183-cci-devfreq.c
@@ -0,0 +1,206 @@
+// SPDX-License-Identifier: GPL-2.0
+/*
+ * Copyright (c) 2019 MediaTek Inc.
+
+ * Author: Andrew-sh.Cheng <andrew-sh.cheng@mediatek.com>
+ */
+
+#include <linux/clk.h>
+#include <linux/devfreq.h>
+#include <linux/module.h>
+#include <linux/of.h>
+#include <linux/platform_device.h>
+#include <linux/regulator/consumer.h>
+#include <linux/time.h>
+
+#include "governor.h"
+
+#define MAX_VOLT_LIMIT		(1150000)
+
+struct cci_devfreq {
+	struct devfreq *devfreq;
+	struct regulator *proc_reg;
+	struct clk *cci_clk;
+	int old_vproc;
+	unsigned long old_freq;
+};
+
+static int mtk_cci_set_voltage(struct cci_devfreq *cci_df, int vproc)
+{
+	int ret;
+
+	ret = regulator_set_voltage(cci_df->proc_reg, vproc,
+				    MAX_VOLT_LIMIT);
+	if (!ret)
+		cci_df->old_vproc = vproc;
+	return ret;
+}
+
+static int mtk_cci_devfreq_target(struct device *dev, unsigned long *freq,
+				  u32 flags)
+{
+	int ret;
+	struct cci_devfreq *cci_df = dev_get_drvdata(dev);
+	struct dev_pm_opp *opp;
+	unsigned long opp_rate, opp_voltage, old_voltage;
+
+	if (!cci_df)
+		return -EINVAL;
+
+	if (cci_df->old_freq == *freq)
+		return 0;
+
+	opp_rate = *freq;
+	opp = dev_pm_opp_find_freq_floor(dev, &opp_rate);
+	opp_voltage = dev_pm_opp_get_voltage(opp);
+	dev_pm_opp_put(opp);
+
+	old_voltage = cci_df->old_vproc;
+	if (old_voltage == 0)
+		old_voltage = regulator_get_voltage(cci_df->proc_reg);
+
+	// scale up: set voltage first then freq
+	if (opp_voltage > old_voltage) {
+		ret = mtk_cci_set_voltage(cci_df, opp_voltage);
+		if (ret) {
+			pr_err("cci: failed to scale up voltage\n");
+			return ret;
+		}
+	}
+
+	ret = clk_set_rate(cci_df->cci_clk, *freq);
+	if (ret) {
+		pr_err("%s: failed cci to set rate: %d\n", __func__,
+		       ret);
+		mtk_cci_set_voltage(cci_df, old_voltage);
+		return ret;
+	}
+
+	// scale down: set freq first then voltage
+	if (opp_voltage < old_voltage) {
+		ret = mtk_cci_set_voltage(cci_df, opp_voltage);
+		if (ret) {
+			pr_err("cci: failed to scale down voltage\n");
+			clk_set_rate(cci_df->cci_clk, cci_df->old_freq);
+			return ret;
+		}
+	}
+
+	cci_df->old_freq = *freq;
+
+	return 0;
+}
+
+static struct devfreq_dev_profile cci_devfreq_profile = {
+	.target = mtk_cci_devfreq_target,
+};
+
+static int mtk_cci_devfreq_probe(struct platform_device *pdev)
+{
+	struct device *cci_dev = &pdev->dev;
+	struct cci_devfreq *cci_df;
+	struct devfreq_passive_data *passive_data;
+	int ret;
+
+	cci_df = devm_kzalloc(cci_dev, sizeof(*cci_df), GFP_KERNEL);
+	if (!cci_df)
+		return -ENOMEM;
+
+	cci_df->cci_clk = devm_clk_get(cci_dev, "cci_clock");
+	ret = PTR_ERR_OR_ZERO(cci_df->cci_clk);
+	if (ret) {
+		if (ret != -EPROBE_DEFER)
+			dev_err(cci_dev, "failed to get clock for CCI: %d\n",
+				ret);
+		return ret;
+	}
+	cci_df->proc_reg = devm_regulator_get_optional(cci_dev, "proc");
+	ret = PTR_ERR_OR_ZERO(cci_df->proc_reg);
+	if (ret) {
+		if (ret != -EPROBE_DEFER)
+			dev_err(cci_dev, "failed to get regulator for CCI: %d\n",
+				ret);
+		return ret;
+	}
+	ret = regulator_enable(cci_df->proc_reg);
+	if (ret) {
+		pr_warn("enable buck for cci fail\n");
+		return ret;
+	}
+
+	ret = dev_pm_opp_of_add_table(cci_dev);
+	if (ret) {
+		dev_err(cci_dev, "Fail to init CCI OPP table: %d\n", ret);
+		return ret;
+	}
+
+	platform_set_drvdata(pdev, cci_df);
+
+	passive_data = devm_kzalloc(cci_dev, sizeof(*passive_data), GFP_KERNEL);
+	if (!passive_data)
+		return -ENOMEM;
+
+	passive_data->parent_type = CPUFREQ_PARENT_DEV;
+
+	cci_df->devfreq = devm_devfreq_add_device(cci_dev,
+						  &cci_devfreq_profile,
+						  DEVFREQ_GOV_PASSIVE,
+						  passive_data);
+	if (IS_ERR(cci_df->devfreq)) {
+		ret = PTR_ERR(cci_df->devfreq);
+		dev_err(cci_dev, "cannot create cci devfreq device:%d\n", ret);
+		dev_pm_opp_of_remove_table(cci_dev);
+		return ret;
+	}
+
+	return 0;
+}
+
+static int mtk_cci_devfreq_remove(struct platform_device *pdev)
+{
+	struct device *cci_dev = &pdev->dev;
+	struct cci_devfreq *cci_df;
+	struct notifier_block *opp_nb;
+
+	cci_df = platform_get_drvdata(pdev);
+	opp_nb = &cci_df->opp_nb;
+
+	dev_pm_opp_unregister_notifier(cci_dev, opp_nb);
+	devm_devfreq_remove_device(cci_dev, cci_df->devfreq);
+	dev_pm_opp_of_remove_table(cci_dev);
+	regulator_disable(cci_df->proc_reg);
+
+	return 0;
+}
+
+static const __maybe_unused struct of_device_id
+	mediatek_cci_devfreq_of_match[] = {
+	{ .compatible = "mediatek,mt8183-cci" },
+	{ },
+};
+MODULE_DEVICE_TABLE(of, mediatek_cci_devfreq_of_match);
+
+static struct platform_driver cci_devfreq_driver = {
+	.probe	= mtk_cci_devfreq_probe,
+	.remove	= mtk_cci_devfreq_remove,
+	.driver = {
+		.name = "mediatek-cci-devfreq",
+		.of_match_table = of_match_ptr(mediatek_cci_devfreq_of_match),
+	},
+};
+
+static int __init mtk_cci_devfreq_init(void)
+{
+	return platform_driver_register(&cci_devfreq_driver);
+}
+module_init(mtk_cci_devfreq_init)
+
+static void __exit mtk_cci_devfreq_exit(void)
+{
+	platform_driver_unregister(&cci_devfreq_driver);
+}
+module_exit(mtk_cci_devfreq_exit)
+
+MODULE_DESCRIPTION("Mediatek CCI devfreq driver");
+MODULE_AUTHOR("Andrew-sh.Cheng <andrew-sh.cheng@mediatek.com>");
+MODULE_LICENSE("GPL v2");
-- 
2.12.5
_______________________________________________
Linux-mediatek mailing list
Linux-mediatek@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-mediatek

^ permalink raw reply related	[flat|nested] 35+ messages in thread

* [PATCH 10/12] opp: Modify opp API, dev_pm_opp_get_freq(), find freq in opp, even it is disabled
  2020-05-20  3:42 ` [PATCH 00/12] Add cpufreq and cci devfreq for mt8183, and SVS support Andrew-sh.Cheng
                     ` (8 preceding siblings ...)
  2020-05-20  3:43   ` [PATCH 09/12] devfreq: add mediatek " Andrew-sh.Cheng
@ 2020-05-20  3:43   ` Andrew-sh.Cheng
  2020-05-20  3:43   ` [PATCH 11/12] cpufreq: mediatek: add opp notification for SVS support Andrew-sh.Cheng
                     ` (3 subsequent siblings)
  13 siblings, 0 replies; 35+ messages in thread
From: Andrew-sh.Cheng @ 2020-05-20  3:43 UTC (permalink / raw)
  To: MyungJoo Ham, Kyungmin Park, Chanwoo Choi, Rob Herring,
	Mark Rutland, Matthias Brugger, Rafael J . Wysocki, Viresh Kumar,
	Nishanth Menon, Stephen Boyd, Liam Girdwood, Mark Brown
  Cc: devicetree, Andrew-sh.Cheng, srv_heupstream, linux-pm,
	linux-kernel, linux-mediatek, linux-arm-kernel

Modify dev_pm_opp_get_freq() to return freqeuncy
even this opp item is not available.
So that we can get the information of disable opp items.

Signed-off-by: Andrew-sh.Cheng <andrew-sh.cheng@mediatek.com>
---
 drivers/opp/core.c | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/drivers/opp/core.c b/drivers/opp/core.c
index 284b01223831..04d9171604c5 100644
--- a/drivers/opp/core.c
+++ b/drivers/opp/core.c
@@ -118,7 +118,7 @@ EXPORT_SYMBOL_GPL(dev_pm_opp_get_voltage);
  */
 unsigned long dev_pm_opp_get_freq(struct dev_pm_opp *opp)
 {
-	if (IS_ERR_OR_NULL(opp) || !opp->available) {
+	if (IS_ERR_OR_NULL(opp)) {
 		pr_err("%s: Invalid parameters\n", __func__);
 		return 0;
 	}
-- 
2.12.5
_______________________________________________
Linux-mediatek mailing list
Linux-mediatek@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-mediatek

^ permalink raw reply related	[flat|nested] 35+ messages in thread

* [PATCH 11/12] cpufreq: mediatek: add opp notification for SVS support
  2020-05-20  3:42 ` [PATCH 00/12] Add cpufreq and cci devfreq for mt8183, and SVS support Andrew-sh.Cheng
                     ` (9 preceding siblings ...)
  2020-05-20  3:43   ` [PATCH 10/12] opp: Modify opp API, dev_pm_opp_get_freq(), find freq in opp, even it is disabled Andrew-sh.Cheng
@ 2020-05-20  3:43   ` Andrew-sh.Cheng
  2020-05-20  3:43   ` [PATCH 12/12] devfreq: mediatek: cci devfreq register " Andrew-sh.Cheng
                     ` (2 subsequent siblings)
  13 siblings, 0 replies; 35+ messages in thread
From: Andrew-sh.Cheng @ 2020-05-20  3:43 UTC (permalink / raw)
  To: MyungJoo Ham, Kyungmin Park, Chanwoo Choi, Rob Herring,
	Mark Rutland, Matthias Brugger, Rafael J . Wysocki, Viresh Kumar,
	Nishanth Menon, Stephen Boyd, Liam Girdwood, Mark Brown
  Cc: devicetree, Andrew-sh.Cheng, srv_heupstream, linux-pm,
	linux-kernel, linux-mediatek, linux-arm-kernel

cpufreq should listen opp notification and do proper actions
when receiving disable and voltage adjustment events,
which are triggered when SVS is enabled.

Signed-off-by: Andrew-sh.Cheng <andrew-sh.cheng@mediatek.com>
---
 drivers/cpufreq/mediatek-cpufreq.c | 89 ++++++++++++++++++++++++++++++++++++--
 1 file changed, 85 insertions(+), 4 deletions(-)

diff --git a/drivers/cpufreq/mediatek-cpufreq.c b/drivers/cpufreq/mediatek-cpufreq.c
index 4b479c110cc9..71395ab87ac7 100644
--- a/drivers/cpufreq/mediatek-cpufreq.c
+++ b/drivers/cpufreq/mediatek-cpufreq.c
@@ -42,6 +42,11 @@ struct mtk_cpu_dvfs_info {
 	struct list_head list_head;
 	int intermediate_voltage;
 	bool need_voltage_tracking;
+	struct mutex lock; /* avoid notify and policy race condition */
+	struct notifier_block opp_nb;
+	int opp_cpu;
+	unsigned long opp_freq;
+	int old_vproc;
 };
 
 static LIST_HEAD(dvfs_info_list);
@@ -192,11 +197,16 @@ static int mtk_cpufreq_voltage_tracking(struct mtk_cpu_dvfs_info *info,
 
 static int mtk_cpufreq_set_voltage(struct mtk_cpu_dvfs_info *info, int vproc)
 {
+	int ret;
+
 	if (info->need_voltage_tracking)
-		return mtk_cpufreq_voltage_tracking(info, vproc);
+		ret = mtk_cpufreq_voltage_tracking(info, vproc);
 	else
-		return regulator_set_voltage(info->proc_reg, vproc,
-					     vproc + VOLT_TOL);
+		ret = regulator_set_voltage(info->proc_reg, vproc,
+					    MAX_VOLT_LIMIT);
+	if (!ret)
+		info->old_vproc = vproc;
+	return ret;
 }
 
 static int mtk_cpufreq_set_target(struct cpufreq_policy *policy,
@@ -214,7 +224,9 @@ static int mtk_cpufreq_set_target(struct cpufreq_policy *policy,
 	inter_vproc = info->intermediate_voltage;
 
 	old_freq_hz = clk_get_rate(cpu_clk);
-	old_vproc = regulator_get_voltage(info->proc_reg);
+	old_vproc = info->old_vproc;
+	if (old_vproc == 0)
+		old_vproc = regulator_get_voltage(info->proc_reg);
 	if (old_vproc < 0) {
 		pr_err("%s: invalid Vproc value: %d\n", __func__, old_vproc);
 		return old_vproc;
@@ -231,6 +243,7 @@ static int mtk_cpufreq_set_target(struct cpufreq_policy *policy,
 	vproc = dev_pm_opp_get_voltage(opp);
 	dev_pm_opp_put(opp);
 
+	mutex_lock(&info->lock);
 	/*
 	 * If the new voltage or the intermediate voltage is higher than the
 	 * current voltage, scale up voltage first.
@@ -242,6 +255,7 @@ static int mtk_cpufreq_set_target(struct cpufreq_policy *policy,
 			pr_err("cpu%d: failed to scale up voltage!\n",
 			       policy->cpu);
 			mtk_cpufreq_set_voltage(info, old_vproc);
+			mutex_unlock(&info->lock);
 			return ret;
 		}
 	}
@@ -253,6 +267,7 @@ static int mtk_cpufreq_set_target(struct cpufreq_policy *policy,
 		       policy->cpu);
 		mtk_cpufreq_set_voltage(info, old_vproc);
 		WARN_ON(1);
+		mutex_unlock(&info->lock);
 		return ret;
 	}
 
@@ -263,6 +278,7 @@ static int mtk_cpufreq_set_target(struct cpufreq_policy *policy,
 		       policy->cpu);
 		clk_set_parent(cpu_clk, armpll);
 		mtk_cpufreq_set_voltage(info, old_vproc);
+		mutex_unlock(&info->lock);
 		return ret;
 	}
 
@@ -273,6 +289,7 @@ static int mtk_cpufreq_set_target(struct cpufreq_policy *policy,
 		       policy->cpu);
 		mtk_cpufreq_set_voltage(info, inter_vproc);
 		WARN_ON(1);
+		mutex_unlock(&info->lock);
 		return ret;
 	}
 
@@ -288,15 +305,69 @@ static int mtk_cpufreq_set_target(struct cpufreq_policy *policy,
 			clk_set_parent(cpu_clk, info->inter_clk);
 			clk_set_rate(armpll, old_freq_hz);
 			clk_set_parent(cpu_clk, armpll);
+			mutex_unlock(&info->lock);
 			return ret;
 		}
 	}
 
+	info->opp_freq = freq_hz;
+	mutex_unlock(&info->lock);
+
 	return 0;
 }
 
 #define DYNAMIC_POWER "dynamic-power-coefficient"
 
+static int mtk_cpufreq_opp_notifier(struct notifier_block *nb,
+				    unsigned long event, void *data)
+{
+	struct dev_pm_opp *opp = data;
+	struct dev_pm_opp *new_opp;
+	struct mtk_cpu_dvfs_info *info;
+	unsigned long freq, volt;
+	struct cpufreq_policy *policy;
+	int ret = 0;
+
+	info = container_of(nb, struct mtk_cpu_dvfs_info, opp_nb);
+
+	if (event == OPP_EVENT_ADJUST_VOLTAGE) {
+		freq = dev_pm_opp_get_freq(opp);
+
+		mutex_lock(&info->lock);
+		if (info->opp_freq == freq) {
+			volt = dev_pm_opp_get_voltage(opp);
+			ret = mtk_cpufreq_set_voltage(info, volt);
+			if (ret)
+				dev_err(info->cpu_dev, "failed to scale voltage: %d\n",
+					ret);
+		}
+		mutex_unlock(&info->lock);
+	} else if (event == OPP_EVENT_DISABLE) {
+		freq = dev_pm_opp_get_freq(opp);
+		/* case of current opp item is disabled */
+		if (info->opp_freq == freq) {
+			freq = 1;
+			new_opp = dev_pm_opp_find_freq_ceil(info->cpu_dev,
+							    &freq);
+			if (!IS_ERR(new_opp)) {
+				dev_pm_opp_put(new_opp);
+				policy = cpufreq_cpu_get(info->opp_cpu);
+				if (policy) {
+					cpufreq_driver_target(policy,
+						freq / 1000,
+						CPUFREQ_RELATION_L);
+					cpufreq_cpu_put(policy);
+				}
+			} else {
+				pr_err("%s: all opp items are disabled\n",
+				       __func__);
+			}
+		}
+	}
+
+	return notifier_from_errno(ret);
+}
+
 static int mtk_cpu_dvfs_info_init(struct mtk_cpu_dvfs_info *info, int cpu)
 {
 	struct device *cpu_dev;
@@ -392,11 +463,21 @@ static int mtk_cpu_dvfs_info_init(struct mtk_cpu_dvfs_info *info, int cpu)
 	info->intermediate_voltage = dev_pm_opp_get_voltage(opp);
 	dev_pm_opp_put(opp);
 
+	info->opp_cpu = cpu;
+	info->opp_nb.notifier_call = mtk_cpufreq_opp_notifier;
+	ret = dev_pm_opp_register_notifier(cpu_dev, &info->opp_nb);
+	if (ret) {
+		pr_warn("cannot register opp notification\n");
+		goto out_disable_inter_clock;
+	}
+
+	mutex_init(&info->lock);
 	info->cpu_dev = cpu_dev;
 	info->proc_reg = proc_reg;
 	info->sram_reg = IS_ERR(sram_reg) ? NULL : sram_reg;
 	info->cpu_clk = cpu_clk;
 	info->inter_clk = inter_clk;
+	info->opp_freq = clk_get_rate(cpu_clk);
 
 	/*
 	 * If SRAM regulator is present, software "voltage tracking" is needed
-- 
2.12.5
_______________________________________________
Linux-mediatek mailing list
Linux-mediatek@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-mediatek

^ permalink raw reply related	[flat|nested] 35+ messages in thread

* [PATCH 12/12] devfreq: mediatek: cci devfreq register opp notification for SVS support
  2020-05-20  3:42 ` [PATCH 00/12] Add cpufreq and cci devfreq for mt8183, and SVS support Andrew-sh.Cheng
                     ` (10 preceding siblings ...)
  2020-05-20  3:43   ` [PATCH 11/12] cpufreq: mediatek: add opp notification for SVS support Andrew-sh.Cheng
@ 2020-05-20  3:43   ` Andrew-sh.Cheng
  2020-05-20  4:10   ` [PATCH 00/12] Add cpufreq and cci devfreq for mt8183, and " Chanwoo Choi
  2020-06-15  7:31   ` Viresh Kumar
  13 siblings, 0 replies; 35+ messages in thread
From: Andrew-sh.Cheng @ 2020-05-20  3:43 UTC (permalink / raw)
  To: MyungJoo Ham, Kyungmin Park, Chanwoo Choi, Rob Herring,
	Mark Rutland, Matthias Brugger, Rafael J . Wysocki, Viresh Kumar,
	Nishanth Menon, Stephen Boyd, Liam Girdwood, Mark Brown
  Cc: devicetree, Andrew-sh.Cheng, srv_heupstream, linux-pm,
	linux-kernel, linux-mediatek, linux-arm-kernel

SVS will change the voltage of opp item.
CCI devfreq need to react to change frequency.

Signed-off-by: Andrew-sh.Cheng <andrew-sh.cheng@mediatek.com>
---
 drivers/devfreq/mt8183-cci-devfreq.c | 27 +++++++++++++++++++++++++++
 1 file changed, 27 insertions(+)

diff --git a/drivers/devfreq/mt8183-cci-devfreq.c b/drivers/devfreq/mt8183-cci-devfreq.c
index cd7929a83bf8..3e03c1cac1a1 100644
--- a/drivers/devfreq/mt8183-cci-devfreq.c
+++ b/drivers/devfreq/mt8183-cci-devfreq.c
@@ -23,6 +23,7 @@ struct cci_devfreq {
 	struct clk *cci_clk;
 	int old_vproc;
 	unsigned long old_freq;
+	struct notifier_block opp_nb;
 };
 
 static int mtk_cci_set_voltage(struct cci_devfreq *cci_df, int vproc)
@@ -91,6 +92,26 @@ static int mtk_cci_devfreq_target(struct device *dev, unsigned long *freq,
 	return 0;
 }
 
+static int ccidevfreq_opp_notifier(struct notifier_block *nb,
+				   unsigned long event, void *data)
+{
+	struct dev_pm_opp *opp = data;
+	struct cci_devfreq *cci_df = container_of(nb, struct cci_devfreq,
+						  opp_nb);
+	unsigned long	freq, volt;
+
+	if (event == OPP_EVENT_ADJUST_VOLTAGE) {
+		freq = dev_pm_opp_get_freq(opp);
+		/* current opp item is changed */
+		if (freq == cci_df->old_freq) {
+			volt = dev_pm_opp_get_voltage(opp);
+			mtk_cci_set_voltage(cci_df, volt);
+		}
+	}
+
+	return 0;
+}
+
 static struct devfreq_dev_profile cci_devfreq_profile = {
 	.target = mtk_cci_devfreq_target,
 };
@@ -100,12 +121,15 @@ static int mtk_cci_devfreq_probe(struct platform_device *pdev)
 	struct device *cci_dev = &pdev->dev;
 	struct cci_devfreq *cci_df;
 	struct devfreq_passive_data *passive_data;
+	struct notifier_block *opp_nb;
 	int ret;
 
 	cci_df = devm_kzalloc(cci_dev, sizeof(*cci_df), GFP_KERNEL);
 	if (!cci_df)
 		return -ENOMEM;
 
+	opp_nb = &cci_df->opp_nb;
+
 	cci_df->cci_clk = devm_clk_get(cci_dev, "cci_clock");
 	ret = PTR_ERR_OR_ZERO(cci_df->cci_clk);
 	if (ret) {
@@ -153,6 +177,9 @@ static int mtk_cci_devfreq_probe(struct platform_device *pdev)
 		return ret;
 	}
 
+	opp_nb->notifier_call = ccidevfreq_opp_notifier;
+	dev_pm_opp_register_notifier(cci_dev, opp_nb);
+
 	return 0;
 }
 
-- 
2.12.5
_______________________________________________
Linux-mediatek mailing list
Linux-mediatek@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-mediatek

^ permalink raw reply related	[flat|nested] 35+ messages in thread

* Re: [PATCH 00/12] Add cpufreq and cci devfreq for mt8183, and SVS support
  2020-05-20  3:42 ` [PATCH 00/12] Add cpufreq and cci devfreq for mt8183, and SVS support Andrew-sh.Cheng
                     ` (11 preceding siblings ...)
  2020-05-20  3:43   ` [PATCH 12/12] devfreq: mediatek: cci devfreq register " Andrew-sh.Cheng
@ 2020-05-20  4:10   ` Chanwoo Choi
  2020-05-20  5:36     ` andrew-sh.cheng
  2020-06-15  7:31   ` Viresh Kumar
  13 siblings, 1 reply; 35+ messages in thread
From: Chanwoo Choi @ 2020-05-20  4:10 UTC (permalink / raw)
  To: Andrew-sh.Cheng, MyungJoo Ham, Kyungmin Park, Rob Herring,
	Mark Rutland, Matthias Brugger, Rafael J . Wysocki, Viresh Kumar,
	Nishanth Menon, Stephen Boyd, Liam Girdwood, Mark Brown
  Cc: devicetree, srv_heupstream, linux-pm, linux-kernel,
	linux-mediatek, linux-arm-kernel

Hi Andrew,

Could you explain the base commit of these patches?
When I tried to apply them to v5.7-rc1 for testing,
the merge conflict occurs.

Thanks,
Chanwoo Choi

On 5/20/20 12:42 PM, Andrew-sh.Cheng wrote:
> MT8183 supports CPU DVFS and CCI DVFS, and LITTLE cpus and CCI are in the same voltage domain.
> So, this series is to add drivers to handle the voltage coupling between CPU and CCI DVFS.
> 
> For SVS support, need OPP_EVENT_ADJUST_VOLTAGE and corresponding reaction.
> 
> Change since v5:
> 	- Changing dt-binding format to yaml.
> 	- Extending current devfreq passive_governor instead of create a new one.
> 	- Resend depending patches of Sravana Kannan base on kernel-5.7
> 
> 
> Andrew-sh.Cheng (6):
>   cpufreq: mediatek: add clock and regulator enable for intermediate
>     clock
>   dt-bindings: devfreq: add compatible for mt8183 cci devfreq
>   devfreq: add mediatek cci devfreq
>   opp: Modify opp API, dev_pm_opp_get_freq(), find freq in opp, even it
>     is disabled
>   cpufreq: mediatek: add opp notification for SVS support
>   devfreq: mediatek: cci devfreq register opp notification for SVS
>     support
> 
> Saravana Kannan (6):
>   OPP: Allow required-opps even if the device doesn't have power-domains
>   OPP: Add function to look up required OPP's for a given OPP
>   OPP: Improve required-opps linking
>   PM / devfreq: Cache OPP table reference in devfreq
>   PM / devfreq: Add required OPPs support to passive governor
>   PM / devfreq: Add cpu based scaling support to passive_governor
> 
>  .../devicetree/bindings/devfreq/mt8183-cci.yaml    |  51 ++++
>  drivers/cpufreq/mediatek-cpufreq.c                 | 122 ++++++++-
>  drivers/devfreq/Kconfig                            |  12 +
>  drivers/devfreq/Makefile                           |   1 +
>  drivers/devfreq/devfreq.c                          |   6 +
>  drivers/devfreq/governor_passive.c                 | 298 +++++++++++++++++++--
>  drivers/devfreq/mt8183-cci-devfreq.c               | 233 ++++++++++++++++
>  drivers/opp/core.c                                 |  85 +++++-
>  drivers/opp/of.c                                   | 108 ++++----
>  drivers/opp/opp.h                                  |   5 +
>  include/linux/devfreq.h                            |  42 ++-
>  include/linux/pm_opp.h                             |  11 +
>  12 files changed, 874 insertions(+), 100 deletions(-)
>  create mode 100644 Documentation/devicetree/bindings/devfreq/mt8183-cci.yaml
>  create mode 100644 drivers/devfreq/mt8183-cci-devfreq.c
> 

_______________________________________________
Linux-mediatek mailing list
Linux-mediatek@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-mediatek

^ permalink raw reply	[flat|nested] 35+ messages in thread

* Re: [PATCH 00/12] Add cpufreq and cci devfreq for mt8183, and SVS support
  2020-05-20  4:10   ` [PATCH 00/12] Add cpufreq and cci devfreq for mt8183, and " Chanwoo Choi
@ 2020-05-20  5:36     ` andrew-sh.cheng
  2020-05-20  6:24       ` Chanwoo Choi
  0 siblings, 1 reply; 35+ messages in thread
From: andrew-sh.cheng @ 2020-05-20  5:36 UTC (permalink / raw)
  To: Chanwoo Choi
  Cc: Mark Rutland, Nishanth Menon, srv_heupstream, linux-pm,
	Stephen Boyd, Viresh Kumar, Mark Brown, Rafael J . Wysocki,
	Liam Girdwood, Rob Herring, linux-kernel, Kyungmin Park,
	MyungJoo Ham, linux-mediatek, linux-arm-kernel, Matthias Brugger,
	devicetree

On Wed, 2020-05-20 at 13:10 +0900, Chanwoo Choi wrote:
> Hi Andrew,
> 
> Could you explain the base commit of these patches?
> When I tried to apply them to v5.7-rc1 for testing,
> the merge conflict occurs.
> 
> Thanks,
> Chanwoo Choi

Hi Chanwoo Choi,

My base commit is
commit 8f3d9f354286745c751374f5f1fcafee6b3f3136
Author: Linus Torvalds <torvalds@linux-foundation.org>
Date:   Sun Apr 12 12:35:55 2020 -0700

    Linux 5.7-rc1

Could you show me the conflict error?

BR,
Andrew-sh.Cheng
> 
> On 5/20/20 12:42 PM, Andrew-sh.Cheng wrote:
> > MT8183 supports CPU DVFS and CCI DVFS, and LITTLE cpus and CCI are in the same voltage domain.
> > So, this series is to add drivers to handle the voltage coupling between CPU and CCI DVFS.
> > 
> > For SVS support, need OPP_EVENT_ADJUST_VOLTAGE and corresponding reaction.
> > 
> > Change since v5:
> > 	- Changing dt-binding format to yaml.
> > 	- Extending current devfreq passive_governor instead of create a new one.
> > 	- Resend depending patches of Sravana Kannan base on kernel-5.7
> > 
> > 
> > Andrew-sh.Cheng (6):
> >   cpufreq: mediatek: add clock and regulator enable for intermediate
> >     clock
> >   dt-bindings: devfreq: add compatible for mt8183 cci devfreq
> >   devfreq: add mediatek cci devfreq
> >   opp: Modify opp API, dev_pm_opp_get_freq(), find freq in opp, even it
> >     is disabled
> >   cpufreq: mediatek: add opp notification for SVS support
> >   devfreq: mediatek: cci devfreq register opp notification for SVS
> >     support
> > 
> > Saravana Kannan (6):
> >   OPP: Allow required-opps even if the device doesn't have power-domains
> >   OPP: Add function to look up required OPP's for a given OPP
> >   OPP: Improve required-opps linking
> >   PM / devfreq: Cache OPP table reference in devfreq
> >   PM / devfreq: Add required OPPs support to passive governor
> >   PM / devfreq: Add cpu based scaling support to passive_governor
> > 
> >  .../devicetree/bindings/devfreq/mt8183-cci.yaml    |  51 ++++
> >  drivers/cpufreq/mediatek-cpufreq.c                 | 122 ++++++++-
> >  drivers/devfreq/Kconfig                            |  12 +
> >  drivers/devfreq/Makefile                           |   1 +
> >  drivers/devfreq/devfreq.c                          |   6 +
> >  drivers/devfreq/governor_passive.c                 | 298 +++++++++++++++++++--
> >  drivers/devfreq/mt8183-cci-devfreq.c               | 233 ++++++++++++++++
> >  drivers/opp/core.c                                 |  85 +++++-
> >  drivers/opp/of.c                                   | 108 ++++----
> >  drivers/opp/opp.h                                  |   5 +
> >  include/linux/devfreq.h                            |  42 ++-
> >  include/linux/pm_opp.h                             |  11 +
> >  12 files changed, 874 insertions(+), 100 deletions(-)
> >  create mode 100644 Documentation/devicetree/bindings/devfreq/mt8183-cci.yaml
> >  create mode 100644 drivers/devfreq/mt8183-cci-devfreq.c
> > 

_______________________________________________
Linux-mediatek mailing list
Linux-mediatek@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-mediatek

^ permalink raw reply	[flat|nested] 35+ messages in thread

* Re: [PATCH 00/12] Add cpufreq and cci devfreq for mt8183, and SVS support
  2020-05-20  5:36     ` andrew-sh.cheng
@ 2020-05-20  6:24       ` Chanwoo Choi
  2020-05-20  7:10         ` andrew-sh.cheng
  0 siblings, 1 reply; 35+ messages in thread
From: Chanwoo Choi @ 2020-05-20  6:24 UTC (permalink / raw)
  To: andrew-sh.cheng
  Cc: Mark Rutland, Nishanth Menon, srv_heupstream, linux-pm,
	Stephen Boyd, Viresh Kumar, Mark Brown, Rafael J . Wysocki,
	Liam Girdwood, Rob Herring, linux-kernel, Kyungmin Park,
	MyungJoo Ham, linux-mediatek, linux-arm-kernel, Matthias Brugger,
	devicetree

Hi,

On 5/20/20 2:36 PM, andrew-sh.cheng wrote:
> On Wed, 2020-05-20 at 13:10 +0900, Chanwoo Choi wrote:
>> Hi Andrew,
>>
>> Could you explain the base commit of these patches?
>> When I tried to apply them to v5.7-rc1 for testing,
>> the merge conflict occurs.
>>
>> Thanks,

>> Chanwoo Choi
> 
> Hi Chanwoo Choi,
> 
> My base commit is
> commit 8f3d9f354286745c751374f5f1fcafee6b3f3136
> Author: Linus Torvalds <torvalds@linux-foundation.org>
> Date:   Sun Apr 12 12:35:55 2020 -0700
> 
>     Linux 5.7-rc1
> 
> Could you show me the conflict error?


When I tried to apply first patch with 'git am',
the merge conflict occurred.

git am \[PATCH\ 01_12\]\ OPP\:\ Allow\ required-opps\ even\ if\ the\ device\ doesn\'t\ have\ power-domains.eml
Applying: OPP: Allow required-opps even if the device doesn't have power-domains
error: patch failed: drivers/opp/core.c:755
error: drivers/opp/core.c: patch does not apply
error: patch failed: drivers/opp/of.c:195																																																																												
error: drivers/opp/of.c: patch does not apply
Patch failed at 0001 OPP: Allow required-opps even if the device doesn't have power-domains
Use 'git am --show-current-patch' to see the failed patch
When you have resolved this problem, run "git am --continue".
If you prefer to skip this patch, run "git am --skip" instead.
To restore the original branch and stop patching, run "git am --abort".

Regards,
Chanwoo Choi

> 
> BR,
> Andrew-sh.Cheng
>>
>> On 5/20/20 12:42 PM, Andrew-sh.Cheng wrote:
>>> MT8183 supports CPU DVFS and CCI DVFS, and LITTLE cpus and CCI are in the same voltage domain.
>>> So, this series is to add drivers to handle the voltage coupling between CPU and CCI DVFS.
>>>
>>> For SVS support, need OPP_EVENT_ADJUST_VOLTAGE and corresponding reaction.
>>>
>>> Change since v5:
>>> 	- Changing dt-binding format to yaml.
>>> 	- Extending current devfreq passive_governor instead of create a new one.
>>> 	- Resend depending patches of Sravana Kannan base on kernel-5.7
>>>
>>>
>>> Andrew-sh.Cheng (6):
>>>   cpufreq: mediatek: add clock and regulator enable for intermediate
>>>     clock
>>>   dt-bindings: devfreq: add compatible for mt8183 cci devfreq
>>>   devfreq: add mediatek cci devfreq
>>>   opp: Modify opp API, dev_pm_opp_get_freq(), find freq in opp, even it
>>>     is disabled
>>>   cpufreq: mediatek: add opp notification for SVS support
>>>   devfreq: mediatek: cci devfreq register opp notification for SVS
>>>     support
>>>
>>> Saravana Kannan (6):
>>>   OPP: Allow required-opps even if the device doesn't have power-domains
>>>   OPP: Add function to look up required OPP's for a given OPP
>>>   OPP: Improve required-opps linking
>>>   PM / devfreq: Cache OPP table reference in devfreq
>>>   PM / devfreq: Add required OPPs support to passive governor
>>>   PM / devfreq: Add cpu based scaling support to passive_governor
>>>
>>>  .../devicetree/bindings/devfreq/mt8183-cci.yaml    |  51 ++++
>>>  drivers/cpufreq/mediatek-cpufreq.c                 | 122 ++++++++-
>>>  drivers/devfreq/Kconfig                            |  12 +
>>>  drivers/devfreq/Makefile                           |   1 +
>>>  drivers/devfreq/devfreq.c                          |   6 +
>>>  drivers/devfreq/governor_passive.c                 | 298 +++++++++++++++++++--
>>>  drivers/devfreq/mt8183-cci-devfreq.c               | 233 ++++++++++++++++
>>>  drivers/opp/core.c                                 |  85 +++++-
>>>  drivers/opp/of.c                                   | 108 ++++----
>>>  drivers/opp/opp.h                                  |   5 +
>>>  include/linux/devfreq.h                            |  42 ++-
>>>  include/linux/pm_opp.h                             |  11 +
>>>  12 files changed, 874 insertions(+), 100 deletions(-)
>>>  create mode 100644 Documentation/devicetree/bindings/devfreq/mt8183-cci.yaml
>>>  create mode 100644 drivers/devfreq/mt8183-cci-devfreq.c
>>>
> 


-- 
Best Regards,
Chanwoo Choi
Samsung Electronics

_______________________________________________
Linux-mediatek mailing list
Linux-mediatek@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-mediatek

^ permalink raw reply	[flat|nested] 35+ messages in thread

* Re: [PATCH 00/12] Add cpufreq and cci devfreq for mt8183, and SVS support
  2020-05-20  6:24       ` Chanwoo Choi
@ 2020-05-20  7:10         ` andrew-sh.cheng
  2020-05-20 14:53           ` Matthias Brugger
  0 siblings, 1 reply; 35+ messages in thread
From: andrew-sh.cheng @ 2020-05-20  7:10 UTC (permalink / raw)
  To: Chanwoo Choi
  Cc: Mark Rutland, Nishanth Menon, srv_heupstream, linux-pm,
	Stephen Boyd, Viresh Kumar, Mark Brown, Rafael J . Wysocki,
	Liam Girdwood, Rob Herring, linux-kernel, Kyungmin Park,
	MyungJoo Ham, linux-mediatek, linux-arm-kernel, Matthias Brugger,
	devicetree

On Wed, 2020-05-20 at 15:24 +0900, Chanwoo Choi wrote:
> Hi,
> 
> On 5/20/20 2:36 PM, andrew-sh.cheng wrote:
> > On Wed, 2020-05-20 at 13:10 +0900, Chanwoo Choi wrote:
> >> Hi Andrew,
> >>
> >> Could you explain the base commit of these patches?
> >> When I tried to apply them to v5.7-rc1 for testing,
> >> the merge conflict occurs.
> >>
> >> Thanks,
> 
> >> Chanwoo Choi
> > 
> > Hi Chanwoo Choi,
> > 
> > My base commit is
> > commit 8f3d9f354286745c751374f5f1fcafee6b3f3136
> > Author: Linus Torvalds <torvalds@linux-foundation.org>
> > Date:   Sun Apr 12 12:35:55 2020 -0700
> > 
> >     Linux 5.7-rc1
> > 
> > Could you show me the conflict error?
> 
> 
> When I tried to apply first patch with 'git am',
> the merge conflict occurred.
> 
> git am \[PATCH\ 01_12\]\ OPP\:\ Allow\ required-opps\ even\ if\ the\ device\ doesn\'t\ have\ power-domains.eml
> Applying: OPP: Allow required-opps even if the device doesn't have power-domains
> error: patch failed: drivers/opp/core.c:755
> error: drivers/opp/core.c: patch does not apply
> error: patch failed: drivers/opp/of.c:195																																																																												
> error: drivers/opp/of.c: patch does not apply
> Patch failed at 0001 OPP: Allow required-opps even if the device doesn't have power-domains
> Use 'git am --show-current-patch' to see the failed patch
> When you have resolved this problem, run "git am --continue".
> If you prefer to skip this patch, run "git am --skip" instead.
> To restore the original branch and stop patching, run "git am --abort".
> 
> Regards,
> Chanwoo Choi

Hi Chanwoo,

I just make a new folder to get code and check.
Below is my command list.
Please help check the different with you.
  505  repo init -u http://gerrit.mediatek.inc:8080/cros-kernel/manifest
-b upstream
  506  repo sync -j8
  507  repo start kern-dev --all
  508   git remote add main
https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git
  509  git remote add main
https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git
  510  ls
  511  cd kernel/mediatek/
  512   git remote add main
https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git
  513  git fetch main
  514  git checkout v5.7-rc1
  515  git am
Add-cpufreq-and-cci-devfreq-for-mt8183-and-SVS-support.patch
  516  history


BR,
Andrew-sh.Cheng
> 
> > 
> > BR,
> > Andrew-sh.Cheng
> >>
> >> On 5/20/20 12:42 PM, Andrew-sh.Cheng wrote:
> >>> MT8183 supports CPU DVFS and CCI DVFS, and LITTLE cpus and CCI are in the same voltage domain.
> >>> So, this series is to add drivers to handle the voltage coupling between CPU and CCI DVFS.
> >>>
> >>> For SVS support, need OPP_EVENT_ADJUST_VOLTAGE and corresponding reaction.
> >>>
> >>> Change since v5:
> >>> 	- Changing dt-binding format to yaml.
> >>> 	- Extending current devfreq passive_governor instead of create a new one.
> >>> 	- Resend depending patches of Sravana Kannan base on kernel-5.7
> >>>
> >>>
> >>> Andrew-sh.Cheng (6):
> >>>   cpufreq: mediatek: add clock and regulator enable for intermediate
> >>>     clock
> >>>   dt-bindings: devfreq: add compatible for mt8183 cci devfreq
> >>>   devfreq: add mediatek cci devfreq
> >>>   opp: Modify opp API, dev_pm_opp_get_freq(), find freq in opp, even it
> >>>     is disabled
> >>>   cpufreq: mediatek: add opp notification for SVS support
> >>>   devfreq: mediatek: cci devfreq register opp notification for SVS
> >>>     support
> >>>
> >>> Saravana Kannan (6):
> >>>   OPP: Allow required-opps even if the device doesn't have power-domains
> >>>   OPP: Add function to look up required OPP's for a given OPP
> >>>   OPP: Improve required-opps linking
> >>>   PM / devfreq: Cache OPP table reference in devfreq
> >>>   PM / devfreq: Add required OPPs support to passive governor
> >>>   PM / devfreq: Add cpu based scaling support to passive_governor
> >>>
> >>>  .../devicetree/bindings/devfreq/mt8183-cci.yaml    |  51 ++++
> >>>  drivers/cpufreq/mediatek-cpufreq.c                 | 122 ++++++++-
> >>>  drivers/devfreq/Kconfig                            |  12 +
> >>>  drivers/devfreq/Makefile                           |   1 +
> >>>  drivers/devfreq/devfreq.c                          |   6 +
> >>>  drivers/devfreq/governor_passive.c                 | 298 +++++++++++++++++++--
> >>>  drivers/devfreq/mt8183-cci-devfreq.c               | 233 ++++++++++++++++
> >>>  drivers/opp/core.c                                 |  85 +++++-
> >>>  drivers/opp/of.c                                   | 108 ++++----
> >>>  drivers/opp/opp.h                                  |   5 +
> >>>  include/linux/devfreq.h                            |  42 ++-
> >>>  include/linux/pm_opp.h                             |  11 +
> >>>  12 files changed, 874 insertions(+), 100 deletions(-)
> >>>  create mode 100644 Documentation/devicetree/bindings/devfreq/mt8183-cci.yaml
> >>>  create mode 100644 drivers/devfreq/mt8183-cci-devfreq.c
> >>>
> > 
> 
> 

_______________________________________________
Linux-mediatek mailing list
Linux-mediatek@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-mediatek

^ permalink raw reply	[flat|nested] 35+ messages in thread

* Re: [PATCH 09/12] devfreq: add mediatek cci devfreq
  2020-05-20  3:43   ` [PATCH 09/12] devfreq: add mediatek " Andrew-sh.Cheng
@ 2020-05-20 12:31     ` Mark Brown
  2020-05-21  8:52       ` andrew-sh.cheng
  2020-05-28  7:35     ` Chanwoo Choi
  1 sibling, 1 reply; 35+ messages in thread
From: Mark Brown @ 2020-05-20 12:31 UTC (permalink / raw)
  To: Andrew-sh.Cheng
  Cc: Mark Rutland, Nishanth Menon, srv_heupstream, linux-pm,
	Stephen Boyd, Viresh Kumar, Rafael J . Wysocki, Liam Girdwood,
	Rob Herring, linux-kernel, Chanwoo Choi, Kyungmin Park,
	MyungJoo Ham, linux-mediatek, linux-arm-kernel, Matthias Brugger,
	devicetree


[-- Attachment #1.1: Type: text/plain, Size: 502 bytes --]

On Wed, May 20, 2020 at 11:43:04AM +0800, Andrew-sh.Cheng wrote:

> +	cci_df->proc_reg = devm_regulator_get_optional(cci_dev, "proc");
> +	ret = PTR_ERR_OR_ZERO(cci_df->proc_reg);
> +	if (ret) {
> +		if (ret != -EPROBE_DEFER)
> +			dev_err(cci_dev, "failed to get regulator for CCI: %d\n",
> +				ret);
> +		return ret;
> +	}
> +	ret = regulator_enable(cci_df->proc_reg);

The code appears to require a regulator (and I'm guessing the device
needs power) so why is this using regulator_get_optional()?

[-- Attachment #1.2: signature.asc --]
[-- Type: application/pgp-signature, Size: 488 bytes --]

[-- Attachment #2: Type: text/plain, Size: 170 bytes --]

_______________________________________________
Linux-mediatek mailing list
Linux-mediatek@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-mediatek

^ permalink raw reply	[flat|nested] 35+ messages in thread

* Re: [PATCH 00/12] Add cpufreq and cci devfreq for mt8183, and SVS support
  2020-05-20  7:10         ` andrew-sh.cheng
@ 2020-05-20 14:53           ` Matthias Brugger
  0 siblings, 0 replies; 35+ messages in thread
From: Matthias Brugger @ 2020-05-20 14:53 UTC (permalink / raw)
  To: andrew-sh.cheng, Chanwoo Choi
  Cc: Mark Rutland, Nishanth Menon, srv_heupstream, linux-pm,
	Stephen Boyd, Viresh Kumar, Mark Brown, Rafael J . Wysocki,
	Liam Girdwood, Rob Herring, linux-kernel, Kyungmin Park,
	MyungJoo Ham, linux-mediatek, linux-arm-kernel, devicetree



On 20/05/2020 09:10, andrew-sh.cheng wrote:
> On Wed, 2020-05-20 at 15:24 +0900, Chanwoo Choi wrote:
>> Hi,
>>
>> On 5/20/20 2:36 PM, andrew-sh.cheng wrote:
>>> On Wed, 2020-05-20 at 13:10 +0900, Chanwoo Choi wrote:
>>>> Hi Andrew,
>>>>
>>>> Could you explain the base commit of these patches?
>>>> When I tried to apply them to v5.7-rc1 for testing,
>>>> the merge conflict occurs.
>>>>
>>>> Thanks,
>>
>>>> Chanwoo Choi
>>>
>>> Hi Chanwoo Choi,
>>>
>>> My base commit is
>>> commit 8f3d9f354286745c751374f5f1fcafee6b3f3136
>>> Author: Linus Torvalds <torvalds@linux-foundation.org>
>>> Date:   Sun Apr 12 12:35:55 2020 -0700
>>>
>>>     Linux 5.7-rc1
>>>
>>> Could you show me the conflict error?
>>
>>
>> When I tried to apply first patch with 'git am',
>> the merge conflict occurred.
>>
>> git am \[PATCH\ 01_12\]\ OPP\:\ Allow\ required-opps\ even\ if\ the\ device\ doesn\'t\ have\ power-domains.eml
>> Applying: OPP: Allow required-opps even if the device doesn't have power-domains
>> error: patch failed: drivers/opp/core.c:755
>> error: drivers/opp/core.c: patch does not apply
>> error: patch failed: drivers/opp/of.c:195																																																																												
>> error: drivers/opp/of.c: patch does not apply
>> Patch failed at 0001 OPP: Allow required-opps even if the device doesn't have power-domains
>> Use 'git am --show-current-patch' to see the failed patch
>> When you have resolved this problem, run "git am --continue".
>> If you prefer to skip this patch, run "git am --skip" instead.
>> To restore the original branch and stop patching, run "git am --abort".
>>
>> Regards,
>> Chanwoo Choi
> 
> Hi Chanwoo,
> 
> I just make a new folder to get code and check.
> Below is my command list.
> Please help check the different with you.
>   505  repo init -u http://gerrit.mediatek.inc:8080/cros-kernel/manifest
> -b upstream
>   506  repo sync -j8
>   507  repo start kern-dev --all
>   508   git remote add main
> https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git
>   509  git remote add main
> https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git
>   510  ls
>   511  cd kernel/mediatek/
>   512   git remote add main
> https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git
>   513  git fetch main
>   514  git checkout v5.7-rc1
>   515  git am
> Add-cpufreq-and-cci-devfreq-for-mt8183-and-SVS-support.patch
>   516  history
> 

For reference I just tried with b4.sh [1]:
# b4.sh am -l -o /tmp -n patch  1589958625.23971.2.camel@mtksdaap41
# git am -3 -s /tmp/patch.mbx

Applies without conflicts.

Regards,
Matthias

[1] https://git.kernel.org/pub/scm/utils/b4/b4.git

_______________________________________________
Linux-mediatek mailing list
Linux-mediatek@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-mediatek

^ permalink raw reply	[flat|nested] 35+ messages in thread

* Re: [PATCH 01/12] OPP: Allow required-opps even if the device doesn't have power-domains
  2020-05-20  3:42   ` [PATCH 01/12] OPP: Allow required-opps even if the device doesn't have power-domains Andrew-sh.Cheng
@ 2020-05-20 14:54     ` Matthias Brugger
  2020-05-21  1:50       ` andrew-sh.cheng
  0 siblings, 1 reply; 35+ messages in thread
From: Matthias Brugger @ 2020-05-20 14:54 UTC (permalink / raw)
  To: Andrew-sh.Cheng, MyungJoo Ham, Kyungmin Park, Chanwoo Choi,
	Rob Herring, Mark Rutland, Rafael J . Wysocki, Viresh Kumar,
	Nishanth Menon, Stephen Boyd, Liam Girdwood, Mark Brown
  Cc: devicetree, Saravana Kannan, srv_heupstream, linux-pm,
	linux-kernel, linux-mediatek, linux-arm-kernel



On 20/05/2020 05:42, Andrew-sh.Cheng wrote:
> From: Saravana Kannan <saravanak@google.com>
> 
> A Device-A can have a (minimum) performance requirement on another
> Device-B to be able to function correctly. This performance requirement
> on Device-B can also change based on the current performance level of
> Device-A.
> 
> The existing required-opps feature fits well to describe this need. So,
> instead of limiting required-opps to point to only PM-domain devices,
> allow it to point to any device.
> 
> Signed-off-by: Saravana Kannan <saravanak@google.com>

Please check all patches, they are missing your
Signed-off-by

Regards,
Matthias

> ---
>  drivers/opp/core.c |  2 +-
>  drivers/opp/of.c   | 11 -----------
>  2 files changed, 1 insertion(+), 12 deletions(-)
> 
> diff --git a/drivers/opp/core.c b/drivers/opp/core.c
> index ba43e6a3dc0a..51403c1f2481 100644
> --- a/drivers/opp/core.c
> +++ b/drivers/opp/core.c
> @@ -755,7 +755,7 @@ static int _set_required_opps(struct device *dev,
>  		return 0;
>  
>  	/* Single genpd case */
> -	if (!genpd_virt_devs) {
> +	if (!genpd_virt_devs && required_opp_tables[0]->is_genpd) {
>  		pstate = likely(opp) ? opp->required_opps[0]->pstate : 0;
>  		ret = dev_pm_genpd_set_performance_state(dev, pstate);
>  		if (ret) {
> diff --git a/drivers/opp/of.c b/drivers/opp/of.c
> index 9cd8f0adacae..6d33de668a7b 100644
> --- a/drivers/opp/of.c
> +++ b/drivers/opp/of.c
> @@ -195,17 +195,6 @@ static void _opp_table_alloc_required_tables(struct opp_table *opp_table,
>  
>  		if (IS_ERR(required_opp_tables[i]))
>  			goto free_required_tables;
> -
> -		/*
> -		 * We only support genpd's OPPs in the "required-opps" for now,
> -		 * as we don't know how much about other cases. Error out if the
> -		 * required OPP doesn't belong to a genpd.
> -		 */
> -		if (!required_opp_tables[i]->is_genpd) {
> -			dev_err(dev, "required-opp doesn't belong to genpd: %pOF\n",
> -				required_np);
> -			goto free_required_tables;
> -		}
>  	}
>  
>  	goto put_np;
> 

_______________________________________________
Linux-mediatek mailing list
Linux-mediatek@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-mediatek

^ permalink raw reply	[flat|nested] 35+ messages in thread

* Re: [PATCH 01/12] OPP: Allow required-opps even if the device doesn't have power-domains
  2020-05-20 14:54     ` Matthias Brugger
@ 2020-05-21  1:50       ` andrew-sh.cheng
  0 siblings, 0 replies; 35+ messages in thread
From: andrew-sh.cheng @ 2020-05-21  1:50 UTC (permalink / raw)
  To: Matthias Brugger
  Cc: Mark Rutland, Nishanth Menon, Saravana Kannan, srv_heupstream,
	linux-pm, Stephen Boyd, Viresh Kumar, Mark Brown,
	Rafael J . Wysocki, Liam Girdwood, Rob Herring, linux-kernel,
	Chanwoo Choi, Kyungmin Park, MyungJoo Ham, linux-mediatek,
	linux-arm-kernel, devicetree

On Wed, 2020-05-20 at 16:54 +0200, Matthias Brugger wrote:
> 
> On 20/05/2020 05:42, Andrew-sh.Cheng wrote:
> > From: Saravana Kannan <saravanak@google.com>
> > 
> > A Device-A can have a (minimum) performance requirement on another
> > Device-B to be able to function correctly. This performance requirement
> > on Device-B can also change based on the current performance level of
> > Device-A.
> > 
> > The existing required-opps feature fits well to describe this need. So,
> > instead of limiting required-opps to point to only PM-domain devices,
> > allow it to point to any device.
> > 
> > Signed-off-by: Saravana Kannan <saravanak@google.com>
> 
> Please check all patches, they are missing your
> Signed-off-by
> 
> Regards,
> Matthias

Hi Matthias,

I modify patch [6/12] to meet kernel-5.7 data structure and add
signed-off.
For [1/12] to [5/12], I didn't modify them.
Should I also add signed-off ?

BR,
Andrew-sh.Cheng
> 
> > ---
> >  drivers/opp/core.c |  2 +-
> >  drivers/opp/of.c   | 11 -----------
> >  2 files changed, 1 insertion(+), 12 deletions(-)
> > 
> > diff --git a/drivers/opp/core.c b/drivers/opp/core.c
> > index ba43e6a3dc0a..51403c1f2481 100644
> > --- a/drivers/opp/core.c
> > +++ b/drivers/opp/core.c
> > @@ -755,7 +755,7 @@ static int _set_required_opps(struct device *dev,
> >  		return 0;
> >  
> >  	/* Single genpd case */
> > -	if (!genpd_virt_devs) {
> > +	if (!genpd_virt_devs && required_opp_tables[0]->is_genpd) {
> >  		pstate = likely(opp) ? opp->required_opps[0]->pstate : 0;
> >  		ret = dev_pm_genpd_set_performance_state(dev, pstate);
> >  		if (ret) {
> > diff --git a/drivers/opp/of.c b/drivers/opp/of.c
> > index 9cd8f0adacae..6d33de668a7b 100644
> > --- a/drivers/opp/of.c
> > +++ b/drivers/opp/of.c
> > @@ -195,17 +195,6 @@ static void _opp_table_alloc_required_tables(struct opp_table *opp_table,
> >  
> >  		if (IS_ERR(required_opp_tables[i]))
> >  			goto free_required_tables;
> > -
> > -		/*
> > -		 * We only support genpd's OPPs in the "required-opps" for now,
> > -		 * as we don't know how much about other cases. Error out if the
> > -		 * required OPP doesn't belong to a genpd.
> > -		 */
> > -		if (!required_opp_tables[i]->is_genpd) {
> > -			dev_err(dev, "required-opp doesn't belong to genpd: %pOF\n",
> > -				required_np);
> > -			goto free_required_tables;
> > -		}
> >  	}
> >  
> >  	goto put_np;
> > 

_______________________________________________
Linux-mediatek mailing list
Linux-mediatek@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-mediatek

^ permalink raw reply	[flat|nested] 35+ messages in thread

* Re: [PATCH 09/12] devfreq: add mediatek cci devfreq
  2020-05-20 12:31     ` Mark Brown
@ 2020-05-21  8:52       ` andrew-sh.cheng
  0 siblings, 0 replies; 35+ messages in thread
From: andrew-sh.cheng @ 2020-05-21  8:52 UTC (permalink / raw)
  Cc: Mark Rutland, Nishanth Menon, srv_heupstream, linux-pm,
	Stephen Boyd, Viresh Kumar, Rafael J . Wysocki, Liam Girdwood,
	Rob Herring, linux-kernel, Chanwoo Choi, Kyungmin Park,
	MyungJoo Ham, linux-mediatek, linux-arm-kernel, Matthias Brugger,
	devicetree


On Wed, 2020-05-20 at 13:31 +0100, Mark Brown wrote:
> On Wed, May 20, 2020 at 11:43:04AM +0800, Andrew-sh.Cheng wrote:
> 
> > +	cci_df->proc_reg = devm_regulator_get_optional(cci_dev, "proc");
> > +	ret = PTR_ERR_OR_ZERO(cci_df->proc_reg);
> > +	if (ret) {
> > +		if (ret != -EPROBE_DEFER)
> > +			dev_err(cci_dev, "failed to get regulator for CCI: %d\n",
> > +				ret);
> > +		return ret;
> > +	}
> > +	ret = regulator_enable(cci_df->proc_reg);
> 
> The code appears to require a regulator (and I'm guessing the device
> needs power) so why is this using regulator_get_optional()?

Hi Mark,

Do you mean, why not use regulator_get_exclusive() or regulator_get()?
Because cci and cpu litter core shared buck, it cannot use
regulator_get_exclusive().
Because both cci and cpu want to tune voltage, it cannot use
regulator_get(), otherwise it will get dummy regulator even this buck
doesn't register.as regulator.

BR,
Andrew-sh.Cheng
_______________________________________________
Linux-mediatek mailing list
Linux-mediatek@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-mediatek

^ permalink raw reply	[flat|nested] 35+ messages in thread

* Re: [PATCH 06/12] PM / devfreq: Add cpu based scaling support to passive_governor
  2020-05-20  3:43   ` [PATCH 06/12] PM / devfreq: Add cpu based scaling support to passive_governor Andrew-sh.Cheng
@ 2020-05-28  5:03     ` Chanwoo Choi
  2020-05-28  6:14     ` Chanwoo Choi
  1 sibling, 0 replies; 35+ messages in thread
From: Chanwoo Choi @ 2020-05-28  5:03 UTC (permalink / raw)
  To: Andrew-sh.Cheng, MyungJoo Ham, Kyungmin Park, Rob Herring,
	Mark Rutland, Matthias Brugger, Rafael J . Wysocki, Viresh Kumar,
	Nishanth Menon, Stephen Boyd, Liam Girdwood, Mark Brown
  Cc: devicetree, srv_heupstream, linux-pm, linux-kernel,
	Saravana Kannan, linux-mediatek, Sibi Sankar, linux-arm-kernel

Hi Andrew-sh.Cheng,

Thanks for your posting. I like this approach absolutely.
I think that it is necessary. When I developed the embedded product,
I needed this feature always. 

I add the comments on below.

On 5/20/20 12:43 PM, Andrew-sh.Cheng wrote:
> From: Saravana Kannan <skannan@codeaurora.org>
> 
> Many CPU architectures have caches that can scale independent of the
> CPUs. Frequency scaling of the caches is necessary to make sure that the
> cache is not a performance bottleneck that leads to poor performance and
> power. The same idea applies for RAM/DDR.
> 
> To achieve this, this patch adds support for cpu based scaling to the
> passive governor. This is accomplished by taking the current frequency
> of each CPU frequency domain and then adjust the frequency of the cache
> (or any devfreq device) based on the frequency of the CPUs. It listens
> to CPU frequency transition notifiers to keep itself up to date on the
> current CPU frequency.
> 
> To decide the frequency of the device, the governor does one of the
> following:
> * Derives the optimal devfreq device opp from required-opps property of
>   the parent cpu opp_table.
> 
> * Scales the device frequency in proportion to the CPU frequency. So, if
>   the CPUs are running at their max frequency, the device runs at its
>   max frequency. If the CPUs are running at their min frequency, the
>   device runs at its min frequency. It is interpolated for frequencies
>   in between.
> 
> Andrew-sh.Cheng change
> dev_pm_opp_xlate_opp to dev_pm_opp_xlate_required_opp devfreq->max_freq
> to devfreq->user_min_freq_req.data.freq.qos->min_freq.target_value
> for kernel-5.7
> 
> Signed-off-by: Saravana Kannan <skannan@codeaurora.org>
> [Sibi: Integrated cpu-freqmap governor into passive_governor]
> Signed-off-by: Sibi Sankar <sibis@codeaurora.org>
> Signed-off-by: Andrew-sh.Cheng <andrew-sh.cheng@mediatek.com>
> ---
>  drivers/devfreq/Kconfig            |   2 +
>  drivers/devfreq/governor_passive.c | 278 ++++++++++++++++++++++++++++++++++---
>  include/linux/devfreq.h            |  40 +++++-
>  3 files changed, 299 insertions(+), 21 deletions(-)
> 
> diff --git a/drivers/devfreq/Kconfig b/drivers/devfreq/Kconfig
> index 0b1df12e0f21..d9067950af6a 100644
> --- a/drivers/devfreq/Kconfig
> +++ b/drivers/devfreq/Kconfig
> @@ -73,6 +73,8 @@ config DEVFREQ_GOV_PASSIVE
>  	  device. This governor does not change the frequency by itself
>  	  through sysfs entries. The passive governor recommends that
>  	  devfreq device uses the OPP table to get the frequency/voltage.
> +	  Alternatively the governor can also be chosen to scale based on
> +	  the online CPUs current frequency.
>  
>  comment "DEVFREQ Drivers"
>  
> diff --git a/drivers/devfreq/governor_passive.c b/drivers/devfreq/governor_passive.c
> index 2d67d6c12dce..7dcda02a5bb7 100644
> --- a/drivers/devfreq/governor_passive.c
> +++ b/drivers/devfreq/governor_passive.c
> @@ -8,11 +8,89 @@
>   */
>  
>  #include <linux/module.h>
> +#include <linux/cpu.h>
> +#include <linux/cpufreq.h>
> +#include <linux/cpumask.h>
>  #include <linux/device.h>
>  #include <linux/devfreq.h>
> +#include <linux/slab.h>
>  #include "governor.h"
>  
> -static int devfreq_passive_get_target_freq(struct devfreq *devfreq,
> +static unsigned int xlate_cpufreq_to_devfreq(struct devfreq_passive_data *data,

Need to change 'unsigned int' to 'unsigned long'.

> +					     unsigned int cpu)
> +{
> +	unsigned int cpu_min, cpu_max, dev_min, dev_max, cpu_percent, max_state;

Better to define them separately as following and then need to rename
the variable. Usually, use the 'min_freq' and 'max_freq' word for
the minimum/maximum frequency.

	unsigned int cpu_min_freq, cpu_max_freq, cpu_curr_freq, cpu_percent;
	unsigned long dev_min_freq, dev_max_freq, dev_max_state,

The devfreq used 'unsigned long'. The cpufreq used 'unsigned long'
and 'unsigned int'. You need to handle them properly.


> +	struct devfreq_cpu_state *cpu_state = data->cpu_state[cpu];
> +	struct devfreq *devfreq = (struct devfreq *)data->this;
> +	unsigned long *freq_table = devfreq->profile->freq_table;

In this function, use 'cpu' work for cpufreq and use 'dev' for devfreq.
So, I think 'dev_freq_table' is proper name instead of 'freq_table'
for the readability.

	freq_table -> dev_freq_table

> +	struct dev_pm_opp *opp = NULL, *cpu_opp = NULL;

In the get_target_freq_with_devfreq(), use 'p_opp' indicating
the OPP of parent device. For the consistency, I think that
use 'p_opp' instead of 'cpu_opp'. 

> +	unsigned long cpu_freq, freq;

Define the 'cpu_freq' on above with cpu_min_freq/cpu_max_freq definition.
	cpu_freq -> cpu_curr_freq.

> +
> +	if (!cpu_state || cpu_state->first_cpu != cpu ||
> +	    !cpu_state->opp_table || !devfreq->opp_table)
> +		return 0;
> +
> +	cpu_freq = cpu_state->freq * 1000;
> +	cpu_opp = devfreq_recommended_opp(cpu_state->dev, &cpu_freq, 0);
> +	if (IS_ERR(cpu_opp))
> +		return 0;
> +
> +	opp = dev_pm_opp_xlate_required_opp(cpu_state->opp_table,
> +					    devfreq->opp_table, cpu_opp);
> +	dev_pm_opp_put(cpu_opp);
> +
> +	if (!IS_ERR(opp)) {
> +		freq = dev_pm_opp_get_freq(opp);
> +		dev_pm_opp_put(opp);

Better to add the 'out' goto statement.
If you use 'goto out', you can reduce the one indentation
without 'else' statement.
	

> +	} else {

As I commented, when dev_pm_opp_xlate_required_opp() return successfully
, use 'goto out'. We can remove 'else' and then reduce the unneeded indentation.


> +		/* Use Interpolation if required opps is not available */
> +		cpu_min = cpu_state->min_freq;
> +		cpu_max = cpu_state->max_freq;
> +		cpu_freq = cpu_state->freq;
> +
> +		if (freq_table) {
> +			/* Get minimum frequency according to sorting order */
> +			max_state = freq_table[devfreq->profile->max_state - 1];
> +			if (freq_table[0] < max_state) {
> +				dev_min = freq_table[0];
> +				dev_max = max_state;
> +			} else {
> +				dev_min = max_state;
> +				dev_max = freq_table[0];
> +			}
> +		} else {
> +			if (devfreq->user_max_freq_req.data.freq.qos->max_freq.target_value
> +			    <= devfreq->user_min_freq_req.data.freq.qos->min_freq.target_value)
> +				return 0;
> +			dev_min =
> +			devfreq->user_min_freq_req.data.freq.qos->min_freq.target_value;
> +			dev_max =
> +			devfreq->user_max_freq_req.data.freq.qos->max_freq.target_value;

I think it is not proper to access the variable of pm_qos structure directly.
Instead of direct access, you have to use the exported PM QoS function such as
- pm_qos_read_value(devfreq->dev.parent, DEV_PM_QOS_MIN_FREQUENCY);
- pm_qos_read_value(devfreq->dev.parent, DEV_PM_QOS_MAX_FREQUENCY);

> +		}
> +		cpu_percent = ((cpu_freq - cpu_min) * 100) / cpu_max - cpu_min;
> +		freq = dev_min + mult_frac(dev_max - dev_min, cpu_percent, 100);
> +	}


I think that you better to add 'out' jump label as following:

out:

> +
> +	return freq;
> +}
> +
> +static int get_target_freq_with_cpufreq(struct devfreq *devfreq,
> +					unsigned long *freq)
> +{
> +	struct devfreq_passive_data *p_data =
> +				(struct devfreq_passive_data *)devfreq->data;
> +	unsigned int cpu, target_freq = 0;

Need to define 'target_freq' with 'unsigned long' type.

> +
> +	for_each_online_cpu(cpu)
> +		target_freq = max(target_freq,
> +				  xlate_cpufreq_to_devfreq(p_data, cpu));
> +
> +	*freq = target_freq;
> +
> +	return 0;
> +}
> +
> +static int get_target_freq_with_devfreq(struct devfreq *devfreq,
>  					unsigned long *freq)
>  {
>  	struct devfreq_passive_data *p_data
> @@ -23,16 +101,6 @@ static int devfreq_passive_get_target_freq(struct devfreq *devfreq,
>  	int i, count, ret = 0;
>  
>  	/*
> -	 * If the devfreq device with passive governor has the specific method
> -	 * to determine the next frequency, should use the get_target_freq()
> -	 * of struct devfreq_passive_data.
> -	 */
> -	if (p_data->get_target_freq) {
> -		ret = p_data->get_target_freq(devfreq, freq);
> -		goto out;
> -	}
> -
> -	/*
>  	 * If the parent and passive devfreq device uses the OPP table,
>  	 * get the next frequency by using the OPP table.
>  	 */
> @@ -102,6 +170,37 @@ static int devfreq_passive_get_target_freq(struct devfreq *devfreq,
>  	return ret;
>  }
>  
> +static int devfreq_passive_get_target_freq(struct devfreq *devfreq,
> +					   unsigned long *freq)
> +{
> +	struct devfreq_passive_data *p_data =
> +				(struct devfreq_passive_data *)devfreq->data;
> +	int ret;
> +
> +	/*
> +	 * If the devfreq device with passive governor has the specific method
> +	 * to determine the next frequency, should use the get_target_freq()
> +	 * of struct devfreq_passive_data.
> +	 */
> +	if (p_data->get_target_freq)
> +		return p_data->get_target_freq(devfreq, freq);
> +
> +	switch (p_data->parent_type) {
> +	case DEVFREQ_PARENT_DEV:
> +		ret = get_target_freq_with_devfreq(devfreq, freq);
> +		break;
> +	case CPUFREQ_PARENT_DEV:
> +		ret = get_target_freq_with_cpufreq(devfreq, freq);
> +		break;
> +	default:
> +		ret = -EINVAL;
> +		dev_err(&devfreq->dev, "Invalid parent type\n");
> +		break;
> +	}
> +
> +	return ret;
> +}
> +
>  static int update_devfreq_passive(struct devfreq *devfreq, unsigned long freq)
>  {
>  	int ret;
> @@ -156,6 +255,140 @@ static int devfreq_passive_notifier_call(struct notifier_block *nb,
>  	return NOTIFY_DONE;
>  }
>  
> +static int cpufreq_passive_notifier_call(struct notifier_block *nb,
> +					 unsigned long event, void *ptr)
> +{
> +	struct devfreq_passive_data *data =
> +			container_of(nb, struct devfreq_passive_data, nb);
> +	struct devfreq *devfreq = (struct devfreq *)data->this;
> +	struct devfreq_cpu_state *cpu_state;
> +	struct cpufreq_freqs *freq = ptr;

How about changing 'freq' to 'cpu_freqs'?

In the drivers/cpufreq/cpufreq.c, use 'freqs' name indicating
the instance of 'struct cpufreq_freqs'. And in order to
identfy, how about adding 'cpu_' prefix for variable name?

> +	unsigned int current_freq;

Need to define curr_freq with 'unsigned long' type
and better to use 'curr_freq' variable name.

> +	int ret;
> +
> +	if (event != CPUFREQ_POSTCHANGE || !freq ||
> +	    !data->cpu_state[freq->policy->cpu])
> +		return 0;
> +
> +	cpu_state = data->cpu_state[freq->policy->cpu];
> +	if (cpu_state->freq == freq->new)
> +		return 0;
> +
> +	/* Backup current freq and pre-update cpu state freq*/
> +	current_freq = cpu_state->freq;
> +	cpu_state->freq = freq->new;
> +
> +	mutex_lock(&devfreq->lock);
> +	ret = update_devfreq(devfreq);
> +	mutex_unlock(&devfreq->lock);
> +	if (ret) {
> +		cpu_state->freq = current_freq;
> +		dev_err(&devfreq->dev, "Couldn't update the frequency.\n");
> +		return ret;
> +	}
> +
> +	return 0;
> +}
> +
> +static int cpufreq_passive_register(struct devfreq_passive_data **p_data)
> +{
> +	struct devfreq_passive_data *data = *p_data;
> +	struct devfreq *devfreq = (struct devfreq *)data->this;
> +	struct device *dev = devfreq->dev.parent;
> +	struct opp_table *opp_table = NULL;
> +	struct devfreq_cpu_state *state;

For the readability, I thinkt 'cpu_state' is proper instead of 'state'.

> +	struct cpufreq_policy *policy;
> +	struct device *cpu_dev;
> +	unsigned int cpu;
> +	int ret;
> +
> +	get_online_cpus();

Add blank line.

> +	data->nb.notifier_call = cpufreq_passive_notifier_call;
> +	ret = cpufreq_register_notifier(&data->nb,
> +					CPUFREQ_TRANSITION_NOTIFIER);
> +	if (ret) {
> +		dev_err(dev, "Couldn't register cpufreq notifier.\n");
> +		data->nb.notifier_call = NULL;
> +		goto out;
> +	}
> +
> +	/* Populate devfreq_cpu_state */
> +	for_each_online_cpu(cpu) {
> +		if (data->cpu_state[cpu])
> +			continue;
> +
> +		policy = cpufreq_cpu_get(cpu);

cpufreq_cpu_get() might return 'NULL'. I think you need to handle
return value as following:

		if (!policy) {
			ret = -EINVAL;
			goto out;
		} else if (PTR_ERR(policy) == -EPROBE_DEFER) {
			goto out;
		} else if (IS_ERR(policy) {
			ret = PTR_ERR(policy);
			dev_err(dev, "Couldn't get the cpufreq_poliy.\n");
			goto out;
		}

If cpufreq_cpu_get() return successfully, to do next.
It reduces the one indentaion.



> +		if (policy) {
> +			state = kzalloc(sizeof(*state), GFP_KERNEL);
> +			if (!state) {
> +				ret = -ENOMEM;
> +				goto out;
> +			}
> +
> +			cpu_dev = get_cpu_device(cpu);
> +			if (!cpu_dev) {
> +				dev_err(dev, "Couldn't get cpu device.\n");
> +				ret = -ENODEV;
> +				goto out;
> +			}
> +
> +			opp_table = dev_pm_opp_get_opp_table(cpu_dev);
> +			if (IS_ERR(devfreq->opp_table)) {
> +				ret = PTR_ERR(opp_table);
> +				goto out;
> +			}
> +
> +			state->dev = cpu_dev;
> +			state->opp_table = opp_table;
> +			state->first_cpu = cpumask_first(policy->related_cpus);
> +			state->freq = policy->cur;
> +			state->min_freq = policy->cpuinfo.min_freq;
> +			state->max_freq = policy->cpuinfo.max_freq;
> +			data->cpu_state[cpu] = state;

Add blank line.

> +			cpufreq_cpu_put(policy);
> +		} else {
> +			ret = -EPROBE_DEFER;
> +			goto out;
> +		}
> +	}

Add blank line.

> +out:
> +	put_online_cpus();
> +	if (ret)
> +		return ret;
> +
> +	/* Update devfreq */
> +	mutex_lock(&devfreq->lock);
> +	ret = update_devfreq(devfreq);
> +	mutex_unlock(&devfreq->lock);
> +	if (ret)
> +		dev_err(dev, "Couldn't update the frequency.\n");
> +
> +	return ret;
> +}
> +
> +static int cpufreq_passive_unregister(struct devfreq_passive_data **p_data)
> +{
> +	struct devfreq_passive_data *data = *p_data;
> +	struct devfreq_cpu_state *cpu_state;
> +	int cpu;
> +
> +	if (data->nb.notifier_call)
> +		cpufreq_unregister_notifier(&data->nb,
> +					    CPUFREQ_TRANSITION_NOTIFIER);
> +
> +	for_each_possible_cpu(cpu) {
> +		cpu_state = data->cpu_state[cpu];
> +		if (cpu_state) {
> +			if (cpu_state->opp_table)
> +				dev_pm_opp_put_opp_table(cpu_state->opp_table);
> +			kfree(cpu_state);
> +			cpu_state = NULL;
> +		}
> +	}
> +
> +	return 0;
> +}
> +
>  static int devfreq_passive_event_handler(struct devfreq *devfreq,
>  				unsigned int event, void *data)
>  {
> @@ -165,7 +398,7 @@ static int devfreq_passive_event_handler(struct devfreq *devfreq,
>  	struct notifier_block *nb = &p_data->nb;
>  	int ret = 0;
>  
> -	if (!parent)
> +	if (p_data->parent_type == DEVFREQ_PARENT_DEV && !parent)
>  		return -EPROBE_DEFER;

If you modify the devfreq_passive_event_handler() as following,
you can move this condition for DEVFREQ_PARENT_DEV into 
(register|unregister)_parent_dev_notifier.

	switch (event) {                                                                                  
	case DEVFREQ_GOV_START:                                               
		ret = register_parent_dev_notifier(p_data);
		break;
	case DEVFREQ_GOV_STOP:                                             
		ret = unregister_parent_dev_notifier(p_data);
		break;
	default: 
		ret = -EINVAL;
		break;
	}
                                                                                              
	return ret;

>  
>  	switch (event) {
> @@ -173,13 +406,24 @@ static int devfreq_passive_event_handler(struct devfreq *devfreq,
>  		if (!p_data->this)
>  			p_data->this = devfreq;
>  
> -		nb->notifier_call = devfreq_passive_notifier_call;
> -		ret = devfreq_register_notifier(parent, nb,
> -					DEVFREQ_TRANSITION_NOTIFIER);
> +		if (p_data->parent_type == DEVFREQ_PARENT_DEV) {
> +			nb->notifier_call = devfreq_passive_notifier_call;
> +			ret = devfreq_register_notifier(parent, nb,
> +						DEVFREQ_TRANSITION_NOTIFIER);
> +		} else if (p_data->parent_type == CPUFREQ_PARENT_DEV) {
> +			ret = cpufreq_passive_register(&p_data);

I think that we better to collect the code related to notifier registration
into one function like devfreq_pass_register_notifier() instead of
cpufreq_passive_register() as following: I think it is more simple and readable.

If you have more proper function name of register_parent_dev_notifier,
please give your opinion.


	int register_parent_dev_notifier(struct devfreq_passive_data **p_data)
		switch (p_data->parent_type) {
		case DEVFREQ_PARENT_DEV:
			nb->notifier_call = devfreq_passive_notifier_call;
			ret = devfreq_register_notifier(parent, nb,
			break;
		case CPUFREQ_PARENT_DEV:
			cpufreq_register_notifier(...)
			...
			break;
		}
		

> +		} else {
> +			ret = -EINVAL;
> +		}
>  		break;
>  	case DEVFREQ_GOV_STOP:
> -		WARN_ON(devfreq_unregister_notifier(parent, nb,
> -					DEVFREQ_TRANSITION_NOTIFIER));
> +		if (p_data->parent_type == DEVFREQ_PARENT_DEV)
> +			WARN_ON(devfreq_unregister_notifier(parent, nb,
> +						DEVFREQ_TRANSITION_NOTIFIER));
> +		else if (p_data->parent_type == CPUFREQ_PARENT_DEV)
> +			cpufreq_passive_unregister(&p_data);
> +		else
> +			ret = -EINVAL;

ditto. unregister_parent_dev_notifier(struct devfreq_passive_data **p_data)

>  		break;
>  	default:
>  		break;
> diff --git a/include/linux/devfreq.h b/include/linux/devfreq.h
> index a4b19d593151..04ce576fd6f1 100644
> --- a/include/linux/devfreq.h
> +++ b/include/linux/devfreq.h
> @@ -278,6 +278,32 @@ struct devfreq_simple_ondemand_data {
>  
>  #if IS_ENABLED(CONFIG_DEVFREQ_GOV_PASSIVE)
>  /**
> + * struct devfreq_cpu_state - holds the per-cpu state
> + * @freq:	the current frequency of the cpu.
> + * @min_freq:	the min frequency of the cpu.
> + * @max_freq:	the max frequency of the cpu.
> + * @first_cpu:	the cpumask of the first cpu of a policy.
> + * @dev:	reference to cpu device.
> + * @opp_table:	reference to cpu opp table.
> + *
> + * This structure stores the required cpu_state of a cpu.
> + * This is auto-populated by the governor.
> + */
> +struct devfreq_cpu_state {> +	unsigned int freq;

It is better to change from 'freq' to 'curr_freq'
for more correct expression.

> +	unsigned int min_freq;
> +	unsigned int max_freq;
> +	unsigned int first_cpu;
> +	struct device *dev;

How about changing the name 'dev' to 'cpu_dev'?


> +	struct opp_table *opp_table;
> +};

devfreq_cpu_state is only handled by within driver/devfreq/governor_passive.c.

So, you can move it into drivers/devfreq/governor_passive.c
and just add the definition into include/linux/devfreq.h as following:
It is able to prevent the access of variable of 'struct devfreq_cpu_state'
outside.

	struct devfreq_cpu_state;

> +
> +enum devfreq_parent_dev_type {
> +	DEVFREQ_PARENT_DEV,
> +	CPUFREQ_PARENT_DEV,
> +};
> +
> +/**
>   * struct devfreq_passive_data - ``void *data`` fed to struct devfreq
>   *	and devfreq_add_device
>   * @parent:	the devfreq instance of parent device.
> @@ -288,13 +314,15 @@ struct devfreq_simple_ondemand_data {
>   *			using governors except for passive governor.
>   *			If the devfreq device has the specific method to decide
>   *			the next frequency, should use this callback.
> - * @this:	the devfreq instance of own device.
> - * @nb:		the notifier block for DEVFREQ_TRANSITION_NOTIFIER list
> + * @parent_type		parent type of the device

Need to add ':' at the end of word. -> "parent_type:".

> + * @this:		the devfreq instance of own device.
> + * @nb:			the notifier block for DEVFREQ_TRANSITION_NOTIFIER list

I knew that you make them with same indentation.
But, actually, it is not related to this patch like clean-up code.
Even if it is not pretty, you better to don't touch 'this' and 'nb' indentaion.

> + * @cpu_state:		the state min/max/current frequency of all online cpu's
>   *
>   * The devfreq_passive_data have to set the devfreq instance of parent
>   * device with governors except for the passive governor. But, don't need to
> - * initialize the 'this' and 'nb' field because the devfreq core will handle
> - * them.
> + * initialize the 'this', 'nb' and 'cpu_state' field because the devfreq core
> + * will handle them.
>   */
>  struct devfreq_passive_data {
>  	/* Should set the devfreq instance of parent device */
> @@ -303,9 +331,13 @@ struct devfreq_passive_data {
>  	/* Optional callback to decide the next frequency of passvice device */
>  	int (*get_target_freq)(struct devfreq *this, unsigned long *freq);
>  
> +	/* Should set the type of parent device */
> +	enum devfreq_parent_dev_type parent_type;
> +
>  	/* For passive governor's internal use. Don't need to set them */
>  	struct devfreq *this;
>  	struct notifier_block nb;
> +	struct devfreq_cpu_state *cpu_state[NR_CPUS];
>  };
>  #endif
>  
> 


-- 
Best Regards,
Chanwoo Choi
Samsung Electronics

_______________________________________________
Linux-mediatek mailing list
Linux-mediatek@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-mediatek

^ permalink raw reply	[flat|nested] 35+ messages in thread

* Re: [PATCH 06/12] PM / devfreq: Add cpu based scaling support to passive_governor
  2020-05-20  3:43   ` [PATCH 06/12] PM / devfreq: Add cpu based scaling support to passive_governor Andrew-sh.Cheng
  2020-05-28  5:03     ` Chanwoo Choi
@ 2020-05-28  6:14     ` Chanwoo Choi
  2020-05-28  7:17       ` Chanwoo Choi
  2020-06-02 11:43       ` andrew-sh.cheng
  1 sibling, 2 replies; 35+ messages in thread
From: Chanwoo Choi @ 2020-05-28  6:14 UTC (permalink / raw)
  To: Andrew-sh.Cheng, MyungJoo Ham, Kyungmin Park, Rob Herring,
	Mark Rutland, Matthias Brugger, Rafael J . Wysocki, Viresh Kumar,
	Nishanth Menon, Stephen Boyd, Liam Girdwood, Mark Brown
  Cc: devicetree, srv_heupstream, linux-pm, linux-kernel,
	linux-mediatek, Sibi Sankar, linux-arm-kernel

Hi Andrew-sh.Cheng,

Thanks for your posting. I like this approach absolutely.
I think that it is necessary. When I developed the embedded product,
I needed this feature always. 

I add the comments on below.


And the following email is not valid. So, I dropped this email
from Cc list.
Saravana Kannan <skannan@codeaurora.org>


On 5/20/20 12:43 PM, Andrew-sh.Cheng wrote:
> From: Saravana Kannan <skannan@codeaurora.org>
> 
> Many CPU architectures have caches that can scale independent of the
> CPUs. Frequency scaling of the caches is necessary to make sure that the
> cache is not a performance bottleneck that leads to poor performance and
> power. The same idea applies for RAM/DDR.
> 
> To achieve this, this patch adds support for cpu based scaling to the
> passive governor. This is accomplished by taking the current frequency
> of each CPU frequency domain and then adjust the frequency of the cache
> (or any devfreq device) based on the frequency of the CPUs. It listens
> to CPU frequency transition notifiers to keep itself up to date on the
> current CPU frequency.
> 
> To decide the frequency of the device, the governor does one of the
> following:
> * Derives the optimal devfreq device opp from required-opps property of
>   the parent cpu opp_table.
> 
> * Scales the device frequency in proportion to the CPU frequency. So, if
>   the CPUs are running at their max frequency, the device runs at its
>   max frequency. If the CPUs are running at their min frequency, the
>   device runs at its min frequency. It is interpolated for frequencies
>   in between.
> 
> Andrew-sh.Cheng change
> dev_pm_opp_xlate_opp to dev_pm_opp_xlate_required_opp devfreq->max_freq
> to devfreq->user_min_freq_req.data.freq.qos->min_freq.target_value
> for kernel-5.7
> 
> Signed-off-by: Saravana Kannan <skannan@codeaurora.org>
> [Sibi: Integrated cpu-freqmap governor into passive_governor]
> Signed-off-by: Sibi Sankar <sibis@codeaurora.org>
> Signed-off-by: Andrew-sh.Cheng <andrew-sh.cheng@mediatek.com>
> ---
>  drivers/devfreq/Kconfig            |   2 +
>  drivers/devfreq/governor_passive.c | 278 ++++++++++++++++++++++++++++++++++---
>  include/linux/devfreq.h            |  40 +++++-
>  3 files changed, 299 insertions(+), 21 deletions(-)
> 
> diff --git a/drivers/devfreq/Kconfig b/drivers/devfreq/Kconfig
> index 0b1df12e0f21..d9067950af6a 100644
> --- a/drivers/devfreq/Kconfig
> +++ b/drivers/devfreq/Kconfig
> @@ -73,6 +73,8 @@ config DEVFREQ_GOV_PASSIVE
>  	  device. This governor does not change the frequency by itself
>  	  through sysfs entries. The passive governor recommends that
>  	  devfreq device uses the OPP table to get the frequency/voltage.
> +	  Alternatively the governor can also be chosen to scale based on
> +	  the online CPUs current frequency.
>  
>  comment "DEVFREQ Drivers"
>  
> diff --git a/drivers/devfreq/governor_passive.c b/drivers/devfreq/governor_passive.c
> index 2d67d6c12dce..7dcda02a5bb7 100644
> --- a/drivers/devfreq/governor_passive.c
> +++ b/drivers/devfreq/governor_passive.c
> @@ -8,11 +8,89 @@
>   */
>  
>  #include <linux/module.h>
> +#include <linux/cpu.h>
> +#include <linux/cpufreq.h>
> +#include <linux/cpumask.h>
>  #include <linux/device.h>
>  #include <linux/devfreq.h>
> +#include <linux/slab.h>
>  #include "governor.h"
>  
> -static int devfreq_passive_get_target_freq(struct devfreq *devfreq,
> +static unsigned int xlate_cpufreq_to_devfreq(struct devfreq_passive_data *data,

Need to change 'unsigned int' to 'unsigned long'.

> +					     unsigned int cpu)
> +{
> +	unsigned int cpu_min, cpu_max, dev_min, dev_max, cpu_percent, max_state;

Better to define them separately as following and then need to rename
the variable. Usually, use the 'min_freq' and 'max_freq' word for
the minimum/maximum frequency.

	unsigned int cpu_min_freq, cpu_max_freq, cpu_curr_freq, cpu_percent;
	unsigned long dev_min_freq, dev_max_freq, dev_max_state,

The devfreq used 'unsigned long'. The cpufreq used 'unsigned long'
and 'unsigned int'. You need to handle them properly.


> +	struct devfreq_cpu_state *cpu_state = data->cpu_state[cpu];
> +	struct devfreq *devfreq = (struct devfreq *)data->this;
> +	unsigned long *freq_table = devfreq->profile->freq_table;

In this function, use 'cpu' work for cpufreq and use 'dev' for devfreq.
So, I think 'dev_freq_table' is proper name instead of 'freq_table'
for the readability.

	freq_table -> dev_freq_table

> +	struct dev_pm_opp *opp = NULL, *cpu_opp = NULL;

In the get_target_freq_with_devfreq(), use 'p_opp' indicating
the OPP of parent device. For the consistency, I think that
use 'p_opp' instead of 'cpu_opp'. 

> +	unsigned long cpu_freq, freq;

Define the 'cpu_freq' on above with cpu_min_freq/cpu_max_freq definition.
	cpu_freq -> cpu_curr_freq.

> +
> +	if (!cpu_state || cpu_state->first_cpu != cpu ||
> +	    !cpu_state->opp_table || !devfreq->opp_table)
> +		return 0;
> +
> +	cpu_freq = cpu_state->freq * 1000;
> +	cpu_opp = devfreq_recommended_opp(cpu_state->dev, &cpu_freq, 0);
> +	if (IS_ERR(cpu_opp))
> +		return 0;
> +
> +	opp = dev_pm_opp_xlate_required_opp(cpu_state->opp_table,
> +					    devfreq->opp_table, cpu_opp);
> +	dev_pm_opp_put(cpu_opp);
> +
> +	if (!IS_ERR(opp)) {
> +		freq = dev_pm_opp_get_freq(opp);
> +		dev_pm_opp_put(opp);

Better to add the 'out' goto statement.
If you use 'goto out', you can reduce the one indentation
without 'else' statement.
	

> +	} else {

As I commented, when dev_pm_opp_xlate_required_opp() return successfully
, use 'goto out'. We can remove 'else' and then reduce the unneeded indentation.


> +		/* Use Interpolation if required opps is not available */
> +		cpu_min = cpu_state->min_freq;
> +		cpu_max = cpu_state->max_freq;
> +		cpu_freq = cpu_state->freq;
> +
> +		if (freq_table) {
> +			/* Get minimum frequency according to sorting order */
> +			max_state = freq_table[devfreq->profile->max_state - 1];
> +			if (freq_table[0] < max_state) {
> +				dev_min = freq_table[0];
> +				dev_max = max_state;
> +			} else {
> +				dev_min = max_state;
> +				dev_max = freq_table[0];
> +			}
> +		} else {
> +			if (devfreq->user_max_freq_req.data.freq.qos->max_freq.target_value
> +			    <= devfreq->user_min_freq_req.data.freq.qos->min_freq.target_value)
> +				return 0;
> +			dev_min =
> +			devfreq->user_min_freq_req.data.freq.qos->min_freq.target_value;
> +			dev_max =
> +			devfreq->user_max_freq_req.data.freq.qos->max_freq.target_value;

I think it is not proper to access the variable of pm_qos structure directly.
Instead of direct access, you have to use the exported PM QoS function such as
- pm_qos_read_value(devfreq->dev.parent, DEV_PM_QOS_MIN_FREQUENCY);
- pm_qos_read_value(devfreq->dev.parent, DEV_PM_QOS_MAX_FREQUENCY);

> +		}
> +		cpu_percent = ((cpu_freq - cpu_min) * 100) / cpu_max - cpu_min;
> +		freq = dev_min + mult_frac(dev_max - dev_min, cpu_percent, 100);
> +	}


I think that you better to add 'out' jump label as following:

out:

> +
> +	return freq;
> +}
> +
> +static int get_target_freq_with_cpufreq(struct devfreq *devfreq,
> +					unsigned long *freq)
> +{
> +	struct devfreq_passive_data *p_data =
> +				(struct devfreq_passive_data *)devfreq->data;
> +	unsigned int cpu, target_freq = 0;

Need to define 'target_freq' with 'unsigned long' type.

> +
> +	for_each_online_cpu(cpu)
> +		target_freq = max(target_freq,
> +				  xlate_cpufreq_to_devfreq(p_data, cpu));
> +
> +	*freq = target_freq;
> +
> +	return 0;
> +}
> +
> +static int get_target_freq_with_devfreq(struct devfreq *devfreq,
>  					unsigned long *freq)
>  {
>  	struct devfreq_passive_data *p_data
> @@ -23,16 +101,6 @@ static int devfreq_passive_get_target_freq(struct devfreq *devfreq,
>  	int i, count, ret = 0;
>  
>  	/*
> -	 * If the devfreq device with passive governor has the specific method
> -	 * to determine the next frequency, should use the get_target_freq()
> -	 * of struct devfreq_passive_data.
> -	 */
> -	if (p_data->get_target_freq) {
> -		ret = p_data->get_target_freq(devfreq, freq);
> -		goto out;
> -	}
> -
> -	/*
>  	 * If the parent and passive devfreq device uses the OPP table,
>  	 * get the next frequency by using the OPP table.
>  	 */
> @@ -102,6 +170,37 @@ static int devfreq_passive_get_target_freq(struct devfreq *devfreq,
>  	return ret;
>  }
>  
> +static int devfreq_passive_get_target_freq(struct devfreq *devfreq,
> +					   unsigned long *freq)
> +{
> +	struct devfreq_passive_data *p_data =
> +				(struct devfreq_passive_data *)devfreq->data;
> +	int ret;
> +
> +	/*
> +	 * If the devfreq device with passive governor has the specific method
> +	 * to determine the next frequency, should use the get_target_freq()
> +	 * of struct devfreq_passive_data.
> +	 */
> +	if (p_data->get_target_freq)
> +		return p_data->get_target_freq(devfreq, freq);
> +
> +	switch (p_data->parent_type) {
> +	case DEVFREQ_PARENT_DEV:
> +		ret = get_target_freq_with_devfreq(devfreq, freq);
> +		break;
> +	case CPUFREQ_PARENT_DEV:
> +		ret = get_target_freq_with_cpufreq(devfreq, freq);
> +		break;
> +	default:
> +		ret = -EINVAL;
> +		dev_err(&devfreq->dev, "Invalid parent type\n");
> +		break;
> +	}
> +
> +	return ret;
> +}
> +
>  static int update_devfreq_passive(struct devfreq *devfreq, unsigned long freq)
>  {
>  	int ret;
> @@ -156,6 +255,140 @@ static int devfreq_passive_notifier_call(struct notifier_block *nb,
>  	return NOTIFY_DONE;
>  }
>  
> +static int cpufreq_passive_notifier_call(struct notifier_block *nb,
> +					 unsigned long event, void *ptr)
> +{
> +	struct devfreq_passive_data *data =
> +			container_of(nb, struct devfreq_passive_data, nb);
> +	struct devfreq *devfreq = (struct devfreq *)data->this;
> +	struct devfreq_cpu_state *cpu_state;
> +	struct cpufreq_freqs *freq = ptr;

How about changing 'freq' to 'cpu_freqs'?

In the drivers/cpufreq/cpufreq.c, use 'freqs' name indicating
the instance of 'struct cpufreq_freqs'. And in order to
identfy, how about adding 'cpu_' prefix for variable name?

> +	unsigned int current_freq;

Need to define curr_freq with 'unsigned long' type
and better to use 'curr_freq' variable name.

> +	int ret;
> +
> +	if (event != CPUFREQ_POSTCHANGE || !freq ||
> +	    !data->cpu_state[freq->policy->cpu])
> +		return 0;
> +
> +	cpu_state = data->cpu_state[freq->policy->cpu];
> +	if (cpu_state->freq == freq->new)
> +		return 0;
> +
> +	/* Backup current freq and pre-update cpu state freq*/
> +	current_freq = cpu_state->freq;
> +	cpu_state->freq = freq->new;
> +
> +	mutex_lock(&devfreq->lock);
> +	ret = update_devfreq(devfreq);
> +	mutex_unlock(&devfreq->lock);
> +	if (ret) {
> +		cpu_state->freq = current_freq;
> +		dev_err(&devfreq->dev, "Couldn't update the frequency.\n");
> +		return ret;
> +	}
> +
> +	return 0;
> +}
> +
> +static int cpufreq_passive_register(struct devfreq_passive_data **p_data)
> +{
> +	struct devfreq_passive_data *data = *p_data;
> +	struct devfreq *devfreq = (struct devfreq *)data->this;
> +	struct device *dev = devfreq->dev.parent;
> +	struct opp_table *opp_table = NULL;
> +	struct devfreq_cpu_state *state;

For the readability, I thinkt 'cpu_state' is proper instead of 'state'.

> +	struct cpufreq_policy *policy;
> +	struct device *cpu_dev;
> +	unsigned int cpu;
> +	int ret;
> +
> +	get_online_cpus();

Add blank line.

> +	data->nb.notifier_call = cpufreq_passive_notifier_call;
> +	ret = cpufreq_register_notifier(&data->nb,
> +					CPUFREQ_TRANSITION_NOTIFIER);
> +	if (ret) {
> +		dev_err(dev, "Couldn't register cpufreq notifier.\n");
> +		data->nb.notifier_call = NULL;
> +		goto out;
> +	}
> +
> +	/* Populate devfreq_cpu_state */
> +	for_each_online_cpu(cpu) {
> +		if (data->cpu_state[cpu])
> +			continue;
> +
> +		policy = cpufreq_cpu_get(cpu);

cpufreq_cpu_get() might return 'NULL'. I think you need to handle
return value as following:

		if (!policy) {
			ret = -EINVAL;
			goto out;
		} else if (PTR_ERR(policy) == -EPROBE_DEFER) {
			goto out;
		} else if (IS_ERR(policy) {
			ret = PTR_ERR(policy);
			dev_err(dev, "Couldn't get the cpufreq_poliy.\n");
			goto out;
		}

If cpufreq_cpu_get() return successfully, to do next.
It reduces the one indentaion.



> +		if (policy) {
> +			state = kzalloc(sizeof(*state), GFP_KERNEL);
> +			if (!state) {
> +				ret = -ENOMEM;
> +				goto out;
> +			}
> +
> +			cpu_dev = get_cpu_device(cpu);
> +			if (!cpu_dev) {
> +				dev_err(dev, "Couldn't get cpu device.\n");
> +				ret = -ENODEV;
> +				goto out;
> +			}
> +
> +			opp_table = dev_pm_opp_get_opp_table(cpu_dev);
> +			if (IS_ERR(devfreq->opp_table)) {
> +				ret = PTR_ERR(opp_table);
> +				goto out;
> +			}
> +
> +			state->dev = cpu_dev;
> +			state->opp_table = opp_table;
> +			state->first_cpu = cpumask_first(policy->related_cpus);
> +			state->freq = policy->cur;
> +			state->min_freq = policy->cpuinfo.min_freq;
> +			state->max_freq = policy->cpuinfo.max_freq;
> +			data->cpu_state[cpu] = state;

Add blank line.

> +			cpufreq_cpu_put(policy);
> +		} else {
> +			ret = -EPROBE_DEFER;
> +			goto out;
> +		}
> +	}

Add blank line.

> +out:
> +	put_online_cpus();
> +	if (ret)
> +		return ret;
> +
> +	/* Update devfreq */
> +	mutex_lock(&devfreq->lock);
> +	ret = update_devfreq(devfreq);
> +	mutex_unlock(&devfreq->lock);
> +	if (ret)
> +		dev_err(dev, "Couldn't update the frequency.\n");
> +
> +	return ret;
> +}
> +
> +static int cpufreq_passive_unregister(struct devfreq_passive_data **p_data)
> +{
> +	struct devfreq_passive_data *data = *p_data;
> +	struct devfreq_cpu_state *cpu_state;
> +	int cpu;
> +
> +	if (data->nb.notifier_call)
> +		cpufreq_unregister_notifier(&data->nb,
> +					    CPUFREQ_TRANSITION_NOTIFIER);
> +
> +	for_each_possible_cpu(cpu) {
> +		cpu_state = data->cpu_state[cpu];
> +		if (cpu_state) {
> +			if (cpu_state->opp_table)
> +				dev_pm_opp_put_opp_table(cpu_state->opp_table);
> +			kfree(cpu_state);
> +			cpu_state = NULL;
> +		}
> +	}
> +
> +	return 0;
> +}
> +
>  static int devfreq_passive_event_handler(struct devfreq *devfreq,
>  				unsigned int event, void *data)
>  {
> @@ -165,7 +398,7 @@ static int devfreq_passive_event_handler(struct devfreq *devfreq,
>  	struct notifier_block *nb = &p_data->nb;
>  	int ret = 0;
>  
> -	if (!parent)
> +	if (p_data->parent_type == DEVFREQ_PARENT_DEV && !parent)
>  		return -EPROBE_DEFER;

If you modify the devfreq_passive_event_handler() as following,
you can move this condition for DEVFREQ_PARENT_DEV into 
(register|unregister)_parent_dev_notifier.

	switch (event) {                                                                                  
	case DEVFREQ_GOV_START:                                               
		ret = register_parent_dev_notifier(p_data);
		break;
	case DEVFREQ_GOV_STOP:                                             
		ret = unregister_parent_dev_notifier(p_data);
		break;
	default: 
		ret = -EINVAL;
		break;
	}
                                                                                              
	return ret;

>  
>  	switch (event) {
> @@ -173,13 +406,24 @@ static int devfreq_passive_event_handler(struct devfreq *devfreq,
>  		if (!p_data->this)
>  			p_data->this = devfreq;
>  
> -		nb->notifier_call = devfreq_passive_notifier_call;
> -		ret = devfreq_register_notifier(parent, nb,
> -					DEVFREQ_TRANSITION_NOTIFIER);
> +		if (p_data->parent_type == DEVFREQ_PARENT_DEV) {
> +			nb->notifier_call = devfreq_passive_notifier_call;
> +			ret = devfreq_register_notifier(parent, nb,
> +						DEVFREQ_TRANSITION_NOTIFIER);
> +		} else if (p_data->parent_type == CPUFREQ_PARENT_DEV) {
> +			ret = cpufreq_passive_register(&p_data);

I think that we better to collect the code related to notifier registration
into one function like devfreq_pass_register_notifier() instead of
cpufreq_passive_register() as following: I think it is more simple and readable.

If you have more proper function name of register_parent_dev_notifier,
please give your opinion.


	int register_parent_dev_notifier(struct devfreq_passive_data **p_data)
		switch (p_data->parent_type) {
		case DEVFREQ_PARENT_DEV:
			nb->notifier_call = devfreq_passive_notifier_call;
			ret = devfreq_register_notifier(parent, nb,
			break;
		case CPUFREQ_PARENT_DEV:
			cpufreq_register_notifier(...)
			...
			break;
		}
		

> +		} else {
> +			ret = -EINVAL;
> +		}
>  		break;
>  	case DEVFREQ_GOV_STOP:
> -		WARN_ON(devfreq_unregister_notifier(parent, nb,
> -					DEVFREQ_TRANSITION_NOTIFIER));
> +		if (p_data->parent_type == DEVFREQ_PARENT_DEV)
> +			WARN_ON(devfreq_unregister_notifier(parent, nb,
> +						DEVFREQ_TRANSITION_NOTIFIER));
> +		else if (p_data->parent_type == CPUFREQ_PARENT_DEV)
> +			cpufreq_passive_unregister(&p_data);
> +		else
> +			ret = -EINVAL;

ditto. unregister_parent_dev_notifier(struct devfreq_passive_data **p_data)

>  		break;
>  	default:
>  		break;
> diff --git a/include/linux/devfreq.h b/include/linux/devfreq.h
> index a4b19d593151..04ce576fd6f1 100644
> --- a/include/linux/devfreq.h
> +++ b/include/linux/devfreq.h
> @@ -278,6 +278,32 @@ struct devfreq_simple_ondemand_data {
>  
>  #if IS_ENABLED(CONFIG_DEVFREQ_GOV_PASSIVE)
>  /**
> + * struct devfreq_cpu_state - holds the per-cpu state
> + * @freq:	the current frequency of the cpu.
> + * @min_freq:	the min frequency of the cpu.
> + * @max_freq:	the max frequency of the cpu.
> + * @first_cpu:	the cpumask of the first cpu of a policy.
> + * @dev:	reference to cpu device.
> + * @opp_table:	reference to cpu opp table.
> + *
> + * This structure stores the required cpu_state of a cpu.
> + * This is auto-populated by the governor.
> + */
> +struct devfreq_cpu_state {> +	unsigned int freq;

It is better to change from 'freq' to 'curr_freq'
for more correct expression.

> +	unsigned int min_freq;
> +	unsigned int max_freq;
> +	unsigned int first_cpu;
> +	struct device *dev;

How about changing the name 'dev' to 'cpu_dev'?


> +	struct opp_table *opp_table;
> +};

devfreq_cpu_state is only handled by within driver/devfreq/governor_passive.c.

So, you can move it into drivers/devfreq/governor_passive.c
and just add the definition into include/linux/devfreq.h as following:
It is able to prevent the access of variable of 'struct devfreq_cpu_state'
outside.

	struct devfreq_cpu_state;

> +
> +enum devfreq_parent_dev_type {
> +	DEVFREQ_PARENT_DEV,
> +	CPUFREQ_PARENT_DEV,
> +};
> +
> +/**
>   * struct devfreq_passive_data - ``void *data`` fed to struct devfreq
>   *	and devfreq_add_device
>   * @parent:	the devfreq instance of parent device.
> @@ -288,13 +314,15 @@ struct devfreq_simple_ondemand_data {
>   *			using governors except for passive governor.
>   *			If the devfreq device has the specific method to decide
>   *			the next frequency, should use this callback.
> - * @this:	the devfreq instance of own device.
> - * @nb:		the notifier block for DEVFREQ_TRANSITION_NOTIFIER list
> + * @parent_type		parent type of the device

Need to add ':' at the end of word. -> "parent_type:".

> + * @this:		the devfreq instance of own device.
> + * @nb:			the notifier block for DEVFREQ_TRANSITION_NOTIFIER list

I knew that you make them with same indentation.
But, actually, it is not related to this patch like clean-up code.
Even if it is not pretty, you better to don't touch 'this' and 'nb' indentaion.

> + * @cpu_state:		the state min/max/current frequency of all online cpu's
>   *
>   * The devfreq_passive_data have to set the devfreq instance of parent
>   * device with governors except for the passive governor. But, don't need to
> - * initialize the 'this' and 'nb' field because the devfreq core will handle
> - * them.
> + * initialize the 'this', 'nb' and 'cpu_state' field because the devfreq core
> + * will handle them.
>   */
>  struct devfreq_passive_data {
>  	/* Should set the devfreq instance of parent device */
> @@ -303,9 +331,13 @@ struct devfreq_passive_data {
>  	/* Optional callback to decide the next frequency of passvice device */
>  	int (*get_target_freq)(struct devfreq *this, unsigned long *freq);
>  
> +	/* Should set the type of parent device */
> +	enum devfreq_parent_dev_type parent_type;
> +
>  	/* For passive governor's internal use. Don't need to set them */
>  	struct devfreq *this;
>  	struct notifier_block nb;
> +	struct devfreq_cpu_state *cpu_state[NR_CPUS];
>  };
>  #endif
>  
> 


-- 
Best Regards,
Chanwoo Choi
Samsung Electronics

_______________________________________________
Linux-mediatek mailing list
Linux-mediatek@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-mediatek

^ permalink raw reply	[flat|nested] 35+ messages in thread

* Re: [PATCH 06/12] PM / devfreq: Add cpu based scaling support to passive_governor
  2020-05-28  6:14     ` Chanwoo Choi
@ 2020-05-28  7:17       ` Chanwoo Choi
  2020-06-02 12:23         ` andrew-sh.cheng
  2020-06-02 11:43       ` andrew-sh.cheng
  1 sibling, 1 reply; 35+ messages in thread
From: Chanwoo Choi @ 2020-05-28  7:17 UTC (permalink / raw)
  To: Andrew-sh.Cheng, MyungJoo Ham, Kyungmin Park, Rob Herring,
	Mark Rutland, Matthias Brugger, Rafael J . Wysocki, Viresh Kumar,
	Nishanth Menon, Stephen Boyd, Liam Girdwood, Mark Brown
  Cc: devicetree, srv_heupstream, linux-pm, linux-kernel,
	linux-mediatek, Sibi Sankar, linux-arm-kernel

Hi Andrew-sh.Cheng,

The exynos-bus.c used the passive governor.
Even if don't make the problem because DEVFREQ_PARENT_DEV is zero,
you need to initialize the parent_type with DEVFREQ_PARENT_DEV as following:

diff --git a/drivers/devfreq/exynos-bus.c b/drivers/devfreq/exynos-bus.c
index 8fa8eb541373..1c71c47bc2ac 100644
--- a/drivers/devfreq/exynos-bus.c
+++ b/drivers/devfreq/exynos-bus.c
@@ -369,6 +369,7 @@ static int exynos_bus_profile_init_passive(struct exynos_bus *bus,
                return -ENOMEM;
 
        passive_data->parent = parent_devfreq;
+       passive_data->parent_type = DEVFREQ_PARENT_DEV;
 
        /* Add devfreq device for exynos bus with passive governor */
        bus->devfreq = devm_devfreq_add_device(dev, profile, DEVFREQ_GOV_PASSIVE,


On 5/28/20 3:14 PM, Chanwoo Choi wrote:
> Hi Andrew-sh.Cheng,
> 
> Thanks for your posting. I like this approach absolutely.
> I think that it is necessary. When I developed the embedded product,
> I needed this feature always. 
> 
> I add the comments on below.
> 
> 
> And the following email is not valid. So, I dropped this email
> from Cc list.
> Saravana Kannan <skannan@codeaurora.org>
> 
> 
> On 5/20/20 12:43 PM, Andrew-sh.Cheng wrote:
>> From: Saravana Kannan <skannan@codeaurora.org>
>>
>> Many CPU architectures have caches that can scale independent of the
>> CPUs. Frequency scaling of the caches is necessary to make sure that the
>> cache is not a performance bottleneck that leads to poor performance and
>> power. The same idea applies for RAM/DDR.
>>
>> To achieve this, this patch adds support for cpu based scaling to the
>> passive governor. This is accomplished by taking the current frequency
>> of each CPU frequency domain and then adjust the frequency of the cache
>> (or any devfreq device) based on the frequency of the CPUs. It listens
>> to CPU frequency transition notifiers to keep itself up to date on the
>> current CPU frequency.
>>
>> To decide the frequency of the device, the governor does one of the
>> following:
>> * Derives the optimal devfreq device opp from required-opps property of
>>   the parent cpu opp_table.
>>
>> * Scales the device frequency in proportion to the CPU frequency. So, if
>>   the CPUs are running at their max frequency, the device runs at its
>>   max frequency. If the CPUs are running at their min frequency, the
>>   device runs at its min frequency. It is interpolated for frequencies
>>   in between.
>>
>> Andrew-sh.Cheng change
>> dev_pm_opp_xlate_opp to dev_pm_opp_xlate_required_opp devfreq->max_freq
>> to devfreq->user_min_freq_req.data.freq.qos->min_freq.target_value
>> for kernel-5.7
>>
>> Signed-off-by: Saravana Kannan <skannan@codeaurora.org>
>> [Sibi: Integrated cpu-freqmap governor into passive_governor]
>> Signed-off-by: Sibi Sankar <sibis@codeaurora.org>
>> Signed-off-by: Andrew-sh.Cheng <andrew-sh.cheng@mediatek.com>
>> ---
>>  drivers/devfreq/Kconfig            |   2 +
>>  drivers/devfreq/governor_passive.c | 278 ++++++++++++++++++++++++++++++++++---
>>  include/linux/devfreq.h            |  40 +++++-
>>  3 files changed, 299 insertions(+), 21 deletions(-)
>>
>> diff --git a/drivers/devfreq/Kconfig b/drivers/devfreq/Kconfig
>> index 0b1df12e0f21..d9067950af6a 100644
>> --- a/drivers/devfreq/Kconfig
>> +++ b/drivers/devfreq/Kconfig
>> @@ -73,6 +73,8 @@ config DEVFREQ_GOV_PASSIVE
>>  	  device. This governor does not change the frequency by itself
>>  	  through sysfs entries. The passive governor recommends that
>>  	  devfreq device uses the OPP table to get the frequency/voltage.
>> +	  Alternatively the governor can also be chosen to scale based on
>> +	  the online CPUs current frequency.
>>  
>>  comment "DEVFREQ Drivers"
>>  
>> diff --git a/drivers/devfreq/governor_passive.c b/drivers/devfreq/governor_passive.c
>> index 2d67d6c12dce..7dcda02a5bb7 100644
>> --- a/drivers/devfreq/governor_passive.c
>> +++ b/drivers/devfreq/governor_passive.c
>> @@ -8,11 +8,89 @@
>>   */
>>  
>>  #include <linux/module.h>
>> +#include <linux/cpu.h>
>> +#include <linux/cpufreq.h>
>> +#include <linux/cpumask.h>
>>  #include <linux/device.h>
>>  #include <linux/devfreq.h>
>> +#include <linux/slab.h>
>>  #include "governor.h"
>>  
>> -static int devfreq_passive_get_target_freq(struct devfreq *devfreq,
>> +static unsigned int xlate_cpufreq_to_devfreq(struct devfreq_passive_data *data,
> 
> Need to change 'unsigned int' to 'unsigned long'.
> 
>> +					     unsigned int cpu)
>> +{
>> +	unsigned int cpu_min, cpu_max, dev_min, dev_max, cpu_percent, max_state;
> 
> Better to define them separately as following and then need to rename
> the variable. Usually, use the 'min_freq' and 'max_freq' word for
> the minimum/maximum frequency.
> 
> 	unsigned int cpu_min_freq, cpu_max_freq, cpu_curr_freq, cpu_percent;
> 	unsigned long dev_min_freq, dev_max_freq, dev_max_state,
> 
> The devfreq used 'unsigned long'. The cpufreq used 'unsigned long'
> and 'unsigned int'. You need to handle them properly.
> 
> 
>> +	struct devfreq_cpu_state *cpu_state = data->cpu_state[cpu];
>> +	struct devfreq *devfreq = (struct devfreq *)data->this;
>> +	unsigned long *freq_table = devfreq->profile->freq_table;
> 
> In this function, use 'cpu' work for cpufreq and use 'dev' for devfreq.
> So, I think 'dev_freq_table' is proper name instead of 'freq_table'
> for the readability.
> 
> 	freq_table -> dev_freq_table
> 
>> +	struct dev_pm_opp *opp = NULL, *cpu_opp = NULL;
> 
> In the get_target_freq_with_devfreq(), use 'p_opp' indicating
> the OPP of parent device. For the consistency, I think that
> use 'p_opp' instead of 'cpu_opp'. 
> 
>> +	unsigned long cpu_freq, freq;
> 
> Define the 'cpu_freq' on above with cpu_min_freq/cpu_max_freq definition.
> 	cpu_freq -> cpu_curr_freq.
> 
>> +
>> +	if (!cpu_state || cpu_state->first_cpu != cpu ||
>> +	    !cpu_state->opp_table || !devfreq->opp_table)
>> +		return 0;
>> +
>> +	cpu_freq = cpu_state->freq * 1000;
>> +	cpu_opp = devfreq_recommended_opp(cpu_state->dev, &cpu_freq, 0);
>> +	if (IS_ERR(cpu_opp))
>> +		return 0;
>> +
>> +	opp = dev_pm_opp_xlate_required_opp(cpu_state->opp_table,
>> +					    devfreq->opp_table, cpu_opp);
>> +	dev_pm_opp_put(cpu_opp);
>> +
>> +	if (!IS_ERR(opp)) {
>> +		freq = dev_pm_opp_get_freq(opp);
>> +		dev_pm_opp_put(opp);
> 
> Better to add the 'out' goto statement.
> If you use 'goto out', you can reduce the one indentation
> without 'else' statement.
> 	
> 
>> +	} else {
> 
> As I commented, when dev_pm_opp_xlate_required_opp() return successfully
> , use 'goto out'. We can remove 'else' and then reduce the unneeded indentation.
> 
> 
>> +		/* Use Interpolation if required opps is not available */
>> +		cpu_min = cpu_state->min_freq;
>> +		cpu_max = cpu_state->max_freq;
>> +		cpu_freq = cpu_state->freq;
>> +
>> +		if (freq_table) {
>> +			/* Get minimum frequency according to sorting order */
>> +			max_state = freq_table[devfreq->profile->max_state - 1];
>> +			if (freq_table[0] < max_state) {
>> +				dev_min = freq_table[0];
>> +				dev_max = max_state;
>> +			} else {
>> +				dev_min = max_state;
>> +				dev_max = freq_table[0];
>> +			}
>> +		} else {
>> +			if (devfreq->user_max_freq_req.data.freq.qos->max_freq.target_value
>> +			    <= devfreq->user_min_freq_req.data.freq.qos->min_freq.target_value)
>> +				return 0;
>> +			dev_min =
>> +			devfreq->user_min_freq_req.data.freq.qos->min_freq.target_value;
>> +			dev_max =
>> +			devfreq->user_max_freq_req.data.freq.qos->max_freq.target_value;
> 
> I think it is not proper to access the variable of pm_qos structure directly.
> Instead of direct access, you have to use the exported PM QoS function such as
> - pm_qos_read_value(devfreq->dev.parent, DEV_PM_QOS_MIN_FREQUENCY);
> - pm_qos_read_value(devfreq->dev.parent, DEV_PM_QOS_MAX_FREQUENCY);
> 
>> +		}
>> +		cpu_percent = ((cpu_freq - cpu_min) * 100) / cpu_max - cpu_min;
>> +		freq = dev_min + mult_frac(dev_max - dev_min, cpu_percent, 100);
>> +	}
> 
> 
> I think that you better to add 'out' jump label as following:
> 
> out:
> 
>> +
>> +	return freq;
>> +}
>> +
>> +static int get_target_freq_with_cpufreq(struct devfreq *devfreq,
>> +					unsigned long *freq)
>> +{
>> +	struct devfreq_passive_data *p_data =
>> +				(struct devfreq_passive_data *)devfreq->data;
>> +	unsigned int cpu, target_freq = 0;
> 
> Need to define 'target_freq' with 'unsigned long' type.
> 
>> +
>> +	for_each_online_cpu(cpu)
>> +		target_freq = max(target_freq,
>> +				  xlate_cpufreq_to_devfreq(p_data, cpu));
>> +
>> +	*freq = target_freq;
>> +
>> +	return 0;
>> +}
>> +
>> +static int get_target_freq_with_devfreq(struct devfreq *devfreq,
>>  					unsigned long *freq)
>>  {
>>  	struct devfreq_passive_data *p_data
>> @@ -23,16 +101,6 @@ static int devfreq_passive_get_target_freq(struct devfreq *devfreq,
>>  	int i, count, ret = 0;
>>  
>>  	/*
>> -	 * If the devfreq device with passive governor has the specific method
>> -	 * to determine the next frequency, should use the get_target_freq()
>> -	 * of struct devfreq_passive_data.
>> -	 */
>> -	if (p_data->get_target_freq) {
>> -		ret = p_data->get_target_freq(devfreq, freq);
>> -		goto out;
>> -	}
>> -
>> -	/*
>>  	 * If the parent and passive devfreq device uses the OPP table,
>>  	 * get the next frequency by using the OPP table.
>>  	 */
>> @@ -102,6 +170,37 @@ static int devfreq_passive_get_target_freq(struct devfreq *devfreq,
>>  	return ret;
>>  }
>>  
>> +static int devfreq_passive_get_target_freq(struct devfreq *devfreq,
>> +					   unsigned long *freq)
>> +{
>> +	struct devfreq_passive_data *p_data =
>> +				(struct devfreq_passive_data *)devfreq->data;
>> +	int ret;
>> +
>> +	/*
>> +	 * If the devfreq device with passive governor has the specific method
>> +	 * to determine the next frequency, should use the get_target_freq()
>> +	 * of struct devfreq_passive_data.
>> +	 */
>> +	if (p_data->get_target_freq)
>> +		return p_data->get_target_freq(devfreq, freq);
>> +
>> +	switch (p_data->parent_type) {
>> +	case DEVFREQ_PARENT_DEV:
>> +		ret = get_target_freq_with_devfreq(devfreq, freq);
>> +		break;
>> +	case CPUFREQ_PARENT_DEV:
>> +		ret = get_target_freq_with_cpufreq(devfreq, freq);
>> +		break;
>> +	default:
>> +		ret = -EINVAL;
>> +		dev_err(&devfreq->dev, "Invalid parent type\n");
>> +		break;
>> +	}
>> +
>> +	return ret;
>> +}
>> +
>>  static int update_devfreq_passive(struct devfreq *devfreq, unsigned long freq)
>>  {
>>  	int ret;
>> @@ -156,6 +255,140 @@ static int devfreq_passive_notifier_call(struct notifier_block *nb,
>>  	return NOTIFY_DONE;
>>  }
>>  
>> +static int cpufreq_passive_notifier_call(struct notifier_block *nb,
>> +					 unsigned long event, void *ptr)
>> +{
>> +	struct devfreq_passive_data *data =
>> +			container_of(nb, struct devfreq_passive_data, nb);
>> +	struct devfreq *devfreq = (struct devfreq *)data->this;
>> +	struct devfreq_cpu_state *cpu_state;
>> +	struct cpufreq_freqs *freq = ptr;
> 
> How about changing 'freq' to 'cpu_freqs'?
> 
> In the drivers/cpufreq/cpufreq.c, use 'freqs' name indicating
> the instance of 'struct cpufreq_freqs'. And in order to
> identfy, how about adding 'cpu_' prefix for variable name?
> 
>> +	unsigned int current_freq;
> 
> Need to define curr_freq with 'unsigned long' type
> and better to use 'curr_freq' variable name.
> 
>> +	int ret;
>> +
>> +	if (event != CPUFREQ_POSTCHANGE || !freq ||
>> +	    !data->cpu_state[freq->policy->cpu])
>> +		return 0;
>> +
>> +	cpu_state = data->cpu_state[freq->policy->cpu];
>> +	if (cpu_state->freq == freq->new)
>> +		return 0;
>> +
>> +	/* Backup current freq and pre-update cpu state freq*/
>> +	current_freq = cpu_state->freq;
>> +	cpu_state->freq = freq->new;
>> +
>> +	mutex_lock(&devfreq->lock);
>> +	ret = update_devfreq(devfreq);
>> +	mutex_unlock(&devfreq->lock);
>> +	if (ret) {
>> +		cpu_state->freq = current_freq;
>> +		dev_err(&devfreq->dev, "Couldn't update the frequency.\n");
>> +		return ret;
>> +	}
>> +
>> +	return 0;
>> +}
>> +
>> +static int cpufreq_passive_register(struct devfreq_passive_data **p_data)
>> +{
>> +	struct devfreq_passive_data *data = *p_data;
>> +	struct devfreq *devfreq = (struct devfreq *)data->this;
>> +	struct device *dev = devfreq->dev.parent;
>> +	struct opp_table *opp_table = NULL;
>> +	struct devfreq_cpu_state *state;
> 
> For the readability, I thinkt 'cpu_state' is proper instead of 'state'.
> 
>> +	struct cpufreq_policy *policy;
>> +	struct device *cpu_dev;
>> +	unsigned int cpu;
>> +	int ret;
>> +
>> +	get_online_cpus();
> 
> Add blank line.
> 
>> +	data->nb.notifier_call = cpufreq_passive_notifier_call;
>> +	ret = cpufreq_register_notifier(&data->nb,
>> +					CPUFREQ_TRANSITION_NOTIFIER);
>> +	if (ret) {
>> +		dev_err(dev, "Couldn't register cpufreq notifier.\n");
>> +		data->nb.notifier_call = NULL;
>> +		goto out;
>> +	}
>> +
>> +	/* Populate devfreq_cpu_state */
>> +	for_each_online_cpu(cpu) {
>> +		if (data->cpu_state[cpu])
>> +			continue;
>> +
>> +		policy = cpufreq_cpu_get(cpu);
> 
> cpufreq_cpu_get() might return 'NULL'. I think you need to handle
> return value as following:
> 
> 		if (!policy) {
> 			ret = -EINVAL;
> 			goto out;
> 		} else if (PTR_ERR(policy) == -EPROBE_DEFER) {
> 			goto out;
> 		} else if (IS_ERR(policy) {
> 			ret = PTR_ERR(policy);
> 			dev_err(dev, "Couldn't get the cpufreq_poliy.\n");
> 			goto out;
> 		}
> 
> If cpufreq_cpu_get() return successfully, to do next.
> It reduces the one indentaion.
> 
> 
> 
>> +		if (policy) {
>> +			state = kzalloc(sizeof(*state), GFP_KERNEL);
>> +			if (!state) {
>> +				ret = -ENOMEM;
>> +				goto out;
>> +			}
>> +
>> +			cpu_dev = get_cpu_device(cpu);
>> +			if (!cpu_dev) {
>> +				dev_err(dev, "Couldn't get cpu device.\n");
>> +				ret = -ENODEV;
>> +				goto out;
>> +			}
>> +
>> +			opp_table = dev_pm_opp_get_opp_table(cpu_dev);
>> +			if (IS_ERR(devfreq->opp_table)) {
>> +				ret = PTR_ERR(opp_table);
>> +				goto out;
>> +			}
>> +
>> +			state->dev = cpu_dev;
>> +			state->opp_table = opp_table;
>> +			state->first_cpu = cpumask_first(policy->related_cpus);
>> +			state->freq = policy->cur;
>> +			state->min_freq = policy->cpuinfo.min_freq;
>> +			state->max_freq = policy->cpuinfo.max_freq;
>> +			data->cpu_state[cpu] = state;
> 
> Add blank line.
> 
>> +			cpufreq_cpu_put(policy);
>> +		} else {
>> +			ret = -EPROBE_DEFER;
>> +			goto out;
>> +		}
>> +	}
> 
> Add blank line.
> 
>> +out:
>> +	put_online_cpus();
>> +	if (ret)
>> +		return ret;
>> +
>> +	/* Update devfreq */
>> +	mutex_lock(&devfreq->lock);
>> +	ret = update_devfreq(devfreq);
>> +	mutex_unlock(&devfreq->lock);
>> +	if (ret)
>> +		dev_err(dev, "Couldn't update the frequency.\n");
>> +
>> +	return ret;
>> +}
>> +
>> +static int cpufreq_passive_unregister(struct devfreq_passive_data **p_data)
>> +{
>> +	struct devfreq_passive_data *data = *p_data;
>> +	struct devfreq_cpu_state *cpu_state;
>> +	int cpu;
>> +
>> +	if (data->nb.notifier_call)
>> +		cpufreq_unregister_notifier(&data->nb,
>> +					    CPUFREQ_TRANSITION_NOTIFIER);
>> +
>> +	for_each_possible_cpu(cpu) {
>> +		cpu_state = data->cpu_state[cpu];
>> +		if (cpu_state) {
>> +			if (cpu_state->opp_table)
>> +				dev_pm_opp_put_opp_table(cpu_state->opp_table);
>> +			kfree(cpu_state);
>> +			cpu_state = NULL;
>> +		}
>> +	}
>> +
>> +	return 0;
>> +}
>> +
>>  static int devfreq_passive_event_handler(struct devfreq *devfreq,
>>  				unsigned int event, void *data)
>>  {
>> @@ -165,7 +398,7 @@ static int devfreq_passive_event_handler(struct devfreq *devfreq,
>>  	struct notifier_block *nb = &p_data->nb;
>>  	int ret = 0;
>>  
>> -	if (!parent)
>> +	if (p_data->parent_type == DEVFREQ_PARENT_DEV && !parent)
>>  		return -EPROBE_DEFER;
> 
> If you modify the devfreq_passive_event_handler() as following,
> you can move this condition for DEVFREQ_PARENT_DEV into 
> (register|unregister)_parent_dev_notifier.
> 
> 	switch (event) {                                                                                  
> 	case DEVFREQ_GOV_START:                                               
> 		ret = register_parent_dev_notifier(p_data);
> 		break;
> 	case DEVFREQ_GOV_STOP:                                             
> 		ret = unregister_parent_dev_notifier(p_data);
> 		break;
> 	default: 
> 		ret = -EINVAL;
> 		break;
> 	}
>                                                                                               
> 	return ret;
> 
>>  
>>  	switch (event) {
>> @@ -173,13 +406,24 @@ static int devfreq_passive_event_handler(struct devfreq *devfreq,
>>  		if (!p_data->this)
>>  			p_data->this = devfreq;
>>  
>> -		nb->notifier_call = devfreq_passive_notifier_call;
>> -		ret = devfreq_register_notifier(parent, nb,
>> -					DEVFREQ_TRANSITION_NOTIFIER);
>> +		if (p_data->parent_type == DEVFREQ_PARENT_DEV) {
>> +			nb->notifier_call = devfreq_passive_notifier_call;
>> +			ret = devfreq_register_notifier(parent, nb,
>> +						DEVFREQ_TRANSITION_NOTIFIER);
>> +		} else if (p_data->parent_type == CPUFREQ_PARENT_DEV) {
>> +			ret = cpufreq_passive_register(&p_data);
> 
> I think that we better to collect the code related to notifier registration
> into one function like devfreq_pass_register_notifier() instead of
> cpufreq_passive_register() as following: I think it is more simple and readable.
> 
> If you have more proper function name of register_parent_dev_notifier,
> please give your opinion.
> 
> 
> 	int register_parent_dev_notifier(struct devfreq_passive_data **p_data)
> 		switch (p_data->parent_type) {
> 		case DEVFREQ_PARENT_DEV:
> 			nb->notifier_call = devfreq_passive_notifier_call;
> 			ret = devfreq_register_notifier(parent, nb,
> 			break;
> 		case CPUFREQ_PARENT_DEV:
> 			cpufreq_register_notifier(...)
> 			...
> 			break;
> 		}
> 		
> 
>> +		} else {
>> +			ret = -EINVAL;
>> +		}
>>  		break;
>>  	case DEVFREQ_GOV_STOP:
>> -		WARN_ON(devfreq_unregister_notifier(parent, nb,
>> -					DEVFREQ_TRANSITION_NOTIFIER));
>> +		if (p_data->parent_type == DEVFREQ_PARENT_DEV)
>> +			WARN_ON(devfreq_unregister_notifier(parent, nb,
>> +						DEVFREQ_TRANSITION_NOTIFIER));
>> +		else if (p_data->parent_type == CPUFREQ_PARENT_DEV)
>> +			cpufreq_passive_unregister(&p_data);
>> +		else
>> +			ret = -EINVAL;
> 
> ditto. unregister_parent_dev_notifier(struct devfreq_passive_data **p_data)
> 
>>  		break;
>>  	default:
>>  		break;
>> diff --git a/include/linux/devfreq.h b/include/linux/devfreq.h
>> index a4b19d593151..04ce576fd6f1 100644
>> --- a/include/linux/devfreq.h
>> +++ b/include/linux/devfreq.h
>> @@ -278,6 +278,32 @@ struct devfreq_simple_ondemand_data {
>>  
>>  #if IS_ENABLED(CONFIG_DEVFREQ_GOV_PASSIVE)
>>  /**
>> + * struct devfreq_cpu_state - holds the per-cpu state
>> + * @freq:	the current frequency of the cpu.
>> + * @min_freq:	the min frequency of the cpu.
>> + * @max_freq:	the max frequency of the cpu.
>> + * @first_cpu:	the cpumask of the first cpu of a policy.
>> + * @dev:	reference to cpu device.
>> + * @opp_table:	reference to cpu opp table.
>> + *
>> + * This structure stores the required cpu_state of a cpu.
>> + * This is auto-populated by the governor.
>> + */
>> +struct devfreq_cpu_state {> +	unsigned int freq;
> 
> It is better to change from 'freq' to 'curr_freq'
> for more correct expression.
> 
>> +	unsigned int min_freq;
>> +	unsigned int max_freq;
>> +	unsigned int first_cpu;
>> +	struct device *dev;
> 
> How about changing the name 'dev' to 'cpu_dev'?
> 
> 
>> +	struct opp_table *opp_table;
>> +};
> 
> devfreq_cpu_state is only handled by within driver/devfreq/governor_passive.c.
> 
> So, you can move it into drivers/devfreq/governor_passive.c
> and just add the definition into include/linux/devfreq.h as following:
> It is able to prevent the access of variable of 'struct devfreq_cpu_state'
> outside.
> 
> 	struct devfreq_cpu_state;
> 
>> +
>> +enum devfreq_parent_dev_type {
>> +	DEVFREQ_PARENT_DEV,
>> +	CPUFREQ_PARENT_DEV,
>> +};
>> +
>> +/**
>>   * struct devfreq_passive_data - ``void *data`` fed to struct devfreq
>>   *	and devfreq_add_device
>>   * @parent:	the devfreq instance of parent device.
>> @@ -288,13 +314,15 @@ struct devfreq_simple_ondemand_data {
>>   *			using governors except for passive governor.
>>   *			If the devfreq device has the specific method to decide
>>   *			the next frequency, should use this callback.
>> - * @this:	the devfreq instance of own device.
>> - * @nb:		the notifier block for DEVFREQ_TRANSITION_NOTIFIER list
>> + * @parent_type		parent type of the device
> 
> Need to add ':' at the end of word. -> "parent_type:".
> 
>> + * @this:		the devfreq instance of own device.
>> + * @nb:			the notifier block for DEVFREQ_TRANSITION_NOTIFIER list
> 
> I knew that you make them with same indentation.
> But, actually, it is not related to this patch like clean-up code.
> Even if it is not pretty, you better to don't touch 'this' and 'nb' indentaion.
> 
>> + * @cpu_state:		the state min/max/current frequency of all online cpu's
>>   *
>>   * The devfreq_passive_data have to set the devfreq instance of parent
>>   * device with governors except for the passive governor. But, don't need to
>> - * initialize the 'this' and 'nb' field because the devfreq core will handle
>> - * them.
>> + * initialize the 'this', 'nb' and 'cpu_state' field because the devfreq core
>> + * will handle them.
>>   */
>>  struct devfreq_passive_data {
>>  	/* Should set the devfreq instance of parent device */
>> @@ -303,9 +331,13 @@ struct devfreq_passive_data {
>>  	/* Optional callback to decide the next frequency of passvice device */
>>  	int (*get_target_freq)(struct devfreq *this, unsigned long *freq);
>>  
>> +	/* Should set the type of parent device */
>> +	enum devfreq_parent_dev_type parent_type;
>> +
>>  	/* For passive governor's internal use. Don't need to set them */
>>  	struct devfreq *this;
>>  	struct notifier_block nb;
>> +	struct devfreq_cpu_state *cpu_state[NR_CPUS];
>>  };
>>  #endif
>>  
>>
> 
> 


-- 
Best Regards,
Chanwoo Choi
Samsung Electronics

_______________________________________________
Linux-mediatek mailing list
Linux-mediatek@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-mediatek

^ permalink raw reply related	[flat|nested] 35+ messages in thread

* Re: [PATCH 09/12] devfreq: add mediatek cci devfreq
  2020-05-20  3:43   ` [PATCH 09/12] devfreq: add mediatek " Andrew-sh.Cheng
  2020-05-20 12:31     ` Mark Brown
@ 2020-05-28  7:35     ` Chanwoo Choi
  2020-05-28  8:00       ` Chanwoo Choi
  1 sibling, 1 reply; 35+ messages in thread
From: Chanwoo Choi @ 2020-05-28  7:35 UTC (permalink / raw)
  To: Andrew-sh.Cheng, MyungJoo Ham, Kyungmin Park, Rob Herring,
	Mark Rutland, Matthias Brugger, Rafael J . Wysocki, Viresh Kumar,
	Nishanth Menon, Stephen Boyd, Liam Girdwood, Mark Brown
  Cc: devicetree, srv_heupstream, linux-pm, linux-kernel,
	linux-mediatek, linux-arm-kernel

Hi Andrew-sh.Cheng,

On 5/20/20 12:43 PM, Andrew-sh.Cheng wrote:
> This adds a devfreq driver for the Cache Coherent Interconnect (CCI)
> of the Mediatek MT8183.
> 
> On the MT8183 the CCI is supplied by the same regulator as the LITTLE
> cores. The driver is notified when the regulator voltage changes
> (driven by cpufreq) and adjusts the CCI frequency to the maximum
> possible value.
> 
> Signed-off-by: Andrew-sh.Cheng <andrew-sh.cheng@mediatek.com>
> ---
>  drivers/devfreq/Kconfig              |  10 ++
>  drivers/devfreq/Makefile             |   1 +
>  drivers/devfreq/mt8183-cci-devfreq.c | 206 +++++++++++++++++++++++++++++++++++

The mt8183-cci.c is enough for driver name.

>  3 files changed, 217 insertions(+)
>  create mode 100644 drivers/devfreq/mt8183-cci-devfreq.c
> 
> diff --git a/drivers/devfreq/Kconfig b/drivers/devfreq/Kconfig
> index d9067950af6a..4ed7116271ee 100644
> --- a/drivers/devfreq/Kconfig
> +++ b/drivers/devfreq/Kconfig
> @@ -103,6 +103,16 @@ config ARM_IMX8M_DDRC_DEVFREQ
>  	  This adds the DEVFREQ driver for the i.MX8M DDR Controller. It allows
>  	  adjusting DRAM frequency.
>  
> +config ARM_MT8183_CCI_DEVFREQ
> +	tristate "MT8183 CCI DEVFREQ Driver"
> +	depends on ARM_MEDIATEK_CPUFREQ
> +	help
> +		This adds a devfreq driver for Cache Coherent Interconnect
> +		of Mediatek MT8183, which is shared the same regulator
> +		with cpu cluster.
> +		It can track buck voltage and update a proper cci frequency.

s/cci/CCI

> +		Use notification to get regulator status.
> +
>  config ARM_TEGRA_DEVFREQ
>  	tristate "NVIDIA Tegra30/114/124/210 DEVFREQ Driver"
>  	depends on ARCH_TEGRA_3x_SOC || ARCH_TEGRA_114_SOC || \
> diff --git a/drivers/devfreq/Makefile b/drivers/devfreq/Makefile
> index 3eb4d5e6635c..5b1b670c954d 100644
> --- a/drivers/devfreq/Makefile
> +++ b/drivers/devfreq/Makefile
> @@ -10,6 +10,7 @@ obj-$(CONFIG_DEVFREQ_GOV_PASSIVE)	+= governor_passive.o
>  # DEVFREQ Drivers
>  obj-$(CONFIG_ARM_EXYNOS_BUS_DEVFREQ)	+= exynos-bus.o
>  obj-$(CONFIG_ARM_IMX8M_DDRC_DEVFREQ)	+= imx8m-ddrc.o
> +obj-$(CONFIG_ARM_MT8183_CCI_DEVFREQ)	+= mt8183-cci-devfreq.o
>  obj-$(CONFIG_ARM_RK3399_DMC_DEVFREQ)	+= rk3399_dmc.o
>  obj-$(CONFIG_ARM_TEGRA_DEVFREQ)		+= tegra30-devfreq.o
>  obj-$(CONFIG_ARM_TEGRA20_DEVFREQ)	+= tegra20-devfreq.o
> diff --git a/drivers/devfreq/mt8183-cci-devfreq.c b/drivers/devfreq/mt8183-cci-devfreq.c
> new file mode 100644
> index 000000000000..cd7929a83bf8
> --- /dev/null
> +++ b/drivers/devfreq/mt8183-cci-devfreq.c
> @@ -0,0 +1,206 @@
> +// SPDX-License-Identifier: GPL-2.0
> +/*
> + * Copyright (c) 2019 MediaTek Inc.

s/2019/2020

> +
> + * Author: Andrew-sh.Cheng <andrew-sh.cheng@mediatek.com>
> + */
> +
> +#include <linux/clk.h>
> +#include <linux/devfreq.h>
> +#include <linux/module.h>
> +#include <linux/of.h>
> +#include <linux/platform_device.h>
> +#include <linux/regulator/consumer.h>
> +#include <linux/time.h>
> +
> +#include "governor.h"

It is not needed. Please remove it.

> +
> +#define MAX_VOLT_LIMIT		(1150000)
> +
> +struct cci_devfreq {
> +	struct devfreq *devfreq;
> +	struct regulator *proc_reg;

'proc' means the 'processor'?
Instead of 'proc_reg', you better to use 'cpu_reg'.

> +	struct clk *cci_clk;
> +	int old_vproc;
> +	unsigned long old_freq;
> +};
> +
> +static int mtk_cci_set_voltage(struct cci_devfreq *cci_df, int vproc)
> +{
> +	int ret;
> +
> +	ret = regulator_set_voltage(cci_df->proc_reg, vproc,
> +				    MAX_VOLT_LIMIT);
> +	if (!ret)
> +		cci_df->old_vproc = vproc;
> +	return ret;
> +}
> +
> +static int mtk_cci_devfreq_target(struct device *dev, unsigned long *freq,
> +				  u32 flags)
> +{
> +	int ret;
> +	struct cci_devfreq *cci_df = dev_get_drvdata(dev);
> +	struct dev_pm_opp *opp;
> +	unsigned long opp_rate, opp_voltage, old_voltage;
> +
> +	if (!cci_df)
> +		return -EINVAL;
> +
> +	if (cci_df->old_freq == *freq)
> +		return 0;
> +
> +	opp_rate = *freq;
> +	opp = dev_pm_opp_find_freq_floor(dev, &opp_rate);
> +	opp_voltage = dev_pm_opp_get_voltage(opp);
> +	dev_pm_opp_put(opp);


You can use the helper function for getting the rate 
with devfreq_recommended_opp(). You can refer the following code
in drivers/devfreq/exynos-bus.c

	opp = devfreq_recommended_opp(dev, freq, flags);
	if (IS_ERR(opp)) {
		dev_err(dev, "failed to get recommended opp instance\n");
		return PTR_ERR(opp);
	}
	dev_pm_opp_put(opp);

> +
> +	old_voltage = cci_df->old_vproc;
> +	if (old_voltage == 0)
> +		old_voltage = regulator_get_voltage(cci_df->proc_reg);
> +
> +	// scale up: set voltage first then freq
> +	if (opp_voltage > old_voltage) {
> +		ret = mtk_cci_set_voltage(cci_df, opp_voltage);
> +		if (ret) {
> +			pr_err("cci: failed to scale up voltage\n");
> +			return ret;
> +		}
> +	}
> +
> +	ret = clk_set_rate(cci_df->cci_clk, *freq);
> +	if (ret) {
> +		pr_err("%s: failed cci to set rate: %d\n", __func__,
> +		       ret);
> +		mtk_cci_set_voltage(cci_df, old_voltage);
> +		return ret;
> +	}
> +
> +	// scale down: set freq first then voltage
> +	if (opp_voltage < old_voltage) {
> +		ret = mtk_cci_set_voltage(cci_df, opp_voltage);
> +		if (ret) {
> +			pr_err("cci: failed to scale down voltage\n");
> +			clk_set_rate(cci_df->cci_clk, cci_df->old_freq);
> +			return ret;
> +		}
> +	}


I recommend that dev_pm_opp_set_rate() and dev_pm_opp_set_regulator()
instead of 'clk_set_rate' and 'regulator_set_voltage'.
In the dev_pm_opp_set_rate(), handle the these sequence.
You can refer the merged patch[1].

[1] commit 4294a779bd8dff6c65e7e85ffe7a1ea236e92a68
- PM / devfreq: exynos-bus: Convert to use dev_pm_opp_set_rate()


> +
> +	cci_df->old_freq = *freq;
> +
> +	return 0;
> +}
> +
> +static struct devfreq_dev_profile cci_devfreq_profile = {
> +	.target = mtk_cci_devfreq_target,

Need to add '.exit' for calling dev_pm_opp_of_remove_table().
You can refer the merged devfreq patches like exynos_bus.c, imx-bus.c.

> +};
> +
> +static int mtk_cci_devfreq_probe(struct platform_device *pdev)
> +{
> +	struct device *cci_dev = &pdev->dev;
> +	struct cci_devfreq *cci_df;
> +	struct devfreq_passive_data *passive_data;
> +	int ret;
> +
> +	cci_df = devm_kzalloc(cci_dev, sizeof(*cci_df), GFP_KERNEL);
> +	if (!cci_df)
> +		return -ENOMEM;
> +
> +	cci_df->cci_clk = devm_clk_get(cci_dev, "cci_clock");
> +	ret = PTR_ERR_OR_ZERO(cci_df->cci_clk);
> +	if (ret) {
> +		if (ret != -EPROBE_DEFER)
> +			dev_err(cci_dev, "failed to get clock for CCI: %d\n",
> +				ret);
> +		return ret;
> +	}
> +	cci_df->proc_reg = devm_regulator_get_optional(cci_dev, "proc");
> +	ret = PTR_ERR_OR_ZERO(cci_df->proc_reg);
> +	if (ret) {
> +		if (ret != -EPROBE_DEFER)
> +			dev_err(cci_dev, "failed to get regulator for CCI: %d\n",
> +				ret);
> +		return ret;
> +	}

I recommend that use dev_pm_opp_set_regulators.
You can refer the merged patch[1].

[1] commit 4294a779bd8dff6c65e7e85ffe7a1ea236e92a68
- PM / devfreq: exynos-bus: Convert to use dev_pm_opp_set_rate()


> +	ret = regulator_enable(cci_df->proc_reg);
> +	if (ret) {
> +		pr_warn("enable buck for cci fail\n");

Use dev_err instead of 'pr_warn'.

> +		return ret;
> +	}
> +
> +	ret = dev_pm_opp_of_add_table(cci_dev);
> +	if (ret) {
> +		dev_err(cci_dev, "Fail to init CCI OPP table: %d\n", ret);

How about changing the error log as following
because in this driver, use the 'failed to' sentence for error handling?

	failed to get OPP table for CCI:L %d

> +		return ret;
> +	}
> +
> +	platform_set_drvdata(pdev, cci_df);
> +
> +	passive_data = devm_kzalloc(cci_dev, sizeof(*passive_data), GFP_KERNEL);
> +	if (!passive_data)
> +		return -ENOMEM;

On this error case, you have to call dev_pm_opp_of_remove_table().
You better to make the 'err_opp' jump lable and then add 'goto err_opp'.

> +
> +	passive_data->parent_type = CPUFREQ_PARENT_DEV;
> +
> +	cci_df->devfreq = devm_devfreq_add_device(cci_dev,
> +						  &cci_devfreq_profile,
> +						  DEVFREQ_GOV_PASSIVE,
> +						  passive_data);
> +	if (IS_ERR(cci_df->devfreq)) {
> +		ret = PTR_ERR(cci_df->devfreq);
> +		dev_err(cci_dev, "cannot create cci devfreq device:%d\n", ret);
> +		dev_pm_opp_of_remove_table(cci_dev);

Instead of direct call, use 'goto err_opp'.

> +		return ret;
> +	}
> +
> +	return 0;
> +}
> +
> +static int mtk_cci_devfreq_remove(struct platform_device *pdev)
> +{
> +	struct device *cci_dev = &pdev->dev;
> +	struct cci_devfreq *cci_df;
> +	struct notifier_block *opp_nb;
> +
> +	cci_df = platform_get_drvdata(pdev);
> +	opp_nb = &cci_df->opp_nb;
> +
> +	dev_pm_opp_unregister_notifier(cci_dev, opp_nb);

This patch doesn't call the dev_pm_opp_register_notifier.
Please remove it.

> +	devm_devfreq_remove_device(cci_dev, cci_df->devfreq);

Don't need to call this function because you used devm_devfreq_add_device().

> +	dev_pm_opp_of_remove_table(cci_dev)> +	regulator_disable(cci_df->proc_reg);
> +
> +	return 0;
> +}
> +
> +static const __maybe_unused struct of_device_id
> +	mediatek_cci_devfreq_of_match[] = {

Make it on one line and remove '__maybe_unused' keyword.
- mediatek_cci_devfreq_of_match-> mediatek_cci_of_match

> +	{ .compatible = "mediatek,mt8183-cci" },
> +	{ },
> +};
> +MODULE_DEVICE_TABLE(of, mediatek_cci_devfreq_of_match);

ditto.

> +
> +static struct platform_driver cci_devfreq_driver = {
> +	.probe	= mtk_cci_devfreq_probe,
> +	.remove	= mtk_cci_devfreq_remove,
> +	.driver = {
> +		.name = "mediatek-cci-devfreq",
> +		.of_match_table = of_match_ptr(mediatek_cci_devfreq_of_match),

ditto.

> +	},
> +};
> +
> +static int __init mtk_cci_devfreq_init(void)
> +{
> +	return platform_driver_register(&cci_devfreq_driver);
> +}
> +module_init(mtk_cci_devfreq_init)
> +
> +static void __exit mtk_cci_devfreq_exit(void)
> +{
> +	platform_driver_unregister(&cci_devfreq_driver);
> +}
> +module_exit(mtk_cci_devfreq_exit)

Use 'module_platform_driver' instead of module_init and module_exit.

> +
> +MODULE_DESCRIPTION("Mediatek CCI devfreq driver");
> +MODULE_AUTHOR("Andrew-sh.Cheng <andrew-sh.cheng@mediatek.com>");
> +MODULE_LICENSE("GPL v2");
> 


-- 
Best Regards,
Chanwoo Choi
Samsung Electronics

_______________________________________________
Linux-mediatek mailing list
Linux-mediatek@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-mediatek

^ permalink raw reply	[flat|nested] 35+ messages in thread

* Re: [PATCH 08/12] dt-bindings: devfreq: add compatible for mt8183 cci devfreq
  2020-05-20  3:43   ` [PATCH 08/12] dt-bindings: devfreq: add compatible for mt8183 cci devfreq Andrew-sh.Cheng
@ 2020-05-28  7:42     ` Chanwoo Choi
  2020-06-17 12:05       ` andrew-sh.cheng
  0 siblings, 1 reply; 35+ messages in thread
From: Chanwoo Choi @ 2020-05-28  7:42 UTC (permalink / raw)
  To: Andrew-sh.Cheng, MyungJoo Ham, Kyungmin Park, Rob Herring,
	Mark Rutland, Matthias Brugger, Rafael J . Wysocki, Viresh Kumar,
	Nishanth Menon, Stephen Boyd, Liam Girdwood, Mark Brown
  Cc: devicetree, srv_heupstream, linux-pm, linux-kernel,
	linux-mediatek, linux-arm-kernel

Hi,

On 5/20/20 12:43 PM, Andrew-sh.Cheng wrote:
> This adds dt-binding documentation of cci devfreq
> for Mediatek MT8183 SoC platform.
> 
> Signed-off-by: Andrew-sh.Cheng <andrew-sh.cheng@mediatek.com>
> ---
>  .../devicetree/bindings/devfreq/mt8183-cci.yaml    | 51 ++++++++++++++++++++++
>  1 file changed, 51 insertions(+)
>  create mode 100644 Documentation/devicetree/bindings/devfreq/mt8183-cci.yaml
> 
> diff --git a/Documentation/devicetree/bindings/devfreq/mt8183-cci.yaml b/Documentation/devicetree/bindings/devfreq/mt8183-cci.yaml
> new file mode 100644
> index 000000000000..a7341fd94097
> --- /dev/null
> +++ b/Documentation/devicetree/bindings/devfreq/mt8183-cci.yaml
> @@ -0,0 +1,51 @@
> +# SPDX-License-Identifier: (GPL-2.0-only OR BSD-2-Clause)
> +%YAML 1.2
> +---
> +$id: https://protect2.fireeye.com/url?k=33f1f15d-6e23ea05-33f07a12-0cc47a31c8b4-91b3f8aeecce95dc&q=1&u=http%3A%2F%2Fdevicetree.org%2Fschemas%2Fdevfreq%2Fmt8183-cci.yaml%23
> +$schema: https://protect2.fireeye.com/url?k=fc7d9089-a1af8bd1-fc7c1bc6-0cc47a31c8b4-b46f5afc59faf86d&q=1&u=http%3A%2F%2Fdevicetree.org%2Fmeta-schemas%2Fcore.yaml%23
> +
> +title: CCI_DEVFREQ driver for MT8183.
> +
> +maintainers:
> +  - Andrew-sh.Cheng <andrew-sh.cheng@mediatek.com>
> +
> +description: |
> +  This module is used to create CCI DEVFREQ.
> +  The performance will depend on both CCI frequency and CPU frequency.
> +  For MT8183, CCI co-buck with Little core.
> +  Contain CCI opp table for voltage and frequency scaling.
> +
> +properties:
> +  compatible:
> +    const: "mediatek,mt8183-cci"
> +
> +  clocks:
> +    maxItems: 1
> +
> +  clock-names:
> +    const: "cci"
> +
> +  operating-points-v2: true
> +  opp-table: true
> +
> +  proc-supply:
> +    description:
> +      Phandle of the regulator that provides the supply voltage.
> +
> +required:
> +  - compatible
> +  - clocks
> +  - clock-names
> +  - proc-supply
> +
> +examples:
> +  - |
> +    #include <dt-bindings/clock/mt8183-clk.h>
> +    cci: cci {
> +      compatible = "mediatek,mt8183-cci";
> +      clocks = <&apmixedsys CLK_APMIXED_CCIPLL>;
> +      clock-names = "cci";
> +      operating-points-v2 = <&cci_opp>;
> +      proc-supply = <&mt6358_vproc12_reg>;
> +    };
> +
> 

I recommend that add the more detailed example
with OPP table with CPU node.


-- 
Best Regards,
Chanwoo Choi
Samsung Electronics

_______________________________________________
Linux-mediatek mailing list
Linux-mediatek@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-mediatek

^ permalink raw reply	[flat|nested] 35+ messages in thread

* Re: [PATCH 09/12] devfreq: add mediatek cci devfreq
  2020-05-28  7:35     ` Chanwoo Choi
@ 2020-05-28  8:00       ` Chanwoo Choi
  0 siblings, 0 replies; 35+ messages in thread
From: Chanwoo Choi @ 2020-05-28  8:00 UTC (permalink / raw)
  To: Andrew-sh.Cheng, MyungJoo Ham, Kyungmin Park, Rob Herring,
	Mark Rutland, Matthias Brugger, Rafael J . Wysocki, Viresh Kumar,
	Nishanth Menon, Stephen Boyd, Liam Girdwood, Mark Brown
  Cc: devicetree, srv_heupstream, linux-pm, linux-kernel,
	linux-mediatek, linux-arm-kernel

Hi Andrew-sh.Cheng,

On 5/28/20 4:35 PM, Chanwoo Choi wrote:
> Hi Andrew-sh.Cheng,
> 
> On 5/20/20 12:43 PM, Andrew-sh.Cheng wrote:
>> This adds a devfreq driver for the Cache Coherent Interconnect (CCI)
>> of the Mediatek MT8183.
>>
>> On the MT8183 the CCI is supplied by the same regulator as the LITTLE
>> cores. The driver is notified when the regulator voltage changes
>> (driven by cpufreq) and adjusts the CCI frequency to the maximum
>> possible value.

I understood that the mt8183-cci.c and mt8183 cpufreq driver (ARM_MEDIATEK_CPUFREQ)
shared the same regulator. So, when mt8183 cpufreq driver
changes the CPU frequency and voltage, the mt8183-cci.c
changes the CCI frequency according to the new mt8183 frequency
with passive governor. 

I think that mt8183-cci.c don't need to change the voltage
because already mt8183 cpufreq changed the voltage of shared regulator.
Why do you change the voltage in this driver?

>>
>> Signed-off-by: Andrew-sh.Cheng <andrew-sh.cheng@mediatek.com>
>> ---
>>  drivers/devfreq/Kconfig              |  10 ++
>>  drivers/devfreq/Makefile             |   1 +
>>  drivers/devfreq/mt8183-cci-devfreq.c | 206 +++++++++++++++++++++++++++++++++++
> 
> The mt8183-cci.c is enough for driver name.
> 
>>  3 files changed, 217 insertions(+)
>>  create mode 100644 drivers/devfreq/mt8183-cci-devfreq.c
>>
>> diff --git a/drivers/devfreq/Kconfig b/drivers/devfreq/Kconfig
>> index d9067950af6a..4ed7116271ee 100644
>> --- a/drivers/devfreq/Kconfig
>> +++ b/drivers/devfreq/Kconfig
>> @@ -103,6 +103,16 @@ config ARM_IMX8M_DDRC_DEVFREQ
>>  	  This adds the DEVFREQ driver for the i.MX8M DDR Controller. It allows
>>  	  adjusting DRAM frequency.
>>  
>> +config ARM_MT8183_CCI_DEVFREQ
>> +	tristate "MT8183 CCI DEVFREQ Driver"
>> +	depends on ARM_MEDIATEK_CPUFREQ
>> +	help
>> +		This adds a devfreq driver for Cache Coherent Interconnect
>> +		of Mediatek MT8183, which is shared the same regulator
>> +		with cpu cluster.
>> +		It can track buck voltage and update a proper cci frequency.
> 
> s/cci/CCI
> 
>> +		Use notification to get regulator status.
>> +
>>  config ARM_TEGRA_DEVFREQ
>>  	tristate "NVIDIA Tegra30/114/124/210 DEVFREQ Driver"
>>  	depends on ARCH_TEGRA_3x_SOC || ARCH_TEGRA_114_SOC || \
>> diff --git a/drivers/devfreq/Makefile b/drivers/devfreq/Makefile
>> index 3eb4d5e6635c..5b1b670c954d 100644
>> --- a/drivers/devfreq/Makefile
>> +++ b/drivers/devfreq/Makefile
>> @@ -10,6 +10,7 @@ obj-$(CONFIG_DEVFREQ_GOV_PASSIVE)	+= governor_passive.o
>>  # DEVFREQ Drivers
>>  obj-$(CONFIG_ARM_EXYNOS_BUS_DEVFREQ)	+= exynos-bus.o
>>  obj-$(CONFIG_ARM_IMX8M_DDRC_DEVFREQ)	+= imx8m-ddrc.o
>> +obj-$(CONFIG_ARM_MT8183_CCI_DEVFREQ)	+= mt8183-cci-devfreq.o
>>  obj-$(CONFIG_ARM_RK3399_DMC_DEVFREQ)	+= rk3399_dmc.o
>>  obj-$(CONFIG_ARM_TEGRA_DEVFREQ)		+= tegra30-devfreq.o
>>  obj-$(CONFIG_ARM_TEGRA20_DEVFREQ)	+= tegra20-devfreq.o
>> diff --git a/drivers/devfreq/mt8183-cci-devfreq.c b/drivers/devfreq/mt8183-cci-devfreq.c
>> new file mode 100644
>> index 000000000000..cd7929a83bf8
>> --- /dev/null
>> +++ b/drivers/devfreq/mt8183-cci-devfreq.c
>> @@ -0,0 +1,206 @@
>> +// SPDX-License-Identifier: GPL-2.0
>> +/*
>> + * Copyright (c) 2019 MediaTek Inc.
> 
> s/2019/2020
> 
>> +
>> + * Author: Andrew-sh.Cheng <andrew-sh.cheng@mediatek.com>
>> + */
>> +
>> +#include <linux/clk.h>
>> +#include <linux/devfreq.h>
>> +#include <linux/module.h>
>> +#include <linux/of.h>
>> +#include <linux/platform_device.h>
>> +#include <linux/regulator/consumer.h>
>> +#include <linux/time.h>
>> +
>> +#include "governor.h"
> 
> It is not needed. Please remove it.
> 
>> +
>> +#define MAX_VOLT_LIMIT		(1150000)
>> +
>> +struct cci_devfreq {
>> +	struct devfreq *devfreq;
>> +	struct regulator *proc_reg;
> 
> 'proc' means the 'processor'?
> Instead of 'proc_reg', you better to use 'cpu_reg'.
> 
>> +	struct clk *cci_clk;
>> +	int old_vproc;
>> +	unsigned long old_freq;
>> +};
>> +
>> +static int mtk_cci_set_voltage(struct cci_devfreq *cci_df, int vproc)
>> +{
>> +	int ret;
>> +
>> +	ret = regulator_set_voltage(cci_df->proc_reg, vproc,
>> +				    MAX_VOLT_LIMIT);
>> +	if (!ret)
>> +		cci_df->old_vproc = vproc;
>> +	return ret;
>> +}
>> +
>> +static int mtk_cci_devfreq_target(struct device *dev, unsigned long *freq,
>> +				  u32 flags)
>> +{
>> +	int ret;
>> +	struct cci_devfreq *cci_df = dev_get_drvdata(dev);
>> +	struct dev_pm_opp *opp;
>> +	unsigned long opp_rate, opp_voltage, old_voltage;
>> +
>> +	if (!cci_df)
>> +		return -EINVAL;
>> +
>> +	if (cci_df->old_freq == *freq)
>> +		return 0;
>> +
>> +	opp_rate = *freq;
>> +	opp = dev_pm_opp_find_freq_floor(dev, &opp_rate);
>> +	opp_voltage = dev_pm_opp_get_voltage(opp);
>> +	dev_pm_opp_put(opp);
> 
> 
> You can use the helper function for getting the rate 
> with devfreq_recommended_opp(). You can refer the following code
> in drivers/devfreq/exynos-bus.c
> 
> 	opp = devfreq_recommended_opp(dev, freq, flags);
> 	if (IS_ERR(opp)) {
> 		dev_err(dev, "failed to get recommended opp instance\n");
> 		return PTR_ERR(opp);
> 	}
> 	dev_pm_opp_put(opp);
> 
>> +
>> +	old_voltage = cci_df->old_vproc;
>> +	if (old_voltage == 0)
>> +		old_voltage = regulator_get_voltage(cci_df->proc_reg);
>> +
>> +	// scale up: set voltage first then freq
>> +	if (opp_voltage > old_voltage) {
>> +		ret = mtk_cci_set_voltage(cci_df, opp_voltage);
>> +		if (ret) {
>> +			pr_err("cci: failed to scale up voltage\n");
>> +			return ret;
>> +		}
>> +	}
>> +
>> +	ret = clk_set_rate(cci_df->cci_clk, *freq);
>> +	if (ret) {
>> +		pr_err("%s: failed cci to set rate: %d\n", __func__,
>> +		       ret);
>> +		mtk_cci_set_voltage(cci_df, old_voltage);
>> +		return ret;
>> +	}
>> +
>> +	// scale down: set freq first then voltage
>> +	if (opp_voltage < old_voltage) {
>> +		ret = mtk_cci_set_voltage(cci_df, opp_voltage);
>> +		if (ret) {
>> +			pr_err("cci: failed to scale down voltage\n");
>> +			clk_set_rate(cci_df->cci_clk, cci_df->old_freq);
>> +			return ret;
>> +		}
>> +	}
> 
> 
> I recommend that dev_pm_opp_set_rate() and dev_pm_opp_set_regulator()
> instead of 'clk_set_rate' and 'regulator_set_voltage'.
> In the dev_pm_opp_set_rate(), handle the these sequence.
> You can refer the merged patch[1].
> 
> [1] commit 4294a779bd8dff6c65e7e85ffe7a1ea236e92a68
> - PM / devfreq: exynos-bus: Convert to use dev_pm_opp_set_rate()
> 
> 
>> +
>> +	cci_df->old_freq = *freq;
>> +
>> +	return 0;
>> +}
>> +
>> +static struct devfreq_dev_profile cci_devfreq_profile = {
>> +	.target = mtk_cci_devfreq_target,
> 
> Need to add '.exit' for calling dev_pm_opp_of_remove_table().
> You can refer the merged devfreq patches like exynos_bus.c, imx-bus.c.
> 
>> +};
>> +
>> +static int mtk_cci_devfreq_probe(struct platform_device *pdev)
>> +{
>> +	struct device *cci_dev = &pdev->dev;
>> +	struct cci_devfreq *cci_df;
>> +	struct devfreq_passive_data *passive_data;
>> +	int ret;
>> +
>> +	cci_df = devm_kzalloc(cci_dev, sizeof(*cci_df), GFP_KERNEL);
>> +	if (!cci_df)
>> +		return -ENOMEM;
>> +
>> +	cci_df->cci_clk = devm_clk_get(cci_dev, "cci_clock");
>> +	ret = PTR_ERR_OR_ZERO(cci_df->cci_clk);
>> +	if (ret) {
>> +		if (ret != -EPROBE_DEFER)
>> +			dev_err(cci_dev, "failed to get clock for CCI: %d\n",
>> +				ret);
>> +		return ret;
>> +	}
>> +	cci_df->proc_reg = devm_regulator_get_optional(cci_dev, "proc");
>> +	ret = PTR_ERR_OR_ZERO(cci_df->proc_reg);
>> +	if (ret) {
>> +		if (ret != -EPROBE_DEFER)
>> +			dev_err(cci_dev, "failed to get regulator for CCI: %d\n",
>> +				ret);
>> +		return ret;
>> +	}
> 
> I recommend that use dev_pm_opp_set_regulators.
> You can refer the merged patch[1].
> 
> [1] commit 4294a779bd8dff6c65e7e85ffe7a1ea236e92a68
> - PM / devfreq: exynos-bus: Convert to use dev_pm_opp_set_rate()
> 
> 
>> +	ret = regulator_enable(cci_df->proc_reg);
>> +	if (ret) {
>> +		pr_warn("enable buck for cci fail\n");
> 
> Use dev_err instead of 'pr_warn'.
> 
>> +		return ret;
>> +	}
>> +
>> +	ret = dev_pm_opp_of_add_table(cci_dev);
>> +	if (ret) {
>> +		dev_err(cci_dev, "Fail to init CCI OPP table: %d\n", ret);
> 
> How about changing the error log as following
> because in this driver, use the 'failed to' sentence for error handling?
> 
> 	failed to get OPP table for CCI:L %d
> 
>> +		return ret;
>> +	}
>> +
>> +	platform_set_drvdata(pdev, cci_df);
>> +
>> +	passive_data = devm_kzalloc(cci_dev, sizeof(*passive_data), GFP_KERNEL);
>> +	if (!passive_data)
>> +		return -ENOMEM;
> 
> On this error case, you have to call dev_pm_opp_of_remove_table().
> You better to make the 'err_opp' jump lable and then add 'goto err_opp'.
> 
>> +
>> +	passive_data->parent_type = CPUFREQ_PARENT_DEV;
>> +
>> +	cci_df->devfreq = devm_devfreq_add_device(cci_dev,
>> +						  &cci_devfreq_profile,
>> +						  DEVFREQ_GOV_PASSIVE,
>> +						  passive_data);
>> +	if (IS_ERR(cci_df->devfreq)) {
>> +		ret = PTR_ERR(cci_df->devfreq);
>> +		dev_err(cci_dev, "cannot create cci devfreq device:%d\n", ret);
>> +		dev_pm_opp_of_remove_table(cci_dev);
> 
> Instead of direct call, use 'goto err_opp'.
> 
>> +		return ret;
>> +	}
>> +
>> +	return 0;
>> +}
>> +
>> +static int mtk_cci_devfreq_remove(struct platform_device *pdev)
>> +{
>> +	struct device *cci_dev = &pdev->dev;
>> +	struct cci_devfreq *cci_df;
>> +	struct notifier_block *opp_nb;
>> +
>> +	cci_df = platform_get_drvdata(pdev);
>> +	opp_nb = &cci_df->opp_nb;
>> +
>> +	dev_pm_opp_unregister_notifier(cci_dev, opp_nb);
> 
> This patch doesn't call the dev_pm_opp_register_notifier.
> Please remove it.
> 
>> +	devm_devfreq_remove_device(cci_dev, cci_df->devfreq);
> 
> Don't need to call this function because you used devm_devfreq_add_device().
> 
>> +	dev_pm_opp_of_remove_table(cci_dev)> +	regulator_disable(cci_df->proc_reg);
>> +
>> +	return 0;
>> +}
>> +
>> +static const __maybe_unused struct of_device_id
>> +	mediatek_cci_devfreq_of_match[] = {
> 
> Make it on one line and remove '__maybe_unused' keyword.
> - mediatek_cci_devfreq_of_match-> mediatek_cci_of_match
> 
>> +	{ .compatible = "mediatek,mt8183-cci" },
>> +	{ },
>> +};
>> +MODULE_DEVICE_TABLE(of, mediatek_cci_devfreq_of_match);
> 
> ditto.
> 
>> +
>> +static struct platform_driver cci_devfreq_driver = {
>> +	.probe	= mtk_cci_devfreq_probe,
>> +	.remove	= mtk_cci_devfreq_remove,
>> +	.driver = {
>> +		.name = "mediatek-cci-devfreq",
>> +		.of_match_table = of_match_ptr(mediatek_cci_devfreq_of_match),
> 
> ditto.
> 
>> +	},
>> +};
>> +
>> +static int __init mtk_cci_devfreq_init(void)
>> +{
>> +	return platform_driver_register(&cci_devfreq_driver);
>> +}
>> +module_init(mtk_cci_devfreq_init)
>> +
>> +static void __exit mtk_cci_devfreq_exit(void)
>> +{
>> +	platform_driver_unregister(&cci_devfreq_driver);
>> +}
>> +module_exit(mtk_cci_devfreq_exit)
> 
> Use 'module_platform_driver' instead of module_init and module_exit.
> 
>> +
>> +MODULE_DESCRIPTION("Mediatek CCI devfreq driver");
>> +MODULE_AUTHOR("Andrew-sh.Cheng <andrew-sh.cheng@mediatek.com>");
>> +MODULE_LICENSE("GPL v2");
>>
> 
> 


-- 
Best Regards,
Chanwoo Choi
Samsung Electronics

_______________________________________________
Linux-mediatek mailing list
Linux-mediatek@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-mediatek

^ permalink raw reply	[flat|nested] 35+ messages in thread

* Re: [PATCH 06/12] PM / devfreq: Add cpu based scaling support to passive_governor
  2020-05-28  6:14     ` Chanwoo Choi
  2020-05-28  7:17       ` Chanwoo Choi
@ 2020-06-02 11:43       ` andrew-sh.cheng
  2020-06-03  4:07         ` Chanwoo Choi
  1 sibling, 1 reply; 35+ messages in thread
From: andrew-sh.cheng @ 2020-06-02 11:43 UTC (permalink / raw)
  To: Chanwoo Choi
  Cc: Mark Rutland, Nishanth Menon, srv_heupstream, devicetree,
	Stephen Boyd, Viresh Kumar, Mark Brown, linux-pm,
	Rafael J . Wysocki, Liam Girdwood, Rob Herring, linux-kernel,
	Kyungmin Park, MyungJoo Ham, linux-mediatek, Sibi Sankar,
	Matthias Brugger, linux-arm-kernel

On Thu, 2020-05-28 at 15:14 +0900, Chanwoo Choi wrote:
> Hi Andrew-sh.Cheng,
> 
> Thanks for your posting. I like this approach absolutely.
> I think that it is necessary. When I developed the embedded product,
> I needed this feature always. 
> 
> I add the comments on below.
> 
> 
> And the following email is not valid. So, I dropped this email
> from Cc list.
> Saravana Kannan <skannan@codeaurora.org>
> 
> 
> On 5/20/20 12:43 PM, Andrew-sh.Cheng wrote:
> > From: Saravana Kannan <skannan@codeaurora.org>
> > 
> > Many CPU architectures have caches that can scale independent of the
> > CPUs. Frequency scaling of the caches is necessary to make sure that the
> > cache is not a performance bottleneck that leads to poor performance and
> > power. The same idea applies for RAM/DDR.
> > 
> > To achieve this, this patch adds support for cpu based scaling to the
> > passive governor. This is accomplished by taking the current frequency
> > of each CPU frequency domain and then adjust the frequency of the cache
> > (or any devfreq device) based on the frequency of the CPUs. It listens
> > to CPU frequency transition notifiers to keep itself up to date on the
> > current CPU frequency.
> > 
> > To decide the frequency of the device, the governor does one of the
> > following:
> > * Derives the optimal devfreq device opp from required-opps property of
> >   the parent cpu opp_table.
> > 
> > * Scales the device frequency in proportion to the CPU frequency. So, if
> >   the CPUs are running at their max frequency, the device runs at its
> >   max frequency. If the CPUs are running at their min frequency, the
> >   device runs at its min frequency. It is interpolated for frequencies
> >   in between.
> > 
> > Andrew-sh.Cheng change
> > dev_pm_opp_xlate_opp to dev_pm_opp_xlate_required_opp devfreq->max_freq
> > to devfreq->user_min_freq_req.data.freq.qos->min_freq.target_value
> > for kernel-5.7
> > 
> > Signed-off-by: Saravana Kannan <skannan@codeaurora.org>
> > [Sibi: Integrated cpu-freqmap governor into passive_governor]
> > Signed-off-by: Sibi Sankar <sibis@codeaurora.org>
> > Signed-off-by: Andrew-sh.Cheng <andrew-sh.cheng@mediatek.com>
> > ---
> >  drivers/devfreq/Kconfig            |   2 +
> >  drivers/devfreq/governor_passive.c | 278 ++++++++++++++++++++++++++++++++++---
> >  include/linux/devfreq.h            |  40 +++++-
> >  3 files changed, 299 insertions(+), 21 deletions(-)
> > 
> > diff --git a/drivers/devfreq/Kconfig b/drivers/devfreq/Kconfig
> > index 0b1df12e0f21..d9067950af6a 100644
> > --- a/drivers/devfreq/Kconfig
> > +++ b/drivers/devfreq/Kconfig
> > @@ -73,6 +73,8 @@ config DEVFREQ_GOV_PASSIVE
> >  	  device. This governor does not change the frequency by itself
> >  	  through sysfs entries. The passive governor recommends that
> >  	  devfreq device uses the OPP table to get the frequency/voltage.
> > +	  Alternatively the governor can also be chosen to scale based on
> > +	  the online CPUs current frequency.
> >  
> >  comment "DEVFREQ Drivers"
> >  
> > diff --git a/drivers/devfreq/governor_passive.c b/drivers/devfreq/governor_passive.c
> > index 2d67d6c12dce..7dcda02a5bb7 100644
> > --- a/drivers/devfreq/governor_passive.c
> > +++ b/drivers/devfreq/governor_passive.c
> > @@ -8,11 +8,89 @@
> >   */
> >  
> >  #include <linux/module.h>
> > +#include <linux/cpu.h>
> > +#include <linux/cpufreq.h>
> > +#include <linux/cpumask.h>
> >  #include <linux/device.h>
> >  #include <linux/devfreq.h>
> > +#include <linux/slab.h>
> >  #include "governor.h"
> >  
> > -static int devfreq_passive_get_target_freq(struct devfreq *devfreq,
> > +static unsigned int xlate_cpufreq_to_devfreq(struct devfreq_passive_data *data,
> 
> Need to change 'unsigned int' to 'unsigned long'
Get it.
> .
> 
> > +					     unsigned int cpu)
> > +{
> > +	unsigned int cpu_min, cpu_max, dev_min, dev_max, cpu_percent, max_state;
> 
> Better to define them separately as following and then need to rename
> the variable. Usually, use the 'min_freq' and 'max_freq' word for
> the minimum/maximum frequency.
> 
> 	unsigned int cpu_min_freq, cpu_max_freq, cpu_curr_freq, cpu_percent;
> 	unsigned long dev_min_freq, dev_max_freq, dev_max_state,
> 
> The devfreq used 'unsigned long'. The cpufreq used 'unsigned long'
> and 'unsigned int'. You need to handle them properly.
Get it.
For cpu_freq, I separate it into "unsigned long cpu_curr_freq" and
"unsigned int cpu_curr_freq_khz"
> 
> 
> > +	struct devfreq_cpu_state *cpu_state = data->cpu_state[cpu];
> > +	struct devfreq *devfreq = (struct devfreq *)data->this;
> > +	unsigned long *freq_table = devfreq->profile->freq_table;
> 
> In this function, use 'cpu' work for cpufreq and use 'dev' for devfreq.
> So, I think 'dev_freq_table' is proper name instead of 'freq_table'
> for the readability.
> 
> 	freq_table -> dev_freq_table
> 
> > +	struct dev_pm_opp *opp = NULL, *cpu_opp = NULL;
> 
> In the get_target_freq_with_devfreq(), use 'p_opp' indicating
> the OPP of parent device. For the consistency, I think that
> use 'p_opp' instead of 'cpu_opp'. 
> 
> > +	unsigned long cpu_freq, freq;
> 
> Define the 'cpu_freq' on above with cpu_min_freq/cpu_max_freq definition.
> 	cpu_freq -> cpu_curr_freq.
Get it.
Will modify them for readability.
> 
> > +
> > +	if (!cpu_state || cpu_state->first_cpu != cpu ||
> > +	    !cpu_state->opp_table || !devfreq->opp_table)
> > +		return 0;
> > +
> > +	cpu_freq = cpu_state->freq * 1000;
> > +	cpu_opp = devfreq_recommended_opp(cpu_state->dev, &cpu_freq, 0);
> > +	if (IS_ERR(cpu_opp))
> > +		return 0;
> > +
> > +	opp = dev_pm_opp_xlate_required_opp(cpu_state->opp_table,
> > +					    devfreq->opp_table, cpu_opp);
> > +	dev_pm_opp_put(cpu_opp);
> > +
> > +	if (!IS_ERR(opp)) {
> > +		freq = dev_pm_opp_get_freq(opp);
> > +		dev_pm_opp_put(opp);
> 
> Better to add the 'out' goto statement.
> If you use 'goto out', you can reduce the one indentation
> without 'else' statement.
Get it.
> 	
> 
> > +	} else {
> 
> As I commented, when dev_pm_opp_xlate_required_opp() return successfully
> , use 'goto out'. We can remove 'else' and then reduce the unneeded indentation.
> 
> 
> > +		/* Use Interpolation if required opps is not available */
> > +		cpu_min = cpu_state->min_freq;
> > +		cpu_max = cpu_state->max_freq;
> > +		cpu_freq = cpu_state->freq;
> > +
> > +		if (freq_table) {
> > +			/* Get minimum frequency according to sorting order */
> > +			max_state = freq_table[devfreq->profile->max_state - 1];
> > +			if (freq_table[0] < max_state) {
> > +				dev_min = freq_table[0];
> > +				dev_max = max_state;
> > +			} else {
> > +				dev_min = max_state;
> > +				dev_max = freq_table[0];
> > +			}
> > +		} else {
> > +			if (devfreq->user_max_freq_req.data.freq.qos->max_freq.target_value
> > +			    <= devfreq->user_min_freq_req.data.freq.qos->min_freq.target_value)
> > +				return 0;
> > +			dev_min =
> > +			devfreq->user_min_freq_req.data.freq.qos->min_freq.target_value;
> > +			dev_max =
> > +			devfreq->user_max_freq_req.data.freq.qos->max_freq.target_value;
> 
> I think it is not proper to access the variable of pm_qos structure directly.
> Instead of direct access, you have to use the exported PM QoS function such as
> - pm_qos_read_value(devfreq->dev.parent, DEV_PM_QOS_MIN_FREQUENCY);
> - pm_qos_read_value(devfreq->dev.parent, DEV_PM_QOS_MAX_FREQUENCY);
Get it.
> 
> > +		}
> > +		cpu_percent = ((cpu_freq - cpu_min) * 100) / cpu_max - cpu_min;
> > +		freq = dev_min + mult_frac(dev_max - dev_min, cpu_percent, 100);
> > +	}
> 
> 
> I think that you better to add 'out' jump label as following:
> 
> out:
> 
> > +
> > +	return freq;
> > +}
> > +
> > +static int get_target_freq_with_cpufreq(struct devfreq *devfreq,
> > +					unsigned long *freq)
> > +{
> > +	struct devfreq_passive_data *p_data =
> > +				(struct devfreq_passive_data *)devfreq->data;
> > +	unsigned int cpu, target_freq = 0;
> 
> Need to define 'target_freq' with 'unsigned long' type.
Get it.
> 
> > +
> > +	for_each_online_cpu(cpu)
> > +		target_freq = max(target_freq,
> > +				  xlate_cpufreq_to_devfreq(p_data, cpu));
> > +
> > +	*freq = target_freq;
> > +
> > +	return 0;
> > +}
> > +
> > +static int get_target_freq_with_devfreq(struct devfreq *devfreq,
> >  					unsigned long *freq)
> >  {
> >  	struct devfreq_passive_data *p_data
> > @@ -23,16 +101,6 @@ static int devfreq_passive_get_target_freq(struct devfreq *devfreq,
> >  	int i, count, ret = 0;
> >  
> >  	/*
> > -	 * If the devfreq device with passive governor has the specific method
> > -	 * to determine the next frequency, should use the get_target_freq()
> > -	 * of struct devfreq_passive_data.
> > -	 */
> > -	if (p_data->get_target_freq) {
> > -		ret = p_data->get_target_freq(devfreq, freq);
> > -		goto out;
> > -	}
> > -
> > -	/*
> >  	 * If the parent and passive devfreq device uses the OPP table,
> >  	 * get the next frequency by using the OPP table.
> >  	 */
> > @@ -102,6 +170,37 @@ static int devfreq_passive_get_target_freq(struct devfreq *devfreq,
> >  	return ret;
> >  }
> >  
> > +static int devfreq_passive_get_target_freq(struct devfreq *devfreq,
> > +					   unsigned long *freq)
> > +{
> > +	struct devfreq_passive_data *p_data =
> > +				(struct devfreq_passive_data *)devfreq->data;
> > +	int ret;
> > +
> > +	/*
> > +	 * If the devfreq device with passive governor has the specific method
> > +	 * to determine the next frequency, should use the get_target_freq()
> > +	 * of struct devfreq_passive_data.
> > +	 */
> > +	if (p_data->get_target_freq)
> > +		return p_data->get_target_freq(devfreq, freq);
> > +
> > +	switch (p_data->parent_type) {
> > +	case DEVFREQ_PARENT_DEV:
> > +		ret = get_target_freq_with_devfreq(devfreq, freq);
> > +		break;
> > +	case CPUFREQ_PARENT_DEV:
> > +		ret = get_target_freq_with_cpufreq(devfreq, freq);
> > +		break;
> > +	default:
> > +		ret = -EINVAL;
> > +		dev_err(&devfreq->dev, "Invalid parent type\n");
> > +		break;
> > +	}
> > +
> > +	return ret;
> > +}
> > +
> >  static int update_devfreq_passive(struct devfreq *devfreq, unsigned long freq)
> >  {
> >  	int ret;
> > @@ -156,6 +255,140 @@ static int devfreq_passive_notifier_call(struct notifier_block *nb,
> >  	return NOTIFY_DONE;
> >  }
> >  
> > +static int cpufreq_passive_notifier_call(struct notifier_block *nb,
> > +					 unsigned long event, void *ptr)
> > +{
> > +	struct devfreq_passive_data *data =
> > +			container_of(nb, struct devfreq_passive_data, nb);
> > +	struct devfreq *devfreq = (struct devfreq *)data->this;
> > +	struct devfreq_cpu_state *cpu_state;
> > +	struct cpufreq_freqs *freq = ptr;
> 
> How about changing 'freq' to 'cpu_freqs'?
> 
> In the drivers/cpufreq/cpufreq.c, use 'freqs' name indicating
> the instance of 'struct cpufreq_freqs'. And in order to
> identfy, how about adding 'cpu_' prefix for variable name?
> 
> > +	unsigned int current_freq;
> 
> Need to define curr_freq with 'unsigned long' type
> and better to use 'curr_freq' variable name.
It is good to change current_freq to curr_freq, but why should it us
'unsigned long'?
I think it is 'unsigned int'.
> 
> > +	int ret;
> > +
> > +	if (event != CPUFREQ_POSTCHANGE || !freq ||
> > +	    !data->cpu_state[freq->policy->cpu])
> > +		return 0;
> > +
> > +	cpu_state = data->cpu_state[freq->policy->cpu];
> > +	if (cpu_state->freq == freq->new)
> > +		return 0;
> > +
> > +	/* Backup current freq and pre-update cpu state freq*/
> > +	current_freq = cpu_state->freq;
> > +	cpu_state->freq = freq->new;
> > +
> > +	mutex_lock(&devfreq->lock);
> > +	ret = update_devfreq(devfreq);
> > +	mutex_unlock(&devfreq->lock);
> > +	if (ret) {
> > +		cpu_state->freq = current_freq;
> > +		dev_err(&devfreq->dev, "Couldn't update the frequency.\n");
> > +		return ret;
> > +	}
> > +
> > +	return 0;
> > +}
> > +
> > +static int cpufreq_passive_register(struct devfreq_passive_data **p_data)
> > +{
> > +	struct devfreq_passive_data *data = *p_data;
> > +	struct devfreq *devfreq = (struct devfreq *)data->this;
> > +	struct device *dev = devfreq->dev.parent;
> > +	struct opp_table *opp_table = NULL;
> > +	struct devfreq_cpu_state *state;
> 
> For the readability, I thinkt 'cpu_state' is proper instead of 'state'.
Get it.
> 
> > +	struct cpufreq_policy *policy;
> > +	struct device *cpu_dev;
> > +	unsigned int cpu;
> > +	int ret;
> > +
> > +	get_online_cpus();
> 
> Add blank line.
Get it.
> 
> > +	data->nb.notifier_call = cpufreq_passive_notifier_call;
> > +	ret = cpufreq_register_notifier(&data->nb,
> > +					CPUFREQ_TRANSITION_NOTIFIER);
> > +	if (ret) {
> > +		dev_err(dev, "Couldn't register cpufreq notifier.\n");
> > +		data->nb.notifier_call = NULL;
> > +		goto out;
> > +	}
> > +
> > +	/* Populate devfreq_cpu_state */
> > +	for_each_online_cpu(cpu) {
> > +		if (data->cpu_state[cpu])
> > +			continue;
> > +
> > +		policy = cpufreq_cpu_get(cpu);
> 
> cpufreq_cpu_get() might return 'NULL'. I think you need to handle
> return value as following:
> 
> 		if (!policy) {
> 			ret = -EINVAL;
> 			goto out;
> 		} else if (PTR_ERR(policy) == -EPROBE_DEFER) {
> 			goto out;
> 		} else if (IS_ERR(policy) {
> 			ret = PTR_ERR(policy);
> 			dev_err(dev, "Couldn't get the cpufreq_poliy.\n");
> 			goto out;
> 		}
> 
> If cpufreq_cpu_get() return successfully, to do next.
> It reduces the one indentaion.
> 
> 
Get it.
> 
> > +		if (policy) {
> > +			state = kzalloc(sizeof(*state), GFP_KERNEL);
> > +			if (!state) {
> > +				ret = -ENOMEM;
> > +				goto out;
> > +			}
> > +
> > +			cpu_dev = get_cpu_device(cpu);
> > +			if (!cpu_dev) {
> > +				dev_err(dev, "Couldn't get cpu device.\n");
> > +				ret = -ENODEV;
> > +				goto out;
> > +			}
> > +
> > +			opp_table = dev_pm_opp_get_opp_table(cpu_dev);
> > +			if (IS_ERR(devfreq->opp_table)) {
> > +				ret = PTR_ERR(opp_table);
> > +				goto out;
> > +			}
> > +
> > +			state->dev = cpu_dev;
> > +			state->opp_table = opp_table;
> > +			state->first_cpu = cpumask_first(policy->related_cpus);
> > +			state->freq = policy->cur;
> > +			state->min_freq = policy->cpuinfo.min_freq;
> > +			state->max_freq = policy->cpuinfo.max_freq;
> > +			data->cpu_state[cpu] = state;
> 
> Add blank line.
> 
> > +			cpufreq_cpu_put(policy);
> > +		} else {
> > +			ret = -EPROBE_DEFER;
> > +			goto out;
> > +		}
> > +	}
> 
> Add blank line.
Get it.
> > +out:
> > +	put_online_cpus();
> > +	if (ret)
> > +		return ret;
> > +
> > +	/* Update devfreq */
> > +	mutex_lock(&devfreq->lock);
> > +	ret = update_devfreq(devfreq);
> > +	mutex_unlock(&devfreq->lock);
> > +	if (ret)
> > +		dev_err(dev, "Couldn't update the frequency.\n");
> > +
> > +	return ret;
> > +}
> > +
> > +static int cpufreq_passive_unregister(struct devfreq_passive_data **p_data)
> > +{
> > +	struct devfreq_passive_data *data = *p_data;
> > +	struct devfreq_cpu_state *cpu_state;
> > +	int cpu;
> > +
> > +	if (data->nb.notifier_call)
> > +		cpufreq_unregister_notifier(&data->nb,
> > +					    CPUFREQ_TRANSITION_NOTIFIER);
> > +
> > +	for_each_possible_cpu(cpu) {
> > +		cpu_state = data->cpu_state[cpu];
> > +		if (cpu_state) {
> > +			if (cpu_state->opp_table)
> > +				dev_pm_opp_put_opp_table(cpu_state->opp_table);
> > +			kfree(cpu_state);
> > +			cpu_state = NULL;
> > +		}
> > +	}
> > +
> > +	return 0;
> > +}
> > +
> >  static int devfreq_passive_event_handler(struct devfreq *devfreq,
> >  				unsigned int event, void *data)
> >  {
> > @@ -165,7 +398,7 @@ static int devfreq_passive_event_handler(struct devfreq *devfreq,
> >  	struct notifier_block *nb = &p_data->nb;
> >  	int ret = 0;
> >  
> > -	if (!parent)
> > +	if (p_data->parent_type == DEVFREQ_PARENT_DEV && !parent)
> >  		return -EPROBE_DEFER;
> 
> If you modify the devfreq_passive_event_handler() as following,
> you can move this condition for DEVFREQ_PARENT_DEV into 
> (register|unregister)_parent_dev_notifier.
> 
> 	switch (event) {                                                                                  
> 	case DEVFREQ_GOV_START:                                               
> 		ret = register_parent_dev_notifier(p_data);
> 		break;
> 	case DEVFREQ_GOV_STOP:                                             
> 		ret = unregister_parent_dev_notifier(p_data);
> 		break;
> 	default: 
> 		ret = -EINVAL;
> 		break;
> 	}
>                                                                                               
> 	return ret;
> 
Get it.
> >  
> >  	switch (event) {
> > @@ -173,13 +406,24 @@ static int devfreq_passive_event_handler(struct devfreq *devfreq,
> >  		if (!p_data->this)
> >  			p_data->this = devfreq;
> >  
> > -		nb->notifier_call = devfreq_passive_notifier_call;
> > -		ret = devfreq_register_notifier(parent, nb,
> > -					DEVFREQ_TRANSITION_NOTIFIER);
> > +		if (p_data->parent_type == DEVFREQ_PARENT_DEV) {
> > +			nb->notifier_call = devfreq_passive_notifier_call;
> > +			ret = devfreq_register_notifier(parent, nb,
> > +						DEVFREQ_TRANSITION_NOTIFIER);
> > +		} else if (p_data->parent_type == CPUFREQ_PARENT_DEV) {
> > +			ret = cpufreq_passive_register(&p_data);
> 
> I think that we better to collect the code related to notifier registration
> into one function like devfreq_pass_register_notifier() instead of
> cpufreq_passive_register() as following: I think it is more simple and readable.
> 
> If you have more proper function name of register_parent_dev_notifier,
> please give your opinion.
> 
> 	int register_parent_dev_notifier(struct devfreq_passive_data **p_data)
> 		switch (p_data->parent_type) {
> 		case DEVFREQ_PARENT_DEV:
> 			nb->notifier_call = devfreq_passive_notifier_call;
> 			ret = devfreq_register_notifier(parent, nb,
> 			break;
> 		case CPUFREQ_PARENT_DEV:
> 			cpufreq_register_notifier(...)
> 			...
> 			break;
> 		}
Not fully understanding.
Do you mean expanding cpufreq_passive_register()?
I think leave it in function will be with clean for this code segment.

> 		
> 
> > +		} else {
> > +			ret = -EINVAL;
> > +		}
> >  		break;
> >  	case DEVFREQ_GOV_STOP:
> > -		WARN_ON(devfreq_unregister_notifier(parent, nb,
> > -					DEVFREQ_TRANSITION_NOTIFIER));
> > +		if (p_data->parent_type == DEVFREQ_PARENT_DEV)
> > +			WARN_ON(devfreq_unregister_notifier(parent, nb,
> > +						DEVFREQ_TRANSITION_NOTIFIER));
> > +		else if (p_data->parent_type == CPUFREQ_PARENT_DEV)
> > +			cpufreq_passive_unregister(&p_data);
> > +		else
> > +			ret = -EINVAL;
> 
> ditto. unregister_parent_dev_notifier(struct devfreq_passive_data **p_data)
Get it.
> 
> >  		break;
> >  	default:
> >  		break;
> > diff --git a/include/linux/devfreq.h b/include/linux/devfreq.h
> > index a4b19d593151..04ce576fd6f1 100644
> > --- a/include/linux/devfreq.h
> > +++ b/include/linux/devfreq.h
> > @@ -278,6 +278,32 @@ struct devfreq_simple_ondemand_data {
> >  
> >  #if IS_ENABLED(CONFIG_DEVFREQ_GOV_PASSIVE)
> >  /**
> > + * struct devfreq_cpu_state - holds the per-cpu state
> > + * @freq:	the current frequency of the cpu.
> > + * @min_freq:	the min frequency of the cpu.
> > + * @max_freq:	the max frequency of the cpu.
> > + * @first_cpu:	the cpumask of the first cpu of a policy.
> > + * @dev:	reference to cpu device.
> > + * @opp_table:	reference to cpu opp table.
> > + *
> > + * This structure stores the required cpu_state of a cpu.
> > + * This is auto-populated by the governor.
> > + */
> > +struct devfreq_cpu_state {> +	unsigned int freq;
> 
> It is better to change from 'freq' to 'curr_freq'
> for more correct expression.
Get it.
> 
> > +	unsigned int min_freq;
> > +	unsigned int max_freq;
> > +	unsigned int first_cpu;
> > +	struct device *dev;
> 
> How about changing the name 'dev' to 'cpu_dev'?
Okay.
> 
> 
> > +	struct opp_table *opp_table;
> > +};
> 
> devfreq_cpu_state is only handled by within driver/devfreq/governor_passive.c.
> 
> So, you can move it into drivers/devfreq/governor_passive.c
> and just add the definition into include/linux/devfreq.h as following:
> It is able to prevent the access of variable of 'struct devfreq_cpu_state'
> outside.
> 
> 	struct devfreq_cpu_state;
Get it.
> 
> > +
> > +enum devfreq_parent_dev_type {
> > +	DEVFREQ_PARENT_DEV,
> > +	CPUFREQ_PARENT_DEV,
> > +};
> > +
> > +/**
> >   * struct devfreq_passive_data - ``void *data`` fed to struct devfreq
> >   *	and devfreq_add_device
> >   * @parent:	the devfreq instance of parent device.
> > @@ -288,13 +314,15 @@ struct devfreq_simple_ondemand_data {
> >   *			using governors except for passive governor.
> >   *			If the devfreq device has the specific method to decide
> >   *			the next frequency, should use this callback.
> > - * @this:	the devfreq instance of own device.
> > - * @nb:		the notifier block for DEVFREQ_TRANSITION_NOTIFIER list
> > + * @parent_type		parent type of the device
> 
> Need to add ':' at the end of word. -> "parent_type:".
> 
> > + * @this:		the devfreq instance of own device.
> > + * @nb:			the notifier block for DEVFREQ_TRANSITION_NOTIFIER list
> 
> I knew that you make them with same indentation.
> But, actually, it is not related to this patch like clean-up code.
> Even if it is not pretty, you better to don't touch 'this' and 'nb' indentaion.
Get it.
> 
> > + * @cpu_state:		the state min/max/current frequency of all online cpu's
> >   *
> >   * The devfreq_passive_data have to set the devfreq instance of parent
> >   * device with governors except for the passive governor. But, don't need to
> > - * initialize the 'this' and 'nb' field because the devfreq core will handle
> > - * them.
> > + * initialize the 'this', 'nb' and 'cpu_state' field because the devfreq core
> > + * will handle them.
> >   */
> >  struct devfreq_passive_data {
> >  	/* Should set the devfreq instance of parent device */
> > @@ -303,9 +331,13 @@ struct devfreq_passive_data {
> >  	/* Optional callback to decide the next frequency of passvice device */
> >  	int (*get_target_freq)(struct devfreq *this, unsigned long *freq);
> >  
> > +	/* Should set the type of parent device */
> > +	enum devfreq_parent_dev_type parent_type;
> > +
> >  	/* For passive governor's internal use. Don't need to set them */
> >  	struct devfreq *this;
> >  	struct notifier_block nb;
> > +	struct devfreq_cpu_state *cpu_state[NR_CPUS];
> >  };
> >  #endif
> >  
> > 
> 
> 

_______________________________________________
Linux-mediatek mailing list
Linux-mediatek@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-mediatek

^ permalink raw reply	[flat|nested] 35+ messages in thread

* Re: [PATCH 06/12] PM / devfreq: Add cpu based scaling support to passive_governor
  2020-05-28  7:17       ` Chanwoo Choi
@ 2020-06-02 12:23         ` andrew-sh.cheng
  2020-06-03  4:12           ` Chanwoo Choi
  0 siblings, 1 reply; 35+ messages in thread
From: andrew-sh.cheng @ 2020-06-02 12:23 UTC (permalink / raw)
  To: Chanwoo Choi
  Cc: Mark Rutland, Nishanth Menon, srv_heupstream, devicetree,
	Stephen Boyd, Viresh Kumar, Mark Brown, linux-pm,
	Rafael J . Wysocki, Liam Girdwood, Rob Herring, linux-kernel,
	Kyungmin Park, MyungJoo Ham, linux-mediatek, Sibi Sankar,
	Matthias Brugger, linux-arm-kernel

On Thu, 2020-05-28 at 16:17 +0900, Chanwoo Choi wrote:
> Hi Andrew-sh.Cheng,
> 
> The exynos-bus.c used the passive governor.
> Even if don't make the problem because DEVFREQ_PARENT_DEV is zero,
> you need to initialize the parent_type with DEVFREQ_PARENT_DEV as following:
> 
> diff --git a/drivers/devfreq/exynos-bus.c b/drivers/devfreq/exynos-bus.c
> index 8fa8eb541373..1c71c47bc2ac 100644
> --- a/drivers/devfreq/exynos-bus.c
> +++ b/drivers/devfreq/exynos-bus.c
> @@ -369,6 +369,7 @@ static int exynos_bus_profile_init_passive(struct exynos_bus *bus,
>                 return -ENOMEM;
>  
>         passive_data->parent = parent_devfreq;
> +       passive_data->parent_type = DEVFREQ_PARENT_DEV;
>  
>         /* Add devfreq device for exynos bus with passive governor */
>         bus->devfreq = devm_devfreq_add_device(dev, profile, DEVFREQ_GOV_PASSIVE,
Hi Chanwoo Choi,
Do you just remind me to initialize it to DEVFREQ_PARENT_DEV whn use
this governor?
I will do it and thank you for reminding.
> 
> 
> On 5/28/20 3:14 PM, Chanwoo Choi wrote:
> > Hi Andrew-sh.Cheng,
> > 
> > Thanks for your posting. I like this approach absolutely.
> > I think that it is necessary. When I developed the embedded product,
> > I needed this feature always. 
> > 
> > I add the comments on below.
> > 
> > 
> > And the following email is not valid. So, I dropped this email
> > from Cc list.
> > Saravana Kannan <skannan@codeaurora.org>
> > 
> > 
> > On 5/20/20 12:43 PM, Andrew-sh.Cheng wrote:
> >> From: Saravana Kannan <skannan@codeaurora.org>
> >>
> >> Many CPU architectures have caches that can scale independent of the
> >> CPUs. Frequency scaling of the caches is necessary to make sure that the
> >> cache is not a performance bottleneck that leads to poor performance and
> >> power. The same idea applies for RAM/DDR.
> >>
> >> To achieve this, this patch adds support for cpu based scaling to the
> >> passive governor. This is accomplished by taking the current frequency
> >> of each CPU frequency domain and then adjust the frequency of the cache
> >> (or any devfreq device) based on the frequency of the CPUs. It listens
> >> to CPU frequency transition notifiers to keep itself up to date on the
> >> current CPU frequency.
> >>
> >> To decide the frequency of the device, the governor does one of the
> >> following:
> >> * Derives the optimal devfreq device opp from required-opps property of
> >>   the parent cpu opp_table.
> >>
> >> * Scales the device frequency in proportion to the CPU frequency. So, if
> >>   the CPUs are running at their max frequency, the device runs at its
> >>   max frequency. If the CPUs are running at their min frequency, the
> >>   device runs at its min frequency. It is interpolated for frequencies
> >>   in between.
> >>
> >> Andrew-sh.Cheng change
> >> dev_pm_opp_xlate_opp to dev_pm_opp_xlate_required_opp devfreq->max_freq
> >> to devfreq->user_min_freq_req.data.freq.qos->min_freq.target_value
> >> for kernel-5.7
> >>
> >> Signed-off-by: Saravana Kannan <skannan@codeaurora.org>
> >> [Sibi: Integrated cpu-freqmap governor into passive_governor]
> >> Signed-off-by: Sibi Sankar <sibis@codeaurora.org>
> >> Signed-off-by: Andrew-sh.Cheng <andrew-sh.cheng@mediatek.com>
> >> ---
> >>  drivers/devfreq/Kconfig            |   2 +
> >>  drivers/devfreq/governor_passive.c | 278 ++++++++++++++++++++++++++++++++++---
> >>  include/linux/devfreq.h            |  40 +++++-
> >>  3 files changed, 299 insertions(+), 21 deletions(-)
> >>
> >> diff --git a/drivers/devfreq/Kconfig b/drivers/devfreq/Kconfig
> >> index 0b1df12e0f21..d9067950af6a 100644
> >> --- a/drivers/devfreq/Kconfig
> >> +++ b/drivers/devfreq/Kconfig
> >> @@ -73,6 +73,8 @@ config DEVFREQ_GOV_PASSIVE
> >>  	  device. This governor does not change the frequency by itself
> >>  	  through sysfs entries. The passive governor recommends that
> >>  	  devfreq device uses the OPP table to get the frequency/voltage.
> >> +	  Alternatively the governor can also be chosen to scale based on
> >> +	  the online CPUs current frequency.
> >>  
> >>  comment "DEVFREQ Drivers"
> >>  
> >> diff --git a/drivers/devfreq/governor_passive.c b/drivers/devfreq/governor_passive.c
> >> index 2d67d6c12dce..7dcda02a5bb7 100644
> >> --- a/drivers/devfreq/governor_passive.c
> >> +++ b/drivers/devfreq/governor_passive.c
> >> @@ -8,11 +8,89 @@
> >>   */
> >>  
> >>  #include <linux/module.h>
> >> +#include <linux/cpu.h>
> >> +#include <linux/cpufreq.h>
> >> +#include <linux/cpumask.h>
> >>  #include <linux/device.h>
> >>  #include <linux/devfreq.h>
> >> +#include <linux/slab.h>
> >>  #include "governor.h"
> >>  
> >> -static int devfreq_passive_get_target_freq(struct devfreq *devfreq,
> >> +static unsigned int xlate_cpufreq_to_devfreq(struct devfreq_passive_data *data,
> > 
> > Need to change 'unsigned int' to 'unsigned long'.
> > 
> >> +					     unsigned int cpu)
> >> +{
> >> +	unsigned int cpu_min, cpu_max, dev_min, dev_max, cpu_percent, max_state;
> > 
> > Better to define them separately as following and then need to rename
> > the variable. Usually, use the 'min_freq' and 'max_freq' word for
> > the minimum/maximum frequency.
> > 
> > 	unsigned int cpu_min_freq, cpu_max_freq, cpu_curr_freq, cpu_percent;
> > 	unsigned long dev_min_freq, dev_max_freq, dev_max_state,
> > 
> > The devfreq used 'unsigned long'. The cpufreq used 'unsigned long'
> > and 'unsigned int'. You need to handle them properly.
> > 
> > 
> >> +	struct devfreq_cpu_state *cpu_state = data->cpu_state[cpu];
> >> +	struct devfreq *devfreq = (struct devfreq *)data->this;
> >> +	unsigned long *freq_table = devfreq->profile->freq_table;
> > 
> > In this function, use 'cpu' work for cpufreq and use 'dev' for devfreq.
> > So, I think 'dev_freq_table' is proper name instead of 'freq_table'
> > for the readability.
> > 
> > 	freq_table -> dev_freq_table
> > 
> >> +	struct dev_pm_opp *opp = NULL, *cpu_opp = NULL;
> > 
> > In the get_target_freq_with_devfreq(), use 'p_opp' indicating
> > the OPP of parent device. For the consistency, I think that
> > use 'p_opp' instead of 'cpu_opp'. 
> > 
> >> +	unsigned long cpu_freq, freq;
> > 
> > Define the 'cpu_freq' on above with cpu_min_freq/cpu_max_freq definition.
> > 	cpu_freq -> cpu_curr_freq.
> > 
> >> +
> >> +	if (!cpu_state || cpu_state->first_cpu != cpu ||
> >> +	    !cpu_state->opp_table || !devfreq->opp_table)
> >> +		return 0;
> >> +
> >> +	cpu_freq = cpu_state->freq * 1000;
> >> +	cpu_opp = devfreq_recommended_opp(cpu_state->dev, &cpu_freq, 0);
> >> +	if (IS_ERR(cpu_opp))
> >> +		return 0;
> >> +
> >> +	opp = dev_pm_opp_xlate_required_opp(cpu_state->opp_table,
> >> +					    devfreq->opp_table, cpu_opp);
> >> +	dev_pm_opp_put(cpu_opp);
> >> +
> >> +	if (!IS_ERR(opp)) {
> >> +		freq = dev_pm_opp_get_freq(opp);
> >> +		dev_pm_opp_put(opp);
> > 
> > Better to add the 'out' goto statement.
> > If you use 'goto out', you can reduce the one indentation
> > without 'else' statement.
> > 	
> > 
> >> +	} else {
> > 
> > As I commented, when dev_pm_opp_xlate_required_opp() return successfully
> > , use 'goto out'. We can remove 'else' and then reduce the unneeded indentation.
> > 
> > 
> >> +		/* Use Interpolation if required opps is not available */
> >> +		cpu_min = cpu_state->min_freq;
> >> +		cpu_max = cpu_state->max_freq;
> >> +		cpu_freq = cpu_state->freq;
> >> +
> >> +		if (freq_table) {
> >> +			/* Get minimum frequency according to sorting order */
> >> +			max_state = freq_table[devfreq->profile->max_state - 1];
> >> +			if (freq_table[0] < max_state) {
> >> +				dev_min = freq_table[0];
> >> +				dev_max = max_state;
> >> +			} else {
> >> +				dev_min = max_state;
> >> +				dev_max = freq_table[0];
> >> +			}
> >> +		} else {
> >> +			if (devfreq->user_max_freq_req.data.freq.qos->max_freq.target_value
> >> +			    <= devfreq->user_min_freq_req.data.freq.qos->min_freq.target_value)
> >> +				return 0;
> >> +			dev_min =
> >> +			devfreq->user_min_freq_req.data.freq.qos->min_freq.target_value;
> >> +			dev_max =
> >> +			devfreq->user_max_freq_req.data.freq.qos->max_freq.target_value;
> > 
> > I think it is not proper to access the variable of pm_qos structure directly.
> > Instead of direct access, you have to use the exported PM QoS function such as
> > - pm_qos_read_value(devfreq->dev.parent, DEV_PM_QOS_MIN_FREQUENCY);
> > - pm_qos_read_value(devfreq->dev.parent, DEV_PM_QOS_MAX_FREQUENCY);
> > 
> >> +		}
> >> +		cpu_percent = ((cpu_freq - cpu_min) * 100) / cpu_max - cpu_min;
> >> +		freq = dev_min + mult_frac(dev_max - dev_min, cpu_percent, 100);
> >> +	}
> > 
> > 
> > I think that you better to add 'out' jump label as following:
> > 
> > out:
> > 
> >> +
> >> +	return freq;
> >> +}
> >> +
> >> +static int get_target_freq_with_cpufreq(struct devfreq *devfreq,
> >> +					unsigned long *freq)
> >> +{
> >> +	struct devfreq_passive_data *p_data =
> >> +				(struct devfreq_passive_data *)devfreq->data;
> >> +	unsigned int cpu, target_freq = 0;
> > 
> > Need to define 'target_freq' with 'unsigned long' type.
> > 
> >> +
> >> +	for_each_online_cpu(cpu)
> >> +		target_freq = max(target_freq,
> >> +				  xlate_cpufreq_to_devfreq(p_data, cpu));
> >> +
> >> +	*freq = target_freq;
> >> +
> >> +	return 0;
> >> +}
> >> +
> >> +static int get_target_freq_with_devfreq(struct devfreq *devfreq,
> >>  					unsigned long *freq)
> >>  {
> >>  	struct devfreq_passive_data *p_data
> >> @@ -23,16 +101,6 @@ static int devfreq_passive_get_target_freq(struct devfreq *devfreq,
> >>  	int i, count, ret = 0;
> >>  
> >>  	/*
> >> -	 * If the devfreq device with passive governor has the specific method
> >> -	 * to determine the next frequency, should use the get_target_freq()
> >> -	 * of struct devfreq_passive_data.
> >> -	 */
> >> -	if (p_data->get_target_freq) {
> >> -		ret = p_data->get_target_freq(devfreq, freq);
> >> -		goto out;
> >> -	}
> >> -
> >> -	/*
> >>  	 * If the parent and passive devfreq device uses the OPP table,
> >>  	 * get the next frequency by using the OPP table.
> >>  	 */
> >> @@ -102,6 +170,37 @@ static int devfreq_passive_get_target_freq(struct devfreq *devfreq,
> >>  	return ret;
> >>  }
> >>  
> >> +static int devfreq_passive_get_target_freq(struct devfreq *devfreq,
> >> +					   unsigned long *freq)
> >> +{
> >> +	struct devfreq_passive_data *p_data =
> >> +				(struct devfreq_passive_data *)devfreq->data;
> >> +	int ret;
> >> +
> >> +	/*
> >> +	 * If the devfreq device with passive governor has the specific method
> >> +	 * to determine the next frequency, should use the get_target_freq()
> >> +	 * of struct devfreq_passive_data.
> >> +	 */
> >> +	if (p_data->get_target_freq)
> >> +		return p_data->get_target_freq(devfreq, freq);
> >> +
> >> +	switch (p_data->parent_type) {
> >> +	case DEVFREQ_PARENT_DEV:
> >> +		ret = get_target_freq_with_devfreq(devfreq, freq);
> >> +		break;
> >> +	case CPUFREQ_PARENT_DEV:
> >> +		ret = get_target_freq_with_cpufreq(devfreq, freq);
> >> +		break;
> >> +	default:
> >> +		ret = -EINVAL;
> >> +		dev_err(&devfreq->dev, "Invalid parent type\n");
> >> +		break;
> >> +	}
> >> +
> >> +	return ret;
> >> +}
> >> +
> >>  static int update_devfreq_passive(struct devfreq *devfreq, unsigned long freq)
> >>  {
> >>  	int ret;
> >> @@ -156,6 +255,140 @@ static int devfreq_passive_notifier_call(struct notifier_block *nb,
> >>  	return NOTIFY_DONE;
> >>  }
> >>  
> >> +static int cpufreq_passive_notifier_call(struct notifier_block *nb,
> >> +					 unsigned long event, void *ptr)
> >> +{
> >> +	struct devfreq_passive_data *data =
> >> +			container_of(nb, struct devfreq_passive_data, nb);
> >> +	struct devfreq *devfreq = (struct devfreq *)data->this;
> >> +	struct devfreq_cpu_state *cpu_state;
> >> +	struct cpufreq_freqs *freq = ptr;
> > 
> > How about changing 'freq' to 'cpu_freqs'?
> > 
> > In the drivers/cpufreq/cpufreq.c, use 'freqs' name indicating
> > the instance of 'struct cpufreq_freqs'. And in order to
> > identfy, how about adding 'cpu_' prefix for variable name?
> > 
> >> +	unsigned int current_freq;
> > 
> > Need to define curr_freq with 'unsigned long' type
> > and better to use 'curr_freq' variable name.
> > 
> >> +	int ret;
> >> +
> >> +	if (event != CPUFREQ_POSTCHANGE || !freq ||
> >> +	    !data->cpu_state[freq->policy->cpu])
> >> +		return 0;
> >> +
> >> +	cpu_state = data->cpu_state[freq->policy->cpu];
> >> +	if (cpu_state->freq == freq->new)
> >> +		return 0;
> >> +
> >> +	/* Backup current freq and pre-update cpu state freq*/
> >> +	current_freq = cpu_state->freq;
> >> +	cpu_state->freq = freq->new;
> >> +
> >> +	mutex_lock(&devfreq->lock);
> >> +	ret = update_devfreq(devfreq);
> >> +	mutex_unlock(&devfreq->lock);
> >> +	if (ret) {
> >> +		cpu_state->freq = current_freq;
> >> +		dev_err(&devfreq->dev, "Couldn't update the frequency.\n");
> >> +		return ret;
> >> +	}
> >> +
> >> +	return 0;
> >> +}
> >> +
> >> +static int cpufreq_passive_register(struct devfreq_passive_data **p_data)
> >> +{
> >> +	struct devfreq_passive_data *data = *p_data;
> >> +	struct devfreq *devfreq = (struct devfreq *)data->this;
> >> +	struct device *dev = devfreq->dev.parent;
> >> +	struct opp_table *opp_table = NULL;
> >> +	struct devfreq_cpu_state *state;
> > 
> > For the readability, I thinkt 'cpu_state' is proper instead of 'state'.
> > 
> >> +	struct cpufreq_policy *policy;
> >> +	struct device *cpu_dev;
> >> +	unsigned int cpu;
> >> +	int ret;
> >> +
> >> +	get_online_cpus();
> > 
> > Add blank line.
> > 
> >> +	data->nb.notifier_call = cpufreq_passive_notifier_call;
> >> +	ret = cpufreq_register_notifier(&data->nb,
> >> +					CPUFREQ_TRANSITION_NOTIFIER);
> >> +	if (ret) {
> >> +		dev_err(dev, "Couldn't register cpufreq notifier.\n");
> >> +		data->nb.notifier_call = NULL;
> >> +		goto out;
> >> +	}
> >> +
> >> +	/* Populate devfreq_cpu_state */
> >> +	for_each_online_cpu(cpu) {
> >> +		if (data->cpu_state[cpu])
> >> +			continue;
> >> +
> >> +		policy = cpufreq_cpu_get(cpu);
> > 
> > cpufreq_cpu_get() might return 'NULL'. I think you need to handle
> > return value as following:
> > 
> > 		if (!policy) {
> > 			ret = -EINVAL;
> > 			goto out;
> > 		} else if (PTR_ERR(policy) == -EPROBE_DEFER) {
> > 			goto out;
> > 		} else if (IS_ERR(policy) {
> > 			ret = PTR_ERR(policy);
> > 			dev_err(dev, "Couldn't get the cpufreq_poliy.\n");
> > 			goto out;
> > 		}
> > 
> > If cpufreq_cpu_get() return successfully, to do next.
> > It reduces the one indentaion.
> > 
> > 
> > 
> >> +		if (policy) {
> >> +			state = kzalloc(sizeof(*state), GFP_KERNEL);
> >> +			if (!state) {
> >> +				ret = -ENOMEM;
> >> +				goto out;
> >> +			}
> >> +
> >> +			cpu_dev = get_cpu_device(cpu);
> >> +			if (!cpu_dev) {
> >> +				dev_err(dev, "Couldn't get cpu device.\n");
> >> +				ret = -ENODEV;
> >> +				goto out;
> >> +			}
> >> +
> >> +			opp_table = dev_pm_opp_get_opp_table(cpu_dev);
> >> +			if (IS_ERR(devfreq->opp_table)) {
> >> +				ret = PTR_ERR(opp_table);
> >> +				goto out;
> >> +			}
> >> +
> >> +			state->dev = cpu_dev;
> >> +			state->opp_table = opp_table;
> >> +			state->first_cpu = cpumask_first(policy->related_cpus);
> >> +			state->freq = policy->cur;
> >> +			state->min_freq = policy->cpuinfo.min_freq;
> >> +			state->max_freq = policy->cpuinfo.max_freq;
> >> +			data->cpu_state[cpu] = state;
> > 
> > Add blank line.
> > 
> >> +			cpufreq_cpu_put(policy);
> >> +		} else {
> >> +			ret = -EPROBE_DEFER;
> >> +			goto out;
> >> +		}
> >> +	}
> > 
> > Add blank line.
> > 
> >> +out:
> >> +	put_online_cpus();
> >> +	if (ret)
> >> +		return ret;
> >> +
> >> +	/* Update devfreq */
> >> +	mutex_lock(&devfreq->lock);
> >> +	ret = update_devfreq(devfreq);
> >> +	mutex_unlock(&devfreq->lock);
> >> +	if (ret)
> >> +		dev_err(dev, "Couldn't update the frequency.\n");
> >> +
> >> +	return ret;
> >> +}
> >> +
> >> +static int cpufreq_passive_unregister(struct devfreq_passive_data **p_data)
> >> +{
> >> +	struct devfreq_passive_data *data = *p_data;
> >> +	struct devfreq_cpu_state *cpu_state;
> >> +	int cpu;
> >> +
> >> +	if (data->nb.notifier_call)
> >> +		cpufreq_unregister_notifier(&data->nb,
> >> +					    CPUFREQ_TRANSITION_NOTIFIER);
> >> +
> >> +	for_each_possible_cpu(cpu) {
> >> +		cpu_state = data->cpu_state[cpu];
> >> +		if (cpu_state) {
> >> +			if (cpu_state->opp_table)
> >> +				dev_pm_opp_put_opp_table(cpu_state->opp_table);
> >> +			kfree(cpu_state);
> >> +			cpu_state = NULL;
> >> +		}
> >> +	}
> >> +
> >> +	return 0;
> >> +}
> >> +
> >>  static int devfreq_passive_event_handler(struct devfreq *devfreq,
> >>  				unsigned int event, void *data)
> >>  {
> >> @@ -165,7 +398,7 @@ static int devfreq_passive_event_handler(struct devfreq *devfreq,
> >>  	struct notifier_block *nb = &p_data->nb;
> >>  	int ret = 0;
> >>  
> >> -	if (!parent)
> >> +	if (p_data->parent_type == DEVFREQ_PARENT_DEV && !parent)
> >>  		return -EPROBE_DEFER;
> > 
> > If you modify the devfreq_passive_event_handler() as following,
> > you can move this condition for DEVFREQ_PARENT_DEV into 
> > (register|unregister)_parent_dev_notifier.
> > 
> > 	switch (event) {                                                                                  
> > 	case DEVFREQ_GOV_START:                                               
> > 		ret = register_parent_dev_notifier(p_data);
> > 		break;
> > 	case DEVFREQ_GOV_STOP:                                             
> > 		ret = unregister_parent_dev_notifier(p_data);
> > 		break;
> > 	default: 
> > 		ret = -EINVAL;
> > 		break;
> > 	}
> >                                                                                               
> > 	return ret;
> > 
> >>  
> >>  	switch (event) {
> >> @@ -173,13 +406,24 @@ static int devfreq_passive_event_handler(struct devfreq *devfreq,
> >>  		if (!p_data->this)
> >>  			p_data->this = devfreq;
> >>  
> >> -		nb->notifier_call = devfreq_passive_notifier_call;
> >> -		ret = devfreq_register_notifier(parent, nb,
> >> -					DEVFREQ_TRANSITION_NOTIFIER);
> >> +		if (p_data->parent_type == DEVFREQ_PARENT_DEV) {
> >> +			nb->notifier_call = devfreq_passive_notifier_call;
> >> +			ret = devfreq_register_notifier(parent, nb,
> >> +						DEVFREQ_TRANSITION_NOTIFIER);
> >> +		} else if (p_data->parent_type == CPUFREQ_PARENT_DEV) {
> >> +			ret = cpufreq_passive_register(&p_data);
> > 
> > I think that we better to collect the code related to notifier registration
> > into one function like devfreq_pass_register_notifier() instead of
> > cpufreq_passive_register() as following: I think it is more simple and readable.
> > 
> > If you have more proper function name of register_parent_dev_notifier,
> > please give your opinion.
> > 
> > 
> > 	int register_parent_dev_notifier(struct devfreq_passive_data **p_data)
> > 		switch (p_data->parent_type) {
> > 		case DEVFREQ_PARENT_DEV:
> > 			nb->notifier_call = devfreq_passive_notifier_call;
> > 			ret = devfreq_register_notifier(parent, nb,
> > 			break;
> > 		case CPUFREQ_PARENT_DEV:
> > 			cpufreq_register_notifier(...)
> > 			...
> > 			break;
> > 		}
> > 		
> > 
> >> +		} else {
> >> +			ret = -EINVAL;
> >> +		}
> >>  		break;
> >>  	case DEVFREQ_GOV_STOP:
> >> -		WARN_ON(devfreq_unregister_notifier(parent, nb,
> >> -					DEVFREQ_TRANSITION_NOTIFIER));
> >> +		if (p_data->parent_type == DEVFREQ_PARENT_DEV)
> >> +			WARN_ON(devfreq_unregister_notifier(parent, nb,
> >> +						DEVFREQ_TRANSITION_NOTIFIER));
> >> +		else if (p_data->parent_type == CPUFREQ_PARENT_DEV)
> >> +			cpufreq_passive_unregister(&p_data);
> >> +		else
> >> +			ret = -EINVAL;
> > 
> > ditto. unregister_parent_dev_notifier(struct devfreq_passive_data **p_data)
> > 
> >>  		break;
> >>  	default:
> >>  		break;
> >> diff --git a/include/linux/devfreq.h b/include/linux/devfreq.h
> >> index a4b19d593151..04ce576fd6f1 100644
> >> --- a/include/linux/devfreq.h
> >> +++ b/include/linux/devfreq.h
> >> @@ -278,6 +278,32 @@ struct devfreq_simple_ondemand_data {
> >>  
> >>  #if IS_ENABLED(CONFIG_DEVFREQ_GOV_PASSIVE)
> >>  /**
> >> + * struct devfreq_cpu_state - holds the per-cpu state
> >> + * @freq:	the current frequency of the cpu.
> >> + * @min_freq:	the min frequency of the cpu.
> >> + * @max_freq:	the max frequency of the cpu.
> >> + * @first_cpu:	the cpumask of the first cpu of a policy.
> >> + * @dev:	reference to cpu device.
> >> + * @opp_table:	reference to cpu opp table.
> >> + *
> >> + * This structure stores the required cpu_state of a cpu.
> >> + * This is auto-populated by the governor.
> >> + */
> >> +struct devfreq_cpu_state {> +	unsigned int freq;
> > 
> > It is better to change from 'freq' to 'curr_freq'
> > for more correct expression.
> > 
> >> +	unsigned int min_freq;
> >> +	unsigned int max_freq;
> >> +	unsigned int first_cpu;
> >> +	struct device *dev;
> > 
> > How about changing the name 'dev' to 'cpu_dev'?
> > 
> > 
> >> +	struct opp_table *opp_table;
> >> +};
> > 
> > devfreq_cpu_state is only handled by within driver/devfreq/governor_passive.c.
> > 
> > So, you can move it into drivers/devfreq/governor_passive.c
> > and just add the definition into include/linux/devfreq.h as following:
> > It is able to prevent the access of variable of 'struct devfreq_cpu_state'
> > outside.
> > 
> > 	struct devfreq_cpu_state;
> > 
> >> +
> >> +enum devfreq_parent_dev_type {
> >> +	DEVFREQ_PARENT_DEV,
> >> +	CPUFREQ_PARENT_DEV,
> >> +};
> >> +
> >> +/**
> >>   * struct devfreq_passive_data - ``void *data`` fed to struct devfreq
> >>   *	and devfreq_add_device
> >>   * @parent:	the devfreq instance of parent device.
> >> @@ -288,13 +314,15 @@ struct devfreq_simple_ondemand_data {
> >>   *			using governors except for passive governor.
> >>   *			If the devfreq device has the specific method to decide
> >>   *			the next frequency, should use this callback.
> >> - * @this:	the devfreq instance of own device.
> >> - * @nb:		the notifier block for DEVFREQ_TRANSITION_NOTIFIER list
> >> + * @parent_type		parent type of the device
> > 
> > Need to add ':' at the end of word. -> "parent_type:".
> > 
> >> + * @this:		the devfreq instance of own device.
> >> + * @nb:			the notifier block for DEVFREQ_TRANSITION_NOTIFIER list
> > 
> > I knew that you make them with same indentation.
> > But, actually, it is not related to this patch like clean-up code.
> > Even if it is not pretty, you better to don't touch 'this' and 'nb' indentaion.
> > 
> >> + * @cpu_state:		the state min/max/current frequency of all online cpu's
> >>   *
> >>   * The devfreq_passive_data have to set the devfreq instance of parent
> >>   * device with governors except for the passive governor. But, don't need to
> >> - * initialize the 'this' and 'nb' field because the devfreq core will handle
> >> - * them.
> >> + * initialize the 'this', 'nb' and 'cpu_state' field because the devfreq core
> >> + * will handle them.
> >>   */
> >>  struct devfreq_passive_data {
> >>  	/* Should set the devfreq instance of parent device */
> >> @@ -303,9 +331,13 @@ struct devfreq_passive_data {
> >>  	/* Optional callback to decide the next frequency of passvice device */
> >>  	int (*get_target_freq)(struct devfreq *this, unsigned long *freq);
> >>  
> >> +	/* Should set the type of parent device */
> >> +	enum devfreq_parent_dev_type parent_type;
> >> +
> >>  	/* For passive governor's internal use. Don't need to set them */
> >>  	struct devfreq *this;
> >>  	struct notifier_block nb;
> >> +	struct devfreq_cpu_state *cpu_state[NR_CPUS];
> >>  };
> >>  #endif
> >>  
> >>
> > 
> > 
> 
> 

_______________________________________________
Linux-mediatek mailing list
Linux-mediatek@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-mediatek

^ permalink raw reply	[flat|nested] 35+ messages in thread

* Re: [PATCH 06/12] PM / devfreq: Add cpu based scaling support to passive_governor
  2020-06-02 11:43       ` andrew-sh.cheng
@ 2020-06-03  4:07         ` Chanwoo Choi
  2020-06-17  7:59           ` andrew-sh.cheng
  0 siblings, 1 reply; 35+ messages in thread
From: Chanwoo Choi @ 2020-06-03  4:07 UTC (permalink / raw)
  To: andrew-sh.cheng
  Cc: Mark Rutland, Nishanth Menon, srv_heupstream, devicetree,
	Stephen Boyd, Viresh Kumar, Mark Brown, linux-pm,
	Rafael J . Wysocki, Liam Girdwood, Rob Herring, linux-kernel,
	Kyungmin Park, MyungJoo Ham, linux-mediatek, Sibi Sankar,
	Matthias Brugger, linux-arm-kernel

Hi Andrew-sh.Cheng,

Do you know that why cannot show the patches sent from you on mailing list?

Even if you sent them to linux-pm mailing list, I cannot find
your patches on linux-pm's patchwork[1] and others.
[1] https://patchwork.kernel.org/project/linux-pm/list/

Could you find you patch on mailing list?
Do you use git send-email when you send these patches?

I used the thunderbird tool and gmail for reading the patches.
When I tried to read the original source of this patch,
it looks like that the body of patch is encoded.
I cannot read the plain text of patch body.
- When gmail, use 'Show original'
- When thunderbird, use 'More -> View Source'

If I'm missing something to check this patch,
please let me know. I'll fix my environment.
It is strange situation on my case.


On 6/2/20 8:43 PM, andrew-sh.cheng wrote:
> On Thu, 2020-05-28 at 15:14 +0900, Chanwoo Choi wrote:
>> Hi Andrew-sh.Cheng,
>>
>> Thanks for your posting. I like this approach absolutely.
>> I think that it is necessary. When I developed the embedded product,
>> I needed this feature always. 
>>
>> I add the comments on below.
>>
>>
>> And the following email is not valid. So, I dropped this email
>> from Cc list.
>> Saravana Kannan <skannan@codeaurora.org>
>>
>>
>> On 5/20/20 12:43 PM, Andrew-sh.Cheng wrote:
>>> From: Saravana Kannan <skannan@codeaurora.org>
>>>
>>> Many CPU architectures have caches that can scale independent of the
>>> CPUs. Frequency scaling of the caches is necessary to make sure that the
>>> cache is not a performance bottleneck that leads to poor performance and
>>> power. The same idea applies for RAM/DDR.
>>>
>>> To achieve this, this patch adds support for cpu based scaling to the
>>> passive governor. This is accomplished by taking the current frequency
>>> of each CPU frequency domain and then adjust the frequency of the cache
>>> (or any devfreq device) based on the frequency of the CPUs. It listens
>>> to CPU frequency transition notifiers to keep itself up to date on the
>>> current CPU frequency.
>>>
>>> To decide the frequency of the device, the governor does one of the
>>> following:
>>> * Derives the optimal devfreq device opp from required-opps property of
>>>   the parent cpu opp_table.
>>>
>>> * Scales the device frequency in proportion to the CPU frequency. So, if
>>>   the CPUs are running at their max frequency, the device runs at its
>>>   max frequency. If the CPUs are running at their min frequency, the
>>>   device runs at its min frequency. It is interpolated for frequencies
>>>   in between.
>>>
>>> Andrew-sh.Cheng change
>>> dev_pm_opp_xlate_opp to dev_pm_opp_xlate_required_opp devfreq->max_freq
>>> to devfreq->user_min_freq_req.data.freq.qos->min_freq.target_value
>>> for kernel-5.7
>>>
>>> Signed-off-by: Saravana Kannan <skannan@codeaurora.org>
>>> [Sibi: Integrated cpu-freqmap governor into passive_governor]
>>> Signed-off-by: Sibi Sankar <sibis@codeaurora.org>
>>> Signed-off-by: Andrew-sh.Cheng <andrew-sh.cheng@mediatek.com>
>>> ---
>>>  drivers/devfreq/Kconfig            |   2 +
>>>  drivers/devfreq/governor_passive.c | 278 ++++++++++++++++++++++++++++++++++---
>>>  include/linux/devfreq.h            |  40 +++++-
>>>  3 files changed, 299 insertions(+), 21 deletions(-)
>>>
>>> diff --git a/drivers/devfreq/Kconfig b/drivers/devfreq/Kconfig
>>> index 0b1df12e0f21..d9067950af6a 100644
>>> --- a/drivers/devfreq/Kconfig
>>> +++ b/drivers/devfreq/Kconfig
>>> @@ -73,6 +73,8 @@ config DEVFREQ_GOV_PASSIVE
>>>  	  device. This governor does not change the frequency by itself
>>>  	  through sysfs entries. The passive governor recommends that
>>>  	  devfreq device uses the OPP table to get the frequency/voltage.
>>> +	  Alternatively the governor can also be chosen to scale based on
>>> +	  the online CPUs current frequency.
>>>  
>>>  comment "DEVFREQ Drivers"
>>>  
>>> diff --git a/drivers/devfreq/governor_passive.c b/drivers/devfreq/governor_passive.c
>>> index 2d67d6c12dce..7dcda02a5bb7 100644
>>> --- a/drivers/devfreq/governor_passive.c
>>> +++ b/drivers/devfreq/governor_passive.c
>>> @@ -8,11 +8,89 @@
>>>   */
>>>  
>>>  #include <linux/module.h>
>>> +#include <linux/cpu.h>
>>> +#include <linux/cpufreq.h>
>>> +#include <linux/cpumask.h>
>>>  #include <linux/device.h>
>>>  #include <linux/devfreq.h>
>>> +#include <linux/slab.h>
>>>  #include "governor.h"
>>>  
>>> -static int devfreq_passive_get_target_freq(struct devfreq *devfreq,
>>> +static unsigned int xlate_cpufreq_to_devfreq(struct devfreq_passive_data *data,
>>
>> Need to change 'unsigned int' to 'unsigned long'
> Get it.

If you add the blank line before/after of your reply,
it is better to catch your reply. Please add the blank line for me.

>> .
>>
>>> +					     unsigned int cpu)
>>> +{
>>> +	unsigned int cpu_min, cpu_max, dev_min, dev_max, cpu_percent, max_state;
>>
>> Better to define them separately as following and then need to rename
>> the variable. Usually, use the 'min_freq' and 'max_freq' word for
>> the minimum/maximum frequency.
>>
>> 	unsigned int cpu_min_freq, cpu_max_freq, cpu_curr_freq, cpu_percent;
>> 	unsigned long dev_min_freq, dev_max_freq, dev_max_state,
>>
>> The devfreq used 'unsigned long'. The cpufreq used 'unsigned long'
>> and 'unsigned int'. You need to handle them properly.
> Get it.
> For cpu_freq, I separate it into "unsigned long cpu_curr_freq" and
> "unsigned int cpu_curr_freq_khz"
>>
>>
>>> +	struct devfreq_cpu_state *cpu_state = data->cpu_state[cpu];
>>> +	struct devfreq *devfreq = (struct devfreq *)data->this;
>>> +	unsigned long *freq_table = devfreq->profile->freq_table;
>>
>> In this function, use 'cpu' work for cpufreq and use 'dev' for devfreq.
>> So, I think 'dev_freq_table' is proper name instead of 'freq_table'
>> for the readability.
>>
>> 	freq_table -> dev_freq_table
>>
>>> +	struct dev_pm_opp *opp = NULL, *cpu_opp = NULL;
>>
>> In the get_target_freq_with_devfreq(), use 'p_opp' indicating
>> the OPP of parent device. For the consistency, I think that
>> use 'p_opp' instead of 'cpu_opp'. 
>>
>>> +	unsigned long cpu_freq, freq;
>>
>> Define the 'cpu_freq' on above with cpu_min_freq/cpu_max_freq definition.
>> 	cpu_freq -> cpu_curr_freq.
> Get it.
> Will modify them for readability.
>>
>>> +
>>> +	if (!cpu_state || cpu_state->first_cpu != cpu ||
>>> +	    !cpu_state->opp_table || !devfreq->opp_table)
>>> +		return 0;
>>> +
>>> +	cpu_freq = cpu_state->freq * 1000;
>>> +	cpu_opp = devfreq_recommended_opp(cpu_state->dev, &cpu_freq, 0);
>>> +	if (IS_ERR(cpu_opp))
>>> +		return 0;
>>> +
>>> +	opp = dev_pm_opp_xlate_required_opp(cpu_state->opp_table,
>>> +					    devfreq->opp_table, cpu_opp);
>>> +	dev_pm_opp_put(cpu_opp);
>>> +
>>> +	if (!IS_ERR(opp)) {
>>> +		freq = dev_pm_opp_get_freq(opp);
>>> +		dev_pm_opp_put(opp);
>>
>> Better to add the 'out' goto statement.
>> If you use 'goto out', you can reduce the one indentation
>> without 'else' statement.
> Get it.
>> 	
>>
>>> +	} else {
>>
>> As I commented, when dev_pm_opp_xlate_required_opp() return successfully
>> , use 'goto out'. We can remove 'else' and then reduce the unneeded indentation.
>>
>>
>>> +		/* Use Interpolation if required opps is not available */
>>> +		cpu_min = cpu_state->min_freq;
>>> +		cpu_max = cpu_state->max_freq;
>>> +		cpu_freq = cpu_state->freq;
>>> +
>>> +		if (freq_table) {
>>> +			/* Get minimum frequency according to sorting order */
>>> +			max_state = freq_table[devfreq->profile->max_state - 1];
>>> +			if (freq_table[0] < max_state) {
>>> +				dev_min = freq_table[0];
>>> +				dev_max = max_state;
>>> +			} else {
>>> +				dev_min = max_state;
>>> +				dev_max = freq_table[0];
>>> +			}
>>> +		} else {
>>> +			if (devfreq->user_max_freq_req.data.freq.qos->max_freq.target_value
>>> +			    <= devfreq->user_min_freq_req.data.freq.qos->min_freq.target_value)
>>> +				return 0;
>>> +			dev_min =
>>> +			devfreq->user_min_freq_req.data.freq.qos->min_freq.target_value;
>>> +			dev_max =
>>> +			devfreq->user_max_freq_req.data.freq.qos->max_freq.target_value;
>>
>> I think it is not proper to access the variable of pm_qos structure directly.
>> Instead of direct access, you have to use the exported PM QoS function such as
>> - pm_qos_read_value(devfreq->dev.parent, DEV_PM_QOS_MIN_FREQUENCY);
>> - pm_qos_read_value(devfreq->dev.parent, DEV_PM_QOS_MAX_FREQUENCY);
> Get it.
>>
>>> +		}
>>> +		cpu_percent = ((cpu_freq - cpu_min) * 100) / cpu_max - cpu_min;
>>> +		freq = dev_min + mult_frac(dev_max - dev_min, cpu_percent, 100);
>>> +	}
>>
>>
>> I think that you better to add 'out' jump label as following:
>>
>> out:
>>
>>> +
>>> +	return freq;
>>> +}
>>> +
>>> +static int get_target_freq_with_cpufreq(struct devfreq *devfreq,
>>> +					unsigned long *freq)
>>> +{
>>> +	struct devfreq_passive_data *p_data =
>>> +				(struct devfreq_passive_data *)devfreq->data;
>>> +	unsigned int cpu, target_freq = 0;
>>
>> Need to define 'target_freq' with 'unsigned long' type.
> Get it.
>>
>>> +
>>> +	for_each_online_cpu(cpu)
>>> +		target_freq = max(target_freq,
>>> +				  xlate_cpufreq_to_devfreq(p_data, cpu));
>>> +
>>> +	*freq = target_freq;
>>> +
>>> +	return 0;
>>> +}
>>> +
>>> +static int get_target_freq_with_devfreq(struct devfreq *devfreq,
>>>  					unsigned long *freq)
>>>  {
>>>  	struct devfreq_passive_data *p_data
>>> @@ -23,16 +101,6 @@ static int devfreq_passive_get_target_freq(struct devfreq *devfreq,
>>>  	int i, count, ret = 0;
>>>  
>>>  	/*
>>> -	 * If the devfreq device with passive governor has the specific method
>>> -	 * to determine the next frequency, should use the get_target_freq()
>>> -	 * of struct devfreq_passive_data.
>>> -	 */
>>> -	if (p_data->get_target_freq) {
>>> -		ret = p_data->get_target_freq(devfreq, freq);
>>> -		goto out;
>>> -	}
>>> -
>>> -	/*
>>>  	 * If the parent and passive devfreq device uses the OPP table,
>>>  	 * get the next frequency by using the OPP table.
>>>  	 */
>>> @@ -102,6 +170,37 @@ static int devfreq_passive_get_target_freq(struct devfreq *devfreq,
>>>  	return ret;
>>>  }
>>>  
>>> +static int devfreq_passive_get_target_freq(struct devfreq *devfreq,
>>> +					   unsigned long *freq)
>>> +{
>>> +	struct devfreq_passive_data *p_data =
>>> +				(struct devfreq_passive_data *)devfreq->data;
>>> +	int ret;
>>> +
>>> +	/*
>>> +	 * If the devfreq device with passive governor has the specific method
>>> +	 * to determine the next frequency, should use the get_target_freq()
>>> +	 * of struct devfreq_passive_data.
>>> +	 */
>>> +	if (p_data->get_target_freq)
>>> +		return p_data->get_target_freq(devfreq, freq);
>>> +
>>> +	switch (p_data->parent_type) {
>>> +	case DEVFREQ_PARENT_DEV:
>>> +		ret = get_target_freq_with_devfreq(devfreq, freq);
>>> +		break;
>>> +	case CPUFREQ_PARENT_DEV:
>>> +		ret = get_target_freq_with_cpufreq(devfreq, freq);
>>> +		break;
>>> +	default:
>>> +		ret = -EINVAL;
>>> +		dev_err(&devfreq->dev, "Invalid parent type\n");
>>> +		break;
>>> +	}
>>> +
>>> +	return ret;
>>> +}
>>> +
>>>  static int update_devfreq_passive(struct devfreq *devfreq, unsigned long freq)
>>>  {
>>>  	int ret;
>>> @@ -156,6 +255,140 @@ static int devfreq_passive_notifier_call(struct notifier_block *nb,
>>>  	return NOTIFY_DONE;
>>>  }
>>>  
>>> +static int cpufreq_passive_notifier_call(struct notifier_block *nb,
>>> +					 unsigned long event, void *ptr)
>>> +{
>>> +	struct devfreq_passive_data *data =
>>> +			container_of(nb, struct devfreq_passive_data, nb);
>>> +	struct devfreq *devfreq = (struct devfreq *)data->this;
>>> +	struct devfreq_cpu_state *cpu_state;
>>> +	struct cpufreq_freqs *freq = ptr;
>>
>> How about changing 'freq' to 'cpu_freqs'?
>>
>> In the drivers/cpufreq/cpufreq.c, use 'freqs' name indicating
>> the instance of 'struct cpufreq_freqs'. And in order to
>> identfy, how about adding 'cpu_' prefix for variable name?
>>
>>> +	unsigned int current_freq;
>>
>> Need to define curr_freq with 'unsigned long' type
>> and better to use 'curr_freq' variable name.
> It is good to change current_freq to curr_freq, but why should it us
> 'unsigned long'?
> I think it is 'unsigned int'.

I think that 'curr_freq' is proper. Yes, it is 'unsigned int'.
When you changing the cpu frequency to device frequency,
recommend to handle them between unsigned int and unsigned long.

>>
>>> +	int ret;
>>> +
>>> +	if (event != CPUFREQ_POSTCHANGE || !freq ||
>>> +	    !data->cpu_state[freq->policy->cpu])
>>> +		return 0;
>>> +
>>> +	cpu_state = data->cpu_state[freq->policy->cpu];
>>> +	if (cpu_state->freq == freq->new)
>>> +		return 0;
>>> +
>>> +	/* Backup current freq and pre-update cpu state freq*/
>>> +	current_freq = cpu_state->freq;
>>> +	cpu_state->freq = freq->new;
>>> +
>>> +	mutex_lock(&devfreq->lock);
>>> +	ret = update_devfreq(devfreq);
>>> +	mutex_unlock(&devfreq->lock);
>>> +	if (ret) {
>>> +		cpu_state->freq = current_freq;
>>> +		dev_err(&devfreq->dev, "Couldn't update the frequency.\n");
>>> +		return ret;
>>> +	}
>>> +
>>> +	return 0;
>>> +}
>>> +
>>> +static int cpufreq_passive_register(struct devfreq_passive_data **p_data)
>>> +{
>>> +	struct devfreq_passive_data *data = *p_data;
>>> +	struct devfreq *devfreq = (struct devfreq *)data->this;
>>> +	struct device *dev = devfreq->dev.parent;
>>> +	struct opp_table *opp_table = NULL;
>>> +	struct devfreq_cpu_state *state;
>>
>> For the readability, I thinkt 'cpu_state' is proper instead of 'state'.
> Get it.
>>
>>> +	struct cpufreq_policy *policy;
>>> +	struct device *cpu_dev;
>>> +	unsigned int cpu;
>>> +	int ret;
>>> +
>>> +	get_online_cpus();
>>
>> Add blank line.
> Get it.
>>
>>> +	data->nb.notifier_call = cpufreq_passive_notifier_call;
>>> +	ret = cpufreq_register_notifier(&data->nb,
>>> +					CPUFREQ_TRANSITION_NOTIFIER);
>>> +	if (ret) {
>>> +		dev_err(dev, "Couldn't register cpufreq notifier.\n");
>>> +		data->nb.notifier_call = NULL;
>>> +		goto out;
>>> +	}
>>> +
>>> +	/* Populate devfreq_cpu_state */
>>> +	for_each_online_cpu(cpu) {
>>> +		if (data->cpu_state[cpu])
>>> +			continue;
>>> +
>>> +		policy = cpufreq_cpu_get(cpu);
>>
>> cpufreq_cpu_get() might return 'NULL'. I think you need to handle
>> return value as following:
>>
>> 		if (!policy) {
>> 			ret = -EINVAL;
>> 			goto out;
>> 		} else if (PTR_ERR(policy) == -EPROBE_DEFER) {
>> 			goto out;
>> 		} else if (IS_ERR(policy) {
>> 			ret = PTR_ERR(policy);
>> 			dev_err(dev, "Couldn't get the cpufreq_poliy.\n");
>> 			goto out;
>> 		}
>>
>> If cpufreq_cpu_get() return successfully, to do next.
>> It reduces the one indentaion.
>>
>>
> Get it.
>>
>>> +		if (policy) {
>>> +			state = kzalloc(sizeof(*state), GFP_KERNEL);
>>> +			if (!state) {
>>> +				ret = -ENOMEM;
>>> +				goto out;
>>> +			}
>>> +
>>> +			cpu_dev = get_cpu_device(cpu);
>>> +			if (!cpu_dev) {
>>> +				dev_err(dev, "Couldn't get cpu device.\n");
>>> +				ret = -ENODEV;
>>> +				goto out;
>>> +			}
>>> +
>>> +			opp_table = dev_pm_opp_get_opp_table(cpu_dev);
>>> +			if (IS_ERR(devfreq->opp_table)) {
>>> +				ret = PTR_ERR(opp_table);
>>> +				goto out;
>>> +			}
>>> +
>>> +			state->dev = cpu_dev;
>>> +			state->opp_table = opp_table;
>>> +			state->first_cpu = cpumask_first(policy->related_cpus);
>>> +			state->freq = policy->cur;
>>> +			state->min_freq = policy->cpuinfo.min_freq;
>>> +			state->max_freq = policy->cpuinfo.max_freq;
>>> +			data->cpu_state[cpu] = state;
>>
>> Add blank line.
>>
>>> +			cpufreq_cpu_put(policy);
>>> +		} else {
>>> +			ret = -EPROBE_DEFER;
>>> +			goto out;
>>> +		}
>>> +	}
>>
>> Add blank line.
> Get it.
>>> +out:
>>> +	put_online_cpus();
>>> +	if (ret)
>>> +		return ret;
>>> +
>>> +	/* Update devfreq */
>>> +	mutex_lock(&devfreq->lock);
>>> +	ret = update_devfreq(devfreq);
>>> +	mutex_unlock(&devfreq->lock);
>>> +	if (ret)
>>> +		dev_err(dev, "Couldn't update the frequency.\n");
>>> +
>>> +	return ret;
>>> +}
>>> +
>>> +static int cpufreq_passive_unregister(struct devfreq_passive_data **p_data)
>>> +{
>>> +	struct devfreq_passive_data *data = *p_data;
>>> +	struct devfreq_cpu_state *cpu_state;
>>> +	int cpu;
>>> +
>>> +	if (data->nb.notifier_call)
>>> +		cpufreq_unregister_notifier(&data->nb,
>>> +					    CPUFREQ_TRANSITION_NOTIFIER);
>>> +
>>> +	for_each_possible_cpu(cpu) {
>>> +		cpu_state = data->cpu_state[cpu];
>>> +		if (cpu_state) {
>>> +			if (cpu_state->opp_table)
>>> +				dev_pm_opp_put_opp_table(cpu_state->opp_table);
>>> +			kfree(cpu_state);
>>> +			cpu_state = NULL;
>>> +		}
>>> +	}
>>> +
>>> +	return 0;
>>> +}
>>> +
>>>  static int devfreq_passive_event_handler(struct devfreq *devfreq,
>>>  				unsigned int event, void *data)
>>>  {
>>> @@ -165,7 +398,7 @@ static int devfreq_passive_event_handler(struct devfreq *devfreq,
>>>  	struct notifier_block *nb = &p_data->nb;
>>>  	int ret = 0;
>>>  
>>> -	if (!parent)
>>> +	if (p_data->parent_type == DEVFREQ_PARENT_DEV && !parent)
>>>  		return -EPROBE_DEFER;
>>
>> If you modify the devfreq_passive_event_handler() as following,
>> you can move this condition for DEVFREQ_PARENT_DEV into 
>> (register|unregister)_parent_dev_notifier.
>>
>> 	switch (event) {                                                                                  
>> 	case DEVFREQ_GOV_START:                                               
>> 		ret = register_parent_dev_notifier(p_data);
>> 		break;
>> 	case DEVFREQ_GOV_STOP:                                             
>> 		ret = unregister_parent_dev_notifier(p_data);
>> 		break;
>> 	default: 
>> 		ret = -EINVAL;
>> 		break;
>> 	}
>>                                                                                               
>> 	return ret;
>>
> Get it.
>>>  
>>>  	switch (event) {
>>> @@ -173,13 +406,24 @@ static int devfreq_passive_event_handler(struct devfreq *devfreq,
>>>  		if (!p_data->this)
>>>  			p_data->this = devfreq;
>>>  
>>> -		nb->notifier_call = devfreq_passive_notifier_call;
>>> -		ret = devfreq_register_notifier(parent, nb,
>>> -					DEVFREQ_TRANSITION_NOTIFIER);
>>> +		if (p_data->parent_type == DEVFREQ_PARENT_DEV) {
>>> +			nb->notifier_call = devfreq_passive_notifier_call;
>>> +			ret = devfreq_register_notifier(parent, nb,
>>> +						DEVFREQ_TRANSITION_NOTIFIER);
>>> +		} else if (p_data->parent_type == CPUFREQ_PARENT_DEV) {
>>> +			ret = cpufreq_passive_register(&p_data);
>>
>> I think that we better to collect the code related to notifier registration
>> into one function like devfreq_pass_register_notifier() instead of
>> cpufreq_passive_register() as following: I think it is more simple and readable.
>>
>> If you have more proper function name of register_parent_dev_notifier,
>> please give your opinion.
>>
>> 	int register_parent_dev_notifier(struct devfreq_passive_data **p_data)
>> 		switch (p_data->parent_type) {
>> 		case DEVFREQ_PARENT_DEV:
>> 			nb->notifier_call = devfreq_passive_notifier_call;
>> 			ret = devfreq_register_notifier(parent, nb,
>> 			break;
>> 		case CPUFREQ_PARENT_DEV:
>> 			cpufreq_register_notifier(...)
>> 			...
>> 			break;
>> 		}
> Not fully understanding.
> Do you mean expanding cpufreq_passive_register()?

Yes and rename it for both cpufreq and devfreq.

> I think leave it in function will be with clean for this code segment.

I want that one function handle the notifier register
for both cpufreq and devfreq so that we make it more simply as following:
On the step hanling the governor event, don't need to consider
the type of parent device of devfreq deivce with this style.

	case DEVFREQ_GOV_START:
		ret = register_notifier(...);
		break;
	case DEVFREQ_GOV_STOP:
		ret = unregister_notifier(...);
		break;

> 
>> 		
>>
>>> +		} else {
>>> +			ret = -EINVAL;
>>> +		}
>>>  		break;
>>>  	case DEVFREQ_GOV_STOP:
>>> -		WARN_ON(devfreq_unregister_notifier(parent, nb,
>>> -					DEVFREQ_TRANSITION_NOTIFIER));
>>> +		if (p_data->parent_type == DEVFREQ_PARENT_DEV)
>>> +			WARN_ON(devfreq_unregister_notifier(parent, nb,
>>> +						DEVFREQ_TRANSITION_NOTIFIER));
>>> +		else if (p_data->parent_type == CPUFREQ_PARENT_DEV)
>>> +			cpufreq_passive_unregister(&p_data);
>>> +		else
>>> +			ret = -EINVAL;
>>
>> ditto. unregister_parent_dev_notifier(struct devfreq_passive_data **p_data)
> Get it.

ditto. As I aboved commented.

>>
>>>  		break;
>>>  	default:
>>>  		break;
>>> diff --git a/include/linux/devfreq.h b/include/linux/devfreq.h
>>> index a4b19d593151..04ce576fd6f1 100644
>>> --- a/include/linux/devfreq.h
>>> +++ b/include/linux/devfreq.h
>>> @@ -278,6 +278,32 @@ struct devfreq_simple_ondemand_data {
>>>  
>>>  #if IS_ENABLED(CONFIG_DEVFREQ_GOV_PASSIVE)
>>>  /**
>>> + * struct devfreq_cpu_state - holds the per-cpu state
>>> + * @freq:	the current frequency of the cpu.
>>> + * @min_freq:	the min frequency of the cpu.
>>> + * @max_freq:	the max frequency of the cpu.
>>> + * @first_cpu:	the cpumask of the first cpu of a policy.
>>> + * @dev:	reference to cpu device.
>>> + * @opp_table:	reference to cpu opp table.
>>> + *
>>> + * This structure stores the required cpu_state of a cpu.
>>> + * This is auto-populated by the governor.
>>> + */
>>> +struct devfreq_cpu_state {> +	unsigned int freq;
>>
>> It is better to change from 'freq' to 'curr_freq'
>> for more correct expression.
> Get it.
>>
>>> +	unsigned int min_freq;
>>> +	unsigned int max_freq;
>>> +	unsigned int first_cpu;
>>> +	struct device *dev;
>>
>> How about changing the name 'dev' to 'cpu_dev'?
> Okay.
>>
>>
>>> +	struct opp_table *opp_table;
>>> +};
>>
>> devfreq_cpu_state is only handled by within driver/devfreq/governor_passive.c.
>>
>> So, you can move it into drivers/devfreq/governor_passive.c
>> and just add the definition into include/linux/devfreq.h as following:
>> It is able to prevent the access of variable of 'struct devfreq_cpu_state'
>> outside.
>>
>> 	struct devfreq_cpu_state;
> Get it.
>>
>>> +
>>> +enum devfreq_parent_dev_type {
>>> +	DEVFREQ_PARENT_DEV,
>>> +	CPUFREQ_PARENT_DEV,
>>> +};
>>> +
>>> +/**
>>>   * struct devfreq_passive_data - ``void *data`` fed to struct devfreq
>>>   *	and devfreq_add_device
>>>   * @parent:	the devfreq instance of parent device.
>>> @@ -288,13 +314,15 @@ struct devfreq_simple_ondemand_data {
>>>   *			using governors except for passive governor.
>>>   *			If the devfreq device has the specific method to decide
>>>   *			the next frequency, should use this callback.
>>> - * @this:	the devfreq instance of own device.
>>> - * @nb:		the notifier block for DEVFREQ_TRANSITION_NOTIFIER list
>>> + * @parent_type		parent type of the device
>>
>> Need to add ':' at the end of word. -> "parent_type:".
>>
>>> + * @this:		the devfreq instance of own device.
>>> + * @nb:			the notifier block for DEVFREQ_TRANSITION_NOTIFIER list
>>
>> I knew that you make them with same indentation.
>> But, actually, it is not related to this patch like clean-up code.
>> Even if it is not pretty, you better to don't touch 'this' and 'nb' indentaion.
> Get it.
>>
>>> + * @cpu_state:		the state min/max/current frequency of all online cpu's
>>>   *
>>>   * The devfreq_passive_data have to set the devfreq instance of parent
>>>   * device with governors except for the passive governor. But, don't need to
>>> - * initialize the 'this' and 'nb' field because the devfreq core will handle
>>> - * them.
>>> + * initialize the 'this', 'nb' and 'cpu_state' field because the devfreq core
>>> + * will handle them.
>>>   */
>>>  struct devfreq_passive_data {
>>>  	/* Should set the devfreq instance of parent device */
>>> @@ -303,9 +331,13 @@ struct devfreq_passive_data {
>>>  	/* Optional callback to decide the next frequency of passvice device */
>>>  	int (*get_target_freq)(struct devfreq *this, unsigned long *freq);
>>>  
>>> +	/* Should set the type of parent device */
>>> +	enum devfreq_parent_dev_type parent_type;
>>> +
>>>  	/* For passive governor's internal use. Don't need to set them */
>>>  	struct devfreq *this;
>>>  	struct notifier_block nb;
>>> +	struct devfreq_cpu_state *cpu_state[NR_CPUS];
>>>  };
>>>  #endif
>>>  
>>>
>>
>>
> 


-- 
Best Regards,
Chanwoo Choi
Samsung Electronics

_______________________________________________
Linux-mediatek mailing list
Linux-mediatek@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-mediatek

^ permalink raw reply	[flat|nested] 35+ messages in thread

* Re: [PATCH 06/12] PM / devfreq: Add cpu based scaling support to passive_governor
  2020-06-02 12:23         ` andrew-sh.cheng
@ 2020-06-03  4:12           ` Chanwoo Choi
  0 siblings, 0 replies; 35+ messages in thread
From: Chanwoo Choi @ 2020-06-03  4:12 UTC (permalink / raw)
  To: andrew-sh.cheng
  Cc: Mark Rutland, Nishanth Menon, srv_heupstream, devicetree,
	Stephen Boyd, Viresh Kumar, Mark Brown, linux-pm,
	Rafael J . Wysocki, Liam Girdwood, Rob Herring, linux-kernel,
	Kyungmin Park, MyungJoo Ham, linux-mediatek, Sibi Sankar,
	Matthias Brugger, linux-arm-kernel

Hi Andrew-sh.Cheng,

On 6/2/20 9:23 PM, andrew-sh.cheng wrote:
> On Thu, 2020-05-28 at 16:17 +0900, Chanwoo Choi wrote:
>> Hi Andrew-sh.Cheng,
>>
>> The exynos-bus.c used the passive governor.
>> Even if don't make the problem because DEVFREQ_PARENT_DEV is zero,
>> you need to initialize the parent_type with DEVFREQ_PARENT_DEV as following:
>>
>> diff --git a/drivers/devfreq/exynos-bus.c b/drivers/devfreq/exynos-bus.c
>> index 8fa8eb541373..1c71c47bc2ac 100644
>> --- a/drivers/devfreq/exynos-bus.c
>> +++ b/drivers/devfreq/exynos-bus.c
>> @@ -369,6 +369,7 @@ static int exynos_bus_profile_init_passive(struct exynos_bus *bus,
>>                 return -ENOMEM;
>>  
>>         passive_data->parent = parent_devfreq;
>> +       passive_data->parent_type = DEVFREQ_PARENT_DEV;
>>  
>>         /* Add devfreq device for exynos bus with passive governor */
>>         bus->devfreq = devm_devfreq_add_device(dev, profile, DEVFREQ_GOV_PASSIVE,
> Hi Chanwoo Choi,
> Do you just remind me to initialize it to DEVFREQ_PARENT_DEV whn use
> this governor?

Yes. This change was not included in this patchset.

> I will do it and thank you for reminding.

Thanks.

(snip)


And, this patchset doesn't include the dt-binding example
and any real example in devicetree. If possible, I recommend
you better to update dt-binding document with example.

-- 
Best Regards,
Chanwoo Choi
Samsung Electronics

_______________________________________________
Linux-mediatek mailing list
Linux-mediatek@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-mediatek

^ permalink raw reply	[flat|nested] 35+ messages in thread

* Re: [PATCH 00/12] Add cpufreq and cci devfreq for mt8183, and SVS support
  2020-05-20  3:42 ` [PATCH 00/12] Add cpufreq and cci devfreq for mt8183, and SVS support Andrew-sh.Cheng
                     ` (12 preceding siblings ...)
  2020-05-20  4:10   ` [PATCH 00/12] Add cpufreq and cci devfreq for mt8183, and " Chanwoo Choi
@ 2020-06-15  7:31   ` Viresh Kumar
  13 siblings, 0 replies; 35+ messages in thread
From: Viresh Kumar @ 2020-06-15  7:31 UTC (permalink / raw)
  To: Andrew-sh.Cheng
  Cc: Mark Rutland, Nishanth Menon, srv_heupstream, linux-pm,
	Stephen Boyd, Mark Brown, Rafael J . Wysocki, Liam Girdwood,
	Rob Herring, linux-kernel, Chanwoo Choi, Kyungmin Park,
	MyungJoo Ham, linux-mediatek, linux-arm-kernel, Matthias Brugger,
	devicetree

On 20-05-20, 11:42, Andrew-sh.Cheng wrote:
> 	- Resend depending patches of Sravana Kannan base on kernel-5.7

Saravana's patches were never accepted and I suggested him this which
he never tested I believe.

https://lore.kernel.org/lkml/20191125112812.26jk5hsdwqfnofc2@vireshk-i7/

There is no point rebasing your stuff on a series which hasn't
concluded or is accepted, at least logically.

-- 
viresh

_______________________________________________
Linux-mediatek mailing list
Linux-mediatek@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-mediatek

^ permalink raw reply	[flat|nested] 35+ messages in thread

* Re: [PATCH 06/12] PM / devfreq: Add cpu based scaling support to passive_governor
  2020-06-03  4:07         ` Chanwoo Choi
@ 2020-06-17  7:59           ` andrew-sh.cheng
  0 siblings, 0 replies; 35+ messages in thread
From: andrew-sh.cheng @ 2020-06-17  7:59 UTC (permalink / raw)
  To: Chanwoo Choi
  Cc: Mark Rutland, Nishanth Menon, srv_heupstream, devicetree,
	Stephen Boyd, Viresh Kumar, Mark Brown, linux-pm,
	Rafael J . Wysocki, Liam Girdwood, Rob Herring, linux-kernel,
	Kyungmin Park, MyungJoo Ham, linux-mediatek, Sibi Sankar,
	Matthias Brugger, linux-arm-kernel

On Wed, 2020-06-03 at 13:07 +0900, Chanwoo Choi wrote:
> Hi Andrew-sh.Cheng,
> 
> Do you know that why cannot show the patches sent from you on mailing list?
> 
> Even if you sent them to linux-pm mailing list, I cannot find
> your patches on linux-pm's patchwork[1] and others.
> [1] https://patchwork.kernel.org/project/linux-pm/list/
> 
> Could you find you patch on mailing list?
> Do you use git send-email when you send these patches?
> 
> I used the thunderbird tool and gmail for reading the patches.
> When I tried to read the original source of this patch,
> it looks like that the body of patch is encoded.
> I cannot read the plain text of patch body.
> - When gmail, use 'Show original'
> - When thunderbird, use 'More -> View Source'
> 
> If I'm missing something to check this patch,
> please let me know. I'll fix my environment.
> It is strange situation on my case.
> 

Hi Chanwoo Choi~
I cannot find the patch in linux-pm, either.
It should be firewall problem of MTK. (I got some notify from IT.)
I will request the right to send mail to "linux-pm@vger.kernel.org"
Thank you for reminding.

> 
> On 6/2/20 8:43 PM, andrew-sh.cheng wrote:
> > On Thu, 2020-05-28 at 15:14 +0900, Chanwoo Choi wrote:
> >> Hi Andrew-sh.Cheng,
> >>
> >> Thanks for your posting. I like this approach absolutely.
> >> I think that it is necessary. When I developed the embedded product,
> >> I needed this feature always. 
> >>
> >> I add the comments on below.
> >>
> >>
> >> And the following email is not valid. So, I dropped this email
> >> from Cc list.
> >> Saravana Kannan <skannan@codeaurora.org>
> >>
> >>
> >> On 5/20/20 12:43 PM, Andrew-sh.Cheng wrote:
> >>> From: Saravana Kannan <skannan@codeaurora.org>
> >>>
> >>> Many CPU architectures have caches that can scale independent of the
> >>> CPUs. Frequency scaling of the caches is necessary to make sure that the
> >>> cache is not a performance bottleneck that leads to poor performance and
> >>> power. The same idea applies for RAM/DDR.
> >>>
> >>> To achieve this, this patch adds support for cpu based scaling to the
> >>> passive governor. This is accomplished by taking the current frequency
> >>> of each CPU frequency domain and then adjust the frequency of the cache
> >>> (or any devfreq device) based on the frequency of the CPUs. It listens
> >>> to CPU frequency transition notifiers to keep itself up to date on the
> >>> current CPU frequency.
> >>>
> >>> To decide the frequency of the device, the governor does one of the
> >>> following:
> >>> * Derives the optimal devfreq device opp from required-opps property of
> >>>   the parent cpu opp_table.
> >>>
> >>> * Scales the device frequency in proportion to the CPU frequency. So, if
> >>>   the CPUs are running at their max frequency, the device runs at its
> >>>   max frequency. If the CPUs are running at their min frequency, the
> >>>   device runs at its min frequency. It is interpolated for frequencies
> >>>   in between.
> >>>
> >>> Andrew-sh.Cheng change
> >>> dev_pm_opp_xlate_opp to dev_pm_opp_xlate_required_opp devfreq->max_freq
> >>> to devfreq->user_min_freq_req.data.freq.qos->min_freq.target_value
> >>> for kernel-5.7
> >>>
> >>> Signed-off-by: Saravana Kannan <skannan@codeaurora.org>
> >>> [Sibi: Integrated cpu-freqmap governor into passive_governor]
> >>> Signed-off-by: Sibi Sankar <sibis@codeaurora.org>
> >>> Signed-off-by: Andrew-sh.Cheng <andrew-sh.cheng@mediatek.com>
> >>> ---
> >>>  drivers/devfreq/Kconfig            |   2 +
> >>>  drivers/devfreq/governor_passive.c | 278 ++++++++++++++++++++++++++++++++++---
> >>>  include/linux/devfreq.h            |  40 +++++-
> >>>  3 files changed, 299 insertions(+), 21 deletions(-)
> >>>
> >>> diff --git a/drivers/devfreq/Kconfig b/drivers/devfreq/Kconfig
> >>> index 0b1df12e0f21..d9067950af6a 100644
> >>> --- a/drivers/devfreq/Kconfig
> >>> +++ b/drivers/devfreq/Kconfig
> >>> @@ -73,6 +73,8 @@ config DEVFREQ_GOV_PASSIVE
> >>>  	  device. This governor does not change the frequency by itself
> >>>  	  through sysfs entries. The passive governor recommends that
> >>>  	  devfreq device uses the OPP table to get the frequency/voltage.
> >>> +	  Alternatively the governor can also be chosen to scale based on
> >>> +	  the online CPUs current frequency.
> >>>  
> >>>  comment "DEVFREQ Drivers"
> >>>  
> >>> diff --git a/drivers/devfreq/governor_passive.c b/drivers/devfreq/governor_passive.c
> >>> index 2d67d6c12dce..7dcda02a5bb7 100644
> >>> --- a/drivers/devfreq/governor_passive.c
> >>> +++ b/drivers/devfreq/governor_passive.c
> >>> @@ -8,11 +8,89 @@
> >>>   */
> >>>  
> >>>  #include <linux/module.h>
> >>> +#include <linux/cpu.h>
> >>> +#include <linux/cpufreq.h>
> >>> +#include <linux/cpumask.h>
> >>>  #include <linux/device.h>
> >>>  #include <linux/devfreq.h>
> >>> +#include <linux/slab.h>
> >>>  #include "governor.h"
> >>>  
> >>> -static int devfreq_passive_get_target_freq(struct devfreq *devfreq,
> >>> +static unsigned int xlate_cpufreq_to_devfreq(struct devfreq_passive_data *data,
> >>
> >> Need to change 'unsigned int' to 'unsigned long'
> > Get it.
> 
> If you add the blank line before/after of your reply,
> it is better to catch your reply. Please add the blank line for me.
> 

Thank you for teaching this norm~

> >> .
> >>
> >>> +					     unsigned int cpu)
> >>> +{
> >>> +	unsigned int cpu_min, cpu_max, dev_min, dev_max, cpu_percent, max_state;
> >>
> >> Better to define them separately as following and then need to rename
> >> the variable. Usually, use the 'min_freq' and 'max_freq' word for
> >> the minimum/maximum frequency.
> >>
> >> 	unsigned int cpu_min_freq, cpu_max_freq, cpu_curr_freq, cpu_percent;
> >> 	unsigned long dev_min_freq, dev_max_freq, dev_max_state,
> >>
> >> The devfreq used 'unsigned long'. The cpufreq used 'unsigned long'
> >> and 'unsigned int'. You need to handle them properly.
> > Get it.
> > For cpu_freq, I separate it into "unsigned long cpu_curr_freq" and
> > "unsigned int cpu_curr_freq_khz"
> >>
> >>
> >>> +	struct devfreq_cpu_state *cpu_state = data->cpu_state[cpu];
> >>> +	struct devfreq *devfreq = (struct devfreq *)data->this;
> >>> +	unsigned long *freq_table = devfreq->profile->freq_table;
> >>
> >> In this function, use 'cpu' work for cpufreq and use 'dev' for devfreq.
> >> So, I think 'dev_freq_table' is proper name instead of 'freq_table'
> >> for the readability.
> >>
> >> 	freq_table -> dev_freq_table
> >>
> >>> +	struct dev_pm_opp *opp = NULL, *cpu_opp = NULL;
> >>
> >> In the get_target_freq_with_devfreq(), use 'p_opp' indicating
> >> the OPP of parent device. For the consistency, I think that
> >> use 'p_opp' instead of 'cpu_opp'. 
> >>
> >>> +	unsigned long cpu_freq, freq;
> >>
> >> Define the 'cpu_freq' on above with cpu_min_freq/cpu_max_freq definition.
> >> 	cpu_freq -> cpu_curr_freq.
> > Get it.
> > Will modify them for readability.
> >>
> >>> +
> >>> +	if (!cpu_state || cpu_state->first_cpu != cpu ||
> >>> +	    !cpu_state->opp_table || !devfreq->opp_table)
> >>> +		return 0;
> >>> +
> >>> +	cpu_freq = cpu_state->freq * 1000;
> >>> +	cpu_opp = devfreq_recommended_opp(cpu_state->dev, &cpu_freq, 0);
> >>> +	if (IS_ERR(cpu_opp))
> >>> +		return 0;
> >>> +
> >>> +	opp = dev_pm_opp_xlate_required_opp(cpu_state->opp_table,
> >>> +					    devfreq->opp_table, cpu_opp);
> >>> +	dev_pm_opp_put(cpu_opp);
> >>> +
> >>> +	if (!IS_ERR(opp)) {
> >>> +		freq = dev_pm_opp_get_freq(opp);
> >>> +		dev_pm_opp_put(opp);
> >>
> >> Better to add the 'out' goto statement.
> >> If you use 'goto out', you can reduce the one indentation
> >> without 'else' statement.
> > Get it.
> >> 	
> >>
> >>> +	} else {
> >>
> >> As I commented, when dev_pm_opp_xlate_required_opp() return successfully
> >> , use 'goto out'. We can remove 'else' and then reduce the unneeded indentation.
> >>
> >>
> >>> +		/* Use Interpolation if required opps is not available */
> >>> +		cpu_min = cpu_state->min_freq;
> >>> +		cpu_max = cpu_state->max_freq;
> >>> +		cpu_freq = cpu_state->freq;
> >>> +
> >>> +		if (freq_table) {
> >>> +			/* Get minimum frequency according to sorting order */
> >>> +			max_state = freq_table[devfreq->profile->max_state - 1];
> >>> +			if (freq_table[0] < max_state) {
> >>> +				dev_min = freq_table[0];
> >>> +				dev_max = max_state;
> >>> +			} else {
> >>> +				dev_min = max_state;
> >>> +				dev_max = freq_table[0];
> >>> +			}
> >>> +		} else {
> >>> +			if (devfreq->user_max_freq_req.data.freq.qos->max_freq.target_value
> >>> +			    <= devfreq->user_min_freq_req.data.freq.qos->min_freq.target_value)
> >>> +				return 0;
> >>> +			dev_min =
> >>> +			devfreq->user_min_freq_req.data.freq.qos->min_freq.target_value;
> >>> +			dev_max =
> >>> +			devfreq->user_max_freq_req.data.freq.qos->max_freq.target_value;
> >>
> >> I think it is not proper to access the variable of pm_qos structure directly.
> >> Instead of direct access, you have to use the exported PM QoS function such as
> >> - pm_qos_read_value(devfreq->dev.parent, DEV_PM_QOS_MIN_FREQUENCY);
> >> - pm_qos_read_value(devfreq->dev.parent, DEV_PM_QOS_MAX_FREQUENCY);
> > Get it.
> >>
> >>> +		}
> >>> +		cpu_percent = ((cpu_freq - cpu_min) * 100) / cpu_max - cpu_min;
> >>> +		freq = dev_min + mult_frac(dev_max - dev_min, cpu_percent, 100);
> >>> +	}
> >>
> >>
> >> I think that you better to add 'out' jump label as following:
> >>
> >> out:
> >>
> >>> +
> >>> +	return freq;
> >>> +}
> >>> +
> >>> +static int get_target_freq_with_cpufreq(struct devfreq *devfreq,
> >>> +					unsigned long *freq)
> >>> +{
> >>> +	struct devfreq_passive_data *p_data =
> >>> +				(struct devfreq_passive_data *)devfreq->data;
> >>> +	unsigned int cpu, target_freq = 0;
> >>
> >> Need to define 'target_freq' with 'unsigned long' type.
> > Get it.
> >>
> >>> +
> >>> +	for_each_online_cpu(cpu)
> >>> +		target_freq = max(target_freq,
> >>> +				  xlate_cpufreq_to_devfreq(p_data, cpu));
> >>> +
> >>> +	*freq = target_freq;
> >>> +
> >>> +	return 0;
> >>> +}
> >>> +
> >>> +static int get_target_freq_with_devfreq(struct devfreq *devfreq,
> >>>  					unsigned long *freq)
> >>>  {
> >>>  	struct devfreq_passive_data *p_data
> >>> @@ -23,16 +101,6 @@ static int devfreq_passive_get_target_freq(struct devfreq *devfreq,
> >>>  	int i, count, ret = 0;
> >>>  
> >>>  	/*
> >>> -	 * If the devfreq device with passive governor has the specific method
> >>> -	 * to determine the next frequency, should use the get_target_freq()
> >>> -	 * of struct devfreq_passive_data.
> >>> -	 */
> >>> -	if (p_data->get_target_freq) {
> >>> -		ret = p_data->get_target_freq(devfreq, freq);
> >>> -		goto out;
> >>> -	}
> >>> -
> >>> -	/*
> >>>  	 * If the parent and passive devfreq device uses the OPP table,
> >>>  	 * get the next frequency by using the OPP table.
> >>>  	 */
> >>> @@ -102,6 +170,37 @@ static int devfreq_passive_get_target_freq(struct devfreq *devfreq,
> >>>  	return ret;
> >>>  }
> >>>  
> >>> +static int devfreq_passive_get_target_freq(struct devfreq *devfreq,
> >>> +					   unsigned long *freq)
> >>> +{
> >>> +	struct devfreq_passive_data *p_data =
> >>> +				(struct devfreq_passive_data *)devfreq->data;
> >>> +	int ret;
> >>> +
> >>> +	/*
> >>> +	 * If the devfreq device with passive governor has the specific method
> >>> +	 * to determine the next frequency, should use the get_target_freq()
> >>> +	 * of struct devfreq_passive_data.
> >>> +	 */
> >>> +	if (p_data->get_target_freq)
> >>> +		return p_data->get_target_freq(devfreq, freq);
> >>> +
> >>> +	switch (p_data->parent_type) {
> >>> +	case DEVFREQ_PARENT_DEV:
> >>> +		ret = get_target_freq_with_devfreq(devfreq, freq);
> >>> +		break;
> >>> +	case CPUFREQ_PARENT_DEV:
> >>> +		ret = get_target_freq_with_cpufreq(devfreq, freq);
> >>> +		break;
> >>> +	default:
> >>> +		ret = -EINVAL;
> >>> +		dev_err(&devfreq->dev, "Invalid parent type\n");
> >>> +		break;
> >>> +	}
> >>> +
> >>> +	return ret;
> >>> +}
> >>> +
> >>>  static int update_devfreq_passive(struct devfreq *devfreq, unsigned long freq)
> >>>  {
> >>>  	int ret;
> >>> @@ -156,6 +255,140 @@ static int devfreq_passive_notifier_call(struct notifier_block *nb,
> >>>  	return NOTIFY_DONE;
> >>>  }
> >>>  
> >>> +static int cpufreq_passive_notifier_call(struct notifier_block *nb,
> >>> +					 unsigned long event, void *ptr)
> >>> +{
> >>> +	struct devfreq_passive_data *data =
> >>> +			container_of(nb, struct devfreq_passive_data, nb);
> >>> +	struct devfreq *devfreq = (struct devfreq *)data->this;
> >>> +	struct devfreq_cpu_state *cpu_state;
> >>> +	struct cpufreq_freqs *freq = ptr;
> >>
> >> How about changing 'freq' to 'cpu_freqs'?
> >>
> >> In the drivers/cpufreq/cpufreq.c, use 'freqs' name indicating
> >> the instance of 'struct cpufreq_freqs'. And in order to
> >> identfy, how about adding 'cpu_' prefix for variable name?
> >>
> >>> +	unsigned int current_freq;
> >>
> >> Need to define curr_freq with 'unsigned long' type
> >> and better to use 'curr_freq' variable name.
> > It is good to change current_freq to curr_freq, but why should it us
> > 'unsigned long'?
> > I think it is 'unsigned int'.
> 
> I think that 'curr_freq' is proper. Yes, it is 'unsigned int'.
> When you changing the cpu frequency to device frequency,
> recommend to handle them between unsigned int and unsigned long.
> 

Got it.

> >>
> >>> +	int ret;
> >>> +
> >>> +	if (event != CPUFREQ_POSTCHANGE || !freq ||
> >>> +	    !data->cpu_state[freq->policy->cpu])
> >>> +		return 0;
> >>> +
> >>> +	cpu_state = data->cpu_state[freq->policy->cpu];
> >>> +	if (cpu_state->freq == freq->new)
> >>> +		return 0;
> >>> +
> >>> +	/* Backup current freq and pre-update cpu state freq*/
> >>> +	current_freq = cpu_state->freq;
> >>> +	cpu_state->freq = freq->new;
> >>> +
> >>> +	mutex_lock(&devfreq->lock);
> >>> +	ret = update_devfreq(devfreq);
> >>> +	mutex_unlock(&devfreq->lock);
> >>> +	if (ret) {
> >>> +		cpu_state->freq = current_freq;
> >>> +		dev_err(&devfreq->dev, "Couldn't update the frequency.\n");
> >>> +		return ret;
> >>> +	}
> >>> +
> >>> +	return 0;
> >>> +}
> >>> +
> >>> +static int cpufreq_passive_register(struct devfreq_passive_data **p_data)
> >>> +{
> >>> +	struct devfreq_passive_data *data = *p_data;
> >>> +	struct devfreq *devfreq = (struct devfreq *)data->this;
> >>> +	struct device *dev = devfreq->dev.parent;
> >>> +	struct opp_table *opp_table = NULL;
> >>> +	struct devfreq_cpu_state *state;
> >>
> >> For the readability, I thinkt 'cpu_state' is proper instead of 'state'.
> > Get it.
> >>
> >>> +	struct cpufreq_policy *policy;
> >>> +	struct device *cpu_dev;
> >>> +	unsigned int cpu;
> >>> +	int ret;
> >>> +
> >>> +	get_online_cpus();
> >>
> >> Add blank line.
> > Get it.
> >>
> >>> +	data->nb.notifier_call = cpufreq_passive_notifier_call;
> >>> +	ret = cpufreq_register_notifier(&data->nb,
> >>> +					CPUFREQ_TRANSITION_NOTIFIER);
> >>> +	if (ret) {
> >>> +		dev_err(dev, "Couldn't register cpufreq notifier.\n");
> >>> +		data->nb.notifier_call = NULL;
> >>> +		goto out;
> >>> +	}
> >>> +
> >>> +	/* Populate devfreq_cpu_state */
> >>> +	for_each_online_cpu(cpu) {
> >>> +		if (data->cpu_state[cpu])
> >>> +			continue;
> >>> +
> >>> +		policy = cpufreq_cpu_get(cpu);
> >>
> >> cpufreq_cpu_get() might return 'NULL'. I think you need to handle
> >> return value as following:
> >>
> >> 		if (!policy) {
> >> 			ret = -EINVAL;
> >> 			goto out;
> >> 		} else if (PTR_ERR(policy) == -EPROBE_DEFER) {
> >> 			goto out;
> >> 		} else if (IS_ERR(policy) {
> >> 			ret = PTR_ERR(policy);
> >> 			dev_err(dev, "Couldn't get the cpufreq_poliy.\n");
> >> 			goto out;
> >> 		}
> >>
> >> If cpufreq_cpu_get() return successfully, to do next.
> >> It reduces the one indentaion.
> >>
> >>
> > Get it.
> >>
> >>> +		if (policy) {
> >>> +			state = kzalloc(sizeof(*state), GFP_KERNEL);
> >>> +			if (!state) {
> >>> +				ret = -ENOMEM;
> >>> +				goto out;
> >>> +			}
> >>> +
> >>> +			cpu_dev = get_cpu_device(cpu);
> >>> +			if (!cpu_dev) {
> >>> +				dev_err(dev, "Couldn't get cpu device.\n");
> >>> +				ret = -ENODEV;
> >>> +				goto out;
> >>> +			}
> >>> +
> >>> +			opp_table = dev_pm_opp_get_opp_table(cpu_dev);
> >>> +			if (IS_ERR(devfreq->opp_table)) {
> >>> +				ret = PTR_ERR(opp_table);
> >>> +				goto out;
> >>> +			}
> >>> +
> >>> +			state->dev = cpu_dev;
> >>> +			state->opp_table = opp_table;
> >>> +			state->first_cpu = cpumask_first(policy->related_cpus);
> >>> +			state->freq = policy->cur;
> >>> +			state->min_freq = policy->cpuinfo.min_freq;
> >>> +			state->max_freq = policy->cpuinfo.max_freq;
> >>> +			data->cpu_state[cpu] = state;
> >>
> >> Add blank line.
> >>
> >>> +			cpufreq_cpu_put(policy);
> >>> +		} else {
> >>> +			ret = -EPROBE_DEFER;
> >>> +			goto out;
> >>> +		}
> >>> +	}
> >>
> >> Add blank line.
> > Get it.
> >>> +out:
> >>> +	put_online_cpus();
> >>> +	if (ret)
> >>> +		return ret;
> >>> +
> >>> +	/* Update devfreq */
> >>> +	mutex_lock(&devfreq->lock);
> >>> +	ret = update_devfreq(devfreq);
> >>> +	mutex_unlock(&devfreq->lock);
> >>> +	if (ret)
> >>> +		dev_err(dev, "Couldn't update the frequency.\n");
> >>> +
> >>> +	return ret;
> >>> +}
> >>> +
> >>> +static int cpufreq_passive_unregister(struct devfreq_passive_data **p_data)
> >>> +{
> >>> +	struct devfreq_passive_data *data = *p_data;
> >>> +	struct devfreq_cpu_state *cpu_state;
> >>> +	int cpu;
> >>> +
> >>> +	if (data->nb.notifier_call)
> >>> +		cpufreq_unregister_notifier(&data->nb,
> >>> +					    CPUFREQ_TRANSITION_NOTIFIER);
> >>> +
> >>> +	for_each_possible_cpu(cpu) {
> >>> +		cpu_state = data->cpu_state[cpu];
> >>> +		if (cpu_state) {
> >>> +			if (cpu_state->opp_table)
> >>> +				dev_pm_opp_put_opp_table(cpu_state->opp_table);
> >>> +			kfree(cpu_state);
> >>> +			cpu_state = NULL;
> >>> +		}
> >>> +	}
> >>> +
> >>> +	return 0;
> >>> +}
> >>> +
> >>>  static int devfreq_passive_event_handler(struct devfreq *devfreq,
> >>>  				unsigned int event, void *data)
> >>>  {
> >>> @@ -165,7 +398,7 @@ static int devfreq_passive_event_handler(struct devfreq *devfreq,
> >>>  	struct notifier_block *nb = &p_data->nb;
> >>>  	int ret = 0;
> >>>  
> >>> -	if (!parent)
> >>> +	if (p_data->parent_type == DEVFREQ_PARENT_DEV && !parent)
> >>>  		return -EPROBE_DEFER;
> >>
> >> If you modify the devfreq_passive_event_handler() as following,
> >> you can move this condition for DEVFREQ_PARENT_DEV into 
> >> (register|unregister)_parent_dev_notifier.
> >>
> >> 	switch (event) {                                                                                  
> >> 	case DEVFREQ_GOV_START:                                               
> >> 		ret = register_parent_dev_notifier(p_data);
> >> 		break;
> >> 	case DEVFREQ_GOV_STOP:                                             
> >> 		ret = unregister_parent_dev_notifier(p_data);
> >> 		break;
> >> 	default: 
> >> 		ret = -EINVAL;
> >> 		break;
> >> 	}
> >>                                                                                               
> >> 	return ret;
> >>
> > Get it.
> >>>  
> >>>  	switch (event) {
> >>> @@ -173,13 +406,24 @@ static int devfreq_passive_event_handler(struct devfreq *devfreq,
> >>>  		if (!p_data->this)
> >>>  			p_data->this = devfreq;
> >>>  
> >>> -		nb->notifier_call = devfreq_passive_notifier_call;
> >>> -		ret = devfreq_register_notifier(parent, nb,
> >>> -					DEVFREQ_TRANSITION_NOTIFIER);
> >>> +		if (p_data->parent_type == DEVFREQ_PARENT_DEV) {
> >>> +			nb->notifier_call = devfreq_passive_notifier_call;
> >>> +			ret = devfreq_register_notifier(parent, nb,
> >>> +						DEVFREQ_TRANSITION_NOTIFIER);
> >>> +		} else if (p_data->parent_type == CPUFREQ_PARENT_DEV) {
> >>> +			ret = cpufreq_passive_register(&p_data);
> >>
> >> I think that we better to collect the code related to notifier registration
> >> into one function like devfreq_pass_register_notifier() instead of
> >> cpufreq_passive_register() as following: I think it is more simple and readable.
> >>
> >> If you have more proper function name of register_parent_dev_notifier,
> >> please give your opinion.
> >>
> >> 	int register_parent_dev_notifier(struct devfreq_passive_data **p_data)
> >> 		switch (p_data->parent_type) {
> >> 		case DEVFREQ_PARENT_DEV:
> >> 			nb->notifier_call = devfreq_passive_notifier_call;
> >> 			ret = devfreq_register_notifier(parent, nb,
> >> 			break;
> >> 		case CPUFREQ_PARENT_DEV:
> >> 			cpufreq_register_notifier(...)
> >> 			...
> >> 			break;
> >> 		}
> > Not fully understanding.
> > Do you mean expanding cpufreq_passive_register()?
> 
> Yes and rename it for both cpufreq and devfreq.
> 
> > I think leave it in function will be with clean for this code segment.
> 
> I want that one function handle the notifier register
> for both cpufreq and devfreq so that we make it more simply as following:
> On the step hanling the governor event, don't need to consider
> the type of parent device of devfreq deivce with this style.
> 
> 	case DEVFREQ_GOV_START:
> 		ret = register_notifier(...);
> 		break;
> 	case DEVFREQ_GOV_STOP:
> 		ret = unregister_notifier(...);
> 		break;
> 

Got it.
I will call the same function register_parent_dev_notifier() in case
DEVFREQ_GOV_START, checking parent_type and doing corresponding jobs
inside register_parent_dev_notifier()

> > 
> >> 		
> >>
> >>> +		} else {
> >>> +			ret = -EINVAL;
> >>> +		}
> >>>  		break;
> >>>  	case DEVFREQ_GOV_STOP:
> >>> -		WARN_ON(devfreq_unregister_notifier(parent, nb,
> >>> -					DEVFREQ_TRANSITION_NOTIFIER));
> >>> +		if (p_data->parent_type == DEVFREQ_PARENT_DEV)
> >>> +			WARN_ON(devfreq_unregister_notifier(parent, nb,
> >>> +						DEVFREQ_TRANSITION_NOTIFIER));
> >>> +		else if (p_data->parent_type == CPUFREQ_PARENT_DEV)
> >>> +			cpufreq_passive_unregister(&p_data);
> >>> +		else
> >>> +			ret = -EINVAL;
> >>
> >> ditto. unregister_parent_dev_notifier(struct devfreq_passive_data **p_data)
> > Get it.
> 
> ditto. As I aboved commented.
> 
> >>
> >>>  		break;
> >>>  	default:
> >>>  		break;
> >>> diff --git a/include/linux/devfreq.h b/include/linux/devfreq.h
> >>> index a4b19d593151..04ce576fd6f1 100644
> >>> --- a/include/linux/devfreq.h
> >>> +++ b/include/linux/devfreq.h
> >>> @@ -278,6 +278,32 @@ struct devfreq_simple_ondemand_data {
> >>>  
> >>>  #if IS_ENABLED(CONFIG_DEVFREQ_GOV_PASSIVE)
> >>>  /**
> >>> + * struct devfreq_cpu_state - holds the per-cpu state
> >>> + * @freq:	the current frequency of the cpu.
> >>> + * @min_freq:	the min frequency of the cpu.
> >>> + * @max_freq:	the max frequency of the cpu.
> >>> + * @first_cpu:	the cpumask of the first cpu of a policy.
> >>> + * @dev:	reference to cpu device.
> >>> + * @opp_table:	reference to cpu opp table.
> >>> + *
> >>> + * This structure stores the required cpu_state of a cpu.
> >>> + * This is auto-populated by the governor.
> >>> + */
> >>> +struct devfreq_cpu_state {> +	unsigned int freq;
> >>
> >> It is better to change from 'freq' to 'curr_freq'
> >> for more correct expression.
> > Get it.
> >>
> >>> +	unsigned int min_freq;
> >>> +	unsigned int max_freq;
> >>> +	unsigned int first_cpu;
> >>> +	struct device *dev;
> >>
> >> How about changing the name 'dev' to 'cpu_dev'?
> > Okay.
> >>
> >>
> >>> +	struct opp_table *opp_table;
> >>> +};
> >>
> >> devfreq_cpu_state is only handled by within driver/devfreq/governor_passive.c.
> >>
> >> So, you can move it into drivers/devfreq/governor_passive.c
> >> and just add the definition into include/linux/devfreq.h as following:
> >> It is able to prevent the access of variable of 'struct devfreq_cpu_state'
> >> outside.
> >>
> >> 	struct devfreq_cpu_state;
> > Get it.
> >>
> >>> +
> >>> +enum devfreq_parent_dev_type {
> >>> +	DEVFREQ_PARENT_DEV,
> >>> +	CPUFREQ_PARENT_DEV,
> >>> +};
> >>> +
> >>> +/**
> >>>   * struct devfreq_passive_data - ``void *data`` fed to struct devfreq
> >>>   *	and devfreq_add_device
> >>>   * @parent:	the devfreq instance of parent device.
> >>> @@ -288,13 +314,15 @@ struct devfreq_simple_ondemand_data {
> >>>   *			using governors except for passive governor.
> >>>   *			If the devfreq device has the specific method to decide
> >>>   *			the next frequency, should use this callback.
> >>> - * @this:	the devfreq instance of own device.
> >>> - * @nb:		the notifier block for DEVFREQ_TRANSITION_NOTIFIER list
> >>> + * @parent_type		parent type of the device
> >>
> >> Need to add ':' at the end of word. -> "parent_type:".
> >>
> >>> + * @this:		the devfreq instance of own device.
> >>> + * @nb:			the notifier block for DEVFREQ_TRANSITION_NOTIFIER list
> >>
> >> I knew that you make them with same indentation.
> >> But, actually, it is not related to this patch like clean-up code.
> >> Even if it is not pretty, you better to don't touch 'this' and 'nb' indentaion.
> > Get it.
> >>
> >>> + * @cpu_state:		the state min/max/current frequency of all online cpu's
> >>>   *
> >>>   * The devfreq_passive_data have to set the devfreq instance of parent
> >>>   * device with governors except for the passive governor. But, don't need to
> >>> - * initialize the 'this' and 'nb' field because the devfreq core will handle
> >>> - * them.
> >>> + * initialize the 'this', 'nb' and 'cpu_state' field because the devfreq core
> >>> + * will handle them.
> >>>   */
> >>>  struct devfreq_passive_data {
> >>>  	/* Should set the devfreq instance of parent device */
> >>> @@ -303,9 +331,13 @@ struct devfreq_passive_data {
> >>>  	/* Optional callback to decide the next frequency of passvice device */
> >>>  	int (*get_target_freq)(struct devfreq *this, unsigned long *freq);
> >>>  
> >>> +	/* Should set the type of parent device */
> >>> +	enum devfreq_parent_dev_type parent_type;
> >>> +
> >>>  	/* For passive governor's internal use. Don't need to set them */
> >>>  	struct devfreq *this;
> >>>  	struct notifier_block nb;
> >>> +	struct devfreq_cpu_state *cpu_state[NR_CPUS];
> >>>  };
> >>>  #endif
> >>>  
> >>>
> >>
> >>
> > 
> 
> 

_______________________________________________
Linux-mediatek mailing list
Linux-mediatek@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-mediatek

^ permalink raw reply	[flat|nested] 35+ messages in thread

* Re: [PATCH 08/12] dt-bindings: devfreq: add compatible for mt8183 cci devfreq
  2020-05-28  7:42     ` Chanwoo Choi
@ 2020-06-17 12:05       ` andrew-sh.cheng
  0 siblings, 0 replies; 35+ messages in thread
From: andrew-sh.cheng @ 2020-06-17 12:05 UTC (permalink / raw)
  To: Chanwoo Choi
  Cc: Mark Rutland, Nishanth Menon, srv_heupstream, linux-pm,
	Stephen Boyd, Viresh Kumar, Mark Brown, Rafael J . Wysocki,
	Liam Girdwood, Rob Herring, linux-kernel, Kyungmin Park,
	MyungJoo Ham, linux-mediatek, linux-arm-kernel, Matthias Brugger,
	devicetree

On Thu, 2020-05-28 at 16:42 +0900, Chanwoo Choi wrote:
> Hi,
> 
> On 5/20/20 12:43 PM, Andrew-sh.Cheng wrote:
> > This adds dt-binding documentation of cci devfreq
> > for Mediatek MT8183 SoC platform.
> > 
> > Signed-off-by: Andrew-sh.Cheng <andrew-sh.cheng@mediatek.com>
> > ---
> >  .../devicetree/bindings/devfreq/mt8183-cci.yaml    | 51 ++++++++++++++++++++++
> >  1 file changed, 51 insertions(+)
> >  create mode 100644 Documentation/devicetree/bindings/devfreq/mt8183-cci.yaml
> > 
> > diff --git a/Documentation/devicetree/bindings/devfreq/mt8183-cci.yaml b/Documentation/devicetree/bindings/devfreq/mt8183-cci.yaml
> > new file mode 100644
> > index 000000000000..a7341fd94097
> > --- /dev/null
> > +++ b/Documentation/devicetree/bindings/devfreq/mt8183-cci.yaml
> > @@ -0,0 +1,51 @@
> > +# SPDX-License-Identifier: (GPL-2.0-only OR BSD-2-Clause)
> > +%YAML 1.2
> > +---
> > +$id: https://protect2.fireeye.com/url?k=33f1f15d-6e23ea05-33f07a12-0cc47a31c8b4-91b3f8aeecce95dc&q=1&u=http%3A%2F%2Fdevicetree.org%2Fschemas%2Fdevfreq%2Fmt8183-cci.yaml%23
> > +$schema: https://protect2.fireeye.com/url?k=fc7d9089-a1af8bd1-fc7c1bc6-0cc47a31c8b4-b46f5afc59faf86d&q=1&u=http%3A%2F%2Fdevicetree.org%2Fmeta-schemas%2Fcore.yaml%23
> > +
> > +title: CCI_DEVFREQ driver for MT8183.
> > +
> > +maintainers:
> > +  - Andrew-sh.Cheng <andrew-sh.cheng@mediatek.com>
> > +
> > +description: |
> > +  This module is used to create CCI DEVFREQ.
> > +  The performance will depend on both CCI frequency and CPU frequency.
> > +  For MT8183, CCI co-buck with Little core.
> > +  Contain CCI opp table for voltage and frequency scaling.
> > +
> > +properties:
> > +  compatible:
> > +    const: "mediatek,mt8183-cci"
> > +
> > +  clocks:
> > +    maxItems: 1
> > +
> > +  clock-names:
> > +    const: "cci"
> > +
> > +  operating-points-v2: true
> > +  opp-table: true
> > +
> > +  proc-supply:
> > +    description:
> > +      Phandle of the regulator that provides the supply voltage.
> > +
> > +required:
> > +  - compatible
> > +  - clocks
> > +  - clock-names
> > +  - proc-supply
> > +
> > +examples:
> > +  - |
> > +    #include <dt-bindings/clock/mt8183-clk.h>
> > +    cci: cci {
> > +      compatible = "mediatek,mt8183-cci";
> > +      clocks = <&apmixedsys CLK_APMIXED_CCIPLL>;
> > +      clock-names = "cci";
> > +      operating-points-v2 = <&cci_opp>;
> > +      proc-supply = <&mt6358_vproc12_reg>;
> > +    };
> > +
> > 
> 
> I recommend that add the more detailed example
> with OPP table with CPU node.
> 

Hi Chanwoo Choi,

Actually, for previous versions of my patch set, I didn't use
governor_passive as cci_devfreq governor.
So I think it is okay that not provide CPU OPP node for this cci device
node.

> 

_______________________________________________
Linux-mediatek mailing list
Linux-mediatek@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-mediatek

^ permalink raw reply	[flat|nested] 35+ messages in thread

end of thread, other threads:[~2020-06-17 12:05 UTC | newest]

Thread overview: 35+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
     [not found] <CGME20200520034324epcas1p3affbd24bd1f3fe40d51baade07c1abba@epcas1p3.samsung.com>
2020-05-20  3:42 ` [PATCH 00/12] Add cpufreq and cci devfreq for mt8183, and SVS support Andrew-sh.Cheng
2020-05-20  3:42   ` [PATCH 01/12] OPP: Allow required-opps even if the device doesn't have power-domains Andrew-sh.Cheng
2020-05-20 14:54     ` Matthias Brugger
2020-05-21  1:50       ` andrew-sh.cheng
2020-05-20  3:42   ` [PATCH 02/12] OPP: Add function to look up required OPP's for a given OPP Andrew-sh.Cheng
2020-05-20  3:42   ` [PATCH 03/12] OPP: Improve required-opps linking Andrew-sh.Cheng
2020-05-20  3:42   ` [PATCH 04/12] PM / devfreq: Cache OPP table reference in devfreq Andrew-sh.Cheng
2020-05-20  3:43   ` [PATCH 05/12] PM / devfreq: Add required OPPs support to passive governor Andrew-sh.Cheng
2020-05-20  3:43   ` [PATCH 06/12] PM / devfreq: Add cpu based scaling support to passive_governor Andrew-sh.Cheng
2020-05-28  5:03     ` Chanwoo Choi
2020-05-28  6:14     ` Chanwoo Choi
2020-05-28  7:17       ` Chanwoo Choi
2020-06-02 12:23         ` andrew-sh.cheng
2020-06-03  4:12           ` Chanwoo Choi
2020-06-02 11:43       ` andrew-sh.cheng
2020-06-03  4:07         ` Chanwoo Choi
2020-06-17  7:59           ` andrew-sh.cheng
2020-05-20  3:43   ` [PATCH 07/12] cpufreq: mediatek: Enable clock and regulator Andrew-sh.Cheng
2020-05-20  3:43   ` [PATCH 08/12] dt-bindings: devfreq: add compatible for mt8183 cci devfreq Andrew-sh.Cheng
2020-05-28  7:42     ` Chanwoo Choi
2020-06-17 12:05       ` andrew-sh.cheng
2020-05-20  3:43   ` [PATCH 09/12] devfreq: add mediatek " Andrew-sh.Cheng
2020-05-20 12:31     ` Mark Brown
2020-05-21  8:52       ` andrew-sh.cheng
2020-05-28  7:35     ` Chanwoo Choi
2020-05-28  8:00       ` Chanwoo Choi
2020-05-20  3:43   ` [PATCH 10/12] opp: Modify opp API, dev_pm_opp_get_freq(), find freq in opp, even it is disabled Andrew-sh.Cheng
2020-05-20  3:43   ` [PATCH 11/12] cpufreq: mediatek: add opp notification for SVS support Andrew-sh.Cheng
2020-05-20  3:43   ` [PATCH 12/12] devfreq: mediatek: cci devfreq register " Andrew-sh.Cheng
2020-05-20  4:10   ` [PATCH 00/12] Add cpufreq and cci devfreq for mt8183, and " Chanwoo Choi
2020-05-20  5:36     ` andrew-sh.cheng
2020-05-20  6:24       ` Chanwoo Choi
2020-05-20  7:10         ` andrew-sh.cheng
2020-05-20 14:53           ` Matthias Brugger
2020-06-15  7:31   ` Viresh Kumar

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).