linux-kernel.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
* [PATCH v2 0/3] cpufreq: Add basic cpufreq scaling for Qualcomm MSM8909
@ 2023-10-18  8:06 Stephan Gerhold
  2023-10-18  8:06 ` [PATCH v2 1/3] cpufreq: qcom-nvmem: Simplify driver data allocation Stephan Gerhold
                   ` (3 more replies)
  0 siblings, 4 replies; 28+ messages in thread
From: Stephan Gerhold @ 2023-10-18  8:06 UTC (permalink / raw)
  To: Viresh Kumar
  Cc: Andy Gross, Bjorn Andersson, Konrad Dybcio, Ilia Lin,
	Rafael J. Wysocki, Rob Herring, Krzysztof Kozlowski,
	Conor Dooley, linux-pm, linux-arm-msm, linux-kernel, devicetree,
	Ulf Hansson, Stephan Gerhold, Stephan Gerhold, stable

Add the necessary definitions to the qcom-cpufreq-nvmem driver to
support basic cpufreq scaling on the Qualcomm MSM8909 SoC. In practice
the necessary power domains vary depending on the actual PMIC the SoC
was combined with. With PM8909 the VDD_APC power domain is shared with
VDD_CX so the RPM firmware handles all voltage adjustments, while with
PM8916 and PM660 Linux is responsible to do adaptive voltage scaling
of a dedicated CPU regulator using CPR.

Signed-off-by: Stephan Gerhold <stephan.gerhold@kernkonzept.com>
---
Changes in v2:
- Reword commit messages based on discussion with Uffe
- Use generic power domain name "perf" (Uffe)
- Fix pm_runtime error handling (Uffe)
- Add allocation cleanup patch as preparation
- Fix ordering of qcom,msm8909 compatible (Konrad)
- cpufreq-dt-platdev blocklist/dt-bindings patches were applied already
- Link to v1: https://lore.kernel.org/r/20230912-msm8909-cpufreq-v1-0-767ce66b544b@kernkonzept.com

---
Stephan Gerhold (3):
      cpufreq: qcom-nvmem: Simplify driver data allocation
      cpufreq: qcom-nvmem: Enable virtual power domain devices
      cpufreq: qcom-nvmem: Add MSM8909

 drivers/cpufreq/qcom-cpufreq-nvmem.c | 124 +++++++++++++++++++++++++----------
 1 file changed, 90 insertions(+), 34 deletions(-)
---
base-commit: 2e12b516f5e6046ceabd4d24e24297e4d130b148
change-id: 20230906-msm8909-cpufreq-dff238de9ff3

Best regards,
-- 
Stephan Gerhold <stephan.gerhold@kernkonzept.com>
Kernkonzept GmbH at Dresden, Germany, HRB 31129, CEO Dr.-Ing. Michael Hohmuth


^ permalink raw reply	[flat|nested] 28+ messages in thread

* [PATCH v2 1/3] cpufreq: qcom-nvmem: Simplify driver data allocation
  2023-10-18  8:06 [PATCH v2 0/3] cpufreq: Add basic cpufreq scaling for Qualcomm MSM8909 Stephan Gerhold
@ 2023-10-18  8:06 ` Stephan Gerhold
  2023-10-18  8:45   ` Konrad Dybcio
  2023-10-18  8:06 ` [PATCH v2 2/3] cpufreq: qcom-nvmem: Enable virtual power domain devices Stephan Gerhold
                   ` (2 subsequent siblings)
  3 siblings, 1 reply; 28+ messages in thread
From: Stephan Gerhold @ 2023-10-18  8:06 UTC (permalink / raw)
  To: Viresh Kumar
  Cc: Andy Gross, Bjorn Andersson, Konrad Dybcio, Ilia Lin,
	Rafael J. Wysocki, Rob Herring, Krzysztof Kozlowski,
	Conor Dooley, linux-pm, linux-arm-msm, linux-kernel, devicetree,
	Ulf Hansson, Stephan Gerhold, Stephan Gerhold

Simplify the allocation and cleanup of driver data by using devm
together with a flexible array. Prepare for adding additional per-CPU
data by defining a struct qcom_cpufreq_drv_cpu instead of storing the
opp_tokens directly.

Signed-off-by: Stephan Gerhold <stephan.gerhold@kernkonzept.com>
---
 drivers/cpufreq/qcom-cpufreq-nvmem.c | 49 +++++++++++++-----------------------
 1 file changed, 18 insertions(+), 31 deletions(-)

diff --git a/drivers/cpufreq/qcom-cpufreq-nvmem.c b/drivers/cpufreq/qcom-cpufreq-nvmem.c
index 3fa12648ceb6..82a244f3fa52 100644
--- a/drivers/cpufreq/qcom-cpufreq-nvmem.c
+++ b/drivers/cpufreq/qcom-cpufreq-nvmem.c
@@ -45,10 +45,14 @@ struct qcom_cpufreq_match_data {
 	const char **genpd_names;
 };
 
+struct qcom_cpufreq_drv_cpu {
+	int opp_token;
+};
+
 struct qcom_cpufreq_drv {
-	int *opp_tokens;
 	u32 versions;
 	const struct qcom_cpufreq_match_data *data;
+	struct qcom_cpufreq_drv_cpu cpus[];
 };
 
 static struct platform_device *cpufreq_dt_pdev, *cpufreq_pdev;
@@ -290,42 +294,32 @@ static int qcom_cpufreq_probe(struct platform_device *pdev)
 		return -ENOENT;
 	}
 
-	drv = kzalloc(sizeof(*drv), GFP_KERNEL);
+	drv = devm_kzalloc(&pdev->dev, struct_size(drv, cpus, num_possible_cpus()),
+		           GFP_KERNEL);
 	if (!drv)
 		return -ENOMEM;
 
 	match = pdev->dev.platform_data;
 	drv->data = match->data;
-	if (!drv->data) {
-		ret = -ENODEV;
-		goto free_drv;
-	}
+	if (!drv->data)
+		return -ENODEV;
 
 	if (drv->data->get_version) {
 		speedbin_nvmem = of_nvmem_cell_get(np, NULL);
-		if (IS_ERR(speedbin_nvmem)) {
-			ret = dev_err_probe(cpu_dev, PTR_ERR(speedbin_nvmem),
-					    "Could not get nvmem cell\n");
-			goto free_drv;
-		}
+		if (IS_ERR(speedbin_nvmem))
+			return dev_err_probe(cpu_dev, PTR_ERR(speedbin_nvmem),
+					     "Could not get nvmem cell\n");
 
 		ret = drv->data->get_version(cpu_dev,
 							speedbin_nvmem, &pvs_name, drv);
 		if (ret) {
 			nvmem_cell_put(speedbin_nvmem);
-			goto free_drv;
+			return ret;
 		}
 		nvmem_cell_put(speedbin_nvmem);
 	}
 	of_node_put(np);
 
-	drv->opp_tokens = kcalloc(num_possible_cpus(), sizeof(*drv->opp_tokens),
-				  GFP_KERNEL);
-	if (!drv->opp_tokens) {
-		ret = -ENOMEM;
-		goto free_drv;
-	}
-
 	for_each_possible_cpu(cpu) {
 		struct dev_pm_opp_config config = {
 			.supported_hw = NULL,
@@ -351,9 +345,9 @@ static int qcom_cpufreq_probe(struct platform_device *pdev)
 		}
 
 		if (config.supported_hw || config.genpd_names) {
-			drv->opp_tokens[cpu] = dev_pm_opp_set_config(cpu_dev, &config);
-			if (drv->opp_tokens[cpu] < 0) {
-				ret = drv->opp_tokens[cpu];
+			drv->cpus[cpu].opp_token = dev_pm_opp_set_config(cpu_dev, &config);
+			if (drv->cpus[cpu].opp_token < 0) {
+				ret = drv->cpus[cpu].opp_token;
 				dev_err(cpu_dev, "Failed to set OPP config\n");
 				goto free_opp;
 			}
@@ -372,11 +366,7 @@ static int qcom_cpufreq_probe(struct platform_device *pdev)
 
 free_opp:
 	for_each_possible_cpu(cpu)
-		dev_pm_opp_clear_config(drv->opp_tokens[cpu]);
-	kfree(drv->opp_tokens);
-free_drv:
-	kfree(drv);
-
+		dev_pm_opp_clear_config(drv->cpus[cpu].opp_token);
 	return ret;
 }
 
@@ -388,10 +378,7 @@ static void qcom_cpufreq_remove(struct platform_device *pdev)
 	platform_device_unregister(cpufreq_dt_pdev);
 
 	for_each_possible_cpu(cpu)
-		dev_pm_opp_clear_config(drv->opp_tokens[cpu]);
-
-	kfree(drv->opp_tokens);
-	kfree(drv);
+		dev_pm_opp_clear_config(drv->cpus[cpu].opp_token);
 }
 
 static struct platform_driver qcom_cpufreq_driver = {

-- 
2.39.2


^ permalink raw reply related	[flat|nested] 28+ messages in thread

* [PATCH v2 2/3] cpufreq: qcom-nvmem: Enable virtual power domain devices
  2023-10-18  8:06 [PATCH v2 0/3] cpufreq: Add basic cpufreq scaling for Qualcomm MSM8909 Stephan Gerhold
  2023-10-18  8:06 ` [PATCH v2 1/3] cpufreq: qcom-nvmem: Simplify driver data allocation Stephan Gerhold
@ 2023-10-18  8:06 ` Stephan Gerhold
  2023-10-19 10:24   ` Ulf Hansson
  2023-10-18  8:06 ` [PATCH v2 3/3] cpufreq: qcom-nvmem: Add MSM8909 Stephan Gerhold
  2023-10-19  6:16 ` [PATCH v2 0/3] cpufreq: Add basic cpufreq scaling for Qualcomm MSM8909 Viresh Kumar
  3 siblings, 1 reply; 28+ messages in thread
From: Stephan Gerhold @ 2023-10-18  8:06 UTC (permalink / raw)
  To: Viresh Kumar
  Cc: Andy Gross, Bjorn Andersson, Konrad Dybcio, Ilia Lin,
	Rafael J. Wysocki, Rob Herring, Krzysztof Kozlowski,
	Conor Dooley, linux-pm, linux-arm-msm, linux-kernel, devicetree,
	Ulf Hansson, Stephan Gerhold, Stephan Gerhold, stable

The genpd core caches performance state votes from devices that are
runtime suspended as of commit 3c5a272202c2 ("PM: domains: Improve
runtime PM performance state handling"). They get applied once the
device becomes active again.

To attach the power domains needed by qcom-cpufreq-nvmem the OPP core
calls genpd_dev_pm_attach_by_id(). This results in "virtual" dummy
devices that use runtime PM only to control the enable and performance
state for the attached power domain.

However, at the moment nothing ever resumes the virtual devices created
for qcom-cpufreq-nvmem. They remain permanently runtime suspended. This
means that performance state votes made during cpufreq scaling get
always cached and never applied to the hardware.

Fix this by enabling the devices after attaching them and use
dev_pm_syscore_device() to ensure the power domains also stay on when
going to suspend. Since it supplies the CPU we can never turn it off
from Linux. There are other mechanisms to turn it off when needed,
usually in the RPM firmware (RPMPD) or the cpuidle path (CPR genpd).

Without this fix performance states votes are silently ignored, and the
CPU/CPR voltage is never adjusted. This has been broken since 5.14 but
for some reason no one noticed this on QCS404 so far.

Cc: stable@vger.kernel.org
Fixes: 1cb8339ca225 ("cpufreq: qcom: Add support for qcs404 on nvmem driver")
Signed-off-by: Stephan Gerhold <stephan.gerhold@kernkonzept.com>
---
 drivers/cpufreq/qcom-cpufreq-nvmem.c | 49 +++++++++++++++++++++++++++++++++---
 1 file changed, 46 insertions(+), 3 deletions(-)

diff --git a/drivers/cpufreq/qcom-cpufreq-nvmem.c b/drivers/cpufreq/qcom-cpufreq-nvmem.c
index 82a244f3fa52..3794390089b0 100644
--- a/drivers/cpufreq/qcom-cpufreq-nvmem.c
+++ b/drivers/cpufreq/qcom-cpufreq-nvmem.c
@@ -25,6 +25,7 @@
 #include <linux/platform_device.h>
 #include <linux/pm_domain.h>
 #include <linux/pm_opp.h>
+#include <linux/pm_runtime.h>
 #include <linux/slab.h>
 #include <linux/soc/qcom/smem.h>
 
@@ -47,6 +48,7 @@ struct qcom_cpufreq_match_data {
 
 struct qcom_cpufreq_drv_cpu {
 	int opp_token;
+	struct device **virt_devs;
 };
 
 struct qcom_cpufreq_drv {
@@ -268,6 +270,18 @@ static const struct qcom_cpufreq_match_data match_data_ipq8074 = {
 	.get_version = qcom_cpufreq_ipq8074_name_version,
 };
 
+static void qcom_cpufreq_put_virt_devs(struct qcom_cpufreq_drv *drv, unsigned cpu)
+{
+	const char * const *name = drv->data->genpd_names;
+	int i;
+
+	if (!drv->cpus[cpu].virt_devs)
+		return;
+
+	for (i = 0; *name; i++, name++)
+		pm_runtime_put(drv->cpus[cpu].virt_devs[i]);
+}
+
 static int qcom_cpufreq_probe(struct platform_device *pdev)
 {
 	struct qcom_cpufreq_drv *drv;
@@ -321,6 +335,7 @@ static int qcom_cpufreq_probe(struct platform_device *pdev)
 	of_node_put(np);
 
 	for_each_possible_cpu(cpu) {
+		struct device **virt_devs = NULL;
 		struct dev_pm_opp_config config = {
 			.supported_hw = NULL,
 		};
@@ -341,7 +356,7 @@ static int qcom_cpufreq_probe(struct platform_device *pdev)
 
 		if (drv->data->genpd_names) {
 			config.genpd_names = drv->data->genpd_names;
-			config.virt_devs = NULL;
+			config.virt_devs = &virt_devs;
 		}
 
 		if (config.supported_hw || config.genpd_names) {
@@ -352,6 +367,30 @@ static int qcom_cpufreq_probe(struct platform_device *pdev)
 				goto free_opp;
 			}
 		}
+
+		if (virt_devs) {
+			const char * const *name = config.genpd_names;
+			int i, j;
+
+			for (i = 0; *name; i++, name++) {
+				ret = pm_runtime_resume_and_get(virt_devs[i]);
+				if (ret) {
+					dev_err(cpu_dev, "failed to resume %s: %d\n",
+						*name, ret);
+
+					/* Rollback previous PM runtime calls */
+					name = config.genpd_names;
+					for (j = 0; *name && j < i; j++, name++)
+						pm_runtime_put(virt_devs[j]);
+
+					goto free_opp;
+				}
+
+				/* Keep CPU power domain always-on */
+				dev_pm_syscore_device(virt_devs[i], true);
+			}
+			drv->cpus[cpu].virt_devs = virt_devs;
+		}
 	}
 
 	cpufreq_dt_pdev = platform_device_register_simple("cpufreq-dt", -1,
@@ -365,8 +404,10 @@ static int qcom_cpufreq_probe(struct platform_device *pdev)
 	dev_err(cpu_dev, "Failed to register platform device\n");
 
 free_opp:
-	for_each_possible_cpu(cpu)
+	for_each_possible_cpu(cpu) {
+		qcom_cpufreq_put_virt_devs(drv, cpu);
 		dev_pm_opp_clear_config(drv->cpus[cpu].opp_token);
+	}
 	return ret;
 }
 
@@ -377,8 +418,10 @@ static void qcom_cpufreq_remove(struct platform_device *pdev)
 
 	platform_device_unregister(cpufreq_dt_pdev);
 
-	for_each_possible_cpu(cpu)
+	for_each_possible_cpu(cpu) {
+		qcom_cpufreq_put_virt_devs(drv, cpu);
 		dev_pm_opp_clear_config(drv->cpus[cpu].opp_token);
+	}
 }
 
 static struct platform_driver qcom_cpufreq_driver = {

-- 
2.39.2


^ permalink raw reply related	[flat|nested] 28+ messages in thread

* [PATCH v2 3/3] cpufreq: qcom-nvmem: Add MSM8909
  2023-10-18  8:06 [PATCH v2 0/3] cpufreq: Add basic cpufreq scaling for Qualcomm MSM8909 Stephan Gerhold
  2023-10-18  8:06 ` [PATCH v2 1/3] cpufreq: qcom-nvmem: Simplify driver data allocation Stephan Gerhold
  2023-10-18  8:06 ` [PATCH v2 2/3] cpufreq: qcom-nvmem: Enable virtual power domain devices Stephan Gerhold
@ 2023-10-18  8:06 ` Stephan Gerhold
  2023-10-18  8:42   ` Konrad Dybcio
  2023-10-19 10:50   ` Ulf Hansson
  2023-10-19  6:16 ` [PATCH v2 0/3] cpufreq: Add basic cpufreq scaling for Qualcomm MSM8909 Viresh Kumar
  3 siblings, 2 replies; 28+ messages in thread
From: Stephan Gerhold @ 2023-10-18  8:06 UTC (permalink / raw)
  To: Viresh Kumar
  Cc: Andy Gross, Bjorn Andersson, Konrad Dybcio, Ilia Lin,
	Rafael J. Wysocki, Rob Herring, Krzysztof Kozlowski,
	Conor Dooley, linux-pm, linux-arm-msm, linux-kernel, devicetree,
	Ulf Hansson, Stephan Gerhold, Stephan Gerhold

When the MSM8909 SoC is used together with the PM8909 PMIC the primary
power supply for the CPU (VDD_APC) is shared with other components to
the SoC, namely the VDD_CX power domain typically supplied by the PM8909
S1 regulator. This means that all votes for necessary performance states
go via the RPM firmware which collects the requirements from all the
processors in the SoC. The RPM firmware then chooses the actual voltage
based on the performance states ("corners"), depending on calibration
values in the NVMEM and other factors.

The MSM8909 SoC is also sometimes used with the PM8916 or PM660 PMIC.
In that case there is a dedicated regulator connected to VDD_APC and
Linux is responsible to do adaptive voltage scaling using CPR (similar
to the existing code for QCS404).

This difference can be described in the device tree, by either assigning
the CPU a power domain from RPMPD or from the CPR driver.

Describe this using "perf" as generic power domain name, which is also
used already for SCMI based platforms.

Also add a simple function that reads the speedbin from a NVMEM cell
and sets it as-is for opp-supported-hw. The actual bit position can be
described in the device tree without additional driver changes.

Signed-off-by: Stephan Gerhold <stephan.gerhold@kernkonzept.com>
---
 drivers/cpufreq/qcom-cpufreq-nvmem.c | 26 ++++++++++++++++++++++++++
 1 file changed, 26 insertions(+)

diff --git a/drivers/cpufreq/qcom-cpufreq-nvmem.c b/drivers/cpufreq/qcom-cpufreq-nvmem.c
index 3794390089b0..e52031863350 100644
--- a/drivers/cpufreq/qcom-cpufreq-nvmem.c
+++ b/drivers/cpufreq/qcom-cpufreq-nvmem.c
@@ -59,6 +59,24 @@ struct qcom_cpufreq_drv {
 
 static struct platform_device *cpufreq_dt_pdev, *cpufreq_pdev;
 
+static int qcom_cpufreq_simple_get_version(struct device *cpu_dev,
+					   struct nvmem_cell *speedbin_nvmem,
+					   char **pvs_name,
+					   struct qcom_cpufreq_drv *drv)
+{
+	u8 *speedbin;
+
+	*pvs_name = NULL;
+	speedbin = nvmem_cell_read(speedbin_nvmem, NULL);
+	if (IS_ERR(speedbin))
+		return PTR_ERR(speedbin);
+
+	dev_dbg(cpu_dev, "speedbin: %d\n", *speedbin);
+	drv->versions = 1 << *speedbin;
+	kfree(speedbin);
+	return 0;
+}
+
 static void get_krait_bin_format_a(struct device *cpu_dev,
 					  int *speed, int *pvs, int *pvs_ver,
 					  u8 *buf)
@@ -252,6 +270,8 @@ static int qcom_cpufreq_ipq8074_name_version(struct device *cpu_dev,
 	return 0;
 }
 
+static const char *generic_genpd_names[] = { "perf", NULL };
+
 static const struct qcom_cpufreq_match_data match_data_kryo = {
 	.get_version = qcom_cpufreq_kryo_name_version,
 };
@@ -260,6 +280,11 @@ static const struct qcom_cpufreq_match_data match_data_krait = {
 	.get_version = qcom_cpufreq_krait_name_version,
 };
 
+static const struct qcom_cpufreq_match_data match_data_msm8909 = {
+	.get_version = qcom_cpufreq_simple_get_version,
+	.genpd_names = generic_genpd_names,
+};
+
 static const char *qcs404_genpd_names[] = { "cpr", NULL };
 
 static const struct qcom_cpufreq_match_data match_data_qcs404 = {
@@ -434,6 +459,7 @@ static struct platform_driver qcom_cpufreq_driver = {
 
 static const struct of_device_id qcom_cpufreq_match_list[] __initconst = {
 	{ .compatible = "qcom,apq8096", .data = &match_data_kryo },
+	{ .compatible = "qcom,msm8909", .data = &match_data_msm8909 },
 	{ .compatible = "qcom,msm8996", .data = &match_data_kryo },
 	{ .compatible = "qcom,qcs404", .data = &match_data_qcs404 },
 	{ .compatible = "qcom,ipq8064", .data = &match_data_krait },

-- 
2.39.2


^ permalink raw reply related	[flat|nested] 28+ messages in thread

* Re: [PATCH v2 3/3] cpufreq: qcom-nvmem: Add MSM8909
  2023-10-18  8:06 ` [PATCH v2 3/3] cpufreq: qcom-nvmem: Add MSM8909 Stephan Gerhold
@ 2023-10-18  8:42   ` Konrad Dybcio
  2023-10-19 10:50   ` Ulf Hansson
  1 sibling, 0 replies; 28+ messages in thread
From: Konrad Dybcio @ 2023-10-18  8:42 UTC (permalink / raw)
  To: Stephan Gerhold, Viresh Kumar
  Cc: Andy Gross, Bjorn Andersson, Ilia Lin, Rafael J. Wysocki,
	Rob Herring, Krzysztof Kozlowski, Conor Dooley, linux-pm,
	linux-arm-msm, linux-kernel, devicetree, Ulf Hansson,
	Stephan Gerhold



On 10/18/23 10:06, Stephan Gerhold wrote:
> When the MSM8909 SoC is used together with the PM8909 PMIC the primary
> power supply for the CPU (VDD_APC) is shared with other components to
> the SoC, namely the VDD_CX power domain typically supplied by the PM8909
> S1 regulator. This means that all votes for necessary performance states
> go via the RPM firmware which collects the requirements from all the
> processors in the SoC. The RPM firmware then chooses the actual voltage
> based on the performance states ("corners"), depending on calibration
> values in the NVMEM and other factors.
> 
> The MSM8909 SoC is also sometimes used with the PM8916 or PM660 PMIC.
> In that case there is a dedicated regulator connected to VDD_APC and
> Linux is responsible to do adaptive voltage scaling using CPR (similar
> to the existing code for QCS404).
> 
> This difference can be described in the device tree, by either assigning
> the CPU a power domain from RPMPD or from the CPR driver.
> 
> Describe this using "perf" as generic power domain name, which is also
> used already for SCMI based platforms.
> 
> Also add a simple function that reads the speedbin from a NVMEM cell
> and sets it as-is for opp-supported-hw. The actual bit position can be
> described in the device tree without additional driver changes.
> 
> Signed-off-by: Stephan Gerhold <stephan.gerhold@kernkonzept.com>
> ---
Acked-by: Konrad Dybcio <konrad.dybcio@linaro.org>

Konrad

^ permalink raw reply	[flat|nested] 28+ messages in thread

* Re: [PATCH v2 1/3] cpufreq: qcom-nvmem: Simplify driver data allocation
  2023-10-18  8:06 ` [PATCH v2 1/3] cpufreq: qcom-nvmem: Simplify driver data allocation Stephan Gerhold
@ 2023-10-18  8:45   ` Konrad Dybcio
  0 siblings, 0 replies; 28+ messages in thread
From: Konrad Dybcio @ 2023-10-18  8:45 UTC (permalink / raw)
  To: Stephan Gerhold, Viresh Kumar
  Cc: Andy Gross, Bjorn Andersson, Ilia Lin, Rafael J. Wysocki,
	Rob Herring, Krzysztof Kozlowski, Conor Dooley, linux-pm,
	linux-arm-msm, linux-kernel, devicetree, Ulf Hansson,
	Stephan Gerhold



On 10/18/23 10:06, Stephan Gerhold wrote:
> Simplify the allocation and cleanup of driver data by using devm
> together with a flexible array. Prepare for adding additional per-CPU
> data by defining a struct qcom_cpufreq_drv_cpu instead of storing the
> opp_tokens directly.
> 
> Signed-off-by: Stephan Gerhold <stephan.gerhold@kernkonzept.com>
> ---
Reviewed-by: Konrad Dybcio <konrad.dybcio@linaro.org>

Konrad

^ permalink raw reply	[flat|nested] 28+ messages in thread

* Re: [PATCH v2 0/3] cpufreq: Add basic cpufreq scaling for Qualcomm MSM8909
  2023-10-18  8:06 [PATCH v2 0/3] cpufreq: Add basic cpufreq scaling for Qualcomm MSM8909 Stephan Gerhold
                   ` (2 preceding siblings ...)
  2023-10-18  8:06 ` [PATCH v2 3/3] cpufreq: qcom-nvmem: Add MSM8909 Stephan Gerhold
@ 2023-10-19  6:16 ` Viresh Kumar
  2023-10-19 10:19   ` Ulf Hansson
  2023-10-19 10:23   ` Viresh Kumar
  3 siblings, 2 replies; 28+ messages in thread
From: Viresh Kumar @ 2023-10-19  6:16 UTC (permalink / raw)
  To: Stephan Gerhold
  Cc: Andy Gross, Bjorn Andersson, Konrad Dybcio, Ilia Lin,
	Rafael J. Wysocki, Rob Herring, Krzysztof Kozlowski,
	Conor Dooley, linux-pm, linux-arm-msm, linux-kernel, devicetree,
	Ulf Hansson, Stephan Gerhold, stable

On 18-10-23, 10:06, Stephan Gerhold wrote:
> Add the necessary definitions to the qcom-cpufreq-nvmem driver to
> support basic cpufreq scaling on the Qualcomm MSM8909 SoC. In practice
> the necessary power domains vary depending on the actual PMIC the SoC
> was combined with. With PM8909 the VDD_APC power domain is shared with
> VDD_CX so the RPM firmware handles all voltage adjustments, while with
> PM8916 and PM660 Linux is responsible to do adaptive voltage scaling
> of a dedicated CPU regulator using CPR.
> 
> Signed-off-by: Stephan Gerhold <stephan.gerhold@kernkonzept.com>

Applied patch 1 and 3. Thanks.

-- 
viresh

^ permalink raw reply	[flat|nested] 28+ messages in thread

* Re: [PATCH v2 0/3] cpufreq: Add basic cpufreq scaling for Qualcomm MSM8909
  2023-10-19  6:16 ` [PATCH v2 0/3] cpufreq: Add basic cpufreq scaling for Qualcomm MSM8909 Viresh Kumar
@ 2023-10-19 10:19   ` Ulf Hansson
  2023-10-19 10:21     ` Viresh Kumar
  2023-10-19 10:23   ` Viresh Kumar
  1 sibling, 1 reply; 28+ messages in thread
From: Ulf Hansson @ 2023-10-19 10:19 UTC (permalink / raw)
  To: Viresh Kumar
  Cc: Stephan Gerhold, Andy Gross, Bjorn Andersson, Konrad Dybcio,
	Ilia Lin, Rafael J. Wysocki, Rob Herring, Krzysztof Kozlowski,
	Conor Dooley, linux-pm, linux-arm-msm, linux-kernel, devicetree,
	Stephan Gerhold, stable

On Thu, 19 Oct 2023 at 08:16, Viresh Kumar <viresh.kumar@linaro.org> wrote:
>
> On 18-10-23, 10:06, Stephan Gerhold wrote:
> > Add the necessary definitions to the qcom-cpufreq-nvmem driver to
> > support basic cpufreq scaling on the Qualcomm MSM8909 SoC. In practice
> > the necessary power domains vary depending on the actual PMIC the SoC
> > was combined with. With PM8909 the VDD_APC power domain is shared with
> > VDD_CX so the RPM firmware handles all voltage adjustments, while with
> > PM8916 and PM660 Linux is responsible to do adaptive voltage scaling
> > of a dedicated CPU regulator using CPR.
> >
> > Signed-off-by: Stephan Gerhold <stephan.gerhold@kernkonzept.com>
>
> Applied patch 1 and 3. Thanks.

I did spend quite some time reviewing the previous version. Please
allow me to have a look at this too before applying.

Kind regards
Uffe

>
> --
> viresh

^ permalink raw reply	[flat|nested] 28+ messages in thread

* Re: [PATCH v2 0/3] cpufreq: Add basic cpufreq scaling for Qualcomm MSM8909
  2023-10-19 10:19   ` Ulf Hansson
@ 2023-10-19 10:21     ` Viresh Kumar
  0 siblings, 0 replies; 28+ messages in thread
From: Viresh Kumar @ 2023-10-19 10:21 UTC (permalink / raw)
  To: Ulf Hansson
  Cc: Stephan Gerhold, Andy Gross, Bjorn Andersson, Konrad Dybcio,
	Ilia Lin, Rafael J. Wysocki, Rob Herring, Krzysztof Kozlowski,
	Conor Dooley, linux-pm, linux-arm-msm, linux-kernel, devicetree,
	Stephan Gerhold, stable

On 19-10-23, 12:19, Ulf Hansson wrote:
> On Thu, 19 Oct 2023 at 08:16, Viresh Kumar <viresh.kumar@linaro.org> wrote:
> >
> > On 18-10-23, 10:06, Stephan Gerhold wrote:
> > > Add the necessary definitions to the qcom-cpufreq-nvmem driver to
> > > support basic cpufreq scaling on the Qualcomm MSM8909 SoC. In practice
> > > the necessary power domains vary depending on the actual PMIC the SoC
> > > was combined with. With PM8909 the VDD_APC power domain is shared with
> > > VDD_CX so the RPM firmware handles all voltage adjustments, while with
> > > PM8916 and PM660 Linux is responsible to do adaptive voltage scaling
> > > of a dedicated CPU regulator using CPR.
> > >
> > > Signed-off-by: Stephan Gerhold <stephan.gerhold@kernkonzept.com>
> >
> > Applied patch 1 and 3. Thanks.
> 
> I did spend quite some time reviewing the previous version. Please
> allow me to have a look at this too before applying.

My branch isn't immutable, I keep on changing it. Please provide your
feedback :)

-- 
viresh

^ permalink raw reply	[flat|nested] 28+ messages in thread

* Re: [PATCH v2 0/3] cpufreq: Add basic cpufreq scaling for Qualcomm MSM8909
  2023-10-19  6:16 ` [PATCH v2 0/3] cpufreq: Add basic cpufreq scaling for Qualcomm MSM8909 Viresh Kumar
  2023-10-19 10:19   ` Ulf Hansson
@ 2023-10-19 10:23   ` Viresh Kumar
  2023-10-19 13:48     ` Stephan Gerhold
  1 sibling, 1 reply; 28+ messages in thread
From: Viresh Kumar @ 2023-10-19 10:23 UTC (permalink / raw)
  To: Stephan Gerhold
  Cc: Andy Gross, Bjorn Andersson, Konrad Dybcio, Ilia Lin,
	Rafael J. Wysocki, Rob Herring, Krzysztof Kozlowski,
	Conor Dooley, linux-pm, linux-arm-msm, linux-kernel, devicetree,
	Ulf Hansson, Stephan Gerhold, stable

On 19-10-23, 11:46, Viresh Kumar wrote:
> On 18-10-23, 10:06, Stephan Gerhold wrote:
> > Add the necessary definitions to the qcom-cpufreq-nvmem driver to
> > support basic cpufreq scaling on the Qualcomm MSM8909 SoC. In practice
> > the necessary power domains vary depending on the actual PMIC the SoC
> > was combined with. With PM8909 the VDD_APC power domain is shared with
> > VDD_CX so the RPM firmware handles all voltage adjustments, while with
> > PM8916 and PM660 Linux is responsible to do adaptive voltage scaling
> > of a dedicated CPU regulator using CPR.
> > 
> > Signed-off-by: Stephan Gerhold <stephan.gerhold@kernkonzept.com>
> 
> Applied patch 1 and 3. Thanks.

Hi Stephan,

I think your platform has exactly what I am looking for. Can you
please help me test this, before it lands into linux-next :)

https://lore.kernel.org/cover.1697710527.git.viresh.kumar@linaro.org

TIA.

-- 
viresh

^ permalink raw reply	[flat|nested] 28+ messages in thread

* Re: [PATCH v2 2/3] cpufreq: qcom-nvmem: Enable virtual power domain devices
  2023-10-18  8:06 ` [PATCH v2 2/3] cpufreq: qcom-nvmem: Enable virtual power domain devices Stephan Gerhold
@ 2023-10-19 10:24   ` Ulf Hansson
  2023-10-19 11:26     ` Ulf Hansson
  0 siblings, 1 reply; 28+ messages in thread
From: Ulf Hansson @ 2023-10-19 10:24 UTC (permalink / raw)
  To: Stephan Gerhold
  Cc: Viresh Kumar, Andy Gross, Bjorn Andersson, Konrad Dybcio,
	Ilia Lin, Rafael J. Wysocki, Rob Herring, Krzysztof Kozlowski,
	Conor Dooley, linux-pm, linux-arm-msm, linux-kernel, devicetree,
	Stephan Gerhold, stable

On Wed, 18 Oct 2023 at 10:06, Stephan Gerhold
<stephan.gerhold@kernkonzept.com> wrote:
>
> The genpd core caches performance state votes from devices that are
> runtime suspended as of commit 3c5a272202c2 ("PM: domains: Improve
> runtime PM performance state handling"). They get applied once the
> device becomes active again.
>
> To attach the power domains needed by qcom-cpufreq-nvmem the OPP core
> calls genpd_dev_pm_attach_by_id(). This results in "virtual" dummy
> devices that use runtime PM only to control the enable and performance
> state for the attached power domain.
>
> However, at the moment nothing ever resumes the virtual devices created
> for qcom-cpufreq-nvmem. They remain permanently runtime suspended. This
> means that performance state votes made during cpufreq scaling get
> always cached and never applied to the hardware.
>
> Fix this by enabling the devices after attaching them and use
> dev_pm_syscore_device() to ensure the power domains also stay on when
> going to suspend. Since it supplies the CPU we can never turn it off
> from Linux. There are other mechanisms to turn it off when needed,
> usually in the RPM firmware (RPMPD) or the cpuidle path (CPR genpd).

I believe we discussed using dev_pm_syscore_device() for the previous
version. It's not intended to be used for things like the above.

Moreover, I was under the impression that it wasn't really needed. In
fact, I would think that this actually breaks things for system
suspend/resume, as in this case the cpr driver's genpd
->power_on|off() callbacks are no longer getting called due this,
which means that the cpr state machine isn't going to be restored
properly. Or did I get this wrong?

Kind regards
Uffe

>
> Without this fix performance states votes are silently ignored, and the
> CPU/CPR voltage is never adjusted. This has been broken since 5.14 but
> for some reason no one noticed this on QCS404 so far.
>
> Cc: stable@vger.kernel.org
> Fixes: 1cb8339ca225 ("cpufreq: qcom: Add support for qcs404 on nvmem driver")
> Signed-off-by: Stephan Gerhold <stephan.gerhold@kernkonzept.com>
> ---
>  drivers/cpufreq/qcom-cpufreq-nvmem.c | 49 +++++++++++++++++++++++++++++++++---
>  1 file changed, 46 insertions(+), 3 deletions(-)
>
> diff --git a/drivers/cpufreq/qcom-cpufreq-nvmem.c b/drivers/cpufreq/qcom-cpufreq-nvmem.c
> index 82a244f3fa52..3794390089b0 100644
> --- a/drivers/cpufreq/qcom-cpufreq-nvmem.c
> +++ b/drivers/cpufreq/qcom-cpufreq-nvmem.c
> @@ -25,6 +25,7 @@
>  #include <linux/platform_device.h>
>  #include <linux/pm_domain.h>
>  #include <linux/pm_opp.h>
> +#include <linux/pm_runtime.h>
>  #include <linux/slab.h>
>  #include <linux/soc/qcom/smem.h>
>
> @@ -47,6 +48,7 @@ struct qcom_cpufreq_match_data {
>
>  struct qcom_cpufreq_drv_cpu {
>         int opp_token;
> +       struct device **virt_devs;
>  };
>
>  struct qcom_cpufreq_drv {
> @@ -268,6 +270,18 @@ static const struct qcom_cpufreq_match_data match_data_ipq8074 = {
>         .get_version = qcom_cpufreq_ipq8074_name_version,
>  };
>
> +static void qcom_cpufreq_put_virt_devs(struct qcom_cpufreq_drv *drv, unsigned cpu)
> +{
> +       const char * const *name = drv->data->genpd_names;
> +       int i;
> +
> +       if (!drv->cpus[cpu].virt_devs)
> +               return;
> +
> +       for (i = 0; *name; i++, name++)
> +               pm_runtime_put(drv->cpus[cpu].virt_devs[i]);
> +}
> +
>  static int qcom_cpufreq_probe(struct platform_device *pdev)
>  {
>         struct qcom_cpufreq_drv *drv;
> @@ -321,6 +335,7 @@ static int qcom_cpufreq_probe(struct platform_device *pdev)
>         of_node_put(np);
>
>         for_each_possible_cpu(cpu) {
> +               struct device **virt_devs = NULL;
>                 struct dev_pm_opp_config config = {
>                         .supported_hw = NULL,
>                 };
> @@ -341,7 +356,7 @@ static int qcom_cpufreq_probe(struct platform_device *pdev)
>
>                 if (drv->data->genpd_names) {
>                         config.genpd_names = drv->data->genpd_names;
> -                       config.virt_devs = NULL;
> +                       config.virt_devs = &virt_devs;
>                 }
>
>                 if (config.supported_hw || config.genpd_names) {
> @@ -352,6 +367,30 @@ static int qcom_cpufreq_probe(struct platform_device *pdev)
>                                 goto free_opp;
>                         }
>                 }
> +
> +               if (virt_devs) {
> +                       const char * const *name = config.genpd_names;
> +                       int i, j;
> +
> +                       for (i = 0; *name; i++, name++) {
> +                               ret = pm_runtime_resume_and_get(virt_devs[i]);
> +                               if (ret) {
> +                                       dev_err(cpu_dev, "failed to resume %s: %d\n",
> +                                               *name, ret);
> +
> +                                       /* Rollback previous PM runtime calls */
> +                                       name = config.genpd_names;
> +                                       for (j = 0; *name && j < i; j++, name++)
> +                                               pm_runtime_put(virt_devs[j]);
> +
> +                                       goto free_opp;
> +                               }
> +
> +                               /* Keep CPU power domain always-on */
> +                               dev_pm_syscore_device(virt_devs[i], true);
> +                       }
> +                       drv->cpus[cpu].virt_devs = virt_devs;
> +               }
>         }
>
>         cpufreq_dt_pdev = platform_device_register_simple("cpufreq-dt", -1,
> @@ -365,8 +404,10 @@ static int qcom_cpufreq_probe(struct platform_device *pdev)
>         dev_err(cpu_dev, "Failed to register platform device\n");
>
>  free_opp:
> -       for_each_possible_cpu(cpu)
> +       for_each_possible_cpu(cpu) {
> +               qcom_cpufreq_put_virt_devs(drv, cpu);
>                 dev_pm_opp_clear_config(drv->cpus[cpu].opp_token);
> +       }
>         return ret;
>  }
>
> @@ -377,8 +418,10 @@ static void qcom_cpufreq_remove(struct platform_device *pdev)
>
>         platform_device_unregister(cpufreq_dt_pdev);
>
> -       for_each_possible_cpu(cpu)
> +       for_each_possible_cpu(cpu) {
> +               qcom_cpufreq_put_virt_devs(drv, cpu);
>                 dev_pm_opp_clear_config(drv->cpus[cpu].opp_token);
> +       }
>  }
>
>  static struct platform_driver qcom_cpufreq_driver = {
>
> --
> 2.39.2
>

^ permalink raw reply	[flat|nested] 28+ messages in thread

* Re: [PATCH v2 3/3] cpufreq: qcom-nvmem: Add MSM8909
  2023-10-18  8:06 ` [PATCH v2 3/3] cpufreq: qcom-nvmem: Add MSM8909 Stephan Gerhold
  2023-10-18  8:42   ` Konrad Dybcio
@ 2023-10-19 10:50   ` Ulf Hansson
  1 sibling, 0 replies; 28+ messages in thread
From: Ulf Hansson @ 2023-10-19 10:50 UTC (permalink / raw)
  To: Stephan Gerhold
  Cc: Viresh Kumar, Andy Gross, Bjorn Andersson, Konrad Dybcio,
	Ilia Lin, Rafael J. Wysocki, Rob Herring, Krzysztof Kozlowski,
	Conor Dooley, linux-pm, linux-arm-msm, linux-kernel, devicetree,
	Stephan Gerhold

On Wed, 18 Oct 2023 at 10:06, Stephan Gerhold
<stephan.gerhold@kernkonzept.com> wrote:
>
> When the MSM8909 SoC is used together with the PM8909 PMIC the primary
> power supply for the CPU (VDD_APC) is shared with other components to
> the SoC, namely the VDD_CX power domain typically supplied by the PM8909
> S1 regulator. This means that all votes for necessary performance states
> go via the RPM firmware which collects the requirements from all the
> processors in the SoC. The RPM firmware then chooses the actual voltage
> based on the performance states ("corners"), depending on calibration
> values in the NVMEM and other factors.
>
> The MSM8909 SoC is also sometimes used with the PM8916 or PM660 PMIC.
> In that case there is a dedicated regulator connected to VDD_APC and
> Linux is responsible to do adaptive voltage scaling using CPR (similar
> to the existing code for QCS404).
>
> This difference can be described in the device tree, by either assigning
> the CPU a power domain from RPMPD or from the CPR driver.
>
> Describe this using "perf" as generic power domain name, which is also
> used already for SCMI based platforms.
>
> Also add a simple function that reads the speedbin from a NVMEM cell
> and sets it as-is for opp-supported-hw. The actual bit position can be
> described in the device tree without additional driver changes.
>
> Signed-off-by: Stephan Gerhold <stephan.gerhold@kernkonzept.com>
> ---
>  drivers/cpufreq/qcom-cpufreq-nvmem.c | 26 ++++++++++++++++++++++++++
>  1 file changed, 26 insertions(+)
>
> diff --git a/drivers/cpufreq/qcom-cpufreq-nvmem.c b/drivers/cpufreq/qcom-cpufreq-nvmem.c
> index 3794390089b0..e52031863350 100644
> --- a/drivers/cpufreq/qcom-cpufreq-nvmem.c
> +++ b/drivers/cpufreq/qcom-cpufreq-nvmem.c
> @@ -59,6 +59,24 @@ struct qcom_cpufreq_drv {
>
>  static struct platform_device *cpufreq_dt_pdev, *cpufreq_pdev;
>
> +static int qcom_cpufreq_simple_get_version(struct device *cpu_dev,
> +                                          struct nvmem_cell *speedbin_nvmem,
> +                                          char **pvs_name,
> +                                          struct qcom_cpufreq_drv *drv)
> +{
> +       u8 *speedbin;
> +
> +       *pvs_name = NULL;
> +       speedbin = nvmem_cell_read(speedbin_nvmem, NULL);
> +       if (IS_ERR(speedbin))
> +               return PTR_ERR(speedbin);
> +
> +       dev_dbg(cpu_dev, "speedbin: %d\n", *speedbin);
> +       drv->versions = 1 << *speedbin;
> +       kfree(speedbin);
> +       return 0;
> +}
> +
>  static void get_krait_bin_format_a(struct device *cpu_dev,
>                                           int *speed, int *pvs, int *pvs_ver,
>                                           u8 *buf)
> @@ -252,6 +270,8 @@ static int qcom_cpufreq_ipq8074_name_version(struct device *cpu_dev,
>         return 0;
>  }
>
> +static const char *generic_genpd_names[] = { "perf", NULL };
> +

As discussed, using "perf" as a generic name for a performance domain
for CPUs makes perfect sense to me. However, we need to update the DT
doc bindings for this too. At least we should update
Documentation/devicetree/bindings/cpufreq/qcom-cpufreq-nvmem.yaml as a
part of $subject series.

At a later step, we should have a look at updating the description for
the power-domain-names in the common
Documentation/devicetree/bindings/arm/cpus.yaml, I think.

>  static const struct qcom_cpufreq_match_data match_data_kryo = {
>         .get_version = qcom_cpufreq_kryo_name_version,
>  };
> @@ -260,6 +280,11 @@ static const struct qcom_cpufreq_match_data match_data_krait = {
>         .get_version = qcom_cpufreq_krait_name_version,
>  };
>
> +static const struct qcom_cpufreq_match_data match_data_msm8909 = {
> +       .get_version = qcom_cpufreq_simple_get_version,
> +       .genpd_names = generic_genpd_names,
> +};
> +
>  static const char *qcs404_genpd_names[] = { "cpr", NULL };
>
>  static const struct qcom_cpufreq_match_data match_data_qcs404 = {
> @@ -434,6 +459,7 @@ static struct platform_driver qcom_cpufreq_driver = {
>
>  static const struct of_device_id qcom_cpufreq_match_list[] __initconst = {
>         { .compatible = "qcom,apq8096", .data = &match_data_kryo },
> +       { .compatible = "qcom,msm8909", .data = &match_data_msm8909 },
>         { .compatible = "qcom,msm8996", .data = &match_data_kryo },
>         { .compatible = "qcom,qcs404", .data = &match_data_qcs404 },
>         { .compatible = "qcom,ipq8064", .data = &match_data_krait },
>
> --
> 2.39.2
>

Other than the above, feel free to add:

Reviewed-by: Ulf Hansson <ulf.hansson@linaro.org>

Kind regards
Uffe

^ permalink raw reply	[flat|nested] 28+ messages in thread

* Re: [PATCH v2 2/3] cpufreq: qcom-nvmem: Enable virtual power domain devices
  2023-10-19 10:24   ` Ulf Hansson
@ 2023-10-19 11:26     ` Ulf Hansson
  2023-10-19 13:05       ` Stephan Gerhold
  2023-10-24 12:03       ` Stephan Gerhold
  0 siblings, 2 replies; 28+ messages in thread
From: Ulf Hansson @ 2023-10-19 11:26 UTC (permalink / raw)
  To: Stephan Gerhold
  Cc: Viresh Kumar, Andy Gross, Bjorn Andersson, Konrad Dybcio,
	Ilia Lin, Rafael J. Wysocki, Rob Herring, Krzysztof Kozlowski,
	Conor Dooley, linux-pm, linux-arm-msm, linux-kernel, devicetree,
	Stephan Gerhold, stable

On Thu, 19 Oct 2023 at 12:24, Ulf Hansson <ulf.hansson@linaro.org> wrote:
>
> On Wed, 18 Oct 2023 at 10:06, Stephan Gerhold
> <stephan.gerhold@kernkonzept.com> wrote:
> >
> > The genpd core caches performance state votes from devices that are
> > runtime suspended as of commit 3c5a272202c2 ("PM: domains: Improve
> > runtime PM performance state handling"). They get applied once the
> > device becomes active again.
> >
> > To attach the power domains needed by qcom-cpufreq-nvmem the OPP core
> > calls genpd_dev_pm_attach_by_id(). This results in "virtual" dummy
> > devices that use runtime PM only to control the enable and performance
> > state for the attached power domain.
> >
> > However, at the moment nothing ever resumes the virtual devices created
> > for qcom-cpufreq-nvmem. They remain permanently runtime suspended. This
> > means that performance state votes made during cpufreq scaling get
> > always cached and never applied to the hardware.
> >
> > Fix this by enabling the devices after attaching them and use
> > dev_pm_syscore_device() to ensure the power domains also stay on when
> > going to suspend. Since it supplies the CPU we can never turn it off
> > from Linux. There are other mechanisms to turn it off when needed,
> > usually in the RPM firmware (RPMPD) or the cpuidle path (CPR genpd).
>
> I believe we discussed using dev_pm_syscore_device() for the previous
> version. It's not intended to be used for things like the above.
>
> Moreover, I was under the impression that it wasn't really needed. In
> fact, I would think that this actually breaks things for system
> suspend/resume, as in this case the cpr driver's genpd
> ->power_on|off() callbacks are no longer getting called due this,
> which means that the cpr state machine isn't going to be restored
> properly. Or did I get this wrong?

BTW, if you really need something like the above, the proper way to do
it would instead be to call device_set_awake_path() for the device.

This informs genpd that the device needs to stay powered-on during
system suspend (assuming that GENPD_FLAG_ACTIVE_WAKEUP has been set
for it), hence it will keep the corresponding PM domain powered-on
too.

[...]

Kind regards
Uffe

^ permalink raw reply	[flat|nested] 28+ messages in thread

* Re: [PATCH v2 2/3] cpufreq: qcom-nvmem: Enable virtual power domain devices
  2023-10-19 11:26     ` Ulf Hansson
@ 2023-10-19 13:05       ` Stephan Gerhold
  2023-10-19 14:12         ` Ulf Hansson
  2023-10-24 12:03       ` Stephan Gerhold
  1 sibling, 1 reply; 28+ messages in thread
From: Stephan Gerhold @ 2023-10-19 13:05 UTC (permalink / raw)
  To: Ulf Hansson
  Cc: Stephan Gerhold, Viresh Kumar, Andy Gross, Bjorn Andersson,
	Konrad Dybcio, Ilia Lin, Rafael J. Wysocki, Rob Herring,
	Krzysztof Kozlowski, Conor Dooley, linux-pm, linux-arm-msm,
	linux-kernel, devicetree, stable

On Thu, Oct 19, 2023 at 01:26:19PM +0200, Ulf Hansson wrote:
> On Thu, 19 Oct 2023 at 12:24, Ulf Hansson <ulf.hansson@linaro.org> wrote:
> > On Wed, 18 Oct 2023 at 10:06, Stephan Gerhold
> > <stephan.gerhold@kernkonzept.com> wrote:
> > >
> > > The genpd core caches performance state votes from devices that are
> > > runtime suspended as of commit 3c5a272202c2 ("PM: domains: Improve
> > > runtime PM performance state handling"). They get applied once the
> > > device becomes active again.
> > >
> > > To attach the power domains needed by qcom-cpufreq-nvmem the OPP core
> > > calls genpd_dev_pm_attach_by_id(). This results in "virtual" dummy
> > > devices that use runtime PM only to control the enable and performance
> > > state for the attached power domain.
> > >
> > > However, at the moment nothing ever resumes the virtual devices created
> > > for qcom-cpufreq-nvmem. They remain permanently runtime suspended. This
> > > means that performance state votes made during cpufreq scaling get
> > > always cached and never applied to the hardware.
> > >
> > > Fix this by enabling the devices after attaching them and use
> > > dev_pm_syscore_device() to ensure the power domains also stay on when
> > > going to suspend. Since it supplies the CPU we can never turn it off
> > > from Linux. There are other mechanisms to turn it off when needed,
> > > usually in the RPM firmware (RPMPD) or the cpuidle path (CPR genpd).
> >
> > I believe we discussed using dev_pm_syscore_device() for the previous
> > version. It's not intended to be used for things like the above.
> >

Sorry, looks like we still had a misunderstanding in the conclusion of
the previous discussion. :')

> > Moreover, I was under the impression that it wasn't really needed. In
> > fact, I would think that this actually breaks things for system
> > suspend/resume, as in this case the cpr driver's genpd
> > ->power_on|off() callbacks are no longer getting called due this,
> > which means that the cpr state machine isn't going to be restored
> > properly. Or did I get this wrong?
> 

We strictly need the RPMPDs to be always-on, also across system suspend
[1]. The RPM firmware will drop the votes internally as soon as the
CPU(s) have entered deep cpuidle. We can't do this from Linux, because
we need the CPU to continue running until it was shut down cleanly.

For CPR, we strictly need the backing regulator to be always-on, also
across system suspend. Typically the hardware will turn off the
regulator as soon as the CPU(s) enter deep cpuidle. Similarly, we can't
do this from Linux, because we need the CPU to continue running until it
was shut down cleanly.

My understanding was that we're going to pause the CPR state machine
using the system suspend/resume callbacks on the driver, instead of
using the genpd->power_on|off() callbacks [2]. I can submit a separate
patch for this.

I didn't prioritize this because QCS404 (as the only current user of
CPR) doesn't have proper deep cpuidle/power management set up yet. It's
not entirely clear to me if there is any advantage (or perhaps even
disadvantage) if we pause the CPR state machine while the shared L2
cache is still being actively powered by the CPR power rail during
system suspend. I suspect this is a configuration that was never
considered in the hardware design.

Given the strict requirement for the RPMPDs, I only see two options:

 1. Have an always-on consumer that prevents the power domains to be
    powered off during system suspend. This is what this patch tries to
    achieve.

Or:

 2. Come up with a way to register the RPMPDs used by the CPU with
    GENPD_FLAG_ALWAYS_ON. This would also be doable, but isn't as
    straightfoward as "regulator-always-on" in the DT because the rpmpd
    DT node represents multiple genpds in a single DT node [3].

What do you think? Do you see some other solution perhaps? I hope we can
clear up the misunderstanding. :-)

[1]: https://lore.kernel.org/linux-arm-msm/ZQGqfMigCFZP_HLA@gerhold.net/
[2]: https://lore.kernel.org/linux-arm-msm/CAPDyKFoiup8KNv=1LFGKDdDLA1pHsdJUgTTWMdgxnikEmReXzg@mail.gmail.com/
[3]: https://lore.kernel.org/linux-arm-msm/ZSg-XtwMxg3_fWxc@gerhold.net/

> BTW, if you really need something like the above, the proper way to do
> it would instead be to call device_set_awake_path() for the device.
> 
> This informs genpd that the device needs to stay powered-on during
> system suspend (assuming that GENPD_FLAG_ACTIVE_WAKEUP has been set
> for it), hence it will keep the corresponding PM domain powered-on
> too.
> 

Thanks, I can try if this works as alternative to the
dev_pm_syscore_device()!

I will wait for your thoughts on the above before accidentally going
into the wrong direction again. :-)

Thanks!
Stephan

^ permalink raw reply	[flat|nested] 28+ messages in thread

* Re: [PATCH v2 0/3] cpufreq: Add basic cpufreq scaling for Qualcomm MSM8909
  2023-10-19 10:23   ` Viresh Kumar
@ 2023-10-19 13:48     ` Stephan Gerhold
  2023-10-20  3:21       ` Viresh Kumar
  0 siblings, 1 reply; 28+ messages in thread
From: Stephan Gerhold @ 2023-10-19 13:48 UTC (permalink / raw)
  To: Viresh Kumar
  Cc: Stephan Gerhold, Andy Gross, Bjorn Andersson, Konrad Dybcio,
	Ilia Lin, Rafael J. Wysocki, Rob Herring, Krzysztof Kozlowski,
	Conor Dooley, linux-pm, linux-arm-msm, linux-kernel, devicetree,
	Ulf Hansson, stable

On Thu, Oct 19, 2023 at 03:53:42PM +0530, Viresh Kumar wrote:
> On 19-10-23, 11:46, Viresh Kumar wrote:
> > On 18-10-23, 10:06, Stephan Gerhold wrote:
> > > Add the necessary definitions to the qcom-cpufreq-nvmem driver to
> > > support basic cpufreq scaling on the Qualcomm MSM8909 SoC. In practice
> > > the necessary power domains vary depending on the actual PMIC the SoC
> > > was combined with. With PM8909 the VDD_APC power domain is shared with
> > > VDD_CX so the RPM firmware handles all voltage adjustments, while with
> > > PM8916 and PM660 Linux is responsible to do adaptive voltage scaling
> > > of a dedicated CPU regulator using CPR.
> > > 
> > > Signed-off-by: Stephan Gerhold <stephan.gerhold@kernkonzept.com>
> > 
> > Applied patch 1 and 3. Thanks.
> 
> Hi Stephan,
> 
> I think your platform has exactly what I am looking for. Can you
> please help me test this, before it lands into linux-next :)
> 
> https://lore.kernel.org/cover.1697710527.git.viresh.kumar@linaro.org
> 

Sure, I will try to test it until end of next week, with both single and
multiple power domains assigned to the CPU. Is there something
particular you would like me to look for? Or just that the scaling still
works correctly as before?

Stephan

^ permalink raw reply	[flat|nested] 28+ messages in thread

* Re: [PATCH v2 2/3] cpufreq: qcom-nvmem: Enable virtual power domain devices
  2023-10-19 13:05       ` Stephan Gerhold
@ 2023-10-19 14:12         ` Ulf Hansson
  2023-10-19 14:48           ` Stephan Gerhold
  0 siblings, 1 reply; 28+ messages in thread
From: Ulf Hansson @ 2023-10-19 14:12 UTC (permalink / raw)
  To: Stephan Gerhold
  Cc: Stephan Gerhold, Viresh Kumar, Andy Gross, Bjorn Andersson,
	Konrad Dybcio, Ilia Lin, Rafael J. Wysocki, Rob Herring,
	Krzysztof Kozlowski, Conor Dooley, linux-pm, linux-arm-msm,
	linux-kernel, devicetree, stable

On Thu, 19 Oct 2023 at 15:05, Stephan Gerhold <stephan@gerhold.net> wrote:
>
> On Thu, Oct 19, 2023 at 01:26:19PM +0200, Ulf Hansson wrote:
> > On Thu, 19 Oct 2023 at 12:24, Ulf Hansson <ulf.hansson@linaro.org> wrote:
> > > On Wed, 18 Oct 2023 at 10:06, Stephan Gerhold
> > > <stephan.gerhold@kernkonzept.com> wrote:
> > > >
> > > > The genpd core caches performance state votes from devices that are
> > > > runtime suspended as of commit 3c5a272202c2 ("PM: domains: Improve
> > > > runtime PM performance state handling"). They get applied once the
> > > > device becomes active again.
> > > >
> > > > To attach the power domains needed by qcom-cpufreq-nvmem the OPP core
> > > > calls genpd_dev_pm_attach_by_id(). This results in "virtual" dummy
> > > > devices that use runtime PM only to control the enable and performance
> > > > state for the attached power domain.
> > > >
> > > > However, at the moment nothing ever resumes the virtual devices created
> > > > for qcom-cpufreq-nvmem. They remain permanently runtime suspended. This
> > > > means that performance state votes made during cpufreq scaling get
> > > > always cached and never applied to the hardware.
> > > >
> > > > Fix this by enabling the devices after attaching them and use
> > > > dev_pm_syscore_device() to ensure the power domains also stay on when
> > > > going to suspend. Since it supplies the CPU we can never turn it off
> > > > from Linux. There are other mechanisms to turn it off when needed,
> > > > usually in the RPM firmware (RPMPD) or the cpuidle path (CPR genpd).
> > >
> > > I believe we discussed using dev_pm_syscore_device() for the previous
> > > version. It's not intended to be used for things like the above.
> > >
>
> Sorry, looks like we still had a misunderstanding in the conclusion of
> the previous discussion. :')
>
> > > Moreover, I was under the impression that it wasn't really needed. In
> > > fact, I would think that this actually breaks things for system
> > > suspend/resume, as in this case the cpr driver's genpd
> > > ->power_on|off() callbacks are no longer getting called due this,
> > > which means that the cpr state machine isn't going to be restored
> > > properly. Or did I get this wrong?
> >
>
> We strictly need the RPMPDs to be always-on, also across system suspend
> [1]. The RPM firmware will drop the votes internally as soon as the
> CPU(s) have entered deep cpuidle. We can't do this from Linux, because
> we need the CPU to continue running until it was shut down cleanly.
>
> For CPR, we strictly need the backing regulator to be always-on, also
> across system suspend. Typically the hardware will turn off the
> regulator as soon as the CPU(s) enter deep cpuidle. Similarly, we can't
> do this from Linux, because we need the CPU to continue running until it
> was shut down cleanly.
>
> My understanding was that we're going to pause the CPR state machine
> using the system suspend/resume callbacks on the driver, instead of
> using the genpd->power_on|off() callbacks [2]. I can submit a separate
> patch for this.

If we are going to do 1) as described below, this looks to me that
it's going to be needed.

How will otherwise the cpr state machine be saved/restored during
system suspend/resume? Note that, beyond 1), the genpd's
->power_on|off() callbacks are no longer going to be called during
system suspend/resume.

In a way this also means that the cpr genpd provider might as well
also have GENPD_FLAG_ALWAYS_ON set for it.

>
> I didn't prioritize this because QCS404 (as the only current user of
> CPR) doesn't have proper deep cpuidle/power management set up yet. It's
> not entirely clear to me if there is any advantage (or perhaps even
> disadvantage) if we pause the CPR state machine while the shared L2
> cache is still being actively powered by the CPR power rail during
> system suspend. I suspect this is a configuration that was never
> considered in the hardware design.

I see.

>
> Given the strict requirement for the RPMPDs, I only see two options:
>
>  1. Have an always-on consumer that prevents the power domains to be
>     powered off during system suspend. This is what this patch tries to
>     achieve.
>
> Or:
>
>  2. Come up with a way to register the RPMPDs used by the CPU with
>     GENPD_FLAG_ALWAYS_ON. This would also be doable, but isn't as
>     straightfoward as "regulator-always-on" in the DT because the rpmpd
>     DT node represents multiple genpds in a single DT node [3].

Yes, it sounds like it may be easier to do 1).

>
> What do you think? Do you see some other solution perhaps? I hope we can
> clear up the misunderstanding. :-)

Yes, thanks!

>
> [1]: https://lore.kernel.org/linux-arm-msm/ZQGqfMigCFZP_HLA@gerhold.net/
> [2]: https://lore.kernel.org/linux-arm-msm/CAPDyKFoiup8KNv=1LFGKDdDLA1pHsdJUgTTWMdgxnikEmReXzg@mail.gmail.com/
> [3]: https://lore.kernel.org/linux-arm-msm/ZSg-XtwMxg3_fWxc@gerhold.net/
>
> > BTW, if you really need something like the above, the proper way to do
> > it would instead be to call device_set_awake_path() for the device.
> >
> > This informs genpd that the device needs to stay powered-on during
> > system suspend (assuming that GENPD_FLAG_ACTIVE_WAKEUP has been set
> > for it), hence it will keep the corresponding PM domain powered-on
> > too.
> >
>
> Thanks, I can try if this works as alternative to the
> dev_pm_syscore_device()!

Yes, please. We don't want to abuse the dev_pm_syscore_device() thingy.

>
> I will wait for your thoughts on the above before accidentally going
> into the wrong direction again. :-)

No worries, we are moving forward. :-)

Kind regards
Uffe

^ permalink raw reply	[flat|nested] 28+ messages in thread

* Re: [PATCH v2 2/3] cpufreq: qcom-nvmem: Enable virtual power domain devices
  2023-10-19 14:12         ` Ulf Hansson
@ 2023-10-19 14:48           ` Stephan Gerhold
  2023-10-19 15:19             ` Ulf Hansson
  0 siblings, 1 reply; 28+ messages in thread
From: Stephan Gerhold @ 2023-10-19 14:48 UTC (permalink / raw)
  To: Ulf Hansson
  Cc: Stephan Gerhold, Viresh Kumar, Andy Gross, Bjorn Andersson,
	Konrad Dybcio, Ilia Lin, Rafael J. Wysocki, Rob Herring,
	Krzysztof Kozlowski, Conor Dooley, linux-pm, linux-arm-msm,
	linux-kernel, devicetree, stable

On Thu, Oct 19, 2023 at 04:12:56PM +0200, Ulf Hansson wrote:
> On Thu, 19 Oct 2023 at 15:05, Stephan Gerhold <stephan@gerhold.net> wrote:
> > On Thu, Oct 19, 2023 at 01:26:19PM +0200, Ulf Hansson wrote:
> > > On Thu, 19 Oct 2023 at 12:24, Ulf Hansson <ulf.hansson@linaro.org> wrote:
> > > > On Wed, 18 Oct 2023 at 10:06, Stephan Gerhold
> > > > <stephan.gerhold@kernkonzept.com> wrote:
> > > > >
> > > > > The genpd core caches performance state votes from devices that are
> > > > > runtime suspended as of commit 3c5a272202c2 ("PM: domains: Improve
> > > > > runtime PM performance state handling"). They get applied once the
> > > > > device becomes active again.
> > > > >
> > > > > To attach the power domains needed by qcom-cpufreq-nvmem the OPP core
> > > > > calls genpd_dev_pm_attach_by_id(). This results in "virtual" dummy
> > > > > devices that use runtime PM only to control the enable and performance
> > > > > state for the attached power domain.
> > > > >
> > > > > However, at the moment nothing ever resumes the virtual devices created
> > > > > for qcom-cpufreq-nvmem. They remain permanently runtime suspended. This
> > > > > means that performance state votes made during cpufreq scaling get
> > > > > always cached and never applied to the hardware.
> > > > >
> > > > > Fix this by enabling the devices after attaching them and use
> > > > > dev_pm_syscore_device() to ensure the power domains also stay on when
> > > > > going to suspend. Since it supplies the CPU we can never turn it off
> > > > > from Linux. There are other mechanisms to turn it off when needed,
> > > > > usually in the RPM firmware (RPMPD) or the cpuidle path (CPR genpd).
> > > >
> > > > I believe we discussed using dev_pm_syscore_device() for the previous
> > > > version. It's not intended to be used for things like the above.
> > > >
> >
> > Sorry, looks like we still had a misunderstanding in the conclusion of
> > the previous discussion. :')
> >
> > > > Moreover, I was under the impression that it wasn't really needed. In
> > > > fact, I would think that this actually breaks things for system
> > > > suspend/resume, as in this case the cpr driver's genpd
> > > > ->power_on|off() callbacks are no longer getting called due this,
> > > > which means that the cpr state machine isn't going to be restored
> > > > properly. Or did I get this wrong?
> > >
> >
> > We strictly need the RPMPDs to be always-on, also across system suspend
> > [1]. The RPM firmware will drop the votes internally as soon as the
> > CPU(s) have entered deep cpuidle. We can't do this from Linux, because
> > we need the CPU to continue running until it was shut down cleanly.
> >
> > For CPR, we strictly need the backing regulator to be always-on, also
> > across system suspend. Typically the hardware will turn off the
> > regulator as soon as the CPU(s) enter deep cpuidle. Similarly, we can't
> > do this from Linux, because we need the CPU to continue running until it
> > was shut down cleanly.
> >
> > My understanding was that we're going to pause the CPR state machine
> > using the system suspend/resume callbacks on the driver, instead of
> > using the genpd->power_on|off() callbacks [2]. I can submit a separate
> > patch for this.
> 
> If we are going to do 1) as described below, this looks to me that
> it's going to be needed.
> 

Yep.

> How will otherwise the cpr state machine be saved/restored during
> system suspend/resume? Note that, beyond 1), the genpd's
> ->power_on|off() callbacks are no longer going to be called during
> system suspend/resume.
> 

(Side note: I think "save/restore" might be the wrong words for
 suspend/resume of CPR. Looking at the code most of the configuration
 appears to be preserved across suspend/resume. Nothing is saved, it
 literally just disables the state machine during suspend and re-enables
 it during resume.

 I'm not entirely sure what's the reason for doing this. Perhaps the
 main goal is just to prevent the CPR state machine from getting stuck
 or sending pointless IRQs that won't be handled while Linux is
 suspended.)

> In a way this also means that the cpr genpd provider might as well
> also have GENPD_FLAG_ALWAYS_ON set for it.

Conceptually I would consider CPR to be a generic power domain provider
that could supply any kind of device. I know at least of CPUs and GPUs.
We need "always-on" only for the CPU, but not necessarily for other
devices.

For a GPU, the Linux driver (running on the CPU) can stop the GPU, wait
for completion and then invoke the ->power_off() callback of CPR. In
that case it is also safe to disable the backing regulator from Linux.
(I briefly mentioned this already in the previous discussion I think.)

We could set GENPD_FLAG_ALWAYS_ON for the CPR compatibles where we know
that they are only used to supply CPUs, but if we're going to do (1)
anyway there might not be much of an advantage for the extra complexity.

> 
> >
> > I didn't prioritize this because QCS404 (as the only current user of
> > CPR) doesn't have proper deep cpuidle/power management set up yet. It's
> > not entirely clear to me if there is any advantage (or perhaps even
> > disadvantage) if we pause the CPR state machine while the shared L2
> > cache is still being actively powered by the CPR power rail during
> > system suspend. I suspect this is a configuration that was never
> > considered in the hardware design.
> 
> I see.
> 
> >
> > Given the strict requirement for the RPMPDs, I only see two options:
> >
> >  1. Have an always-on consumer that prevents the power domains to be
> >     powered off during system suspend. This is what this patch tries to
> >     achieve.
> >
> > Or:
> >
> >  2. Come up with a way to register the RPMPDs used by the CPU with
> >     GENPD_FLAG_ALWAYS_ON. This would also be doable, but isn't as
> >     straightfoward as "regulator-always-on" in the DT because the rpmpd
> >     DT node represents multiple genpds in a single DT node [3].
> 
> Yes, it sounds like it may be easier to do 1).
> 
> >
> > What do you think? Do you see some other solution perhaps? I hope we can
> > clear up the misunderstanding. :-)
> 
> Yes, thanks!
> 
> >
> > [1]: https://lore.kernel.org/linux-arm-msm/ZQGqfMigCFZP_HLA@gerhold.net/
> > [2]: https://lore.kernel.org/linux-arm-msm/CAPDyKFoiup8KNv=1LFGKDdDLA1pHsdJUgTTWMdgxnikEmReXzg@mail.gmail.com/
> > [3]: https://lore.kernel.org/linux-arm-msm/ZSg-XtwMxg3_fWxc@gerhold.net/
> >
> > > BTW, if you really need something like the above, the proper way to do
> > > it would instead be to call device_set_awake_path() for the device.
> > >
> > > This informs genpd that the device needs to stay powered-on during
> > > system suspend (assuming that GENPD_FLAG_ACTIVE_WAKEUP has been set
> > > for it), hence it will keep the corresponding PM domain powered-on
> > > too.
> > >
> >
> > Thanks, I can try if this works as alternative to the
> > dev_pm_syscore_device()!
> 
> Yes, please. We don't want to abuse the dev_pm_syscore_device() thingy.
> 

Could you clarify the idea behind GENPD_FLAG_ACTIVE_WAKEUP? Would I set
it conditionally for all RPMPDs or just the ones consumed by the CPU?
How does the genpd *provider* know if one of its *consumer* devices
needs to have its power domain kept on for wakeup?

Thanks!
Stephan

^ permalink raw reply	[flat|nested] 28+ messages in thread

* Re: [PATCH v2 2/3] cpufreq: qcom-nvmem: Enable virtual power domain devices
  2023-10-19 14:48           ` Stephan Gerhold
@ 2023-10-19 15:19             ` Ulf Hansson
  2023-10-19 17:07               ` Stephan Gerhold
  0 siblings, 1 reply; 28+ messages in thread
From: Ulf Hansson @ 2023-10-19 15:19 UTC (permalink / raw)
  To: Stephan Gerhold
  Cc: Stephan Gerhold, Viresh Kumar, Andy Gross, Bjorn Andersson,
	Konrad Dybcio, Ilia Lin, Rafael J. Wysocki, Rob Herring,
	Krzysztof Kozlowski, Conor Dooley, linux-pm, linux-arm-msm,
	linux-kernel, devicetree, stable

On Thu, 19 Oct 2023 at 16:49, Stephan Gerhold <stephan@gerhold.net> wrote:
>
> On Thu, Oct 19, 2023 at 04:12:56PM +0200, Ulf Hansson wrote:
> > On Thu, 19 Oct 2023 at 15:05, Stephan Gerhold <stephan@gerhold.net> wrote:
> > > On Thu, Oct 19, 2023 at 01:26:19PM +0200, Ulf Hansson wrote:
> > > > On Thu, 19 Oct 2023 at 12:24, Ulf Hansson <ulf.hansson@linaro.org> wrote:
> > > > > On Wed, 18 Oct 2023 at 10:06, Stephan Gerhold
> > > > > <stephan.gerhold@kernkonzept.com> wrote:
> > > > > >
> > > > > > The genpd core caches performance state votes from devices that are
> > > > > > runtime suspended as of commit 3c5a272202c2 ("PM: domains: Improve
> > > > > > runtime PM performance state handling"). They get applied once the
> > > > > > device becomes active again.
> > > > > >
> > > > > > To attach the power domains needed by qcom-cpufreq-nvmem the OPP core
> > > > > > calls genpd_dev_pm_attach_by_id(). This results in "virtual" dummy
> > > > > > devices that use runtime PM only to control the enable and performance
> > > > > > state for the attached power domain.
> > > > > >
> > > > > > However, at the moment nothing ever resumes the virtual devices created
> > > > > > for qcom-cpufreq-nvmem. They remain permanently runtime suspended. This
> > > > > > means that performance state votes made during cpufreq scaling get
> > > > > > always cached and never applied to the hardware.
> > > > > >
> > > > > > Fix this by enabling the devices after attaching them and use
> > > > > > dev_pm_syscore_device() to ensure the power domains also stay on when
> > > > > > going to suspend. Since it supplies the CPU we can never turn it off
> > > > > > from Linux. There are other mechanisms to turn it off when needed,
> > > > > > usually in the RPM firmware (RPMPD) or the cpuidle path (CPR genpd).
> > > > >
> > > > > I believe we discussed using dev_pm_syscore_device() for the previous
> > > > > version. It's not intended to be used for things like the above.
> > > > >
> > >
> > > Sorry, looks like we still had a misunderstanding in the conclusion of
> > > the previous discussion. :')
> > >
> > > > > Moreover, I was under the impression that it wasn't really needed. In
> > > > > fact, I would think that this actually breaks things for system
> > > > > suspend/resume, as in this case the cpr driver's genpd
> > > > > ->power_on|off() callbacks are no longer getting called due this,
> > > > > which means that the cpr state machine isn't going to be restored
> > > > > properly. Or did I get this wrong?
> > > >
> > >
> > > We strictly need the RPMPDs to be always-on, also across system suspend
> > > [1]. The RPM firmware will drop the votes internally as soon as the
> > > CPU(s) have entered deep cpuidle. We can't do this from Linux, because
> > > we need the CPU to continue running until it was shut down cleanly.
> > >
> > > For CPR, we strictly need the backing regulator to be always-on, also
> > > across system suspend. Typically the hardware will turn off the
> > > regulator as soon as the CPU(s) enter deep cpuidle. Similarly, we can't
> > > do this from Linux, because we need the CPU to continue running until it
> > > was shut down cleanly.
> > >
> > > My understanding was that we're going to pause the CPR state machine
> > > using the system suspend/resume callbacks on the driver, instead of
> > > using the genpd->power_on|off() callbacks [2]. I can submit a separate
> > > patch for this.
> >
> > If we are going to do 1) as described below, this looks to me that
> > it's going to be needed.
> >
>
> Yep.
>
> > How will otherwise the cpr state machine be saved/restored during
> > system suspend/resume? Note that, beyond 1), the genpd's
> > ->power_on|off() callbacks are no longer going to be called during
> > system suspend/resume.
> >
>
> (Side note: I think "save/restore" might be the wrong words for
>  suspend/resume of CPR. Looking at the code most of the configuration
>  appears to be preserved across suspend/resume. Nothing is saved, it
>  literally just disables the state machine during suspend and re-enables
>  it during resume.
>
>  I'm not entirely sure what's the reason for doing this. Perhaps the
>  main goal is just to prevent the CPR state machine from getting stuck
>  or sending pointless IRQs that won't be handled while Linux is
>  suspended.)

If only the latter, that is a very good reason too. Drivers should
take care of their devices to make sure they are not triggering
spurious irqs during system suspend.

>
> > In a way this also means that the cpr genpd provider might as well
> > also have GENPD_FLAG_ALWAYS_ON set for it.
>
> Conceptually I would consider CPR to be a generic power domain provider
> that could supply any kind of device. I know at least of CPUs and GPUs.
> We need "always-on" only for the CPU, but not necessarily for other
> devices.
>
> For a GPU, the Linux driver (running on the CPU) can stop the GPU, wait
> for completion and then invoke the ->power_off() callback of CPR. In
> that case it is also safe to disable the backing regulator from Linux.
> (I briefly mentioned this already in the previous discussion I think.)
>
> We could set GENPD_FLAG_ALWAYS_ON for the CPR compatibles where we know
> that they are only used to supply CPUs, but if we're going to do (1)
> anyway there might not be much of an advantage for the extra complexity.

Okay, fair enough. Let's just stick with 1) and skip using
GENPD_FLAG_ALWAYS_ON bit for the cpr genpd provider.

>
> >
> > >
> > > I didn't prioritize this because QCS404 (as the only current user of
> > > CPR) doesn't have proper deep cpuidle/power management set up yet. It's
> > > not entirely clear to me if there is any advantage (or perhaps even
> > > disadvantage) if we pause the CPR state machine while the shared L2
> > > cache is still being actively powered by the CPR power rail during
> > > system suspend. I suspect this is a configuration that was never
> > > considered in the hardware design.
> >
> > I see.
> >
> > >
> > > Given the strict requirement for the RPMPDs, I only see two options:
> > >
> > >  1. Have an always-on consumer that prevents the power domains to be
> > >     powered off during system suspend. This is what this patch tries to
> > >     achieve.
> > >
> > > Or:
> > >
> > >  2. Come up with a way to register the RPMPDs used by the CPU with
> > >     GENPD_FLAG_ALWAYS_ON. This would also be doable, but isn't as
> > >     straightfoward as "regulator-always-on" in the DT because the rpmpd
> > >     DT node represents multiple genpds in a single DT node [3].
> >
> > Yes, it sounds like it may be easier to do 1).
> >
> > >
> > > What do you think? Do you see some other solution perhaps? I hope we can
> > > clear up the misunderstanding. :-)
> >
> > Yes, thanks!
> >
> > >
> > > [1]: https://lore.kernel.org/linux-arm-msm/ZQGqfMigCFZP_HLA@gerhold.net/
> > > [2]: https://lore.kernel.org/linux-arm-msm/CAPDyKFoiup8KNv=1LFGKDdDLA1pHsdJUgTTWMdgxnikEmReXzg@mail.gmail.com/
> > > [3]: https://lore.kernel.org/linux-arm-msm/ZSg-XtwMxg3_fWxc@gerhold.net/
> > >
> > > > BTW, if you really need something like the above, the proper way to do
> > > > it would instead be to call device_set_awake_path() for the device.
> > > >
> > > > This informs genpd that the device needs to stay powered-on during
> > > > system suspend (assuming that GENPD_FLAG_ACTIVE_WAKEUP has been set
> > > > for it), hence it will keep the corresponding PM domain powered-on
> > > > too.
> > > >
> > >
> > > Thanks, I can try if this works as alternative to the
> > > dev_pm_syscore_device()!
> >
> > Yes, please. We don't want to abuse the dev_pm_syscore_device() thingy.
> >
>
> Could you clarify the idea behind GENPD_FLAG_ACTIVE_WAKEUP? Would I set
> it conditionally for all RPMPDs or just the ones consumed by the CPU?
> How does the genpd *provider* know if one of its *consumer* devices
> needs to have its power domain kept on for wakeup?

We are thinking of the GENPD_FLAG_ACTIVE_WAKEUP as a platform
configuration type of flag for the genpd in question. The consumer
driver shouldn't need to know about the details of what is happening
on the PM domain level - only whether it needs its device to remain
powered-on during system suspend or not.

I suspect that the GENPD_FLAG_ACTIVE_WAKEUP is probably okay to set
for most genpds, but there may be some exceptions.

Kind regards
Uffe

^ permalink raw reply	[flat|nested] 28+ messages in thread

* Re: [PATCH v2 2/3] cpufreq: qcom-nvmem: Enable virtual power domain devices
  2023-10-19 15:19             ` Ulf Hansson
@ 2023-10-19 17:07               ` Stephan Gerhold
  2023-10-20 10:20                 ` Ulf Hansson
  0 siblings, 1 reply; 28+ messages in thread
From: Stephan Gerhold @ 2023-10-19 17:07 UTC (permalink / raw)
  To: Ulf Hansson
  Cc: Stephan Gerhold, Viresh Kumar, Andy Gross, Bjorn Andersson,
	Konrad Dybcio, Ilia Lin, Rafael J. Wysocki, Rob Herring,
	Krzysztof Kozlowski, Conor Dooley, linux-pm, linux-arm-msm,
	linux-kernel, devicetree, stable

On Thu, Oct 19, 2023 at 05:19:53PM +0200, Ulf Hansson wrote:
> On Thu, 19 Oct 2023 at 16:49, Stephan Gerhold <stephan@gerhold.net> wrote:
> > On Thu, Oct 19, 2023 at 04:12:56PM +0200, Ulf Hansson wrote:
> > > On Thu, 19 Oct 2023 at 15:05, Stephan Gerhold <stephan@gerhold.net> wrote:
> > > > On Thu, Oct 19, 2023 at 01:26:19PM +0200, Ulf Hansson wrote:
> > > > > BTW, if you really need something like the above, the proper way to do
> > > > > it would instead be to call device_set_awake_path() for the device.
> > > > >
> > > > > This informs genpd that the device needs to stay powered-on during
> > > > > system suspend (assuming that GENPD_FLAG_ACTIVE_WAKEUP has been set
> > > > > for it), hence it will keep the corresponding PM domain powered-on
> > > > > too.
> > > >
> > > > Thanks, I can try if this works as alternative to the
> > > > dev_pm_syscore_device()!
> > >
> > > Yes, please. We don't want to abuse the dev_pm_syscore_device() thingy.
> >
> > Could you clarify the idea behind GENPD_FLAG_ACTIVE_WAKEUP? Would I set
> > it conditionally for all RPMPDs or just the ones consumed by the CPU?
> > How does the genpd *provider* know if one of its *consumer* devices
> > needs to have its power domain kept on for wakeup?
> 
> We are thinking of the GENPD_FLAG_ACTIVE_WAKEUP as a platform
> configuration type of flag for the genpd in question. The consumer
> driver shouldn't need to know about the details of what is happening
> on the PM domain level - only whether it needs its device to remain
> powered-on during system suspend or not.
> 

Thanks! I will test if this works for RPMPD and post new versions of the
patches. By coincidence I think this flag might actually be useful as
temporary solution for CPR. If I:

 1. Change $subject patch to use device_set_awake_path() instead, and
 2. Set GENPD_FLAG_ACTIVE_WAKEUP for the RPMPD genpds, but
 3. Do *not* set GENPD_FLAG_ACTIVE_WAKEUP for the CPR genpd.

Then the genpd ->power_on|off() callbacks should still be called
for CPR during system suspend, right? :D

> I suspect that the GENPD_FLAG_ACTIVE_WAKEUP is probably okay to set
> for most genpds, but there may be some exceptions.
> 

Out of curiosity, do you have an example for such an exception where
GENPD_FLAG_ACTIVE_WAKEUP shouldn't be set, aside from workarounds like
I just described?

As you said, the consumer device should just say that it wants to stay
powered for wakeup during suspend. But if its power domains get powered
off, I would expect that to break. How could a genpd driver still
provide power without being powered on? Wouldn't that rather be a low
performance state?

Thanks,
Stephan

^ permalink raw reply	[flat|nested] 28+ messages in thread

* Re: [PATCH v2 0/3] cpufreq: Add basic cpufreq scaling for Qualcomm MSM8909
  2023-10-19 13:48     ` Stephan Gerhold
@ 2023-10-20  3:21       ` Viresh Kumar
  0 siblings, 0 replies; 28+ messages in thread
From: Viresh Kumar @ 2023-10-20  3:21 UTC (permalink / raw)
  To: Stephan Gerhold
  Cc: Stephan Gerhold, Andy Gross, Bjorn Andersson, Konrad Dybcio,
	Ilia Lin, Rafael J. Wysocki, Rob Herring, Krzysztof Kozlowski,
	Conor Dooley, linux-pm, linux-arm-msm, linux-kernel, devicetree,
	Ulf Hansson, stable

On 19-10-23, 15:48, Stephan Gerhold wrote:
> Sure, I will try to test it until end of next week, with both single and
> multiple power domains assigned to the CPU. Is there something
> particular you would like me to look for? Or just that the scaling still
> works correctly as before?

Yeah, simple test to ensure scaling works fine is all I need.
Shouldn't take a lot of time. Thanks.

-- 
viresh

^ permalink raw reply	[flat|nested] 28+ messages in thread

* Re: [PATCH v2 2/3] cpufreq: qcom-nvmem: Enable virtual power domain devices
  2023-10-19 17:07               ` Stephan Gerhold
@ 2023-10-20 10:20                 ` Ulf Hansson
  0 siblings, 0 replies; 28+ messages in thread
From: Ulf Hansson @ 2023-10-20 10:20 UTC (permalink / raw)
  To: Stephan Gerhold
  Cc: Stephan Gerhold, Viresh Kumar, Andy Gross, Bjorn Andersson,
	Konrad Dybcio, Ilia Lin, Rafael J. Wysocki, Rob Herring,
	Krzysztof Kozlowski, Conor Dooley, linux-pm, linux-arm-msm,
	linux-kernel, devicetree, stable

On Thu, 19 Oct 2023 at 19:08, Stephan Gerhold <stephan@gerhold.net> wrote:
>
> On Thu, Oct 19, 2023 at 05:19:53PM +0200, Ulf Hansson wrote:
> > On Thu, 19 Oct 2023 at 16:49, Stephan Gerhold <stephan@gerhold.net> wrote:
> > > On Thu, Oct 19, 2023 at 04:12:56PM +0200, Ulf Hansson wrote:
> > > > On Thu, 19 Oct 2023 at 15:05, Stephan Gerhold <stephan@gerhold.net> wrote:
> > > > > On Thu, Oct 19, 2023 at 01:26:19PM +0200, Ulf Hansson wrote:
> > > > > > BTW, if you really need something like the above, the proper way to do
> > > > > > it would instead be to call device_set_awake_path() for the device.
> > > > > >
> > > > > > This informs genpd that the device needs to stay powered-on during
> > > > > > system suspend (assuming that GENPD_FLAG_ACTIVE_WAKEUP has been set
> > > > > > for it), hence it will keep the corresponding PM domain powered-on
> > > > > > too.
> > > > >
> > > > > Thanks, I can try if this works as alternative to the
> > > > > dev_pm_syscore_device()!
> > > >
> > > > Yes, please. We don't want to abuse the dev_pm_syscore_device() thingy.
> > >
> > > Could you clarify the idea behind GENPD_FLAG_ACTIVE_WAKEUP? Would I set
> > > it conditionally for all RPMPDs or just the ones consumed by the CPU?
> > > How does the genpd *provider* know if one of its *consumer* devices
> > > needs to have its power domain kept on for wakeup?
> >
> > We are thinking of the GENPD_FLAG_ACTIVE_WAKEUP as a platform
> > configuration type of flag for the genpd in question. The consumer
> > driver shouldn't need to know about the details of what is happening
> > on the PM domain level - only whether it needs its device to remain
> > powered-on during system suspend or not.
> >
>
> Thanks! I will test if this works for RPMPD and post new versions of the
> patches. By coincidence I think this flag might actually be useful as
> temporary solution for CPR. If I:
>
>  1. Change $subject patch to use device_set_awake_path() instead, and
>  2. Set GENPD_FLAG_ACTIVE_WAKEUP for the RPMPD genpds, but
>  3. Do *not* set GENPD_FLAG_ACTIVE_WAKEUP for the CPR genpd.
>
> Then the genpd ->power_on|off() callbacks should still be called
> for CPR during system suspend, right? :D

Yes, correct, that should work fine!

>
> > I suspect that the GENPD_FLAG_ACTIVE_WAKEUP is probably okay to set
> > for most genpds, but there may be some exceptions.
> >
>
> Out of curiosity, do you have an example for such an exception where
> GENPD_FLAG_ACTIVE_WAKEUP shouldn't be set, aside from workarounds like
> I just described?
>
> As you said, the consumer device should just say that it wants to stay
> powered for wakeup during suspend. But if its power domains get powered
> off, I would expect that to break. How could a genpd driver still
> provide power without being powered on? Wouldn't that rather be a low
> performance state?

I think this boils down to how the power-rail that the genpd manages,
is handled by the platform during system suspend.

In principle there could be some other separate logic that helps a
FW/PMIC to understand whether it needs to keep the power-rail on or
not - no matter whether the genpd releases its vote for it during
system suspend.

This becomes mostly hypothetical, but clearly there are a lot of
genpd/platforms that don't use GENPD_FLAG_ACTIVE_WAKEUP too. If those
are just mistakes or just not needed, I don't actually know.

Kind regards
Uffe

^ permalink raw reply	[flat|nested] 28+ messages in thread

* Re: [PATCH v2 2/3] cpufreq: qcom-nvmem: Enable virtual power domain devices
  2023-10-19 11:26     ` Ulf Hansson
  2023-10-19 13:05       ` Stephan Gerhold
@ 2023-10-24 12:03       ` Stephan Gerhold
  2023-10-24 12:49         ` Ulf Hansson
  1 sibling, 1 reply; 28+ messages in thread
From: Stephan Gerhold @ 2023-10-24 12:03 UTC (permalink / raw)
  To: Ulf Hansson
  Cc: Viresh Kumar, Andy Gross, Bjorn Andersson, Konrad Dybcio,
	Ilia Lin, Rafael J. Wysocki, Rob Herring, Krzysztof Kozlowski,
	Conor Dooley, linux-pm, linux-arm-msm, linux-kernel, devicetree,
	Stephan Gerhold, stable

On Thu, Oct 19, 2023 at 01:26:19PM +0200, Ulf Hansson wrote:
> On Thu, 19 Oct 2023 at 12:24, Ulf Hansson <ulf.hansson@linaro.org> wrote:
> >
> > On Wed, 18 Oct 2023 at 10:06, Stephan Gerhold
> > <stephan.gerhold@kernkonzept.com> wrote:
> > >
> > > The genpd core caches performance state votes from devices that are
> > > runtime suspended as of commit 3c5a272202c2 ("PM: domains: Improve
> > > runtime PM performance state handling"). They get applied once the
> > > device becomes active again.
> > >
> > > To attach the power domains needed by qcom-cpufreq-nvmem the OPP core
> > > calls genpd_dev_pm_attach_by_id(). This results in "virtual" dummy
> > > devices that use runtime PM only to control the enable and performance
> > > state for the attached power domain.
> > >
> > > However, at the moment nothing ever resumes the virtual devices created
> > > for qcom-cpufreq-nvmem. They remain permanently runtime suspended. This
> > > means that performance state votes made during cpufreq scaling get
> > > always cached and never applied to the hardware.
> > >
> > > Fix this by enabling the devices after attaching them and use
> > > dev_pm_syscore_device() to ensure the power domains also stay on when
> > > going to suspend. Since it supplies the CPU we can never turn it off
> > > from Linux. There are other mechanisms to turn it off when needed,
> > > usually in the RPM firmware (RPMPD) or the cpuidle path (CPR genpd).
> >
> > I believe we discussed using dev_pm_syscore_device() for the previous
> > version. It's not intended to be used for things like the above.
> >
> > Moreover, I was under the impression that it wasn't really needed. In
> > fact, I would think that this actually breaks things for system
> > suspend/resume, as in this case the cpr driver's genpd
> > ->power_on|off() callbacks are no longer getting called due this,
> > which means that the cpr state machine isn't going to be restored
> > properly. Or did I get this wrong?
> 
> BTW, if you really need something like the above, the proper way to do
> it would instead be to call device_set_awake_path() for the device.
> 

Unfortunately this does not work correctly. When I use
device_set_awake_path() it does set dev->power.wakeup_path = true.
However, this flag is cleared again in device_prepare() when entering
suspend. To me it looks a bit like wakeup_path is not supposed to be set
directly by drivers? Before and after your commit 8512220c5782 ("PM /
core: Assign the wakeup_path status flag in __device_prepare()") it
seems to be internally bound to device_may_wakeup().

It works if I make device_may_wakeup() return true, with

	device_set_wakeup_capable(dev, true);
	device_wakeup_enable(dev);

but that also allows *disabling* the wakeup from sysfs which doesn't
really make sense for the CPU.

Any ideas?

Thanks!
--
Stephan Gerhold <stephan.gerhold@kernkonzept.com>
Kernkonzept GmbH at Dresden, Germany, HRB 31129, CEO Dr.-Ing. Michael Hohmuth

^ permalink raw reply	[flat|nested] 28+ messages in thread

* Re: [PATCH v2 2/3] cpufreq: qcom-nvmem: Enable virtual power domain devices
  2023-10-24 12:03       ` Stephan Gerhold
@ 2023-10-24 12:49         ` Ulf Hansson
  2023-10-24 13:07           ` Stephan Gerhold
  0 siblings, 1 reply; 28+ messages in thread
From: Ulf Hansson @ 2023-10-24 12:49 UTC (permalink / raw)
  To: Stephan Gerhold
  Cc: Viresh Kumar, Andy Gross, Bjorn Andersson, Konrad Dybcio,
	Ilia Lin, Rafael J. Wysocki, Rob Herring, Krzysztof Kozlowski,
	Conor Dooley, linux-pm, linux-arm-msm, linux-kernel, devicetree,
	Stephan Gerhold, stable

On Tue, 24 Oct 2023 at 14:03, Stephan Gerhold
<stephan.gerhold@kernkonzept.com> wrote:
>
> On Thu, Oct 19, 2023 at 01:26:19PM +0200, Ulf Hansson wrote:
> > On Thu, 19 Oct 2023 at 12:24, Ulf Hansson <ulf.hansson@linaro.org> wrote:
> > >
> > > On Wed, 18 Oct 2023 at 10:06, Stephan Gerhold
> > > <stephan.gerhold@kernkonzept.com> wrote:
> > > >
> > > > The genpd core caches performance state votes from devices that are
> > > > runtime suspended as of commit 3c5a272202c2 ("PM: domains: Improve
> > > > runtime PM performance state handling"). They get applied once the
> > > > device becomes active again.
> > > >
> > > > To attach the power domains needed by qcom-cpufreq-nvmem the OPP core
> > > > calls genpd_dev_pm_attach_by_id(). This results in "virtual" dummy
> > > > devices that use runtime PM only to control the enable and performance
> > > > state for the attached power domain.
> > > >
> > > > However, at the moment nothing ever resumes the virtual devices created
> > > > for qcom-cpufreq-nvmem. They remain permanently runtime suspended. This
> > > > means that performance state votes made during cpufreq scaling get
> > > > always cached and never applied to the hardware.
> > > >
> > > > Fix this by enabling the devices after attaching them and use
> > > > dev_pm_syscore_device() to ensure the power domains also stay on when
> > > > going to suspend. Since it supplies the CPU we can never turn it off
> > > > from Linux. There are other mechanisms to turn it off when needed,
> > > > usually in the RPM firmware (RPMPD) or the cpuidle path (CPR genpd).
> > >
> > > I believe we discussed using dev_pm_syscore_device() for the previous
> > > version. It's not intended to be used for things like the above.
> > >
> > > Moreover, I was under the impression that it wasn't really needed. In
> > > fact, I would think that this actually breaks things for system
> > > suspend/resume, as in this case the cpr driver's genpd
> > > ->power_on|off() callbacks are no longer getting called due this,
> > > which means that the cpr state machine isn't going to be restored
> > > properly. Or did I get this wrong?
> >
> > BTW, if you really need something like the above, the proper way to do
> > it would instead be to call device_set_awake_path() for the device.
> >
>
> Unfortunately this does not work correctly. When I use
> device_set_awake_path() it does set dev->power.wakeup_path = true.
> However, this flag is cleared again in device_prepare() when entering
> suspend. To me it looks a bit like wakeup_path is not supposed to be set
> directly by drivers? Before and after your commit 8512220c5782 ("PM /
> core: Assign the wakeup_path status flag in __device_prepare()") it
> seems to be internally bound to device_may_wakeup().
>
> It works if I make device_may_wakeup() return true, with
>
>         device_set_wakeup_capable(dev, true);
>         device_wakeup_enable(dev);
>
> but that also allows *disabling* the wakeup from sysfs which doesn't
> really make sense for the CPU.
>
> Any ideas?

The device_set_awake_path() should be called from a system suspend
callback. So you need to add that callback for the cpufreq driver.

Sorry, if that wasn't clear.

Kind regards
Uffe

^ permalink raw reply	[flat|nested] 28+ messages in thread

* Re: [PATCH v2 2/3] cpufreq: qcom-nvmem: Enable virtual power domain devices
  2023-10-24 12:49         ` Ulf Hansson
@ 2023-10-24 13:07           ` Stephan Gerhold
  2023-10-24 16:11             ` Ulf Hansson
  0 siblings, 1 reply; 28+ messages in thread
From: Stephan Gerhold @ 2023-10-24 13:07 UTC (permalink / raw)
  To: Ulf Hansson
  Cc: Viresh Kumar, Andy Gross, Bjorn Andersson, Konrad Dybcio,
	Ilia Lin, Rafael J. Wysocki, Rob Herring, Krzysztof Kozlowski,
	Conor Dooley, linux-pm, linux-arm-msm, linux-kernel, devicetree,
	Stephan Gerhold, stable

On Tue, Oct 24, 2023 at 02:49:32PM +0200, Ulf Hansson wrote:
> On Tue, 24 Oct 2023 at 14:03, Stephan Gerhold
> <stephan.gerhold@kernkonzept.com> wrote:
> >
> > On Thu, Oct 19, 2023 at 01:26:19PM +0200, Ulf Hansson wrote:
> > > On Thu, 19 Oct 2023 at 12:24, Ulf Hansson <ulf.hansson@linaro.org> wrote:
> > > >
> > > > On Wed, 18 Oct 2023 at 10:06, Stephan Gerhold
> > > > <stephan.gerhold@kernkonzept.com> wrote:
> > > > >
> > > > > The genpd core caches performance state votes from devices that are
> > > > > runtime suspended as of commit 3c5a272202c2 ("PM: domains: Improve
> > > > > runtime PM performance state handling"). They get applied once the
> > > > > device becomes active again.
> > > > >
> > > > > To attach the power domains needed by qcom-cpufreq-nvmem the OPP core
> > > > > calls genpd_dev_pm_attach_by_id(). This results in "virtual" dummy
> > > > > devices that use runtime PM only to control the enable and performance
> > > > > state for the attached power domain.
> > > > >
> > > > > However, at the moment nothing ever resumes the virtual devices created
> > > > > for qcom-cpufreq-nvmem. They remain permanently runtime suspended. This
> > > > > means that performance state votes made during cpufreq scaling get
> > > > > always cached and never applied to the hardware.
> > > > >
> > > > > Fix this by enabling the devices after attaching them and use
> > > > > dev_pm_syscore_device() to ensure the power domains also stay on when
> > > > > going to suspend. Since it supplies the CPU we can never turn it off
> > > > > from Linux. There are other mechanisms to turn it off when needed,
> > > > > usually in the RPM firmware (RPMPD) or the cpuidle path (CPR genpd).
> > > >
> > > > I believe we discussed using dev_pm_syscore_device() for the previous
> > > > version. It's not intended to be used for things like the above.
> > > >
> > > > Moreover, I was under the impression that it wasn't really needed. In
> > > > fact, I would think that this actually breaks things for system
> > > > suspend/resume, as in this case the cpr driver's genpd
> > > > ->power_on|off() callbacks are no longer getting called due this,
> > > > which means that the cpr state machine isn't going to be restored
> > > > properly. Or did I get this wrong?
> > >
> > > BTW, if you really need something like the above, the proper way to do
> > > it would instead be to call device_set_awake_path() for the device.
> > >
> >
> > Unfortunately this does not work correctly. When I use
> > device_set_awake_path() it does set dev->power.wakeup_path = true.
> > However, this flag is cleared again in device_prepare() when entering
> > suspend. To me it looks a bit like wakeup_path is not supposed to be set
> > directly by drivers? Before and after your commit 8512220c5782 ("PM /
> > core: Assign the wakeup_path status flag in __device_prepare()") it
> > seems to be internally bound to device_may_wakeup().
> >
> > It works if I make device_may_wakeup() return true, with
> >
> >         device_set_wakeup_capable(dev, true);
> >         device_wakeup_enable(dev);
> >
> > but that also allows *disabling* the wakeup from sysfs which doesn't
> > really make sense for the CPU.
> >
> > Any ideas?
> 
> The device_set_awake_path() should be called from a system suspend
> callback. So you need to add that callback for the cpufreq driver.
> 
> Sorry, if that wasn't clear.
> 

Hmm, but at the moment I'm calling this on the virtual genpd devices.
How would it work for them? I don't have a suspend callback for them.

I guess could loop over the virtual devices in the cpufreq driver
suspend callback, but is my driver suspend callback really guaranteed to
run before the device_prepare() that clears "wakeup_path" on the virtual
devices?

Or is this the point where we need device links to make that work?
A quick look suggests "wakeup_path" is just propagated to parents but
not device links, so I don't think that would help, either.

Thanks,
-- 
Stephan Gerhold <stephan.gerhold@kernkonzept.com>
Kernkonzept GmbH at Dresden, Germany, HRB 31129, CEO Dr.-Ing. Michael Hohmuth

^ permalink raw reply	[flat|nested] 28+ messages in thread

* Re: [PATCH v2 2/3] cpufreq: qcom-nvmem: Enable virtual power domain devices
  2023-10-24 13:07           ` Stephan Gerhold
@ 2023-10-24 16:11             ` Ulf Hansson
  2023-10-24 16:25               ` Stephan Gerhold
  0 siblings, 1 reply; 28+ messages in thread
From: Ulf Hansson @ 2023-10-24 16:11 UTC (permalink / raw)
  To: Stephan Gerhold
  Cc: Viresh Kumar, Andy Gross, Bjorn Andersson, Konrad Dybcio,
	Ilia Lin, Rafael J. Wysocki, Rob Herring, Krzysztof Kozlowski,
	Conor Dooley, linux-pm, linux-arm-msm, linux-kernel, devicetree,
	Stephan Gerhold, stable

On Tue, 24 Oct 2023 at 15:07, Stephan Gerhold
<stephan.gerhold@kernkonzept.com> wrote:
>
> On Tue, Oct 24, 2023 at 02:49:32PM +0200, Ulf Hansson wrote:
> > On Tue, 24 Oct 2023 at 14:03, Stephan Gerhold
> > <stephan.gerhold@kernkonzept.com> wrote:
> > >
> > > On Thu, Oct 19, 2023 at 01:26:19PM +0200, Ulf Hansson wrote:
> > > > On Thu, 19 Oct 2023 at 12:24, Ulf Hansson <ulf.hansson@linaro.org> wrote:
> > > > >
> > > > > On Wed, 18 Oct 2023 at 10:06, Stephan Gerhold
> > > > > <stephan.gerhold@kernkonzept.com> wrote:
> > > > > >
> > > > > > The genpd core caches performance state votes from devices that are
> > > > > > runtime suspended as of commit 3c5a272202c2 ("PM: domains: Improve
> > > > > > runtime PM performance state handling"). They get applied once the
> > > > > > device becomes active again.
> > > > > >
> > > > > > To attach the power domains needed by qcom-cpufreq-nvmem the OPP core
> > > > > > calls genpd_dev_pm_attach_by_id(). This results in "virtual" dummy
> > > > > > devices that use runtime PM only to control the enable and performance
> > > > > > state for the attached power domain.
> > > > > >
> > > > > > However, at the moment nothing ever resumes the virtual devices created
> > > > > > for qcom-cpufreq-nvmem. They remain permanently runtime suspended. This
> > > > > > means that performance state votes made during cpufreq scaling get
> > > > > > always cached and never applied to the hardware.
> > > > > >
> > > > > > Fix this by enabling the devices after attaching them and use
> > > > > > dev_pm_syscore_device() to ensure the power domains also stay on when
> > > > > > going to suspend. Since it supplies the CPU we can never turn it off
> > > > > > from Linux. There are other mechanisms to turn it off when needed,
> > > > > > usually in the RPM firmware (RPMPD) or the cpuidle path (CPR genpd).
> > > > >
> > > > > I believe we discussed using dev_pm_syscore_device() for the previous
> > > > > version. It's not intended to be used for things like the above.
> > > > >
> > > > > Moreover, I was under the impression that it wasn't really needed. In
> > > > > fact, I would think that this actually breaks things for system
> > > > > suspend/resume, as in this case the cpr driver's genpd
> > > > > ->power_on|off() callbacks are no longer getting called due this,
> > > > > which means that the cpr state machine isn't going to be restored
> > > > > properly. Or did I get this wrong?
> > > >
> > > > BTW, if you really need something like the above, the proper way to do
> > > > it would instead be to call device_set_awake_path() for the device.
> > > >
> > >
> > > Unfortunately this does not work correctly. When I use
> > > device_set_awake_path() it does set dev->power.wakeup_path = true.
> > > However, this flag is cleared again in device_prepare() when entering
> > > suspend. To me it looks a bit like wakeup_path is not supposed to be set
> > > directly by drivers? Before and after your commit 8512220c5782 ("PM /
> > > core: Assign the wakeup_path status flag in __device_prepare()") it
> > > seems to be internally bound to device_may_wakeup().
> > >
> > > It works if I make device_may_wakeup() return true, with
> > >
> > >         device_set_wakeup_capable(dev, true);
> > >         device_wakeup_enable(dev);
> > >
> > > but that also allows *disabling* the wakeup from sysfs which doesn't
> > > really make sense for the CPU.
> > >
> > > Any ideas?
> >
> > The device_set_awake_path() should be called from a system suspend
> > callback. So you need to add that callback for the cpufreq driver.
> >
> > Sorry, if that wasn't clear.
> >
>
> Hmm, but at the moment I'm calling this on the virtual genpd devices.
> How would it work for them? I don't have a suspend callback for them.
>
> I guess could loop over the virtual devices in the cpufreq driver
> suspend callback, but is my driver suspend callback really guaranteed to
> run before the device_prepare() that clears "wakeup_path" on the virtual
> devices?

Yes, that's guaranteed. dpm_prepare() (which calls device_prepare())
is always being executed before dpm_suspend().

>
> Or is this the point where we need device links to make that work?
> A quick look suggests "wakeup_path" is just propagated to parents but
> not device links, so I don't think that would help, either.

I don't think we need device-links for this, at least the way things
are working currently.

Kind regards
Uffe

^ permalink raw reply	[flat|nested] 28+ messages in thread

* Re: [PATCH v2 2/3] cpufreq: qcom-nvmem: Enable virtual power domain devices
  2023-10-24 16:11             ` Ulf Hansson
@ 2023-10-24 16:25               ` Stephan Gerhold
  2023-10-25 10:05                 ` Ulf Hansson
  0 siblings, 1 reply; 28+ messages in thread
From: Stephan Gerhold @ 2023-10-24 16:25 UTC (permalink / raw)
  To: Ulf Hansson
  Cc: Stephan Gerhold, Viresh Kumar, Andy Gross, Bjorn Andersson,
	Konrad Dybcio, Ilia Lin, Rafael J. Wysocki, Rob Herring,
	Krzysztof Kozlowski, Conor Dooley, linux-pm, linux-arm-msm,
	linux-kernel, devicetree, stable

On Tue, Oct 24, 2023 at 06:11:34PM +0200, Ulf Hansson wrote:
> On Tue, 24 Oct 2023 at 15:07, Stephan Gerhold
> <stephan.gerhold@kernkonzept.com> wrote:
> >
> > On Tue, Oct 24, 2023 at 02:49:32PM +0200, Ulf Hansson wrote:
> > > On Tue, 24 Oct 2023 at 14:03, Stephan Gerhold
> > > <stephan.gerhold@kernkonzept.com> wrote:
> > > >
> > > > On Thu, Oct 19, 2023 at 01:26:19PM +0200, Ulf Hansson wrote:
> > > > > On Thu, 19 Oct 2023 at 12:24, Ulf Hansson <ulf.hansson@linaro.org> wrote:
> > > > > >
> > > > > > On Wed, 18 Oct 2023 at 10:06, Stephan Gerhold
> > > > > > <stephan.gerhold@kernkonzept.com> wrote:
> > > > > > >
> > > > > > > The genpd core caches performance state votes from devices that are
> > > > > > > runtime suspended as of commit 3c5a272202c2 ("PM: domains: Improve
> > > > > > > runtime PM performance state handling"). They get applied once the
> > > > > > > device becomes active again.
> > > > > > >
> > > > > > > To attach the power domains needed by qcom-cpufreq-nvmem the OPP core
> > > > > > > calls genpd_dev_pm_attach_by_id(). This results in "virtual" dummy
> > > > > > > devices that use runtime PM only to control the enable and performance
> > > > > > > state for the attached power domain.
> > > > > > >
> > > > > > > However, at the moment nothing ever resumes the virtual devices created
> > > > > > > for qcom-cpufreq-nvmem. They remain permanently runtime suspended. This
> > > > > > > means that performance state votes made during cpufreq scaling get
> > > > > > > always cached and never applied to the hardware.
> > > > > > >
> > > > > > > Fix this by enabling the devices after attaching them and use
> > > > > > > dev_pm_syscore_device() to ensure the power domains also stay on when
> > > > > > > going to suspend. Since it supplies the CPU we can never turn it off
> > > > > > > from Linux. There are other mechanisms to turn it off when needed,
> > > > > > > usually in the RPM firmware (RPMPD) or the cpuidle path (CPR genpd).
> > > > > >
> > > > > > I believe we discussed using dev_pm_syscore_device() for the previous
> > > > > > version. It's not intended to be used for things like the above.
> > > > > >
> > > > > > Moreover, I was under the impression that it wasn't really needed. In
> > > > > > fact, I would think that this actually breaks things for system
> > > > > > suspend/resume, as in this case the cpr driver's genpd
> > > > > > ->power_on|off() callbacks are no longer getting called due this,
> > > > > > which means that the cpr state machine isn't going to be restored
> > > > > > properly. Or did I get this wrong?
> > > > >
> > > > > BTW, if you really need something like the above, the proper way to do
> > > > > it would instead be to call device_set_awake_path() for the device.
> > > > >
> > > >
> > > > Unfortunately this does not work correctly. When I use
> > > > device_set_awake_path() it does set dev->power.wakeup_path = true.
> > > > However, this flag is cleared again in device_prepare() when entering
> > > > suspend. To me it looks a bit like wakeup_path is not supposed to be set
> > > > directly by drivers? Before and after your commit 8512220c5782 ("PM /
> > > > core: Assign the wakeup_path status flag in __device_prepare()") it
> > > > seems to be internally bound to device_may_wakeup().
> > > >
> > > > It works if I make device_may_wakeup() return true, with
> > > >
> > > >         device_set_wakeup_capable(dev, true);
> > > >         device_wakeup_enable(dev);
> > > >
> > > > but that also allows *disabling* the wakeup from sysfs which doesn't
> > > > really make sense for the CPU.
> > > >
> > > > Any ideas?
> > >
> > > The device_set_awake_path() should be called from a system suspend
> > > callback. So you need to add that callback for the cpufreq driver.
> > >
> > > Sorry, if that wasn't clear.
> > >
> >
> > Hmm, but at the moment I'm calling this on the virtual genpd devices.
> > How would it work for them? I don't have a suspend callback for them.
> >
> > I guess could loop over the virtual devices in the cpufreq driver
> > suspend callback, but is my driver suspend callback really guaranteed to
> > run before the device_prepare() that clears "wakeup_path" on the virtual
> > devices?
> 
> Yes, that's guaranteed. dpm_prepare() (which calls device_prepare())
> is always being executed before dpm_suspend().
> 

Thanks, I think I understand. Maybe. :-)

Just to confirm, I should call device_set_awake_path() for the virtual
genpd devices as part of the PM ->suspend() callback? And this will be
guaranteed to run after the "prepare" phase but before the
"suspend_noirq" phase where the genpd core will check the wakeup flag?

Thanks,
Stepan

^ permalink raw reply	[flat|nested] 28+ messages in thread

* Re: [PATCH v2 2/3] cpufreq: qcom-nvmem: Enable virtual power domain devices
  2023-10-24 16:25               ` Stephan Gerhold
@ 2023-10-25 10:05                 ` Ulf Hansson
  2023-11-01 14:56                   ` Stephan Gerhold
  0 siblings, 1 reply; 28+ messages in thread
From: Ulf Hansson @ 2023-10-25 10:05 UTC (permalink / raw)
  To: Stephan Gerhold
  Cc: Stephan Gerhold, Viresh Kumar, Andy Gross, Bjorn Andersson,
	Konrad Dybcio, Ilia Lin, Rafael J. Wysocki, Rob Herring,
	Krzysztof Kozlowski, Conor Dooley, linux-pm, linux-arm-msm,
	linux-kernel, devicetree, stable

On Tue, 24 Oct 2023 at 18:25, Stephan Gerhold <stephan@gerhold.net> wrote:
>
> On Tue, Oct 24, 2023 at 06:11:34PM +0200, Ulf Hansson wrote:
> > On Tue, 24 Oct 2023 at 15:07, Stephan Gerhold
> > <stephan.gerhold@kernkonzept.com> wrote:
> > >
> > > On Tue, Oct 24, 2023 at 02:49:32PM +0200, Ulf Hansson wrote:
> > > > On Tue, 24 Oct 2023 at 14:03, Stephan Gerhold
> > > > <stephan.gerhold@kernkonzept.com> wrote:
> > > > >
> > > > > On Thu, Oct 19, 2023 at 01:26:19PM +0200, Ulf Hansson wrote:
> > > > > > On Thu, 19 Oct 2023 at 12:24, Ulf Hansson <ulf.hansson@linaro.org> wrote:
> > > > > > >
> > > > > > > On Wed, 18 Oct 2023 at 10:06, Stephan Gerhold
> > > > > > > <stephan.gerhold@kernkonzept.com> wrote:
> > > > > > > >
> > > > > > > > The genpd core caches performance state votes from devices that are
> > > > > > > > runtime suspended as of commit 3c5a272202c2 ("PM: domains: Improve
> > > > > > > > runtime PM performance state handling"). They get applied once the
> > > > > > > > device becomes active again.
> > > > > > > >
> > > > > > > > To attach the power domains needed by qcom-cpufreq-nvmem the OPP core
> > > > > > > > calls genpd_dev_pm_attach_by_id(). This results in "virtual" dummy
> > > > > > > > devices that use runtime PM only to control the enable and performance
> > > > > > > > state for the attached power domain.
> > > > > > > >
> > > > > > > > However, at the moment nothing ever resumes the virtual devices created
> > > > > > > > for qcom-cpufreq-nvmem. They remain permanently runtime suspended. This
> > > > > > > > means that performance state votes made during cpufreq scaling get
> > > > > > > > always cached and never applied to the hardware.
> > > > > > > >
> > > > > > > > Fix this by enabling the devices after attaching them and use
> > > > > > > > dev_pm_syscore_device() to ensure the power domains also stay on when
> > > > > > > > going to suspend. Since it supplies the CPU we can never turn it off
> > > > > > > > from Linux. There are other mechanisms to turn it off when needed,
> > > > > > > > usually in the RPM firmware (RPMPD) or the cpuidle path (CPR genpd).
> > > > > > >
> > > > > > > I believe we discussed using dev_pm_syscore_device() for the previous
> > > > > > > version. It's not intended to be used for things like the above.
> > > > > > >
> > > > > > > Moreover, I was under the impression that it wasn't really needed. In
> > > > > > > fact, I would think that this actually breaks things for system
> > > > > > > suspend/resume, as in this case the cpr driver's genpd
> > > > > > > ->power_on|off() callbacks are no longer getting called due this,
> > > > > > > which means that the cpr state machine isn't going to be restored
> > > > > > > properly. Or did I get this wrong?
> > > > > >
> > > > > > BTW, if you really need something like the above, the proper way to do
> > > > > > it would instead be to call device_set_awake_path() for the device.
> > > > > >
> > > > >
> > > > > Unfortunately this does not work correctly. When I use
> > > > > device_set_awake_path() it does set dev->power.wakeup_path = true.
> > > > > However, this flag is cleared again in device_prepare() when entering
> > > > > suspend. To me it looks a bit like wakeup_path is not supposed to be set
> > > > > directly by drivers? Before and after your commit 8512220c5782 ("PM /
> > > > > core: Assign the wakeup_path status flag in __device_prepare()") it
> > > > > seems to be internally bound to device_may_wakeup().
> > > > >
> > > > > It works if I make device_may_wakeup() return true, with
> > > > >
> > > > >         device_set_wakeup_capable(dev, true);
> > > > >         device_wakeup_enable(dev);
> > > > >
> > > > > but that also allows *disabling* the wakeup from sysfs which doesn't
> > > > > really make sense for the CPU.
> > > > >
> > > > > Any ideas?
> > > >
> > > > The device_set_awake_path() should be called from a system suspend
> > > > callback. So you need to add that callback for the cpufreq driver.
> > > >
> > > > Sorry, if that wasn't clear.
> > > >
> > >
> > > Hmm, but at the moment I'm calling this on the virtual genpd devices.
> > > How would it work for them? I don't have a suspend callback for them.
> > >
> > > I guess could loop over the virtual devices in the cpufreq driver
> > > suspend callback, but is my driver suspend callback really guaranteed to
> > > run before the device_prepare() that clears "wakeup_path" on the virtual
> > > devices?
> >
> > Yes, that's guaranteed. dpm_prepare() (which calls device_prepare())
> > is always being executed before dpm_suspend().
> >
>
> Thanks, I think I understand. Maybe. :-)
>
> Just to confirm, I should call device_set_awake_path() for the virtual
> genpd devices as part of the PM ->suspend() callback? And this will be
> guaranteed to run after the "prepare" phase but before the
> "suspend_noirq" phase where the genpd core will check the wakeup flag?

Correct!

Kind regards
Uffe

^ permalink raw reply	[flat|nested] 28+ messages in thread

* Re: [PATCH v2 2/3] cpufreq: qcom-nvmem: Enable virtual power domain devices
  2023-10-25 10:05                 ` Ulf Hansson
@ 2023-11-01 14:56                   ` Stephan Gerhold
  0 siblings, 0 replies; 28+ messages in thread
From: Stephan Gerhold @ 2023-11-01 14:56 UTC (permalink / raw)
  To: Ulf Hansson
  Cc: Stephan Gerhold, Viresh Kumar, Andy Gross, Bjorn Andersson,
	Konrad Dybcio, Ilia Lin, Rafael J. Wysocki, Rob Herring,
	Krzysztof Kozlowski, Conor Dooley, linux-pm, linux-arm-msm,
	linux-kernel, devicetree, stable

On Wed, Oct 25, 2023 at 12:05:49PM +0200, Ulf Hansson wrote:
> On Tue, 24 Oct 2023 at 18:25, Stephan Gerhold <stephan@gerhold.net> wrote:
> > On Tue, Oct 24, 2023 at 06:11:34PM +0200, Ulf Hansson wrote:
> > > On Tue, 24 Oct 2023 at 15:07, Stephan Gerhold
> > > <stephan.gerhold@kernkonzept.com> wrote:
> > > > On Tue, Oct 24, 2023 at 02:49:32PM +0200, Ulf Hansson wrote:
> > > > > On Tue, 24 Oct 2023 at 14:03, Stephan Gerhold
> > > > > <stephan.gerhold@kernkonzept.com> wrote:
> > > > > > On Thu, Oct 19, 2023 at 01:26:19PM +0200, Ulf Hansson wrote:
> > > > > > > On Thu, 19 Oct 2023 at 12:24, Ulf Hansson <ulf.hansson@linaro.org> wrote:
> > > > > > > > On Wed, 18 Oct 2023 at 10:06, Stephan Gerhold
> > > > > > > > <stephan.gerhold@kernkonzept.com> wrote:
> > > > > > > > >
> > > > > > > > > The genpd core caches performance state votes from devices that are
> > > > > > > > > runtime suspended as of commit 3c5a272202c2 ("PM: domains: Improve
> > > > > > > > > runtime PM performance state handling"). They get applied once the
> > > > > > > > > device becomes active again.
> > > > > > > > >
> > > > > > > > > To attach the power domains needed by qcom-cpufreq-nvmem the OPP core
> > > > > > > > > calls genpd_dev_pm_attach_by_id(). This results in "virtual" dummy
> > > > > > > > > devices that use runtime PM only to control the enable and performance
> > > > > > > > > state for the attached power domain.
> > > > > > > > >
> > > > > > > > > However, at the moment nothing ever resumes the virtual devices created
> > > > > > > > > for qcom-cpufreq-nvmem. They remain permanently runtime suspended. This
> > > > > > > > > means that performance state votes made during cpufreq scaling get
> > > > > > > > > always cached and never applied to the hardware.
> > > > > > > > >
> > > > > > > > > Fix this by enabling the devices after attaching them and use
> > > > > > > > > dev_pm_syscore_device() to ensure the power domains also stay on when
> > > > > > > > > going to suspend. Since it supplies the CPU we can never turn it off
> > > > > > > > > from Linux. There are other mechanisms to turn it off when needed,
> > > > > > > > > usually in the RPM firmware (RPMPD) or the cpuidle path (CPR genpd).
> > > > > > > >
> > > > > > > > I believe we discussed using dev_pm_syscore_device() for the previous
> > > > > > > > version. It's not intended to be used for things like the above.
> > > > > > > >
> > > > > > > > Moreover, I was under the impression that it wasn't really needed. In
> > > > > > > > fact, I would think that this actually breaks things for system
> > > > > > > > suspend/resume, as in this case the cpr driver's genpd
> > > > > > > > ->power_on|off() callbacks are no longer getting called due this,
> > > > > > > > which means that the cpr state machine isn't going to be restored
> > > > > > > > properly. Or did I get this wrong?
> > > > > > >
> > > > > > > BTW, if you really need something like the above, the proper way to do
> > > > > > > it would instead be to call device_set_awake_path() for the device.
> > > > > > >
> > > > > >
> > > > > > Unfortunately this does not work correctly. When I use
> > > > > > device_set_awake_path() it does set dev->power.wakeup_path = true.
> > > > > > However, this flag is cleared again in device_prepare() when entering
> > > > > > suspend. To me it looks a bit like wakeup_path is not supposed to be set
> > > > > > directly by drivers? Before and after your commit 8512220c5782 ("PM /
> > > > > > core: Assign the wakeup_path status flag in __device_prepare()") it
> > > > > > seems to be internally bound to device_may_wakeup().
> > > > > >
> > > > > > It works if I make device_may_wakeup() return true, with
> > > > > >
> > > > > >         device_set_wakeup_capable(dev, true);
> > > > > >         device_wakeup_enable(dev);
> > > > > >
> > > > > > but that also allows *disabling* the wakeup from sysfs which doesn't
> > > > > > really make sense for the CPU.
> > > > > >
> > > > > > Any ideas?
> > > > >
> > > > > The device_set_awake_path() should be called from a system suspend
> > > > > callback. So you need to add that callback for the cpufreq driver.
> > > > >
> > > > > Sorry, if that wasn't clear.
> > > > >
> > > >
> > > > Hmm, but at the moment I'm calling this on the virtual genpd devices.
> > > > How would it work for them? I don't have a suspend callback for them.
> > > >
> > > > I guess could loop over the virtual devices in the cpufreq driver
> > > > suspend callback, but is my driver suspend callback really guaranteed to
> > > > run before the device_prepare() that clears "wakeup_path" on the virtual
> > > > devices?
> > >
> > > Yes, that's guaranteed. dpm_prepare() (which calls device_prepare())
> > > is always being executed before dpm_suspend().
> > >
> >
> > Thanks, I think I understand. Maybe. :-)
> >
> > Just to confirm, I should call device_set_awake_path() for the virtual
> > genpd devices as part of the PM ->suspend() callback? And this will be
> > guaranteed to run after the "prepare" phase but before the
> > "suspend_noirq" phase where the genpd core will check the wakeup flag?
> 
> Correct!
> 

Thanks, this seems to works as intended! The diff I tested is below, in
case you already have some comments.

I'll put this into proper patches and will send a v3 after the merge
window.

Stephan

diff --git a/drivers/cpufreq/qcom-cpufreq-nvmem.c b/drivers/cpufreq/qcom-cpufreq-nvmem.c
index 7e9202080c98..e0c82c958018 100644
--- a/drivers/cpufreq/qcom-cpufreq-nvmem.c
+++ b/drivers/cpufreq/qcom-cpufreq-nvmem.c
@@ -23,6 +23,7 @@
 #include <linux/nvmem-consumer.h>
 #include <linux/of.h>
 #include <linux/platform_device.h>
+#include <linux/pm.h>
 #include <linux/pm_domain.h>
 #include <linux/pm_opp.h>
 #include <linux/pm_runtime.h>
@@ -426,6 +427,18 @@ static const struct qcom_cpufreq_match_data match_data_ipq8074 = {
 	.get_version = qcom_cpufreq_ipq8074_name_version,
 };
 
+static void qcom_cpufreq_suspend_virt_devs(struct qcom_cpufreq_drv *drv, unsigned cpu)
+{
+	const char * const *name = drv->data->genpd_names;
+	int i;
+
+	if (!drv->cpus[cpu].virt_devs)
+		return;
+
+	for (i = 0; *name; i++, name++)
+		device_set_awake_path(drv->cpus[cpu].virt_devs[i]);
+}
+
 static void qcom_cpufreq_put_virt_devs(struct qcom_cpufreq_drv *drv, unsigned cpu)
 {
 	const char * const *name = drv->data->genpd_names;
@@ -542,9 +555,6 @@ static int qcom_cpufreq_probe(struct platform_device *pdev)
 
 					goto free_opp;
 				}
-
-				/* Keep CPU power domain always-on */
-				dev_pm_syscore_device(virt_devs[i], true);
 			}
 			drv->cpus[cpu].virt_devs = virt_devs;
 		}
@@ -581,11 +591,25 @@ static void qcom_cpufreq_remove(struct platform_device *pdev)
 	}
 }
 
+static int qcom_cpufreq_suspend(struct device *dev)
+{
+	struct qcom_cpufreq_drv *drv = dev_get_drvdata(dev);
+	unsigned int cpu;
+
+	for_each_possible_cpu(cpu)
+		qcom_cpufreq_suspend_virt_devs(drv, cpu);
+
+	return 0;
+}
+
+static DEFINE_SIMPLE_DEV_PM_OPS(qcom_cpufreq_pm_ops, qcom_cpufreq_suspend, NULL);
+
 static struct platform_driver qcom_cpufreq_driver = {
 	.probe = qcom_cpufreq_probe,
 	.remove_new = qcom_cpufreq_remove,
 	.driver = {
 		.name = "qcom-cpufreq-nvmem",
+		.pm = pm_sleep_ptr(&qcom_cpufreq_pm_ops),
 	},
 };
 
diff --git a/drivers/pmdomain/qcom/rpmpd.c b/drivers/pmdomain/qcom/rpmpd.c
index abb62e4a2bdf..0f91e00b5909 100644
--- a/drivers/pmdomain/qcom/rpmpd.c
+++ b/drivers/pmdomain/qcom/rpmpd.c
@@ -1050,6 +1050,7 @@ static int rpmpd_probe(struct platform_device *pdev)
 		rpmpds[i]->pd.power_off = rpmpd_power_off;
 		rpmpds[i]->pd.power_on = rpmpd_power_on;
 		rpmpds[i]->pd.set_performance_state = rpmpd_set_performance;
+		rpmpds[i]->pd.flags = GENPD_FLAG_ACTIVE_WAKEUP;
 		pm_genpd_init(&rpmpds[i]->pd, NULL, true);
 
 		data->domains[i] = &rpmpds[i]->pd;

-- 
Stephan Gerhold <stephan.gerhold@kernkonzept.com>
Kernkonzept GmbH at Dresden, Germany, HRB 31129, CEO Dr.-Ing. Michael Hohmuth

^ permalink raw reply related	[flat|nested] 28+ messages in thread

end of thread, other threads:[~2023-11-01 14:56 UTC | newest]

Thread overview: 28+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2023-10-18  8:06 [PATCH v2 0/3] cpufreq: Add basic cpufreq scaling for Qualcomm MSM8909 Stephan Gerhold
2023-10-18  8:06 ` [PATCH v2 1/3] cpufreq: qcom-nvmem: Simplify driver data allocation Stephan Gerhold
2023-10-18  8:45   ` Konrad Dybcio
2023-10-18  8:06 ` [PATCH v2 2/3] cpufreq: qcom-nvmem: Enable virtual power domain devices Stephan Gerhold
2023-10-19 10:24   ` Ulf Hansson
2023-10-19 11:26     ` Ulf Hansson
2023-10-19 13:05       ` Stephan Gerhold
2023-10-19 14:12         ` Ulf Hansson
2023-10-19 14:48           ` Stephan Gerhold
2023-10-19 15:19             ` Ulf Hansson
2023-10-19 17:07               ` Stephan Gerhold
2023-10-20 10:20                 ` Ulf Hansson
2023-10-24 12:03       ` Stephan Gerhold
2023-10-24 12:49         ` Ulf Hansson
2023-10-24 13:07           ` Stephan Gerhold
2023-10-24 16:11             ` Ulf Hansson
2023-10-24 16:25               ` Stephan Gerhold
2023-10-25 10:05                 ` Ulf Hansson
2023-11-01 14:56                   ` Stephan Gerhold
2023-10-18  8:06 ` [PATCH v2 3/3] cpufreq: qcom-nvmem: Add MSM8909 Stephan Gerhold
2023-10-18  8:42   ` Konrad Dybcio
2023-10-19 10:50   ` Ulf Hansson
2023-10-19  6:16 ` [PATCH v2 0/3] cpufreq: Add basic cpufreq scaling for Qualcomm MSM8909 Viresh Kumar
2023-10-19 10:19   ` Ulf Hansson
2023-10-19 10:21     ` Viresh Kumar
2023-10-19 10:23   ` Viresh Kumar
2023-10-19 13:48     ` Stephan Gerhold
2023-10-20  3:21       ` Viresh Kumar

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).