All of lore.kernel.org
 help / color / mirror / Atom feed
* [PATCH v12 0/4] PM / Domains: Support hierarchical CPU arrangement (PSCI/ARM)
@ 2019-02-27 19:58 ` Ulf Hansson
  0 siblings, 0 replies; 33+ messages in thread
From: Ulf Hansson @ 2019-02-27 19:58 UTC (permalink / raw)
  To: Rafael J . Wysocki, linux-pm
  Cc: Frederic Weisbecker, Thomas Gleixner, Sudeep Holla,
	Lorenzo Pieralisi, Mark Rutland, Daniel Lezcano,
	Raju P . L . S . S . S . N, Stephen Boyd, Tony Lindgren,
	Kevin Hilman, Lina Iyer, Ulf Hansson, Viresh Kumar,
	Vincent Guittot, Geert Uytterhoeven, linux-arm-kernel,
	linux-arm-msm, linux-kernel

Changes in v12:
 - Drop the patches for restructuring tick_nohz_get_sleep_length(). Instead
replace them all with a new timer/cpuidle patch, according to suggestions by
Rafael.
 - The entire v12 series, including the PSCI/ARM changes are available in a
git branch [2].

Changes in v11:
 - This version contains only the infrastructure changes that is needed for
deployment. The PSCI/ARM changes have also been updated and tested, but I will
post them separately. Still, to provide completeness, I have published a branch
containing everything to a git tree [1], feel free to have a look and test.
 - The v10 series contained a patch, "timer: Export next wakeup time of a CPU",
which has been replaced by a couple of new patches, whom reworks the existing
tick_nohz_get_sleep_length() function, to provide the next timer expiration
instead of the duration.
 - More changelogs are available per patch.

Changes in v10:
 - Quite significant changes have been to the PSCI driver deployment. According
   to an agreement with Lorenzo, the hierarchical CPU layout for PSCI should be
   orthogonal to whether the PSCI FW supports OSI or not. This has been taken
   care of in this version.
 - Drop the generic attach/detach helpers of CPUs to genpd, instead make that
   related code internal to PSCI, for now.
 - Fix "BUG: sleeping for invalid context" for hotplug, as reported by Raju.
 - Addressed various comments from version 8 and 9.
 - Clarified changelogs and re-wrote the cover-letter to better explain the
   motivations behind these changes.

Background:

For ARM64/ARM based platforms CPUs are often arranged in a hierarchical manner.
>From a CPU idle state perspective, this means some states may be shared among a
group of CPUs (aka CPU cluster).

To deal with idle management of a group of CPUs, sometimes the kernel needs to
be involved to manage the last-man standing algorithm, simply because it can't
rely solely on power management FWs to deal with this. Depending on the
platform, of course.

There are a couple of typical scenarios for when the kernel needs to be in
control, dealing with synchronization of when the last CPU in a cluster is about
to enter a deep idle state.

1)
The kernel needs to carry out so called last-man activities before the
CPU cluster can enter a deep idle state. This may for example involve to
configure external logics for wakeups, as the GIC may no longer be functional
once a deep cluster idle state have been entered. Likewise, these operations
may need to be restored, when the first CPU wakes up.

2)
Other more generic I/O devices, such as an MMC controller for example, may be a
part of the same power domain as the CPU cluster, due to a shared power-rail.
For these scenarios, when the MMC controller is in use dealing with an MMC
request, a deeper idle state of the CPU cluster may needs to be temporarily
disabled. This is needed to retain the MMC controller in a functional state,
else it may loose its register-context in the middle of serving a request.

In this series, we are extending the generic PM domain (aka genpd) to be used
for also CPU devices. Hence the goal is to re-use much of its current code to
help us manage the last-man standing synchronization. Moreover, as we already
use genpd to model power domains for generic I/O devices, both 1) and 2) can be
address with its help.

Moreover, to address these problems for ARM64 DT based platforms, we are
deploying support for genpd and runtime PM to the PSCI FW driver - and finally
we make some updates to two ARM64 DTBs, as to deploy the new PSCI CPU topology
layout.

The series has been tested on a Qcom 410c dragonboard and on a Hisilicon Hikey
board. The first one uses PSCI OS-initiated mode, while the second uses the PSCI
Platform-Coordinated mode.

Kind regards
Ulf Hansson

[1]
git.linaro.org/people/ulf.hansson/linux-pm.git next_v11
[2]
git.linaro.org/people/ulf.hansson/linux-pm.git next_v12

Ulf Hansson (4):
  PM / Domains: Add a generic data pointer to the genpd_power_state
    struct
  PM / Domains: Add support for CPU devices to genpd
  cpuidle: Export the next timer/tick expiration for a CPU
  PM / Domains: Add genpd governor for CPUs

 drivers/base/power/domain.c          | 78 ++++++++++++++++++++++++++--
 drivers/base/power/domain_governor.c | 62 +++++++++++++++++++++-
 drivers/cpuidle/cpuidle.c            |  8 +++
 include/linux/cpuidle.h              |  1 +
 include/linux/pm_domain.h            | 20 ++++++-
 include/linux/tick.h                 |  7 ++-
 kernel/time/tick-sched.c             | 12 +++++
 7 files changed, 182 insertions(+), 6 deletions(-)

-- 
2.17.1

^ permalink raw reply	[flat|nested] 33+ messages in thread

* [PATCH v12 0/4] PM / Domains: Support hierarchical CPU arrangement (PSCI/ARM)
@ 2019-02-27 19:58 ` Ulf Hansson
  0 siblings, 0 replies; 33+ messages in thread
From: Ulf Hansson @ 2019-02-27 19:58 UTC (permalink / raw)
  To: Rafael J . Wysocki, linux-pm
  Cc: Frederic Weisbecker, Thomas Gleixner, Sudeep Holla,
	Lorenzo Pieralisi, Mark Rutland, Daniel Lezcano,
	Raju P . L . S . S . S . N, Stephen Boyd, Tony Lindgren,
	Kevin Hilman, Lina Iyer, Ulf Hansson, Viresh Kumar,
	Vincent Guittot, Geert Uytterhoeven, linux-arm-kernel,
	linux-arm-msm, linux-kernel

Changes in v12:
 - Drop the patches for restructuring tick_nohz_get_sleep_length(). Instead
replace them all with a new timer/cpuidle patch, according to suggestions by
Rafael.
 - The entire v12 series, including the PSCI/ARM changes are available in a
git branch [2].

Changes in v11:
 - This version contains only the infrastructure changes that is needed for
deployment. The PSCI/ARM changes have also been updated and tested, but I will
post them separately. Still, to provide completeness, I have published a branch
containing everything to a git tree [1], feel free to have a look and test.
 - The v10 series contained a patch, "timer: Export next wakeup time of a CPU",
which has been replaced by a couple of new patches, whom reworks the existing
tick_nohz_get_sleep_length() function, to provide the next timer expiration
instead of the duration.
 - More changelogs are available per patch.

Changes in v10:
 - Quite significant changes have been to the PSCI driver deployment. According
   to an agreement with Lorenzo, the hierarchical CPU layout for PSCI should be
   orthogonal to whether the PSCI FW supports OSI or not. This has been taken
   care of in this version.
 - Drop the generic attach/detach helpers of CPUs to genpd, instead make that
   related code internal to PSCI, for now.
 - Fix "BUG: sleeping for invalid context" for hotplug, as reported by Raju.
 - Addressed various comments from version 8 and 9.
 - Clarified changelogs and re-wrote the cover-letter to better explain the
   motivations behind these changes.

Background:

For ARM64/ARM based platforms CPUs are often arranged in a hierarchical manner.
From a CPU idle state perspective, this means some states may be shared among a
group of CPUs (aka CPU cluster).

To deal with idle management of a group of CPUs, sometimes the kernel needs to
be involved to manage the last-man standing algorithm, simply because it can't
rely solely on power management FWs to deal with this. Depending on the
platform, of course.

There are a couple of typical scenarios for when the kernel needs to be in
control, dealing with synchronization of when the last CPU in a cluster is about
to enter a deep idle state.

1)
The kernel needs to carry out so called last-man activities before the
CPU cluster can enter a deep idle state. This may for example involve to
configure external logics for wakeups, as the GIC may no longer be functional
once a deep cluster idle state have been entered. Likewise, these operations
may need to be restored, when the first CPU wakes up.

2)
Other more generic I/O devices, such as an MMC controller for example, may be a
part of the same power domain as the CPU cluster, due to a shared power-rail.
For these scenarios, when the MMC controller is in use dealing with an MMC
request, a deeper idle state of the CPU cluster may needs to be temporarily
disabled. This is needed to retain the MMC controller in a functional state,
else it may loose its register-context in the middle of serving a request.

In this series, we are extending the generic PM domain (aka genpd) to be used
for also CPU devices. Hence the goal is to re-use much of its current code to
help us manage the last-man standing synchronization. Moreover, as we already
use genpd to model power domains for generic I/O devices, both 1) and 2) can be
address with its help.

Moreover, to address these problems for ARM64 DT based platforms, we are
deploying support for genpd and runtime PM to the PSCI FW driver - and finally
we make some updates to two ARM64 DTBs, as to deploy the new PSCI CPU topology
layout.

The series has been tested on a Qcom 410c dragonboard and on a Hisilicon Hikey
board. The first one uses PSCI OS-initiated mode, while the second uses the PSCI
Platform-Coordinated mode.

Kind regards
Ulf Hansson

[1]
git.linaro.org/people/ulf.hansson/linux-pm.git next_v11
[2]
git.linaro.org/people/ulf.hansson/linux-pm.git next_v12

Ulf Hansson (4):
  PM / Domains: Add a generic data pointer to the genpd_power_state
    struct
  PM / Domains: Add support for CPU devices to genpd
  cpuidle: Export the next timer/tick expiration for a CPU
  PM / Domains: Add genpd governor for CPUs

 drivers/base/power/domain.c          | 78 ++++++++++++++++++++++++++--
 drivers/base/power/domain_governor.c | 62 +++++++++++++++++++++-
 drivers/cpuidle/cpuidle.c            |  8 +++
 include/linux/cpuidle.h              |  1 +
 include/linux/pm_domain.h            | 20 ++++++-
 include/linux/tick.h                 |  7 ++-
 kernel/time/tick-sched.c             | 12 +++++
 7 files changed, 182 insertions(+), 6 deletions(-)

-- 
2.17.1


^ permalink raw reply	[flat|nested] 33+ messages in thread

* [PATCH v12 0/4] PM / Domains: Support hierarchical CPU arrangement (PSCI/ARM)
@ 2019-02-27 19:58 ` Ulf Hansson
  0 siblings, 0 replies; 33+ messages in thread
From: Ulf Hansson @ 2019-02-27 19:58 UTC (permalink / raw)
  To: Rafael J . Wysocki, linux-pm
  Cc: Mark Rutland, Ulf Hansson, Lorenzo Pieralisi, Vincent Guittot,
	Geert Uytterhoeven, Kevin Hilman, Stephen Boyd,
	Frederic Weisbecker, linux-arm-msm, Daniel Lezcano, linux-kernel,
	Lina Iyer, Tony Lindgren, Viresh Kumar, Sudeep Holla,
	Thomas Gleixner, Raju P . L . S . S . S . N, linux-arm-kernel

Changes in v12:
 - Drop the patches for restructuring tick_nohz_get_sleep_length(). Instead
replace them all with a new timer/cpuidle patch, according to suggestions by
Rafael.
 - The entire v12 series, including the PSCI/ARM changes are available in a
git branch [2].

Changes in v11:
 - This version contains only the infrastructure changes that is needed for
deployment. The PSCI/ARM changes have also been updated and tested, but I will
post them separately. Still, to provide completeness, I have published a branch
containing everything to a git tree [1], feel free to have a look and test.
 - The v10 series contained a patch, "timer: Export next wakeup time of a CPU",
which has been replaced by a couple of new patches, whom reworks the existing
tick_nohz_get_sleep_length() function, to provide the next timer expiration
instead of the duration.
 - More changelogs are available per patch.

Changes in v10:
 - Quite significant changes have been to the PSCI driver deployment. According
   to an agreement with Lorenzo, the hierarchical CPU layout for PSCI should be
   orthogonal to whether the PSCI FW supports OSI or not. This has been taken
   care of in this version.
 - Drop the generic attach/detach helpers of CPUs to genpd, instead make that
   related code internal to PSCI, for now.
 - Fix "BUG: sleeping for invalid context" for hotplug, as reported by Raju.
 - Addressed various comments from version 8 and 9.
 - Clarified changelogs and re-wrote the cover-letter to better explain the
   motivations behind these changes.

Background:

For ARM64/ARM based platforms CPUs are often arranged in a hierarchical manner.
From a CPU idle state perspective, this means some states may be shared among a
group of CPUs (aka CPU cluster).

To deal with idle management of a group of CPUs, sometimes the kernel needs to
be involved to manage the last-man standing algorithm, simply because it can't
rely solely on power management FWs to deal with this. Depending on the
platform, of course.

There are a couple of typical scenarios for when the kernel needs to be in
control, dealing with synchronization of when the last CPU in a cluster is about
to enter a deep idle state.

1)
The kernel needs to carry out so called last-man activities before the
CPU cluster can enter a deep idle state. This may for example involve to
configure external logics for wakeups, as the GIC may no longer be functional
once a deep cluster idle state have been entered. Likewise, these operations
may need to be restored, when the first CPU wakes up.

2)
Other more generic I/O devices, such as an MMC controller for example, may be a
part of the same power domain as the CPU cluster, due to a shared power-rail.
For these scenarios, when the MMC controller is in use dealing with an MMC
request, a deeper idle state of the CPU cluster may needs to be temporarily
disabled. This is needed to retain the MMC controller in a functional state,
else it may loose its register-context in the middle of serving a request.

In this series, we are extending the generic PM domain (aka genpd) to be used
for also CPU devices. Hence the goal is to re-use much of its current code to
help us manage the last-man standing synchronization. Moreover, as we already
use genpd to model power domains for generic I/O devices, both 1) and 2) can be
address with its help.

Moreover, to address these problems for ARM64 DT based platforms, we are
deploying support for genpd and runtime PM to the PSCI FW driver - and finally
we make some updates to two ARM64 DTBs, as to deploy the new PSCI CPU topology
layout.

The series has been tested on a Qcom 410c dragonboard and on a Hisilicon Hikey
board. The first one uses PSCI OS-initiated mode, while the second uses the PSCI
Platform-Coordinated mode.

Kind regards
Ulf Hansson

[1]
git.linaro.org/people/ulf.hansson/linux-pm.git next_v11
[2]
git.linaro.org/people/ulf.hansson/linux-pm.git next_v12

Ulf Hansson (4):
  PM / Domains: Add a generic data pointer to the genpd_power_state
    struct
  PM / Domains: Add support for CPU devices to genpd
  cpuidle: Export the next timer/tick expiration for a CPU
  PM / Domains: Add genpd governor for CPUs

 drivers/base/power/domain.c          | 78 ++++++++++++++++++++++++++--
 drivers/base/power/domain_governor.c | 62 +++++++++++++++++++++-
 drivers/cpuidle/cpuidle.c            |  8 +++
 include/linux/cpuidle.h              |  1 +
 include/linux/pm_domain.h            | 20 ++++++-
 include/linux/tick.h                 |  7 ++-
 kernel/time/tick-sched.c             | 12 +++++
 7 files changed, 182 insertions(+), 6 deletions(-)

-- 
2.17.1


_______________________________________________
linux-arm-kernel mailing list
linux-arm-kernel@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-arm-kernel

^ permalink raw reply	[flat|nested] 33+ messages in thread

* [PATCH v12 1/4] PM / Domains: Add a generic data pointer to the genpd_power_state struct
  2019-02-27 19:58 ` Ulf Hansson
@ 2019-02-27 19:58   ` Ulf Hansson
  -1 siblings, 0 replies; 33+ messages in thread
From: Ulf Hansson @ 2019-02-27 19:58 UTC (permalink / raw)
  To: Rafael J . Wysocki, linux-pm
  Cc: Frederic Weisbecker, Thomas Gleixner, Sudeep Holla,
	Lorenzo Pieralisi, Mark Rutland, Daniel Lezcano,
	Raju P . L . S . S . S . N, Stephen Boyd, Tony Lindgren,
	Kevin Hilman, Lina Iyer, Ulf Hansson, Viresh Kumar,
	Vincent Guittot, Geert Uytterhoeven, linux-arm-kernel,
	linux-arm-msm, linux-kernel

Let's add a data pointer to the genpd_power_state struct, to allow a genpd
backend driver to store per state specific data. To introduce the pointer,
we need to change the way genpd deals with freeing of the corresponding
allocated data.

More precisely, let's clarify the responsibility of whom that shall free
the data, by adding a ->free_states() callback to the struct
generic_pm_domain. The one allocating the data shall assign the callback,
to allow genpd to invoke it from genpd_remove().

Cc: Lina Iyer <ilina@codeaurora.org>
Co-developed-by: Lina Iyer <lina.iyer@linaro.org>
Signed-off-by: Ulf Hansson <ulf.hansson@linaro.org>
---

Changes in v12:
	- None.

---
 drivers/base/power/domain.c | 12 ++++++++++--
 include/linux/pm_domain.h   |  4 +++-
 2 files changed, 13 insertions(+), 3 deletions(-)

diff --git a/drivers/base/power/domain.c b/drivers/base/power/domain.c
index 2c334c01fc43..03885c003c6a 100644
--- a/drivers/base/power/domain.c
+++ b/drivers/base/power/domain.c
@@ -1685,6 +1685,12 @@ int pm_genpd_remove_subdomain(struct generic_pm_domain *genpd,
 }
 EXPORT_SYMBOL_GPL(pm_genpd_remove_subdomain);
 
+static void genpd_free_default_power_state(struct genpd_power_state *states,
+					   unsigned int state_count)
+{
+	kfree(states);
+}
+
 static int genpd_set_default_power_state(struct generic_pm_domain *genpd)
 {
 	struct genpd_power_state *state;
@@ -1695,7 +1701,7 @@ static int genpd_set_default_power_state(struct generic_pm_domain *genpd)
 
 	genpd->states = state;
 	genpd->state_count = 1;
-	genpd->free = state;
+	genpd->free_states = genpd_free_default_power_state;
 
 	return 0;
 }
@@ -1811,7 +1817,9 @@ static int genpd_remove(struct generic_pm_domain *genpd)
 	list_del(&genpd->gpd_list_node);
 	genpd_unlock(genpd);
 	cancel_work_sync(&genpd->power_off_work);
-	kfree(genpd->free);
+	if (genpd->free_states)
+		genpd->free_states(genpd->states, genpd->state_count);
+
 	pr_debug("%s: removed %s\n", __func__, genpd->name);
 
 	return 0;
diff --git a/include/linux/pm_domain.h b/include/linux/pm_domain.h
index 1ed5874bcee0..8e1399231753 100644
--- a/include/linux/pm_domain.h
+++ b/include/linux/pm_domain.h
@@ -69,6 +69,7 @@ struct genpd_power_state {
 	s64 residency_ns;
 	struct fwnode_handle *fwnode;
 	ktime_t idle_time;
+	void *data;
 };
 
 struct genpd_lock_ops;
@@ -110,9 +111,10 @@ struct generic_pm_domain {
 			   struct device *dev);
 	unsigned int flags;		/* Bit field of configs for genpd */
 	struct genpd_power_state *states;
+	void (*free_states)(struct genpd_power_state *states,
+			    unsigned int state_count);
 	unsigned int state_count; /* number of states */
 	unsigned int state_idx; /* state that genpd will go to when off */
-	void *free; /* Free the state that was allocated for default */
 	ktime_t on_time;
 	ktime_t accounting_time;
 	const struct genpd_lock_ops *lock_ops;
-- 
2.17.1

^ permalink raw reply related	[flat|nested] 33+ messages in thread

* [PATCH v12 1/4] PM / Domains: Add a generic data pointer to the genpd_power_state struct
@ 2019-02-27 19:58   ` Ulf Hansson
  0 siblings, 0 replies; 33+ messages in thread
From: Ulf Hansson @ 2019-02-27 19:58 UTC (permalink / raw)
  To: Rafael J . Wysocki, linux-pm
  Cc: Mark Rutland, Ulf Hansson, Lorenzo Pieralisi, Vincent Guittot,
	Geert Uytterhoeven, Kevin Hilman, Stephen Boyd,
	Frederic Weisbecker, linux-arm-msm, Daniel Lezcano, linux-kernel,
	Lina Iyer, Tony Lindgren, Viresh Kumar, Sudeep Holla,
	Thomas Gleixner, Raju P . L . S . S . S . N, linux-arm-kernel

Let's add a data pointer to the genpd_power_state struct, to allow a genpd
backend driver to store per state specific data. To introduce the pointer,
we need to change the way genpd deals with freeing of the corresponding
allocated data.

More precisely, let's clarify the responsibility of whom that shall free
the data, by adding a ->free_states() callback to the struct
generic_pm_domain. The one allocating the data shall assign the callback,
to allow genpd to invoke it from genpd_remove().

Cc: Lina Iyer <ilina@codeaurora.org>
Co-developed-by: Lina Iyer <lina.iyer@linaro.org>
Signed-off-by: Ulf Hansson <ulf.hansson@linaro.org>
---

Changes in v12:
	- None.

---
 drivers/base/power/domain.c | 12 ++++++++++--
 include/linux/pm_domain.h   |  4 +++-
 2 files changed, 13 insertions(+), 3 deletions(-)

diff --git a/drivers/base/power/domain.c b/drivers/base/power/domain.c
index 2c334c01fc43..03885c003c6a 100644
--- a/drivers/base/power/domain.c
+++ b/drivers/base/power/domain.c
@@ -1685,6 +1685,12 @@ int pm_genpd_remove_subdomain(struct generic_pm_domain *genpd,
 }
 EXPORT_SYMBOL_GPL(pm_genpd_remove_subdomain);
 
+static void genpd_free_default_power_state(struct genpd_power_state *states,
+					   unsigned int state_count)
+{
+	kfree(states);
+}
+
 static int genpd_set_default_power_state(struct generic_pm_domain *genpd)
 {
 	struct genpd_power_state *state;
@@ -1695,7 +1701,7 @@ static int genpd_set_default_power_state(struct generic_pm_domain *genpd)
 
 	genpd->states = state;
 	genpd->state_count = 1;
-	genpd->free = state;
+	genpd->free_states = genpd_free_default_power_state;
 
 	return 0;
 }
@@ -1811,7 +1817,9 @@ static int genpd_remove(struct generic_pm_domain *genpd)
 	list_del(&genpd->gpd_list_node);
 	genpd_unlock(genpd);
 	cancel_work_sync(&genpd->power_off_work);
-	kfree(genpd->free);
+	if (genpd->free_states)
+		genpd->free_states(genpd->states, genpd->state_count);
+
 	pr_debug("%s: removed %s\n", __func__, genpd->name);
 
 	return 0;
diff --git a/include/linux/pm_domain.h b/include/linux/pm_domain.h
index 1ed5874bcee0..8e1399231753 100644
--- a/include/linux/pm_domain.h
+++ b/include/linux/pm_domain.h
@@ -69,6 +69,7 @@ struct genpd_power_state {
 	s64 residency_ns;
 	struct fwnode_handle *fwnode;
 	ktime_t idle_time;
+	void *data;
 };
 
 struct genpd_lock_ops;
@@ -110,9 +111,10 @@ struct generic_pm_domain {
 			   struct device *dev);
 	unsigned int flags;		/* Bit field of configs for genpd */
 	struct genpd_power_state *states;
+	void (*free_states)(struct genpd_power_state *states,
+			    unsigned int state_count);
 	unsigned int state_count; /* number of states */
 	unsigned int state_idx; /* state that genpd will go to when off */
-	void *free; /* Free the state that was allocated for default */
 	ktime_t on_time;
 	ktime_t accounting_time;
 	const struct genpd_lock_ops *lock_ops;
-- 
2.17.1


_______________________________________________
linux-arm-kernel mailing list
linux-arm-kernel@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-arm-kernel

^ permalink raw reply related	[flat|nested] 33+ messages in thread

* [PATCH v12 2/4] PM / Domains: Add support for CPU devices to genpd
  2019-02-27 19:58 ` Ulf Hansson
@ 2019-02-27 19:58   ` Ulf Hansson
  -1 siblings, 0 replies; 33+ messages in thread
From: Ulf Hansson @ 2019-02-27 19:58 UTC (permalink / raw)
  To: Rafael J . Wysocki, linux-pm
  Cc: Frederic Weisbecker, Thomas Gleixner, Sudeep Holla,
	Lorenzo Pieralisi, Mark Rutland, Daniel Lezcano,
	Raju P . L . S . S . S . N, Stephen Boyd, Tony Lindgren,
	Kevin Hilman, Lina Iyer, Ulf Hansson, Viresh Kumar,
	Vincent Guittot, Geert Uytterhoeven, linux-arm-kernel,
	linux-arm-msm, linux-kernel

To enable a device belonging to a CPU to be attached to a PM domain managed
by genpd, let's do a few changes to it, as to make it convenient to manage
the specifics around CPUs.

To be able to quickly find out what CPUs that are attached to a genpd,
which typically becomes useful from a genpd governor as following changes
is about to show, let's add a cpumask to the struct generic_pm_domain. At
the point when a CPU device gets attached to a genpd, let's update the
genpd's cpumask. Moreover, let's also propagate changes to the cpumask
upwards in the topology to the master PM domains. In this way, the cpumask
for a genpd hierarchically reflects all CPUs attached to the topology below
it.

Finally, let's make this an opt-in feature, to avoid having to manage CPUs
and the cpumask for a genpd that doesn't need it. For that reason, let's
add a new genpd configuration bit, GENPD_FLAG_CPU_DOMAIN.

Cc: Lina Iyer <ilina@codeaurora.org>
Co-developed-by: Lina Iyer <lina.iyer@linaro.org>
Acked-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
Signed-off-by: Ulf Hansson <ulf.hansson@linaro.org>
---

Changes in v12:
	- None.

---
 drivers/base/power/domain.c | 66 ++++++++++++++++++++++++++++++++++++-
 include/linux/pm_domain.h   | 13 ++++++++
 2 files changed, 78 insertions(+), 1 deletion(-)

diff --git a/drivers/base/power/domain.c b/drivers/base/power/domain.c
index 03885c003c6a..da10052e6427 100644
--- a/drivers/base/power/domain.c
+++ b/drivers/base/power/domain.c
@@ -20,6 +20,7 @@
 #include <linux/sched.h>
 #include <linux/suspend.h>
 #include <linux/export.h>
+#include <linux/cpu.h>
 
 #include "power.h"
 
@@ -126,6 +127,7 @@ static const struct genpd_lock_ops genpd_spin_ops = {
 #define genpd_is_irq_safe(genpd)	(genpd->flags & GENPD_FLAG_IRQ_SAFE)
 #define genpd_is_always_on(genpd)	(genpd->flags & GENPD_FLAG_ALWAYS_ON)
 #define genpd_is_active_wakeup(genpd)	(genpd->flags & GENPD_FLAG_ACTIVE_WAKEUP)
+#define genpd_is_cpu_domain(genpd)	(genpd->flags & GENPD_FLAG_CPU_DOMAIN)
 
 static inline bool irq_safe_dev_in_no_sleep_domain(struct device *dev,
 		const struct generic_pm_domain *genpd)
@@ -1452,6 +1454,56 @@ static void genpd_free_dev_data(struct device *dev,
 	dev_pm_put_subsys_data(dev);
 }
 
+static void __genpd_update_cpumask(struct generic_pm_domain *genpd,
+				   int cpu, bool set, unsigned int depth)
+{
+	struct gpd_link *link;
+
+	if (!genpd_is_cpu_domain(genpd))
+		return;
+
+	list_for_each_entry(link, &genpd->slave_links, slave_node) {
+		struct generic_pm_domain *master = link->master;
+
+		genpd_lock_nested(master, depth + 1);
+		__genpd_update_cpumask(master, cpu, set, depth + 1);
+		genpd_unlock(master);
+	}
+
+	if (set)
+		cpumask_set_cpu(cpu, genpd->cpus);
+	else
+		cpumask_clear_cpu(cpu, genpd->cpus);
+}
+
+static void genpd_update_cpumask(struct generic_pm_domain *genpd,
+				 struct device *dev, bool set)
+{
+	int cpu;
+
+	if (!genpd_is_cpu_domain(genpd))
+		return;
+
+	for_each_possible_cpu(cpu) {
+		if (get_cpu_device(cpu) == dev) {
+			__genpd_update_cpumask(genpd, cpu, set, 0);
+			return;
+		}
+	}
+}
+
+static void genpd_set_cpumask(struct generic_pm_domain *genpd,
+			      struct device *dev)
+{
+	genpd_update_cpumask(genpd, dev, true);
+}
+
+static void genpd_clear_cpumask(struct generic_pm_domain *genpd,
+				struct device *dev)
+{
+	genpd_update_cpumask(genpd, dev, false);
+}
+
 static int genpd_add_device(struct generic_pm_domain *genpd, struct device *dev,
 			    struct gpd_timing_data *td)
 {
@@ -1473,6 +1525,8 @@ static int genpd_add_device(struct generic_pm_domain *genpd, struct device *dev,
 	if (ret)
 		goto out;
 
+	genpd_set_cpumask(genpd, dev);
+
 	dev_pm_domain_set(dev, &genpd->domain);
 
 	genpd->device_count++;
@@ -1534,6 +1588,7 @@ static int genpd_remove_device(struct generic_pm_domain *genpd,
 	if (genpd->detach_dev)
 		genpd->detach_dev(genpd, dev);
 
+	genpd_clear_cpumask(genpd, dev);
 	dev_pm_domain_set(dev, NULL);
 
 	list_del_init(&pdd->list_node);
@@ -1767,11 +1822,18 @@ int pm_genpd_init(struct generic_pm_domain *genpd,
 	if (genpd_is_always_on(genpd) && !genpd_status_on(genpd))
 		return -EINVAL;
 
+	if (genpd_is_cpu_domain(genpd) &&
+	    !zalloc_cpumask_var(&genpd->cpus, GFP_KERNEL))
+		return -ENOMEM;
+
 	/* Use only one "off" state if there were no states declared */
 	if (genpd->state_count == 0) {
 		ret = genpd_set_default_power_state(genpd);
-		if (ret)
+		if (ret) {
+			if (genpd_is_cpu_domain(genpd))
+				free_cpumask_var(genpd->cpus);
 			return ret;
+		}
 	} else if (!gov) {
 		pr_warn("%s : no governor for states\n", genpd->name);
 	}
@@ -1817,6 +1879,8 @@ static int genpd_remove(struct generic_pm_domain *genpd)
 	list_del(&genpd->gpd_list_node);
 	genpd_unlock(genpd);
 	cancel_work_sync(&genpd->power_off_work);
+	if (genpd_is_cpu_domain(genpd))
+		free_cpumask_var(genpd->cpus);
 	if (genpd->free_states)
 		genpd->free_states(genpd->states, genpd->state_count);
 
diff --git a/include/linux/pm_domain.h b/include/linux/pm_domain.h
index 8e1399231753..a6e251fe9deb 100644
--- a/include/linux/pm_domain.h
+++ b/include/linux/pm_domain.h
@@ -16,6 +16,7 @@
 #include <linux/of.h>
 #include <linux/notifier.h>
 #include <linux/spinlock.h>
+#include <linux/cpumask.h>
 
 /*
  * Flags to control the behaviour of a genpd.
@@ -42,11 +43,22 @@
  * GENPD_FLAG_ACTIVE_WAKEUP:	Instructs genpd to keep the PM domain powered
  *				on, in case any of its attached devices is used
  *				in the wakeup path to serve system wakeups.
+ *
+ * GENPD_FLAG_CPU_DOMAIN:	Instructs genpd that it should expect to get
+ *				devices attached, which may belong to CPUs or
+ *				possibly have subdomains with CPUs attached.
+ *				This flag enables the genpd backend driver to
+ *				deploy idle power management support for CPUs
+ *				and groups of CPUs. Note that, the backend
+ *				driver must then comply with the so called,
+ *				last-man-standing algorithm, for the CPUs in the
+ *				PM domain.
  */
 #define GENPD_FLAG_PM_CLK	 (1U << 0)
 #define GENPD_FLAG_IRQ_SAFE	 (1U << 1)
 #define GENPD_FLAG_ALWAYS_ON	 (1U << 2)
 #define GENPD_FLAG_ACTIVE_WAKEUP (1U << 3)
+#define GENPD_FLAG_CPU_DOMAIN	 (1U << 4)
 
 enum gpd_status {
 	GPD_STATE_ACTIVE = 0,	/* PM domain is active */
@@ -94,6 +106,7 @@ struct generic_pm_domain {
 	unsigned int suspended_count;	/* System suspend device counter */
 	unsigned int prepared_count;	/* Suspend counter of prepared devices */
 	unsigned int performance_state;	/* Aggregated max performance state */
+	cpumask_var_t cpus;		/* A cpumask of the attached CPUs */
 	int (*power_off)(struct generic_pm_domain *domain);
 	int (*power_on)(struct generic_pm_domain *domain);
 	struct opp_table *opp_table;	/* OPP table of the genpd */
-- 
2.17.1

^ permalink raw reply related	[flat|nested] 33+ messages in thread

* [PATCH v12 2/4] PM / Domains: Add support for CPU devices to genpd
@ 2019-02-27 19:58   ` Ulf Hansson
  0 siblings, 0 replies; 33+ messages in thread
From: Ulf Hansson @ 2019-02-27 19:58 UTC (permalink / raw)
  To: Rafael J . Wysocki, linux-pm
  Cc: Mark Rutland, Ulf Hansson, Lorenzo Pieralisi, Vincent Guittot,
	Geert Uytterhoeven, Kevin Hilman, Stephen Boyd,
	Frederic Weisbecker, linux-arm-msm, Daniel Lezcano, linux-kernel,
	Lina Iyer, Tony Lindgren, Viresh Kumar, Sudeep Holla,
	Thomas Gleixner, Raju P . L . S . S . S . N, linux-arm-kernel

To enable a device belonging to a CPU to be attached to a PM domain managed
by genpd, let's do a few changes to it, as to make it convenient to manage
the specifics around CPUs.

To be able to quickly find out what CPUs that are attached to a genpd,
which typically becomes useful from a genpd governor as following changes
is about to show, let's add a cpumask to the struct generic_pm_domain. At
the point when a CPU device gets attached to a genpd, let's update the
genpd's cpumask. Moreover, let's also propagate changes to the cpumask
upwards in the topology to the master PM domains. In this way, the cpumask
for a genpd hierarchically reflects all CPUs attached to the topology below
it.

Finally, let's make this an opt-in feature, to avoid having to manage CPUs
and the cpumask for a genpd that doesn't need it. For that reason, let's
add a new genpd configuration bit, GENPD_FLAG_CPU_DOMAIN.

Cc: Lina Iyer <ilina@codeaurora.org>
Co-developed-by: Lina Iyer <lina.iyer@linaro.org>
Acked-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
Signed-off-by: Ulf Hansson <ulf.hansson@linaro.org>
---

Changes in v12:
	- None.

---
 drivers/base/power/domain.c | 66 ++++++++++++++++++++++++++++++++++++-
 include/linux/pm_domain.h   | 13 ++++++++
 2 files changed, 78 insertions(+), 1 deletion(-)

diff --git a/drivers/base/power/domain.c b/drivers/base/power/domain.c
index 03885c003c6a..da10052e6427 100644
--- a/drivers/base/power/domain.c
+++ b/drivers/base/power/domain.c
@@ -20,6 +20,7 @@
 #include <linux/sched.h>
 #include <linux/suspend.h>
 #include <linux/export.h>
+#include <linux/cpu.h>
 
 #include "power.h"
 
@@ -126,6 +127,7 @@ static const struct genpd_lock_ops genpd_spin_ops = {
 #define genpd_is_irq_safe(genpd)	(genpd->flags & GENPD_FLAG_IRQ_SAFE)
 #define genpd_is_always_on(genpd)	(genpd->flags & GENPD_FLAG_ALWAYS_ON)
 #define genpd_is_active_wakeup(genpd)	(genpd->flags & GENPD_FLAG_ACTIVE_WAKEUP)
+#define genpd_is_cpu_domain(genpd)	(genpd->flags & GENPD_FLAG_CPU_DOMAIN)
 
 static inline bool irq_safe_dev_in_no_sleep_domain(struct device *dev,
 		const struct generic_pm_domain *genpd)
@@ -1452,6 +1454,56 @@ static void genpd_free_dev_data(struct device *dev,
 	dev_pm_put_subsys_data(dev);
 }
 
+static void __genpd_update_cpumask(struct generic_pm_domain *genpd,
+				   int cpu, bool set, unsigned int depth)
+{
+	struct gpd_link *link;
+
+	if (!genpd_is_cpu_domain(genpd))
+		return;
+
+	list_for_each_entry(link, &genpd->slave_links, slave_node) {
+		struct generic_pm_domain *master = link->master;
+
+		genpd_lock_nested(master, depth + 1);
+		__genpd_update_cpumask(master, cpu, set, depth + 1);
+		genpd_unlock(master);
+	}
+
+	if (set)
+		cpumask_set_cpu(cpu, genpd->cpus);
+	else
+		cpumask_clear_cpu(cpu, genpd->cpus);
+}
+
+static void genpd_update_cpumask(struct generic_pm_domain *genpd,
+				 struct device *dev, bool set)
+{
+	int cpu;
+
+	if (!genpd_is_cpu_domain(genpd))
+		return;
+
+	for_each_possible_cpu(cpu) {
+		if (get_cpu_device(cpu) == dev) {
+			__genpd_update_cpumask(genpd, cpu, set, 0);
+			return;
+		}
+	}
+}
+
+static void genpd_set_cpumask(struct generic_pm_domain *genpd,
+			      struct device *dev)
+{
+	genpd_update_cpumask(genpd, dev, true);
+}
+
+static void genpd_clear_cpumask(struct generic_pm_domain *genpd,
+				struct device *dev)
+{
+	genpd_update_cpumask(genpd, dev, false);
+}
+
 static int genpd_add_device(struct generic_pm_domain *genpd, struct device *dev,
 			    struct gpd_timing_data *td)
 {
@@ -1473,6 +1525,8 @@ static int genpd_add_device(struct generic_pm_domain *genpd, struct device *dev,
 	if (ret)
 		goto out;
 
+	genpd_set_cpumask(genpd, dev);
+
 	dev_pm_domain_set(dev, &genpd->domain);
 
 	genpd->device_count++;
@@ -1534,6 +1588,7 @@ static int genpd_remove_device(struct generic_pm_domain *genpd,
 	if (genpd->detach_dev)
 		genpd->detach_dev(genpd, dev);
 
+	genpd_clear_cpumask(genpd, dev);
 	dev_pm_domain_set(dev, NULL);
 
 	list_del_init(&pdd->list_node);
@@ -1767,11 +1822,18 @@ int pm_genpd_init(struct generic_pm_domain *genpd,
 	if (genpd_is_always_on(genpd) && !genpd_status_on(genpd))
 		return -EINVAL;
 
+	if (genpd_is_cpu_domain(genpd) &&
+	    !zalloc_cpumask_var(&genpd->cpus, GFP_KERNEL))
+		return -ENOMEM;
+
 	/* Use only one "off" state if there were no states declared */
 	if (genpd->state_count == 0) {
 		ret = genpd_set_default_power_state(genpd);
-		if (ret)
+		if (ret) {
+			if (genpd_is_cpu_domain(genpd))
+				free_cpumask_var(genpd->cpus);
 			return ret;
+		}
 	} else if (!gov) {
 		pr_warn("%s : no governor for states\n", genpd->name);
 	}
@@ -1817,6 +1879,8 @@ static int genpd_remove(struct generic_pm_domain *genpd)
 	list_del(&genpd->gpd_list_node);
 	genpd_unlock(genpd);
 	cancel_work_sync(&genpd->power_off_work);
+	if (genpd_is_cpu_domain(genpd))
+		free_cpumask_var(genpd->cpus);
 	if (genpd->free_states)
 		genpd->free_states(genpd->states, genpd->state_count);
 
diff --git a/include/linux/pm_domain.h b/include/linux/pm_domain.h
index 8e1399231753..a6e251fe9deb 100644
--- a/include/linux/pm_domain.h
+++ b/include/linux/pm_domain.h
@@ -16,6 +16,7 @@
 #include <linux/of.h>
 #include <linux/notifier.h>
 #include <linux/spinlock.h>
+#include <linux/cpumask.h>
 
 /*
  * Flags to control the behaviour of a genpd.
@@ -42,11 +43,22 @@
  * GENPD_FLAG_ACTIVE_WAKEUP:	Instructs genpd to keep the PM domain powered
  *				on, in case any of its attached devices is used
  *				in the wakeup path to serve system wakeups.
+ *
+ * GENPD_FLAG_CPU_DOMAIN:	Instructs genpd that it should expect to get
+ *				devices attached, which may belong to CPUs or
+ *				possibly have subdomains with CPUs attached.
+ *				This flag enables the genpd backend driver to
+ *				deploy idle power management support for CPUs
+ *				and groups of CPUs. Note that, the backend
+ *				driver must then comply with the so called,
+ *				last-man-standing algorithm, for the CPUs in the
+ *				PM domain.
  */
 #define GENPD_FLAG_PM_CLK	 (1U << 0)
 #define GENPD_FLAG_IRQ_SAFE	 (1U << 1)
 #define GENPD_FLAG_ALWAYS_ON	 (1U << 2)
 #define GENPD_FLAG_ACTIVE_WAKEUP (1U << 3)
+#define GENPD_FLAG_CPU_DOMAIN	 (1U << 4)
 
 enum gpd_status {
 	GPD_STATE_ACTIVE = 0,	/* PM domain is active */
@@ -94,6 +106,7 @@ struct generic_pm_domain {
 	unsigned int suspended_count;	/* System suspend device counter */
 	unsigned int prepared_count;	/* Suspend counter of prepared devices */
 	unsigned int performance_state;	/* Aggregated max performance state */
+	cpumask_var_t cpus;		/* A cpumask of the attached CPUs */
 	int (*power_off)(struct generic_pm_domain *domain);
 	int (*power_on)(struct generic_pm_domain *domain);
 	struct opp_table *opp_table;	/* OPP table of the genpd */
-- 
2.17.1


_______________________________________________
linux-arm-kernel mailing list
linux-arm-kernel@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-arm-kernel

^ permalink raw reply related	[flat|nested] 33+ messages in thread

* [PATCH v12 3/4] cpuidle: Export the next timer/tick expiration for a CPU
  2019-02-27 19:58 ` Ulf Hansson
@ 2019-02-27 19:58   ` Ulf Hansson
  -1 siblings, 0 replies; 33+ messages in thread
From: Ulf Hansson @ 2019-02-27 19:58 UTC (permalink / raw)
  To: Rafael J . Wysocki, linux-pm
  Cc: Frederic Weisbecker, Thomas Gleixner, Sudeep Holla,
	Lorenzo Pieralisi, Mark Rutland, Daniel Lezcano,
	Raju P . L . S . S . S . N, Stephen Boyd, Tony Lindgren,
	Kevin Hilman, Lina Iyer, Ulf Hansson, Viresh Kumar,
	Vincent Guittot, Geert Uytterhoeven, linux-arm-kernel,
	linux-arm-msm, linux-kernel

To be able to predict the sleep duration for a CPU that is entering idle,
knowing when the next timer/tick is going to expire, is extremely useful.
Both the teo and the menu cpuidle governors already makes use of this
information, while selecting an idle state.

Moving forward, the similar prediction needs to be done, but for a group of
idle CPUs rather than for a single idle CPU. Following changes implements a
new genpd governor, which needs this.

Support this, by sharing a new function called
tick_nohz_get_next_hrtimer(), which returns the next hrtimer or the next
tick, whatever that expires first.

Additionally, when cpuidle is about to invoke the ->enter() callback, then
call tick_nohz_get_next_hrtimer() and store its return value in the per CPU
struct cpuidle_device, as to make it available outside cpuidle.

Do note, at the point when cpuidle calls tick_nohz_get_next_hrtimer(), the
governor's ->select() callback has already made a decision whether to stop
the tick or not. In this way, tick_nohz_get_next_hrtimer() actually returns
the next timer expiration, whatever origin.

Cc: Lina Iyer <ilina@codeaurora.org>
Co-developed-by: Lina Iyer <lina.iyer@linaro.org>
Co-developed-by: Daniel Lezcano <daniel.lezcano@linaro.org>
Signed-off-by: Ulf Hansson <ulf.hansson@linaro.org>
---

Changes in v12:
	- New patch.

---
 drivers/cpuidle/cpuidle.c |  8 ++++++++
 include/linux/cpuidle.h   |  1 +
 include/linux/tick.h      |  7 ++++++-
 kernel/time/tick-sched.c  | 12 ++++++++++++
 4 files changed, 27 insertions(+), 1 deletion(-)

diff --git a/drivers/cpuidle/cpuidle.c b/drivers/cpuidle/cpuidle.c
index 7f108309e871..255365b1a6ab 100644
--- a/drivers/cpuidle/cpuidle.c
+++ b/drivers/cpuidle/cpuidle.c
@@ -328,6 +328,14 @@ int cpuidle_select(struct cpuidle_driver *drv, struct cpuidle_device *dev,
 int cpuidle_enter(struct cpuidle_driver *drv, struct cpuidle_device *dev,
 		  int index)
 {
+	/*
+	 * Store the next hrtimer, which becomes either next tick or the next
+	 * timer event, whatever expires first. Additionally, to make this data
+	 * useful for consumers outside cpuidle, we rely on that the governor's
+	 * ->select() callback have decided, whether to stop the tick or not.
+	 */
+	dev->next_hrtimer = tick_nohz_get_next_hrtimer();
+
 	if (cpuidle_state_is_coupled(drv, index))
 		return cpuidle_enter_state_coupled(dev, drv, index);
 	return cpuidle_enter_state(dev, drv, index);
diff --git a/include/linux/cpuidle.h b/include/linux/cpuidle.h
index 3b39472324a3..bb9a0db89f1a 100644
--- a/include/linux/cpuidle.h
+++ b/include/linux/cpuidle.h
@@ -83,6 +83,7 @@ struct cpuidle_device {
 	unsigned int		use_deepest_state:1;
 	unsigned int		poll_time_limit:1;
 	unsigned int		cpu;
+	ktime_t			next_hrtimer;
 
 	int			last_residency;
 	struct cpuidle_state_usage	states_usage[CPUIDLE_STATE_MAX];
diff --git a/include/linux/tick.h b/include/linux/tick.h
index 55388ab45fd4..8891b5ac3e40 100644
--- a/include/linux/tick.h
+++ b/include/linux/tick.h
@@ -122,6 +122,7 @@ extern void tick_nohz_idle_enter(void);
 extern void tick_nohz_idle_exit(void);
 extern void tick_nohz_irq_exit(void);
 extern bool tick_nohz_idle_got_tick(void);
+extern ktime_t tick_nohz_get_next_hrtimer(void);
 extern ktime_t tick_nohz_get_sleep_length(ktime_t *delta_next);
 extern unsigned long tick_nohz_get_idle_calls(void);
 extern unsigned long tick_nohz_get_idle_calls_cpu(int cpu);
@@ -145,7 +146,11 @@ static inline void tick_nohz_idle_restart_tick(void) { }
 static inline void tick_nohz_idle_enter(void) { }
 static inline void tick_nohz_idle_exit(void) { }
 static inline bool tick_nohz_idle_got_tick(void) { return false; }
-
+static inline ktime_t tick_nohz_get_next_hrtimer(void)
+{
+	/* Next wake up is the tick period, assume it starts now */
+	return ktime_add(ktime_get(), TICK_NSEC);
+}
 static inline ktime_t tick_nohz_get_sleep_length(ktime_t *delta_next)
 {
 	*delta_next = TICK_NSEC;
diff --git a/kernel/time/tick-sched.c b/kernel/time/tick-sched.c
index 6fa52cd6df0b..8d18e03124ff 100644
--- a/kernel/time/tick-sched.c
+++ b/kernel/time/tick-sched.c
@@ -1022,6 +1022,18 @@ bool tick_nohz_idle_got_tick(void)
 	return false;
 }
 
+/**
+ * tick_nohz_get_next_hrtimer - return the next expiration time for the hrtimer
+ * or the tick, whatever that expires first. Note that, if the tick has been
+ * stopped, it returns the next hrtimer.
+ *
+ * Called from power state control code with interrupts disabled
+ */
+ktime_t tick_nohz_get_next_hrtimer(void)
+{
+	return __this_cpu_read(tick_cpu_device.evtdev)->next_event;
+}
+
 /**
  * tick_nohz_get_sleep_length - return the expected length of the current sleep
  * @delta_next: duration until the next event if the tick cannot be stopped
-- 
2.17.1

^ permalink raw reply related	[flat|nested] 33+ messages in thread

* [PATCH v12 3/4] cpuidle: Export the next timer/tick expiration for a CPU
@ 2019-02-27 19:58   ` Ulf Hansson
  0 siblings, 0 replies; 33+ messages in thread
From: Ulf Hansson @ 2019-02-27 19:58 UTC (permalink / raw)
  To: Rafael J . Wysocki, linux-pm
  Cc: Mark Rutland, Ulf Hansson, Lorenzo Pieralisi, Vincent Guittot,
	Geert Uytterhoeven, Kevin Hilman, Stephen Boyd,
	Frederic Weisbecker, linux-arm-msm, Daniel Lezcano, linux-kernel,
	Lina Iyer, Tony Lindgren, Viresh Kumar, Sudeep Holla,
	Thomas Gleixner, Raju P . L . S . S . S . N, linux-arm-kernel

To be able to predict the sleep duration for a CPU that is entering idle,
knowing when the next timer/tick is going to expire, is extremely useful.
Both the teo and the menu cpuidle governors already makes use of this
information, while selecting an idle state.

Moving forward, the similar prediction needs to be done, but for a group of
idle CPUs rather than for a single idle CPU. Following changes implements a
new genpd governor, which needs this.

Support this, by sharing a new function called
tick_nohz_get_next_hrtimer(), which returns the next hrtimer or the next
tick, whatever that expires first.

Additionally, when cpuidle is about to invoke the ->enter() callback, then
call tick_nohz_get_next_hrtimer() and store its return value in the per CPU
struct cpuidle_device, as to make it available outside cpuidle.

Do note, at the point when cpuidle calls tick_nohz_get_next_hrtimer(), the
governor's ->select() callback has already made a decision whether to stop
the tick or not. In this way, tick_nohz_get_next_hrtimer() actually returns
the next timer expiration, whatever origin.

Cc: Lina Iyer <ilina@codeaurora.org>
Co-developed-by: Lina Iyer <lina.iyer@linaro.org>
Co-developed-by: Daniel Lezcano <daniel.lezcano@linaro.org>
Signed-off-by: Ulf Hansson <ulf.hansson@linaro.org>
---

Changes in v12:
	- New patch.

---
 drivers/cpuidle/cpuidle.c |  8 ++++++++
 include/linux/cpuidle.h   |  1 +
 include/linux/tick.h      |  7 ++++++-
 kernel/time/tick-sched.c  | 12 ++++++++++++
 4 files changed, 27 insertions(+), 1 deletion(-)

diff --git a/drivers/cpuidle/cpuidle.c b/drivers/cpuidle/cpuidle.c
index 7f108309e871..255365b1a6ab 100644
--- a/drivers/cpuidle/cpuidle.c
+++ b/drivers/cpuidle/cpuidle.c
@@ -328,6 +328,14 @@ int cpuidle_select(struct cpuidle_driver *drv, struct cpuidle_device *dev,
 int cpuidle_enter(struct cpuidle_driver *drv, struct cpuidle_device *dev,
 		  int index)
 {
+	/*
+	 * Store the next hrtimer, which becomes either next tick or the next
+	 * timer event, whatever expires first. Additionally, to make this data
+	 * useful for consumers outside cpuidle, we rely on that the governor's
+	 * ->select() callback have decided, whether to stop the tick or not.
+	 */
+	dev->next_hrtimer = tick_nohz_get_next_hrtimer();
+
 	if (cpuidle_state_is_coupled(drv, index))
 		return cpuidle_enter_state_coupled(dev, drv, index);
 	return cpuidle_enter_state(dev, drv, index);
diff --git a/include/linux/cpuidle.h b/include/linux/cpuidle.h
index 3b39472324a3..bb9a0db89f1a 100644
--- a/include/linux/cpuidle.h
+++ b/include/linux/cpuidle.h
@@ -83,6 +83,7 @@ struct cpuidle_device {
 	unsigned int		use_deepest_state:1;
 	unsigned int		poll_time_limit:1;
 	unsigned int		cpu;
+	ktime_t			next_hrtimer;
 
 	int			last_residency;
 	struct cpuidle_state_usage	states_usage[CPUIDLE_STATE_MAX];
diff --git a/include/linux/tick.h b/include/linux/tick.h
index 55388ab45fd4..8891b5ac3e40 100644
--- a/include/linux/tick.h
+++ b/include/linux/tick.h
@@ -122,6 +122,7 @@ extern void tick_nohz_idle_enter(void);
 extern void tick_nohz_idle_exit(void);
 extern void tick_nohz_irq_exit(void);
 extern bool tick_nohz_idle_got_tick(void);
+extern ktime_t tick_nohz_get_next_hrtimer(void);
 extern ktime_t tick_nohz_get_sleep_length(ktime_t *delta_next);
 extern unsigned long tick_nohz_get_idle_calls(void);
 extern unsigned long tick_nohz_get_idle_calls_cpu(int cpu);
@@ -145,7 +146,11 @@ static inline void tick_nohz_idle_restart_tick(void) { }
 static inline void tick_nohz_idle_enter(void) { }
 static inline void tick_nohz_idle_exit(void) { }
 static inline bool tick_nohz_idle_got_tick(void) { return false; }
-
+static inline ktime_t tick_nohz_get_next_hrtimer(void)
+{
+	/* Next wake up is the tick period, assume it starts now */
+	return ktime_add(ktime_get(), TICK_NSEC);
+}
 static inline ktime_t tick_nohz_get_sleep_length(ktime_t *delta_next)
 {
 	*delta_next = TICK_NSEC;
diff --git a/kernel/time/tick-sched.c b/kernel/time/tick-sched.c
index 6fa52cd6df0b..8d18e03124ff 100644
--- a/kernel/time/tick-sched.c
+++ b/kernel/time/tick-sched.c
@@ -1022,6 +1022,18 @@ bool tick_nohz_idle_got_tick(void)
 	return false;
 }
 
+/**
+ * tick_nohz_get_next_hrtimer - return the next expiration time for the hrtimer
+ * or the tick, whatever that expires first. Note that, if the tick has been
+ * stopped, it returns the next hrtimer.
+ *
+ * Called from power state control code with interrupts disabled
+ */
+ktime_t tick_nohz_get_next_hrtimer(void)
+{
+	return __this_cpu_read(tick_cpu_device.evtdev)->next_event;
+}
+
 /**
  * tick_nohz_get_sleep_length - return the expected length of the current sleep
  * @delta_next: duration until the next event if the tick cannot be stopped
-- 
2.17.1


_______________________________________________
linux-arm-kernel mailing list
linux-arm-kernel@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-arm-kernel

^ permalink raw reply related	[flat|nested] 33+ messages in thread

* [PATCH v12 4/4] PM / Domains: Add genpd governor for CPUs
  2019-02-27 19:58 ` Ulf Hansson
@ 2019-02-27 19:58   ` Ulf Hansson
  -1 siblings, 0 replies; 33+ messages in thread
From: Ulf Hansson @ 2019-02-27 19:58 UTC (permalink / raw)
  To: Rafael J . Wysocki, linux-pm
  Cc: Frederic Weisbecker, Thomas Gleixner, Sudeep Holla,
	Lorenzo Pieralisi, Mark Rutland, Daniel Lezcano,
	Raju P . L . S . S . S . N, Stephen Boyd, Tony Lindgren,
	Kevin Hilman, Lina Iyer, Ulf Hansson, Viresh Kumar,
	Vincent Guittot, Geert Uytterhoeven, linux-arm-kernel,
	linux-arm-msm, linux-kernel

As it's now perfectly possible that a PM domain managed by genpd contains
devices belonging to CPUs, we should start to take into account the
residency values for the idle states during the state selection process.
The residency value specifies the minimum duration of time, the CPU or a
group of CPUs, needs to spend in an idle state to not waste energy entering
it.

To deal with this, let's add a new genpd governor, pm_domain_cpu_gov, that
may be used for a PM domain that have CPU devices attached or if the CPUs
are attached through subdomains.

The new governor computes the minimum expected idle duration time for the
online CPUs being attached to the PM domain and its subdomains. Then in the
state selection process, trying the deepest state first, it verifies that
the idle duration time satisfies the state's residency value.

It should be noted that, when computing the minimum expected idle duration
time, we use the information about the next timer/tick that is stored in
the per CPU variable, cpuidle_devices, for the related CPUs. Future wise,
this deserves to be improved, as there are obviously more reasons to why a
CPU may be woken up from idle.

Cc: Lina Iyer <ilina@codeaurora.org>
Co-developed-by: Lina Iyer <lina.iyer@linaro.org>
Signed-off-by: Ulf Hansson <ulf.hansson@linaro.org>
---

Changes in v12:
	- Rebased.

---
 drivers/base/power/domain_governor.c | 62 +++++++++++++++++++++++++++-
 include/linux/pm_domain.h            |  3 ++
 2 files changed, 64 insertions(+), 1 deletion(-)

diff --git a/drivers/base/power/domain_governor.c b/drivers/base/power/domain_governor.c
index 99896fbf18e4..fb8fd21e69a7 100644
--- a/drivers/base/power/domain_governor.c
+++ b/drivers/base/power/domain_governor.c
@@ -10,6 +10,9 @@
 #include <linux/pm_domain.h>
 #include <linux/pm_qos.h>
 #include <linux/hrtimer.h>
+#include <linux/cpuidle.h>
+#include <linux/cpumask.h>
+#include <linux/ktime.h>
 
 static int dev_update_qos_constraint(struct device *dev, void *data)
 {
@@ -211,8 +214,10 @@ static bool default_power_down_ok(struct dev_pm_domain *pd)
 	struct generic_pm_domain *genpd = pd_to_genpd(pd);
 	struct gpd_link *link;
 
-	if (!genpd->max_off_time_changed)
+	if (!genpd->max_off_time_changed) {
+		genpd->state_idx = genpd->cached_power_down_state_idx;
 		return genpd->cached_power_down_ok;
+	}
 
 	/*
 	 * We have to invalidate the cached results for the masters, so
@@ -237,6 +242,7 @@ static bool default_power_down_ok(struct dev_pm_domain *pd)
 		genpd->state_idx--;
 	}
 
+	genpd->cached_power_down_state_idx = genpd->state_idx;
 	return genpd->cached_power_down_ok;
 }
 
@@ -245,6 +251,55 @@ static bool always_on_power_down_ok(struct dev_pm_domain *domain)
 	return false;
 }
 
+static bool cpu_power_down_ok(struct dev_pm_domain *pd)
+{
+	struct generic_pm_domain *genpd = pd_to_genpd(pd);
+	struct cpuidle_device *dev;
+	ktime_t domain_wakeup;
+	s64 idle_duration_ns;
+	int cpu, i;
+
+	/* Validate dev PM QoS constraints. */
+	if (!default_power_down_ok(pd))
+		return false;
+
+	if (!(genpd->flags & GENPD_FLAG_CPU_DOMAIN))
+		return true;
+
+	/*
+	 * Find the next wakeup for any of the online CPUs within the PM domain
+	 * and its subdomains. Note, we only need the genpd->cpus, as it already
+	 * contains a mask of all CPUs from subdomains.
+	 */
+	domain_wakeup = ktime_set(KTIME_SEC_MAX, 0);
+	for_each_cpu_and(cpu, genpd->cpus, cpu_online_mask) {
+		dev = per_cpu(cpuidle_devices, cpu);
+		if (dev && ktime_before(dev->next_hrtimer, domain_wakeup))
+			domain_wakeup = dev->next_hrtimer;
+	}
+
+	/* The minimum idle duration is from now - until the next wakeup. */
+	idle_duration_ns = ktime_to_ns(ktime_sub(domain_wakeup, ktime_get()));
+	if (idle_duration_ns <= 0)
+		return false;
+
+	/*
+	 * Find the deepest idle state that has its residency value satisfied
+	 * and by also taking into account the power off latency for the state.
+	 * Start at the state picked by the dev PM QoS constraint validation.
+	 */
+	i = genpd->state_idx;
+	do {
+		if (idle_duration_ns >= (genpd->states[i].residency_ns +
+		    genpd->states[i].power_off_latency_ns)) {
+			genpd->state_idx = i;
+			return true;
+		}
+	} while (--i >= 0);
+
+	return false;
+}
+
 struct dev_power_governor simple_qos_governor = {
 	.suspend_ok = default_suspend_ok,
 	.power_down_ok = default_power_down_ok,
@@ -257,3 +312,8 @@ struct dev_power_governor pm_domain_always_on_gov = {
 	.power_down_ok = always_on_power_down_ok,
 	.suspend_ok = default_suspend_ok,
 };
+
+struct dev_power_governor pm_domain_cpu_gov = {
+	.suspend_ok = default_suspend_ok,
+	.power_down_ok = cpu_power_down_ok,
+};
diff --git a/include/linux/pm_domain.h b/include/linux/pm_domain.h
index a6e251fe9deb..ae7061556a26 100644
--- a/include/linux/pm_domain.h
+++ b/include/linux/pm_domain.h
@@ -118,6 +118,7 @@ struct generic_pm_domain {
 	s64 max_off_time_ns;	/* Maximum allowed "suspended" time. */
 	bool max_off_time_changed;
 	bool cached_power_down_ok;
+	bool cached_power_down_state_idx;
 	int (*attach_dev)(struct generic_pm_domain *domain,
 			  struct device *dev);
 	void (*detach_dev)(struct generic_pm_domain *domain,
@@ -202,6 +203,7 @@ int dev_pm_genpd_set_performance_state(struct device *dev, unsigned int state);
 
 extern struct dev_power_governor simple_qos_governor;
 extern struct dev_power_governor pm_domain_always_on_gov;
+extern struct dev_power_governor pm_domain_cpu_gov;
 #else
 
 static inline struct generic_pm_domain_data *dev_gpd_data(struct device *dev)
@@ -245,6 +247,7 @@ static inline int dev_pm_genpd_set_performance_state(struct device *dev,
 
 #define simple_qos_governor		(*(struct dev_power_governor *)(NULL))
 #define pm_domain_always_on_gov		(*(struct dev_power_governor *)(NULL))
+#define pm_domain_cpu_gov		(*(struct dev_power_governor *)(NULL))
 #endif
 
 #ifdef CONFIG_PM_GENERIC_DOMAINS_SLEEP
-- 
2.17.1

^ permalink raw reply related	[flat|nested] 33+ messages in thread

* [PATCH v12 4/4] PM / Domains: Add genpd governor for CPUs
@ 2019-02-27 19:58   ` Ulf Hansson
  0 siblings, 0 replies; 33+ messages in thread
From: Ulf Hansson @ 2019-02-27 19:58 UTC (permalink / raw)
  To: Rafael J . Wysocki, linux-pm
  Cc: Mark Rutland, Ulf Hansson, Lorenzo Pieralisi, Vincent Guittot,
	Geert Uytterhoeven, Kevin Hilman, Stephen Boyd,
	Frederic Weisbecker, linux-arm-msm, Daniel Lezcano, linux-kernel,
	Lina Iyer, Tony Lindgren, Viresh Kumar, Sudeep Holla,
	Thomas Gleixner, Raju P . L . S . S . S . N, linux-arm-kernel

As it's now perfectly possible that a PM domain managed by genpd contains
devices belonging to CPUs, we should start to take into account the
residency values for the idle states during the state selection process.
The residency value specifies the minimum duration of time, the CPU or a
group of CPUs, needs to spend in an idle state to not waste energy entering
it.

To deal with this, let's add a new genpd governor, pm_domain_cpu_gov, that
may be used for a PM domain that have CPU devices attached or if the CPUs
are attached through subdomains.

The new governor computes the minimum expected idle duration time for the
online CPUs being attached to the PM domain and its subdomains. Then in the
state selection process, trying the deepest state first, it verifies that
the idle duration time satisfies the state's residency value.

It should be noted that, when computing the minimum expected idle duration
time, we use the information about the next timer/tick that is stored in
the per CPU variable, cpuidle_devices, for the related CPUs. Future wise,
this deserves to be improved, as there are obviously more reasons to why a
CPU may be woken up from idle.

Cc: Lina Iyer <ilina@codeaurora.org>
Co-developed-by: Lina Iyer <lina.iyer@linaro.org>
Signed-off-by: Ulf Hansson <ulf.hansson@linaro.org>
---

Changes in v12:
	- Rebased.

---
 drivers/base/power/domain_governor.c | 62 +++++++++++++++++++++++++++-
 include/linux/pm_domain.h            |  3 ++
 2 files changed, 64 insertions(+), 1 deletion(-)

diff --git a/drivers/base/power/domain_governor.c b/drivers/base/power/domain_governor.c
index 99896fbf18e4..fb8fd21e69a7 100644
--- a/drivers/base/power/domain_governor.c
+++ b/drivers/base/power/domain_governor.c
@@ -10,6 +10,9 @@
 #include <linux/pm_domain.h>
 #include <linux/pm_qos.h>
 #include <linux/hrtimer.h>
+#include <linux/cpuidle.h>
+#include <linux/cpumask.h>
+#include <linux/ktime.h>
 
 static int dev_update_qos_constraint(struct device *dev, void *data)
 {
@@ -211,8 +214,10 @@ static bool default_power_down_ok(struct dev_pm_domain *pd)
 	struct generic_pm_domain *genpd = pd_to_genpd(pd);
 	struct gpd_link *link;
 
-	if (!genpd->max_off_time_changed)
+	if (!genpd->max_off_time_changed) {
+		genpd->state_idx = genpd->cached_power_down_state_idx;
 		return genpd->cached_power_down_ok;
+	}
 
 	/*
 	 * We have to invalidate the cached results for the masters, so
@@ -237,6 +242,7 @@ static bool default_power_down_ok(struct dev_pm_domain *pd)
 		genpd->state_idx--;
 	}
 
+	genpd->cached_power_down_state_idx = genpd->state_idx;
 	return genpd->cached_power_down_ok;
 }
 
@@ -245,6 +251,55 @@ static bool always_on_power_down_ok(struct dev_pm_domain *domain)
 	return false;
 }
 
+static bool cpu_power_down_ok(struct dev_pm_domain *pd)
+{
+	struct generic_pm_domain *genpd = pd_to_genpd(pd);
+	struct cpuidle_device *dev;
+	ktime_t domain_wakeup;
+	s64 idle_duration_ns;
+	int cpu, i;
+
+	/* Validate dev PM QoS constraints. */
+	if (!default_power_down_ok(pd))
+		return false;
+
+	if (!(genpd->flags & GENPD_FLAG_CPU_DOMAIN))
+		return true;
+
+	/*
+	 * Find the next wakeup for any of the online CPUs within the PM domain
+	 * and its subdomains. Note, we only need the genpd->cpus, as it already
+	 * contains a mask of all CPUs from subdomains.
+	 */
+	domain_wakeup = ktime_set(KTIME_SEC_MAX, 0);
+	for_each_cpu_and(cpu, genpd->cpus, cpu_online_mask) {
+		dev = per_cpu(cpuidle_devices, cpu);
+		if (dev && ktime_before(dev->next_hrtimer, domain_wakeup))
+			domain_wakeup = dev->next_hrtimer;
+	}
+
+	/* The minimum idle duration is from now - until the next wakeup. */
+	idle_duration_ns = ktime_to_ns(ktime_sub(domain_wakeup, ktime_get()));
+	if (idle_duration_ns <= 0)
+		return false;
+
+	/*
+	 * Find the deepest idle state that has its residency value satisfied
+	 * and by also taking into account the power off latency for the state.
+	 * Start at the state picked by the dev PM QoS constraint validation.
+	 */
+	i = genpd->state_idx;
+	do {
+		if (idle_duration_ns >= (genpd->states[i].residency_ns +
+		    genpd->states[i].power_off_latency_ns)) {
+			genpd->state_idx = i;
+			return true;
+		}
+	} while (--i >= 0);
+
+	return false;
+}
+
 struct dev_power_governor simple_qos_governor = {
 	.suspend_ok = default_suspend_ok,
 	.power_down_ok = default_power_down_ok,
@@ -257,3 +312,8 @@ struct dev_power_governor pm_domain_always_on_gov = {
 	.power_down_ok = always_on_power_down_ok,
 	.suspend_ok = default_suspend_ok,
 };
+
+struct dev_power_governor pm_domain_cpu_gov = {
+	.suspend_ok = default_suspend_ok,
+	.power_down_ok = cpu_power_down_ok,
+};
diff --git a/include/linux/pm_domain.h b/include/linux/pm_domain.h
index a6e251fe9deb..ae7061556a26 100644
--- a/include/linux/pm_domain.h
+++ b/include/linux/pm_domain.h
@@ -118,6 +118,7 @@ struct generic_pm_domain {
 	s64 max_off_time_ns;	/* Maximum allowed "suspended" time. */
 	bool max_off_time_changed;
 	bool cached_power_down_ok;
+	bool cached_power_down_state_idx;
 	int (*attach_dev)(struct generic_pm_domain *domain,
 			  struct device *dev);
 	void (*detach_dev)(struct generic_pm_domain *domain,
@@ -202,6 +203,7 @@ int dev_pm_genpd_set_performance_state(struct device *dev, unsigned int state);
 
 extern struct dev_power_governor simple_qos_governor;
 extern struct dev_power_governor pm_domain_always_on_gov;
+extern struct dev_power_governor pm_domain_cpu_gov;
 #else
 
 static inline struct generic_pm_domain_data *dev_gpd_data(struct device *dev)
@@ -245,6 +247,7 @@ static inline int dev_pm_genpd_set_performance_state(struct device *dev,
 
 #define simple_qos_governor		(*(struct dev_power_governor *)(NULL))
 #define pm_domain_always_on_gov		(*(struct dev_power_governor *)(NULL))
+#define pm_domain_cpu_gov		(*(struct dev_power_governor *)(NULL))
 #endif
 
 #ifdef CONFIG_PM_GENERIC_DOMAINS_SLEEP
-- 
2.17.1


_______________________________________________
linux-arm-kernel mailing list
linux-arm-kernel@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-arm-kernel

^ permalink raw reply related	[flat|nested] 33+ messages in thread

* Re: [PATCH v12 1/4] PM / Domains: Add a generic data pointer to the genpd_power_state struct
  2019-02-27 19:58   ` Ulf Hansson
@ 2019-03-21  7:47     ` Daniel Lezcano
  -1 siblings, 0 replies; 33+ messages in thread
From: Daniel Lezcano @ 2019-03-21  7:47 UTC (permalink / raw)
  To: Ulf Hansson, Rafael J . Wysocki, linux-pm
  Cc: Frederic Weisbecker, Thomas Gleixner, Sudeep Holla,
	Lorenzo Pieralisi, Mark Rutland, Raju P . L . S . S . S . N,
	Stephen Boyd, Tony Lindgren, Kevin Hilman, Lina Iyer,
	Viresh Kumar, Vincent Guittot, Geert Uytterhoeven,
	linux-arm-kernel, linux-arm-msm, linux-kernel

On 27/02/2019 20:58, Ulf Hansson wrote:
> Let's add a data pointer to the genpd_power_state struct, to allow a genpd
> backend driver to store per state specific data. To introduce the pointer,
> we need to change the way genpd deals with freeing of the corresponding
> allocated data.
> 
> More precisely, let's clarify the responsibility of whom that shall free
> the data, by adding a ->free_states() callback to the struct
> generic_pm_domain. The one allocating the data shall assign the callback,
> to allow genpd to invoke it from genpd_remove().
> 
> Cc: Lina Iyer <ilina@codeaurora.org>
> Co-developed-by: Lina Iyer <lina.iyer@linaro.org>
> Signed-off-by: Ulf Hansson <ulf.hansson@linaro.org>
> ---
> 
> Changes in v12:
> 	- None.
> 
> ---
>  drivers/base/power/domain.c | 12 ++++++++++--
>  include/linux/pm_domain.h   |  4 +++-
>  2 files changed, 13 insertions(+), 3 deletions(-)
> 
> diff --git a/drivers/base/power/domain.c b/drivers/base/power/domain.c
> index 2c334c01fc43..03885c003c6a 100644
> --- a/drivers/base/power/domain.c
> +++ b/drivers/base/power/domain.c
> @@ -1685,6 +1685,12 @@ int pm_genpd_remove_subdomain(struct generic_pm_domain *genpd,
>  }
>  EXPORT_SYMBOL_GPL(pm_genpd_remove_subdomain);
>  
> +static void genpd_free_default_power_state(struct genpd_power_state *states,
> +					   unsigned int state_count)
> +{
> +	kfree(states);
> +}
> +
>  static int genpd_set_default_power_state(struct generic_pm_domain *genpd)
>  {
>  	struct genpd_power_state *state;
> @@ -1695,7 +1701,7 @@ static int genpd_set_default_power_state(struct generic_pm_domain *genpd)
>  
>  	genpd->states = state;
>  	genpd->state_count = 1;
> -	genpd->free = state;
> +	genpd->free_states = genpd_free_default_power_state;
>  
>  	return 0;
>  }
> @@ -1811,7 +1817,9 @@ static int genpd_remove(struct generic_pm_domain *genpd)
>  	list_del(&genpd->gpd_list_node);
>  	genpd_unlock(genpd);
>  	cancel_work_sync(&genpd->power_off_work);
> -	kfree(genpd->free);
> +	if (genpd->free_states)

Is this test necessary as the free_states function is initialized with
the genpd_set_default_power_state() in any case?

> +		genpd->free_states(genpd->states, genpd->state_count);
> +
>  	pr_debug("%s: removed %s\n", __func__, genpd->name);
>  
>  	return 0;
> diff --git a/include/linux/pm_domain.h b/include/linux/pm_domain.h
> index 1ed5874bcee0..8e1399231753 100644
> --- a/include/linux/pm_domain.h
> +++ b/include/linux/pm_domain.h
> @@ -69,6 +69,7 @@ struct genpd_power_state {
>  	s64 residency_ns;
>  	struct fwnode_handle *fwnode;
>  	ktime_t idle_time;
> +	void *data;
>  };
>  
>  struct genpd_lock_ops;
> @@ -110,9 +111,10 @@ struct generic_pm_domain {
>  			   struct device *dev);
>  	unsigned int flags;		/* Bit field of configs for genpd */
>  	struct genpd_power_state *states;
> +	void (*free_states)(struct genpd_power_state *states,
> +			    unsigned int state_count);
>  	unsigned int state_count; /* number of states */
>  	unsigned int state_idx; /* state that genpd will go to when off */
> -	void *free; /* Free the state that was allocated for default */
>  	ktime_t on_time;
>  	ktime_t accounting_time;
>  	const struct genpd_lock_ops *lock_ops;
> 


-- 
 <http://www.linaro.org/> Linaro.org │ Open source software for ARM SoCs

Follow Linaro:  <http://www.facebook.com/pages/Linaro> Facebook |
<http://twitter.com/#!/linaroorg> Twitter |
<http://www.linaro.org/linaro-blog/> Blog

^ permalink raw reply	[flat|nested] 33+ messages in thread

* Re: [PATCH v12 1/4] PM / Domains: Add a generic data pointer to the genpd_power_state struct
@ 2019-03-21  7:47     ` Daniel Lezcano
  0 siblings, 0 replies; 33+ messages in thread
From: Daniel Lezcano @ 2019-03-21  7:47 UTC (permalink / raw)
  To: Ulf Hansson, Rafael J . Wysocki, linux-pm
  Cc: Mark Rutland, Lorenzo Pieralisi, Vincent Guittot,
	Geert Uytterhoeven, Kevin Hilman, Stephen Boyd,
	Frederic Weisbecker, linux-arm-msm, linux-kernel, Lina Iyer,
	Tony Lindgren, Viresh Kumar, Sudeep Holla, Thomas Gleixner,
	Raju P . L . S . S . S . N, linux-arm-kernel

On 27/02/2019 20:58, Ulf Hansson wrote:
> Let's add a data pointer to the genpd_power_state struct, to allow a genpd
> backend driver to store per state specific data. To introduce the pointer,
> we need to change the way genpd deals with freeing of the corresponding
> allocated data.
> 
> More precisely, let's clarify the responsibility of whom that shall free
> the data, by adding a ->free_states() callback to the struct
> generic_pm_domain. The one allocating the data shall assign the callback,
> to allow genpd to invoke it from genpd_remove().
> 
> Cc: Lina Iyer <ilina@codeaurora.org>
> Co-developed-by: Lina Iyer <lina.iyer@linaro.org>
> Signed-off-by: Ulf Hansson <ulf.hansson@linaro.org>
> ---
> 
> Changes in v12:
> 	- None.
> 
> ---
>  drivers/base/power/domain.c | 12 ++++++++++--
>  include/linux/pm_domain.h   |  4 +++-
>  2 files changed, 13 insertions(+), 3 deletions(-)
> 
> diff --git a/drivers/base/power/domain.c b/drivers/base/power/domain.c
> index 2c334c01fc43..03885c003c6a 100644
> --- a/drivers/base/power/domain.c
> +++ b/drivers/base/power/domain.c
> @@ -1685,6 +1685,12 @@ int pm_genpd_remove_subdomain(struct generic_pm_domain *genpd,
>  }
>  EXPORT_SYMBOL_GPL(pm_genpd_remove_subdomain);
>  
> +static void genpd_free_default_power_state(struct genpd_power_state *states,
> +					   unsigned int state_count)
> +{
> +	kfree(states);
> +}
> +
>  static int genpd_set_default_power_state(struct generic_pm_domain *genpd)
>  {
>  	struct genpd_power_state *state;
> @@ -1695,7 +1701,7 @@ static int genpd_set_default_power_state(struct generic_pm_domain *genpd)
>  
>  	genpd->states = state;
>  	genpd->state_count = 1;
> -	genpd->free = state;
> +	genpd->free_states = genpd_free_default_power_state;
>  
>  	return 0;
>  }
> @@ -1811,7 +1817,9 @@ static int genpd_remove(struct generic_pm_domain *genpd)
>  	list_del(&genpd->gpd_list_node);
>  	genpd_unlock(genpd);
>  	cancel_work_sync(&genpd->power_off_work);
> -	kfree(genpd->free);
> +	if (genpd->free_states)

Is this test necessary as the free_states function is initialized with
the genpd_set_default_power_state() in any case?

> +		genpd->free_states(genpd->states, genpd->state_count);
> +
>  	pr_debug("%s: removed %s\n", __func__, genpd->name);
>  
>  	return 0;
> diff --git a/include/linux/pm_domain.h b/include/linux/pm_domain.h
> index 1ed5874bcee0..8e1399231753 100644
> --- a/include/linux/pm_domain.h
> +++ b/include/linux/pm_domain.h
> @@ -69,6 +69,7 @@ struct genpd_power_state {
>  	s64 residency_ns;
>  	struct fwnode_handle *fwnode;
>  	ktime_t idle_time;
> +	void *data;
>  };
>  
>  struct genpd_lock_ops;
> @@ -110,9 +111,10 @@ struct generic_pm_domain {
>  			   struct device *dev);
>  	unsigned int flags;		/* Bit field of configs for genpd */
>  	struct genpd_power_state *states;
> +	void (*free_states)(struct genpd_power_state *states,
> +			    unsigned int state_count);
>  	unsigned int state_count; /* number of states */
>  	unsigned int state_idx; /* state that genpd will go to when off */
> -	void *free; /* Free the state that was allocated for default */
>  	ktime_t on_time;
>  	ktime_t accounting_time;
>  	const struct genpd_lock_ops *lock_ops;
> 


-- 
 <http://www.linaro.org/> Linaro.org │ Open source software for ARM SoCs

Follow Linaro:  <http://www.facebook.com/pages/Linaro> Facebook |
<http://twitter.com/#!/linaroorg> Twitter |
<http://www.linaro.org/linaro-blog/> Blog


_______________________________________________
linux-arm-kernel mailing list
linux-arm-kernel@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-arm-kernel

^ permalink raw reply	[flat|nested] 33+ messages in thread

* Re: [PATCH v12 1/4] PM / Domains: Add a generic data pointer to the genpd_power_state struct
  2019-03-21  7:47     ` Daniel Lezcano
@ 2019-03-21  9:36       ` Ulf Hansson
  -1 siblings, 0 replies; 33+ messages in thread
From: Ulf Hansson @ 2019-03-21  9:36 UTC (permalink / raw)
  To: Daniel Lezcano
  Cc: Rafael J . Wysocki, Linux PM, Frederic Weisbecker,
	Thomas Gleixner, Sudeep Holla, Lorenzo Pieralisi, Mark Rutland,
	Raju P . L . S . S . S . N, Stephen Boyd, Tony Lindgren,
	Kevin Hilman, Lina Iyer, Viresh Kumar, Vincent Guittot,
	Geert Uytterhoeven, Linux ARM, linux-arm-msm,
	Linux Kernel Mailing List

On Thu, 21 Mar 2019 at 08:47, Daniel Lezcano <daniel.lezcano@linaro.org> wrote:
>
> On 27/02/2019 20:58, Ulf Hansson wrote:
> > Let's add a data pointer to the genpd_power_state struct, to allow a genpd
> > backend driver to store per state specific data. To introduce the pointer,
> > we need to change the way genpd deals with freeing of the corresponding
> > allocated data.
> >
> > More precisely, let's clarify the responsibility of whom that shall free
> > the data, by adding a ->free_states() callback to the struct
> > generic_pm_domain. The one allocating the data shall assign the callback,
> > to allow genpd to invoke it from genpd_remove().
> >
> > Cc: Lina Iyer <ilina@codeaurora.org>
> > Co-developed-by: Lina Iyer <lina.iyer@linaro.org>
> > Signed-off-by: Ulf Hansson <ulf.hansson@linaro.org>
> > ---
> >
> > Changes in v12:
> >       - None.
> >
> > ---
> >  drivers/base/power/domain.c | 12 ++++++++++--
> >  include/linux/pm_domain.h   |  4 +++-
> >  2 files changed, 13 insertions(+), 3 deletions(-)
> >
> > diff --git a/drivers/base/power/domain.c b/drivers/base/power/domain.c
> > index 2c334c01fc43..03885c003c6a 100644
> > --- a/drivers/base/power/domain.c
> > +++ b/drivers/base/power/domain.c
> > @@ -1685,6 +1685,12 @@ int pm_genpd_remove_subdomain(struct generic_pm_domain *genpd,
> >  }
> >  EXPORT_SYMBOL_GPL(pm_genpd_remove_subdomain);
> >
> > +static void genpd_free_default_power_state(struct genpd_power_state *states,
> > +                                        unsigned int state_count)
> > +{
> > +     kfree(states);
> > +}
> > +
> >  static int genpd_set_default_power_state(struct generic_pm_domain *genpd)
> >  {
> >       struct genpd_power_state *state;
> > @@ -1695,7 +1701,7 @@ static int genpd_set_default_power_state(struct generic_pm_domain *genpd)
> >
> >       genpd->states = state;
> >       genpd->state_count = 1;
> > -     genpd->free = state;
> > +     genpd->free_states = genpd_free_default_power_state;
> >
> >       return 0;
> >  }
> > @@ -1811,7 +1817,9 @@ static int genpd_remove(struct generic_pm_domain *genpd)
> >       list_del(&genpd->gpd_list_node);
> >       genpd_unlock(genpd);
> >       cancel_work_sync(&genpd->power_off_work);
> > -     kfree(genpd->free);
> > +     if (genpd->free_states)
>
> Is this test necessary as the free_states function is initialized with
> the genpd_set_default_power_state() in any case?

That's the case when the genpd provider did not allocate states, so
then we know genpd deals with this properly for us.

In the other case, when genpd provider has allocates states, then we
need to check that the provider has assigned the ->free_states()
callback before we invokes it, as there is no guarantees that it had.

I was initially tempted to do this check already at pm_genpd_init(),
as it would allow us to check for the error condition and return an
error code if it's not been assigned. However, that requires me to
change all providers that currently "allocates" their states, so that
isn't really a smooth way forward. Perhaps, we should simply print a
message to the log about this condition in pm_genpd_init(), as to
start with!? I can add a patch on top doing that.

>
> > +             genpd->free_states(genpd->states, genpd->state_count);
> > +
> >       pr_debug("%s: removed %s\n", __func__, genpd->name);
> >

[...]

Kind regards
Uffe

^ permalink raw reply	[flat|nested] 33+ messages in thread

* Re: [PATCH v12 1/4] PM / Domains: Add a generic data pointer to the genpd_power_state struct
@ 2019-03-21  9:36       ` Ulf Hansson
  0 siblings, 0 replies; 33+ messages in thread
From: Ulf Hansson @ 2019-03-21  9:36 UTC (permalink / raw)
  To: Daniel Lezcano
  Cc: Mark Rutland, Lorenzo Pieralisi, Vincent Guittot,
	Geert Uytterhoeven, Linux PM, Stephen Boyd, Frederic Weisbecker,
	linux-arm-msm, Kevin Hilman, Rafael J . Wysocki,
	Linux Kernel Mailing List, Lina Iyer, Tony Lindgren,
	Viresh Kumar, Sudeep Holla, Thomas Gleixner,
	Raju P . L . S . S . S . N, Linux ARM

On Thu, 21 Mar 2019 at 08:47, Daniel Lezcano <daniel.lezcano@linaro.org> wrote:
>
> On 27/02/2019 20:58, Ulf Hansson wrote:
> > Let's add a data pointer to the genpd_power_state struct, to allow a genpd
> > backend driver to store per state specific data. To introduce the pointer,
> > we need to change the way genpd deals with freeing of the corresponding
> > allocated data.
> >
> > More precisely, let's clarify the responsibility of whom that shall free
> > the data, by adding a ->free_states() callback to the struct
> > generic_pm_domain. The one allocating the data shall assign the callback,
> > to allow genpd to invoke it from genpd_remove().
> >
> > Cc: Lina Iyer <ilina@codeaurora.org>
> > Co-developed-by: Lina Iyer <lina.iyer@linaro.org>
> > Signed-off-by: Ulf Hansson <ulf.hansson@linaro.org>
> > ---
> >
> > Changes in v12:
> >       - None.
> >
> > ---
> >  drivers/base/power/domain.c | 12 ++++++++++--
> >  include/linux/pm_domain.h   |  4 +++-
> >  2 files changed, 13 insertions(+), 3 deletions(-)
> >
> > diff --git a/drivers/base/power/domain.c b/drivers/base/power/domain.c
> > index 2c334c01fc43..03885c003c6a 100644
> > --- a/drivers/base/power/domain.c
> > +++ b/drivers/base/power/domain.c
> > @@ -1685,6 +1685,12 @@ int pm_genpd_remove_subdomain(struct generic_pm_domain *genpd,
> >  }
> >  EXPORT_SYMBOL_GPL(pm_genpd_remove_subdomain);
> >
> > +static void genpd_free_default_power_state(struct genpd_power_state *states,
> > +                                        unsigned int state_count)
> > +{
> > +     kfree(states);
> > +}
> > +
> >  static int genpd_set_default_power_state(struct generic_pm_domain *genpd)
> >  {
> >       struct genpd_power_state *state;
> > @@ -1695,7 +1701,7 @@ static int genpd_set_default_power_state(struct generic_pm_domain *genpd)
> >
> >       genpd->states = state;
> >       genpd->state_count = 1;
> > -     genpd->free = state;
> > +     genpd->free_states = genpd_free_default_power_state;
> >
> >       return 0;
> >  }
> > @@ -1811,7 +1817,9 @@ static int genpd_remove(struct generic_pm_domain *genpd)
> >       list_del(&genpd->gpd_list_node);
> >       genpd_unlock(genpd);
> >       cancel_work_sync(&genpd->power_off_work);
> > -     kfree(genpd->free);
> > +     if (genpd->free_states)
>
> Is this test necessary as the free_states function is initialized with
> the genpd_set_default_power_state() in any case?

That's the case when the genpd provider did not allocate states, so
then we know genpd deals with this properly for us.

In the other case, when genpd provider has allocates states, then we
need to check that the provider has assigned the ->free_states()
callback before we invokes it, as there is no guarantees that it had.

I was initially tempted to do this check already at pm_genpd_init(),
as it would allow us to check for the error condition and return an
error code if it's not been assigned. However, that requires me to
change all providers that currently "allocates" their states, so that
isn't really a smooth way forward. Perhaps, we should simply print a
message to the log about this condition in pm_genpd_init(), as to
start with!? I can add a patch on top doing that.

>
> > +             genpd->free_states(genpd->states, genpd->state_count);
> > +
> >       pr_debug("%s: removed %s\n", __func__, genpd->name);
> >

[...]

Kind regards
Uffe

_______________________________________________
linux-arm-kernel mailing list
linux-arm-kernel@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-arm-kernel

^ permalink raw reply	[flat|nested] 33+ messages in thread

* Re: [PATCH v12 1/4] PM / Domains: Add a generic data pointer to the genpd_power_state struct
  2019-03-21  9:36       ` Ulf Hansson
@ 2019-03-21  9:52         ` Daniel Lezcano
  -1 siblings, 0 replies; 33+ messages in thread
From: Daniel Lezcano @ 2019-03-21  9:52 UTC (permalink / raw)
  To: Ulf Hansson
  Cc: Rafael J . Wysocki, Linux PM, Frederic Weisbecker,
	Thomas Gleixner, Sudeep Holla, Lorenzo Pieralisi, Mark Rutland,
	Raju P . L . S . S . S . N, Stephen Boyd, Tony Lindgren,
	Kevin Hilman, Lina Iyer, Viresh Kumar, Vincent Guittot,
	Geert Uytterhoeven, Linux ARM, linux-arm-msm,
	Linux Kernel Mailing List

On 21/03/2019 10:36, Ulf Hansson wrote:
> On Thu, 21 Mar 2019 at 08:47, Daniel Lezcano <daniel.lezcano@linaro.org> wrote:

[ ... ]

>>>       cancel_work_sync(&genpd->power_off_work);
>>> -     kfree(genpd->free);
>>> +     if (genpd->free_states)
>>
>> Is this test necessary as the free_states function is initialized with
>> the genpd_set_default_power_state() in any case?
> 
> That's the case when the genpd provider did not allocate states, so
> then we know genpd deals with this properly for us.
> 
> In the other case, when genpd provider has allocates states, then we
> need to check that the provider has assigned the ->free_states()
> callback before we invokes it, as there is no guarantees that it had.
> 
> I was initially tempted to do this check already at pm_genpd_init(),
> as it would allow us to check for the error condition and return an
> error code if it's not been assigned. However, that requires me to
> change all providers that currently "allocates" their states, so that
> isn't really a smooth way forward. Perhaps, we should simply print a
> message to the log about this condition in pm_genpd_init(), as to
> start with!? I can add a patch on top doing that.

Yes, that would make sense to track those providers which do not
initialize the free field.

Acked-by: Daniel Lezcano <daniel.lezcano@linaro.org>




-- 
 <http://www.linaro.org/> Linaro.org │ Open source software for ARM SoCs

Follow Linaro:  <http://www.facebook.com/pages/Linaro> Facebook |
<http://twitter.com/#!/linaroorg> Twitter |
<http://www.linaro.org/linaro-blog/> Blog

^ permalink raw reply	[flat|nested] 33+ messages in thread

* Re: [PATCH v12 1/4] PM / Domains: Add a generic data pointer to the genpd_power_state struct
@ 2019-03-21  9:52         ` Daniel Lezcano
  0 siblings, 0 replies; 33+ messages in thread
From: Daniel Lezcano @ 2019-03-21  9:52 UTC (permalink / raw)
  To: Ulf Hansson
  Cc: Mark Rutland, Lorenzo Pieralisi, Vincent Guittot,
	Geert Uytterhoeven, Linux PM, Stephen Boyd, Frederic Weisbecker,
	linux-arm-msm, Kevin Hilman, Rafael J . Wysocki,
	Linux Kernel Mailing List, Lina Iyer, Tony Lindgren,
	Viresh Kumar, Sudeep Holla, Thomas Gleixner,
	Raju P . L . S . S . S . N, Linux ARM

On 21/03/2019 10:36, Ulf Hansson wrote:
> On Thu, 21 Mar 2019 at 08:47, Daniel Lezcano <daniel.lezcano@linaro.org> wrote:

[ ... ]

>>>       cancel_work_sync(&genpd->power_off_work);
>>> -     kfree(genpd->free);
>>> +     if (genpd->free_states)
>>
>> Is this test necessary as the free_states function is initialized with
>> the genpd_set_default_power_state() in any case?
> 
> That's the case when the genpd provider did not allocate states, so
> then we know genpd deals with this properly for us.
> 
> In the other case, when genpd provider has allocates states, then we
> need to check that the provider has assigned the ->free_states()
> callback before we invokes it, as there is no guarantees that it had.
> 
> I was initially tempted to do this check already at pm_genpd_init(),
> as it would allow us to check for the error condition and return an
> error code if it's not been assigned. However, that requires me to
> change all providers that currently "allocates" their states, so that
> isn't really a smooth way forward. Perhaps, we should simply print a
> message to the log about this condition in pm_genpd_init(), as to
> start with!? I can add a patch on top doing that.

Yes, that would make sense to track those providers which do not
initialize the free field.

Acked-by: Daniel Lezcano <daniel.lezcano@linaro.org>




-- 
 <http://www.linaro.org/> Linaro.org │ Open source software for ARM SoCs

Follow Linaro:  <http://www.facebook.com/pages/Linaro> Facebook |
<http://twitter.com/#!/linaroorg> Twitter |
<http://www.linaro.org/linaro-blog/> Blog


_______________________________________________
linux-arm-kernel mailing list
linux-arm-kernel@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-arm-kernel

^ permalink raw reply	[flat|nested] 33+ messages in thread

* Re: [PATCH v12 2/4] PM / Domains: Add support for CPU devices to genpd
  2019-02-27 19:58   ` Ulf Hansson
@ 2019-03-21  9:54     ` Daniel Lezcano
  -1 siblings, 0 replies; 33+ messages in thread
From: Daniel Lezcano @ 2019-03-21  9:54 UTC (permalink / raw)
  To: Ulf Hansson, Rafael J . Wysocki, linux-pm
  Cc: Frederic Weisbecker, Thomas Gleixner, Sudeep Holla,
	Lorenzo Pieralisi, Mark Rutland, Raju P . L . S . S . S . N,
	Stephen Boyd, Tony Lindgren, Kevin Hilman, Lina Iyer,
	Viresh Kumar, Vincent Guittot, Geert Uytterhoeven,
	linux-arm-kernel, linux-arm-msm, linux-kernel

On 27/02/2019 20:58, Ulf Hansson wrote:
> To enable a device belonging to a CPU to be attached to a PM domain managed
> by genpd, let's do a few changes to it, as to make it convenient to manage
> the specifics around CPUs.
> 
> To be able to quickly find out what CPUs that are attached to a genpd,
> which typically becomes useful from a genpd governor as following changes
> is about to show, let's add a cpumask to the struct generic_pm_domain. At
> the point when a CPU device gets attached to a genpd, let's update the
> genpd's cpumask. Moreover, let's also propagate changes to the cpumask
> upwards in the topology to the master PM domains. In this way, the cpumask
> for a genpd hierarchically reflects all CPUs attached to the topology below
> it.
> 
> Finally, let's make this an opt-in feature, to avoid having to manage CPUs
> and the cpumask for a genpd that doesn't need it. For that reason, let's
> add a new genpd configuration bit, GENPD_FLAG_CPU_DOMAIN.
> 
> Cc: Lina Iyer <ilina@codeaurora.org>
> Co-developed-by: Lina Iyer <lina.iyer@linaro.org>
> Acked-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
> Signed-off-by: Ulf Hansson <ulf.hansson@linaro.org>

Acked-by: Daniel Lezcano <daniel.lezcano@linaro.org>

-- 
 <http://www.linaro.org/> Linaro.org │ Open source software for ARM SoCs

Follow Linaro:  <http://www.facebook.com/pages/Linaro> Facebook |
<http://twitter.com/#!/linaroorg> Twitter |
<http://www.linaro.org/linaro-blog/> Blog

^ permalink raw reply	[flat|nested] 33+ messages in thread

* Re: [PATCH v12 2/4] PM / Domains: Add support for CPU devices to genpd
@ 2019-03-21  9:54     ` Daniel Lezcano
  0 siblings, 0 replies; 33+ messages in thread
From: Daniel Lezcano @ 2019-03-21  9:54 UTC (permalink / raw)
  To: Ulf Hansson, Rafael J . Wysocki, linux-pm
  Cc: Mark Rutland, Lorenzo Pieralisi, Vincent Guittot,
	Geert Uytterhoeven, Kevin Hilman, Stephen Boyd,
	Frederic Weisbecker, linux-arm-msm, linux-kernel, Lina Iyer,
	Tony Lindgren, Viresh Kumar, Sudeep Holla, Thomas Gleixner,
	Raju P . L . S . S . S . N, linux-arm-kernel

On 27/02/2019 20:58, Ulf Hansson wrote:
> To enable a device belonging to a CPU to be attached to a PM domain managed
> by genpd, let's do a few changes to it, as to make it convenient to manage
> the specifics around CPUs.
> 
> To be able to quickly find out what CPUs that are attached to a genpd,
> which typically becomes useful from a genpd governor as following changes
> is about to show, let's add a cpumask to the struct generic_pm_domain. At
> the point when a CPU device gets attached to a genpd, let's update the
> genpd's cpumask. Moreover, let's also propagate changes to the cpumask
> upwards in the topology to the master PM domains. In this way, the cpumask
> for a genpd hierarchically reflects all CPUs attached to the topology below
> it.
> 
> Finally, let's make this an opt-in feature, to avoid having to manage CPUs
> and the cpumask for a genpd that doesn't need it. For that reason, let's
> add a new genpd configuration bit, GENPD_FLAG_CPU_DOMAIN.
> 
> Cc: Lina Iyer <ilina@codeaurora.org>
> Co-developed-by: Lina Iyer <lina.iyer@linaro.org>
> Acked-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
> Signed-off-by: Ulf Hansson <ulf.hansson@linaro.org>

Acked-by: Daniel Lezcano <daniel.lezcano@linaro.org>

-- 
 <http://www.linaro.org/> Linaro.org │ Open source software for ARM SoCs

Follow Linaro:  <http://www.facebook.com/pages/Linaro> Facebook |
<http://twitter.com/#!/linaroorg> Twitter |
<http://www.linaro.org/linaro-blog/> Blog


_______________________________________________
linux-arm-kernel mailing list
linux-arm-kernel@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-arm-kernel

^ permalink raw reply	[flat|nested] 33+ messages in thread

* Re: [PATCH v12 3/4] cpuidle: Export the next timer/tick expiration for a CPU
  2019-02-27 19:58   ` Ulf Hansson
@ 2019-03-21 10:04     ` Daniel Lezcano
  -1 siblings, 0 replies; 33+ messages in thread
From: Daniel Lezcano @ 2019-03-21 10:04 UTC (permalink / raw)
  To: Ulf Hansson, Rafael J . Wysocki, linux-pm
  Cc: Frederic Weisbecker, Thomas Gleixner, Sudeep Holla,
	Lorenzo Pieralisi, Mark Rutland, Raju P . L . S . S . S . N,
	Stephen Boyd, Tony Lindgren, Kevin Hilman, Lina Iyer,
	Viresh Kumar, Vincent Guittot, Geert Uytterhoeven,
	linux-arm-kernel, linux-arm-msm, linux-kernel

On 27/02/2019 20:58, Ulf Hansson wrote:
> To be able to predict the sleep duration for a CPU that is entering idle,
> knowing when the next timer/tick is going to expire, is extremely useful.
> Both the teo and the menu cpuidle governors already makes use of this
> information, while selecting an idle state.
> 
> Moving forward, the similar prediction needs to be done, but for a group of
> idle CPUs rather than for a single idle CPU. Following changes implements a
> new genpd governor, which needs this.
> 
> Support this, by sharing a new function called
> tick_nohz_get_next_hrtimer(), which returns the next hrtimer or the next
> tick, whatever that expires first.
> 
> Additionally, when cpuidle is about to invoke the ->enter() callback, then
> call tick_nohz_get_next_hrtimer() and store its return value in the per CPU
> struct cpuidle_device, as to make it available outside cpuidle.
> 
> Do note, at the point when cpuidle calls tick_nohz_get_next_hrtimer(), the
> governor's ->select() callback has already made a decision whether to stop
> the tick or not. In this way, tick_nohz_get_next_hrtimer() actually returns
> the next timer expiration, whatever origin.
> 
> Cc: Lina Iyer <ilina@codeaurora.org>
> Co-developed-by: Lina Iyer <lina.iyer@linaro.org>
> Co-developed-by: Daniel Lezcano <daniel.lezcano@linaro.org>
> Signed-off-by: Ulf Hansson <ulf.hansson@linaro.org>
> ---

Acked-by: Daniel Lezcano <daniel.lezcano@linaro.org>

-- 
 <http://www.linaro.org/> Linaro.org │ Open source software for ARM SoCs

Follow Linaro:  <http://www.facebook.com/pages/Linaro> Facebook |
<http://twitter.com/#!/linaroorg> Twitter |
<http://www.linaro.org/linaro-blog/> Blog

^ permalink raw reply	[flat|nested] 33+ messages in thread

* Re: [PATCH v12 3/4] cpuidle: Export the next timer/tick expiration for a CPU
@ 2019-03-21 10:04     ` Daniel Lezcano
  0 siblings, 0 replies; 33+ messages in thread
From: Daniel Lezcano @ 2019-03-21 10:04 UTC (permalink / raw)
  To: Ulf Hansson, Rafael J . Wysocki, linux-pm
  Cc: Mark Rutland, Lorenzo Pieralisi, Vincent Guittot,
	Geert Uytterhoeven, Kevin Hilman, Stephen Boyd,
	Frederic Weisbecker, linux-arm-msm, linux-kernel, Lina Iyer,
	Tony Lindgren, Viresh Kumar, Sudeep Holla, Thomas Gleixner,
	Raju P . L . S . S . S . N, linux-arm-kernel

On 27/02/2019 20:58, Ulf Hansson wrote:
> To be able to predict the sleep duration for a CPU that is entering idle,
> knowing when the next timer/tick is going to expire, is extremely useful.
> Both the teo and the menu cpuidle governors already makes use of this
> information, while selecting an idle state.
> 
> Moving forward, the similar prediction needs to be done, but for a group of
> idle CPUs rather than for a single idle CPU. Following changes implements a
> new genpd governor, which needs this.
> 
> Support this, by sharing a new function called
> tick_nohz_get_next_hrtimer(), which returns the next hrtimer or the next
> tick, whatever that expires first.
> 
> Additionally, when cpuidle is about to invoke the ->enter() callback, then
> call tick_nohz_get_next_hrtimer() and store its return value in the per CPU
> struct cpuidle_device, as to make it available outside cpuidle.
> 
> Do note, at the point when cpuidle calls tick_nohz_get_next_hrtimer(), the
> governor's ->select() callback has already made a decision whether to stop
> the tick or not. In this way, tick_nohz_get_next_hrtimer() actually returns
> the next timer expiration, whatever origin.
> 
> Cc: Lina Iyer <ilina@codeaurora.org>
> Co-developed-by: Lina Iyer <lina.iyer@linaro.org>
> Co-developed-by: Daniel Lezcano <daniel.lezcano@linaro.org>
> Signed-off-by: Ulf Hansson <ulf.hansson@linaro.org>
> ---

Acked-by: Daniel Lezcano <daniel.lezcano@linaro.org>

-- 
 <http://www.linaro.org/> Linaro.org │ Open source software for ARM SoCs

Follow Linaro:  <http://www.facebook.com/pages/Linaro> Facebook |
<http://twitter.com/#!/linaroorg> Twitter |
<http://www.linaro.org/linaro-blog/> Blog


_______________________________________________
linux-arm-kernel mailing list
linux-arm-kernel@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-arm-kernel

^ permalink raw reply	[flat|nested] 33+ messages in thread

* Re: [PATCH v12 4/4] PM / Domains: Add genpd governor for CPUs
  2019-02-27 19:58   ` Ulf Hansson
@ 2019-03-21 11:41     ` Daniel Lezcano
  -1 siblings, 0 replies; 33+ messages in thread
From: Daniel Lezcano @ 2019-03-21 11:41 UTC (permalink / raw)
  To: Ulf Hansson, Rafael J . Wysocki, linux-pm
  Cc: Frederic Weisbecker, Thomas Gleixner, Sudeep Holla,
	Lorenzo Pieralisi, Mark Rutland, Raju P . L . S . S . S . N,
	Stephen Boyd, Tony Lindgren, Kevin Hilman, Lina Iyer,
	Viresh Kumar, Vincent Guittot, Geert Uytterhoeven,
	linux-arm-kernel, linux-arm-msm, linux-kernel

On 27/02/2019 20:58, Ulf Hansson wrote:
> As it's now perfectly possible that a PM domain managed by genpd contains
> devices belonging to CPUs, we should start to take into account the
> residency values for the idle states during the state selection process.
> The residency value specifies the minimum duration of time, the CPU or a
> group of CPUs, needs to spend in an idle state to not waste energy entering
> it.
> 
> To deal with this, let's add a new genpd governor, pm_domain_cpu_gov, that
> may be used for a PM domain that have CPU devices attached or if the CPUs
> are attached through subdomains.
> 
> The new governor computes the minimum expected idle duration time for the
> online CPUs being attached to the PM domain and its subdomains. Then in the
> state selection process, trying the deepest state first, it verifies that
> the idle duration time satisfies the state's residency value.
> 
> It should be noted that, when computing the minimum expected idle duration
> time, we use the information about the next timer/tick that is stored in
> the per CPU variable, cpuidle_devices, for the related CPUs. Future wise,
> this deserves to be improved, as there are obviously more reasons to why a
> CPU may be woken up from idle.
> 
> Cc: Lina Iyer <ilina@codeaurora.org>
> Co-developed-by: Lina Iyer <lina.iyer@linaro.org>
> Signed-off-by: Ulf Hansson <ulf.hansson@linaro.org>

Acked-by: Daniel Lezcano <daniel.lezcano@linaro.org>



-- 
 <http://www.linaro.org/> Linaro.org │ Open source software for ARM SoCs

Follow Linaro:  <http://www.facebook.com/pages/Linaro> Facebook |
<http://twitter.com/#!/linaroorg> Twitter |
<http://www.linaro.org/linaro-blog/> Blog

^ permalink raw reply	[flat|nested] 33+ messages in thread

* Re: [PATCH v12 4/4] PM / Domains: Add genpd governor for CPUs
@ 2019-03-21 11:41     ` Daniel Lezcano
  0 siblings, 0 replies; 33+ messages in thread
From: Daniel Lezcano @ 2019-03-21 11:41 UTC (permalink / raw)
  To: Ulf Hansson, Rafael J . Wysocki, linux-pm
  Cc: Mark Rutland, Lorenzo Pieralisi, Vincent Guittot,
	Geert Uytterhoeven, Kevin Hilman, Stephen Boyd,
	Frederic Weisbecker, linux-arm-msm, linux-kernel, Lina Iyer,
	Tony Lindgren, Viresh Kumar, Sudeep Holla, Thomas Gleixner,
	Raju P . L . S . S . S . N, linux-arm-kernel

On 27/02/2019 20:58, Ulf Hansson wrote:
> As it's now perfectly possible that a PM domain managed by genpd contains
> devices belonging to CPUs, we should start to take into account the
> residency values for the idle states during the state selection process.
> The residency value specifies the minimum duration of time, the CPU or a
> group of CPUs, needs to spend in an idle state to not waste energy entering
> it.
> 
> To deal with this, let's add a new genpd governor, pm_domain_cpu_gov, that
> may be used for a PM domain that have CPU devices attached or if the CPUs
> are attached through subdomains.
> 
> The new governor computes the minimum expected idle duration time for the
> online CPUs being attached to the PM domain and its subdomains. Then in the
> state selection process, trying the deepest state first, it verifies that
> the idle duration time satisfies the state's residency value.
> 
> It should be noted that, when computing the minimum expected idle duration
> time, we use the information about the next timer/tick that is stored in
> the per CPU variable, cpuidle_devices, for the related CPUs. Future wise,
> this deserves to be improved, as there are obviously more reasons to why a
> CPU may be woken up from idle.
> 
> Cc: Lina Iyer <ilina@codeaurora.org>
> Co-developed-by: Lina Iyer <lina.iyer@linaro.org>
> Signed-off-by: Ulf Hansson <ulf.hansson@linaro.org>

Acked-by: Daniel Lezcano <daniel.lezcano@linaro.org>



-- 
 <http://www.linaro.org/> Linaro.org │ Open source software for ARM SoCs

Follow Linaro:  <http://www.facebook.com/pages/Linaro> Facebook |
<http://twitter.com/#!/linaroorg> Twitter |
<http://www.linaro.org/linaro-blog/> Blog


_______________________________________________
linux-arm-kernel mailing list
linux-arm-kernel@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-arm-kernel

^ permalink raw reply	[flat|nested] 33+ messages in thread

* Re: [PATCH v12 3/4] cpuidle: Export the next timer/tick expiration for a CPU
  2019-02-27 19:58   ` Ulf Hansson
@ 2019-03-25 12:19     ` Rafael J. Wysocki
  -1 siblings, 0 replies; 33+ messages in thread
From: Rafael J. Wysocki @ 2019-03-25 12:19 UTC (permalink / raw)
  To: Ulf Hansson
  Cc: linux-pm, Frederic Weisbecker, Thomas Gleixner, Sudeep Holla,
	Lorenzo Pieralisi, Mark Rutland, Daniel Lezcano,
	Raju P . L . S . S . S . N, Stephen Boyd, Tony Lindgren,
	Kevin Hilman, Lina Iyer, Viresh Kumar, Vincent Guittot,
	Geert Uytterhoeven, linux-arm-kernel, linux-arm-msm,
	linux-kernel

On Wednesday, February 27, 2019 8:58:35 PM CET Ulf Hansson wrote:
> To be able to predict the sleep duration for a CPU that is entering idle,
> knowing when the next timer/tick is going to expire, is extremely useful.
> Both the teo and the menu cpuidle governors already makes use of this
> information, while selecting an idle state.
> 
> Moving forward, the similar prediction needs to be done, but for a group of
> idle CPUs rather than for a single idle CPU. Following changes implements a
> new genpd governor, which needs this.
> 
> Support this, by sharing a new function called
> tick_nohz_get_next_hrtimer(), which returns the next hrtimer or the next
> tick, whatever that expires first.
> 
> Additionally, when cpuidle is about to invoke the ->enter() callback, then
> call tick_nohz_get_next_hrtimer() and store its return value in the per CPU
> struct cpuidle_device, as to make it available outside cpuidle.
> 
> Do note, at the point when cpuidle calls tick_nohz_get_next_hrtimer(), the
> governor's ->select() callback has already made a decision whether to stop
> the tick or not. In this way, tick_nohz_get_next_hrtimer() actually returns
> the next timer expiration, whatever origin.
> 
> Cc: Lina Iyer <ilina@codeaurora.org>
> Co-developed-by: Lina Iyer <lina.iyer@linaro.org>
> Co-developed-by: Daniel Lezcano <daniel.lezcano@linaro.org>
> Signed-off-by: Ulf Hansson <ulf.hansson@linaro.org>
> ---
> 
> Changes in v12:
> 	- New patch.
> 
> ---
>  drivers/cpuidle/cpuidle.c |  8 ++++++++
>  include/linux/cpuidle.h   |  1 +
>  include/linux/tick.h      |  7 ++++++-
>  kernel/time/tick-sched.c  | 12 ++++++++++++
>  4 files changed, 27 insertions(+), 1 deletion(-)
> 
> diff --git a/drivers/cpuidle/cpuidle.c b/drivers/cpuidle/cpuidle.c
> index 7f108309e871..255365b1a6ab 100644
> --- a/drivers/cpuidle/cpuidle.c
> +++ b/drivers/cpuidle/cpuidle.c
> @@ -328,6 +328,14 @@ int cpuidle_select(struct cpuidle_driver *drv, struct cpuidle_device *dev,
>  int cpuidle_enter(struct cpuidle_driver *drv, struct cpuidle_device *dev,
>  		  int index)
>  {
> +	/*
> +	 * Store the next hrtimer, which becomes either next tick or the next
> +	 * timer event, whatever expires first. Additionally, to make this data
> +	 * useful for consumers outside cpuidle, we rely on that the governor's
> +	 * ->select() callback have decided, whether to stop the tick or not.
> +	 */
> +	dev->next_hrtimer = tick_nohz_get_next_hrtimer();

I would use WRITE_ONCE() to set next_hrtimer here and READ_ONCE() for
reading that value in the next patch, as a matter of annotation if
nothing else.

> +
>  	if (cpuidle_state_is_coupled(drv, index))
>  		return cpuidle_enter_state_coupled(dev, drv, index);
>  	return cpuidle_enter_state(dev, drv, index);

Also I would clear next_hrtimer here to avoid dragging stale values
around.

Apart from this the series LGTM.

Thanks!

^ permalink raw reply	[flat|nested] 33+ messages in thread

* Re: [PATCH v12 3/4] cpuidle: Export the next timer/tick expiration for a CPU
@ 2019-03-25 12:19     ` Rafael J. Wysocki
  0 siblings, 0 replies; 33+ messages in thread
From: Rafael J. Wysocki @ 2019-03-25 12:19 UTC (permalink / raw)
  To: Ulf Hansson
  Cc: Mark Rutland, Lorenzo Pieralisi, Vincent Guittot,
	Geert Uytterhoeven, linux-pm, Stephen Boyd, Frederic Weisbecker,
	linux-arm-msm, Daniel Lezcano, Kevin Hilman, Lina Iyer,
	linux-kernel, Tony Lindgren, Viresh Kumar, Sudeep Holla,
	Thomas Gleixner, Raju P . L . S . S . S . N, linux-arm-kernel

On Wednesday, February 27, 2019 8:58:35 PM CET Ulf Hansson wrote:
> To be able to predict the sleep duration for a CPU that is entering idle,
> knowing when the next timer/tick is going to expire, is extremely useful.
> Both the teo and the menu cpuidle governors already makes use of this
> information, while selecting an idle state.
> 
> Moving forward, the similar prediction needs to be done, but for a group of
> idle CPUs rather than for a single idle CPU. Following changes implements a
> new genpd governor, which needs this.
> 
> Support this, by sharing a new function called
> tick_nohz_get_next_hrtimer(), which returns the next hrtimer or the next
> tick, whatever that expires first.
> 
> Additionally, when cpuidle is about to invoke the ->enter() callback, then
> call tick_nohz_get_next_hrtimer() and store its return value in the per CPU
> struct cpuidle_device, as to make it available outside cpuidle.
> 
> Do note, at the point when cpuidle calls tick_nohz_get_next_hrtimer(), the
> governor's ->select() callback has already made a decision whether to stop
> the tick or not. In this way, tick_nohz_get_next_hrtimer() actually returns
> the next timer expiration, whatever origin.
> 
> Cc: Lina Iyer <ilina@codeaurora.org>
> Co-developed-by: Lina Iyer <lina.iyer@linaro.org>
> Co-developed-by: Daniel Lezcano <daniel.lezcano@linaro.org>
> Signed-off-by: Ulf Hansson <ulf.hansson@linaro.org>
> ---
> 
> Changes in v12:
> 	- New patch.
> 
> ---
>  drivers/cpuidle/cpuidle.c |  8 ++++++++
>  include/linux/cpuidle.h   |  1 +
>  include/linux/tick.h      |  7 ++++++-
>  kernel/time/tick-sched.c  | 12 ++++++++++++
>  4 files changed, 27 insertions(+), 1 deletion(-)
> 
> diff --git a/drivers/cpuidle/cpuidle.c b/drivers/cpuidle/cpuidle.c
> index 7f108309e871..255365b1a6ab 100644
> --- a/drivers/cpuidle/cpuidle.c
> +++ b/drivers/cpuidle/cpuidle.c
> @@ -328,6 +328,14 @@ int cpuidle_select(struct cpuidle_driver *drv, struct cpuidle_device *dev,
>  int cpuidle_enter(struct cpuidle_driver *drv, struct cpuidle_device *dev,
>  		  int index)
>  {
> +	/*
> +	 * Store the next hrtimer, which becomes either next tick or the next
> +	 * timer event, whatever expires first. Additionally, to make this data
> +	 * useful for consumers outside cpuidle, we rely on that the governor's
> +	 * ->select() callback have decided, whether to stop the tick or not.
> +	 */
> +	dev->next_hrtimer = tick_nohz_get_next_hrtimer();

I would use WRITE_ONCE() to set next_hrtimer here and READ_ONCE() for
reading that value in the next patch, as a matter of annotation if
nothing else.

> +
>  	if (cpuidle_state_is_coupled(drv, index))
>  		return cpuidle_enter_state_coupled(dev, drv, index);
>  	return cpuidle_enter_state(dev, drv, index);

Also I would clear next_hrtimer here to avoid dragging stale values
around.

Apart from this the series LGTM.

Thanks!


_______________________________________________
linux-arm-kernel mailing list
linux-arm-kernel@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-arm-kernel

^ permalink raw reply	[flat|nested] 33+ messages in thread

* Re: [PATCH v12 3/4] cpuidle: Export the next timer/tick expiration for a CPU
  2019-03-25 12:19     ` Rafael J. Wysocki
@ 2019-03-25 14:23       ` Ulf Hansson
  -1 siblings, 0 replies; 33+ messages in thread
From: Ulf Hansson @ 2019-03-25 14:23 UTC (permalink / raw)
  To: Rafael J. Wysocki
  Cc: Linux PM, Frederic Weisbecker, Thomas Gleixner, Sudeep Holla,
	Lorenzo Pieralisi, Mark Rutland, Daniel Lezcano,
	Raju P . L . S . S . S . N, Stephen Boyd, Tony Lindgren,
	Kevin Hilman, Lina Iyer, Viresh Kumar, Vincent Guittot,
	Geert Uytterhoeven, Linux ARM, linux-arm-msm,
	Linux Kernel Mailing List

On Mon, 25 Mar 2019 at 13:21, Rafael J. Wysocki <rjw@rjwysocki.net> wrote:
>
> On Wednesday, February 27, 2019 8:58:35 PM CET Ulf Hansson wrote:
> > To be able to predict the sleep duration for a CPU that is entering idle,
> > knowing when the next timer/tick is going to expire, is extremely useful.
> > Both the teo and the menu cpuidle governors already makes use of this
> > information, while selecting an idle state.
> >
> > Moving forward, the similar prediction needs to be done, but for a group of
> > idle CPUs rather than for a single idle CPU. Following changes implements a
> > new genpd governor, which needs this.
> >
> > Support this, by sharing a new function called
> > tick_nohz_get_next_hrtimer(), which returns the next hrtimer or the next
> > tick, whatever that expires first.
> >
> > Additionally, when cpuidle is about to invoke the ->enter() callback, then
> > call tick_nohz_get_next_hrtimer() and store its return value in the per CPU
> > struct cpuidle_device, as to make it available outside cpuidle.
> >
> > Do note, at the point when cpuidle calls tick_nohz_get_next_hrtimer(), the
> > governor's ->select() callback has already made a decision whether to stop
> > the tick or not. In this way, tick_nohz_get_next_hrtimer() actually returns
> > the next timer expiration, whatever origin.
> >
> > Cc: Lina Iyer <ilina@codeaurora.org>
> > Co-developed-by: Lina Iyer <lina.iyer@linaro.org>
> > Co-developed-by: Daniel Lezcano <daniel.lezcano@linaro.org>
> > Signed-off-by: Ulf Hansson <ulf.hansson@linaro.org>
> > ---
> >
> > Changes in v12:
> >       - New patch.
> >
> > ---
> >  drivers/cpuidle/cpuidle.c |  8 ++++++++
> >  include/linux/cpuidle.h   |  1 +
> >  include/linux/tick.h      |  7 ++++++-
> >  kernel/time/tick-sched.c  | 12 ++++++++++++
> >  4 files changed, 27 insertions(+), 1 deletion(-)
> >
> > diff --git a/drivers/cpuidle/cpuidle.c b/drivers/cpuidle/cpuidle.c
> > index 7f108309e871..255365b1a6ab 100644
> > --- a/drivers/cpuidle/cpuidle.c
> > +++ b/drivers/cpuidle/cpuidle.c
> > @@ -328,6 +328,14 @@ int cpuidle_select(struct cpuidle_driver *drv, struct cpuidle_device *dev,
> >  int cpuidle_enter(struct cpuidle_driver *drv, struct cpuidle_device *dev,
> >                 int index)
> >  {
> > +     /*
> > +      * Store the next hrtimer, which becomes either next tick or the next
> > +      * timer event, whatever expires first. Additionally, to make this data
> > +      * useful for consumers outside cpuidle, we rely on that the governor's
> > +      * ->select() callback have decided, whether to stop the tick or not.
> > +      */
> > +     dev->next_hrtimer = tick_nohz_get_next_hrtimer();
>
> I would use WRITE_ONCE() to set next_hrtimer here and READ_ONCE() for
> reading that value in the next patch, as a matter of annotation if
> nothing else.

Okay!

>
> > +
> >       if (cpuidle_state_is_coupled(drv, index))
> >               return cpuidle_enter_state_coupled(dev, drv, index);
> >       return cpuidle_enter_state(dev, drv, index);
>
> Also I would clear next_hrtimer here to avoid dragging stale values
> around.

Right, I can do that.

However, at least in my case it would be an unnecessary update of the
variable, as I am never in a path where the value can be "stale". Even
if one theoretically could use a stale value, it's seems likely to not
be an issue, don't you think? Anyway, if I don't hear from you, I do
the change as you suggested.

>
> Apart from this the series LGTM.

Great, thanks. I re-spin a new version.

Kind regards
Uffe

^ permalink raw reply	[flat|nested] 33+ messages in thread

* Re: [PATCH v12 3/4] cpuidle: Export the next timer/tick expiration for a CPU
@ 2019-03-25 14:23       ` Ulf Hansson
  0 siblings, 0 replies; 33+ messages in thread
From: Ulf Hansson @ 2019-03-25 14:23 UTC (permalink / raw)
  To: Rafael J. Wysocki
  Cc: Mark Rutland, Lorenzo Pieralisi, Vincent Guittot,
	Geert Uytterhoeven, Linux PM, Stephen Boyd, Frederic Weisbecker,
	linux-arm-msm, Daniel Lezcano, Kevin Hilman, Lina Iyer,
	Linux Kernel Mailing List, Tony Lindgren, Viresh Kumar,
	Sudeep Holla, Thomas Gleixner, Raju P . L . S . S . S . N,
	Linux ARM

On Mon, 25 Mar 2019 at 13:21, Rafael J. Wysocki <rjw@rjwysocki.net> wrote:
>
> On Wednesday, February 27, 2019 8:58:35 PM CET Ulf Hansson wrote:
> > To be able to predict the sleep duration for a CPU that is entering idle,
> > knowing when the next timer/tick is going to expire, is extremely useful.
> > Both the teo and the menu cpuidle governors already makes use of this
> > information, while selecting an idle state.
> >
> > Moving forward, the similar prediction needs to be done, but for a group of
> > idle CPUs rather than for a single idle CPU. Following changes implements a
> > new genpd governor, which needs this.
> >
> > Support this, by sharing a new function called
> > tick_nohz_get_next_hrtimer(), which returns the next hrtimer or the next
> > tick, whatever that expires first.
> >
> > Additionally, when cpuidle is about to invoke the ->enter() callback, then
> > call tick_nohz_get_next_hrtimer() and store its return value in the per CPU
> > struct cpuidle_device, as to make it available outside cpuidle.
> >
> > Do note, at the point when cpuidle calls tick_nohz_get_next_hrtimer(), the
> > governor's ->select() callback has already made a decision whether to stop
> > the tick or not. In this way, tick_nohz_get_next_hrtimer() actually returns
> > the next timer expiration, whatever origin.
> >
> > Cc: Lina Iyer <ilina@codeaurora.org>
> > Co-developed-by: Lina Iyer <lina.iyer@linaro.org>
> > Co-developed-by: Daniel Lezcano <daniel.lezcano@linaro.org>
> > Signed-off-by: Ulf Hansson <ulf.hansson@linaro.org>
> > ---
> >
> > Changes in v12:
> >       - New patch.
> >
> > ---
> >  drivers/cpuidle/cpuidle.c |  8 ++++++++
> >  include/linux/cpuidle.h   |  1 +
> >  include/linux/tick.h      |  7 ++++++-
> >  kernel/time/tick-sched.c  | 12 ++++++++++++
> >  4 files changed, 27 insertions(+), 1 deletion(-)
> >
> > diff --git a/drivers/cpuidle/cpuidle.c b/drivers/cpuidle/cpuidle.c
> > index 7f108309e871..255365b1a6ab 100644
> > --- a/drivers/cpuidle/cpuidle.c
> > +++ b/drivers/cpuidle/cpuidle.c
> > @@ -328,6 +328,14 @@ int cpuidle_select(struct cpuidle_driver *drv, struct cpuidle_device *dev,
> >  int cpuidle_enter(struct cpuidle_driver *drv, struct cpuidle_device *dev,
> >                 int index)
> >  {
> > +     /*
> > +      * Store the next hrtimer, which becomes either next tick or the next
> > +      * timer event, whatever expires first. Additionally, to make this data
> > +      * useful for consumers outside cpuidle, we rely on that the governor's
> > +      * ->select() callback have decided, whether to stop the tick or not.
> > +      */
> > +     dev->next_hrtimer = tick_nohz_get_next_hrtimer();
>
> I would use WRITE_ONCE() to set next_hrtimer here and READ_ONCE() for
> reading that value in the next patch, as a matter of annotation if
> nothing else.

Okay!

>
> > +
> >       if (cpuidle_state_is_coupled(drv, index))
> >               return cpuidle_enter_state_coupled(dev, drv, index);
> >       return cpuidle_enter_state(dev, drv, index);
>
> Also I would clear next_hrtimer here to avoid dragging stale values
> around.

Right, I can do that.

However, at least in my case it would be an unnecessary update of the
variable, as I am never in a path where the value can be "stale". Even
if one theoretically could use a stale value, it's seems likely to not
be an issue, don't you think? Anyway, if I don't hear from you, I do
the change as you suggested.

>
> Apart from this the series LGTM.

Great, thanks. I re-spin a new version.

Kind regards
Uffe

_______________________________________________
linux-arm-kernel mailing list
linux-arm-kernel@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-arm-kernel

^ permalink raw reply	[flat|nested] 33+ messages in thread

* Re: [PATCH v12 3/4] cpuidle: Export the next timer/tick expiration for a CPU
  2019-03-25 14:23       ` Ulf Hansson
  (?)
@ 2019-03-26 10:36         ` Rafael J. Wysocki
  -1 siblings, 0 replies; 33+ messages in thread
From: Rafael J. Wysocki @ 2019-03-26 10:36 UTC (permalink / raw)
  To: Ulf Hansson
  Cc: Rafael J. Wysocki, Linux PM, Frederic Weisbecker,
	Thomas Gleixner, Sudeep Holla, Lorenzo Pieralisi, Mark Rutland,
	Daniel Lezcano, Raju P . L . S . S . S . N, Stephen Boyd,
	Tony Lindgren, Kevin Hilman, Lina Iyer, Viresh Kumar,
	Vincent Guittot, Geert Uytterhoeven, Linux ARM, linux-arm-msm

On Mon, Mar 25, 2019 at 3:24 PM Ulf Hansson <ulf.hansson@linaro.org> wrote:
>
> On Mon, 25 Mar 2019 at 13:21, Rafael J. Wysocki <rjw@rjwysocki.net> wrote:
> >
> > On Wednesday, February 27, 2019 8:58:35 PM CET Ulf Hansson wrote:
> > > To be able to predict the sleep duration for a CPU that is entering idle,
> > > knowing when the next timer/tick is going to expire, is extremely useful.
> > > Both the teo and the menu cpuidle governors already makes use of this
> > > information, while selecting an idle state.
> > >

[cut]

> >
> > > +
> > >       if (cpuidle_state_is_coupled(drv, index))
> > >               return cpuidle_enter_state_coupled(dev, drv, index);
> > >       return cpuidle_enter_state(dev, drv, index);
> >
> > Also I would clear next_hrtimer here to avoid dragging stale values
> > around.
>
> Right, I can do that.
>
> However, at least in my case it would be an unnecessary update of the
> variable, as I am never in a path where the value can be "stale".

It easily can AFAICS.  After all, cpu_power_down_ok() need not run on
the same CPU that is setting next_hrtimer here.

> Even if one theoretically could use a stale value, it's seems likely to not
> be an issue, don't you think?

That would be because of the locking in the ->enter() callback I
suppose?  But is it actually universally guaranteed that setting
next_hrtimer will never be reordered with acquiring the lock?

Also, there is some overhead to be avoided if cpu_power_down_ok()
checked the next_hrtimer of the other CPUs against 0 explicitly, isn't
it?

^ permalink raw reply	[flat|nested] 33+ messages in thread

* Re: [PATCH v12 3/4] cpuidle: Export the next timer/tick expiration for a CPU
@ 2019-03-26 10:36         ` Rafael J. Wysocki
  0 siblings, 0 replies; 33+ messages in thread
From: Rafael J. Wysocki @ 2019-03-26 10:36 UTC (permalink / raw)
  To: Ulf Hansson
  Cc: Rafael J. Wysocki, Linux PM, Frederic Weisbecker,
	Thomas Gleixner, Sudeep Holla, Lorenzo Pieralisi, Mark Rutland,
	Daniel Lezcano, Raju P . L . S . S . S . N, Stephen Boyd,
	Tony Lindgren, Kevin Hilman, Lina Iyer, Viresh Kumar,
	Vincent Guittot, Geert Uytterhoeven, Linux ARM, linux-arm-msm,
	Linux Kernel Mailing List

On Mon, Mar 25, 2019 at 3:24 PM Ulf Hansson <ulf.hansson@linaro.org> wrote:
>
> On Mon, 25 Mar 2019 at 13:21, Rafael J. Wysocki <rjw@rjwysocki.net> wrote:
> >
> > On Wednesday, February 27, 2019 8:58:35 PM CET Ulf Hansson wrote:
> > > To be able to predict the sleep duration for a CPU that is entering idle,
> > > knowing when the next timer/tick is going to expire, is extremely useful.
> > > Both the teo and the menu cpuidle governors already makes use of this
> > > information, while selecting an idle state.
> > >

[cut]

> >
> > > +
> > >       if (cpuidle_state_is_coupled(drv, index))
> > >               return cpuidle_enter_state_coupled(dev, drv, index);
> > >       return cpuidle_enter_state(dev, drv, index);
> >
> > Also I would clear next_hrtimer here to avoid dragging stale values
> > around.
>
> Right, I can do that.
>
> However, at least in my case it would be an unnecessary update of the
> variable, as I am never in a path where the value can be "stale".

It easily can AFAICS.  After all, cpu_power_down_ok() need not run on
the same CPU that is setting next_hrtimer here.

> Even if one theoretically could use a stale value, it's seems likely to not
> be an issue, don't you think?

That would be because of the locking in the ->enter() callback I
suppose?  But is it actually universally guaranteed that setting
next_hrtimer will never be reordered with acquiring the lock?

Also, there is some overhead to be avoided if cpu_power_down_ok()
checked the next_hrtimer of the other CPUs against 0 explicitly, isn't
it?

^ permalink raw reply	[flat|nested] 33+ messages in thread

* Re: [PATCH v12 3/4] cpuidle: Export the next timer/tick expiration for a CPU
@ 2019-03-26 10:36         ` Rafael J. Wysocki
  0 siblings, 0 replies; 33+ messages in thread
From: Rafael J. Wysocki @ 2019-03-26 10:36 UTC (permalink / raw)
  To: Ulf Hansson
  Cc: Mark Rutland, Lorenzo Pieralisi, Vincent Guittot,
	Geert Uytterhoeven, Linux PM, Stephen Boyd, Frederic Weisbecker,
	linux-arm-msm, Daniel Lezcano, Rafael J. Wysocki, Kevin Hilman,
	Lina Iyer, Linux Kernel Mailing List, Tony Lindgren,
	Viresh Kumar, Sudeep Holla, Thomas Gleixner,
	Raju P . L . S . S . S . N, Linux ARM

On Mon, Mar 25, 2019 at 3:24 PM Ulf Hansson <ulf.hansson@linaro.org> wrote:
>
> On Mon, 25 Mar 2019 at 13:21, Rafael J. Wysocki <rjw@rjwysocki.net> wrote:
> >
> > On Wednesday, February 27, 2019 8:58:35 PM CET Ulf Hansson wrote:
> > > To be able to predict the sleep duration for a CPU that is entering idle,
> > > knowing when the next timer/tick is going to expire, is extremely useful.
> > > Both the teo and the menu cpuidle governors already makes use of this
> > > information, while selecting an idle state.
> > >

[cut]

> >
> > > +
> > >       if (cpuidle_state_is_coupled(drv, index))
> > >               return cpuidle_enter_state_coupled(dev, drv, index);
> > >       return cpuidle_enter_state(dev, drv, index);
> >
> > Also I would clear next_hrtimer here to avoid dragging stale values
> > around.
>
> Right, I can do that.
>
> However, at least in my case it would be an unnecessary update of the
> variable, as I am never in a path where the value can be "stale".

It easily can AFAICS.  After all, cpu_power_down_ok() need not run on
the same CPU that is setting next_hrtimer here.

> Even if one theoretically could use a stale value, it's seems likely to not
> be an issue, don't you think?

That would be because of the locking in the ->enter() callback I
suppose?  But is it actually universally guaranteed that setting
next_hrtimer will never be reordered with acquiring the lock?

Also, there is some overhead to be avoided if cpu_power_down_ok()
checked the next_hrtimer of the other CPUs against 0 explicitly, isn't
it?

_______________________________________________
linux-arm-kernel mailing list
linux-arm-kernel@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-arm-kernel

^ permalink raw reply	[flat|nested] 33+ messages in thread

* Re: [PATCH v12 3/4] cpuidle: Export the next timer/tick expiration for a CPU
  2019-03-26 10:36         ` Rafael J. Wysocki
  (?)
@ 2019-03-26 11:29           ` Ulf Hansson
  -1 siblings, 0 replies; 33+ messages in thread
From: Ulf Hansson @ 2019-03-26 11:29 UTC (permalink / raw)
  To: Rafael J. Wysocki
  Cc: Rafael J. Wysocki, Linux PM, Frederic Weisbecker,
	Thomas Gleixner, Sudeep Holla, Lorenzo Pieralisi, Mark Rutland,
	Daniel Lezcano, Raju P . L . S . S . S . N, Stephen Boyd,
	Tony Lindgren, Kevin Hilman, Lina Iyer, Viresh Kumar,
	Vincent Guittot, Geert Uytterhoeven, Linux ARM, linux-arm-msm

On Tue, 26 Mar 2019 at 11:36, Rafael J. Wysocki <rafael@kernel.org> wrote:
>
> On Mon, Mar 25, 2019 at 3:24 PM Ulf Hansson <ulf.hansson@linaro.org> wrote:
> >
> > On Mon, 25 Mar 2019 at 13:21, Rafael J. Wysocki <rjw@rjwysocki.net> wrote:
> > >
> > > On Wednesday, February 27, 2019 8:58:35 PM CET Ulf Hansson wrote:
> > > > To be able to predict the sleep duration for a CPU that is entering idle,
> > > > knowing when the next timer/tick is going to expire, is extremely useful.
> > > > Both the teo and the menu cpuidle governors already makes use of this
> > > > information, while selecting an idle state.
> > > >
>
> [cut]
>
> > >
> > > > +
> > > >       if (cpuidle_state_is_coupled(drv, index))
> > > >               return cpuidle_enter_state_coupled(dev, drv, index);
> > > >       return cpuidle_enter_state(dev, drv, index);
> > >
> > > Also I would clear next_hrtimer here to avoid dragging stale values
> > > around.
> >
> > Right, I can do that.
> >
> > However, at least in my case it would be an unnecessary update of the
> > variable, as I am never in a path where the value can be "stale".
>
> It easily can AFAICS.  After all, cpu_power_down_ok() need not run on
> the same CPU that is setting next_hrtimer here.

That's correct.

>
> > Even if one theoretically could use a stale value, it's seems likely to not
> > be an issue, don't you think?
>
> That would be because of the locking in the ->enter() callback I
> suppose?  But is it actually universally guaranteed that setting
> next_hrtimer will never be reordered with acquiring the lock?

In the PSCI case and for those CPUs that shares the same genpd
governor (even hierarchically), then yes.

Unfortunate, I haven't been able to explore this in that great detail
for other legacy ARM32 platforms, so maybe it's just better to play
safe, as you suggest and avoid a stale value.

>
> Also, there is some overhead to be avoided if cpu_power_down_ok()
> checked the next_hrtimer of the other CPUs against 0 explicitly, isn't
> it?

In regards to overhead and when using genpd for CPUs, there are a
couple of things I have in mind that we could try to improve. Yes,
checking for next_hrtimer against 0 could be one thing to consider.

Kind regards
Uffe

^ permalink raw reply	[flat|nested] 33+ messages in thread

* Re: [PATCH v12 3/4] cpuidle: Export the next timer/tick expiration for a CPU
@ 2019-03-26 11:29           ` Ulf Hansson
  0 siblings, 0 replies; 33+ messages in thread
From: Ulf Hansson @ 2019-03-26 11:29 UTC (permalink / raw)
  To: Rafael J. Wysocki
  Cc: Rafael J. Wysocki, Linux PM, Frederic Weisbecker,
	Thomas Gleixner, Sudeep Holla, Lorenzo Pieralisi, Mark Rutland,
	Daniel Lezcano, Raju P . L . S . S . S . N, Stephen Boyd,
	Tony Lindgren, Kevin Hilman, Lina Iyer, Viresh Kumar,
	Vincent Guittot, Geert Uytterhoeven, Linux ARM, linux-arm-msm,
	Linux Kernel Mailing List

On Tue, 26 Mar 2019 at 11:36, Rafael J. Wysocki <rafael@kernel.org> wrote:
>
> On Mon, Mar 25, 2019 at 3:24 PM Ulf Hansson <ulf.hansson@linaro.org> wrote:
> >
> > On Mon, 25 Mar 2019 at 13:21, Rafael J. Wysocki <rjw@rjwysocki.net> wrote:
> > >
> > > On Wednesday, February 27, 2019 8:58:35 PM CET Ulf Hansson wrote:
> > > > To be able to predict the sleep duration for a CPU that is entering idle,
> > > > knowing when the next timer/tick is going to expire, is extremely useful.
> > > > Both the teo and the menu cpuidle governors already makes use of this
> > > > information, while selecting an idle state.
> > > >
>
> [cut]
>
> > >
> > > > +
> > > >       if (cpuidle_state_is_coupled(drv, index))
> > > >               return cpuidle_enter_state_coupled(dev, drv, index);
> > > >       return cpuidle_enter_state(dev, drv, index);
> > >
> > > Also I would clear next_hrtimer here to avoid dragging stale values
> > > around.
> >
> > Right, I can do that.
> >
> > However, at least in my case it would be an unnecessary update of the
> > variable, as I am never in a path where the value can be "stale".
>
> It easily can AFAICS.  After all, cpu_power_down_ok() need not run on
> the same CPU that is setting next_hrtimer here.

That's correct.

>
> > Even if one theoretically could use a stale value, it's seems likely to not
> > be an issue, don't you think?
>
> That would be because of the locking in the ->enter() callback I
> suppose?  But is it actually universally guaranteed that setting
> next_hrtimer will never be reordered with acquiring the lock?

In the PSCI case and for those CPUs that shares the same genpd
governor (even hierarchically), then yes.

Unfortunate, I haven't been able to explore this in that great detail
for other legacy ARM32 platforms, so maybe it's just better to play
safe, as you suggest and avoid a stale value.

>
> Also, there is some overhead to be avoided if cpu_power_down_ok()
> checked the next_hrtimer of the other CPUs against 0 explicitly, isn't
> it?

In regards to overhead and when using genpd for CPUs, there are a
couple of things I have in mind that we could try to improve. Yes,
checking for next_hrtimer against 0 could be one thing to consider.

Kind regards
Uffe

^ permalink raw reply	[flat|nested] 33+ messages in thread

* Re: [PATCH v12 3/4] cpuidle: Export the next timer/tick expiration for a CPU
@ 2019-03-26 11:29           ` Ulf Hansson
  0 siblings, 0 replies; 33+ messages in thread
From: Ulf Hansson @ 2019-03-26 11:29 UTC (permalink / raw)
  To: Rafael J. Wysocki
  Cc: Mark Rutland, Lorenzo Pieralisi, Vincent Guittot,
	Geert Uytterhoeven, Linux PM, Stephen Boyd, Frederic Weisbecker,
	linux-arm-msm, Daniel Lezcano, Rafael J. Wysocki, Kevin Hilman,
	Lina Iyer, Linux Kernel Mailing List, Tony Lindgren,
	Viresh Kumar, Sudeep Holla, Thomas Gleixner,
	Raju P . L . S . S . S . N, Linux ARM

On Tue, 26 Mar 2019 at 11:36, Rafael J. Wysocki <rafael@kernel.org> wrote:
>
> On Mon, Mar 25, 2019 at 3:24 PM Ulf Hansson <ulf.hansson@linaro.org> wrote:
> >
> > On Mon, 25 Mar 2019 at 13:21, Rafael J. Wysocki <rjw@rjwysocki.net> wrote:
> > >
> > > On Wednesday, February 27, 2019 8:58:35 PM CET Ulf Hansson wrote:
> > > > To be able to predict the sleep duration for a CPU that is entering idle,
> > > > knowing when the next timer/tick is going to expire, is extremely useful.
> > > > Both the teo and the menu cpuidle governors already makes use of this
> > > > information, while selecting an idle state.
> > > >
>
> [cut]
>
> > >
> > > > +
> > > >       if (cpuidle_state_is_coupled(drv, index))
> > > >               return cpuidle_enter_state_coupled(dev, drv, index);
> > > >       return cpuidle_enter_state(dev, drv, index);
> > >
> > > Also I would clear next_hrtimer here to avoid dragging stale values
> > > around.
> >
> > Right, I can do that.
> >
> > However, at least in my case it would be an unnecessary update of the
> > variable, as I am never in a path where the value can be "stale".
>
> It easily can AFAICS.  After all, cpu_power_down_ok() need not run on
> the same CPU that is setting next_hrtimer here.

That's correct.

>
> > Even if one theoretically could use a stale value, it's seems likely to not
> > be an issue, don't you think?
>
> That would be because of the locking in the ->enter() callback I
> suppose?  But is it actually universally guaranteed that setting
> next_hrtimer will never be reordered with acquiring the lock?

In the PSCI case and for those CPUs that shares the same genpd
governor (even hierarchically), then yes.

Unfortunate, I haven't been able to explore this in that great detail
for other legacy ARM32 platforms, so maybe it's just better to play
safe, as you suggest and avoid a stale value.

>
> Also, there is some overhead to be avoided if cpu_power_down_ok()
> checked the next_hrtimer of the other CPUs against 0 explicitly, isn't
> it?

In regards to overhead and when using genpd for CPUs, there are a
couple of things I have in mind that we could try to improve. Yes,
checking for next_hrtimer against 0 could be one thing to consider.

Kind regards
Uffe

_______________________________________________
linux-arm-kernel mailing list
linux-arm-kernel@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-arm-kernel

^ permalink raw reply	[flat|nested] 33+ messages in thread

end of thread, other threads:[~2019-03-26 11:30 UTC | newest]

Thread overview: 33+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2019-02-27 19:58 [PATCH v12 0/4] PM / Domains: Support hierarchical CPU arrangement (PSCI/ARM) Ulf Hansson
2019-02-27 19:58 ` Ulf Hansson
2019-02-27 19:58 ` Ulf Hansson
2019-02-27 19:58 ` [PATCH v12 1/4] PM / Domains: Add a generic data pointer to the genpd_power_state struct Ulf Hansson
2019-02-27 19:58   ` Ulf Hansson
2019-03-21  7:47   ` Daniel Lezcano
2019-03-21  7:47     ` Daniel Lezcano
2019-03-21  9:36     ` Ulf Hansson
2019-03-21  9:36       ` Ulf Hansson
2019-03-21  9:52       ` Daniel Lezcano
2019-03-21  9:52         ` Daniel Lezcano
2019-02-27 19:58 ` [PATCH v12 2/4] PM / Domains: Add support for CPU devices to genpd Ulf Hansson
2019-02-27 19:58   ` Ulf Hansson
2019-03-21  9:54   ` Daniel Lezcano
2019-03-21  9:54     ` Daniel Lezcano
2019-02-27 19:58 ` [PATCH v12 3/4] cpuidle: Export the next timer/tick expiration for a CPU Ulf Hansson
2019-02-27 19:58   ` Ulf Hansson
2019-03-21 10:04   ` Daniel Lezcano
2019-03-21 10:04     ` Daniel Lezcano
2019-03-25 12:19   ` Rafael J. Wysocki
2019-03-25 12:19     ` Rafael J. Wysocki
2019-03-25 14:23     ` Ulf Hansson
2019-03-25 14:23       ` Ulf Hansson
2019-03-26 10:36       ` Rafael J. Wysocki
2019-03-26 10:36         ` Rafael J. Wysocki
2019-03-26 10:36         ` Rafael J. Wysocki
2019-03-26 11:29         ` Ulf Hansson
2019-03-26 11:29           ` Ulf Hansson
2019-03-26 11:29           ` Ulf Hansson
2019-02-27 19:58 ` [PATCH v12 4/4] PM / Domains: Add genpd governor for CPUs Ulf Hansson
2019-02-27 19:58   ` Ulf Hansson
2019-03-21 11:41   ` Daniel Lezcano
2019-03-21 11:41     ` Daniel Lezcano

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.