All of lore.kernel.org
 help / color / mirror / Atom feed
* [PATCH v4 0/6] ARM generic idle states
@ 2014-06-11 16:18 ` Lorenzo Pieralisi
  0 siblings, 0 replies; 74+ messages in thread
From: Lorenzo Pieralisi @ 2014-06-11 16:18 UTC (permalink / raw)
  To: linux-arm-kernel, linux-pm, devicetree
  Cc: Lorenzo Pieralisi, Mark Rutland, Sudeep Holla, Catalin Marinas,
	Charles Garcia Tobin, Nicolas Pitre, Rob Herring, Grant Likely,
	Peter De Schrijver, Santosh Shilimkar, Daniel Lezcano,
	Amit Kucheria, Vincent Guittot, Antti Miettinen, Stephen Boyd,
	Kevin Hilman, Sebastian Capella, Tomasz Figa, Mark Brown,
	Paul Walmsley, Chander Kashyap

This patch is v4 of a previous posting:

http://lists.infradead.org/pipermail/linux-arm-kernel/2014-May/253693.html

Changes in v4:

- States sorting using exit-latency
- Added cosmetic review comments
- Dropped RFC tag
- Rebased against 3.15

Changes in v3:

- Streamlined the idle states bindings and added them to the series
  http://www.spinics.net/lists/arm-kernel/msg316299.html
- Sorting states through min-residency+exit-latency
- Added debug strings formatting
- Reworded min-residency-us idle state property
- Removed power-domain properties from idle states waiting for code
  examples requiring them to be defined

Changes in v2:

- Moved OF parsing code to drivers/cpuidle
- Improved states detection and sorting through linked list
- Split code in generic and ARM64 specific bits
- Moved idle enter function into ARM64 idle driver
- Refactored PSCI idle states register function
- Renamed suspend operations and moved detection to ARM64 idle driver
- Changed the way CPUIDLE_FLAG_TIMER_STOP is handled
- Simplified idle state nodes parsing since according to the latest
  bindings idle state nodes are a flat list, not hierarchical anymore
- Used min-residency-us to sort the states, to be further discussed

Idle states on most ARM platforms can be characterized by a set of
parameters that are platform agnostic and describe the HW idle states
features. So far, CPU idle drivers for ARM platforms required the definition
of parameters through static tables, duplicating control data for different
platforms. Moreover, the lack of standardization on firmware interfaces
hampered any standardization effort, resulting in CPU idle drivers for ARM
platforms containing duplicated code and platform specific power down routines.

The introduction of the PSCI firmware interface, and more in general, well
defined suspend back-ends, allows the definition of generic idle states and
the respective kernel infrastructure to support them.

Building on top of DT idle states bindings, that standardize idle states
parameters and corresponding suspend back-ends, this patchset provides code
that parses DT idle states nodes and builds at run-time the control data
infrastructure required by the ARM CPU idle drivers.

Idle states define an entry method (eg PSCI), that requires the respective
ARM64 kernel back-end to be invoked to initialize idle states parameters, so
that when the idle driver executes the back-end specific entry method a table
look-up can be carried out to retrieve the corresponding idle state parameter.

On legacy ARM platforms, the OF idle states are just used to initialize
states data.

The idle states bindings can be extended with new back-ends; the ARM64 CPUidle
driver must be updated accordingly so that the corresponding back
end initializer can be invoked at boot time for parameters initialization.

Patchset has been tested on AEM v8 models, on top of bootwrapper PSCI CPU
SUSPEND implementation which provides simulated core power gating.

[1] http://www.spinics.net/lists/arm-kernel/msg316299.html

Lorenzo Pieralisi (6):
  Documentation: arm: define DT idle states bindings
  Documentation: devicetree: psci: define CPU suspend parameter
  drivers: cpuidle: implement OF based idle states infrastructure
  arm64: add PSCI CPU_SUSPEND based cpu_suspend support
  drivers: cpuidle: CPU idle ARM64 driver
  arm64: boot: dts: update rtsm aemv8 dts with PSCI and idle states

 Documentation/devicetree/bindings/arm/cpus.txt     |   8 +
 .../devicetree/bindings/arm/idle-states.txt        | 507 +++++++++++++++++++++
 Documentation/devicetree/bindings/arm/psci.txt     |  12 +-
 arch/arm64/boot/dts/rtsm_ve-aemv8a.dts             |  44 +-
 arch/arm64/include/asm/psci.h                      |   4 +
 arch/arm64/kernel/psci.c                           | 103 +++++
 drivers/cpuidle/Kconfig                            |  14 +
 drivers/cpuidle/Kconfig.arm64                      |  13 +
 drivers/cpuidle/Makefile                           |   5 +
 drivers/cpuidle/cpuidle-arm64.c                    | 168 +++++++
 drivers/cpuidle/of_idle_states.c                   | 282 ++++++++++++
 drivers/cpuidle/of_idle_states.h                   |   8 +
 12 files changed, 1159 insertions(+), 9 deletions(-)
 create mode 100644 Documentation/devicetree/bindings/arm/idle-states.txt
 create mode 100644 drivers/cpuidle/Kconfig.arm64
 create mode 100644 drivers/cpuidle/cpuidle-arm64.c
 create mode 100644 drivers/cpuidle/of_idle_states.c
 create mode 100644 drivers/cpuidle/of_idle_states.h

-- 
1.8.4



^ permalink raw reply	[flat|nested] 74+ messages in thread

* [PATCH v4 0/6] ARM generic idle states
@ 2014-06-11 16:18 ` Lorenzo Pieralisi
  0 siblings, 0 replies; 74+ messages in thread
From: Lorenzo Pieralisi @ 2014-06-11 16:18 UTC (permalink / raw)
  To: linux-arm-kernel

This patch is v4 of a previous posting:

http://lists.infradead.org/pipermail/linux-arm-kernel/2014-May/253693.html

Changes in v4:

- States sorting using exit-latency
- Added cosmetic review comments
- Dropped RFC tag
- Rebased against 3.15

Changes in v3:

- Streamlined the idle states bindings and added them to the series
  http://www.spinics.net/lists/arm-kernel/msg316299.html
- Sorting states through min-residency+exit-latency
- Added debug strings formatting
- Reworded min-residency-us idle state property
- Removed power-domain properties from idle states waiting for code
  examples requiring them to be defined

Changes in v2:

- Moved OF parsing code to drivers/cpuidle
- Improved states detection and sorting through linked list
- Split code in generic and ARM64 specific bits
- Moved idle enter function into ARM64 idle driver
- Refactored PSCI idle states register function
- Renamed suspend operations and moved detection to ARM64 idle driver
- Changed the way CPUIDLE_FLAG_TIMER_STOP is handled
- Simplified idle state nodes parsing since according to the latest
  bindings idle state nodes are a flat list, not hierarchical anymore
- Used min-residency-us to sort the states, to be further discussed

Idle states on most ARM platforms can be characterized by a set of
parameters that are platform agnostic and describe the HW idle states
features. So far, CPU idle drivers for ARM platforms required the definition
of parameters through static tables, duplicating control data for different
platforms. Moreover, the lack of standardization on firmware interfaces
hampered any standardization effort, resulting in CPU idle drivers for ARM
platforms containing duplicated code and platform specific power down routines.

The introduction of the PSCI firmware interface, and more in general, well
defined suspend back-ends, allows the definition of generic idle states and
the respective kernel infrastructure to support them.

Building on top of DT idle states bindings, that standardize idle states
parameters and corresponding suspend back-ends, this patchset provides code
that parses DT idle states nodes and builds at run-time the control data
infrastructure required by the ARM CPU idle drivers.

Idle states define an entry method (eg PSCI), that requires the respective
ARM64 kernel back-end to be invoked to initialize idle states parameters, so
that when the idle driver executes the back-end specific entry method a table
look-up can be carried out to retrieve the corresponding idle state parameter.

On legacy ARM platforms, the OF idle states are just used to initialize
states data.

The idle states bindings can be extended with new back-ends; the ARM64 CPUidle
driver must be updated accordingly so that the corresponding back
end initializer can be invoked at boot time for parameters initialization.

Patchset has been tested on AEM v8 models, on top of bootwrapper PSCI CPU
SUSPEND implementation which provides simulated core power gating.

[1] http://www.spinics.net/lists/arm-kernel/msg316299.html

Lorenzo Pieralisi (6):
  Documentation: arm: define DT idle states bindings
  Documentation: devicetree: psci: define CPU suspend parameter
  drivers: cpuidle: implement OF based idle states infrastructure
  arm64: add PSCI CPU_SUSPEND based cpu_suspend support
  drivers: cpuidle: CPU idle ARM64 driver
  arm64: boot: dts: update rtsm aemv8 dts with PSCI and idle states

 Documentation/devicetree/bindings/arm/cpus.txt     |   8 +
 .../devicetree/bindings/arm/idle-states.txt        | 507 +++++++++++++++++++++
 Documentation/devicetree/bindings/arm/psci.txt     |  12 +-
 arch/arm64/boot/dts/rtsm_ve-aemv8a.dts             |  44 +-
 arch/arm64/include/asm/psci.h                      |   4 +
 arch/arm64/kernel/psci.c                           | 103 +++++
 drivers/cpuidle/Kconfig                            |  14 +
 drivers/cpuidle/Kconfig.arm64                      |  13 +
 drivers/cpuidle/Makefile                           |   5 +
 drivers/cpuidle/cpuidle-arm64.c                    | 168 +++++++
 drivers/cpuidle/of_idle_states.c                   | 282 ++++++++++++
 drivers/cpuidle/of_idle_states.h                   |   8 +
 12 files changed, 1159 insertions(+), 9 deletions(-)
 create mode 100644 Documentation/devicetree/bindings/arm/idle-states.txt
 create mode 100644 drivers/cpuidle/Kconfig.arm64
 create mode 100644 drivers/cpuidle/cpuidle-arm64.c
 create mode 100644 drivers/cpuidle/of_idle_states.c
 create mode 100644 drivers/cpuidle/of_idle_states.h

-- 
1.8.4

^ permalink raw reply	[flat|nested] 74+ messages in thread

* [PATCH v4 1/6] Documentation: arm: define DT idle states bindings
  2014-06-11 16:18 ` Lorenzo Pieralisi
@ 2014-06-11 16:18   ` Lorenzo Pieralisi
  -1 siblings, 0 replies; 74+ messages in thread
From: Lorenzo Pieralisi @ 2014-06-11 16:18 UTC (permalink / raw)
  To: linux-arm-kernel, linux-pm, devicetree
  Cc: Lorenzo Pieralisi, Mark Rutland, Sudeep Holla, Catalin Marinas,
	Charles Garcia Tobin, Nicolas Pitre, Rob Herring, Grant Likely,
	Peter De Schrijver, Santosh Shilimkar, Daniel Lezcano,
	Amit Kucheria, Vincent Guittot, Antti Miettinen, Stephen Boyd,
	Kevin Hilman, Sebastian Capella, Tomasz Figa, Mark Brown,
	Paul Walmsley, Chander Kashyap

ARM based platforms implement a variety of power management schemes that
allow processors to enter idle states at run-time.
The parameters defining these idle states vary on a per-platform basis forcing
the OS to hardcode the state parameters in platform specific static tables
whose size grows as the number of platforms supported in the kernel increases
and hampers device drivers standardization.

Therefore, this patch aims at standardizing idle state device tree bindings for
ARM platforms. Bindings define idle state parameters inclusive of entry methods
and state latencies, to allow operating systems to retrieve the configuration
entries from the device tree and initialize the related power management
drivers, paving the way for common code in the kernel to deal with idle
states and removing the need for static data in current and previous kernel
versions.

Reviewed-by: Sebastian Capella <sebcape@gmail.com>
Signed-off-by: Lorenzo Pieralisi <lorenzo.pieralisi@arm.com>
---
 Documentation/devicetree/bindings/arm/cpus.txt     |   8 +
 .../devicetree/bindings/arm/idle-states.txt        | 507 +++++++++++++++++++++
 2 files changed, 515 insertions(+)
 create mode 100644 Documentation/devicetree/bindings/arm/idle-states.txt

diff --git a/Documentation/devicetree/bindings/arm/cpus.txt b/Documentation/devicetree/bindings/arm/cpus.txt
index 1fe72a0..a44d4fd 100644
--- a/Documentation/devicetree/bindings/arm/cpus.txt
+++ b/Documentation/devicetree/bindings/arm/cpus.txt
@@ -215,6 +215,12 @@ nodes to be present and contain the properties described below.
 		Value type: <phandle>
 		Definition: Specifies the ACC[2] node associated with this CPU.
 
+	- cpu-idle-states
+		Usage: Optional
+		Value type: <prop-encoded-array>
+		Definition:
+			# List of phandles to idle state nodes supported
+			  by this cpu [3].
 
 Example 1 (dual-cluster big.LITTLE system 32-bit):
 
@@ -411,3 +417,5 @@ cpus {
 --
 [1] arm/msm/qcom,saw2.txt
 [2] arm/msm/qcom,kpss-acc.txt
+[3] ARM Linux kernel documentation - idle states bindings
+    Documentation/devicetree/bindings/arm/idle-states.txt
diff --git a/Documentation/devicetree/bindings/arm/idle-states.txt b/Documentation/devicetree/bindings/arm/idle-states.txt
new file mode 100644
index 0000000..223c425
--- /dev/null
+++ b/Documentation/devicetree/bindings/arm/idle-states.txt
@@ -0,0 +1,507 @@
+==========================================
+ARM idle states binding description
+==========================================
+
+==========================================
+1 - Introduction
+==========================================
+
+ARM systems contain HW capable of managing power consumption dynamically,
+where cores can be put in different low-power states (ranging from simple
+wfi to power gating) according to OSPM policies. The CPU states representing
+the range of dynamic idle states that a processor can enter at run-time, can be
+specified through device tree bindings representing the parameters required
+to enter/exit specific idle states on a given processor.
+
+According to the Server Base System Architecture document (SBSA, [3]), the
+power states an ARM CPU can be put into are identified by the following list:
+
+- Running
+- Idle_standby
+- Idle_retention
+- Sleep
+- Off
+
+The power states described in the SBSA document define the basic CPU states on
+top of which ARM platforms implement power management schemes that allow an OS
+PM implementation to put the processor in different idle states (which include
+states listed above; "off" state is not an idle state since it does not have
+wake-up capabilities, hence it is not considered in this document).
+
+Idle state parameters (eg entry latency) are platform specific and need to be
+characterized with bindings that provide the required information to OSPM
+code so that it can build the required tables and use them at runtime.
+
+The device tree binding definition for ARM idle states is the subject of this
+document.
+
+===========================================
+2 - idle-states node
+===========================================
+
+ARM processor idle states are defined within the idle-states node, which is
+a direct child of the cpus node [1] and provides a container where the
+processor idle states, defined as device tree nodes, are listed.
+
+- idle-states node
+
+	Usage: Optional - On ARM systems, is a container of processor idle
+			  states nodes. If the system does not provide CPU
+			  power management capabilities or the processor just
+			  supports idle_standby an idle-states node is not
+			  required.
+
+	Description: idle-states node is a container node, where its
+		     subnodes describe the CPU idle states.
+
+	Node name must be "idle-states".
+
+	The idle-states node's parent node must be the cpus node.
+
+	The idle-states node's child nodes can be:
+
+	- one or more state nodes
+
+	Any other configuration is considered invalid.
+
+	An idle-states node defines the following properties:
+
+	- entry-method
+		Usage: Required
+		Value type: <stringlist>
+		Definition: Describes the method by which a CPU enters the
+			    idle states. This property is required and must be
+			    one of:
+
+			    - "arm,psci"
+			      ARM PSCI firmware interface [2].
+
+			    - "[vendor],[method]"
+			      An implementation dependent string with
+			      format "vendor,method", where vendor is a string
+			      denoting the name of the manufacturer and
+			      method is a string specifying the mechanism
+			      used to enter the idle state.
+
+The nodes describing the idle states (state) can only be defined within the
+idle-states node, any other configuration is considered invalid and therefore
+must be ignored.
+
+===========================================
+3 - state node
+===========================================
+
+A state node represents an idle state description and must be defined as
+follows:
+
+- state node
+
+	Description: must be child of the idle-states node
+
+	The state node name shall follow standard device tree naming
+	rules ([5], 2.2.1 "Node names"), in particular state nodes which
+	are siblings within a single common parent must be given a unique name.
+
+	The idle state entered by executing the wfi instruction (idle_standby
+	SBSA,[3][4]) is considered standard on all ARM platforms and therefore
+	must not be listed.
+
+	A state node defines the following properties:
+
+	- compatible
+		Usage: Required
+		Value type: <stringlist>
+		Definition: Must be "arm,idle-state".
+
+	- logic-state-retained
+		Usage: See definition
+		Value type: <none>
+		Definition: if present logic is retained on state entry,
+			    otherwise it is lost.
+
+	- cache-state-retained
+		Usage: See definition
+		Value type: <none>
+		Definition: if present cache memory is retained on state entry,
+			    otherwise it is lost.
+
+	- entry-method-param
+		Usage: See definition.
+		Value type: <u32>
+		Definition: Depends on the idle-states node entry-method
+			    property value. Refer to the entry-method bindings
+			    for this property value definition.
+
+	- entry-latency-us
+		Usage: Required
+		Value type: <prop-encoded-array>
+		Definition: u32 value representing worst case latency
+			    in microseconds required to enter the idle state.
+
+	- exit-latency-us
+		Usage: Required
+		Value type: <prop-encoded-array>
+		Definition: u32 value representing worst case latency
+			    in microseconds required to exit the idle state.
+
+	- min-residency-us
+		Usage: Required
+		Value type: <prop-encoded-array>
+		Definition: u32 value representing duration in microseconds
+			    after which this state becomes more energy
+			    efficient than any shallower states.
+
+===========================================
+4 - Examples
+===========================================
+
+Example 1 (ARM 64-bit, 16-cpu system):
+
+cpus {
+	#size-cells = <0>;
+	#address-cells = <2>;
+
+	idle-states {
+		entry-method = "arm,psci";
+
+		CPU_RETENTION_0_0: cpu-retention-0-0 {
+			compatible = "arm,idle-state";
+			cache-state-retained;
+			entry-method-param = <0x0010000>;
+			entry-latency-us = <20>;
+			exit-latency-us = <40>;
+			min-residency-us = <30>;
+		};
+
+		CLUSTER_RETENTION_0: cluster-retention-0 {
+			compatible = "arm,idle-state";
+			logic-state-retained;
+			cache-state-retained;
+			entry-method-param = <0x1010000>;
+			entry-latency-us = <50>;
+			exit-latency-us = <100>;
+			min-residency-us = <250>;
+		};
+
+		CPU_SLEEP_0_0: cpu-sleep-0-0 {
+			compatible = "arm,idle-state";
+			entry-method-param = <0x0010000>;
+			entry-latency-us = <250>;
+			exit-latency-us = <500>;
+			min-residency-us = <350>;
+		};
+
+		CLUSTER_SLEEP_0: cluster-sleep-0 {
+			compatible = "arm,idle-state";
+			entry-method-param = <0x1010000>;
+			entry-latency-us = <600>;
+			exit-latency-us = <1100>;
+			min-residency-us = <2700>;
+		};
+
+		CPU_RETENTION_1_0: cpu-retention-1-0 {
+			compatible = "arm,idle-state";
+			cache-state-retained;
+			entry-method-param = <0x0010000>;
+			entry-latency-us = <20>;
+			exit-latency-us = <40>;
+			min-residency-us = <30>;
+		};
+
+		CLUSTER_RETENTION_1: cluster-retention-1 {
+			compatible = "arm,idle-state";
+			logic-state-retained;
+			cache-state-retained;
+			entry-method-param = <0x1010000>;
+			entry-latency-us = <50>;
+			exit-latency-us = <100>;
+			min-residency-us = <270>;
+		};
+
+		CPU_SLEEP_1_0: cpu-sleep-1-0 {
+			compatible = "arm,idle-state";
+			entry-method-param = <0x0010000>;
+			entry-latency-us = <70>;
+			exit-latency-us = <100>;
+			min-residency-us = <100>;
+		};
+
+		CLUSTER_SLEEP_1: cluster-sleep-1 {
+			compatible = "arm,idle-state";
+			entry-method-param = <0x1010000>;
+			entry-latency-us = <500>;
+			exit-latency-us = <1200>;
+			min-residency-us = <3500>;
+		};
+	};
+
+	CPU0: cpu@0 {
+		device_type = "cpu";
+		compatible = "arm,cortex-a57";
+		reg = <0x0 0x0>;
+		enable-method = "psci";
+		cpu-idle-states = <&CPU_RETENTION_0_0 &CPU_SLEEP_0_0
+				   &CLUSTER_RETENTION_0 &CLUSTER_SLEEP_0>;
+	};
+
+	CPU1: cpu@1 {
+		device_type = "cpu";
+		compatible = "arm,cortex-a57";
+		reg = <0x0 0x1>;
+		enable-method = "psci";
+		cpu-idle-states = <&CPU_RETENTION_0_0 &CPU_SLEEP_0_0
+				   &CLUSTER_RETENTION_0 &CLUSTER_SLEEP_0>;
+	};
+
+	CPU2: cpu@100 {
+		device_type = "cpu";
+		compatible = "arm,cortex-a57";
+		reg = <0x0 0x100>;
+		enable-method = "psci";
+		cpu-idle-states = <&CPU_RETENTION_0_0 &CPU_SLEEP_0_0
+				   &CLUSTER_RETENTION_0 &CLUSTER_SLEEP_0>;
+	};
+
+	CPU3: cpu@101 {
+		device_type = "cpu";
+		compatible = "arm,cortex-a57";
+		reg = <0x0 0x101>;
+		enable-method = "psci";
+		cpu-idle-states = <&CPU_RETENTION_0_0 &CPU_SLEEP_0_0
+				   &CLUSTER_RETENTION_0 &CLUSTER_SLEEP_0>;
+	};
+
+	CPU4: cpu@10000 {
+		device_type = "cpu";
+		compatible = "arm,cortex-a57";
+		reg = <0x0 0x10000>;
+		enable-method = "psci";
+		cpu-idle-states = <&CPU_RETENTION_0_0 &CPU_SLEEP_0_0
+				   &CLUSTER_RETENTION_0 &CLUSTER_SLEEP_0>;
+	};
+
+	CPU5: cpu@10001 {
+		device_type = "cpu";
+		compatible = "arm,cortex-a57";
+		reg = <0x0 0x10001>;
+		enable-method = "psci";
+		cpu-idle-states = <&CPU_RETENTION_0_0 &CPU_SLEEP_0_0
+				   &CLUSTER_RETENTION_0 &CLUSTER_SLEEP_0>;
+	};
+
+	CPU6: cpu@10100 {
+		device_type = "cpu";
+		compatible = "arm,cortex-a57";
+		reg = <0x0 0x10100>;
+		enable-method = "psci";
+		cpu-idle-states = <&CPU_RETENTION_0_0 &CPU_SLEEP_0_0
+				   &CLUSTER_RETENTION_0 &CLUSTER_SLEEP_0>;
+	};
+
+	CPU7: cpu@10101 {
+		device_type = "cpu";
+		compatible = "arm,cortex-a57";
+		reg = <0x0 0x10101>;
+		enable-method = "psci";
+		cpu-idle-states = <&CPU_RETENTION_0_0 &CPU_SLEEP_0_0
+				   &CLUSTER_RETENTION_0 &CLUSTER_SLEEP_0>;
+	};
+
+	CPU8: cpu@100000000 {
+		device_type = "cpu";
+		compatible = "arm,cortex-a53";
+		reg = <0x1 0x0>;
+		enable-method = "psci";
+		cpu-idle-states = <&CPU_RETENTION_1_0 &CPU_SLEEP_1_0
+				   &CLUSTER_RETENTION_1 &CLUSTER_SLEEP_1>;
+	};
+
+	CPU9: cpu@100000001 {
+		device_type = "cpu";
+		compatible = "arm,cortex-a53";
+		reg = <0x1 0x1>;
+		enable-method = "psci";
+		cpu-idle-states = <&CPU_RETENTION_1_0 &CPU_SLEEP_1_0
+				   &CLUSTER_RETENTION_1 &CLUSTER_SLEEP_1>;
+	};
+
+	CPU10: cpu@100000100 {
+		device_type = "cpu";
+		compatible = "arm,cortex-a53";
+		reg = <0x1 0x100>;
+		enable-method = "psci";
+		cpu-idle-states = <&CPU_RETENTION_1_0 &CPU_SLEEP_1_0
+				   &CLUSTER_RETENTION_1 &CLUSTER_SLEEP_1>;
+	};
+
+	CPU11: cpu@100000101 {
+		device_type = "cpu";
+		compatible = "arm,cortex-a53";
+		reg = <0x1 0x101>;
+		enable-method = "psci";
+		cpu-idle-states = <&CPU_RETENTION_1_0 &CPU_SLEEP_1_0
+				   &CLUSTER_RETENTION_1 &CLUSTER_SLEEP_1>;
+	};
+
+	CPU12: cpu@100010000 {
+		device_type = "cpu";
+		compatible = "arm,cortex-a53";
+		reg = <0x1 0x10000>;
+		enable-method = "psci";
+		cpu-idle-states = <&CPU_RETENTION_1_0 &CPU_SLEEP_1_0
+				   &CLUSTER_RETENTION_1 &CLUSTER_SLEEP_1>;
+	};
+
+	CPU13: cpu@100010001 {
+		device_type = "cpu";
+		compatible = "arm,cortex-a53";
+		reg = <0x1 0x10001>;
+		enable-method = "psci";
+		cpu-idle-states = <&CPU_RETENTION_1_0 &CPU_SLEEP_1_0
+				   &CLUSTER_RETENTION_1 &CLUSTER_SLEEP_1>;
+	};
+
+	CPU14: cpu@100010100 {
+		device_type = "cpu";
+		compatible = "arm,cortex-a53";
+		reg = <0x1 0x10100>;
+		enable-method = "psci";
+		cpu-idle-states = <&CPU_RETENTION_1_0 &CPU_SLEEP_1_0
+				   &CLUSTER_RETENTION_1 &CLUSTER_SLEEP_1>;
+	};
+
+	CPU15: cpu@100010101 {
+		device_type = "cpu";
+		compatible = "arm,cortex-a53";
+		reg = <0x1 0x10101>;
+		enable-method = "psci";
+		cpu-idle-states = <&CPU_RETENTION_1_0 &CPU_SLEEP_1_0
+				   &CLUSTER_RETENTION_1 &CLUSTER_SLEEP_1>;
+	};
+};
+
+Example 2 (ARM 32-bit, 8-cpu system, two clusters):
+
+cpus {
+	#size-cells = <0>;
+	#address-cells = <1>;
+
+	idle-states {
+		entry-method = "arm,psci";
+
+		CPU_SLEEP_0_0: cpu-sleep-0-0 {
+			compatible = "arm,idle-state";
+			entry-method-param = <0x0010000>;
+			entry-latency-us = <400>;
+			exit-latency-us = <500>;
+			min-residency-us = <300>;
+		};
+
+		CLUSTER_SLEEP_0: cluster-sleep-0 {
+			compatible = "arm,idle-state";
+			entry-method-param = <0x1010000>;
+			entry-latency-us = <1000>;
+			exit-latency-us = <1500>;
+			min-residency-us = <1500>;
+		};
+
+		CPU_SLEEP_1_0: cpu-sleep-1-0 {
+			compatible = "arm,idle-state";
+			entry-method-param = <0x0010000>;
+			entry-latency-us = <300>;
+			exit-latency-us = <500>;
+			min-residency-us = <500>;
+		};
+
+		CLUSTER_SLEEP_1: cluster-sleep-1 {
+			compatible = "arm,idle-state";
+			entry-method-param = <0x1010000>;
+			entry-latency-us = <800>;
+			exit-latency-us = <2000>;
+			min-residency-us = <6500>;
+		};
+	};
+
+	CPU0: cpu@0 {
+		device_type = "cpu";
+		compatible = "arm,cortex-a15";
+		reg = <0x0>;
+		enable-method = "psci";
+		cpu-idle-states = <&CPU_SLEEP_0_0 &CLUSTER_SLEEP_0>;
+	};
+
+	CPU1: cpu@1 {
+		device_type = "cpu";
+		compatible = "arm,cortex-a15";
+		reg = <0x1>;
+		enable-method = "psci";
+		cpu-idle-states = <&CPU_SLEEP_0_0 &CLUSTER_SLEEP_0>;
+	};
+
+	CPU2: cpu@2 {
+		device_type = "cpu";
+		compatible = "arm,cortex-a15";
+		reg = <0x2>;
+		enable-method = "psci";
+		cpu-idle-states = <&CPU_SLEEP_0_0 &CLUSTER_SLEEP_0>;
+	};
+
+	CPU3: cpu@3 {
+		device_type = "cpu";
+		compatible = "arm,cortex-a15";
+		reg = <0x3>;
+		enable-method = "psci";
+		cpu-idle-states = <&CPU_SLEEP_0_0 &CLUSTER_SLEEP_0>;
+	};
+
+	CPU4: cpu@100 {
+		device_type = "cpu";
+		compatible = "arm,cortex-a7";
+		reg = <0x100>;
+		enable-method = "psci";
+		cpu-idle-states = <&CPU_SLEEP_1_0 &CLUSTER_SLEEP_1>;
+	};
+
+	CPU5: cpu@101 {
+		device_type = "cpu";
+		compatible = "arm,cortex-a7";
+		reg = <0x101>;
+		enable-method = "psci";
+		cpu-idle-states = <&CPU_SLEEP_1_0 &CLUSTER_SLEEP_1>;
+	};
+
+	CPU6: cpu@102 {
+		device_type = "cpu";
+		compatible = "arm,cortex-a7";
+		reg = <0x102>;
+		enable-method = "psci";
+		cpu-idle-states = <&CPU_SLEEP_1_0 &CLUSTER_SLEEP_1>;
+	};
+
+	CPU7: cpu@103 {
+		device_type = "cpu";
+		compatible = "arm,cortex-a7";
+		reg = <0x103>;
+		enable-method = "psci";
+		cpu-idle-states = <&CPU_SLEEP_1_0 &CLUSTER_SLEEP_1>;
+	};
+};
+
+===========================================
+4 - References
+===========================================
+
+[1] ARM Linux Kernel documentation - CPUs bindings
+    Documentation/devicetree/bindings/arm/cpus.txt
+
+[2] ARM Linux Kernel documentation - PSCI bindings
+    Documentation/devicetree/bindings/arm/psci.txt
+
+[3] ARM Server Base System Architecture (SBSA)
+    http://infocenter.arm.com/help/index.jsp
+
+[4] ARM Architecture Reference Manuals
+    http://infocenter.arm.com/help/index.jsp
+
+[5] ePAPR standard
+    https://www.power.org/documentation/epapr-version-1-1/
-- 
1.8.4



^ permalink raw reply related	[flat|nested] 74+ messages in thread

* [PATCH v4 1/6] Documentation: arm: define DT idle states bindings
@ 2014-06-11 16:18   ` Lorenzo Pieralisi
  0 siblings, 0 replies; 74+ messages in thread
From: Lorenzo Pieralisi @ 2014-06-11 16:18 UTC (permalink / raw)
  To: linux-arm-kernel

ARM based platforms implement a variety of power management schemes that
allow processors to enter idle states at run-time.
The parameters defining these idle states vary on a per-platform basis forcing
the OS to hardcode the state parameters in platform specific static tables
whose size grows as the number of platforms supported in the kernel increases
and hampers device drivers standardization.

Therefore, this patch aims at standardizing idle state device tree bindings for
ARM platforms. Bindings define idle state parameters inclusive of entry methods
and state latencies, to allow operating systems to retrieve the configuration
entries from the device tree and initialize the related power management
drivers, paving the way for common code in the kernel to deal with idle
states and removing the need for static data in current and previous kernel
versions.

Reviewed-by: Sebastian Capella <sebcape@gmail.com>
Signed-off-by: Lorenzo Pieralisi <lorenzo.pieralisi@arm.com>
---
 Documentation/devicetree/bindings/arm/cpus.txt     |   8 +
 .../devicetree/bindings/arm/idle-states.txt        | 507 +++++++++++++++++++++
 2 files changed, 515 insertions(+)
 create mode 100644 Documentation/devicetree/bindings/arm/idle-states.txt

diff --git a/Documentation/devicetree/bindings/arm/cpus.txt b/Documentation/devicetree/bindings/arm/cpus.txt
index 1fe72a0..a44d4fd 100644
--- a/Documentation/devicetree/bindings/arm/cpus.txt
+++ b/Documentation/devicetree/bindings/arm/cpus.txt
@@ -215,6 +215,12 @@ nodes to be present and contain the properties described below.
 		Value type: <phandle>
 		Definition: Specifies the ACC[2] node associated with this CPU.
 
+	- cpu-idle-states
+		Usage: Optional
+		Value type: <prop-encoded-array>
+		Definition:
+			# List of phandles to idle state nodes supported
+			  by this cpu [3].
 
 Example 1 (dual-cluster big.LITTLE system 32-bit):
 
@@ -411,3 +417,5 @@ cpus {
 --
 [1] arm/msm/qcom,saw2.txt
 [2] arm/msm/qcom,kpss-acc.txt
+[3] ARM Linux kernel documentation - idle states bindings
+    Documentation/devicetree/bindings/arm/idle-states.txt
diff --git a/Documentation/devicetree/bindings/arm/idle-states.txt b/Documentation/devicetree/bindings/arm/idle-states.txt
new file mode 100644
index 0000000..223c425
--- /dev/null
+++ b/Documentation/devicetree/bindings/arm/idle-states.txt
@@ -0,0 +1,507 @@
+==========================================
+ARM idle states binding description
+==========================================
+
+==========================================
+1 - Introduction
+==========================================
+
+ARM systems contain HW capable of managing power consumption dynamically,
+where cores can be put in different low-power states (ranging from simple
+wfi to power gating) according to OSPM policies. The CPU states representing
+the range of dynamic idle states that a processor can enter at run-time, can be
+specified through device tree bindings representing the parameters required
+to enter/exit specific idle states on a given processor.
+
+According to the Server Base System Architecture document (SBSA, [3]), the
+power states an ARM CPU can be put into are identified by the following list:
+
+- Running
+- Idle_standby
+- Idle_retention
+- Sleep
+- Off
+
+The power states described in the SBSA document define the basic CPU states on
+top of which ARM platforms implement power management schemes that allow an OS
+PM implementation to put the processor in different idle states (which include
+states listed above; "off" state is not an idle state since it does not have
+wake-up capabilities, hence it is not considered in this document).
+
+Idle state parameters (eg entry latency) are platform specific and need to be
+characterized with bindings that provide the required information to OSPM
+code so that it can build the required tables and use them at runtime.
+
+The device tree binding definition for ARM idle states is the subject of this
+document.
+
+===========================================
+2 - idle-states node
+===========================================
+
+ARM processor idle states are defined within the idle-states node, which is
+a direct child of the cpus node [1] and provides a container where the
+processor idle states, defined as device tree nodes, are listed.
+
+- idle-states node
+
+	Usage: Optional - On ARM systems, is a container of processor idle
+			  states nodes. If the system does not provide CPU
+			  power management capabilities or the processor just
+			  supports idle_standby an idle-states node is not
+			  required.
+
+	Description: idle-states node is a container node, where its
+		     subnodes describe the CPU idle states.
+
+	Node name must be "idle-states".
+
+	The idle-states node's parent node must be the cpus node.
+
+	The idle-states node's child nodes can be:
+
+	- one or more state nodes
+
+	Any other configuration is considered invalid.
+
+	An idle-states node defines the following properties:
+
+	- entry-method
+		Usage: Required
+		Value type: <stringlist>
+		Definition: Describes the method by which a CPU enters the
+			    idle states. This property is required and must be
+			    one of:
+
+			    - "arm,psci"
+			      ARM PSCI firmware interface [2].
+
+			    - "[vendor],[method]"
+			      An implementation dependent string with
+			      format "vendor,method", where vendor is a string
+			      denoting the name of the manufacturer and
+			      method is a string specifying the mechanism
+			      used to enter the idle state.
+
+The nodes describing the idle states (state) can only be defined within the
+idle-states node, any other configuration is considered invalid and therefore
+must be ignored.
+
+===========================================
+3 - state node
+===========================================
+
+A state node represents an idle state description and must be defined as
+follows:
+
+- state node
+
+	Description: must be child of the idle-states node
+
+	The state node name shall follow standard device tree naming
+	rules ([5], 2.2.1 "Node names"), in particular state nodes which
+	are siblings within a single common parent must be given a unique name.
+
+	The idle state entered by executing the wfi instruction (idle_standby
+	SBSA,[3][4]) is considered standard on all ARM platforms and therefore
+	must not be listed.
+
+	A state node defines the following properties:
+
+	- compatible
+		Usage: Required
+		Value type: <stringlist>
+		Definition: Must be "arm,idle-state".
+
+	- logic-state-retained
+		Usage: See definition
+		Value type: <none>
+		Definition: if present logic is retained on state entry,
+			    otherwise it is lost.
+
+	- cache-state-retained
+		Usage: See definition
+		Value type: <none>
+		Definition: if present cache memory is retained on state entry,
+			    otherwise it is lost.
+
+	- entry-method-param
+		Usage: See definition.
+		Value type: <u32>
+		Definition: Depends on the idle-states node entry-method
+			    property value. Refer to the entry-method bindings
+			    for this property value definition.
+
+	- entry-latency-us
+		Usage: Required
+		Value type: <prop-encoded-array>
+		Definition: u32 value representing worst case latency
+			    in microseconds required to enter the idle state.
+
+	- exit-latency-us
+		Usage: Required
+		Value type: <prop-encoded-array>
+		Definition: u32 value representing worst case latency
+			    in microseconds required to exit the idle state.
+
+	- min-residency-us
+		Usage: Required
+		Value type: <prop-encoded-array>
+		Definition: u32 value representing duration in microseconds
+			    after which this state becomes more energy
+			    efficient than any shallower states.
+
+===========================================
+4 - Examples
+===========================================
+
+Example 1 (ARM 64-bit, 16-cpu system):
+
+cpus {
+	#size-cells = <0>;
+	#address-cells = <2>;
+
+	idle-states {
+		entry-method = "arm,psci";
+
+		CPU_RETENTION_0_0: cpu-retention-0-0 {
+			compatible = "arm,idle-state";
+			cache-state-retained;
+			entry-method-param = <0x0010000>;
+			entry-latency-us = <20>;
+			exit-latency-us = <40>;
+			min-residency-us = <30>;
+		};
+
+		CLUSTER_RETENTION_0: cluster-retention-0 {
+			compatible = "arm,idle-state";
+			logic-state-retained;
+			cache-state-retained;
+			entry-method-param = <0x1010000>;
+			entry-latency-us = <50>;
+			exit-latency-us = <100>;
+			min-residency-us = <250>;
+		};
+
+		CPU_SLEEP_0_0: cpu-sleep-0-0 {
+			compatible = "arm,idle-state";
+			entry-method-param = <0x0010000>;
+			entry-latency-us = <250>;
+			exit-latency-us = <500>;
+			min-residency-us = <350>;
+		};
+
+		CLUSTER_SLEEP_0: cluster-sleep-0 {
+			compatible = "arm,idle-state";
+			entry-method-param = <0x1010000>;
+			entry-latency-us = <600>;
+			exit-latency-us = <1100>;
+			min-residency-us = <2700>;
+		};
+
+		CPU_RETENTION_1_0: cpu-retention-1-0 {
+			compatible = "arm,idle-state";
+			cache-state-retained;
+			entry-method-param = <0x0010000>;
+			entry-latency-us = <20>;
+			exit-latency-us = <40>;
+			min-residency-us = <30>;
+		};
+
+		CLUSTER_RETENTION_1: cluster-retention-1 {
+			compatible = "arm,idle-state";
+			logic-state-retained;
+			cache-state-retained;
+			entry-method-param = <0x1010000>;
+			entry-latency-us = <50>;
+			exit-latency-us = <100>;
+			min-residency-us = <270>;
+		};
+
+		CPU_SLEEP_1_0: cpu-sleep-1-0 {
+			compatible = "arm,idle-state";
+			entry-method-param = <0x0010000>;
+			entry-latency-us = <70>;
+			exit-latency-us = <100>;
+			min-residency-us = <100>;
+		};
+
+		CLUSTER_SLEEP_1: cluster-sleep-1 {
+			compatible = "arm,idle-state";
+			entry-method-param = <0x1010000>;
+			entry-latency-us = <500>;
+			exit-latency-us = <1200>;
+			min-residency-us = <3500>;
+		};
+	};
+
+	CPU0: cpu at 0 {
+		device_type = "cpu";
+		compatible = "arm,cortex-a57";
+		reg = <0x0 0x0>;
+		enable-method = "psci";
+		cpu-idle-states = <&CPU_RETENTION_0_0 &CPU_SLEEP_0_0
+				   &CLUSTER_RETENTION_0 &CLUSTER_SLEEP_0>;
+	};
+
+	CPU1: cpu at 1 {
+		device_type = "cpu";
+		compatible = "arm,cortex-a57";
+		reg = <0x0 0x1>;
+		enable-method = "psci";
+		cpu-idle-states = <&CPU_RETENTION_0_0 &CPU_SLEEP_0_0
+				   &CLUSTER_RETENTION_0 &CLUSTER_SLEEP_0>;
+	};
+
+	CPU2: cpu at 100 {
+		device_type = "cpu";
+		compatible = "arm,cortex-a57";
+		reg = <0x0 0x100>;
+		enable-method = "psci";
+		cpu-idle-states = <&CPU_RETENTION_0_0 &CPU_SLEEP_0_0
+				   &CLUSTER_RETENTION_0 &CLUSTER_SLEEP_0>;
+	};
+
+	CPU3: cpu at 101 {
+		device_type = "cpu";
+		compatible = "arm,cortex-a57";
+		reg = <0x0 0x101>;
+		enable-method = "psci";
+		cpu-idle-states = <&CPU_RETENTION_0_0 &CPU_SLEEP_0_0
+				   &CLUSTER_RETENTION_0 &CLUSTER_SLEEP_0>;
+	};
+
+	CPU4: cpu at 10000 {
+		device_type = "cpu";
+		compatible = "arm,cortex-a57";
+		reg = <0x0 0x10000>;
+		enable-method = "psci";
+		cpu-idle-states = <&CPU_RETENTION_0_0 &CPU_SLEEP_0_0
+				   &CLUSTER_RETENTION_0 &CLUSTER_SLEEP_0>;
+	};
+
+	CPU5: cpu at 10001 {
+		device_type = "cpu";
+		compatible = "arm,cortex-a57";
+		reg = <0x0 0x10001>;
+		enable-method = "psci";
+		cpu-idle-states = <&CPU_RETENTION_0_0 &CPU_SLEEP_0_0
+				   &CLUSTER_RETENTION_0 &CLUSTER_SLEEP_0>;
+	};
+
+	CPU6: cpu at 10100 {
+		device_type = "cpu";
+		compatible = "arm,cortex-a57";
+		reg = <0x0 0x10100>;
+		enable-method = "psci";
+		cpu-idle-states = <&CPU_RETENTION_0_0 &CPU_SLEEP_0_0
+				   &CLUSTER_RETENTION_0 &CLUSTER_SLEEP_0>;
+	};
+
+	CPU7: cpu at 10101 {
+		device_type = "cpu";
+		compatible = "arm,cortex-a57";
+		reg = <0x0 0x10101>;
+		enable-method = "psci";
+		cpu-idle-states = <&CPU_RETENTION_0_0 &CPU_SLEEP_0_0
+				   &CLUSTER_RETENTION_0 &CLUSTER_SLEEP_0>;
+	};
+
+	CPU8: cpu at 100000000 {
+		device_type = "cpu";
+		compatible = "arm,cortex-a53";
+		reg = <0x1 0x0>;
+		enable-method = "psci";
+		cpu-idle-states = <&CPU_RETENTION_1_0 &CPU_SLEEP_1_0
+				   &CLUSTER_RETENTION_1 &CLUSTER_SLEEP_1>;
+	};
+
+	CPU9: cpu at 100000001 {
+		device_type = "cpu";
+		compatible = "arm,cortex-a53";
+		reg = <0x1 0x1>;
+		enable-method = "psci";
+		cpu-idle-states = <&CPU_RETENTION_1_0 &CPU_SLEEP_1_0
+				   &CLUSTER_RETENTION_1 &CLUSTER_SLEEP_1>;
+	};
+
+	CPU10: cpu at 100000100 {
+		device_type = "cpu";
+		compatible = "arm,cortex-a53";
+		reg = <0x1 0x100>;
+		enable-method = "psci";
+		cpu-idle-states = <&CPU_RETENTION_1_0 &CPU_SLEEP_1_0
+				   &CLUSTER_RETENTION_1 &CLUSTER_SLEEP_1>;
+	};
+
+	CPU11: cpu at 100000101 {
+		device_type = "cpu";
+		compatible = "arm,cortex-a53";
+		reg = <0x1 0x101>;
+		enable-method = "psci";
+		cpu-idle-states = <&CPU_RETENTION_1_0 &CPU_SLEEP_1_0
+				   &CLUSTER_RETENTION_1 &CLUSTER_SLEEP_1>;
+	};
+
+	CPU12: cpu at 100010000 {
+		device_type = "cpu";
+		compatible = "arm,cortex-a53";
+		reg = <0x1 0x10000>;
+		enable-method = "psci";
+		cpu-idle-states = <&CPU_RETENTION_1_0 &CPU_SLEEP_1_0
+				   &CLUSTER_RETENTION_1 &CLUSTER_SLEEP_1>;
+	};
+
+	CPU13: cpu at 100010001 {
+		device_type = "cpu";
+		compatible = "arm,cortex-a53";
+		reg = <0x1 0x10001>;
+		enable-method = "psci";
+		cpu-idle-states = <&CPU_RETENTION_1_0 &CPU_SLEEP_1_0
+				   &CLUSTER_RETENTION_1 &CLUSTER_SLEEP_1>;
+	};
+
+	CPU14: cpu at 100010100 {
+		device_type = "cpu";
+		compatible = "arm,cortex-a53";
+		reg = <0x1 0x10100>;
+		enable-method = "psci";
+		cpu-idle-states = <&CPU_RETENTION_1_0 &CPU_SLEEP_1_0
+				   &CLUSTER_RETENTION_1 &CLUSTER_SLEEP_1>;
+	};
+
+	CPU15: cpu at 100010101 {
+		device_type = "cpu";
+		compatible = "arm,cortex-a53";
+		reg = <0x1 0x10101>;
+		enable-method = "psci";
+		cpu-idle-states = <&CPU_RETENTION_1_0 &CPU_SLEEP_1_0
+				   &CLUSTER_RETENTION_1 &CLUSTER_SLEEP_1>;
+	};
+};
+
+Example 2 (ARM 32-bit, 8-cpu system, two clusters):
+
+cpus {
+	#size-cells = <0>;
+	#address-cells = <1>;
+
+	idle-states {
+		entry-method = "arm,psci";
+
+		CPU_SLEEP_0_0: cpu-sleep-0-0 {
+			compatible = "arm,idle-state";
+			entry-method-param = <0x0010000>;
+			entry-latency-us = <400>;
+			exit-latency-us = <500>;
+			min-residency-us = <300>;
+		};
+
+		CLUSTER_SLEEP_0: cluster-sleep-0 {
+			compatible = "arm,idle-state";
+			entry-method-param = <0x1010000>;
+			entry-latency-us = <1000>;
+			exit-latency-us = <1500>;
+			min-residency-us = <1500>;
+		};
+
+		CPU_SLEEP_1_0: cpu-sleep-1-0 {
+			compatible = "arm,idle-state";
+			entry-method-param = <0x0010000>;
+			entry-latency-us = <300>;
+			exit-latency-us = <500>;
+			min-residency-us = <500>;
+		};
+
+		CLUSTER_SLEEP_1: cluster-sleep-1 {
+			compatible = "arm,idle-state";
+			entry-method-param = <0x1010000>;
+			entry-latency-us = <800>;
+			exit-latency-us = <2000>;
+			min-residency-us = <6500>;
+		};
+	};
+
+	CPU0: cpu at 0 {
+		device_type = "cpu";
+		compatible = "arm,cortex-a15";
+		reg = <0x0>;
+		enable-method = "psci";
+		cpu-idle-states = <&CPU_SLEEP_0_0 &CLUSTER_SLEEP_0>;
+	};
+
+	CPU1: cpu at 1 {
+		device_type = "cpu";
+		compatible = "arm,cortex-a15";
+		reg = <0x1>;
+		enable-method = "psci";
+		cpu-idle-states = <&CPU_SLEEP_0_0 &CLUSTER_SLEEP_0>;
+	};
+
+	CPU2: cpu at 2 {
+		device_type = "cpu";
+		compatible = "arm,cortex-a15";
+		reg = <0x2>;
+		enable-method = "psci";
+		cpu-idle-states = <&CPU_SLEEP_0_0 &CLUSTER_SLEEP_0>;
+	};
+
+	CPU3: cpu at 3 {
+		device_type = "cpu";
+		compatible = "arm,cortex-a15";
+		reg = <0x3>;
+		enable-method = "psci";
+		cpu-idle-states = <&CPU_SLEEP_0_0 &CLUSTER_SLEEP_0>;
+	};
+
+	CPU4: cpu at 100 {
+		device_type = "cpu";
+		compatible = "arm,cortex-a7";
+		reg = <0x100>;
+		enable-method = "psci";
+		cpu-idle-states = <&CPU_SLEEP_1_0 &CLUSTER_SLEEP_1>;
+	};
+
+	CPU5: cpu at 101 {
+		device_type = "cpu";
+		compatible = "arm,cortex-a7";
+		reg = <0x101>;
+		enable-method = "psci";
+		cpu-idle-states = <&CPU_SLEEP_1_0 &CLUSTER_SLEEP_1>;
+	};
+
+	CPU6: cpu at 102 {
+		device_type = "cpu";
+		compatible = "arm,cortex-a7";
+		reg = <0x102>;
+		enable-method = "psci";
+		cpu-idle-states = <&CPU_SLEEP_1_0 &CLUSTER_SLEEP_1>;
+	};
+
+	CPU7: cpu at 103 {
+		device_type = "cpu";
+		compatible = "arm,cortex-a7";
+		reg = <0x103>;
+		enable-method = "psci";
+		cpu-idle-states = <&CPU_SLEEP_1_0 &CLUSTER_SLEEP_1>;
+	};
+};
+
+===========================================
+4 - References
+===========================================
+
+[1] ARM Linux Kernel documentation - CPUs bindings
+    Documentation/devicetree/bindings/arm/cpus.txt
+
+[2] ARM Linux Kernel documentation - PSCI bindings
+    Documentation/devicetree/bindings/arm/psci.txt
+
+[3] ARM Server Base System Architecture (SBSA)
+    http://infocenter.arm.com/help/index.jsp
+
+[4] ARM Architecture Reference Manuals
+    http://infocenter.arm.com/help/index.jsp
+
+[5] ePAPR standard
+    https://www.power.org/documentation/epapr-version-1-1/
-- 
1.8.4

^ permalink raw reply related	[flat|nested] 74+ messages in thread

* [PATCH v4 2/6] Documentation: devicetree: psci: define CPU suspend parameter
  2014-06-11 16:18 ` Lorenzo Pieralisi
@ 2014-06-11 16:18     ` Lorenzo Pieralisi
  -1 siblings, 0 replies; 74+ messages in thread
From: Lorenzo Pieralisi @ 2014-06-11 16:18 UTC (permalink / raw)
  To: linux-arm-kernel-IAPFreCvJWM7uuMidbF8XUB+6BGkLq7r,
	linux-pm-u79uwXL29TY76Z2rM5mHXA,
	devicetree-u79uwXL29TY76Z2rM5mHXA
  Cc: Lorenzo Pieralisi, Mark Rutland, Sudeep Holla, Catalin Marinas,
	Charles Garcia Tobin, Nicolas Pitre, Rob Herring, Grant Likely,
	Peter De Schrijver, Santosh Shilimkar, Daniel Lezcano,
	Amit Kucheria, Vincent Guittot, Antti Miettinen, Stephen Boyd,
	Kevin Hilman, Sebastian Capella, Tomasz Figa, Mark Brown,
	Paul Walmsley, Chander Kashyap

OS layers built on top of PSCI to enter low-power states require the
power_state parameter to be passed to the PSCI CPU suspend method.

This parameter is specific to a power state and platform specific,
therefore must be provided by firmware to the OS in order to enable
proper call sequence.

This patch adds a property in the PSCI bindings that describes how
the CPU suspend power_state parameter should be defined in DT in
all device nodes that rely on PSCI CPU suspend method usage.

Reviewed-by: Sebastian Capella <sebcape-Re5JQEeQqe8AvxtiuMwx3w@public.gmane.org>
Signed-off-by: Lorenzo Pieralisi <lorenzo.pieralisi-5wv7dgnIgG8@public.gmane.org>
---
 Documentation/devicetree/bindings/arm/psci.txt | 12 +++++++++++-
 1 file changed, 11 insertions(+), 1 deletion(-)

diff --git a/Documentation/devicetree/bindings/arm/psci.txt b/Documentation/devicetree/bindings/arm/psci.txt
index b4a58f3..fae3eed 100644
--- a/Documentation/devicetree/bindings/arm/psci.txt
+++ b/Documentation/devicetree/bindings/arm/psci.txt
@@ -50,6 +50,14 @@ Main node optional properties:
 
  - migrate       : Function ID for MIGRATE operation
 
+Device tree nodes that require usage of PSCI CPU_SUSPEND function (ie idle
+states bindings[1]) must specify the following properties:
+
+- entry-method-param
+		Usage: Required for idle states bindings [1].
+		Value type: <u32>
+		Definition: power_state parameter to pass to the PSCI
+			    suspend call.
 
 Example:
 
@@ -64,7 +72,6 @@ Case 1: PSCI v0.1 only.
 		migrate		= <0x95c10003>;
 	};
 
-
 Case 2: PSCI v0.2 only
 
 	psci {
@@ -88,3 +95,6 @@ Case 3: PSCI v0.2 and PSCI v0.1.
 
 		...
 	};
+
+[1] Kernel documentation - ARM idle states bindings
+    Documentation/devicetree/bindings/arm/idle-states.txt
-- 
1.8.4


--
To unsubscribe from this list: send the line "unsubscribe devicetree" in
the body of a message to majordomo-u79uwXL29TY76Z2rM5mHXA@public.gmane.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

^ permalink raw reply related	[flat|nested] 74+ messages in thread

* [PATCH v4 2/6] Documentation: devicetree: psci: define CPU suspend parameter
@ 2014-06-11 16:18     ` Lorenzo Pieralisi
  0 siblings, 0 replies; 74+ messages in thread
From: Lorenzo Pieralisi @ 2014-06-11 16:18 UTC (permalink / raw)
  To: linux-arm-kernel

OS layers built on top of PSCI to enter low-power states require the
power_state parameter to be passed to the PSCI CPU suspend method.

This parameter is specific to a power state and platform specific,
therefore must be provided by firmware to the OS in order to enable
proper call sequence.

This patch adds a property in the PSCI bindings that describes how
the CPU suspend power_state parameter should be defined in DT in
all device nodes that rely on PSCI CPU suspend method usage.

Reviewed-by: Sebastian Capella <sebcape@gmail.com>
Signed-off-by: Lorenzo Pieralisi <lorenzo.pieralisi@arm.com>
---
 Documentation/devicetree/bindings/arm/psci.txt | 12 +++++++++++-
 1 file changed, 11 insertions(+), 1 deletion(-)

diff --git a/Documentation/devicetree/bindings/arm/psci.txt b/Documentation/devicetree/bindings/arm/psci.txt
index b4a58f3..fae3eed 100644
--- a/Documentation/devicetree/bindings/arm/psci.txt
+++ b/Documentation/devicetree/bindings/arm/psci.txt
@@ -50,6 +50,14 @@ Main node optional properties:
 
  - migrate       : Function ID for MIGRATE operation
 
+Device tree nodes that require usage of PSCI CPU_SUSPEND function (ie idle
+states bindings[1]) must specify the following properties:
+
+- entry-method-param
+		Usage: Required for idle states bindings [1].
+		Value type: <u32>
+		Definition: power_state parameter to pass to the PSCI
+			    suspend call.
 
 Example:
 
@@ -64,7 +72,6 @@ Case 1: PSCI v0.1 only.
 		migrate		= <0x95c10003>;
 	};
 
-
 Case 2: PSCI v0.2 only
 
 	psci {
@@ -88,3 +95,6 @@ Case 3: PSCI v0.2 and PSCI v0.1.
 
 		...
 	};
+
+[1] Kernel documentation - ARM idle states bindings
+    Documentation/devicetree/bindings/arm/idle-states.txt
-- 
1.8.4

^ permalink raw reply related	[flat|nested] 74+ messages in thread

* [PATCH v4 3/6] drivers: cpuidle: implement OF based idle states infrastructure
  2014-06-11 16:18 ` Lorenzo Pieralisi
@ 2014-06-11 16:18   ` Lorenzo Pieralisi
  -1 siblings, 0 replies; 74+ messages in thread
From: Lorenzo Pieralisi @ 2014-06-11 16:18 UTC (permalink / raw)
  To: linux-arm-kernel, linux-pm, devicetree
  Cc: Lorenzo Pieralisi, Mark Rutland, Sudeep Holla, Catalin Marinas,
	Charles Garcia Tobin, Nicolas Pitre, Rob Herring, Grant Likely,
	Peter De Schrijver, Santosh Shilimkar, Daniel Lezcano,
	Amit Kucheria, Vincent Guittot, Antti Miettinen, Stephen Boyd,
	Kevin Hilman, Sebastian Capella, Tomasz Figa, Mark Brown,
	Paul Walmsley, Chander Kashyap

On most common ARM systems, the low-power states a CPU can be put into are
not discoverable in HW and require device tree bindings to describe
power down suspend operations and idle states parameters.

In order to enable DT based idle states and configure idle drivers, this
patch implements the bulk infrastructure required to parse the device tree
idle states bindings and initialize the corresponding CPUidle driver states
data.

Code that initializes idle states checks the CPU idle driver cpumask so
that multiple CPU idle drivers can be initialized through it in the
kernel. The CPU idle driver cpumask defines which idle states should be
considered valid for the driver, ie idle states that are valid on a set
of cpus the idle driver manages.

Signed-off-by: Lorenzo Pieralisi <lorenzo.pieralisi@arm.com>
---
 drivers/cpuidle/Kconfig          |   9 ++
 drivers/cpuidle/Makefile         |   1 +
 drivers/cpuidle/of_idle_states.c | 282 +++++++++++++++++++++++++++++++++++++++
 drivers/cpuidle/of_idle_states.h |   8 ++
 4 files changed, 300 insertions(+)
 create mode 100644 drivers/cpuidle/of_idle_states.c
 create mode 100644 drivers/cpuidle/of_idle_states.h

diff --git a/drivers/cpuidle/Kconfig b/drivers/cpuidle/Kconfig
index 1b96fb9..760ce20 100644
--- a/drivers/cpuidle/Kconfig
+++ b/drivers/cpuidle/Kconfig
@@ -30,6 +30,15 @@ config CPU_IDLE_GOV_MENU
 	bool "Menu governor (for tickless system)"
 	default y
 
+config OF_IDLE_STATES
+        bool "Idle states DT support"
+	depends on ARM || ARM64
+	default n
+	help
+	 Allows the CPU idle framework to initialize CPU idle drivers
+	 state data by using DT provided nodes compliant with idle states
+	 device tree bindings.
+
 menu "ARM CPU Idle Drivers"
 depends on ARM
 source "drivers/cpuidle/Kconfig.arm"
diff --git a/drivers/cpuidle/Makefile b/drivers/cpuidle/Makefile
index d8bb1ff..d5ebf4b 100644
--- a/drivers/cpuidle/Makefile
+++ b/drivers/cpuidle/Makefile
@@ -4,6 +4,7 @@
 
 obj-y += cpuidle.o driver.o governor.o sysfs.o governors/
 obj-$(CONFIG_ARCH_NEEDS_CPU_IDLE_COUPLED) += coupled.o
+obj-$(CONFIG_OF_IDLE_STATES)		  += of_idle_states.o
 
 ##################################################################################
 # ARM SoC drivers
diff --git a/drivers/cpuidle/of_idle_states.c b/drivers/cpuidle/of_idle_states.c
new file mode 100644
index 0000000..acdbf45
--- /dev/null
+++ b/drivers/cpuidle/of_idle_states.c
@@ -0,0 +1,282 @@
+/*
+ * OF idle states parsing code.
+ *
+ * Copyright (C) 2014 ARM Ltd.
+ * Author: Lorenzo Pieralisi <lorenzo.pieralisi@arm.com>
+ *
+ * This program is free software; you can redistribute it and/or modify
+ * it under the terms of the GNU General Public License version 2 as
+ * published by the Free Software Foundation.
+ */
+
+#define pr_fmt(fmt) "OF idle-states: " fmt
+
+#include <linux/cpuidle.h>
+#include <linux/cpumask.h>
+#include <linux/errno.h>
+#include <linux/kernel.h>
+#include <linux/list.h>
+#include <linux/list_sort.h>
+#include <linux/module.h>
+#include <linux/of.h>
+#include <linux/slab.h>
+
+#include "of_idle_states.h"
+
+struct state_elem {
+	struct list_head list;
+	struct device_node *node;
+	u32 val;
+};
+
+static struct list_head head __initdata = LIST_HEAD_INIT(head);
+
+static bool __init state_cpu_valid(struct device_node *state_node,
+				   struct device_node *cpu_node)
+{
+	int i = 0;
+	struct device_node *cpu_state;
+
+	while ((cpu_state = of_parse_phandle(cpu_node,
+					     "cpu-idle-states", i++))) {
+		if (cpu_state && state_node == cpu_state) {
+			of_node_put(cpu_state);
+			return true;
+		}
+		of_node_put(cpu_state);
+	}
+	return false;
+}
+
+static bool __init state_cpus_valid(const cpumask_t *cpus,
+				    struct device_node *state_node)
+{
+	int cpu;
+	struct device_node *cpu_node;
+
+	/*
+	 * Check if state is valid on driver cpumask cpus
+	 */
+	for_each_cpu(cpu, cpus) {
+		cpu_node = of_get_cpu_node(cpu, NULL);
+
+		if (!cpu_node) {
+			pr_err("Missing device node for CPU %d\n", cpu);
+			return false;
+		}
+
+		if (!state_cpu_valid(state_node, cpu_node))
+			return false;
+	}
+
+	return true;
+}
+
+static int __init state_cmp(void *priv, struct list_head *a,
+			    struct list_head *b)
+{
+	struct state_elem *ela, *elb;
+
+	ela = container_of(a, struct state_elem, list);
+	elb = container_of(b, struct state_elem, list);
+
+	return ela->val - elb->val;
+}
+
+static int __init add_state_node(cpumask_t *cpumask,
+				 struct device_node *state_node)
+{
+	struct state_elem *el;
+	u32 val;
+
+	pr_debug(" * %s...\n", state_node->full_name);
+
+	if (!state_cpus_valid(cpumask, state_node))
+		return -EINVAL;
+	/*
+	 * Parse just the property required to sort the states.
+	 * Since we are missing a value defining the energy
+	 * efficiency of a state, for now the sorting code uses
+	 *
+	 * exit-latency-us
+	 *
+	 * as sorting rank.
+	 */
+	if (of_property_read_u32(state_node, "exit-latency-us",
+				 &val)) {
+		pr_debug(" * %s missing exit-latency-us property\n",
+			     state_node->full_name);
+		return -EINVAL;
+	}
+
+	el = kmalloc(sizeof(*el), GFP_KERNEL);
+	if (!el) {
+		pr_err("%s failed to allocate memory\n", __func__);
+		return -ENOMEM;
+	}
+
+	el->node = state_node;
+	el->val = val;
+	list_add_tail(&el->list, &head);
+
+	return 0;
+}
+
+static void __init init_state_node(struct cpuidle_driver *drv,
+				   struct device_node *state_node,
+				   int *cnt)
+{
+	struct cpuidle_state *idle_state;
+
+	pr_debug(" * %s...\n", state_node->full_name);
+
+	idle_state = &drv->states[*cnt];
+
+	if (of_property_read_u32(state_node, "exit-latency-us",
+				 &idle_state->exit_latency)) {
+		pr_debug(" * %s missing exit-latency-us property\n",
+			     state_node->full_name);
+		return;
+	}
+
+	if (of_property_read_u32(state_node, "min-residency-us",
+				 &idle_state->target_residency)) {
+		pr_debug(" * %s missing min-residency-us property\n",
+			     state_node->full_name);
+		return;
+	}
+	/*
+	 * It is unknown to the idle driver if and when the tick_device
+	 * loses context when the CPU enters the idle states. To solve
+	 * this issue the tick device must be linked to a power domain
+	 * so that the idle driver can check on which states the device
+	 * loses its context. Current code takes the conservative choice
+	 * of defining the idle state as one where the tick device always
+	 * loses its context. On platforms where tick device never loses
+	 * its context (ie it is not a C3STOP device) this turns into
+	 * a nop. On platforms where the tick device does lose context in some
+	 * states, this code can be optimized, when power domain specifications
+	 * for ARM CPUs are finalized.
+	 */
+	idle_state->flags = CPUIDLE_FLAG_TIME_VALID | CPUIDLE_FLAG_TIMER_STOP;
+
+	strncpy(idle_state->name, state_node->name, CPUIDLE_NAME_LEN);
+	strncpy(idle_state->desc, state_node->name, CPUIDLE_NAME_LEN);
+
+	(*cnt)++;
+}
+
+static int __init init_idle_states(struct cpuidle_driver *drv,
+				   struct device_node *state_nodes[],
+				   unsigned int start_idx, bool init_nodes)
+{
+	struct state_elem *el;
+	struct list_head *curr, *tmp;
+	unsigned int cnt = start_idx;
+
+	list_for_each_entry(el, &head, list) {
+		/*
+		 * Check if the init function has to fill the
+		 * state_nodes array on behalf of the CPUidle driver.
+		 */
+		if (init_nodes)
+			state_nodes[cnt] = el->node;
+		/*
+		 * cnt is updated on return if a state was added.
+		 */
+		init_state_node(drv, el->node, &cnt);
+
+		if (cnt == CPUIDLE_STATE_MAX) {
+			pr_warn("State index reached static CPU idle state limit\n");
+			break;
+		}
+	}
+
+	drv->state_count = cnt;
+
+	list_for_each_safe(curr, tmp, &head) {
+		list_del(curr);
+		kfree(container_of(curr, struct state_elem, list));
+	}
+
+	/*
+	 * If no idle states are detected, return an error and let the idle
+	 * driver initialization fail accordingly.
+	 */
+	return (cnt > start_idx) ? 0 : -ENODATA;
+}
+
+static void __init add_idle_states(struct cpuidle_driver *drv,
+				   struct device_node *idle_states)
+{
+	struct device_node *state_node;
+
+	for_each_child_of_node(idle_states, state_node) {
+		if ((!of_device_is_compatible(state_node, "arm,idle-state"))) {
+			pr_warn(" * %s: children of /cpus/idle-states must be \"arm,idle-state\" compatible\n",
+				     state_node->full_name);
+			continue;
+		}
+		/*
+		 * If memory allocation fails, better bail out.
+		 * Initialized nodes are freed at initialization
+		 * completion in of_init_idle_driver().
+		 */
+		if ((add_state_node(drv->cpumask, state_node) == -ENOMEM))
+			break;
+	}
+	/*
+	 * Sort the states list before initializing the CPUidle driver
+	 * states array.
+	 */
+	list_sort(NULL, &head, state_cmp);
+}
+
+/**
+ * of_init_idle_driver() - Parse the DT idle states and initialize the
+ *			   idle driver states array
+ *
+ * @drv:	  Pointer to CPU idle driver to be initialized
+ * @state_nodes:  Array of struct device_nodes to be initialized if
+ *		  init_nodes == true. Must be sized CPUIDLE_STATE_MAX
+ * @start_idx:    First idle state index to be initialized
+ * @init_nodes:   Boolean to request device nodes initialization
+ *
+ * On success the states array in the cpuidle driver contains
+ * initialized entries in the states array, starting from index start_idx.
+ * If init_nodes == true, on success the state_nodes array is initialized
+ * with idle state DT node pointers, starting from index start_idx,
+ * in a 1:1 relation with the idle driver states array.
+ *
+ * Return:
+ *	0 on success
+ *	<0 on failure
+ */
+int __init of_init_idle_driver(struct cpuidle_driver *drv,
+			       struct device_node *state_nodes[],
+			       unsigned int start_idx, bool init_nodes)
+{
+	struct device_node *idle_states_node;
+	int ret;
+
+	if (start_idx >= CPUIDLE_STATE_MAX) {
+		pr_warn("State index exceeds static CPU idle driver states array size\n");
+		return -EINVAL;
+	}
+
+	if (WARN(init_nodes && !state_nodes,
+		"Requested nodes stashing in an invalid nodes container\n"))
+		return -EINVAL;
+
+	idle_states_node = of_find_node_by_path("/cpus/idle-states");
+	if (!idle_states_node)
+		return -ENOENT;
+
+	add_idle_states(drv, idle_states_node);
+
+	ret = init_idle_states(drv, state_nodes, start_idx, init_nodes);
+
+	of_node_put(idle_states_node);
+
+	return ret;
+}
diff --git a/drivers/cpuidle/of_idle_states.h b/drivers/cpuidle/of_idle_states.h
new file mode 100644
index 0000000..049f94f
--- /dev/null
+++ b/drivers/cpuidle/of_idle_states.h
@@ -0,0 +1,8 @@
+#ifndef __OF_IDLE_STATES
+#define __OF_IDLE_STATES
+
+int __init of_init_idle_driver(struct cpuidle_driver *drv,
+			       struct device_node *state_nodes[],
+			       unsigned int start_idx,
+			       bool init_nodes);
+#endif
-- 
1.8.4



^ permalink raw reply related	[flat|nested] 74+ messages in thread

* [PATCH v4 3/6] drivers: cpuidle: implement OF based idle states infrastructure
@ 2014-06-11 16:18   ` Lorenzo Pieralisi
  0 siblings, 0 replies; 74+ messages in thread
From: Lorenzo Pieralisi @ 2014-06-11 16:18 UTC (permalink / raw)
  To: linux-arm-kernel

On most common ARM systems, the low-power states a CPU can be put into are
not discoverable in HW and require device tree bindings to describe
power down suspend operations and idle states parameters.

In order to enable DT based idle states and configure idle drivers, this
patch implements the bulk infrastructure required to parse the device tree
idle states bindings and initialize the corresponding CPUidle driver states
data.

Code that initializes idle states checks the CPU idle driver cpumask so
that multiple CPU idle drivers can be initialized through it in the
kernel. The CPU idle driver cpumask defines which idle states should be
considered valid for the driver, ie idle states that are valid on a set
of cpus the idle driver manages.

Signed-off-by: Lorenzo Pieralisi <lorenzo.pieralisi@arm.com>
---
 drivers/cpuidle/Kconfig          |   9 ++
 drivers/cpuidle/Makefile         |   1 +
 drivers/cpuidle/of_idle_states.c | 282 +++++++++++++++++++++++++++++++++++++++
 drivers/cpuidle/of_idle_states.h |   8 ++
 4 files changed, 300 insertions(+)
 create mode 100644 drivers/cpuidle/of_idle_states.c
 create mode 100644 drivers/cpuidle/of_idle_states.h

diff --git a/drivers/cpuidle/Kconfig b/drivers/cpuidle/Kconfig
index 1b96fb9..760ce20 100644
--- a/drivers/cpuidle/Kconfig
+++ b/drivers/cpuidle/Kconfig
@@ -30,6 +30,15 @@ config CPU_IDLE_GOV_MENU
 	bool "Menu governor (for tickless system)"
 	default y
 
+config OF_IDLE_STATES
+        bool "Idle states DT support"
+	depends on ARM || ARM64
+	default n
+	help
+	 Allows the CPU idle framework to initialize CPU idle drivers
+	 state data by using DT provided nodes compliant with idle states
+	 device tree bindings.
+
 menu "ARM CPU Idle Drivers"
 depends on ARM
 source "drivers/cpuidle/Kconfig.arm"
diff --git a/drivers/cpuidle/Makefile b/drivers/cpuidle/Makefile
index d8bb1ff..d5ebf4b 100644
--- a/drivers/cpuidle/Makefile
+++ b/drivers/cpuidle/Makefile
@@ -4,6 +4,7 @@
 
 obj-y += cpuidle.o driver.o governor.o sysfs.o governors/
 obj-$(CONFIG_ARCH_NEEDS_CPU_IDLE_COUPLED) += coupled.o
+obj-$(CONFIG_OF_IDLE_STATES)		  += of_idle_states.o
 
 ##################################################################################
 # ARM SoC drivers
diff --git a/drivers/cpuidle/of_idle_states.c b/drivers/cpuidle/of_idle_states.c
new file mode 100644
index 0000000..acdbf45
--- /dev/null
+++ b/drivers/cpuidle/of_idle_states.c
@@ -0,0 +1,282 @@
+/*
+ * OF idle states parsing code.
+ *
+ * Copyright (C) 2014 ARM Ltd.
+ * Author: Lorenzo Pieralisi <lorenzo.pieralisi@arm.com>
+ *
+ * This program is free software; you can redistribute it and/or modify
+ * it under the terms of the GNU General Public License version 2 as
+ * published by the Free Software Foundation.
+ */
+
+#define pr_fmt(fmt) "OF idle-states: " fmt
+
+#include <linux/cpuidle.h>
+#include <linux/cpumask.h>
+#include <linux/errno.h>
+#include <linux/kernel.h>
+#include <linux/list.h>
+#include <linux/list_sort.h>
+#include <linux/module.h>
+#include <linux/of.h>
+#include <linux/slab.h>
+
+#include "of_idle_states.h"
+
+struct state_elem {
+	struct list_head list;
+	struct device_node *node;
+	u32 val;
+};
+
+static struct list_head head __initdata = LIST_HEAD_INIT(head);
+
+static bool __init state_cpu_valid(struct device_node *state_node,
+				   struct device_node *cpu_node)
+{
+	int i = 0;
+	struct device_node *cpu_state;
+
+	while ((cpu_state = of_parse_phandle(cpu_node,
+					     "cpu-idle-states", i++))) {
+		if (cpu_state && state_node == cpu_state) {
+			of_node_put(cpu_state);
+			return true;
+		}
+		of_node_put(cpu_state);
+	}
+	return false;
+}
+
+static bool __init state_cpus_valid(const cpumask_t *cpus,
+				    struct device_node *state_node)
+{
+	int cpu;
+	struct device_node *cpu_node;
+
+	/*
+	 * Check if state is valid on driver cpumask cpus
+	 */
+	for_each_cpu(cpu, cpus) {
+		cpu_node = of_get_cpu_node(cpu, NULL);
+
+		if (!cpu_node) {
+			pr_err("Missing device node for CPU %d\n", cpu);
+			return false;
+		}
+
+		if (!state_cpu_valid(state_node, cpu_node))
+			return false;
+	}
+
+	return true;
+}
+
+static int __init state_cmp(void *priv, struct list_head *a,
+			    struct list_head *b)
+{
+	struct state_elem *ela, *elb;
+
+	ela = container_of(a, struct state_elem, list);
+	elb = container_of(b, struct state_elem, list);
+
+	return ela->val - elb->val;
+}
+
+static int __init add_state_node(cpumask_t *cpumask,
+				 struct device_node *state_node)
+{
+	struct state_elem *el;
+	u32 val;
+
+	pr_debug(" * %s...\n", state_node->full_name);
+
+	if (!state_cpus_valid(cpumask, state_node))
+		return -EINVAL;
+	/*
+	 * Parse just the property required to sort the states.
+	 * Since we are missing a value defining the energy
+	 * efficiency of a state, for now the sorting code uses
+	 *
+	 * exit-latency-us
+	 *
+	 * as sorting rank.
+	 */
+	if (of_property_read_u32(state_node, "exit-latency-us",
+				 &val)) {
+		pr_debug(" * %s missing exit-latency-us property\n",
+			     state_node->full_name);
+		return -EINVAL;
+	}
+
+	el = kmalloc(sizeof(*el), GFP_KERNEL);
+	if (!el) {
+		pr_err("%s failed to allocate memory\n", __func__);
+		return -ENOMEM;
+	}
+
+	el->node = state_node;
+	el->val = val;
+	list_add_tail(&el->list, &head);
+
+	return 0;
+}
+
+static void __init init_state_node(struct cpuidle_driver *drv,
+				   struct device_node *state_node,
+				   int *cnt)
+{
+	struct cpuidle_state *idle_state;
+
+	pr_debug(" * %s...\n", state_node->full_name);
+
+	idle_state = &drv->states[*cnt];
+
+	if (of_property_read_u32(state_node, "exit-latency-us",
+				 &idle_state->exit_latency)) {
+		pr_debug(" * %s missing exit-latency-us property\n",
+			     state_node->full_name);
+		return;
+	}
+
+	if (of_property_read_u32(state_node, "min-residency-us",
+				 &idle_state->target_residency)) {
+		pr_debug(" * %s missing min-residency-us property\n",
+			     state_node->full_name);
+		return;
+	}
+	/*
+	 * It is unknown to the idle driver if and when the tick_device
+	 * loses context when the CPU enters the idle states. To solve
+	 * this issue the tick device must be linked to a power domain
+	 * so that the idle driver can check on which states the device
+	 * loses its context. Current code takes the conservative choice
+	 * of defining the idle state as one where the tick device always
+	 * loses its context. On platforms where tick device never loses
+	 * its context (ie it is not a C3STOP device) this turns into
+	 * a nop. On platforms where the tick device does lose context in some
+	 * states, this code can be optimized, when power domain specifications
+	 * for ARM CPUs are finalized.
+	 */
+	idle_state->flags = CPUIDLE_FLAG_TIME_VALID | CPUIDLE_FLAG_TIMER_STOP;
+
+	strncpy(idle_state->name, state_node->name, CPUIDLE_NAME_LEN);
+	strncpy(idle_state->desc, state_node->name, CPUIDLE_NAME_LEN);
+
+	(*cnt)++;
+}
+
+static int __init init_idle_states(struct cpuidle_driver *drv,
+				   struct device_node *state_nodes[],
+				   unsigned int start_idx, bool init_nodes)
+{
+	struct state_elem *el;
+	struct list_head *curr, *tmp;
+	unsigned int cnt = start_idx;
+
+	list_for_each_entry(el, &head, list) {
+		/*
+		 * Check if the init function has to fill the
+		 * state_nodes array on behalf of the CPUidle driver.
+		 */
+		if (init_nodes)
+			state_nodes[cnt] = el->node;
+		/*
+		 * cnt is updated on return if a state was added.
+		 */
+		init_state_node(drv, el->node, &cnt);
+
+		if (cnt == CPUIDLE_STATE_MAX) {
+			pr_warn("State index reached static CPU idle state limit\n");
+			break;
+		}
+	}
+
+	drv->state_count = cnt;
+
+	list_for_each_safe(curr, tmp, &head) {
+		list_del(curr);
+		kfree(container_of(curr, struct state_elem, list));
+	}
+
+	/*
+	 * If no idle states are detected, return an error and let the idle
+	 * driver initialization fail accordingly.
+	 */
+	return (cnt > start_idx) ? 0 : -ENODATA;
+}
+
+static void __init add_idle_states(struct cpuidle_driver *drv,
+				   struct device_node *idle_states)
+{
+	struct device_node *state_node;
+
+	for_each_child_of_node(idle_states, state_node) {
+		if ((!of_device_is_compatible(state_node, "arm,idle-state"))) {
+			pr_warn(" * %s: children of /cpus/idle-states must be \"arm,idle-state\" compatible\n",
+				     state_node->full_name);
+			continue;
+		}
+		/*
+		 * If memory allocation fails, better bail out.
+		 * Initialized nodes are freed at initialization
+		 * completion in of_init_idle_driver().
+		 */
+		if ((add_state_node(drv->cpumask, state_node) == -ENOMEM))
+			break;
+	}
+	/*
+	 * Sort the states list before initializing the CPUidle driver
+	 * states array.
+	 */
+	list_sort(NULL, &head, state_cmp);
+}
+
+/**
+ * of_init_idle_driver() - Parse the DT idle states and initialize the
+ *			   idle driver states array
+ *
+ * @drv:	  Pointer to CPU idle driver to be initialized
+ * @state_nodes:  Array of struct device_nodes to be initialized if
+ *		  init_nodes == true. Must be sized CPUIDLE_STATE_MAX
+ * @start_idx:    First idle state index to be initialized
+ * @init_nodes:   Boolean to request device nodes initialization
+ *
+ * On success the states array in the cpuidle driver contains
+ * initialized entries in the states array, starting from index start_idx.
+ * If init_nodes == true, on success the state_nodes array is initialized
+ * with idle state DT node pointers, starting from index start_idx,
+ * in a 1:1 relation with the idle driver states array.
+ *
+ * Return:
+ *	0 on success
+ *	<0 on failure
+ */
+int __init of_init_idle_driver(struct cpuidle_driver *drv,
+			       struct device_node *state_nodes[],
+			       unsigned int start_idx, bool init_nodes)
+{
+	struct device_node *idle_states_node;
+	int ret;
+
+	if (start_idx >= CPUIDLE_STATE_MAX) {
+		pr_warn("State index exceeds static CPU idle driver states array size\n");
+		return -EINVAL;
+	}
+
+	if (WARN(init_nodes && !state_nodes,
+		"Requested nodes stashing in an invalid nodes container\n"))
+		return -EINVAL;
+
+	idle_states_node = of_find_node_by_path("/cpus/idle-states");
+	if (!idle_states_node)
+		return -ENOENT;
+
+	add_idle_states(drv, idle_states_node);
+
+	ret = init_idle_states(drv, state_nodes, start_idx, init_nodes);
+
+	of_node_put(idle_states_node);
+
+	return ret;
+}
diff --git a/drivers/cpuidle/of_idle_states.h b/drivers/cpuidle/of_idle_states.h
new file mode 100644
index 0000000..049f94f
--- /dev/null
+++ b/drivers/cpuidle/of_idle_states.h
@@ -0,0 +1,8 @@
+#ifndef __OF_IDLE_STATES
+#define __OF_IDLE_STATES
+
+int __init of_init_idle_driver(struct cpuidle_driver *drv,
+			       struct device_node *state_nodes[],
+			       unsigned int start_idx,
+			       bool init_nodes);
+#endif
-- 
1.8.4

^ permalink raw reply related	[flat|nested] 74+ messages in thread

* [PATCH v4 4/6] arm64: add PSCI CPU_SUSPEND based cpu_suspend support
  2014-06-11 16:18 ` Lorenzo Pieralisi
@ 2014-06-11 16:18   ` Lorenzo Pieralisi
  -1 siblings, 0 replies; 74+ messages in thread
From: Lorenzo Pieralisi @ 2014-06-11 16:18 UTC (permalink / raw)
  To: linux-arm-kernel, linux-pm, devicetree
  Cc: Lorenzo Pieralisi, Mark Rutland, Sudeep Holla, Catalin Marinas,
	Charles Garcia Tobin, Nicolas Pitre, Rob Herring, Grant Likely,
	Peter De Schrijver, Santosh Shilimkar, Daniel Lezcano,
	Amit Kucheria, Vincent Guittot, Antti Miettinen, Stephen Boyd,
	Kevin Hilman, Sebastian Capella, Tomasz Figa, Mark Brown,
	Paul Walmsley, Chander Kashyap

This patch implements the cpu_suspend cpu operations method through
the PSCI CPU_SUSPEND API. The PSCI implementation translates the idle state
index passed by the cpu_suspend core call into a valid PSCI state according to
the PSCI states initialized at boot by the PSCI suspend backend.

Entry point is set to cpu_resume physical address, that represents the
default kernel execution address following a CPU reset.

Idle state indices missing a DT node description are initialized to power
state standby WFI so that if called by the idle driver they provide the
default behaviour.

Reviewed-by: Sebastian Capella <sebcape@gmail.com>
Signed-off-by: Lorenzo Pieralisi <lorenzo.pieralisi@arm.com>
---
 arch/arm64/include/asm/psci.h |   4 ++
 arch/arm64/kernel/psci.c      | 103 ++++++++++++++++++++++++++++++++++++++++++
 2 files changed, 107 insertions(+)

diff --git a/arch/arm64/include/asm/psci.h b/arch/arm64/include/asm/psci.h
index e5312ea..16c1351 100644
--- a/arch/arm64/include/asm/psci.h
+++ b/arch/arm64/include/asm/psci.h
@@ -14,6 +14,10 @@
 #ifndef __ASM_PSCI_H
 #define __ASM_PSCI_H
 
+struct cpuidle_driver;
 int psci_init(void);
 
+int __init psci_dt_register_idle_states(struct cpuidle_driver *,
+					struct device_node *[]);
+
 #endif /* __ASM_PSCI_H */
diff --git a/arch/arm64/kernel/psci.c b/arch/arm64/kernel/psci.c
index 9e9798f..f708bcc 100644
--- a/arch/arm64/kernel/psci.c
+++ b/arch/arm64/kernel/psci.c
@@ -15,12 +15,14 @@
 
 #define pr_fmt(fmt) "psci: " fmt
 
+#include <linux/cpuidle.h>
 #include <linux/init.h>
 #include <linux/of.h>
 #include <linux/smp.h>
 #include <linux/reboot.h>
 #include <linux/pm.h>
 #include <linux/delay.h>
+#include <linux/slab.h>
 #include <uapi/linux/psci.h>
 
 #include <asm/compiler.h>
@@ -28,6 +30,7 @@
 #include <asm/errno.h>
 #include <asm/psci.h>
 #include <asm/smp_plat.h>
+#include <asm/suspend.h>
 #include <asm/system_misc.h>
 
 #define PSCI_POWER_STATE_TYPE_STANDBY		0
@@ -65,6 +68,8 @@ enum psci_function {
 	PSCI_FN_MAX,
 };
 
+static DEFINE_PER_CPU_READ_MOSTLY(struct psci_power_state *, psci_power_state);
+
 static u32 psci_function_id[PSCI_FN_MAX];
 
 static int psci_to_linux_errno(int errno)
@@ -93,6 +98,18 @@ static u32 psci_power_state_pack(struct psci_power_state state)
 		 & PSCI_0_2_POWER_STATE_AFFL_MASK);
 }
 
+static void psci_power_state_unpack(u32 power_state,
+				    struct psci_power_state *state)
+{
+	state->id = (power_state & PSCI_0_2_POWER_STATE_ID_MASK) >>
+			PSCI_0_2_POWER_STATE_ID_SHIFT;
+	state->type = (power_state & PSCI_0_2_POWER_STATE_TYPE_MASK) >>
+			PSCI_0_2_POWER_STATE_TYPE_SHIFT;
+	state->affinity_level =
+			(power_state & PSCI_0_2_POWER_STATE_AFFL_MASK) >>
+			PSCI_0_2_POWER_STATE_AFFL_SHIFT;
+}
+
 /*
  * The following two functions are invoked via the invoke_psci_fn pointer
  * and will not be inlined, allowing us to piggyback on the AAPCS.
@@ -199,6 +216,77 @@ static int psci_migrate_info_type(void)
 	return err;
 }
 
+int __init psci_dt_register_idle_states(struct cpuidle_driver *drv,
+					struct device_node *state_nodes[])
+{
+	int cpu, i;
+	struct psci_power_state *psci_states;
+	const struct cpu_operations *cpu_ops_ptr;
+
+	if (!state_nodes)
+		return -EINVAL;
+	/*
+	 * This is belt-and-braces: make sure that if the idle
+	 * specified protocol is psci, the cpu_ops have been
+	 * initialized to psci operations. Anything else is
+	 * a recipe for mayhem.
+	 */
+	for_each_cpu(cpu, drv->cpumask) {
+		cpu_ops_ptr = cpu_ops[cpu];
+		if (WARN_ON(!cpu_ops_ptr || strcmp(cpu_ops_ptr->name, "psci")))
+			return -EOPNOTSUPP;
+	}
+
+	psci_states = kcalloc(drv->state_count, sizeof(*psci_states),
+			      GFP_KERNEL);
+
+	if (!psci_states) {
+		pr_warn("psci idle state allocation failed\n");
+		return -ENOMEM;
+	}
+
+	for_each_cpu(cpu, drv->cpumask) {
+		if (per_cpu(psci_power_state, cpu)) {
+			pr_warn("idle states already initialized on cpu %u\n",
+				cpu);
+			continue;
+		}
+		per_cpu(psci_power_state, cpu) = psci_states;
+	}
+
+
+	for (i = 0; i < drv->state_count; i++) {
+		u32 psci_power_state;
+
+		if (!state_nodes[i]) {
+			/*
+			 * An index with a missing node pointer falls back to
+			 * simple STANDBYWFI
+			 */
+			psci_states[i].type = PSCI_POWER_STATE_TYPE_STANDBY;
+			continue;
+		}
+
+		if (of_property_read_u32(state_nodes[i], "entry-method-param",
+					 &psci_power_state)) {
+			pr_warn(" * %s missing entry-method-param property\n",
+				state_nodes[i]->full_name);
+			/*
+			 * If entry-method-param property is missing, fall
+			 * back to STANDBYWFI state
+			 */
+			psci_states[i].type = PSCI_POWER_STATE_TYPE_STANDBY;
+			continue;
+		}
+
+		pr_debug("psci-power-state %#x index %u\n", psci_power_state,
+							    i);
+		psci_power_state_unpack(psci_power_state, &psci_states[i]);
+	}
+
+	return 0;
+}
+
 static int get_set_conduit_method(struct device_node *np)
 {
 	const char *method;
@@ -435,6 +523,18 @@ static int cpu_psci_cpu_kill(unsigned int cpu)
 }
 #endif
 
+#ifdef CONFIG_ARM64_CPU_SUSPEND
+static int cpu_psci_cpu_suspend(unsigned long index)
+{
+	struct psci_power_state *state = __get_cpu_var(psci_power_state);
+
+	if (!state)
+		return -EOPNOTSUPP;
+
+	return psci_ops.cpu_suspend(state[index], virt_to_phys(cpu_resume));
+}
+#endif
+
 const struct cpu_operations cpu_psci_ops = {
 	.name		= "psci",
 	.cpu_init	= cpu_psci_cpu_init,
@@ -445,6 +545,9 @@ const struct cpu_operations cpu_psci_ops = {
 	.cpu_die	= cpu_psci_cpu_die,
 	.cpu_kill	= cpu_psci_cpu_kill,
 #endif
+#ifdef CONFIG_ARM64_CPU_SUSPEND
+	.cpu_suspend	= cpu_psci_cpu_suspend,
+#endif
 };
 
 #endif
-- 
1.8.4



^ permalink raw reply related	[flat|nested] 74+ messages in thread

* [PATCH v4 4/6] arm64: add PSCI CPU_SUSPEND based cpu_suspend support
@ 2014-06-11 16:18   ` Lorenzo Pieralisi
  0 siblings, 0 replies; 74+ messages in thread
From: Lorenzo Pieralisi @ 2014-06-11 16:18 UTC (permalink / raw)
  To: linux-arm-kernel

This patch implements the cpu_suspend cpu operations method through
the PSCI CPU_SUSPEND API. The PSCI implementation translates the idle state
index passed by the cpu_suspend core call into a valid PSCI state according to
the PSCI states initialized at boot by the PSCI suspend backend.

Entry point is set to cpu_resume physical address, that represents the
default kernel execution address following a CPU reset.

Idle state indices missing a DT node description are initialized to power
state standby WFI so that if called by the idle driver they provide the
default behaviour.

Reviewed-by: Sebastian Capella <sebcape@gmail.com>
Signed-off-by: Lorenzo Pieralisi <lorenzo.pieralisi@arm.com>
---
 arch/arm64/include/asm/psci.h |   4 ++
 arch/arm64/kernel/psci.c      | 103 ++++++++++++++++++++++++++++++++++++++++++
 2 files changed, 107 insertions(+)

diff --git a/arch/arm64/include/asm/psci.h b/arch/arm64/include/asm/psci.h
index e5312ea..16c1351 100644
--- a/arch/arm64/include/asm/psci.h
+++ b/arch/arm64/include/asm/psci.h
@@ -14,6 +14,10 @@
 #ifndef __ASM_PSCI_H
 #define __ASM_PSCI_H
 
+struct cpuidle_driver;
 int psci_init(void);
 
+int __init psci_dt_register_idle_states(struct cpuidle_driver *,
+					struct device_node *[]);
+
 #endif /* __ASM_PSCI_H */
diff --git a/arch/arm64/kernel/psci.c b/arch/arm64/kernel/psci.c
index 9e9798f..f708bcc 100644
--- a/arch/arm64/kernel/psci.c
+++ b/arch/arm64/kernel/psci.c
@@ -15,12 +15,14 @@
 
 #define pr_fmt(fmt) "psci: " fmt
 
+#include <linux/cpuidle.h>
 #include <linux/init.h>
 #include <linux/of.h>
 #include <linux/smp.h>
 #include <linux/reboot.h>
 #include <linux/pm.h>
 #include <linux/delay.h>
+#include <linux/slab.h>
 #include <uapi/linux/psci.h>
 
 #include <asm/compiler.h>
@@ -28,6 +30,7 @@
 #include <asm/errno.h>
 #include <asm/psci.h>
 #include <asm/smp_plat.h>
+#include <asm/suspend.h>
 #include <asm/system_misc.h>
 
 #define PSCI_POWER_STATE_TYPE_STANDBY		0
@@ -65,6 +68,8 @@ enum psci_function {
 	PSCI_FN_MAX,
 };
 
+static DEFINE_PER_CPU_READ_MOSTLY(struct psci_power_state *, psci_power_state);
+
 static u32 psci_function_id[PSCI_FN_MAX];
 
 static int psci_to_linux_errno(int errno)
@@ -93,6 +98,18 @@ static u32 psci_power_state_pack(struct psci_power_state state)
 		 & PSCI_0_2_POWER_STATE_AFFL_MASK);
 }
 
+static void psci_power_state_unpack(u32 power_state,
+				    struct psci_power_state *state)
+{
+	state->id = (power_state & PSCI_0_2_POWER_STATE_ID_MASK) >>
+			PSCI_0_2_POWER_STATE_ID_SHIFT;
+	state->type = (power_state & PSCI_0_2_POWER_STATE_TYPE_MASK) >>
+			PSCI_0_2_POWER_STATE_TYPE_SHIFT;
+	state->affinity_level =
+			(power_state & PSCI_0_2_POWER_STATE_AFFL_MASK) >>
+			PSCI_0_2_POWER_STATE_AFFL_SHIFT;
+}
+
 /*
  * The following two functions are invoked via the invoke_psci_fn pointer
  * and will not be inlined, allowing us to piggyback on the AAPCS.
@@ -199,6 +216,77 @@ static int psci_migrate_info_type(void)
 	return err;
 }
 
+int __init psci_dt_register_idle_states(struct cpuidle_driver *drv,
+					struct device_node *state_nodes[])
+{
+	int cpu, i;
+	struct psci_power_state *psci_states;
+	const struct cpu_operations *cpu_ops_ptr;
+
+	if (!state_nodes)
+		return -EINVAL;
+	/*
+	 * This is belt-and-braces: make sure that if the idle
+	 * specified protocol is psci, the cpu_ops have been
+	 * initialized to psci operations. Anything else is
+	 * a recipe for mayhem.
+	 */
+	for_each_cpu(cpu, drv->cpumask) {
+		cpu_ops_ptr = cpu_ops[cpu];
+		if (WARN_ON(!cpu_ops_ptr || strcmp(cpu_ops_ptr->name, "psci")))
+			return -EOPNOTSUPP;
+	}
+
+	psci_states = kcalloc(drv->state_count, sizeof(*psci_states),
+			      GFP_KERNEL);
+
+	if (!psci_states) {
+		pr_warn("psci idle state allocation failed\n");
+		return -ENOMEM;
+	}
+
+	for_each_cpu(cpu, drv->cpumask) {
+		if (per_cpu(psci_power_state, cpu)) {
+			pr_warn("idle states already initialized on cpu %u\n",
+				cpu);
+			continue;
+		}
+		per_cpu(psci_power_state, cpu) = psci_states;
+	}
+
+
+	for (i = 0; i < drv->state_count; i++) {
+		u32 psci_power_state;
+
+		if (!state_nodes[i]) {
+			/*
+			 * An index with a missing node pointer falls back to
+			 * simple STANDBYWFI
+			 */
+			psci_states[i].type = PSCI_POWER_STATE_TYPE_STANDBY;
+			continue;
+		}
+
+		if (of_property_read_u32(state_nodes[i], "entry-method-param",
+					 &psci_power_state)) {
+			pr_warn(" * %s missing entry-method-param property\n",
+				state_nodes[i]->full_name);
+			/*
+			 * If entry-method-param property is missing, fall
+			 * back to STANDBYWFI state
+			 */
+			psci_states[i].type = PSCI_POWER_STATE_TYPE_STANDBY;
+			continue;
+		}
+
+		pr_debug("psci-power-state %#x index %u\n", psci_power_state,
+							    i);
+		psci_power_state_unpack(psci_power_state, &psci_states[i]);
+	}
+
+	return 0;
+}
+
 static int get_set_conduit_method(struct device_node *np)
 {
 	const char *method;
@@ -435,6 +523,18 @@ static int cpu_psci_cpu_kill(unsigned int cpu)
 }
 #endif
 
+#ifdef CONFIG_ARM64_CPU_SUSPEND
+static int cpu_psci_cpu_suspend(unsigned long index)
+{
+	struct psci_power_state *state = __get_cpu_var(psci_power_state);
+
+	if (!state)
+		return -EOPNOTSUPP;
+
+	return psci_ops.cpu_suspend(state[index], virt_to_phys(cpu_resume));
+}
+#endif
+
 const struct cpu_operations cpu_psci_ops = {
 	.name		= "psci",
 	.cpu_init	= cpu_psci_cpu_init,
@@ -445,6 +545,9 @@ const struct cpu_operations cpu_psci_ops = {
 	.cpu_die	= cpu_psci_cpu_die,
 	.cpu_kill	= cpu_psci_cpu_kill,
 #endif
+#ifdef CONFIG_ARM64_CPU_SUSPEND
+	.cpu_suspend	= cpu_psci_cpu_suspend,
+#endif
 };
 
 #endif
-- 
1.8.4

^ permalink raw reply related	[flat|nested] 74+ messages in thread

* [PATCH v4 5/6] drivers: cpuidle: CPU idle ARM64 driver
  2014-06-11 16:18 ` Lorenzo Pieralisi
@ 2014-06-11 16:18     ` Lorenzo Pieralisi
  -1 siblings, 0 replies; 74+ messages in thread
From: Lorenzo Pieralisi @ 2014-06-11 16:18 UTC (permalink / raw)
  To: linux-arm-kernel-IAPFreCvJWM7uuMidbF8XUB+6BGkLq7r,
	linux-pm-u79uwXL29TY76Z2rM5mHXA,
	devicetree-u79uwXL29TY76Z2rM5mHXA
  Cc: Lorenzo Pieralisi, Mark Rutland, Sudeep Holla, Catalin Marinas,
	Charles Garcia Tobin, Nicolas Pitre, Rob Herring, Grant Likely,
	Peter De Schrijver, Santosh Shilimkar, Daniel Lezcano,
	Amit Kucheria, Vincent Guittot, Antti Miettinen, Stephen Boyd,
	Kevin Hilman, Sebastian Capella, Tomasz Figa, Mark Brown,
	Paul Walmsley, Chander Kashyap

This patch implements a generic CPU idle driver for ARM64 machines.

It relies on the DT idle states infrastructure to initialize idle
states count and respective parameters. Current code assumes the driver
is managing idle states on all possible CPUs but can be easily
generalized to support heterogenous systems and build cpumasks at
runtime using MIDRs or DT cpu nodes compatible properties.

Suspend back-ends (eg PSCI) must register a suspend initializer with
the CPU idle driver so that the suspend backend call can be detected,
and the driver code can call the back-end infrastructure to complete the
suspend backend initialization.

Idle state index 0 is always initialized as a simple wfi state, ie always
considered present and functional on all ARM64 platforms.

Signed-off-by: Lorenzo Pieralisi <lorenzo.pieralisi-5wv7dgnIgG8@public.gmane.org>
---
 drivers/cpuidle/Kconfig         |   5 ++
 drivers/cpuidle/Kconfig.arm64   |  13 ++++
 drivers/cpuidle/Makefile        |   4 +
 drivers/cpuidle/cpuidle-arm64.c | 168 ++++++++++++++++++++++++++++++++++++++++
 4 files changed, 190 insertions(+)
 create mode 100644 drivers/cpuidle/Kconfig.arm64
 create mode 100644 drivers/cpuidle/cpuidle-arm64.c

diff --git a/drivers/cpuidle/Kconfig b/drivers/cpuidle/Kconfig
index 760ce20..360c086 100644
--- a/drivers/cpuidle/Kconfig
+++ b/drivers/cpuidle/Kconfig
@@ -44,6 +44,11 @@ depends on ARM
 source "drivers/cpuidle/Kconfig.arm"
 endmenu
 
+menu "ARM64 CPU Idle Drivers"
+depends on ARM64
+source "drivers/cpuidle/Kconfig.arm64"
+endmenu
+
 menu "MIPS CPU Idle Drivers"
 depends on MIPS
 source "drivers/cpuidle/Kconfig.mips"
diff --git a/drivers/cpuidle/Kconfig.arm64 b/drivers/cpuidle/Kconfig.arm64
new file mode 100644
index 0000000..b83612c
--- /dev/null
+++ b/drivers/cpuidle/Kconfig.arm64
@@ -0,0 +1,13 @@
+#
+# ARM64 CPU Idle drivers
+#
+
+config ARM64_CPUIDLE
+	bool "Generic ARM64 CPU idle Driver"
+	select OF_IDLE_STATES
+	help
+	  Select this to enable generic cpuidle driver for ARM v8.
+	  It provides a generic idle driver whose idle states are configured
+	  at run-time through DT nodes. The CPUidle suspend backend is
+	  initialized by the device tree parsing code on matching the entry
+	  method to the respective CPU operations.
diff --git a/drivers/cpuidle/Makefile b/drivers/cpuidle/Makefile
index d5ebf4b..e496242 100644
--- a/drivers/cpuidle/Makefile
+++ b/drivers/cpuidle/Makefile
@@ -23,6 +23,10 @@ obj-$(CONFIG_ARM_EXYNOS_CPUIDLE)        += cpuidle-exynos.o
 obj-$(CONFIG_MIPS_CPS_CPUIDLE)		+= cpuidle-cps.o
 
 ###############################################################################
+# ARM64 drivers
+obj-$(CONFIG_ARM64_CPUIDLE)		+= cpuidle-arm64.o
+
+###############################################################################
 # POWERPC drivers
 obj-$(CONFIG_PSERIES_CPUIDLE)		+= cpuidle-pseries.o
 obj-$(CONFIG_POWERNV_CPUIDLE)		+= cpuidle-powernv.o
diff --git a/drivers/cpuidle/cpuidle-arm64.c b/drivers/cpuidle/cpuidle-arm64.c
new file mode 100644
index 0000000..4c932f8
--- /dev/null
+++ b/drivers/cpuidle/cpuidle-arm64.c
@@ -0,0 +1,168 @@
+/*
+ * ARM64 generic CPU idle driver.
+ *
+ * Copyright (C) 2014 ARM Ltd.
+ * Author: Lorenzo Pieralisi <lorenzo.pieralisi-5wv7dgnIgG8@public.gmane.org>
+ *
+ * This program is free software; you can redistribute it and/or modify
+ * it under the terms of the GNU General Public License version 2 as
+ * published by the Free Software Foundation.
+ */
+
+#define pr_fmt(fmt) "CPUidle arm64: " fmt
+
+#include <linux/cpuidle.h>
+#include <linux/cpumask.h>
+#include <linux/cpu_pm.h>
+#include <linux/kernel.h>
+#include <linux/module.h>
+#include <linux/of.h>
+
+#include <asm/psci.h>
+#include <asm/suspend.h>
+
+#include "of_idle_states.h"
+
+typedef int (*suspend_init_fn)(struct cpuidle_driver *,
+			       struct device_node *[]);
+
+struct cpu_suspend_ops {
+	const char *id;
+	suspend_init_fn init_fn;
+};
+
+static const struct cpu_suspend_ops suspend_operations[] __initconst = {
+	{"arm,psci", psci_dt_register_idle_states},
+	{}
+};
+
+static __init const struct cpu_suspend_ops *get_suspend_ops(const char *str)
+{
+	int i;
+
+	if (!str)
+		return NULL;
+
+	for (i = 0; suspend_operations[i].id; i++)
+		if (!strcmp(suspend_operations[i].id, str))
+			return &suspend_operations[i];
+
+	return NULL;
+}
+
+/*
+ * arm_enter_idle_state - Programs CPU to enter the specified state
+ *
+ * dev: cpuidle device
+ * drv: cpuidle driver
+ * idx: state index
+ *
+ * Called from the CPUidle framework to program the device to the
+ * specified target state selected by the governor.
+ */
+static int arm_enter_idle_state(struct cpuidle_device *dev,
+				struct cpuidle_driver *drv, int idx)
+{
+	int ret;
+
+	if (!idx) {
+		cpu_do_idle();
+		return idx;
+	}
+
+	cpu_pm_enter();
+	/*
+	 * Pass idle state index to cpu_suspend which in turn will call
+	 * the CPU ops suspend protocol with idle index as a parameter.
+	 *
+	 * Some states would not require context to be saved and flushed
+	 * to DRAM, so calling cpu_suspend would not be stricly necessary.
+	 * When power domains specifications for ARM CPUs are finalized then
+	 * this code can be optimized to prevent saving registers if not
+	 * needed.
+	 */
+	ret = cpu_suspend(idx);
+
+	cpu_pm_exit();
+
+	return ret ? -1 : idx;
+}
+
+struct cpuidle_driver arm64_idle_driver = {
+	.name = "arm64_idle",
+	.owner = THIS_MODULE,
+};
+
+static struct device_node *state_nodes[CPUIDLE_STATE_MAX] __initdata;
+
+/*
+ * arm64_idle_init
+ *
+ * Registers the arm64 specific cpuidle driver with the cpuidle
+ * framework. It relies on core code to parse the idle states
+ * and initialize them using driver data structures accordingly.
+ */
+static int __init arm64_idle_init(void)
+{
+	int i, ret;
+	const char *entry_method;
+	struct device_node *idle_states_node;
+	const struct cpu_suspend_ops *suspend_init;
+	struct cpuidle_driver *drv = &arm64_idle_driver;
+
+	idle_states_node = of_find_node_by_path("/cpus/idle-states");
+	if (!idle_states_node)
+		return -ENOENT;
+
+	if (of_property_read_string(idle_states_node, "entry-method",
+				    &entry_method)) {
+		pr_warn(" * %s missing entry-method property\n",
+			    idle_states_node->full_name);
+		of_node_put(idle_states_node);
+		ret = -EOPNOTSUPP;
+		goto put_node;
+	}
+
+	suspend_init = get_suspend_ops(entry_method);
+	if (!suspend_init) {
+		pr_warn("Missing suspend initializer\n");
+		ret = -EOPNOTSUPP;
+		goto put_node;
+	}
+
+	/*
+	 * State at index 0 is standby wfi and considered standard
+	 * on all ARM platforms. If in some platforms simple wfi
+	 * can't be used as "state 0", DT bindings must be implemented
+	 * to work around this issue and allow installing a special
+	 * handler for idle state index 0.
+	 */
+	drv->states[0].exit_latency = 1;
+	drv->states[0].target_residency = 1;
+	drv->states[0].flags = CPUIDLE_FLAG_TIME_VALID;
+	strncpy(drv->states[0].name, "ARM WFI", CPUIDLE_NAME_LEN);
+	strncpy(drv->states[0].desc, "ARM WFI", CPUIDLE_DESC_LEN);
+
+	drv->cpumask = (struct cpumask *) cpu_possible_mask;
+	/*
+	 * Start at index 1, request idle state nodes to be filled
+	 */
+	ret = of_init_idle_driver(drv, state_nodes, 1, true);
+	if (ret)
+		goto put_node;
+
+	if (suspend_init->init_fn(drv, state_nodes)) {
+		ret = -EOPNOTSUPP;
+		goto put_node;
+	}
+
+	for (i = 0; i < drv->state_count; i++)
+		drv->states[i].enter = arm_enter_idle_state;
+
+	ret = cpuidle_register(drv, NULL);
+
+put_node:
+	of_node_put(idle_states_node);
+	return ret;
+}
+device_initcall(arm64_idle_init);
-- 
1.8.4


--
To unsubscribe from this list: send the line "unsubscribe devicetree" in
the body of a message to majordomo-u79uwXL29TY76Z2rM5mHXA@public.gmane.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

^ permalink raw reply related	[flat|nested] 74+ messages in thread

* [PATCH v4 5/6] drivers: cpuidle: CPU idle ARM64 driver
@ 2014-06-11 16:18     ` Lorenzo Pieralisi
  0 siblings, 0 replies; 74+ messages in thread
From: Lorenzo Pieralisi @ 2014-06-11 16:18 UTC (permalink / raw)
  To: linux-arm-kernel

This patch implements a generic CPU idle driver for ARM64 machines.

It relies on the DT idle states infrastructure to initialize idle
states count and respective parameters. Current code assumes the driver
is managing idle states on all possible CPUs but can be easily
generalized to support heterogenous systems and build cpumasks at
runtime using MIDRs or DT cpu nodes compatible properties.

Suspend back-ends (eg PSCI) must register a suspend initializer with
the CPU idle driver so that the suspend backend call can be detected,
and the driver code can call the back-end infrastructure to complete the
suspend backend initialization.

Idle state index 0 is always initialized as a simple wfi state, ie always
considered present and functional on all ARM64 platforms.

Signed-off-by: Lorenzo Pieralisi <lorenzo.pieralisi@arm.com>
---
 drivers/cpuidle/Kconfig         |   5 ++
 drivers/cpuidle/Kconfig.arm64   |  13 ++++
 drivers/cpuidle/Makefile        |   4 +
 drivers/cpuidle/cpuidle-arm64.c | 168 ++++++++++++++++++++++++++++++++++++++++
 4 files changed, 190 insertions(+)
 create mode 100644 drivers/cpuidle/Kconfig.arm64
 create mode 100644 drivers/cpuidle/cpuidle-arm64.c

diff --git a/drivers/cpuidle/Kconfig b/drivers/cpuidle/Kconfig
index 760ce20..360c086 100644
--- a/drivers/cpuidle/Kconfig
+++ b/drivers/cpuidle/Kconfig
@@ -44,6 +44,11 @@ depends on ARM
 source "drivers/cpuidle/Kconfig.arm"
 endmenu
 
+menu "ARM64 CPU Idle Drivers"
+depends on ARM64
+source "drivers/cpuidle/Kconfig.arm64"
+endmenu
+
 menu "MIPS CPU Idle Drivers"
 depends on MIPS
 source "drivers/cpuidle/Kconfig.mips"
diff --git a/drivers/cpuidle/Kconfig.arm64 b/drivers/cpuidle/Kconfig.arm64
new file mode 100644
index 0000000..b83612c
--- /dev/null
+++ b/drivers/cpuidle/Kconfig.arm64
@@ -0,0 +1,13 @@
+#
+# ARM64 CPU Idle drivers
+#
+
+config ARM64_CPUIDLE
+	bool "Generic ARM64 CPU idle Driver"
+	select OF_IDLE_STATES
+	help
+	  Select this to enable generic cpuidle driver for ARM v8.
+	  It provides a generic idle driver whose idle states are configured
+	  at run-time through DT nodes. The CPUidle suspend backend is
+	  initialized by the device tree parsing code on matching the entry
+	  method to the respective CPU operations.
diff --git a/drivers/cpuidle/Makefile b/drivers/cpuidle/Makefile
index d5ebf4b..e496242 100644
--- a/drivers/cpuidle/Makefile
+++ b/drivers/cpuidle/Makefile
@@ -23,6 +23,10 @@ obj-$(CONFIG_ARM_EXYNOS_CPUIDLE)        += cpuidle-exynos.o
 obj-$(CONFIG_MIPS_CPS_CPUIDLE)		+= cpuidle-cps.o
 
 ###############################################################################
+# ARM64 drivers
+obj-$(CONFIG_ARM64_CPUIDLE)		+= cpuidle-arm64.o
+
+###############################################################################
 # POWERPC drivers
 obj-$(CONFIG_PSERIES_CPUIDLE)		+= cpuidle-pseries.o
 obj-$(CONFIG_POWERNV_CPUIDLE)		+= cpuidle-powernv.o
diff --git a/drivers/cpuidle/cpuidle-arm64.c b/drivers/cpuidle/cpuidle-arm64.c
new file mode 100644
index 0000000..4c932f8
--- /dev/null
+++ b/drivers/cpuidle/cpuidle-arm64.c
@@ -0,0 +1,168 @@
+/*
+ * ARM64 generic CPU idle driver.
+ *
+ * Copyright (C) 2014 ARM Ltd.
+ * Author: Lorenzo Pieralisi <lorenzo.pieralisi@arm.com>
+ *
+ * This program is free software; you can redistribute it and/or modify
+ * it under the terms of the GNU General Public License version 2 as
+ * published by the Free Software Foundation.
+ */
+
+#define pr_fmt(fmt) "CPUidle arm64: " fmt
+
+#include <linux/cpuidle.h>
+#include <linux/cpumask.h>
+#include <linux/cpu_pm.h>
+#include <linux/kernel.h>
+#include <linux/module.h>
+#include <linux/of.h>
+
+#include <asm/psci.h>
+#include <asm/suspend.h>
+
+#include "of_idle_states.h"
+
+typedef int (*suspend_init_fn)(struct cpuidle_driver *,
+			       struct device_node *[]);
+
+struct cpu_suspend_ops {
+	const char *id;
+	suspend_init_fn init_fn;
+};
+
+static const struct cpu_suspend_ops suspend_operations[] __initconst = {
+	{"arm,psci", psci_dt_register_idle_states},
+	{}
+};
+
+static __init const struct cpu_suspend_ops *get_suspend_ops(const char *str)
+{
+	int i;
+
+	if (!str)
+		return NULL;
+
+	for (i = 0; suspend_operations[i].id; i++)
+		if (!strcmp(suspend_operations[i].id, str))
+			return &suspend_operations[i];
+
+	return NULL;
+}
+
+/*
+ * arm_enter_idle_state - Programs CPU to enter the specified state
+ *
+ * dev: cpuidle device
+ * drv: cpuidle driver
+ * idx: state index
+ *
+ * Called from the CPUidle framework to program the device to the
+ * specified target state selected by the governor.
+ */
+static int arm_enter_idle_state(struct cpuidle_device *dev,
+				struct cpuidle_driver *drv, int idx)
+{
+	int ret;
+
+	if (!idx) {
+		cpu_do_idle();
+		return idx;
+	}
+
+	cpu_pm_enter();
+	/*
+	 * Pass idle state index to cpu_suspend which in turn will call
+	 * the CPU ops suspend protocol with idle index as a parameter.
+	 *
+	 * Some states would not require context to be saved and flushed
+	 * to DRAM, so calling cpu_suspend would not be stricly necessary.
+	 * When power domains specifications for ARM CPUs are finalized then
+	 * this code can be optimized to prevent saving registers if not
+	 * needed.
+	 */
+	ret = cpu_suspend(idx);
+
+	cpu_pm_exit();
+
+	return ret ? -1 : idx;
+}
+
+struct cpuidle_driver arm64_idle_driver = {
+	.name = "arm64_idle",
+	.owner = THIS_MODULE,
+};
+
+static struct device_node *state_nodes[CPUIDLE_STATE_MAX] __initdata;
+
+/*
+ * arm64_idle_init
+ *
+ * Registers the arm64 specific cpuidle driver with the cpuidle
+ * framework. It relies on core code to parse the idle states
+ * and initialize them using driver data structures accordingly.
+ */
+static int __init arm64_idle_init(void)
+{
+	int i, ret;
+	const char *entry_method;
+	struct device_node *idle_states_node;
+	const struct cpu_suspend_ops *suspend_init;
+	struct cpuidle_driver *drv = &arm64_idle_driver;
+
+	idle_states_node = of_find_node_by_path("/cpus/idle-states");
+	if (!idle_states_node)
+		return -ENOENT;
+
+	if (of_property_read_string(idle_states_node, "entry-method",
+				    &entry_method)) {
+		pr_warn(" * %s missing entry-method property\n",
+			    idle_states_node->full_name);
+		of_node_put(idle_states_node);
+		ret = -EOPNOTSUPP;
+		goto put_node;
+	}
+
+	suspend_init = get_suspend_ops(entry_method);
+	if (!suspend_init) {
+		pr_warn("Missing suspend initializer\n");
+		ret = -EOPNOTSUPP;
+		goto put_node;
+	}
+
+	/*
+	 * State at index 0 is standby wfi and considered standard
+	 * on all ARM platforms. If in some platforms simple wfi
+	 * can't be used as "state 0", DT bindings must be implemented
+	 * to work around this issue and allow installing a special
+	 * handler for idle state index 0.
+	 */
+	drv->states[0].exit_latency = 1;
+	drv->states[0].target_residency = 1;
+	drv->states[0].flags = CPUIDLE_FLAG_TIME_VALID;
+	strncpy(drv->states[0].name, "ARM WFI", CPUIDLE_NAME_LEN);
+	strncpy(drv->states[0].desc, "ARM WFI", CPUIDLE_DESC_LEN);
+
+	drv->cpumask = (struct cpumask *) cpu_possible_mask;
+	/*
+	 * Start at index 1, request idle state nodes to be filled
+	 */
+	ret = of_init_idle_driver(drv, state_nodes, 1, true);
+	if (ret)
+		goto put_node;
+
+	if (suspend_init->init_fn(drv, state_nodes)) {
+		ret = -EOPNOTSUPP;
+		goto put_node;
+	}
+
+	for (i = 0; i < drv->state_count; i++)
+		drv->states[i].enter = arm_enter_idle_state;
+
+	ret = cpuidle_register(drv, NULL);
+
+put_node:
+	of_node_put(idle_states_node);
+	return ret;
+}
+device_initcall(arm64_idle_init);
-- 
1.8.4

^ permalink raw reply related	[flat|nested] 74+ messages in thread

* [PATCH v4 6/6] arm64: boot: dts: update rtsm aemv8 dts with PSCI and idle states
  2014-06-11 16:18 ` Lorenzo Pieralisi
@ 2014-06-11 16:18   ` Lorenzo Pieralisi
  -1 siblings, 0 replies; 74+ messages in thread
From: Lorenzo Pieralisi @ 2014-06-11 16:18 UTC (permalink / raw)
  To: linux-arm-kernel, linux-pm, devicetree
  Cc: Mark Rutland, Paul Walmsley, Lorenzo Pieralisi, Vincent Guittot,
	Kevin Hilman, Nicolas Pitre, Catalin Marinas, Peter De Schrijver,
	Daniel Lezcano, Stephen Boyd, Amit Kucheria, Chander Kashyap,
	Sebastian Capella, Rob Herring, Santosh Shilimkar, Mark Brown,
	Sudeep Holla, Grant Likely, Tomasz Figa, Antti Miettinen,
	Charles Garcia Tobin

This patch updates the RTSM dts file with PSCI bindings and nodes
describing the AEMv8 model idle states parameters.

Signed-off-by: Lorenzo Pieralisi <lorenzo.pieralisi@arm.com>
---
 arch/arm64/boot/dts/rtsm_ve-aemv8a.dts | 44 +++++++++++++++++++++++++++-------
 1 file changed, 36 insertions(+), 8 deletions(-)

diff --git a/arch/arm64/boot/dts/rtsm_ve-aemv8a.dts b/arch/arm64/boot/dts/rtsm_ve-aemv8a.dts
index d79de9c..4051ffb 100644
--- a/arch/arm64/boot/dts/rtsm_ve-aemv8a.dts
+++ b/arch/arm64/boot/dts/rtsm_ve-aemv8a.dts
@@ -27,37 +27,65 @@
 		serial3 = &v2m_serial3;
 	};
 
+	psci {
+		compatible = "arm,psci-0.2";
+		method = "smc";
+		cpu_suspend = <0xc4000001>;
+		cpu_off = <0x84000002>;
+		cpu_on = <0xc4000003>;
+	};
+
 	cpus {
 		#address-cells = <2>;
 		#size-cells = <0>;
 
+		idle-states {
+			entry-method = "arm,psci";
+
+			CPU_SLEEP_0: cpu-sleep-0 {
+				compatible = "arm,idle-state";
+				entry-method-param = <0x0010000>;
+				entry-latency-us = <40>;
+				exit-latency-us = <100>;
+				min-residency-us = <150>;
+			};
+
+			CLUSTER_SLEEP_0: cluster-sleep-0 {
+				compatible = "arm,idle-state";
+				entry-method-param = <0x1010000>;
+				entry-latency-us = <500>;
+				exit-latency-us = <1000>;
+				min-residency-us = <2500>;
+			};
+		};
+
 		cpu@0 {
 			device_type = "cpu";
 			compatible = "arm,armv8";
 			reg = <0x0 0x0>;
-			enable-method = "spin-table";
-			cpu-release-addr = <0x0 0x8000fff8>;
+			enable-method = "psci";
+			cpu-idle-states = <&CPU_SLEEP_0 &CLUSTER_SLEEP_0>;
 		};
 		cpu@1 {
 			device_type = "cpu";
 			compatible = "arm,armv8";
 			reg = <0x0 0x1>;
-			enable-method = "spin-table";
-			cpu-release-addr = <0x0 0x8000fff8>;
+			enable-method = "psci";
+			cpu-idle-states = <&CPU_SLEEP_0 &CLUSTER_SLEEP_0>;
 		};
 		cpu@2 {
 			device_type = "cpu";
 			compatible = "arm,armv8";
 			reg = <0x0 0x2>;
-			enable-method = "spin-table";
-			cpu-release-addr = <0x0 0x8000fff8>;
+			enable-method = "psci";
+			cpu-idle-states = <&CPU_SLEEP_0 &CLUSTER_SLEEP_0>;
 		};
 		cpu@3 {
 			device_type = "cpu";
 			compatible = "arm,armv8";
 			reg = <0x0 0x3>;
-			enable-method = "spin-table";
-			cpu-release-addr = <0x0 0x8000fff8>;
+			enable-method = "psci";
+			cpu-idle-states = <&CPU_SLEEP_0 &CLUSTER_SLEEP_0>;
 		};
 	};
 
-- 
1.8.4

^ permalink raw reply related	[flat|nested] 74+ messages in thread

* [PATCH v4 6/6] arm64: boot: dts: update rtsm aemv8 dts with PSCI and idle states
@ 2014-06-11 16:18   ` Lorenzo Pieralisi
  0 siblings, 0 replies; 74+ messages in thread
From: Lorenzo Pieralisi @ 2014-06-11 16:18 UTC (permalink / raw)
  To: linux-arm-kernel

This patch updates the RTSM dts file with PSCI bindings and nodes
describing the AEMv8 model idle states parameters.

Signed-off-by: Lorenzo Pieralisi <lorenzo.pieralisi@arm.com>
---
 arch/arm64/boot/dts/rtsm_ve-aemv8a.dts | 44 +++++++++++++++++++++++++++-------
 1 file changed, 36 insertions(+), 8 deletions(-)

diff --git a/arch/arm64/boot/dts/rtsm_ve-aemv8a.dts b/arch/arm64/boot/dts/rtsm_ve-aemv8a.dts
index d79de9c..4051ffb 100644
--- a/arch/arm64/boot/dts/rtsm_ve-aemv8a.dts
+++ b/arch/arm64/boot/dts/rtsm_ve-aemv8a.dts
@@ -27,37 +27,65 @@
 		serial3 = &v2m_serial3;
 	};
 
+	psci {
+		compatible = "arm,psci-0.2";
+		method = "smc";
+		cpu_suspend = <0xc4000001>;
+		cpu_off = <0x84000002>;
+		cpu_on = <0xc4000003>;
+	};
+
 	cpus {
 		#address-cells = <2>;
 		#size-cells = <0>;
 
+		idle-states {
+			entry-method = "arm,psci";
+
+			CPU_SLEEP_0: cpu-sleep-0 {
+				compatible = "arm,idle-state";
+				entry-method-param = <0x0010000>;
+				entry-latency-us = <40>;
+				exit-latency-us = <100>;
+				min-residency-us = <150>;
+			};
+
+			CLUSTER_SLEEP_0: cluster-sleep-0 {
+				compatible = "arm,idle-state";
+				entry-method-param = <0x1010000>;
+				entry-latency-us = <500>;
+				exit-latency-us = <1000>;
+				min-residency-us = <2500>;
+			};
+		};
+
 		cpu at 0 {
 			device_type = "cpu";
 			compatible = "arm,armv8";
 			reg = <0x0 0x0>;
-			enable-method = "spin-table";
-			cpu-release-addr = <0x0 0x8000fff8>;
+			enable-method = "psci";
+			cpu-idle-states = <&CPU_SLEEP_0 &CLUSTER_SLEEP_0>;
 		};
 		cpu at 1 {
 			device_type = "cpu";
 			compatible = "arm,armv8";
 			reg = <0x0 0x1>;
-			enable-method = "spin-table";
-			cpu-release-addr = <0x0 0x8000fff8>;
+			enable-method = "psci";
+			cpu-idle-states = <&CPU_SLEEP_0 &CLUSTER_SLEEP_0>;
 		};
 		cpu at 2 {
 			device_type = "cpu";
 			compatible = "arm,armv8";
 			reg = <0x0 0x2>;
-			enable-method = "spin-table";
-			cpu-release-addr = <0x0 0x8000fff8>;
+			enable-method = "psci";
+			cpu-idle-states = <&CPU_SLEEP_0 &CLUSTER_SLEEP_0>;
 		};
 		cpu at 3 {
 			device_type = "cpu";
 			compatible = "arm,armv8";
 			reg = <0x0 0x3>;
-			enable-method = "spin-table";
-			cpu-release-addr = <0x0 0x8000fff8>;
+			enable-method = "psci";
+			cpu-idle-states = <&CPU_SLEEP_0 &CLUSTER_SLEEP_0>;
 		};
 	};
 
-- 
1.8.4

^ permalink raw reply related	[flat|nested] 74+ messages in thread

* Re: [PATCH v4 1/6] Documentation: arm: define DT idle states bindings
  2014-06-11 16:18   ` Lorenzo Pieralisi
@ 2014-06-11 18:15     ` Nicolas Pitre
  -1 siblings, 0 replies; 74+ messages in thread
From: Nicolas Pitre @ 2014-06-11 18:15 UTC (permalink / raw)
  To: Lorenzo Pieralisi
  Cc: linux-arm-kernel, linux-pm, devicetree, Mark Rutland,
	Sudeep Holla, Catalin Marinas, Charles Garcia Tobin, Rob Herring,
	Grant Likely, Peter De Schrijver, Santosh Shilimkar,
	Daniel Lezcano, Amit Kucheria, Vincent Guittot, Antti Miettinen,
	Stephen Boyd, Kevin Hilman, Sebastian Capella, Tomasz Figa,
	Mark Brown, Paul Walmsley, Chander Kashyap

On Wed, 11 Jun 2014, Lorenzo Pieralisi wrote:

> ARM based platforms implement a variety of power management schemes that
> allow processors to enter idle states at run-time.
> The parameters defining these idle states vary on a per-platform basis forcing
> the OS to hardcode the state parameters in platform specific static tables
> whose size grows as the number of platforms supported in the kernel increases
> and hampers device drivers standardization.
> 
> Therefore, this patch aims at standardizing idle state device tree bindings for
> ARM platforms. Bindings define idle state parameters inclusive of entry methods
> and state latencies, to allow operating systems to retrieve the configuration
> entries from the device tree and initialize the related power management
> drivers, paving the way for common code in the kernel to deal with idle
> states and removing the need for static data in current and previous kernel
> versions.

Following the offline discussion with Charles, I've some comments.

[...]

> +Idle state parameters (eg entry latency) are platform specific and 
need to be
> +characterized with bindings that provide the required information to OSPM
> +code so that it can build the required tables and use them at runtime.

[...]

> +	- entry-latency-us
> +		Usage: Required
> +		Value type: <prop-encoded-array>
> +		Definition: u32 value representing worst case latency
> +			    in microseconds required to enter the idle state.
> +
> +	- exit-latency-us
> +		Usage: Required
> +		Value type: <prop-encoded-array>
> +		Definition: u32 value representing worst case latency
> +			    in microseconds required to exit the idle state.
> +
> +	- min-residency-us
> +		Usage: Required
> +		Value type: <prop-encoded-array>
> +		Definition: u32 value representing duration in microseconds
> +			    after which this state becomes more energy
> +			    efficient than any shallower states.

I think this would benefit from a clearer definition.  For example, 
should the min-residency-us value include or exclude the entry and exit 
delays?  I think it should since that's what the cpuidle code will have 
to use when testing against expected delay before next wakeup event in 
any case.  Some of your examples don't assume it is the case though, as 
the min-residency-us is smaller than entry+exit delays.

Also I think we'd need a 4th value to fully characterize a state: worst 
case wake-up latency for QoS purposes.

Let's illustrate the different periods on a time line to make it clearer
(hmmm let's see how this can be managed on a braille display :-O ):

EXEC:	Normal CPU execution.

PREP:	Preparation phase before committing the hardware to idle mode
	like cache flushing. This is abortable on pending wake-up 
	event conditions. The abort latency is assumed to be negligible 
	(i.e. less than the ENTRY + EXIT duration). If aborted, we go 
	back to EXEC. This phase is optional. If not abortable, this 
	should be included in the ENTRY phase instead.

ENTRY:	The hardware is committed to idle mode. This period must run to
	completion up to IDLE before anything else can happen.

IDLE:	This is the actual power-saving idle period. This may last 
	between 0 and infinite time, until a wake-up event occurs.

EXIT:	Period during which the CPU is brought back to operational
	mode (EXEC).

...__[EXEC]__|__[PREP]--|__[ENTRY]__|__[IDLE]__|___[EXIT]_--|__[EXEC]__...
             |          |           |          |            |

             |<-- entry-latency --->|

                                               |<- exit-  ->|
                                               |  latency   |

             |<-------------- min-residency --------------->|

                        |<----- worst_wakeup_latency ------>|

entry-latency: Worst case latency required to enter the idle state.  The 
exit_latency may be guaranteed only after entry-latency has passed.

min-residency: Minimum period, including preparation, entry and exit, 
for a given power mode to be worthwhile energy wise.  It must be at 
least equal to entry_latency + exit_latency.

worst_wakeup_latency: Maximum delay between the signaling of a wake-up 
event and the CPU being able to execute normal code again. If not 
specified, this is assumed to be entry-latency + exit_latency.

Notes:

The cpuidle code would only care about min-residency to select the most 
appropriate mode based on the expected delay before the next event.

The scheduler will care about the following in the near future:

wakeup_delay = exit_latency + max(entry_latency - (now - entry_timestamp), 0)

In other words, the scheduler would wake up the CPU with the shortest 
wake-up latency.  This wake-up latency must take into account the entry 
latency if that period has not expired.  Here the abortable nature of 
the PREP period is ignored on purpose because it cannot be relied upon 
(e.g. if the cache is mostly clean then the PREP deadline may occur much 
sooner than expected).

And pmqos would only care about worst_wakeup_latency.

So... I hope this is useful.  I think the above ascii art could be part 
of your documentation to explain it all.



Nicolas

^ permalink raw reply	[flat|nested] 74+ messages in thread

* [PATCH v4 1/6] Documentation: arm: define DT idle states bindings
@ 2014-06-11 18:15     ` Nicolas Pitre
  0 siblings, 0 replies; 74+ messages in thread
From: Nicolas Pitre @ 2014-06-11 18:15 UTC (permalink / raw)
  To: linux-arm-kernel

On Wed, 11 Jun 2014, Lorenzo Pieralisi wrote:

> ARM based platforms implement a variety of power management schemes that
> allow processors to enter idle states at run-time.
> The parameters defining these idle states vary on a per-platform basis forcing
> the OS to hardcode the state parameters in platform specific static tables
> whose size grows as the number of platforms supported in the kernel increases
> and hampers device drivers standardization.
> 
> Therefore, this patch aims at standardizing idle state device tree bindings for
> ARM platforms. Bindings define idle state parameters inclusive of entry methods
> and state latencies, to allow operating systems to retrieve the configuration
> entries from the device tree and initialize the related power management
> drivers, paving the way for common code in the kernel to deal with idle
> states and removing the need for static data in current and previous kernel
> versions.

Following the offline discussion with Charles, I've some comments.

[...]

> +Idle state parameters (eg entry latency) are platform specific and 
need to be
> +characterized with bindings that provide the required information to OSPM
> +code so that it can build the required tables and use them at runtime.

[...]

> +	- entry-latency-us
> +		Usage: Required
> +		Value type: <prop-encoded-array>
> +		Definition: u32 value representing worst case latency
> +			    in microseconds required to enter the idle state.
> +
> +	- exit-latency-us
> +		Usage: Required
> +		Value type: <prop-encoded-array>
> +		Definition: u32 value representing worst case latency
> +			    in microseconds required to exit the idle state.
> +
> +	- min-residency-us
> +		Usage: Required
> +		Value type: <prop-encoded-array>
> +		Definition: u32 value representing duration in microseconds
> +			    after which this state becomes more energy
> +			    efficient than any shallower states.

I think this would benefit from a clearer definition.  For example, 
should the min-residency-us value include or exclude the entry and exit 
delays?  I think it should since that's what the cpuidle code will have 
to use when testing against expected delay before next wakeup event in 
any case.  Some of your examples don't assume it is the case though, as 
the min-residency-us is smaller than entry+exit delays.

Also I think we'd need a 4th value to fully characterize a state: worst 
case wake-up latency for QoS purposes.

Let's illustrate the different periods on a time line to make it clearer
(hmmm let's see how this can be managed on a braille display :-O ):

EXEC:	Normal CPU execution.

PREP:	Preparation phase before committing the hardware to idle mode
	like cache flushing. This is abortable on pending wake-up 
	event conditions. The abort latency is assumed to be negligible 
	(i.e. less than the ENTRY + EXIT duration). If aborted, we go 
	back to EXEC. This phase is optional. If not abortable, this 
	should be included in the ENTRY phase instead.

ENTRY:	The hardware is committed to idle mode. This period must run to
	completion up to IDLE before anything else can happen.

IDLE:	This is the actual power-saving idle period. This may last 
	between 0 and infinite time, until a wake-up event occurs.

EXIT:	Period during which the CPU is brought back to operational
	mode (EXEC).

...__[EXEC]__|__[PREP]--|__[ENTRY]__|__[IDLE]__|___[EXIT]_--|__[EXEC]__...
             |          |           |          |            |

             |<-- entry-latency --->|

                                               |<- exit-  ->|
                                               |  latency   |

             |<-------------- min-residency --------------->|

                        |<----- worst_wakeup_latency ------>|

entry-latency: Worst case latency required to enter the idle state.  The 
exit_latency may be guaranteed only after entry-latency has passed.

min-residency: Minimum period, including preparation, entry and exit, 
for a given power mode to be worthwhile energy wise.  It must be at 
least equal to entry_latency + exit_latency.

worst_wakeup_latency: Maximum delay between the signaling of a wake-up 
event and the CPU being able to execute normal code again. If not 
specified, this is assumed to be entry-latency + exit_latency.

Notes:

The cpuidle code would only care about min-residency to select the most 
appropriate mode based on the expected delay before the next event.

The scheduler will care about the following in the near future:

wakeup_delay = exit_latency + max(entry_latency - (now - entry_timestamp), 0)

In other words, the scheduler would wake up the CPU with the shortest 
wake-up latency.  This wake-up latency must take into account the entry 
latency if that period has not expired.  Here the abortable nature of 
the PREP period is ignored on purpose because it cannot be relied upon 
(e.g. if the cache is mostly clean then the PREP deadline may occur much 
sooner than expected).

And pmqos would only care about worst_wakeup_latency.

So... I hope this is useful.  I think the above ascii art could be part 
of your documentation to explain it all.



Nicolas

^ permalink raw reply	[flat|nested] 74+ messages in thread

* Re: [PATCH v4 3/6] drivers: cpuidle: implement OF based idle states infrastructure
  2014-06-11 16:18   ` Lorenzo Pieralisi
@ 2014-06-11 18:24     ` Nicolas Pitre
  -1 siblings, 0 replies; 74+ messages in thread
From: Nicolas Pitre @ 2014-06-11 18:24 UTC (permalink / raw)
  To: Lorenzo Pieralisi
  Cc: linux-arm-kernel, linux-pm, devicetree, Mark Rutland,
	Sudeep Holla, Catalin Marinas, Charles Garcia Tobin, Rob Herring,
	Grant Likely, Peter De Schrijver, Santosh Shilimkar,
	Daniel Lezcano, Amit Kucheria, Vincent Guittot, Antti Miettinen,
	Stephen Boyd, Kevin Hilman, Sebastian Capella, Tomasz Figa,
	Mark Brown, Paul Walmsley, Chander Kashyap

On Wed, 11 Jun 2014, Lorenzo Pieralisi wrote:

> +config OF_IDLE_STATES
> +        bool "Idle states DT support"
> +	depends on ARM || ARM64
> +	default n

The default for default is n already, so you don't have to default to 
the default's default.


Nicolas

^ permalink raw reply	[flat|nested] 74+ messages in thread

* [PATCH v4 3/6] drivers: cpuidle: implement OF based idle states infrastructure
@ 2014-06-11 18:24     ` Nicolas Pitre
  0 siblings, 0 replies; 74+ messages in thread
From: Nicolas Pitre @ 2014-06-11 18:24 UTC (permalink / raw)
  To: linux-arm-kernel

On Wed, 11 Jun 2014, Lorenzo Pieralisi wrote:

> +config OF_IDLE_STATES
> +        bool "Idle states DT support"
> +	depends on ARM || ARM64
> +	default n

The default for default is n already, so you don't have to default to 
the default's default.


Nicolas

^ permalink raw reply	[flat|nested] 74+ messages in thread

* Re: [PATCH v4 3/6] drivers: cpuidle: implement OF based idle states infrastructure
  2014-06-11 16:18   ` Lorenzo Pieralisi
@ 2014-06-11 18:25     ` Rafael J. Wysocki
  -1 siblings, 0 replies; 74+ messages in thread
From: Rafael J. Wysocki @ 2014-06-11 18:25 UTC (permalink / raw)
  To: Lorenzo Pieralisi
  Cc: linux-arm-kernel, linux-pm, devicetree, Mark Rutland,
	Sudeep Holla, Catalin Marinas, Charles Garcia Tobin,
	Nicolas Pitre, Rob Herring, Grant Likely, Peter De Schrijver,
	Santosh Shilimkar, Daniel Lezcano, Amit Kucheria,
	Vincent Guittot, Antti Miettinen, Stephen Boyd, Kevin Hilman,
	Sebastian Capella, Tomasz Figa, Mark Brown, Paul Walmsley,
	Chander Kashyap

On Wed, Jun 11, 2014 at 6:18 PM, Lorenzo Pieralisi
<lorenzo.pieralisi@arm.com> wrote:
> On most common ARM systems, the low-power states a CPU can be put into are
> not discoverable in HW and require device tree bindings to describe
> power down suspend operations and idle states parameters.
>
> In order to enable DT based idle states and configure idle drivers, this
> patch implements the bulk infrastructure required to parse the device tree
> idle states bindings and initialize the corresponding CPUidle driver states
> data.
>
> Code that initializes idle states checks the CPU idle driver cpumask so
> that multiple CPU idle drivers can be initialized through it in the
> kernel. The CPU idle driver cpumask defines which idle states should be
> considered valid for the driver, ie idle states that are valid on a set
> of cpus the idle driver manages.
>
> Signed-off-by: Lorenzo Pieralisi <lorenzo.pieralisi@arm.com>
> ---
>  drivers/cpuidle/Kconfig          |   9 ++
>  drivers/cpuidle/Makefile         |   1 +
>  drivers/cpuidle/of_idle_states.c | 282 +++++++++++++++++++++++++++++++++++++++
>  drivers/cpuidle/of_idle_states.h |   8 ++
>  4 files changed, 300 insertions(+)
>  create mode 100644 drivers/cpuidle/of_idle_states.c
>  create mode 100644 drivers/cpuidle/of_idle_states.h
>
> diff --git a/drivers/cpuidle/Kconfig b/drivers/cpuidle/Kconfig
> index 1b96fb9..760ce20 100644
> --- a/drivers/cpuidle/Kconfig
> +++ b/drivers/cpuidle/Kconfig
> @@ -30,6 +30,15 @@ config CPU_IDLE_GOV_MENU
>         bool "Menu governor (for tickless system)"
>         default y
>
> +config OF_IDLE_STATES

One question here.

Do you want this to be generally useful or is it just ARM-specific?

Rafael

^ permalink raw reply	[flat|nested] 74+ messages in thread

* [PATCH v4 3/6] drivers: cpuidle: implement OF based idle states infrastructure
@ 2014-06-11 18:25     ` Rafael J. Wysocki
  0 siblings, 0 replies; 74+ messages in thread
From: Rafael J. Wysocki @ 2014-06-11 18:25 UTC (permalink / raw)
  To: linux-arm-kernel

On Wed, Jun 11, 2014 at 6:18 PM, Lorenzo Pieralisi
<lorenzo.pieralisi@arm.com> wrote:
> On most common ARM systems, the low-power states a CPU can be put into are
> not discoverable in HW and require device tree bindings to describe
> power down suspend operations and idle states parameters.
>
> In order to enable DT based idle states and configure idle drivers, this
> patch implements the bulk infrastructure required to parse the device tree
> idle states bindings and initialize the corresponding CPUidle driver states
> data.
>
> Code that initializes idle states checks the CPU idle driver cpumask so
> that multiple CPU idle drivers can be initialized through it in the
> kernel. The CPU idle driver cpumask defines which idle states should be
> considered valid for the driver, ie idle states that are valid on a set
> of cpus the idle driver manages.
>
> Signed-off-by: Lorenzo Pieralisi <lorenzo.pieralisi@arm.com>
> ---
>  drivers/cpuidle/Kconfig          |   9 ++
>  drivers/cpuidle/Makefile         |   1 +
>  drivers/cpuidle/of_idle_states.c | 282 +++++++++++++++++++++++++++++++++++++++
>  drivers/cpuidle/of_idle_states.h |   8 ++
>  4 files changed, 300 insertions(+)
>  create mode 100644 drivers/cpuidle/of_idle_states.c
>  create mode 100644 drivers/cpuidle/of_idle_states.h
>
> diff --git a/drivers/cpuidle/Kconfig b/drivers/cpuidle/Kconfig
> index 1b96fb9..760ce20 100644
> --- a/drivers/cpuidle/Kconfig
> +++ b/drivers/cpuidle/Kconfig
> @@ -30,6 +30,15 @@ config CPU_IDLE_GOV_MENU
>         bool "Menu governor (for tickless system)"
>         default y
>
> +config OF_IDLE_STATES

One question here.

Do you want this to be generally useful or is it just ARM-specific?

Rafael

^ permalink raw reply	[flat|nested] 74+ messages in thread

* Re: [PATCH v4 3/6] drivers: cpuidle: implement OF based idle states infrastructure
  2014-06-11 16:18   ` Lorenzo Pieralisi
@ 2014-06-11 18:38     ` Nicolas Pitre
  -1 siblings, 0 replies; 74+ messages in thread
From: Nicolas Pitre @ 2014-06-11 18:38 UTC (permalink / raw)
  To: Lorenzo Pieralisi
  Cc: linux-arm-kernel, linux-pm, devicetree, Mark Rutland,
	Sudeep Holla, Catalin Marinas, Charles Garcia Tobin, Rob Herring,
	Grant Likely, Peter De Schrijver, Santosh Shilimkar,
	Daniel Lezcano, Amit Kucheria, Vincent Guittot, Antti Miettinen,
	Stephen Boyd, Kevin Hilman, Sebastian Capella, Tomasz Figa,
	Mark Brown, Paul Walmsley, Chander Kashyap

On Wed, 11 Jun 2014, Lorenzo Pieralisi wrote:

> On most common ARM systems, the low-power states a CPU can be put into are
> not discoverable in HW and require device tree bindings to describe
> power down suspend operations and idle states parameters.
> 
> In order to enable DT based idle states and configure idle drivers, this
> patch implements the bulk infrastructure required to parse the device tree
> idle states bindings and initialize the corresponding CPUidle driver states
> data.

Oh and another pet peeve of mine: given we always talk about "device 
tree" all the time, could you s/OF/DT/ in the subject?  It's been a 
while that DT has outgrown its OF origins.

>  create mode 100644 drivers/cpuidle/of_idle_states.c
>  create mode 100644 drivers/cpuidle/of_idle_states.h

Ditto here, including any new symbols you introduced.

Nicolas

^ permalink raw reply	[flat|nested] 74+ messages in thread

* [PATCH v4 3/6] drivers: cpuidle: implement OF based idle states infrastructure
@ 2014-06-11 18:38     ` Nicolas Pitre
  0 siblings, 0 replies; 74+ messages in thread
From: Nicolas Pitre @ 2014-06-11 18:38 UTC (permalink / raw)
  To: linux-arm-kernel

On Wed, 11 Jun 2014, Lorenzo Pieralisi wrote:

> On most common ARM systems, the low-power states a CPU can be put into are
> not discoverable in HW and require device tree bindings to describe
> power down suspend operations and idle states parameters.
> 
> In order to enable DT based idle states and configure idle drivers, this
> patch implements the bulk infrastructure required to parse the device tree
> idle states bindings and initialize the corresponding CPUidle driver states
> data.

Oh and another pet peeve of mine: given we always talk about "device 
tree" all the time, could you s/OF/DT/ in the subject?  It's been a 
while that DT has outgrown its OF origins.

>  create mode 100644 drivers/cpuidle/of_idle_states.c
>  create mode 100644 drivers/cpuidle/of_idle_states.h

Ditto here, including any new symbols you introduced.

Nicolas

^ permalink raw reply	[flat|nested] 74+ messages in thread

* Re: [PATCH v4 3/6] drivers: cpuidle: implement OF based idle states infrastructure
  2014-06-11 18:24     ` Nicolas Pitre
@ 2014-06-12  8:46       ` Lorenzo Pieralisi
  -1 siblings, 0 replies; 74+ messages in thread
From: Lorenzo Pieralisi @ 2014-06-12  8:46 UTC (permalink / raw)
  To: Nicolas Pitre
  Cc: linux-arm-kernel, linux-pm, devicetree, Mark Rutland,
	Sudeep Holla, Catalin Marinas, Charles Garcia-Tobin, Rob Herring,
	grant.likely, Peter De Schrijver, Santosh Shilimkar,
	Daniel Lezcano, Amit Kucheria, Vincent Guittot, Antti Miettinen,
	Stephen Boyd, Kevin Hilman, Sebastian Capella, Tomasz Figa,
	Mark Brown

On Wed, Jun 11, 2014 at 07:24:12PM +0100, Nicolas Pitre wrote:
> On Wed, 11 Jun 2014, Lorenzo Pieralisi wrote:
> 
> > +config OF_IDLE_STATES
> > +        bool "Idle states DT support"
> > +	depends on ARM || ARM64
> > +	default n
> 
> The default for default is n already, so you don't have to default to 
> the default's default.

Ok, thanks for spotting that.

Lorenzo


^ permalink raw reply	[flat|nested] 74+ messages in thread

* [PATCH v4 3/6] drivers: cpuidle: implement OF based idle states infrastructure
@ 2014-06-12  8:46       ` Lorenzo Pieralisi
  0 siblings, 0 replies; 74+ messages in thread
From: Lorenzo Pieralisi @ 2014-06-12  8:46 UTC (permalink / raw)
  To: linux-arm-kernel

On Wed, Jun 11, 2014 at 07:24:12PM +0100, Nicolas Pitre wrote:
> On Wed, 11 Jun 2014, Lorenzo Pieralisi wrote:
> 
> > +config OF_IDLE_STATES
> > +        bool "Idle states DT support"
> > +	depends on ARM || ARM64
> > +	default n
> 
> The default for default is n already, so you don't have to default to 
> the default's default.

Ok, thanks for spotting that.

Lorenzo

^ permalink raw reply	[flat|nested] 74+ messages in thread

* Re: [PATCH v4 3/6] drivers: cpuidle: implement OF based idle states infrastructure
  2014-06-11 18:25     ` Rafael J. Wysocki
@ 2014-06-12  9:03       ` Lorenzo Pieralisi
  -1 siblings, 0 replies; 74+ messages in thread
From: Lorenzo Pieralisi @ 2014-06-12  9:03 UTC (permalink / raw)
  To: Rafael J. Wysocki
  Cc: Mark Rutland, Catalin Marinas, Tomasz Figa, Chander Kashyap,
	Vincent Guittot, Nicolas Pitre, Daniel Lezcano, linux-arm-kernel,
	grant.likely, Charles Garcia-Tobin, devicetree, Kevin Hilman,
	linux-pm, Sebastian Capella, Mark Brown, Antti Miettinen,
	Paul Walmsley, paul.burton, Peter De Schrijver, Stephen Boyd,
	Amit Kucheria

[CC'ing Preeti and Paul to check their opinions]

Hi Rafael,

On Wed, Jun 11, 2014 at 07:25:49PM +0100, Rafael J. Wysocki wrote:
> On Wed, Jun 11, 2014 at 6:18 PM, Lorenzo Pieralisi
> <lorenzo.pieralisi@arm.com> wrote:
> > On most common ARM systems, the low-power states a CPU can be put into are
> > not discoverable in HW and require device tree bindings to describe
> > power down suspend operations and idle states parameters.
> >
> > In order to enable DT based idle states and configure idle drivers, this
> > patch implements the bulk infrastructure required to parse the device tree
> > idle states bindings and initialize the corresponding CPUidle driver states
> > data.
> >
> > Code that initializes idle states checks the CPU idle driver cpumask so
> > that multiple CPU idle drivers can be initialized through it in the
> > kernel. The CPU idle driver cpumask defines which idle states should be
> > considered valid for the driver, ie idle states that are valid on a set
> > of cpus the idle driver manages.
> >
> > Signed-off-by: Lorenzo Pieralisi <lorenzo.pieralisi@arm.com>
> > ---
> >  drivers/cpuidle/Kconfig          |   9 ++
> >  drivers/cpuidle/Makefile         |   1 +
> >  drivers/cpuidle/of_idle_states.c | 282 +++++++++++++++++++++++++++++++++++++++
> >  drivers/cpuidle/of_idle_states.h |   8 ++
> >  4 files changed, 300 insertions(+)
> >  create mode 100644 drivers/cpuidle/of_idle_states.c
> >  create mode 100644 drivers/cpuidle/of_idle_states.h
> >
> > diff --git a/drivers/cpuidle/Kconfig b/drivers/cpuidle/Kconfig
> > index 1b96fb9..760ce20 100644
> > --- a/drivers/cpuidle/Kconfig
> > +++ b/drivers/cpuidle/Kconfig
> > @@ -30,6 +30,15 @@ config CPU_IDLE_GOV_MENU
> >         bool "Menu governor (for tickless system)"
> >         default y
> >
> > +config OF_IDLE_STATES
> 
> One question here.
> 
> Do you want this to be generally useful or is it just ARM-specific?

The first series was targeting ARM64, then I noticed that it might be
used for ARM too (Daniel is working on that). Actually, I discovered
that Power and MIPS can reuse at least the code that initializes the
states data too, but I have to point out three things:

1) state enter function method: in my bindings it is common for all
   idle states, need to check if it applies to Power and MIPS too.
2) CPUIDLE_FLAG_TIMER_STOP and how to set it. It is non-trivial to
   add code that detects what idle states lose the tick device context.
   At the moment I am adding the flag by default to all idle states
   apart from standbywfi on ARM, but that can be optimised. Unless we
   resort to power domains (but that's not trivial), we can add a flag
   to the idle states in DT (ie local-timer-stop or suchlike) to support
   that. I think that it will be frowned upon but it is worth trying, would
   like to know what other people think about this.
3) idle states bindings should be reviewed, I expect them to be valid
   on other architectures too, but I need acknowledgments.

I think this series is not far from being ready to be upstreamed, I
would be certainly happy if it can be reused for other archs too so
just let me know.

Thanks !
Lorenzo

^ permalink raw reply	[flat|nested] 74+ messages in thread

* [PATCH v4 3/6] drivers: cpuidle: implement OF based idle states infrastructure
@ 2014-06-12  9:03       ` Lorenzo Pieralisi
  0 siblings, 0 replies; 74+ messages in thread
From: Lorenzo Pieralisi @ 2014-06-12  9:03 UTC (permalink / raw)
  To: linux-arm-kernel

[CC'ing Preeti and Paul to check their opinions]

Hi Rafael,

On Wed, Jun 11, 2014 at 07:25:49PM +0100, Rafael J. Wysocki wrote:
> On Wed, Jun 11, 2014 at 6:18 PM, Lorenzo Pieralisi
> <lorenzo.pieralisi@arm.com> wrote:
> > On most common ARM systems, the low-power states a CPU can be put into are
> > not discoverable in HW and require device tree bindings to describe
> > power down suspend operations and idle states parameters.
> >
> > In order to enable DT based idle states and configure idle drivers, this
> > patch implements the bulk infrastructure required to parse the device tree
> > idle states bindings and initialize the corresponding CPUidle driver states
> > data.
> >
> > Code that initializes idle states checks the CPU idle driver cpumask so
> > that multiple CPU idle drivers can be initialized through it in the
> > kernel. The CPU idle driver cpumask defines which idle states should be
> > considered valid for the driver, ie idle states that are valid on a set
> > of cpus the idle driver manages.
> >
> > Signed-off-by: Lorenzo Pieralisi <lorenzo.pieralisi@arm.com>
> > ---
> >  drivers/cpuidle/Kconfig          |   9 ++
> >  drivers/cpuidle/Makefile         |   1 +
> >  drivers/cpuidle/of_idle_states.c | 282 +++++++++++++++++++++++++++++++++++++++
> >  drivers/cpuidle/of_idle_states.h |   8 ++
> >  4 files changed, 300 insertions(+)
> >  create mode 100644 drivers/cpuidle/of_idle_states.c
> >  create mode 100644 drivers/cpuidle/of_idle_states.h
> >
> > diff --git a/drivers/cpuidle/Kconfig b/drivers/cpuidle/Kconfig
> > index 1b96fb9..760ce20 100644
> > --- a/drivers/cpuidle/Kconfig
> > +++ b/drivers/cpuidle/Kconfig
> > @@ -30,6 +30,15 @@ config CPU_IDLE_GOV_MENU
> >         bool "Menu governor (for tickless system)"
> >         default y
> >
> > +config OF_IDLE_STATES
> 
> One question here.
> 
> Do you want this to be generally useful or is it just ARM-specific?

The first series was targeting ARM64, then I noticed that it might be
used for ARM too (Daniel is working on that). Actually, I discovered
that Power and MIPS can reuse at least the code that initializes the
states data too, but I have to point out three things:

1) state enter function method: in my bindings it is common for all
   idle states, need to check if it applies to Power and MIPS too.
2) CPUIDLE_FLAG_TIMER_STOP and how to set it. It is non-trivial to
   add code that detects what idle states lose the tick device context.
   At the moment I am adding the flag by default to all idle states
   apart from standbywfi on ARM, but that can be optimised. Unless we
   resort to power domains (but that's not trivial), we can add a flag
   to the idle states in DT (ie local-timer-stop or suchlike) to support
   that. I think that it will be frowned upon but it is worth trying, would
   like to know what other people think about this.
3) idle states bindings should be reviewed, I expect them to be valid
   on other architectures too, but I need acknowledgments.

I think this series is not far from being ready to be upstreamed, I
would be certainly happy if it can be reused for other archs too so
just let me know.

Thanks !
Lorenzo

^ permalink raw reply	[flat|nested] 74+ messages in thread

* Re: [PATCH v4 3/6] drivers: cpuidle: implement OF based idle states infrastructure
  2014-06-11 18:38     ` Nicolas Pitre
@ 2014-06-12  9:19       ` Lorenzo Pieralisi
  -1 siblings, 0 replies; 74+ messages in thread
From: Lorenzo Pieralisi @ 2014-06-12  9:19 UTC (permalink / raw)
  To: Nicolas Pitre
  Cc: linux-arm-kernel, linux-pm, devicetree, Mark Rutland,
	Sudeep Holla, Catalin Marinas, Charles Garcia-Tobin, Rob Herring,
	grant.likely, Peter De Schrijver, Santosh Shilimkar,
	Daniel Lezcano, Amit Kucheria, Vincent Guittot, Antti Miettinen,
	Stephen Boyd, Kevin Hilman, Sebastian Capella, Tomasz Figa,
	Mark Brown

On Wed, Jun 11, 2014 at 07:38:51PM +0100, Nicolas Pitre wrote:
> On Wed, 11 Jun 2014, Lorenzo Pieralisi wrote:
> 
> > On most common ARM systems, the low-power states a CPU can be put into are
> > not discoverable in HW and require device tree bindings to describe
> > power down suspend operations and idle states parameters.
> > 
> > In order to enable DT based idle states and configure idle drivers, this
> > patch implements the bulk infrastructure required to parse the device tree
> > idle states bindings and initialize the corresponding CPUidle driver states
> > data.
> 
> Oh and another pet peeve of mine: given we always talk about "device 
> tree" all the time, could you s/OF/DT/ in the subject?  It's been a 
> while that DT has outgrown its OF origins.

> >  create mode 100644 drivers/cpuidle/of_idle_states.c
> >  create mode 100644 drivers/cpuidle/of_idle_states.h
> 
> Ditto here, including any new symbols you introduced.

Yes you have a point, I will do.

Thanks,
Lorenzo


^ permalink raw reply	[flat|nested] 74+ messages in thread

* [PATCH v4 3/6] drivers: cpuidle: implement OF based idle states infrastructure
@ 2014-06-12  9:19       ` Lorenzo Pieralisi
  0 siblings, 0 replies; 74+ messages in thread
From: Lorenzo Pieralisi @ 2014-06-12  9:19 UTC (permalink / raw)
  To: linux-arm-kernel

On Wed, Jun 11, 2014 at 07:38:51PM +0100, Nicolas Pitre wrote:
> On Wed, 11 Jun 2014, Lorenzo Pieralisi wrote:
> 
> > On most common ARM systems, the low-power states a CPU can be put into are
> > not discoverable in HW and require device tree bindings to describe
> > power down suspend operations and idle states parameters.
> > 
> > In order to enable DT based idle states and configure idle drivers, this
> > patch implements the bulk infrastructure required to parse the device tree
> > idle states bindings and initialize the corresponding CPUidle driver states
> > data.
> 
> Oh and another pet peeve of mine: given we always talk about "device 
> tree" all the time, could you s/OF/DT/ in the subject?  It's been a 
> while that DT has outgrown its OF origins.

> >  create mode 100644 drivers/cpuidle/of_idle_states.c
> >  create mode 100644 drivers/cpuidle/of_idle_states.h
> 
> Ditto here, including any new symbols you introduced.

Yes you have a point, I will do.

Thanks,
Lorenzo

^ permalink raw reply	[flat|nested] 74+ messages in thread

* Re: [PATCH v4 3/6] drivers: cpuidle: implement OF based idle states infrastructure
  2014-06-12  9:03       ` Lorenzo Pieralisi
@ 2014-06-13  3:48         ` Preeti U Murthy
  -1 siblings, 0 replies; 74+ messages in thread
From: Preeti U Murthy @ 2014-06-13  3:48 UTC (permalink / raw)
  To: Lorenzo Pieralisi
  Cc: Mark Rutland, Rafael J. Wysocki, Catalin Marinas, Tomasz Figa,
	Chander Kashyap, Vincent Guittot, Nicolas Pitre, Daniel Lezcano,
	linux-arm-kernel, grant.likely, Charles Garcia-Tobin, devicetree,
	Kevin Hilman, linux-pm, Sebastian Capella, Mark Brown,
	Antti Miettinen, Paul Walmsley, paul.burton, Peter De Schrijver

Hi Lorenzo,

On 06/12/2014 02:33 PM, Lorenzo Pieralisi wrote:
> [CC'ing Preeti and Paul to check their opinions]
> 
> Hi Rafael,
> 
> On Wed, Jun 11, 2014 at 07:25:49PM +0100, Rafael J. Wysocki wrote:
>> On Wed, Jun 11, 2014 at 6:18 PM, Lorenzo Pieralisi
>> <lorenzo.pieralisi@arm.com> wrote:
>>> On most common ARM systems, the low-power states a CPU can be put into are
>>> not discoverable in HW and require device tree bindings to describe
>>> power down suspend operations and idle states parameters.
>>>
>>> In order to enable DT based idle states and configure idle drivers, this
>>> patch implements the bulk infrastructure required to parse the device tree
>>> idle states bindings and initialize the corresponding CPUidle driver states
>>> data.
>>>
>>> Code that initializes idle states checks the CPU idle driver cpumask so
>>> that multiple CPU idle drivers can be initialized through it in the
>>> kernel. The CPU idle driver cpumask defines which idle states should be
>>> considered valid for the driver, ie idle states that are valid on a set
>>> of cpus the idle driver manages.
>>>
>>> Signed-off-by: Lorenzo Pieralisi <lorenzo.pieralisi@arm.com>
>>> ---
>>>  drivers/cpuidle/Kconfig          |   9 ++
>>>  drivers/cpuidle/Makefile         |   1 +
>>>  drivers/cpuidle/of_idle_states.c | 282 +++++++++++++++++++++++++++++++++++++++
>>>  drivers/cpuidle/of_idle_states.h |   8 ++
>>>  4 files changed, 300 insertions(+)
>>>  create mode 100644 drivers/cpuidle/of_idle_states.c
>>>  create mode 100644 drivers/cpuidle/of_idle_states.h
>>>
>>> diff --git a/drivers/cpuidle/Kconfig b/drivers/cpuidle/Kconfig
>>> index 1b96fb9..760ce20 100644
>>> --- a/drivers/cpuidle/Kconfig
>>> +++ b/drivers/cpuidle/Kconfig
>>> @@ -30,6 +30,15 @@ config CPU_IDLE_GOV_MENU
>>>         bool "Menu governor (for tickless system)"
>>>         default y
>>>
>>> +config OF_IDLE_STATES
>>
>> One question here.
>>
>> Do you want this to be generally useful or is it just ARM-specific?
> 
> The first series was targeting ARM64, then I noticed that it might be
> used for ARM too (Daniel is working on that). Actually, I discovered
> that Power and MIPS can reuse at least the code that initializes the
> states data too, but I have to point out three things:
> 
> 1) state enter function method: in my bindings it is common for all
>    idle states, need to check if it applies to Power and MIPS too.

On PowerPC, we have a state enter function for each idle state. It will
not be too difficult to consolidate them into one, but that is not on
the cards right now.

> 2) CPUIDLE_FLAG_TIMER_STOP and how to set it. It is non-trivial to
>    add code that detects what idle states lose the tick device context.
>    At the moment I am adding the flag by default to all idle states
>    apart from standbywfi on ARM, but that can be optimised. Unless we
>    resort to power domains (but that's not trivial), we can add a flag
>    to the idle states in DT (ie local-timer-stop or suchlike) to support
>    that. I think that it will be frowned upon but it is worth trying, would
>    like to know what other people think about this.

On PowerPC we have a bit in flag property of the idle state device node,
which determines if timers will stop. Yes, maybe we can fix it at a
specific bit but it may be messy.

> 3) idle states bindings should be reviewed, I expect them to be valid
>    on other architectures too, but I need acknowledgments.

The major difference as I see it is the idle state bindings. In your
patch there is a device node for each idle state. On PowerPC however,
currently we have a single node with the property values of this node
determining the idle states' name, desc etc..

Besides this, the names of the device tree nodes for idle states could
be arch specific to meet some hierarchical requirements in the device
tree. This would make it difficult for this driver to parse the idle
states based on a generic idle state node name.
> 
> I think this series is not far from being ready to be upstreamed, I
> would be certainly happy if it can be reused for other archs too so
> just let me know.

On PowerPC there are a couple of other sanity checks that we ought to do
before initializing the driver.

So IMO, although we can press out the above mentioned differences in one
way or the other to make way for a generic idle driver which reads from
the device tree, I am not in favour of it since it has to concern itself
with quite a bit of arch-specific stuff. This would anyway make it less
and less of a generic idle driver. So its best to push this patch to be
ARM specific.

Regards
Preeti U Murthy
> 
> Thanks !
> Lorenzo
> 

^ permalink raw reply	[flat|nested] 74+ messages in thread

* [PATCH v4 3/6] drivers: cpuidle: implement OF based idle states infrastructure
@ 2014-06-13  3:48         ` Preeti U Murthy
  0 siblings, 0 replies; 74+ messages in thread
From: Preeti U Murthy @ 2014-06-13  3:48 UTC (permalink / raw)
  To: linux-arm-kernel

Hi Lorenzo,

On 06/12/2014 02:33 PM, Lorenzo Pieralisi wrote:
> [CC'ing Preeti and Paul to check their opinions]
> 
> Hi Rafael,
> 
> On Wed, Jun 11, 2014 at 07:25:49PM +0100, Rafael J. Wysocki wrote:
>> On Wed, Jun 11, 2014 at 6:18 PM, Lorenzo Pieralisi
>> <lorenzo.pieralisi@arm.com> wrote:
>>> On most common ARM systems, the low-power states a CPU can be put into are
>>> not discoverable in HW and require device tree bindings to describe
>>> power down suspend operations and idle states parameters.
>>>
>>> In order to enable DT based idle states and configure idle drivers, this
>>> patch implements the bulk infrastructure required to parse the device tree
>>> idle states bindings and initialize the corresponding CPUidle driver states
>>> data.
>>>
>>> Code that initializes idle states checks the CPU idle driver cpumask so
>>> that multiple CPU idle drivers can be initialized through it in the
>>> kernel. The CPU idle driver cpumask defines which idle states should be
>>> considered valid for the driver, ie idle states that are valid on a set
>>> of cpus the idle driver manages.
>>>
>>> Signed-off-by: Lorenzo Pieralisi <lorenzo.pieralisi@arm.com>
>>> ---
>>>  drivers/cpuidle/Kconfig          |   9 ++
>>>  drivers/cpuidle/Makefile         |   1 +
>>>  drivers/cpuidle/of_idle_states.c | 282 +++++++++++++++++++++++++++++++++++++++
>>>  drivers/cpuidle/of_idle_states.h |   8 ++
>>>  4 files changed, 300 insertions(+)
>>>  create mode 100644 drivers/cpuidle/of_idle_states.c
>>>  create mode 100644 drivers/cpuidle/of_idle_states.h
>>>
>>> diff --git a/drivers/cpuidle/Kconfig b/drivers/cpuidle/Kconfig
>>> index 1b96fb9..760ce20 100644
>>> --- a/drivers/cpuidle/Kconfig
>>> +++ b/drivers/cpuidle/Kconfig
>>> @@ -30,6 +30,15 @@ config CPU_IDLE_GOV_MENU
>>>         bool "Menu governor (for tickless system)"
>>>         default y
>>>
>>> +config OF_IDLE_STATES
>>
>> One question here.
>>
>> Do you want this to be generally useful or is it just ARM-specific?
> 
> The first series was targeting ARM64, then I noticed that it might be
> used for ARM too (Daniel is working on that). Actually, I discovered
> that Power and MIPS can reuse at least the code that initializes the
> states data too, but I have to point out three things:
> 
> 1) state enter function method: in my bindings it is common for all
>    idle states, need to check if it applies to Power and MIPS too.

On PowerPC, we have a state enter function for each idle state. It will
not be too difficult to consolidate them into one, but that is not on
the cards right now.

> 2) CPUIDLE_FLAG_TIMER_STOP and how to set it. It is non-trivial to
>    add code that detects what idle states lose the tick device context.
>    At the moment I am adding the flag by default to all idle states
>    apart from standbywfi on ARM, but that can be optimised. Unless we
>    resort to power domains (but that's not trivial), we can add a flag
>    to the idle states in DT (ie local-timer-stop or suchlike) to support
>    that. I think that it will be frowned upon but it is worth trying, would
>    like to know what other people think about this.

On PowerPC we have a bit in flag property of the idle state device node,
which determines if timers will stop. Yes, maybe we can fix it at a
specific bit but it may be messy.

> 3) idle states bindings should be reviewed, I expect them to be valid
>    on other architectures too, but I need acknowledgments.

The major difference as I see it is the idle state bindings. In your
patch there is a device node for each idle state. On PowerPC however,
currently we have a single node with the property values of this node
determining the idle states' name, desc etc..

Besides this, the names of the device tree nodes for idle states could
be arch specific to meet some hierarchical requirements in the device
tree. This would make it difficult for this driver to parse the idle
states based on a generic idle state node name.
> 
> I think this series is not far from being ready to be upstreamed, I
> would be certainly happy if it can be reused for other archs too so
> just let me know.

On PowerPC there are a couple of other sanity checks that we ought to do
before initializing the driver.

So IMO, although we can press out the above mentioned differences in one
way or the other to make way for a generic idle driver which reads from
the device tree, I am not in favour of it since it has to concern itself
with quite a bit of arch-specific stuff. This would anyway make it less
and less of a generic idle driver. So its best to push this patch to be
ARM specific.

Regards
Preeti U Murthy
> 
> Thanks !
> Lorenzo
> 

^ permalink raw reply	[flat|nested] 74+ messages in thread

* Re: [PATCH v4 1/6] Documentation: arm: define DT idle states bindings
  2014-06-11 18:15     ` Nicolas Pitre
@ 2014-06-13 16:49       ` Lorenzo Pieralisi
  -1 siblings, 0 replies; 74+ messages in thread
From: Lorenzo Pieralisi @ 2014-06-13 16:49 UTC (permalink / raw)
  To: Nicolas Pitre
  Cc: linux-arm-kernel, linux-pm, devicetree, Mark Rutland,
	Sudeep Holla, Catalin Marinas, Charles Garcia-Tobin, Rob Herring,
	grant.likely, Peter De Schrijver, Santosh Shilimkar,
	Daniel Lezcano, Amit Kucheria, Vincent Guittot, Antti Miettinen,
	Stephen Boyd, Kevin Hilman, Sebastian Capella, Tomasz Figa,
	Mark Brown

On Wed, Jun 11, 2014 at 07:15:16PM +0100, Nicolas Pitre wrote:
> On Wed, 11 Jun 2014, Lorenzo Pieralisi wrote:
> 
> > ARM based platforms implement a variety of power management schemes that
> > allow processors to enter idle states at run-time.
> > The parameters defining these idle states vary on a per-platform basis forcing
> > the OS to hardcode the state parameters in platform specific static tables
> > whose size grows as the number of platforms supported in the kernel increases
> > and hampers device drivers standardization.
> > 
> > Therefore, this patch aims at standardizing idle state device tree bindings for
> > ARM platforms. Bindings define idle state parameters inclusive of entry methods
> > and state latencies, to allow operating systems to retrieve the configuration
> > entries from the device tree and initialize the related power management
> > drivers, paving the way for common code in the kernel to deal with idle
> > states and removing the need for static data in current and previous kernel
> > versions.
> 
> Following the offline discussion with Charles, I've some comments.
> 
> [...]

Thank you for summing that discussion up.

> > +Idle state parameters (eg entry latency) are platform specific and 
> need to be
> > +characterized with bindings that provide the required information to OSPM
> > +code so that it can build the required tables and use them at runtime.
> 
> [...]
> 
> > +	- entry-latency-us
> > +		Usage: Required
> > +		Value type: <prop-encoded-array>
> > +		Definition: u32 value representing worst case latency
> > +			    in microseconds required to enter the idle state.
> > +
> > +	- exit-latency-us
> > +		Usage: Required
> > +		Value type: <prop-encoded-array>
> > +		Definition: u32 value representing worst case latency
> > +			    in microseconds required to exit the idle state.
> > +
> > +	- min-residency-us
> > +		Usage: Required
> > +		Value type: <prop-encoded-array>
> > +		Definition: u32 value representing duration in microseconds
> > +			    after which this state becomes more energy
> > +			    efficient than any shallower states.
> 
> I think this would benefit from a clearer definition.  For example, 
> should the min-residency-us value include or exclude the entry and exit 
> delays?  I think it should since that's what the cpuidle code will have 
> to use when testing against expected delay before next wakeup event in 
> any case.  Some of your examples don't assume it is the case though, as 
> the min-residency-us is smaller than entry+exit delays.
> 
> Also I think we'd need a 4th value to fully characterize a state: worst 
> case wake-up latency for QoS purposes.
> 
> Let's illustrate the different periods on a time line to make it clearer
> (hmmm let's see how this can be managed on a braille display :-O ):
> 
> EXEC:	Normal CPU execution.
> 
> PREP:	Preparation phase before committing the hardware to idle mode
> 	like cache flushing. This is abortable on pending wake-up 
> 	event conditions. The abort latency is assumed to be negligible 
> 	(i.e. less than the ENTRY + EXIT duration). If aborted, we go 
> 	back to EXEC. This phase is optional. If not abortable, this 
> 	should be included in the ENTRY phase instead.
> 
> ENTRY:	The hardware is committed to idle mode. This period must run to
> 	completion up to IDLE before anything else can happen.
> 
> IDLE:	This is the actual power-saving idle period. This may last 
> 	between 0 and infinite time, until a wake-up event occurs.
> 
> EXIT:	Period during which the CPU is brought back to operational
> 	mode (EXEC).
> 
> ...__[EXEC]__|__[PREP]--|__[ENTRY]__|__[IDLE]__|___[EXIT]_--|__[EXEC]__...
>              |          |           |          |            |
> 
>              |<-- entry-latency --->|
> 
>                                                |<- exit-  ->|
>                                                |  latency   |
> 
>              |<-------------- min-residency --------------->|
> 
>                         |<----- worst_wakeup_latency ------>|
> 
> entry-latency: Worst case latency required to enter the idle state.  The 
> exit_latency may be guaranteed only after entry-latency has passed.
> 
> min-residency: Minimum period, including preparation, entry and exit, 
> for a given power mode to be worthwhile energy wise.  It must be at 
> least equal to entry_latency + exit_latency.
> 
> worst_wakeup_latency: Maximum delay between the signaling of a wake-up 
> event and the CPU being able to execute normal code again. If not 
> specified, this is assumed to be entry-latency + exit_latency.
> 
> Notes:
> 
> The cpuidle code would only care about min-residency to select the most 
> appropriate mode based on the expected delay before the next event.
> 
> The scheduler will care about the following in the near future:
> 
> wakeup_delay = exit_latency + max(entry_latency - (now - entry_timestamp), 0)
> 
> In other words, the scheduler would wake up the CPU with the shortest 
> wake-up latency.  This wake-up latency must take into account the entry 
> latency if that period has not expired.  Here the abortable nature of 
> the PREP period is ignored on purpose because it cannot be relied upon 
> (e.g. if the cache is mostly clean then the PREP deadline may occur much 
> sooner than expected).
> 
> And pmqos would only care about worst_wakeup_latency.
> 
> So... I hope this is useful.  I think the above ascii art could be part 
> of your documentation to explain it all.

I will, it makes perfect sense, let me point out a couple of things:

1) we need 4 properties, 1 optional (worst_wakeup_latency, if not
   present defaults to entry+exit)
2) is everyone ok, given these definitions, in sorting idle states using
   min-residency-us as a rank ?
3) CPUidle:
   idle_state.exit_latency = worst-wakeup-latency
   idle_state.target_residency = min-residency-us
4) PREP (longest period) can be obtained from the other properties, IF it is
   needed
   PREP = (entry + exit) - worst_wakeup (if worst_wakeup omitted, PREP = 0)

If everyone agrees I think these bindings updated with Nico's diagram
and definitions (I will tweak them, not change them because they make
perfect sense to me) are ready to go, if anyone has concerns please
drop a comment.

Thank you Nico !
Lorenzo


^ permalink raw reply	[flat|nested] 74+ messages in thread

* [PATCH v4 1/6] Documentation: arm: define DT idle states bindings
@ 2014-06-13 16:49       ` Lorenzo Pieralisi
  0 siblings, 0 replies; 74+ messages in thread
From: Lorenzo Pieralisi @ 2014-06-13 16:49 UTC (permalink / raw)
  To: linux-arm-kernel

On Wed, Jun 11, 2014 at 07:15:16PM +0100, Nicolas Pitre wrote:
> On Wed, 11 Jun 2014, Lorenzo Pieralisi wrote:
> 
> > ARM based platforms implement a variety of power management schemes that
> > allow processors to enter idle states at run-time.
> > The parameters defining these idle states vary on a per-platform basis forcing
> > the OS to hardcode the state parameters in platform specific static tables
> > whose size grows as the number of platforms supported in the kernel increases
> > and hampers device drivers standardization.
> > 
> > Therefore, this patch aims at standardizing idle state device tree bindings for
> > ARM platforms. Bindings define idle state parameters inclusive of entry methods
> > and state latencies, to allow operating systems to retrieve the configuration
> > entries from the device tree and initialize the related power management
> > drivers, paving the way for common code in the kernel to deal with idle
> > states and removing the need for static data in current and previous kernel
> > versions.
> 
> Following the offline discussion with Charles, I've some comments.
> 
> [...]

Thank you for summing that discussion up.

> > +Idle state parameters (eg entry latency) are platform specific and 
> need to be
> > +characterized with bindings that provide the required information to OSPM
> > +code so that it can build the required tables and use them at runtime.
> 
> [...]
> 
> > +	- entry-latency-us
> > +		Usage: Required
> > +		Value type: <prop-encoded-array>
> > +		Definition: u32 value representing worst case latency
> > +			    in microseconds required to enter the idle state.
> > +
> > +	- exit-latency-us
> > +		Usage: Required
> > +		Value type: <prop-encoded-array>
> > +		Definition: u32 value representing worst case latency
> > +			    in microseconds required to exit the idle state.
> > +
> > +	- min-residency-us
> > +		Usage: Required
> > +		Value type: <prop-encoded-array>
> > +		Definition: u32 value representing duration in microseconds
> > +			    after which this state becomes more energy
> > +			    efficient than any shallower states.
> 
> I think this would benefit from a clearer definition.  For example, 
> should the min-residency-us value include or exclude the entry and exit 
> delays?  I think it should since that's what the cpuidle code will have 
> to use when testing against expected delay before next wakeup event in 
> any case.  Some of your examples don't assume it is the case though, as 
> the min-residency-us is smaller than entry+exit delays.
> 
> Also I think we'd need a 4th value to fully characterize a state: worst 
> case wake-up latency for QoS purposes.
> 
> Let's illustrate the different periods on a time line to make it clearer
> (hmmm let's see how this can be managed on a braille display :-O ):
> 
> EXEC:	Normal CPU execution.
> 
> PREP:	Preparation phase before committing the hardware to idle mode
> 	like cache flushing. This is abortable on pending wake-up 
> 	event conditions. The abort latency is assumed to be negligible 
> 	(i.e. less than the ENTRY + EXIT duration). If aborted, we go 
> 	back to EXEC. This phase is optional. If not abortable, this 
> 	should be included in the ENTRY phase instead.
> 
> ENTRY:	The hardware is committed to idle mode. This period must run to
> 	completion up to IDLE before anything else can happen.
> 
> IDLE:	This is the actual power-saving idle period. This may last 
> 	between 0 and infinite time, until a wake-up event occurs.
> 
> EXIT:	Period during which the CPU is brought back to operational
> 	mode (EXEC).
> 
> ...__[EXEC]__|__[PREP]--|__[ENTRY]__|__[IDLE]__|___[EXIT]_--|__[EXEC]__...
>              |          |           |          |            |
> 
>              |<-- entry-latency --->|
> 
>                                                |<- exit-  ->|
>                                                |  latency   |
> 
>              |<-------------- min-residency --------------->|
> 
>                         |<----- worst_wakeup_latency ------>|
> 
> entry-latency: Worst case latency required to enter the idle state.  The 
> exit_latency may be guaranteed only after entry-latency has passed.
> 
> min-residency: Minimum period, including preparation, entry and exit, 
> for a given power mode to be worthwhile energy wise.  It must be at 
> least equal to entry_latency + exit_latency.
> 
> worst_wakeup_latency: Maximum delay between the signaling of a wake-up 
> event and the CPU being able to execute normal code again. If not 
> specified, this is assumed to be entry-latency + exit_latency.
> 
> Notes:
> 
> The cpuidle code would only care about min-residency to select the most 
> appropriate mode based on the expected delay before the next event.
> 
> The scheduler will care about the following in the near future:
> 
> wakeup_delay = exit_latency + max(entry_latency - (now - entry_timestamp), 0)
> 
> In other words, the scheduler would wake up the CPU with the shortest 
> wake-up latency.  This wake-up latency must take into account the entry 
> latency if that period has not expired.  Here the abortable nature of 
> the PREP period is ignored on purpose because it cannot be relied upon 
> (e.g. if the cache is mostly clean then the PREP deadline may occur much 
> sooner than expected).
> 
> And pmqos would only care about worst_wakeup_latency.
> 
> So... I hope this is useful.  I think the above ascii art could be part 
> of your documentation to explain it all.

I will, it makes perfect sense, let me point out a couple of things:

1) we need 4 properties, 1 optional (worst_wakeup_latency, if not
   present defaults to entry+exit)
2) is everyone ok, given these definitions, in sorting idle states using
   min-residency-us as a rank ?
3) CPUidle:
   idle_state.exit_latency = worst-wakeup-latency
   idle_state.target_residency = min-residency-us
4) PREP (longest period) can be obtained from the other properties, IF it is
   needed
   PREP = (entry + exit) - worst_wakeup (if worst_wakeup omitted, PREP = 0)

If everyone agrees I think these bindings updated with Nico's diagram
and definitions (I will tweak them, not change them because they make
perfect sense to me) are ready to go, if anyone has concerns please
drop a comment.

Thank you Nico !
Lorenzo

^ permalink raw reply	[flat|nested] 74+ messages in thread

* Re: [PATCH v4 3/6] drivers: cpuidle: implement OF based idle states infrastructure
  2014-06-13  3:48         ` Preeti U Murthy
@ 2014-06-13 17:16           ` Lorenzo Pieralisi
  -1 siblings, 0 replies; 74+ messages in thread
From: Lorenzo Pieralisi @ 2014-06-13 17:16 UTC (permalink / raw)
  To: Preeti U Murthy
  Cc: Mark Rutland, Rafael J. Wysocki, Catalin Marinas, Tomasz Figa,
	Chander Kashyap, Vincent Guittot, Nicolas Pitre, Daniel Lezcano,
	linux-arm-kernel, grant.likely, Charles Garcia-Tobin, devicetree,
	Kevin Hilman, linux-pm, Sebastian Capella, Mark Brown,
	Antti Miettinen, Paul Walmsley, paul.burton, Peter De Schrijver

Hi Preeti,

On Fri, Jun 13, 2014 at 04:48:16AM +0100, Preeti U Murthy wrote:

[...]

> >> Do you want this to be generally useful or is it just ARM-specific?
> > 
> > The first series was targeting ARM64, then I noticed that it might be
> > used for ARM too (Daniel is working on that). Actually, I discovered
> > that Power and MIPS can reuse at least the code that initializes the
> > states data too, but I have to point out three things:
> > 
> > 1) state enter function method: in my bindings it is common for all
> >    idle states, need to check if it applies to Power and MIPS too.
> 
> On PowerPC, we have a state enter function for each idle state. It will
> not be too difficult to consolidate them into one, but that is not on
> the cards right now.

Ok, understood, it can become trickier to do when DT bindings for ARM are
merged though, but I understand it is not your priority now.

> > 2) CPUIDLE_FLAG_TIMER_STOP and how to set it. It is non-trivial to
> >    add code that detects what idle states lose the tick device context.
> >    At the moment I am adding the flag by default to all idle states
> >    apart from standbywfi on ARM, but that can be optimised. Unless we
> >    resort to power domains (but that's not trivial), we can add a flag
> >    to the idle states in DT (ie local-timer-stop or suchlike) to support
> >    that. I think that it will be frowned upon but it is worth trying, would
> >    like to know what other people think about this.
> 
> On PowerPC we have a bit in flag property of the idle state device node,
> which determines if timers will stop. Yes, maybe we can fix it at a
> specific bit but it may be messy.

It is the same information defined differently, and TIMER_STOP on Power
is inferred from the entry method. I think that with a bit of work we
could make ends meet, not sure it has to be done now though.

> > 3) idle states bindings should be reviewed, I expect them to be valid
> >    on other architectures too, but I need acknowledgments.
> 
> The major difference as I see it is the idle state bindings. In your
> patch there is a device node for each idle state. On PowerPC however,
> currently we have a single node with the property values of this node
> determining the idle states' name, desc etc..
> 
> Besides this, the names of the device tree nodes for idle states could
> be arch specific to meet some hierarchical requirements in the device
> tree. This would make it difficult for this driver to parse the idle
> states based on a generic idle state node name.
> > 
> > I think this series is not far from being ready to be upstreamed, I
> > would be certainly happy if it can be reused for other archs too so
> > just let me know.
> 
> On PowerPC there are a couple of other sanity checks that we ought to do
> before initializing the driver.
> 
> So IMO, although we can press out the above mentioned differences in one
> way or the other to make way for a generic idle driver which reads from
> the device tree, I am not in favour of it since it has to concern itself
> with quite a bit of arch-specific stuff. This would anyway make it less
> and less of a generic idle driver. So its best to push this patch to be
> ARM specific.

We are not talking about having a common idle driver for all archs, we are
talking about having a common way to initialize idle states data and I think,
as you mentioned that it could be done (from what I read in your driver).
I understand it is not a priority so I will go ahead and leave it ARM
specific for now.

Thanks,
Lorenzo

^ permalink raw reply	[flat|nested] 74+ messages in thread

* [PATCH v4 3/6] drivers: cpuidle: implement OF based idle states infrastructure
@ 2014-06-13 17:16           ` Lorenzo Pieralisi
  0 siblings, 0 replies; 74+ messages in thread
From: Lorenzo Pieralisi @ 2014-06-13 17:16 UTC (permalink / raw)
  To: linux-arm-kernel

Hi Preeti,

On Fri, Jun 13, 2014 at 04:48:16AM +0100, Preeti U Murthy wrote:

[...]

> >> Do you want this to be generally useful or is it just ARM-specific?
> > 
> > The first series was targeting ARM64, then I noticed that it might be
> > used for ARM too (Daniel is working on that). Actually, I discovered
> > that Power and MIPS can reuse at least the code that initializes the
> > states data too, but I have to point out three things:
> > 
> > 1) state enter function method: in my bindings it is common for all
> >    idle states, need to check if it applies to Power and MIPS too.
> 
> On PowerPC, we have a state enter function for each idle state. It will
> not be too difficult to consolidate them into one, but that is not on
> the cards right now.

Ok, understood, it can become trickier to do when DT bindings for ARM are
merged though, but I understand it is not your priority now.

> > 2) CPUIDLE_FLAG_TIMER_STOP and how to set it. It is non-trivial to
> >    add code that detects what idle states lose the tick device context.
> >    At the moment I am adding the flag by default to all idle states
> >    apart from standbywfi on ARM, but that can be optimised. Unless we
> >    resort to power domains (but that's not trivial), we can add a flag
> >    to the idle states in DT (ie local-timer-stop or suchlike) to support
> >    that. I think that it will be frowned upon but it is worth trying, would
> >    like to know what other people think about this.
> 
> On PowerPC we have a bit in flag property of the idle state device node,
> which determines if timers will stop. Yes, maybe we can fix it at a
> specific bit but it may be messy.

It is the same information defined differently, and TIMER_STOP on Power
is inferred from the entry method. I think that with a bit of work we
could make ends meet, not sure it has to be done now though.

> > 3) idle states bindings should be reviewed, I expect them to be valid
> >    on other architectures too, but I need acknowledgments.
> 
> The major difference as I see it is the idle state bindings. In your
> patch there is a device node for each idle state. On PowerPC however,
> currently we have a single node with the property values of this node
> determining the idle states' name, desc etc..
> 
> Besides this, the names of the device tree nodes for idle states could
> be arch specific to meet some hierarchical requirements in the device
> tree. This would make it difficult for this driver to parse the idle
> states based on a generic idle state node name.
> > 
> > I think this series is not far from being ready to be upstreamed, I
> > would be certainly happy if it can be reused for other archs too so
> > just let me know.
> 
> On PowerPC there are a couple of other sanity checks that we ought to do
> before initializing the driver.
> 
> So IMO, although we can press out the above mentioned differences in one
> way or the other to make way for a generic idle driver which reads from
> the device tree, I am not in favour of it since it has to concern itself
> with quite a bit of arch-specific stuff. This would anyway make it less
> and less of a generic idle driver. So its best to push this patch to be
> ARM specific.

We are not talking about having a common idle driver for all archs, we are
talking about having a common way to initialize idle states data and I think,
as you mentioned that it could be done (from what I read in your driver).
I understand it is not a priority so I will go ahead and leave it ARM
specific for now.

Thanks,
Lorenzo

^ permalink raw reply	[flat|nested] 74+ messages in thread

* Re: [PATCH v4 1/6] Documentation: arm: define DT idle states bindings
  2014-06-13 16:49       ` Lorenzo Pieralisi
@ 2014-06-13 17:33         ` Nicolas Pitre
  -1 siblings, 0 replies; 74+ messages in thread
From: Nicolas Pitre @ 2014-06-13 17:33 UTC (permalink / raw)
  To: Lorenzo Pieralisi
  Cc: linux-arm-kernel, linux-pm, devicetree, Mark Rutland,
	Sudeep Holla, Catalin Marinas, Charles Garcia-Tobin, Rob Herring,
	grant.likely, Peter De Schrijver, Santosh Shilimkar,
	Daniel Lezcano, Amit Kucheria, Vincent Guittot, Antti Miettinen,
	Stephen Boyd, Kevin Hilman, Sebastian Capella, Tomasz Figa,
	Mark Brown

On Fri, 13 Jun 2014, Lorenzo Pieralisi wrote:

> On Wed, Jun 11, 2014 at 07:15:16PM +0100, Nicolas Pitre wrote:
> > Let's illustrate the different periods on a time line to make it clearer
> > (hmmm let's see how this can be managed on a braille display :-O ):
> > 
> > EXEC:	Normal CPU execution.
> > 
> > PREP:	Preparation phase before committing the hardware to idle mode
> > 	like cache flushing. This is abortable on pending wake-up 
> > 	event conditions. The abort latency is assumed to be negligible 
> > 	(i.e. less than the ENTRY + EXIT duration). If aborted, we go 
> > 	back to EXEC. This phase is optional. If not abortable, this 
> > 	should be included in the ENTRY phase instead.
> > 
> > ENTRY:	The hardware is committed to idle mode. This period must run to
> > 	completion up to IDLE before anything else can happen.
> > 
> > IDLE:	This is the actual power-saving idle period. This may last 
> > 	between 0 and infinite time, until a wake-up event occurs.
> > 
> > EXIT:	Period during which the CPU is brought back to operational
> > 	mode (EXEC).
> > 
> > ...__[EXEC]__|__[PREP]--|__[ENTRY]__|__[IDLE]__|___[EXIT]_--|__[EXEC]__...
> >              |          |           |          |            |
> > 
> >              |<-- entry-latency --->|
> > 
> >                                                |<- exit-  ->|
> >                                                |  latency   |
> > 
> >              |<-------------- min-residency --------------->|
> > 
> >                         |<----- worst_wakeup_latency ------>|
> > 
> > entry-latency: Worst case latency required to enter the idle state.  The 
> > exit_latency may be guaranteed only after entry-latency has passed.
> > 
> > min-residency: Minimum period, including preparation, entry and exit, 
> > for a given power mode to be worthwhile energy wise.  It must be at 
> > least equal to entry_latency + exit_latency.
> > 
> > worst_wakeup_latency: Maximum delay between the signaling of a wake-up 
> > event and the CPU being able to execute normal code again. If not 
> > specified, this is assumed to be entry-latency + exit_latency.
> > 
> > Notes:
> > 
> > The cpuidle code would only care about min-residency to select the most 
> > appropriate mode based on the expected delay before the next event.
> > 
> > The scheduler will care about the following in the near future:
> > 
> > wakeup_delay = exit_latency + max(entry_latency - (now - entry_timestamp), 0)
> > 
> > In other words, the scheduler would wake up the CPU with the shortest 
> > wake-up latency.  This wake-up latency must take into account the entry 
> > latency if that period has not expired.  Here the abortable nature of 
> > the PREP period is ignored on purpose because it cannot be relied upon 
> > (e.g. if the cache is mostly clean then the PREP deadline may occur much 
> > sooner than expected).
> > 
> > And pmqos would only care about worst_wakeup_latency.
> > 
> > So... I hope this is useful.  I think the above ascii art could be part 
> > of your documentation to explain it all.
> 
> I will, it makes perfect sense, let me point out a couple of things:
> 
> 1) we need 4 properties, 1 optional (worst_wakeup_latency, if not
>    present defaults to entry+exit)
> 2) is everyone ok, given these definitions, in sorting idle states using
>    min-residency-us as a rank ?

Yes.

> 3) CPUidle:
>    idle_state.exit_latency = worst-wakeup-latency
>    idle_state.target_residency = min-residency-us

But exit_latency is not necessarily equal to worst-wakeup-latency.  
We'll need any of those 4 values depending on the context.  So I'd add 
entry_latency and worst_wakeup_latency to struct cpuidle_state.  If a 
driver doesn't initialize entry_latency then it can be left to 0, and if 
worst_wakeup_latency is 0 then it should be set to entry_latency + 
exit_latency by the core code.

> 4) PREP (longest period) can be obtained from the other properties, IF it is
>    needed
>    PREP = (entry + exit) - worst_wakeup (if worst_wakeup omitted, PREP = 0)

Sure.  However I'd avoid documenting it.  As I said this period cannot 
be relied upon because it can vary a lot and if you miss its deadline 
you're up for a much longer delay than expected.  It is useful if a 
wake-up event happens during that period and then the latency can be cut 
short opportunistically. But if we get to the point we need to rely on 
this period to improve things then it would be a good idea to question 
why we need to request and immediately abort a state so often to start 
with.


Nicolas

^ permalink raw reply	[flat|nested] 74+ messages in thread

* [PATCH v4 1/6] Documentation: arm: define DT idle states bindings
@ 2014-06-13 17:33         ` Nicolas Pitre
  0 siblings, 0 replies; 74+ messages in thread
From: Nicolas Pitre @ 2014-06-13 17:33 UTC (permalink / raw)
  To: linux-arm-kernel

On Fri, 13 Jun 2014, Lorenzo Pieralisi wrote:

> On Wed, Jun 11, 2014 at 07:15:16PM +0100, Nicolas Pitre wrote:
> > Let's illustrate the different periods on a time line to make it clearer
> > (hmmm let's see how this can be managed on a braille display :-O ):
> > 
> > EXEC:	Normal CPU execution.
> > 
> > PREP:	Preparation phase before committing the hardware to idle mode
> > 	like cache flushing. This is abortable on pending wake-up 
> > 	event conditions. The abort latency is assumed to be negligible 
> > 	(i.e. less than the ENTRY + EXIT duration). If aborted, we go 
> > 	back to EXEC. This phase is optional. If not abortable, this 
> > 	should be included in the ENTRY phase instead.
> > 
> > ENTRY:	The hardware is committed to idle mode. This period must run to
> > 	completion up to IDLE before anything else can happen.
> > 
> > IDLE:	This is the actual power-saving idle period. This may last 
> > 	between 0 and infinite time, until a wake-up event occurs.
> > 
> > EXIT:	Period during which the CPU is brought back to operational
> > 	mode (EXEC).
> > 
> > ...__[EXEC]__|__[PREP]--|__[ENTRY]__|__[IDLE]__|___[EXIT]_--|__[EXEC]__...
> >              |          |           |          |            |
> > 
> >              |<-- entry-latency --->|
> > 
> >                                                |<- exit-  ->|
> >                                                |  latency   |
> > 
> >              |<-------------- min-residency --------------->|
> > 
> >                         |<----- worst_wakeup_latency ------>|
> > 
> > entry-latency: Worst case latency required to enter the idle state.  The 
> > exit_latency may be guaranteed only after entry-latency has passed.
> > 
> > min-residency: Minimum period, including preparation, entry and exit, 
> > for a given power mode to be worthwhile energy wise.  It must be at 
> > least equal to entry_latency + exit_latency.
> > 
> > worst_wakeup_latency: Maximum delay between the signaling of a wake-up 
> > event and the CPU being able to execute normal code again. If not 
> > specified, this is assumed to be entry-latency + exit_latency.
> > 
> > Notes:
> > 
> > The cpuidle code would only care about min-residency to select the most 
> > appropriate mode based on the expected delay before the next event.
> > 
> > The scheduler will care about the following in the near future:
> > 
> > wakeup_delay = exit_latency + max(entry_latency - (now - entry_timestamp), 0)
> > 
> > In other words, the scheduler would wake up the CPU with the shortest 
> > wake-up latency.  This wake-up latency must take into account the entry 
> > latency if that period has not expired.  Here the abortable nature of 
> > the PREP period is ignored on purpose because it cannot be relied upon 
> > (e.g. if the cache is mostly clean then the PREP deadline may occur much 
> > sooner than expected).
> > 
> > And pmqos would only care about worst_wakeup_latency.
> > 
> > So... I hope this is useful.  I think the above ascii art could be part 
> > of your documentation to explain it all.
> 
> I will, it makes perfect sense, let me point out a couple of things:
> 
> 1) we need 4 properties, 1 optional (worst_wakeup_latency, if not
>    present defaults to entry+exit)
> 2) is everyone ok, given these definitions, in sorting idle states using
>    min-residency-us as a rank ?

Yes.

> 3) CPUidle:
>    idle_state.exit_latency = worst-wakeup-latency
>    idle_state.target_residency = min-residency-us

But exit_latency is not necessarily equal to worst-wakeup-latency.  
We'll need any of those 4 values depending on the context.  So I'd add 
entry_latency and worst_wakeup_latency to struct cpuidle_state.  If a 
driver doesn't initialize entry_latency then it can be left to 0, and if 
worst_wakeup_latency is 0 then it should be set to entry_latency + 
exit_latency by the core code.

> 4) PREP (longest period) can be obtained from the other properties, IF it is
>    needed
>    PREP = (entry + exit) - worst_wakeup (if worst_wakeup omitted, PREP = 0)

Sure.  However I'd avoid documenting it.  As I said this period cannot 
be relied upon because it can vary a lot and if you miss its deadline 
you're up for a much longer delay than expected.  It is useful if a 
wake-up event happens during that period and then the latency can be cut 
short opportunistically. But if we get to the point we need to rely on 
this period to improve things then it would be a good idea to question 
why we need to request and immediately abort a state so often to start 
with.


Nicolas

^ permalink raw reply	[flat|nested] 74+ messages in thread

* Re: [PATCH v4 1/6] Documentation: arm: define DT idle states bindings
  2014-06-13 16:49       ` Lorenzo Pieralisi
@ 2014-06-13 17:40         ` Sebastian Capella
  -1 siblings, 0 replies; 74+ messages in thread
From: Sebastian Capella @ 2014-06-13 17:40 UTC (permalink / raw)
  To: Lorenzo Pieralisi
  Cc: Nicolas Pitre, linux-arm-kernel, linux-pm, devicetree,
	Mark Rutland, Sudeep Holla, Catalin Marinas,
	Charles Garcia-Tobin, Rob Herring, grant.likely,
	Peter De Schrijver, Santosh Shilimkar, Daniel Lezcano,
	Amit Kucheria, Vincent Guittot, Antti Miettinen, Stephen Boyd,
	Kevin Hilman, Tomasz Figa, Mark Brown

I like these too!  Nice job!

Thanks!

Sebastian

^ permalink raw reply	[flat|nested] 74+ messages in thread

* [PATCH v4 1/6] Documentation: arm: define DT idle states bindings
@ 2014-06-13 17:40         ` Sebastian Capella
  0 siblings, 0 replies; 74+ messages in thread
From: Sebastian Capella @ 2014-06-13 17:40 UTC (permalink / raw)
  To: linux-arm-kernel

I like these too!  Nice job!

Thanks!

Sebastian

^ permalink raw reply	[flat|nested] 74+ messages in thread

* Re: [PATCH v4 1/6] Documentation: arm: define DT idle states bindings
  2014-06-13 17:33         ` Nicolas Pitre
@ 2014-06-16 14:23           ` Lorenzo Pieralisi
  -1 siblings, 0 replies; 74+ messages in thread
From: Lorenzo Pieralisi @ 2014-06-16 14:23 UTC (permalink / raw)
  To: Nicolas Pitre
  Cc: linux-arm-kernel, linux-pm, devicetree, Mark Rutland,
	Sudeep Holla, Catalin Marinas, Charles Garcia-Tobin, Rob Herring,
	grant.likely, Peter De Schrijver, Santosh Shilimkar,
	Daniel Lezcano, Amit Kucheria, Vincent Guittot, Antti Miettinen,
	Stephen Boyd, Kevin Hilman, Sebastian Capella, Tomasz Figa,
	Mark Brown

On Fri, Jun 13, 2014 at 06:33:35PM +0100, Nicolas Pitre wrote:
> On Fri, 13 Jun 2014, Lorenzo Pieralisi wrote:
> 
> > On Wed, Jun 11, 2014 at 07:15:16PM +0100, Nicolas Pitre wrote:
> > > Let's illustrate the different periods on a time line to make it clearer
> > > (hmmm let's see how this can be managed on a braille display :-O ):
> > > 
> > > EXEC:	Normal CPU execution.
> > > 
> > > PREP:	Preparation phase before committing the hardware to idle mode
> > > 	like cache flushing. This is abortable on pending wake-up 
> > > 	event conditions. The abort latency is assumed to be negligible 
> > > 	(i.e. less than the ENTRY + EXIT duration). If aborted, we go 
> > > 	back to EXEC. This phase is optional. If not abortable, this 
> > > 	should be included in the ENTRY phase instead.
> > > 
> > > ENTRY:	The hardware is committed to idle mode. This period must run to
> > > 	completion up to IDLE before anything else can happen.
> > > 
> > > IDLE:	This is the actual power-saving idle period. This may last 
> > > 	between 0 and infinite time, until a wake-up event occurs.
> > > 
> > > EXIT:	Period during which the CPU is brought back to operational
> > > 	mode (EXEC).
> > > 
> > > ...__[EXEC]__|__[PREP]--|__[ENTRY]__|__[IDLE]__|___[EXIT]_--|__[EXEC]__...
> > >              |          |           |          |            |
> > > 
> > >              |<-- entry-latency --->|
> > > 
> > >                                                |<- exit-  ->|
> > >                                                |  latency   |
> > > 
> > >              |<-------------- min-residency --------------->|
> > > 
> > >                         |<----- worst_wakeup_latency ------>|
> > > 
> > > entry-latency: Worst case latency required to enter the idle state.  The 
> > > exit_latency may be guaranteed only after entry-latency has passed.
> > > 
> > > min-residency: Minimum period, including preparation, entry and exit, 
> > > for a given power mode to be worthwhile energy wise.  It must be at 
> > > least equal to entry_latency + exit_latency.
> > > 
> > > worst_wakeup_latency: Maximum delay between the signaling of a wake-up 
> > > event and the CPU being able to execute normal code again. If not 
> > > specified, this is assumed to be entry-latency + exit_latency.
> > > 
> > > Notes:
> > > 
> > > The cpuidle code would only care about min-residency to select the most 
> > > appropriate mode based on the expected delay before the next event.
> > > 
> > > The scheduler will care about the following in the near future:
> > > 
> > > wakeup_delay = exit_latency + max(entry_latency - (now - entry_timestamp), 0)
> > > 
> > > In other words, the scheduler would wake up the CPU with the shortest 
> > > wake-up latency.  This wake-up latency must take into account the entry 
> > > latency if that period has not expired.  Here the abortable nature of 
> > > the PREP period is ignored on purpose because it cannot be relied upon 
> > > (e.g. if the cache is mostly clean then the PREP deadline may occur much 
> > > sooner than expected).
> > > 
> > > And pmqos would only care about worst_wakeup_latency.
> > > 
> > > So... I hope this is useful.  I think the above ascii art could be part 
> > > of your documentation to explain it all.
> > 
> > I will, it makes perfect sense, let me point out a couple of things:
> > 
> > 1) we need 4 properties, 1 optional (worst_wakeup_latency, if not
> >    present defaults to entry+exit)
> > 2) is everyone ok, given these definitions, in sorting idle states using
> >    min-residency-us as a rank ?
> 
> Yes.
> 
> > 3) CPUidle:
> >    idle_state.exit_latency = worst-wakeup-latency
> >    idle_state.target_residency = min-residency-us
> 
> But exit_latency is not necessarily equal to worst-wakeup-latency.  
> We'll need any of those 4 values depending on the context.  So I'd add 
> entry_latency and worst_wakeup_latency to struct cpuidle_state.  If a 
> driver doesn't initialize entry_latency then it can be left to 0, and if 
> worst_wakeup_latency is 0 then it should be set to entry_latency + 
> exit_latency by the core code.

Well, that's why I mentioned idle_state.exit_latency, because in CPUidle
today, the struct cpuidle_state.exit_latency field corresponds to our
worst-wakeup-latency property, not to the exit_latency property; I know
it is confusing but at least by defining proper bindings the kernel
structures can be updated with clear semantics (I would not rename them
for the time being though). Fields required by the scheduler (ie
entry_latency) can be added in the patches that rely on them, when we agreed
on the bindings, adding the variables is no big deal.

> > 4) PREP (longest period) can be obtained from the other properties, IF it is
> >    needed
> >    PREP = (entry + exit) - worst_wakeup (if worst_wakeup omitted, PREP = 0)
> 
> Sure.  However I'd avoid documenting it.  As I said this period cannot 
> be relied upon because it can vary a lot and if you miss its deadline 
> you're up for a much longer delay than expected.  It is useful if a 
> wake-up event happens during that period and then the latency can be cut 
> short opportunistically. But if we get to the point we need to rely on 
> this period to improve things then it would be a good idea to question 
> why we need to request and immediately abort a state so often to start 
> with.

Agreed.

Thanks,
Lorenzo


^ permalink raw reply	[flat|nested] 74+ messages in thread

* [PATCH v4 1/6] Documentation: arm: define DT idle states bindings
@ 2014-06-16 14:23           ` Lorenzo Pieralisi
  0 siblings, 0 replies; 74+ messages in thread
From: Lorenzo Pieralisi @ 2014-06-16 14:23 UTC (permalink / raw)
  To: linux-arm-kernel

On Fri, Jun 13, 2014 at 06:33:35PM +0100, Nicolas Pitre wrote:
> On Fri, 13 Jun 2014, Lorenzo Pieralisi wrote:
> 
> > On Wed, Jun 11, 2014 at 07:15:16PM +0100, Nicolas Pitre wrote:
> > > Let's illustrate the different periods on a time line to make it clearer
> > > (hmmm let's see how this can be managed on a braille display :-O ):
> > > 
> > > EXEC:	Normal CPU execution.
> > > 
> > > PREP:	Preparation phase before committing the hardware to idle mode
> > > 	like cache flushing. This is abortable on pending wake-up 
> > > 	event conditions. The abort latency is assumed to be negligible 
> > > 	(i.e. less than the ENTRY + EXIT duration). If aborted, we go 
> > > 	back to EXEC. This phase is optional. If not abortable, this 
> > > 	should be included in the ENTRY phase instead.
> > > 
> > > ENTRY:	The hardware is committed to idle mode. This period must run to
> > > 	completion up to IDLE before anything else can happen.
> > > 
> > > IDLE:	This is the actual power-saving idle period. This may last 
> > > 	between 0 and infinite time, until a wake-up event occurs.
> > > 
> > > EXIT:	Period during which the CPU is brought back to operational
> > > 	mode (EXEC).
> > > 
> > > ...__[EXEC]__|__[PREP]--|__[ENTRY]__|__[IDLE]__|___[EXIT]_--|__[EXEC]__...
> > >              |          |           |          |            |
> > > 
> > >              |<-- entry-latency --->|
> > > 
> > >                                                |<- exit-  ->|
> > >                                                |  latency   |
> > > 
> > >              |<-------------- min-residency --------------->|
> > > 
> > >                         |<----- worst_wakeup_latency ------>|
> > > 
> > > entry-latency: Worst case latency required to enter the idle state.  The 
> > > exit_latency may be guaranteed only after entry-latency has passed.
> > > 
> > > min-residency: Minimum period, including preparation, entry and exit, 
> > > for a given power mode to be worthwhile energy wise.  It must be at 
> > > least equal to entry_latency + exit_latency.
> > > 
> > > worst_wakeup_latency: Maximum delay between the signaling of a wake-up 
> > > event and the CPU being able to execute normal code again. If not 
> > > specified, this is assumed to be entry-latency + exit_latency.
> > > 
> > > Notes:
> > > 
> > > The cpuidle code would only care about min-residency to select the most 
> > > appropriate mode based on the expected delay before the next event.
> > > 
> > > The scheduler will care about the following in the near future:
> > > 
> > > wakeup_delay = exit_latency + max(entry_latency - (now - entry_timestamp), 0)
> > > 
> > > In other words, the scheduler would wake up the CPU with the shortest 
> > > wake-up latency.  This wake-up latency must take into account the entry 
> > > latency if that period has not expired.  Here the abortable nature of 
> > > the PREP period is ignored on purpose because it cannot be relied upon 
> > > (e.g. if the cache is mostly clean then the PREP deadline may occur much 
> > > sooner than expected).
> > > 
> > > And pmqos would only care about worst_wakeup_latency.
> > > 
> > > So... I hope this is useful.  I think the above ascii art could be part 
> > > of your documentation to explain it all.
> > 
> > I will, it makes perfect sense, let me point out a couple of things:
> > 
> > 1) we need 4 properties, 1 optional (worst_wakeup_latency, if not
> >    present defaults to entry+exit)
> > 2) is everyone ok, given these definitions, in sorting idle states using
> >    min-residency-us as a rank ?
> 
> Yes.
> 
> > 3) CPUidle:
> >    idle_state.exit_latency = worst-wakeup-latency
> >    idle_state.target_residency = min-residency-us
> 
> But exit_latency is not necessarily equal to worst-wakeup-latency.  
> We'll need any of those 4 values depending on the context.  So I'd add 
> entry_latency and worst_wakeup_latency to struct cpuidle_state.  If a 
> driver doesn't initialize entry_latency then it can be left to 0, and if 
> worst_wakeup_latency is 0 then it should be set to entry_latency + 
> exit_latency by the core code.

Well, that's why I mentioned idle_state.exit_latency, because in CPUidle
today, the struct cpuidle_state.exit_latency field corresponds to our
worst-wakeup-latency property, not to the exit_latency property; I know
it is confusing but at least by defining proper bindings the kernel
structures can be updated with clear semantics (I would not rename them
for the time being though). Fields required by the scheduler (ie
entry_latency) can be added in the patches that rely on them, when we agreed
on the bindings, adding the variables is no big deal.

> > 4) PREP (longest period) can be obtained from the other properties, IF it is
> >    needed
> >    PREP = (entry + exit) - worst_wakeup (if worst_wakeup omitted, PREP = 0)
> 
> Sure.  However I'd avoid documenting it.  As I said this period cannot 
> be relied upon because it can vary a lot and if you miss its deadline 
> you're up for a much longer delay than expected.  It is useful if a 
> wake-up event happens during that period and then the latency can be cut 
> short opportunistically. But if we get to the point we need to rely on 
> this period to improve things then it would be a good idea to question 
> why we need to request and immediately abort a state so often to start 
> with.

Agreed.

Thanks,
Lorenzo

^ permalink raw reply	[flat|nested] 74+ messages in thread

* Re: [PATCH v4 1/6] Documentation: arm: define DT idle states bindings
  2014-06-16 14:23           ` Lorenzo Pieralisi
@ 2014-06-16 14:48             ` Nicolas Pitre
  -1 siblings, 0 replies; 74+ messages in thread
From: Nicolas Pitre @ 2014-06-16 14:48 UTC (permalink / raw)
  To: Lorenzo Pieralisi
  Cc: linux-arm-kernel, linux-pm, devicetree, Mark Rutland,
	Sudeep Holla, Catalin Marinas, Charles Garcia-Tobin, Rob Herring,
	grant.likely, Peter De Schrijver, Santosh Shilimkar,
	Daniel Lezcano, Amit Kucheria, Vincent Guittot, Antti Miettinen,
	Stephen Boyd, Kevin Hilman, Sebastian Capella, Tomasz Figa,
	Mark Brown

On Mon, 16 Jun 2014, Lorenzo Pieralisi wrote:

> On Fri, Jun 13, 2014 at 06:33:35PM +0100, Nicolas Pitre wrote:
> > >    idle_state.exit_latency = worst-wakeup-latency
> > >    idle_state.target_residency = min-residency-us
> > 
> > But exit_latency is not necessarily equal to worst-wakeup-latency.  
> > We'll need any of those 4 values depending on the context.  So I'd add 
> > entry_latency and worst_wakeup_latency to struct cpuidle_state.  If a 
> > driver doesn't initialize entry_latency then it can be left to 0, and if 
> > worst_wakeup_latency is 0 then it should be set to entry_latency + 
> > exit_latency by the core code.
> 
> Well, that's why I mentioned idle_state.exit_latency, because in CPUidle
> today, the struct cpuidle_state.exit_latency field corresponds to our
> worst-wakeup-latency property, not to the exit_latency property; I know
> it is confusing but at least by defining proper bindings the kernel
> structures can be updated with clear semantics (I would not rename them
> for the time being though).

Why not?  Adding more confusion or even simply keeping the existing one, 
even if it is temporary, doesn't benefit anyone.

> Fields required by the scheduler (ie entry_latency) can be added in 
> the patches that rely on them, when we agreed on the bindings, adding 
> the variables is no big deal.

Sure.


Nicolas

^ permalink raw reply	[flat|nested] 74+ messages in thread

* [PATCH v4 1/6] Documentation: arm: define DT idle states bindings
@ 2014-06-16 14:48             ` Nicolas Pitre
  0 siblings, 0 replies; 74+ messages in thread
From: Nicolas Pitre @ 2014-06-16 14:48 UTC (permalink / raw)
  To: linux-arm-kernel

On Mon, 16 Jun 2014, Lorenzo Pieralisi wrote:

> On Fri, Jun 13, 2014 at 06:33:35PM +0100, Nicolas Pitre wrote:
> > >    idle_state.exit_latency = worst-wakeup-latency
> > >    idle_state.target_residency = min-residency-us
> > 
> > But exit_latency is not necessarily equal to worst-wakeup-latency.  
> > We'll need any of those 4 values depending on the context.  So I'd add 
> > entry_latency and worst_wakeup_latency to struct cpuidle_state.  If a 
> > driver doesn't initialize entry_latency then it can be left to 0, and if 
> > worst_wakeup_latency is 0 then it should be set to entry_latency + 
> > exit_latency by the core code.
> 
> Well, that's why I mentioned idle_state.exit_latency, because in CPUidle
> today, the struct cpuidle_state.exit_latency field corresponds to our
> worst-wakeup-latency property, not to the exit_latency property; I know
> it is confusing but at least by defining proper bindings the kernel
> structures can be updated with clear semantics (I would not rename them
> for the time being though).

Why not?  Adding more confusion or even simply keeping the existing one, 
even if it is temporary, doesn't benefit anyone.

> Fields required by the scheduler (ie entry_latency) can be added in 
> the patches that rely on them, when we agreed on the bindings, adding 
> the variables is no big deal.

Sure.


Nicolas

^ permalink raw reply	[flat|nested] 74+ messages in thread

* Re: [PATCH v4 1/6] Documentation: arm: define DT idle states bindings
  2014-06-13 17:33         ` Nicolas Pitre
@ 2014-06-18 17:36           ` Lorenzo Pieralisi
  -1 siblings, 0 replies; 74+ messages in thread
From: Lorenzo Pieralisi @ 2014-06-18 17:36 UTC (permalink / raw)
  To: Nicolas Pitre
  Cc: linux-arm-kernel, linux-pm, devicetree, Mark Rutland,
	Sudeep Holla, Catalin Marinas, Charles Garcia-Tobin, Rob Herring,
	grant.likely, Peter De Schrijver, Santosh Shilimkar,
	Daniel Lezcano, Amit Kucheria, Vincent Guittot, Antti Miettinen,
	Stephen Boyd, Kevin Hilman, Sebastian Capella, Tomasz Figa,
	Mark Brown

On Fri, Jun 13, 2014 at 06:33:35PM +0100, Nicolas Pitre wrote:
> On Fri, 13 Jun 2014, Lorenzo Pieralisi wrote:
> 
> > On Wed, Jun 11, 2014 at 07:15:16PM +0100, Nicolas Pitre wrote:
> > > Let's illustrate the different periods on a time line to make it clearer
> > > (hmmm let's see how this can be managed on a braille display :-O ):
> > > 
> > > EXEC:	Normal CPU execution.
> > > 
> > > PREP:	Preparation phase before committing the hardware to idle mode
> > > 	like cache flushing. This is abortable on pending wake-up 
> > > 	event conditions. The abort latency is assumed to be negligible 
> > > 	(i.e. less than the ENTRY + EXIT duration). If aborted, we go 
> > > 	back to EXEC. This phase is optional. If not abortable, this 
> > > 	should be included in the ENTRY phase instead.
> > > 
> > > ENTRY:	The hardware is committed to idle mode. This period must run to
> > > 	completion up to IDLE before anything else can happen.
> > > 
> > > IDLE:	This is the actual power-saving idle period. This may last 
> > > 	between 0 and infinite time, until a wake-up event occurs.
> > > 
> > > EXIT:	Period during which the CPU is brought back to operational
> > > 	mode (EXEC).
> > > 
> > > ...__[EXEC]__|__[PREP]--|__[ENTRY]__|__[IDLE]__|___[EXIT]_--|__[EXEC]__...
> > >              |          |           |          |            |
> > > 
> > >              |<-- entry-latency --->|
> > > 
> > >                                                |<- exit-  ->|
> > >                                                |  latency   |
> > > 
> > >              |<-------------- min-residency --------------->|
> > > 
> > >                         |<----- worst_wakeup_latency ------>|
> > > 
> > > entry-latency: Worst case latency required to enter the idle state.  The 
> > > exit_latency may be guaranteed only after entry-latency has passed.
> > > 
> > > min-residency: Minimum period, including preparation, entry and exit, 
> > > for a given power mode to be worthwhile energy wise.  It must be at 
> > > least equal to entry_latency + exit_latency.

Ok, a minor tweak to the diagram above, min-residency should include
energy costs related to idle entry and exit, but not the exit-latency
itself, as long as the energy costs implied by exiting the state are
factored out in the min-residency-us property.

Hence, to sum it up, I attached below the updated bindings patch:

I think we are close to an agreement, if anyone disagrees please shout
as soon as possible so that we can still integrate changes.

Thanks,
Lorenzo

-- >8 --
Subject: [PATCH] Documentation: arm: define DT idle states bindings

ARM based platforms implement a variety of power management schemes that
allow processors to enter idle states at run-time.
The parameters defining these idle states vary on a per-platform basis forcing
the OS to hardcode the state parameters in platform specific static tables
whose size grows as the number of platforms supported in the kernel increases
and hampers device drivers standardization.

Therefore, this patch aims at standardizing idle state device tree bindings for
ARM platforms. Bindings define idle state parameters inclusive of entry methods
and state latencies, to allow operating systems to retrieve the configuration
entries from the device tree and initialize the related power management
drivers, paving the way for common code in the kernel to deal with idle
states and removing the need for static data in current and previous kernel
versions.

Reviewed-by: Sebastian Capella <sebcape@gmail.com>
Signed-off-by: Lorenzo Pieralisi <lorenzo.pieralisi@arm.com>
---
 Documentation/devicetree/bindings/arm/cpus.txt     |   8 +
 .../devicetree/bindings/arm/idle-states.txt        | 561 +++++++++++++++++++++
 2 files changed, 569 insertions(+)
 create mode 100644 Documentation/devicetree/bindings/arm/idle-states.txt

diff --git a/Documentation/devicetree/bindings/arm/cpus.txt b/Documentation/devicetree/bindings/arm/cpus.txt
index 1fe72a0..a44d4fd 100644
--- a/Documentation/devicetree/bindings/arm/cpus.txt
+++ b/Documentation/devicetree/bindings/arm/cpus.txt
@@ -215,6 +215,12 @@ nodes to be present and contain the properties described below.
 		Value type: <phandle>
 		Definition: Specifies the ACC[2] node associated with this CPU.
 
+	- cpu-idle-states
+		Usage: Optional
+		Value type: <prop-encoded-array>
+		Definition:
+			# List of phandles to idle state nodes supported
+			  by this cpu [3].
 
 Example 1 (dual-cluster big.LITTLE system 32-bit):
 
@@ -411,3 +417,5 @@ cpus {
 --
 [1] arm/msm/qcom,saw2.txt
 [2] arm/msm/qcom,kpss-acc.txt
+[3] ARM Linux kernel documentation - idle states bindings
+    Documentation/devicetree/bindings/arm/idle-states.txt
diff --git a/Documentation/devicetree/bindings/arm/idle-states.txt b/Documentation/devicetree/bindings/arm/idle-states.txt
new file mode 100644
index 0000000..c9e1ec6
--- /dev/null
+++ b/Documentation/devicetree/bindings/arm/idle-states.txt
@@ -0,0 +1,561 @@
+==========================================
+ARM idle states binding description
+==========================================
+
+==========================================
+1 - Introduction
+==========================================
+
+ARM systems contain HW capable of managing power consumption dynamically,
+where cores can be put in different low-power states (ranging from simple
+wfi to power gating) according to OSPM policies. The CPU states representing
+the range of dynamic idle states that a processor can enter at run-time, can be
+specified through device tree bindings representing the parameters required
+to enter/exit specific idle states on a given processor.
+
+According to the Server Base System Architecture document (SBSA, [3]), the
+power states an ARM CPU can be put into are identified by the following list:
+
+- Running
+- Idle_standby
+- Idle_retention
+- Sleep
+- Off
+
+The power states described in the SBSA document define the basic CPU states on
+top of which ARM platforms implement power management schemes that allow an OS
+PM implementation to put the processor in different idle states (which include
+states listed above; "off" state is not an idle state since it does not have
+wake-up capabilities, hence it is not considered in this document).
+
+Idle state parameters (eg entry latency) are platform specific and need to be
+characterized with bindings that provide the required information to OSPM
+code so that it can build the required tables and use them at runtime.
+
+The device tree binding definition for ARM idle states is the subject of this
+document.
+
+===========================================
+2 - idle-states node
+===========================================
+
+ARM processor idle states are defined within the idle-states node, which is
+a direct child of the cpus node [1] and provides a container where the
+processor idle states, defined as device tree nodes, are listed.
+
+- idle-states node
+
+	Usage: Optional - On ARM systems, is a container of processor idle
+			  states nodes. If the system does not provide CPU
+			  power management capabilities or the processor just
+			  supports idle_standby an idle-states node is not
+			  required.
+
+	Description: idle-states node is a container node, where its
+		     subnodes describe the CPU idle states.
+
+	Node name must be "idle-states".
+
+	The idle-states node's parent node must be the cpus node.
+
+	The idle-states node's child nodes can be:
+
+	- one or more state nodes
+
+	Any other configuration is considered invalid.
+
+	An idle-states node defines the following properties:
+
+	- entry-method
+		Usage: Required
+		Value type: <stringlist>
+		Definition: Describes the method by which a CPU enters the
+			    idle states. This property is required and must be
+			    one of:
+
+			    - "arm,psci"
+			      ARM PSCI firmware interface [2].
+
+			    - "[vendor],[method]"
+			      An implementation dependent string with
+			      format "vendor,method", where vendor is a string
+			      denoting the name of the manufacturer and
+			      method is a string specifying the mechanism
+			      used to enter the idle state.
+
+The nodes describing the idle states (state) can only be defined within the
+idle-states node, any other configuration is considered invalid and therefore
+must be ignored.
+
+===========================================
+3 - state node
+===========================================
+
+A state node represents an idle state description and must be defined as
+follows:
+
+- state node
+
+	Description: must be child of the idle-states node
+
+	The state node name shall follow standard device tree naming
+	rules ([5], 2.2.1 "Node names"), in particular state nodes which
+	are siblings within a single common parent must be given a unique name.
+
+	The idle state entered by executing the wfi instruction (idle_standby
+	SBSA,[3][4]) is considered standard on all ARM platforms and therefore
+	must not be listed.
+
+	To correctly specify idle states timing and energy related properties,
+	the following definitions identify the different execution phases
+	a CPU goes through to enter and exit idle states and the implied
+	energy metrics:
+
+	..__[EXEC]__|__[PREP]__|__[ENTRY]__|__[IDLE]__|__[EXIT]__|__[EXEC]__..
+		    |          |           |          |          |
+
+		    |<------ entry ------->|
+		    |       latency        |
+						      |<- exit ->|
+						      |  latency |
+		    |<-------- min-residency -------->|
+			       |<-------  wakeup-latency ------->|
+
+	EXEC:	Normal CPU execution.
+
+	PREP:	Preparation phase before committing the hardware to idle mode
+		like cache flushing. This is abortable on pending wake-up
+		event conditions. The abort latency is assumed to be negligible
+		(i.e. less than the ENTRY + EXIT duration). If aborted, CPU
+		goes back to EXEC. This phase is optional. If not abortable,
+		this should be included in the ENTRY phase instead.
+
+	ENTRY:	The hardware is committed to idle mode. This period must run
+		to completion up to IDLE before anything else can happen.
+
+	IDLE:	This is the actual energy-saving idle period. This may last
+		between 0 and infinite time, until a wake-up event occurs.
+
+	EXIT:	Period during which the CPU is brought back to operational
+		mode (EXEC).
+
+	With the definitions provided above, the following list represents
+	the valid properties for a state node:
+
+	- compatible
+		Usage: Required
+		Value type: <stringlist>
+		Definition: Must be "arm,idle-state".
+
+	- logic-state-retained
+		Usage: See definition
+		Value type: <none>
+		Definition: if present logic is retained on state entry,
+			    otherwise it is lost.
+
+	- cache-state-retained
+		Usage: See definition
+		Value type: <none>
+		Definition: if present cache memory is retained on state entry,
+			    otherwise it is lost.
+
+	- entry-method-param
+		Usage: See definition.
+		Value type: <u32>
+		Definition: Depends on the idle-states node entry-method
+			    property value. Refer to the entry-method bindings
+			    for this property value definition.
+
+	- entry-latency-us
+		Usage: Required
+		Value type: <prop-encoded-array>
+		Definition: u32 value representing worst case latency in
+			    microseconds required to enter the idle state.
+			    The exit-latency-us duration may be guaranteed
+			    only after entry-latency-us has passed.
+
+	- exit-latency-us
+		Usage: Required
+		Value type: <prop-encoded-array>
+		Definition: u32 value representing worst case latency
+			    in microseconds required to exit the idle state.
+
+	- min-residency-us
+		Usage: Required
+		Value type: <prop-encoded-array>
+		Definition: u32 value representing minimum residency duration
+			    in microseconds, inclusive of preparation and
+			    entry, for this idle state to be considered
+			    worthwhile energy wise.
+			    The residency time must take into account the
+			    energy consumed while entering and exiting the
+			    idle state and is therefore expected to be
+			    longer than entry-latency-us.
+
+	- wakeup-latency-us:
+		Usage: Optional
+		Value type: <prop-encoded-array>
+		Definition: u32 value representing maximum delay between the
+			    signaling of a wake-up event and the CPU being
+			    able to execute normal code again. If omitted,
+			    this is assumed to be equal to:
+				entry-latency-us + exit-latency-us
+
+===========================================
+4 - Examples
+===========================================
+
+Example 1 (ARM 64-bit, 16-cpu system):
+
+cpus {
+	#size-cells = <0>;
+	#address-cells = <2>;
+
+	idle-states {
+		entry-method = "arm,psci";
+
+		CPU_RETENTION_0_0: cpu-retention-0-0 {
+			compatible = "arm,idle-state";
+			cache-state-retained;
+			entry-method-param = <0x0010000>;
+			entry-latency-us = <20>;
+			exit-latency-us = <40>;
+			min-residency-us = <80>;
+		};
+
+		CLUSTER_RETENTION_0: cluster-retention-0 {
+			compatible = "arm,idle-state";
+			logic-state-retained;
+			cache-state-retained;
+			entry-method-param = <0x1010000>;
+			entry-latency-us = <50>;
+			exit-latency-us = <100>;
+			min-residency-us = <250>;
+		};
+
+		CPU_SLEEP_0_0: cpu-sleep-0-0 {
+			compatible = "arm,idle-state";
+			entry-method-param = <0x0010000>;
+			entry-latency-us = <250>;
+			exit-latency-us = <500>;
+			min-residency-us = <950>;
+		};
+
+		CLUSTER_SLEEP_0: cluster-sleep-0 {
+			compatible = "arm,idle-state";
+			entry-method-param = <0x1010000>;
+			entry-latency-us = <600>;
+			exit-latency-us = <1100>;
+			min-residency-us = <2700>;
+		};
+
+		CPU_RETENTION_1_0: cpu-retention-1-0 {
+			compatible = "arm,idle-state";
+			cache-state-retained;
+			entry-method-param = <0x0010000>;
+			entry-latency-us = <20>;
+			exit-latency-us = <40>;
+			min-residency-us = <90>;
+		};
+
+		CLUSTER_RETENTION_1: cluster-retention-1 {
+			compatible = "arm,idle-state";
+			logic-state-retained;
+			cache-state-retained;
+			entry-method-param = <0x1010000>;
+			entry-latency-us = <50>;
+			exit-latency-us = <100>;
+			min-residency-us = <270>;
+		};
+
+		CPU_SLEEP_1_0: cpu-sleep-1-0 {
+			compatible = "arm,idle-state";
+			entry-method-param = <0x0010000>;
+			entry-latency-us = <70>;
+			exit-latency-us = <100>;
+			min-residency-us = <300>;
+		};
+
+		CLUSTER_SLEEP_1: cluster-sleep-1 {
+			compatible = "arm,idle-state";
+			entry-method-param = <0x1010000>;
+			entry-latency-us = <500>;
+			exit-latency-us = <1200>;
+			min-residency-us = <3500>;
+		};
+	};
+
+	CPU0: cpu@0 {
+		device_type = "cpu";
+		compatible = "arm,cortex-a57";
+		reg = <0x0 0x0>;
+		enable-method = "psci";
+		cpu-idle-states = <&CPU_RETENTION_0_0 &CPU_SLEEP_0_0
+				   &CLUSTER_RETENTION_0 &CLUSTER_SLEEP_0>;
+	};
+
+	CPU1: cpu@1 {
+		device_type = "cpu";
+		compatible = "arm,cortex-a57";
+		reg = <0x0 0x1>;
+		enable-method = "psci";
+		cpu-idle-states = <&CPU_RETENTION_0_0 &CPU_SLEEP_0_0
+				   &CLUSTER_RETENTION_0 &CLUSTER_SLEEP_0>;
+	};
+
+	CPU2: cpu@100 {
+		device_type = "cpu";
+		compatible = "arm,cortex-a57";
+		reg = <0x0 0x100>;
+		enable-method = "psci";
+		cpu-idle-states = <&CPU_RETENTION_0_0 &CPU_SLEEP_0_0
+				   &CLUSTER_RETENTION_0 &CLUSTER_SLEEP_0>;
+	};
+
+	CPU3: cpu@101 {
+		device_type = "cpu";
+		compatible = "arm,cortex-a57";
+		reg = <0x0 0x101>;
+		enable-method = "psci";
+		cpu-idle-states = <&CPU_RETENTION_0_0 &CPU_SLEEP_0_0
+				   &CLUSTER_RETENTION_0 &CLUSTER_SLEEP_0>;
+	};
+
+	CPU4: cpu@10000 {
+		device_type = "cpu";
+		compatible = "arm,cortex-a57";
+		reg = <0x0 0x10000>;
+		enable-method = "psci";
+		cpu-idle-states = <&CPU_RETENTION_0_0 &CPU_SLEEP_0_0
+				   &CLUSTER_RETENTION_0 &CLUSTER_SLEEP_0>;
+	};
+
+	CPU5: cpu@10001 {
+		device_type = "cpu";
+		compatible = "arm,cortex-a57";
+		reg = <0x0 0x10001>;
+		enable-method = "psci";
+		cpu-idle-states = <&CPU_RETENTION_0_0 &CPU_SLEEP_0_0
+				   &CLUSTER_RETENTION_0 &CLUSTER_SLEEP_0>;
+	};
+
+	CPU6: cpu@10100 {
+		device_type = "cpu";
+		compatible = "arm,cortex-a57";
+		reg = <0x0 0x10100>;
+		enable-method = "psci";
+		cpu-idle-states = <&CPU_RETENTION_0_0 &CPU_SLEEP_0_0
+				   &CLUSTER_RETENTION_0 &CLUSTER_SLEEP_0>;
+	};
+
+	CPU7: cpu@10101 {
+		device_type = "cpu";
+		compatible = "arm,cortex-a57";
+		reg = <0x0 0x10101>;
+		enable-method = "psci";
+		cpu-idle-states = <&CPU_RETENTION_0_0 &CPU_SLEEP_0_0
+				   &CLUSTER_RETENTION_0 &CLUSTER_SLEEP_0>;
+	};
+
+	CPU8: cpu@100000000 {
+		device_type = "cpu";
+		compatible = "arm,cortex-a53";
+		reg = <0x1 0x0>;
+		enable-method = "psci";
+		cpu-idle-states = <&CPU_RETENTION_1_0 &CPU_SLEEP_1_0
+				   &CLUSTER_RETENTION_1 &CLUSTER_SLEEP_1>;
+	};
+
+	CPU9: cpu@100000001 {
+		device_type = "cpu";
+		compatible = "arm,cortex-a53";
+		reg = <0x1 0x1>;
+		enable-method = "psci";
+		cpu-idle-states = <&CPU_RETENTION_1_0 &CPU_SLEEP_1_0
+				   &CLUSTER_RETENTION_1 &CLUSTER_SLEEP_1>;
+	};
+
+	CPU10: cpu@100000100 {
+		device_type = "cpu";
+		compatible = "arm,cortex-a53";
+		reg = <0x1 0x100>;
+		enable-method = "psci";
+		cpu-idle-states = <&CPU_RETENTION_1_0 &CPU_SLEEP_1_0
+				   &CLUSTER_RETENTION_1 &CLUSTER_SLEEP_1>;
+	};
+
+	CPU11: cpu@100000101 {
+		device_type = "cpu";
+		compatible = "arm,cortex-a53";
+		reg = <0x1 0x101>;
+		enable-method = "psci";
+		cpu-idle-states = <&CPU_RETENTION_1_0 &CPU_SLEEP_1_0
+				   &CLUSTER_RETENTION_1 &CLUSTER_SLEEP_1>;
+	};
+
+	CPU12: cpu@100010000 {
+		device_type = "cpu";
+		compatible = "arm,cortex-a53";
+		reg = <0x1 0x10000>;
+		enable-method = "psci";
+		cpu-idle-states = <&CPU_RETENTION_1_0 &CPU_SLEEP_1_0
+				   &CLUSTER_RETENTION_1 &CLUSTER_SLEEP_1>;
+	};
+
+	CPU13: cpu@100010001 {
+		device_type = "cpu";
+		compatible = "arm,cortex-a53";
+		reg = <0x1 0x10001>;
+		enable-method = "psci";
+		cpu-idle-states = <&CPU_RETENTION_1_0 &CPU_SLEEP_1_0
+				   &CLUSTER_RETENTION_1 &CLUSTER_SLEEP_1>;
+	};
+
+	CPU14: cpu@100010100 {
+		device_type = "cpu";
+		compatible = "arm,cortex-a53";
+		reg = <0x1 0x10100>;
+		enable-method = "psci";
+		cpu-idle-states = <&CPU_RETENTION_1_0 &CPU_SLEEP_1_0
+				   &CLUSTER_RETENTION_1 &CLUSTER_SLEEP_1>;
+	};
+
+	CPU15: cpu@100010101 {
+		device_type = "cpu";
+		compatible = "arm,cortex-a53";
+		reg = <0x1 0x10101>;
+		enable-method = "psci";
+		cpu-idle-states = <&CPU_RETENTION_1_0 &CPU_SLEEP_1_0
+				   &CLUSTER_RETENTION_1 &CLUSTER_SLEEP_1>;
+	};
+};
+
+Example 2 (ARM 32-bit, 8-cpu system, two clusters):
+
+cpus {
+	#size-cells = <0>;
+	#address-cells = <1>;
+
+	idle-states {
+		entry-method = "arm,psci";
+
+		CPU_SLEEP_0_0: cpu-sleep-0-0 {
+			compatible = "arm,idle-state";
+			entry-method-param = <0x0010000>;
+			entry-latency-us = <200>;
+			exit-latency-us = <100>;
+			wakeup-latency-us = <250>;
+			min-residency-us = <400>;
+		};
+
+		CLUSTER_SLEEP_0: cluster-sleep-0 {
+			compatible = "arm,idle-state";
+			entry-method-param = <0x1010000>;
+			entry-latency-us = <500>;
+			exit-latency-us = <1500>;
+			wakeup-latency-us = <1700>;
+			min-residency-us = <2500>;
+		};
+
+		CPU_SLEEP_1_0: cpu-sleep-1-0 {
+			compatible = "arm,idle-state";
+			entry-method-param = <0x0010000>;
+			entry-latency-us = <300>;
+			exit-latency-us = <500>;
+			wakeup-latency-us = <600>;
+			min-residency-us = <900>;
+		};
+
+		CLUSTER_SLEEP_1: cluster-sleep-1 {
+			compatible = "arm,idle-state";
+			entry-method-param = <0x1010000>;
+			entry-latency-us = <800>;
+			exit-latency-us = <2000>;
+			wakeup-latency-us = <2300>;
+			min-residency-us = <6500>;
+		};
+	};
+
+	CPU0: cpu@0 {
+		device_type = "cpu";
+		compatible = "arm,cortex-a15";
+		reg = <0x0>;
+		enable-method = "psci";
+		cpu-idle-states = <&CPU_SLEEP_0_0 &CLUSTER_SLEEP_0>;
+	};
+
+	CPU1: cpu@1 {
+		device_type = "cpu";
+		compatible = "arm,cortex-a15";
+		reg = <0x1>;
+		enable-method = "psci";
+		cpu-idle-states = <&CPU_SLEEP_0_0 &CLUSTER_SLEEP_0>;
+	};
+
+	CPU2: cpu@2 {
+		device_type = "cpu";
+		compatible = "arm,cortex-a15";
+		reg = <0x2>;
+		enable-method = "psci";
+		cpu-idle-states = <&CPU_SLEEP_0_0 &CLUSTER_SLEEP_0>;
+	};
+
+	CPU3: cpu@3 {
+		device_type = "cpu";
+		compatible = "arm,cortex-a15";
+		reg = <0x3>;
+		enable-method = "psci";
+		cpu-idle-states = <&CPU_SLEEP_0_0 &CLUSTER_SLEEP_0>;
+	};
+
+	CPU4: cpu@100 {
+		device_type = "cpu";
+		compatible = "arm,cortex-a7";
+		reg = <0x100>;
+		enable-method = "psci";
+		cpu-idle-states = <&CPU_SLEEP_1_0 &CLUSTER_SLEEP_1>;
+	};
+
+	CPU5: cpu@101 {
+		device_type = "cpu";
+		compatible = "arm,cortex-a7";
+		reg = <0x101>;
+		enable-method = "psci";
+		cpu-idle-states = <&CPU_SLEEP_1_0 &CLUSTER_SLEEP_1>;
+	};
+
+	CPU6: cpu@102 {
+		device_type = "cpu";
+		compatible = "arm,cortex-a7";
+		reg = <0x102>;
+		enable-method = "psci";
+		cpu-idle-states = <&CPU_SLEEP_1_0 &CLUSTER_SLEEP_1>;
+	};
+
+	CPU7: cpu@103 {
+		device_type = "cpu";
+		compatible = "arm,cortex-a7";
+		reg = <0x103>;
+		enable-method = "psci";
+		cpu-idle-states = <&CPU_SLEEP_1_0 &CLUSTER_SLEEP_1>;
+	};
+};
+
+===========================================
+4 - References
+===========================================
+
+[1] ARM Linux Kernel documentation - CPUs bindings
+    Documentation/devicetree/bindings/arm/cpus.txt
+
+[2] ARM Linux Kernel documentation - PSCI bindings
+    Documentation/devicetree/bindings/arm/psci.txt
+
+[3] ARM Server Base System Architecture (SBSA)
+    http://infocenter.arm.com/help/index.jsp
+
+[4] ARM Architecture Reference Manuals
+    http://infocenter.arm.com/help/index.jsp
+
+[5] ePAPR standard
+    https://www.power.org/documentation/epapr-version-1-1/
-- 
1.9.1


^ permalink raw reply related	[flat|nested] 74+ messages in thread

* [PATCH v4 1/6] Documentation: arm: define DT idle states bindings
@ 2014-06-18 17:36           ` Lorenzo Pieralisi
  0 siblings, 0 replies; 74+ messages in thread
From: Lorenzo Pieralisi @ 2014-06-18 17:36 UTC (permalink / raw)
  To: linux-arm-kernel

On Fri, Jun 13, 2014 at 06:33:35PM +0100, Nicolas Pitre wrote:
> On Fri, 13 Jun 2014, Lorenzo Pieralisi wrote:
> 
> > On Wed, Jun 11, 2014 at 07:15:16PM +0100, Nicolas Pitre wrote:
> > > Let's illustrate the different periods on a time line to make it clearer
> > > (hmmm let's see how this can be managed on a braille display :-O ):
> > > 
> > > EXEC:	Normal CPU execution.
> > > 
> > > PREP:	Preparation phase before committing the hardware to idle mode
> > > 	like cache flushing. This is abortable on pending wake-up 
> > > 	event conditions. The abort latency is assumed to be negligible 
> > > 	(i.e. less than the ENTRY + EXIT duration). If aborted, we go 
> > > 	back to EXEC. This phase is optional. If not abortable, this 
> > > 	should be included in the ENTRY phase instead.
> > > 
> > > ENTRY:	The hardware is committed to idle mode. This period must run to
> > > 	completion up to IDLE before anything else can happen.
> > > 
> > > IDLE:	This is the actual power-saving idle period. This may last 
> > > 	between 0 and infinite time, until a wake-up event occurs.
> > > 
> > > EXIT:	Period during which the CPU is brought back to operational
> > > 	mode (EXEC).
> > > 
> > > ...__[EXEC]__|__[PREP]--|__[ENTRY]__|__[IDLE]__|___[EXIT]_--|__[EXEC]__...
> > >              |          |           |          |            |
> > > 
> > >              |<-- entry-latency --->|
> > > 
> > >                                                |<- exit-  ->|
> > >                                                |  latency   |
> > > 
> > >              |<-------------- min-residency --------------->|
> > > 
> > >                         |<----- worst_wakeup_latency ------>|
> > > 
> > > entry-latency: Worst case latency required to enter the idle state.  The 
> > > exit_latency may be guaranteed only after entry-latency has passed.
> > > 
> > > min-residency: Minimum period, including preparation, entry and exit, 
> > > for a given power mode to be worthwhile energy wise.  It must be at 
> > > least equal to entry_latency + exit_latency.

Ok, a minor tweak to the diagram above, min-residency should include
energy costs related to idle entry and exit, but not the exit-latency
itself, as long as the energy costs implied by exiting the state are
factored out in the min-residency-us property.

Hence, to sum it up, I attached below the updated bindings patch:

I think we are close to an agreement, if anyone disagrees please shout
as soon as possible so that we can still integrate changes.

Thanks,
Lorenzo

-- >8 --
Subject: [PATCH] Documentation: arm: define DT idle states bindings

ARM based platforms implement a variety of power management schemes that
allow processors to enter idle states at run-time.
The parameters defining these idle states vary on a per-platform basis forcing
the OS to hardcode the state parameters in platform specific static tables
whose size grows as the number of platforms supported in the kernel increases
and hampers device drivers standardization.

Therefore, this patch aims at standardizing idle state device tree bindings for
ARM platforms. Bindings define idle state parameters inclusive of entry methods
and state latencies, to allow operating systems to retrieve the configuration
entries from the device tree and initialize the related power management
drivers, paving the way for common code in the kernel to deal with idle
states and removing the need for static data in current and previous kernel
versions.

Reviewed-by: Sebastian Capella <sebcape@gmail.com>
Signed-off-by: Lorenzo Pieralisi <lorenzo.pieralisi@arm.com>
---
 Documentation/devicetree/bindings/arm/cpus.txt     |   8 +
 .../devicetree/bindings/arm/idle-states.txt        | 561 +++++++++++++++++++++
 2 files changed, 569 insertions(+)
 create mode 100644 Documentation/devicetree/bindings/arm/idle-states.txt

diff --git a/Documentation/devicetree/bindings/arm/cpus.txt b/Documentation/devicetree/bindings/arm/cpus.txt
index 1fe72a0..a44d4fd 100644
--- a/Documentation/devicetree/bindings/arm/cpus.txt
+++ b/Documentation/devicetree/bindings/arm/cpus.txt
@@ -215,6 +215,12 @@ nodes to be present and contain the properties described below.
 		Value type: <phandle>
 		Definition: Specifies the ACC[2] node associated with this CPU.
 
+	- cpu-idle-states
+		Usage: Optional
+		Value type: <prop-encoded-array>
+		Definition:
+			# List of phandles to idle state nodes supported
+			  by this cpu [3].
 
 Example 1 (dual-cluster big.LITTLE system 32-bit):
 
@@ -411,3 +417,5 @@ cpus {
 --
 [1] arm/msm/qcom,saw2.txt
 [2] arm/msm/qcom,kpss-acc.txt
+[3] ARM Linux kernel documentation - idle states bindings
+    Documentation/devicetree/bindings/arm/idle-states.txt
diff --git a/Documentation/devicetree/bindings/arm/idle-states.txt b/Documentation/devicetree/bindings/arm/idle-states.txt
new file mode 100644
index 0000000..c9e1ec6
--- /dev/null
+++ b/Documentation/devicetree/bindings/arm/idle-states.txt
@@ -0,0 +1,561 @@
+==========================================
+ARM idle states binding description
+==========================================
+
+==========================================
+1 - Introduction
+==========================================
+
+ARM systems contain HW capable of managing power consumption dynamically,
+where cores can be put in different low-power states (ranging from simple
+wfi to power gating) according to OSPM policies. The CPU states representing
+the range of dynamic idle states that a processor can enter at run-time, can be
+specified through device tree bindings representing the parameters required
+to enter/exit specific idle states on a given processor.
+
+According to the Server Base System Architecture document (SBSA, [3]), the
+power states an ARM CPU can be put into are identified by the following list:
+
+- Running
+- Idle_standby
+- Idle_retention
+- Sleep
+- Off
+
+The power states described in the SBSA document define the basic CPU states on
+top of which ARM platforms implement power management schemes that allow an OS
+PM implementation to put the processor in different idle states (which include
+states listed above; "off" state is not an idle state since it does not have
+wake-up capabilities, hence it is not considered in this document).
+
+Idle state parameters (eg entry latency) are platform specific and need to be
+characterized with bindings that provide the required information to OSPM
+code so that it can build the required tables and use them at runtime.
+
+The device tree binding definition for ARM idle states is the subject of this
+document.
+
+===========================================
+2 - idle-states node
+===========================================
+
+ARM processor idle states are defined within the idle-states node, which is
+a direct child of the cpus node [1] and provides a container where the
+processor idle states, defined as device tree nodes, are listed.
+
+- idle-states node
+
+	Usage: Optional - On ARM systems, is a container of processor idle
+			  states nodes. If the system does not provide CPU
+			  power management capabilities or the processor just
+			  supports idle_standby an idle-states node is not
+			  required.
+
+	Description: idle-states node is a container node, where its
+		     subnodes describe the CPU idle states.
+
+	Node name must be "idle-states".
+
+	The idle-states node's parent node must be the cpus node.
+
+	The idle-states node's child nodes can be:
+
+	- one or more state nodes
+
+	Any other configuration is considered invalid.
+
+	An idle-states node defines the following properties:
+
+	- entry-method
+		Usage: Required
+		Value type: <stringlist>
+		Definition: Describes the method by which a CPU enters the
+			    idle states. This property is required and must be
+			    one of:
+
+			    - "arm,psci"
+			      ARM PSCI firmware interface [2].
+
+			    - "[vendor],[method]"
+			      An implementation dependent string with
+			      format "vendor,method", where vendor is a string
+			      denoting the name of the manufacturer and
+			      method is a string specifying the mechanism
+			      used to enter the idle state.
+
+The nodes describing the idle states (state) can only be defined within the
+idle-states node, any other configuration is considered invalid and therefore
+must be ignored.
+
+===========================================
+3 - state node
+===========================================
+
+A state node represents an idle state description and must be defined as
+follows:
+
+- state node
+
+	Description: must be child of the idle-states node
+
+	The state node name shall follow standard device tree naming
+	rules ([5], 2.2.1 "Node names"), in particular state nodes which
+	are siblings within a single common parent must be given a unique name.
+
+	The idle state entered by executing the wfi instruction (idle_standby
+	SBSA,[3][4]) is considered standard on all ARM platforms and therefore
+	must not be listed.
+
+	To correctly specify idle states timing and energy related properties,
+	the following definitions identify the different execution phases
+	a CPU goes through to enter and exit idle states and the implied
+	energy metrics:
+
+	..__[EXEC]__|__[PREP]__|__[ENTRY]__|__[IDLE]__|__[EXIT]__|__[EXEC]__..
+		    |          |           |          |          |
+
+		    |<------ entry ------->|
+		    |       latency        |
+						      |<- exit ->|
+						      |  latency |
+		    |<-------- min-residency -------->|
+			       |<-------  wakeup-latency ------->|
+
+	EXEC:	Normal CPU execution.
+
+	PREP:	Preparation phase before committing the hardware to idle mode
+		like cache flushing. This is abortable on pending wake-up
+		event conditions. The abort latency is assumed to be negligible
+		(i.e. less than the ENTRY + EXIT duration). If aborted, CPU
+		goes back to EXEC. This phase is optional. If not abortable,
+		this should be included in the ENTRY phase instead.
+
+	ENTRY:	The hardware is committed to idle mode. This period must run
+		to completion up to IDLE before anything else can happen.
+
+	IDLE:	This is the actual energy-saving idle period. This may last
+		between 0 and infinite time, until a wake-up event occurs.
+
+	EXIT:	Period during which the CPU is brought back to operational
+		mode (EXEC).
+
+	With the definitions provided above, the following list represents
+	the valid properties for a state node:
+
+	- compatible
+		Usage: Required
+		Value type: <stringlist>
+		Definition: Must be "arm,idle-state".
+
+	- logic-state-retained
+		Usage: See definition
+		Value type: <none>
+		Definition: if present logic is retained on state entry,
+			    otherwise it is lost.
+
+	- cache-state-retained
+		Usage: See definition
+		Value type: <none>
+		Definition: if present cache memory is retained on state entry,
+			    otherwise it is lost.
+
+	- entry-method-param
+		Usage: See definition.
+		Value type: <u32>
+		Definition: Depends on the idle-states node entry-method
+			    property value. Refer to the entry-method bindings
+			    for this property value definition.
+
+	- entry-latency-us
+		Usage: Required
+		Value type: <prop-encoded-array>
+		Definition: u32 value representing worst case latency in
+			    microseconds required to enter the idle state.
+			    The exit-latency-us duration may be guaranteed
+			    only after entry-latency-us has passed.
+
+	- exit-latency-us
+		Usage: Required
+		Value type: <prop-encoded-array>
+		Definition: u32 value representing worst case latency
+			    in microseconds required to exit the idle state.
+
+	- min-residency-us
+		Usage: Required
+		Value type: <prop-encoded-array>
+		Definition: u32 value representing minimum residency duration
+			    in microseconds, inclusive of preparation and
+			    entry, for this idle state to be considered
+			    worthwhile energy wise.
+			    The residency time must take into account the
+			    energy consumed while entering and exiting the
+			    idle state and is therefore expected to be
+			    longer than entry-latency-us.
+
+	- wakeup-latency-us:
+		Usage: Optional
+		Value type: <prop-encoded-array>
+		Definition: u32 value representing maximum delay between the
+			    signaling of a wake-up event and the CPU being
+			    able to execute normal code again. If omitted,
+			    this is assumed to be equal to:
+				entry-latency-us + exit-latency-us
+
+===========================================
+4 - Examples
+===========================================
+
+Example 1 (ARM 64-bit, 16-cpu system):
+
+cpus {
+	#size-cells = <0>;
+	#address-cells = <2>;
+
+	idle-states {
+		entry-method = "arm,psci";
+
+		CPU_RETENTION_0_0: cpu-retention-0-0 {
+			compatible = "arm,idle-state";
+			cache-state-retained;
+			entry-method-param = <0x0010000>;
+			entry-latency-us = <20>;
+			exit-latency-us = <40>;
+			min-residency-us = <80>;
+		};
+
+		CLUSTER_RETENTION_0: cluster-retention-0 {
+			compatible = "arm,idle-state";
+			logic-state-retained;
+			cache-state-retained;
+			entry-method-param = <0x1010000>;
+			entry-latency-us = <50>;
+			exit-latency-us = <100>;
+			min-residency-us = <250>;
+		};
+
+		CPU_SLEEP_0_0: cpu-sleep-0-0 {
+			compatible = "arm,idle-state";
+			entry-method-param = <0x0010000>;
+			entry-latency-us = <250>;
+			exit-latency-us = <500>;
+			min-residency-us = <950>;
+		};
+
+		CLUSTER_SLEEP_0: cluster-sleep-0 {
+			compatible = "arm,idle-state";
+			entry-method-param = <0x1010000>;
+			entry-latency-us = <600>;
+			exit-latency-us = <1100>;
+			min-residency-us = <2700>;
+		};
+
+		CPU_RETENTION_1_0: cpu-retention-1-0 {
+			compatible = "arm,idle-state";
+			cache-state-retained;
+			entry-method-param = <0x0010000>;
+			entry-latency-us = <20>;
+			exit-latency-us = <40>;
+			min-residency-us = <90>;
+		};
+
+		CLUSTER_RETENTION_1: cluster-retention-1 {
+			compatible = "arm,idle-state";
+			logic-state-retained;
+			cache-state-retained;
+			entry-method-param = <0x1010000>;
+			entry-latency-us = <50>;
+			exit-latency-us = <100>;
+			min-residency-us = <270>;
+		};
+
+		CPU_SLEEP_1_0: cpu-sleep-1-0 {
+			compatible = "arm,idle-state";
+			entry-method-param = <0x0010000>;
+			entry-latency-us = <70>;
+			exit-latency-us = <100>;
+			min-residency-us = <300>;
+		};
+
+		CLUSTER_SLEEP_1: cluster-sleep-1 {
+			compatible = "arm,idle-state";
+			entry-method-param = <0x1010000>;
+			entry-latency-us = <500>;
+			exit-latency-us = <1200>;
+			min-residency-us = <3500>;
+		};
+	};
+
+	CPU0: cpu at 0 {
+		device_type = "cpu";
+		compatible = "arm,cortex-a57";
+		reg = <0x0 0x0>;
+		enable-method = "psci";
+		cpu-idle-states = <&CPU_RETENTION_0_0 &CPU_SLEEP_0_0
+				   &CLUSTER_RETENTION_0 &CLUSTER_SLEEP_0>;
+	};
+
+	CPU1: cpu at 1 {
+		device_type = "cpu";
+		compatible = "arm,cortex-a57";
+		reg = <0x0 0x1>;
+		enable-method = "psci";
+		cpu-idle-states = <&CPU_RETENTION_0_0 &CPU_SLEEP_0_0
+				   &CLUSTER_RETENTION_0 &CLUSTER_SLEEP_0>;
+	};
+
+	CPU2: cpu at 100 {
+		device_type = "cpu";
+		compatible = "arm,cortex-a57";
+		reg = <0x0 0x100>;
+		enable-method = "psci";
+		cpu-idle-states = <&CPU_RETENTION_0_0 &CPU_SLEEP_0_0
+				   &CLUSTER_RETENTION_0 &CLUSTER_SLEEP_0>;
+	};
+
+	CPU3: cpu at 101 {
+		device_type = "cpu";
+		compatible = "arm,cortex-a57";
+		reg = <0x0 0x101>;
+		enable-method = "psci";
+		cpu-idle-states = <&CPU_RETENTION_0_0 &CPU_SLEEP_0_0
+				   &CLUSTER_RETENTION_0 &CLUSTER_SLEEP_0>;
+	};
+
+	CPU4: cpu at 10000 {
+		device_type = "cpu";
+		compatible = "arm,cortex-a57";
+		reg = <0x0 0x10000>;
+		enable-method = "psci";
+		cpu-idle-states = <&CPU_RETENTION_0_0 &CPU_SLEEP_0_0
+				   &CLUSTER_RETENTION_0 &CLUSTER_SLEEP_0>;
+	};
+
+	CPU5: cpu at 10001 {
+		device_type = "cpu";
+		compatible = "arm,cortex-a57";
+		reg = <0x0 0x10001>;
+		enable-method = "psci";
+		cpu-idle-states = <&CPU_RETENTION_0_0 &CPU_SLEEP_0_0
+				   &CLUSTER_RETENTION_0 &CLUSTER_SLEEP_0>;
+	};
+
+	CPU6: cpu at 10100 {
+		device_type = "cpu";
+		compatible = "arm,cortex-a57";
+		reg = <0x0 0x10100>;
+		enable-method = "psci";
+		cpu-idle-states = <&CPU_RETENTION_0_0 &CPU_SLEEP_0_0
+				   &CLUSTER_RETENTION_0 &CLUSTER_SLEEP_0>;
+	};
+
+	CPU7: cpu at 10101 {
+		device_type = "cpu";
+		compatible = "arm,cortex-a57";
+		reg = <0x0 0x10101>;
+		enable-method = "psci";
+		cpu-idle-states = <&CPU_RETENTION_0_0 &CPU_SLEEP_0_0
+				   &CLUSTER_RETENTION_0 &CLUSTER_SLEEP_0>;
+	};
+
+	CPU8: cpu at 100000000 {
+		device_type = "cpu";
+		compatible = "arm,cortex-a53";
+		reg = <0x1 0x0>;
+		enable-method = "psci";
+		cpu-idle-states = <&CPU_RETENTION_1_0 &CPU_SLEEP_1_0
+				   &CLUSTER_RETENTION_1 &CLUSTER_SLEEP_1>;
+	};
+
+	CPU9: cpu at 100000001 {
+		device_type = "cpu";
+		compatible = "arm,cortex-a53";
+		reg = <0x1 0x1>;
+		enable-method = "psci";
+		cpu-idle-states = <&CPU_RETENTION_1_0 &CPU_SLEEP_1_0
+				   &CLUSTER_RETENTION_1 &CLUSTER_SLEEP_1>;
+	};
+
+	CPU10: cpu at 100000100 {
+		device_type = "cpu";
+		compatible = "arm,cortex-a53";
+		reg = <0x1 0x100>;
+		enable-method = "psci";
+		cpu-idle-states = <&CPU_RETENTION_1_0 &CPU_SLEEP_1_0
+				   &CLUSTER_RETENTION_1 &CLUSTER_SLEEP_1>;
+	};
+
+	CPU11: cpu at 100000101 {
+		device_type = "cpu";
+		compatible = "arm,cortex-a53";
+		reg = <0x1 0x101>;
+		enable-method = "psci";
+		cpu-idle-states = <&CPU_RETENTION_1_0 &CPU_SLEEP_1_0
+				   &CLUSTER_RETENTION_1 &CLUSTER_SLEEP_1>;
+	};
+
+	CPU12: cpu at 100010000 {
+		device_type = "cpu";
+		compatible = "arm,cortex-a53";
+		reg = <0x1 0x10000>;
+		enable-method = "psci";
+		cpu-idle-states = <&CPU_RETENTION_1_0 &CPU_SLEEP_1_0
+				   &CLUSTER_RETENTION_1 &CLUSTER_SLEEP_1>;
+	};
+
+	CPU13: cpu at 100010001 {
+		device_type = "cpu";
+		compatible = "arm,cortex-a53";
+		reg = <0x1 0x10001>;
+		enable-method = "psci";
+		cpu-idle-states = <&CPU_RETENTION_1_0 &CPU_SLEEP_1_0
+				   &CLUSTER_RETENTION_1 &CLUSTER_SLEEP_1>;
+	};
+
+	CPU14: cpu at 100010100 {
+		device_type = "cpu";
+		compatible = "arm,cortex-a53";
+		reg = <0x1 0x10100>;
+		enable-method = "psci";
+		cpu-idle-states = <&CPU_RETENTION_1_0 &CPU_SLEEP_1_0
+				   &CLUSTER_RETENTION_1 &CLUSTER_SLEEP_1>;
+	};
+
+	CPU15: cpu at 100010101 {
+		device_type = "cpu";
+		compatible = "arm,cortex-a53";
+		reg = <0x1 0x10101>;
+		enable-method = "psci";
+		cpu-idle-states = <&CPU_RETENTION_1_0 &CPU_SLEEP_1_0
+				   &CLUSTER_RETENTION_1 &CLUSTER_SLEEP_1>;
+	};
+};
+
+Example 2 (ARM 32-bit, 8-cpu system, two clusters):
+
+cpus {
+	#size-cells = <0>;
+	#address-cells = <1>;
+
+	idle-states {
+		entry-method = "arm,psci";
+
+		CPU_SLEEP_0_0: cpu-sleep-0-0 {
+			compatible = "arm,idle-state";
+			entry-method-param = <0x0010000>;
+			entry-latency-us = <200>;
+			exit-latency-us = <100>;
+			wakeup-latency-us = <250>;
+			min-residency-us = <400>;
+		};
+
+		CLUSTER_SLEEP_0: cluster-sleep-0 {
+			compatible = "arm,idle-state";
+			entry-method-param = <0x1010000>;
+			entry-latency-us = <500>;
+			exit-latency-us = <1500>;
+			wakeup-latency-us = <1700>;
+			min-residency-us = <2500>;
+		};
+
+		CPU_SLEEP_1_0: cpu-sleep-1-0 {
+			compatible = "arm,idle-state";
+			entry-method-param = <0x0010000>;
+			entry-latency-us = <300>;
+			exit-latency-us = <500>;
+			wakeup-latency-us = <600>;
+			min-residency-us = <900>;
+		};
+
+		CLUSTER_SLEEP_1: cluster-sleep-1 {
+			compatible = "arm,idle-state";
+			entry-method-param = <0x1010000>;
+			entry-latency-us = <800>;
+			exit-latency-us = <2000>;
+			wakeup-latency-us = <2300>;
+			min-residency-us = <6500>;
+		};
+	};
+
+	CPU0: cpu at 0 {
+		device_type = "cpu";
+		compatible = "arm,cortex-a15";
+		reg = <0x0>;
+		enable-method = "psci";
+		cpu-idle-states = <&CPU_SLEEP_0_0 &CLUSTER_SLEEP_0>;
+	};
+
+	CPU1: cpu at 1 {
+		device_type = "cpu";
+		compatible = "arm,cortex-a15";
+		reg = <0x1>;
+		enable-method = "psci";
+		cpu-idle-states = <&CPU_SLEEP_0_0 &CLUSTER_SLEEP_0>;
+	};
+
+	CPU2: cpu at 2 {
+		device_type = "cpu";
+		compatible = "arm,cortex-a15";
+		reg = <0x2>;
+		enable-method = "psci";
+		cpu-idle-states = <&CPU_SLEEP_0_0 &CLUSTER_SLEEP_0>;
+	};
+
+	CPU3: cpu at 3 {
+		device_type = "cpu";
+		compatible = "arm,cortex-a15";
+		reg = <0x3>;
+		enable-method = "psci";
+		cpu-idle-states = <&CPU_SLEEP_0_0 &CLUSTER_SLEEP_0>;
+	};
+
+	CPU4: cpu at 100 {
+		device_type = "cpu";
+		compatible = "arm,cortex-a7";
+		reg = <0x100>;
+		enable-method = "psci";
+		cpu-idle-states = <&CPU_SLEEP_1_0 &CLUSTER_SLEEP_1>;
+	};
+
+	CPU5: cpu at 101 {
+		device_type = "cpu";
+		compatible = "arm,cortex-a7";
+		reg = <0x101>;
+		enable-method = "psci";
+		cpu-idle-states = <&CPU_SLEEP_1_0 &CLUSTER_SLEEP_1>;
+	};
+
+	CPU6: cpu at 102 {
+		device_type = "cpu";
+		compatible = "arm,cortex-a7";
+		reg = <0x102>;
+		enable-method = "psci";
+		cpu-idle-states = <&CPU_SLEEP_1_0 &CLUSTER_SLEEP_1>;
+	};
+
+	CPU7: cpu at 103 {
+		device_type = "cpu";
+		compatible = "arm,cortex-a7";
+		reg = <0x103>;
+		enable-method = "psci";
+		cpu-idle-states = <&CPU_SLEEP_1_0 &CLUSTER_SLEEP_1>;
+	};
+};
+
+===========================================
+4 - References
+===========================================
+
+[1] ARM Linux Kernel documentation - CPUs bindings
+    Documentation/devicetree/bindings/arm/cpus.txt
+
+[2] ARM Linux Kernel documentation - PSCI bindings
+    Documentation/devicetree/bindings/arm/psci.txt
+
+[3] ARM Server Base System Architecture (SBSA)
+    http://infocenter.arm.com/help/index.jsp
+
+[4] ARM Architecture Reference Manuals
+    http://infocenter.arm.com/help/index.jsp
+
+[5] ePAPR standard
+    https://www.power.org/documentation/epapr-version-1-1/
-- 
1.9.1

^ permalink raw reply related	[flat|nested] 74+ messages in thread

* Re: [PATCH v4 1/6] Documentation: arm: define DT idle states bindings
  2014-06-18 17:36           ` Lorenzo Pieralisi
@ 2014-06-18 18:20             ` Sebastian Capella
  -1 siblings, 0 replies; 74+ messages in thread
From: Sebastian Capella @ 2014-06-18 18:20 UTC (permalink / raw)
  To: Lorenzo Pieralisi
  Cc: Nicolas Pitre, linux-arm-kernel, linux-pm, devicetree,
	Mark Rutland, Sudeep Holla, Catalin Marinas,
	Charles Garcia-Tobin, Rob Herring, grant.likely,
	Peter De Schrijver, Santosh Shilimkar, Daniel Lezcano,
	Amit Kucheria, Vincent Guittot, Antti Miettinen, Stephen Boyd,
	Kevin Hilman, Tomasz Figa, Mark Brown

On Wed, Jun 18, 2014 at 10:36 AM, Lorenzo Pieralisi
<lorenzo.pieralisi@arm.com> wrote:
> On Fri, Jun 13, 2014 at 06:33:35PM +0100, Nicolas Pitre wrote:
>> On Fri, 13 Jun 2014, Lorenzo Pieralisi wrote:
>>
>> > On Wed, Jun 11, 2014 at 07:15:16PM +0100, Nicolas Pitre wrote:
>> > >
>> > > ...__[EXEC]__|__[PREP]--|__[ENTRY]__|__[IDLE]__|___[EXIT]_--|__[EXEC]__...
>> > >              |          |           |          |            |
>> > >              |<-- entry-latency --->|
>> > >                                                |<- exit-  ->|
>> > >                                                |  latency   |
>> > >              |<-------------- min-residency --------------->|
>> > >                         |<----- worst_wakeup_latency ------>|
>> > >
>> > > entry-latency: Worst case latency required to enter the idle state.  The
>> > > exit_latency may be guaranteed only after entry-latency has passed.
>> > >
>> > > min-residency: Minimum period, including preparation, entry and exit,
>> > > for a given power mode to be worthwhile energy wise.  It must be at
>> > > least equal to entry_latency + exit_latency.
>
> Ok, a minor tweak to the diagram above, min-residency should include
> energy costs related to idle entry and exit, but not the exit-latency
> itself, as long as the energy costs implied by exiting the state are
> factored out in the min-residency-us property.

This makes sense to me..

It includes accounting for the energy cost vs WFI of prep/entry/exit,
but timing is from the end of the previous exec, until the event is
expected to trigger.

Thanks!

Sebastian

^ permalink raw reply	[flat|nested] 74+ messages in thread

* [PATCH v4 1/6] Documentation: arm: define DT idle states bindings
@ 2014-06-18 18:20             ` Sebastian Capella
  0 siblings, 0 replies; 74+ messages in thread
From: Sebastian Capella @ 2014-06-18 18:20 UTC (permalink / raw)
  To: linux-arm-kernel

On Wed, Jun 18, 2014 at 10:36 AM, Lorenzo Pieralisi
<lorenzo.pieralisi@arm.com> wrote:
> On Fri, Jun 13, 2014 at 06:33:35PM +0100, Nicolas Pitre wrote:
>> On Fri, 13 Jun 2014, Lorenzo Pieralisi wrote:
>>
>> > On Wed, Jun 11, 2014 at 07:15:16PM +0100, Nicolas Pitre wrote:
>> > >
>> > > ...__[EXEC]__|__[PREP]--|__[ENTRY]__|__[IDLE]__|___[EXIT]_--|__[EXEC]__...
>> > >              |          |           |          |            |
>> > >              |<-- entry-latency --->|
>> > >                                                |<- exit-  ->|
>> > >                                                |  latency   |
>> > >              |<-------------- min-residency --------------->|
>> > >                         |<----- worst_wakeup_latency ------>|
>> > >
>> > > entry-latency: Worst case latency required to enter the idle state.  The
>> > > exit_latency may be guaranteed only after entry-latency has passed.
>> > >
>> > > min-residency: Minimum period, including preparation, entry and exit,
>> > > for a given power mode to be worthwhile energy wise.  It must be at
>> > > least equal to entry_latency + exit_latency.
>
> Ok, a minor tweak to the diagram above, min-residency should include
> energy costs related to idle entry and exit, but not the exit-latency
> itself, as long as the energy costs implied by exiting the state are
> factored out in the min-residency-us property.

This makes sense to me..

It includes accounting for the energy cost vs WFI of prep/entry/exit,
but timing is from the end of the previous exec, until the event is
expected to trigger.

Thanks!

Sebastian

^ permalink raw reply	[flat|nested] 74+ messages in thread

* Re: [PATCH v4 1/6] Documentation: arm: define DT idle states bindings
  2014-06-18 17:36           ` Lorenzo Pieralisi
@ 2014-06-18 19:27             ` Santosh Shilimkar
  -1 siblings, 0 replies; 74+ messages in thread
From: Santosh Shilimkar @ 2014-06-18 19:27 UTC (permalink / raw)
  To: Lorenzo Pieralisi, Nicolas Pitre
  Cc: linux-arm-kernel, linux-pm, devicetree, Mark Rutland,
	Sudeep Holla, Catalin Marinas, Charles Garcia-Tobin, Rob Herring,
	grant.likely, Peter De Schrijver, Daniel Lezcano, Amit Kucheria,
	Vincent Guittot, Antti Miettinen, Stephen Boyd, Kevin Hilman,
	Sebastian Capella, Tomasz Figa, Mark Brown, Paul Walmsley

On Wednesday 18 June 2014 01:36 PM, Lorenzo Pieralisi wrote:
> On Fri, Jun 13, 2014 at 06:33:35PM +0100, Nicolas Pitre wrote:

[..]
> Ok, a minor tweak to the diagram above, min-residency should include
> energy costs related to idle entry and exit, but not the exit-latency
> itself, as long as the energy costs implied by exiting the state are
> factored out in the min-residency-us property.
> 
> Hence, to sum it up, I attached below the updated bindings patch:
> 
> I think we are close to an agreement, if anyone disagrees please shout
> as soon as possible so that we can still integrate changes.
> 

[..]

> 
> -- >8 --
> Subject: [PATCH] Documentation: arm: define DT idle states bindings
> 
> ARM based platforms implement a variety of power management schemes that
> allow processors to enter idle states at run-time.
> The parameters defining these idle states vary on a per-platform basis forcing
> the OS to hardcode the state parameters in platform specific static tables
> whose size grows as the number of platforms supported in the kernel increases
> and hampers device drivers standardization.
> 
> Therefore, this patch aims at standardizing idle state device tree bindings for
> ARM platforms. Bindings define idle state parameters inclusive of entry methods
> and state latencies, to allow operating systems to retrieve the configuration
> entries from the device tree and initialize the related power management
> drivers, paving the way for common code in the kernel to deal with idle
> states and removing the need for static data in current and previous kernel
> versions.
> 
> Reviewed-by: Sebastian Capella <sebcape@gmail.com>
> Signed-off-by: Lorenzo Pieralisi <lorenzo.pieralisi@arm.com>
> ---
Nice work Lorenzo !!
I have few comments/questions.

>  Documentation/devicetree/bindings/arm/cpus.txt     |   8 +
>  .../devicetree/bindings/arm/idle-states.txt        | 561 +++++++++++++++++++++
>  2 files changed, 569 insertions(+)
>  create mode 100644 Documentation/devicetree/bindings/arm/idle-states.txt
> 
> diff --git a/Documentation/devicetree/bindings/arm/cpus.txt b/Documentation/devicetree/bindings/arm/cpus.txt
> index 1fe72a0..a44d4fd 100644
> --- a/Documentation/devicetree/bindings/arm/cpus.txt
> +++ b/Documentation/devicetree/bindings/arm/cpus.txt
> @@ -215,6 +215,12 @@ nodes to be present and contain the properties described below.
>  		Value type: <phandle>
>  		Definition: Specifies the ACC[2] node associated with this CPU.
>  
> +	- cpu-idle-states
> +		Usage: Optional
> +		Value type: <prop-encoded-array>
> +		Definition:
> +			# List of phandles to idle state nodes supported
> +			  by this cpu [3].
>  
>  Example 1 (dual-cluster big.LITTLE system 32-bit):
>  
> @@ -411,3 +417,5 @@ cpus {
>  --
>  [1] arm/msm/qcom,saw2.txt
>  [2] arm/msm/qcom,kpss-acc.txt
> +[3] ARM Linux kernel documentation - idle states bindings
> +    Documentation/devicetree/bindings/arm/idle-states.txt
> diff --git a/Documentation/devicetree/bindings/arm/idle-states.txt b/Documentation/devicetree/bindings/arm/idle-states.txt
> new file mode 100644
> index 0000000..c9e1ec6
> --- /dev/null
> +++ b/Documentation/devicetree/bindings/arm/idle-states.txt
> @@ -0,0 +1,561 @@
> +==========================================
> +ARM idle states binding description
> +==========================================
> +
> +==========================================
> +1 - Introduction
> +==========================================
> +
> +ARM systems contain HW capable of managing power consumption dynamically,
> +where cores can be put in different low-power states (ranging from simple
> +wfi to power gating) according to OSPM policies. The CPU states representing
s/OSPM/OS PM ?
> +the range of dynamic idle states that a processor can enter at run-time, can be
> +specified through device tree bindings representing the parameters required
> +to enter/exit specific idle states on a given processor.
> +
> +According to the Server Base System Architecture document (SBSA, [3]), the
> +power states an ARM CPU can be put into are identified by the following list:
> +
> +- Running
> +- Idle_standby
> +- Idle_retention
> +- Sleep
> +- Off
> +
> +The power states described in the SBSA document define the basic CPU states on
> +top of which ARM platforms implement power management schemes that allow an OS
> +PM implementation to put the processor in different idle states (which include
> +states listed above; "off" state is not an idle state since it does not have
> +wake-up capabilities, hence it is not considered in this document).
> +
> +Idle state parameters (eg entry latency) are platform specific and need to be
> +characterized with bindings that provide the required information to OSPM
Ditto
> +code so that it can build the required tables and use them at runtime.
> +
> +The device tree binding definition for ARM idle states is the subject of this
> +document.
> +
> +===========================================
> +2 - idle-states node
> +===========================================
> +
> +ARM processor idle states are defined within the idle-states node, which is
> +a direct child of the cpus node [1] and provides a container where the
> +processor idle states, defined as device tree nodes, are listed.
> +
> +- idle-states node
> +
> +	Usage: Optional - On ARM systems, is a container of processor idle
s/is/it is ?
> +			  states nodes. If the system does not provide CPU
> +			  power management capabilities or the processor just
> +			  supports idle_standby an idle-states node is not
> +			  required.
> +
> +	Description: idle-states node is a container node, where its
> +		     subnodes describe the CPU idle states.
> +
> +	Node name must be "idle-states".
> +
> +	The idle-states node's parent node must be the cpus node.
> +
> +	The idle-states node's child nodes can be:
s/idle-states/idle-state
> +
> +	- one or more state nodes
> +
> +	Any other configuration is considered invalid.
> +
> +	An idle-states node defines the following properties:
> +
> +	- entry-method
> +		Usage: Required
> +		Value type: <stringlist>
> +		Definition: Describes the method by which a CPU enters the
> +			    idle states. This property is required and must be
> +			    one of:
> +
> +			    - "arm,psci"
> +			      ARM PSCI firmware interface [2].
> +
> +			    - "[vendor],[method]"
> +			      An implementation dependent string with
> +			      format "vendor,method", where vendor is a string
> +			      denoting the name of the manufacturer and
> +			      method is a string specifying the mechanism
> +			      used to enter the idle state.
> +
> +The nodes describing the idle states (state) can only be defined within the
> +idle-states node, any other configuration is considered invalid and therefore
> +must be ignored.
> +
> +===========================================
> +3 - state node
> +===========================================
> +
> +A state node represents an idle state description and must be defined as
> +follows:
> +
> +- state node
> +
> +	Description: must be child of the idle-states node
> +
> +	The state node name shall follow standard device tree naming
> +	rules ([5], 2.2.1 "Node names"), in particular state nodes which
> +	are siblings within a single common parent must be given a unique name.
> +
> +	The idle state entered by executing the wfi instruction (idle_standby
> +	SBSA,[3][4]) is considered standard on all ARM platforms and therefore
> +	must not be listed.
> +
> +	To correctly specify idle states timing and energy related properties,
> +	the following definitions identify the different execution phases
> +	a CPU goes through to enter and exit idle states and the implied
> +	energy metrics:
> +
> +	..__[EXEC]__|__[PREP]__|__[ENTRY]__|__[IDLE]__|__[EXIT]__|__[EXEC]__..
> +		    |          |           |          |          |
> +
> +		    |<------ entry ------->|
> +		    |       latency        |
> +						      |<- exit ->|
> +						      |  latency |
> +		    |<-------- min-residency -------->|
> +			       |<-------  wakeup-latency ------->|
> +
I don't know the wakeup latency makes much sense and also correct.
Hardware wakeup latency is actually exit latency. Is it for failed
or abort-able ilde case ? We are adding this as a new parameter
at least from idle states perspective. I think we should just
avoid it.

> +	EXEC:	Normal CPU execution.
> +
> +	PREP:	Preparation phase before committing the hardware to idle mode
> +		like cache flushing. This is abortable on pending wake-up
> +		event conditions. The abort latency is assumed to be negligible
> +		(i.e. less than the ENTRY + EXIT duration). If aborted, CPU
> +		goes back to EXEC. This phase is optional. If not abortable,
> +		this should be included in the ENTRY phase instead.
> +
> +	ENTRY:	The hardware is committed to idle mode. This period must run
> +		to completion up to IDLE before anything else can happen.
> +
> +	IDLE:	This is the actual energy-saving idle period. This may last
> +		between 0 and infinite time, until a wake-up event occurs.
> +
> +	EXIT:	Period during which the CPU is brought back to operational
> +		mode (EXEC).
> +
> +	With the definitions provided above, the following list represents
> +	the valid properties for a state node:
> +
> +	- compatible
> +		Usage: Required
> +		Value type: <stringlist>
> +		Definition: Must be "arm,idle-state".
> +
> +	- logic-state-retained
> +		Usage: See definition
> +		Value type: <none>
> +		Definition: if present logic is retained on state entry,
> +			    otherwise it is lost.
> +
> +	- cache-state-retained
> +		Usage: See definition
> +		Value type: <none>
> +		Definition: if present cache memory is retained on state entry,
> +			    otherwise it is lost.
> +
> +	- entry-method-param
> +		Usage: See definition.
> +		Value type: <u32>
> +		Definition: Depends on the idle-states node entry-method
> +			    property value. Refer to the entry-method bindings
> +			    for this property value definition.
> +
> +	- entry-latency-us
> +		Usage: Required
> +		Value type: <prop-encoded-array>
> +		Definition: u32 value representing worst case latency in
> +			    microseconds required to enter the idle state.
> +			    The exit-latency-us duration may be guaranteed
> +			    only after entry-latency-us has passed.
> +
> +	- exit-latency-us
> +		Usage: Required
> +		Value type: <prop-encoded-array>
> +		Definition: u32 value representing worst case latency
> +			    in microseconds required to exit the idle state.
> +
> +	- min-residency-us
> +		Usage: Required
> +		Value type: <prop-encoded-array>
> +		Definition: u32 value representing minimum residency duration
> +			    in microseconds, inclusive of preparation and
> +			    entry, for this idle state to be considered
> +			    worthwhile energy wise.
> +			    The residency time must take into account the
> +			    energy consumed while entering and exiting the
> +			    idle state and is therefore expected to be
> +			    longer than entry-latency-us.
> +
> +	- wakeup-latency-us:
> +		Usage: Optional
> +		Value type: <prop-encoded-array>
> +		Definition: u32 value representing maximum delay between the
> +			    signaling of a wake-up event and the CPU being
> +			    able to execute normal code again. If omitted,
> +			    this is assumed to be equal to:
> +				entry-latency-us + exit-latency-us
> +
Rest of the patch looks fine by to me.

regards,
Santosh


^ permalink raw reply	[flat|nested] 74+ messages in thread

* [PATCH v4 1/6] Documentation: arm: define DT idle states bindings
@ 2014-06-18 19:27             ` Santosh Shilimkar
  0 siblings, 0 replies; 74+ messages in thread
From: Santosh Shilimkar @ 2014-06-18 19:27 UTC (permalink / raw)
  To: linux-arm-kernel

On Wednesday 18 June 2014 01:36 PM, Lorenzo Pieralisi wrote:
> On Fri, Jun 13, 2014 at 06:33:35PM +0100, Nicolas Pitre wrote:

[..]
> Ok, a minor tweak to the diagram above, min-residency should include
> energy costs related to idle entry and exit, but not the exit-latency
> itself, as long as the energy costs implied by exiting the state are
> factored out in the min-residency-us property.
> 
> Hence, to sum it up, I attached below the updated bindings patch:
> 
> I think we are close to an agreement, if anyone disagrees please shout
> as soon as possible so that we can still integrate changes.
> 

[..]

> 
> -- >8 --
> Subject: [PATCH] Documentation: arm: define DT idle states bindings
> 
> ARM based platforms implement a variety of power management schemes that
> allow processors to enter idle states at run-time.
> The parameters defining these idle states vary on a per-platform basis forcing
> the OS to hardcode the state parameters in platform specific static tables
> whose size grows as the number of platforms supported in the kernel increases
> and hampers device drivers standardization.
> 
> Therefore, this patch aims at standardizing idle state device tree bindings for
> ARM platforms. Bindings define idle state parameters inclusive of entry methods
> and state latencies, to allow operating systems to retrieve the configuration
> entries from the device tree and initialize the related power management
> drivers, paving the way for common code in the kernel to deal with idle
> states and removing the need for static data in current and previous kernel
> versions.
> 
> Reviewed-by: Sebastian Capella <sebcape@gmail.com>
> Signed-off-by: Lorenzo Pieralisi <lorenzo.pieralisi@arm.com>
> ---
Nice work Lorenzo !!
I have few comments/questions.

>  Documentation/devicetree/bindings/arm/cpus.txt     |   8 +
>  .../devicetree/bindings/arm/idle-states.txt        | 561 +++++++++++++++++++++
>  2 files changed, 569 insertions(+)
>  create mode 100644 Documentation/devicetree/bindings/arm/idle-states.txt
> 
> diff --git a/Documentation/devicetree/bindings/arm/cpus.txt b/Documentation/devicetree/bindings/arm/cpus.txt
> index 1fe72a0..a44d4fd 100644
> --- a/Documentation/devicetree/bindings/arm/cpus.txt
> +++ b/Documentation/devicetree/bindings/arm/cpus.txt
> @@ -215,6 +215,12 @@ nodes to be present and contain the properties described below.
>  		Value type: <phandle>
>  		Definition: Specifies the ACC[2] node associated with this CPU.
>  
> +	- cpu-idle-states
> +		Usage: Optional
> +		Value type: <prop-encoded-array>
> +		Definition:
> +			# List of phandles to idle state nodes supported
> +			  by this cpu [3].
>  
>  Example 1 (dual-cluster big.LITTLE system 32-bit):
>  
> @@ -411,3 +417,5 @@ cpus {
>  --
>  [1] arm/msm/qcom,saw2.txt
>  [2] arm/msm/qcom,kpss-acc.txt
> +[3] ARM Linux kernel documentation - idle states bindings
> +    Documentation/devicetree/bindings/arm/idle-states.txt
> diff --git a/Documentation/devicetree/bindings/arm/idle-states.txt b/Documentation/devicetree/bindings/arm/idle-states.txt
> new file mode 100644
> index 0000000..c9e1ec6
> --- /dev/null
> +++ b/Documentation/devicetree/bindings/arm/idle-states.txt
> @@ -0,0 +1,561 @@
> +==========================================
> +ARM idle states binding description
> +==========================================
> +
> +==========================================
> +1 - Introduction
> +==========================================
> +
> +ARM systems contain HW capable of managing power consumption dynamically,
> +where cores can be put in different low-power states (ranging from simple
> +wfi to power gating) according to OSPM policies. The CPU states representing
s/OSPM/OS PM ?
> +the range of dynamic idle states that a processor can enter at run-time, can be
> +specified through device tree bindings representing the parameters required
> +to enter/exit specific idle states on a given processor.
> +
> +According to the Server Base System Architecture document (SBSA, [3]), the
> +power states an ARM CPU can be put into are identified by the following list:
> +
> +- Running
> +- Idle_standby
> +- Idle_retention
> +- Sleep
> +- Off
> +
> +The power states described in the SBSA document define the basic CPU states on
> +top of which ARM platforms implement power management schemes that allow an OS
> +PM implementation to put the processor in different idle states (which include
> +states listed above; "off" state is not an idle state since it does not have
> +wake-up capabilities, hence it is not considered in this document).
> +
> +Idle state parameters (eg entry latency) are platform specific and need to be
> +characterized with bindings that provide the required information to OSPM
Ditto
> +code so that it can build the required tables and use them at runtime.
> +
> +The device tree binding definition for ARM idle states is the subject of this
> +document.
> +
> +===========================================
> +2 - idle-states node
> +===========================================
> +
> +ARM processor idle states are defined within the idle-states node, which is
> +a direct child of the cpus node [1] and provides a container where the
> +processor idle states, defined as device tree nodes, are listed.
> +
> +- idle-states node
> +
> +	Usage: Optional - On ARM systems, is a container of processor idle
s/is/it is ?
> +			  states nodes. If the system does not provide CPU
> +			  power management capabilities or the processor just
> +			  supports idle_standby an idle-states node is not
> +			  required.
> +
> +	Description: idle-states node is a container node, where its
> +		     subnodes describe the CPU idle states.
> +
> +	Node name must be "idle-states".
> +
> +	The idle-states node's parent node must be the cpus node.
> +
> +	The idle-states node's child nodes can be:
s/idle-states/idle-state
> +
> +	- one or more state nodes
> +
> +	Any other configuration is considered invalid.
> +
> +	An idle-states node defines the following properties:
> +
> +	- entry-method
> +		Usage: Required
> +		Value type: <stringlist>
> +		Definition: Describes the method by which a CPU enters the
> +			    idle states. This property is required and must be
> +			    one of:
> +
> +			    - "arm,psci"
> +			      ARM PSCI firmware interface [2].
> +
> +			    - "[vendor],[method]"
> +			      An implementation dependent string with
> +			      format "vendor,method", where vendor is a string
> +			      denoting the name of the manufacturer and
> +			      method is a string specifying the mechanism
> +			      used to enter the idle state.
> +
> +The nodes describing the idle states (state) can only be defined within the
> +idle-states node, any other configuration is considered invalid and therefore
> +must be ignored.
> +
> +===========================================
> +3 - state node
> +===========================================
> +
> +A state node represents an idle state description and must be defined as
> +follows:
> +
> +- state node
> +
> +	Description: must be child of the idle-states node
> +
> +	The state node name shall follow standard device tree naming
> +	rules ([5], 2.2.1 "Node names"), in particular state nodes which
> +	are siblings within a single common parent must be given a unique name.
> +
> +	The idle state entered by executing the wfi instruction (idle_standby
> +	SBSA,[3][4]) is considered standard on all ARM platforms and therefore
> +	must not be listed.
> +
> +	To correctly specify idle states timing and energy related properties,
> +	the following definitions identify the different execution phases
> +	a CPU goes through to enter and exit idle states and the implied
> +	energy metrics:
> +
> +	..__[EXEC]__|__[PREP]__|__[ENTRY]__|__[IDLE]__|__[EXIT]__|__[EXEC]__..
> +		    |          |           |          |          |
> +
> +		    |<------ entry ------->|
> +		    |       latency        |
> +						      |<- exit ->|
> +						      |  latency |
> +		    |<-------- min-residency -------->|
> +			       |<-------  wakeup-latency ------->|
> +
I don't know the wakeup latency makes much sense and also correct.
Hardware wakeup latency is actually exit latency. Is it for failed
or abort-able ilde case ? We are adding this as a new parameter
at least from idle states perspective. I think we should just
avoid it.

> +	EXEC:	Normal CPU execution.
> +
> +	PREP:	Preparation phase before committing the hardware to idle mode
> +		like cache flushing. This is abortable on pending wake-up
> +		event conditions. The abort latency is assumed to be negligible
> +		(i.e. less than the ENTRY + EXIT duration). If aborted, CPU
> +		goes back to EXEC. This phase is optional. If not abortable,
> +		this should be included in the ENTRY phase instead.
> +
> +	ENTRY:	The hardware is committed to idle mode. This period must run
> +		to completion up to IDLE before anything else can happen.
> +
> +	IDLE:	This is the actual energy-saving idle period. This may last
> +		between 0 and infinite time, until a wake-up event occurs.
> +
> +	EXIT:	Period during which the CPU is brought back to operational
> +		mode (EXEC).
> +
> +	With the definitions provided above, the following list represents
> +	the valid properties for a state node:
> +
> +	- compatible
> +		Usage: Required
> +		Value type: <stringlist>
> +		Definition: Must be "arm,idle-state".
> +
> +	- logic-state-retained
> +		Usage: See definition
> +		Value type: <none>
> +		Definition: if present logic is retained on state entry,
> +			    otherwise it is lost.
> +
> +	- cache-state-retained
> +		Usage: See definition
> +		Value type: <none>
> +		Definition: if present cache memory is retained on state entry,
> +			    otherwise it is lost.
> +
> +	- entry-method-param
> +		Usage: See definition.
> +		Value type: <u32>
> +		Definition: Depends on the idle-states node entry-method
> +			    property value. Refer to the entry-method bindings
> +			    for this property value definition.
> +
> +	- entry-latency-us
> +		Usage: Required
> +		Value type: <prop-encoded-array>
> +		Definition: u32 value representing worst case latency in
> +			    microseconds required to enter the idle state.
> +			    The exit-latency-us duration may be guaranteed
> +			    only after entry-latency-us has passed.
> +
> +	- exit-latency-us
> +		Usage: Required
> +		Value type: <prop-encoded-array>
> +		Definition: u32 value representing worst case latency
> +			    in microseconds required to exit the idle state.
> +
> +	- min-residency-us
> +		Usage: Required
> +		Value type: <prop-encoded-array>
> +		Definition: u32 value representing minimum residency duration
> +			    in microseconds, inclusive of preparation and
> +			    entry, for this idle state to be considered
> +			    worthwhile energy wise.
> +			    The residency time must take into account the
> +			    energy consumed while entering and exiting the
> +			    idle state and is therefore expected to be
> +			    longer than entry-latency-us.
> +
> +	- wakeup-latency-us:
> +		Usage: Optional
> +		Value type: <prop-encoded-array>
> +		Definition: u32 value representing maximum delay between the
> +			    signaling of a wake-up event and the CPU being
> +			    able to execute normal code again. If omitted,
> +			    this is assumed to be equal to:
> +				entry-latency-us + exit-latency-us
> +
Rest of the patch looks fine by to me.

regards,
Santosh

^ permalink raw reply	[flat|nested] 74+ messages in thread

* Re: [PATCH v4 1/6] Documentation: arm: define DT idle states bindings
  2014-06-18 19:27             ` Santosh Shilimkar
@ 2014-06-18 20:51               ` Nicolas Pitre
  -1 siblings, 0 replies; 74+ messages in thread
From: Nicolas Pitre @ 2014-06-18 20:51 UTC (permalink / raw)
  To: Santosh Shilimkar
  Cc: Lorenzo Pieralisi, linux-arm-kernel, linux-pm, devicetree,
	Mark Rutland, Sudeep Holla, Catalin Marinas,
	Charles Garcia-Tobin, Rob Herring, grant.likely,
	Peter De Schrijver, Daniel Lezcano, Amit Kucheria,
	Vincent Guittot, Antti Miettinen, Stephen Boyd, Kevin Hilman,
	Sebastian Capella, Tomasz Figa, Mark Brown

On Wed, 18 Jun 2014, Santosh Shilimkar wrote:

> On Wednesday 18 June 2014 01:36 PM, Lorenzo Pieralisi wrote:
> [..]
> > +	To correctly specify idle states timing and energy related properties,
> > +	the following definitions identify the different execution phases
> > +	a CPU goes through to enter and exit idle states and the implied
> > +	energy metrics:
> > +
> > +	..__[EXEC]__|__[PREP]__|__[ENTRY]__|__[IDLE]__|__[EXIT]__|__[EXEC]__..
> > +		    |          |           |          |          |
> > +
> > +		    |<------ entry ------->|
> > +		    |       latency        |
> > +						      |<- exit ->|
> > +						      |  latency |
> > +		    |<-------- min-residency -------->|
> > +			       |<-------  wakeup-latency ------->|
> > +
> I don't know the wakeup latency makes much sense and also correct.
> Hardware wakeup latency is actually exit latency. Is it for failed
> or abort-able ilde case ? We are adding this as a new parameter
> at least from idle states perspective. I think we should just
> avoid it.

I explained the rationale for this parameter in a previous email but 
Lorenzo didn't carry it over. To be clearer, this should be "worst case 
wake-up latency".  It is of interest for PMQOS.  This is the maximum 
delay that can be expected from the moment a wake-up event is signaled 
and the moment the CPU is back operational.  This is more than just exit 
latency.  By default this is entry_latency + exit_latency but when there 
is an abortable PREP phase then it may be shorter than that.


Nicolas

^ permalink raw reply	[flat|nested] 74+ messages in thread

* [PATCH v4 1/6] Documentation: arm: define DT idle states bindings
@ 2014-06-18 20:51               ` Nicolas Pitre
  0 siblings, 0 replies; 74+ messages in thread
From: Nicolas Pitre @ 2014-06-18 20:51 UTC (permalink / raw)
  To: linux-arm-kernel

On Wed, 18 Jun 2014, Santosh Shilimkar wrote:

> On Wednesday 18 June 2014 01:36 PM, Lorenzo Pieralisi wrote:
> [..]
> > +	To correctly specify idle states timing and energy related properties,
> > +	the following definitions identify the different execution phases
> > +	a CPU goes through to enter and exit idle states and the implied
> > +	energy metrics:
> > +
> > +	..__[EXEC]__|__[PREP]__|__[ENTRY]__|__[IDLE]__|__[EXIT]__|__[EXEC]__..
> > +		    |          |           |          |          |
> > +
> > +		    |<------ entry ------->|
> > +		    |       latency        |
> > +						      |<- exit ->|
> > +						      |  latency |
> > +		    |<-------- min-residency -------->|
> > +			       |<-------  wakeup-latency ------->|
> > +
> I don't know the wakeup latency makes much sense and also correct.
> Hardware wakeup latency is actually exit latency. Is it for failed
> or abort-able ilde case ? We are adding this as a new parameter
> at least from idle states perspective. I think we should just
> avoid it.

I explained the rationale for this parameter in a previous email but 
Lorenzo didn't carry it over. To be clearer, this should be "worst case 
wake-up latency".  It is of interest for PMQOS.  This is the maximum 
delay that can be expected from the moment a wake-up event is signaled 
and the moment the CPU is back operational.  This is more than just exit 
latency.  By default this is entry_latency + exit_latency but when there 
is an abortable PREP phase then it may be shorter than that.


Nicolas

^ permalink raw reply	[flat|nested] 74+ messages in thread

* Re: [PATCH v4 1/6] Documentation: arm: define DT idle states bindings
  2014-06-18 20:51               ` Nicolas Pitre
@ 2014-06-18 20:55                 ` Santosh Shilimkar
  -1 siblings, 0 replies; 74+ messages in thread
From: Santosh Shilimkar @ 2014-06-18 20:55 UTC (permalink / raw)
  To: Nicolas Pitre
  Cc: Lorenzo Pieralisi, linux-arm-kernel, linux-pm, devicetree,
	Mark Rutland, Sudeep Holla, Catalin Marinas,
	Charles Garcia-Tobin, Rob Herring, grant.likely,
	Peter De Schrijver, Daniel Lezcano, Amit Kucheria,
	Vincent Guittot, Antti Miettinen, Stephen Boyd, Kevin Hilman,
	Sebastian Capella, Tomasz Figa, Mark Brown

On Wednesday 18 June 2014 04:51 PM, Nicolas Pitre wrote:
> On Wed, 18 Jun 2014, Santosh Shilimkar wrote:
> 
>> On Wednesday 18 June 2014 01:36 PM, Lorenzo Pieralisi wrote:
>> [..]
>>> +	To correctly specify idle states timing and energy related properties,
>>> +	the following definitions identify the different execution phases
>>> +	a CPU goes through to enter and exit idle states and the implied
>>> +	energy metrics:
>>> +
>>> +	..__[EXEC]__|__[PREP]__|__[ENTRY]__|__[IDLE]__|__[EXIT]__|__[EXEC]__..
>>> +		    |          |           |          |          |
>>> +
>>> +		    |<------ entry ------->|
>>> +		    |       latency        |
>>> +						      |<- exit ->|
>>> +						      |  latency |
>>> +		    |<-------- min-residency -------->|
>>> +			       |<-------  wakeup-latency ------->|
>>> +
>> I don't know the wakeup latency makes much sense and also correct.
>> Hardware wakeup latency is actually exit latency. Is it for failed
>> or abort-able ilde case ? We are adding this as a new parameter
>> at least from idle states perspective. I think we should just
>> avoid it.
> 
> I explained the rationale for this parameter in a previous email but 
> Lorenzo didn't carry it over. To be clearer, this should be "worst case 
> wake-up latency".  It is of interest for PMQOS.  This is the maximum 
> delay that can be expected from the moment a wake-up event is signaled 
> and the moment the CPU is back operational.  This is more than just exit 
> latency.  By default this is entry_latency + exit_latency but when there 
> is an abortable PREP phase then it may be shorter than that.
> 
PMQOS angle is right. It is just that the idle code is not
going to do anything with this value. But I see a value adding it
instead of some one doing calculation.

Thanks for clarity Nico !!

regards,
Santosh


^ permalink raw reply	[flat|nested] 74+ messages in thread

* [PATCH v4 1/6] Documentation: arm: define DT idle states bindings
@ 2014-06-18 20:55                 ` Santosh Shilimkar
  0 siblings, 0 replies; 74+ messages in thread
From: Santosh Shilimkar @ 2014-06-18 20:55 UTC (permalink / raw)
  To: linux-arm-kernel

On Wednesday 18 June 2014 04:51 PM, Nicolas Pitre wrote:
> On Wed, 18 Jun 2014, Santosh Shilimkar wrote:
> 
>> On Wednesday 18 June 2014 01:36 PM, Lorenzo Pieralisi wrote:
>> [..]
>>> +	To correctly specify idle states timing and energy related properties,
>>> +	the following definitions identify the different execution phases
>>> +	a CPU goes through to enter and exit idle states and the implied
>>> +	energy metrics:
>>> +
>>> +	..__[EXEC]__|__[PREP]__|__[ENTRY]__|__[IDLE]__|__[EXIT]__|__[EXEC]__..
>>> +		    |          |           |          |          |
>>> +
>>> +		    |<------ entry ------->|
>>> +		    |       latency        |
>>> +						      |<- exit ->|
>>> +						      |  latency |
>>> +		    |<-------- min-residency -------->|
>>> +			       |<-------  wakeup-latency ------->|
>>> +
>> I don't know the wakeup latency makes much sense and also correct.
>> Hardware wakeup latency is actually exit latency. Is it for failed
>> or abort-able ilde case ? We are adding this as a new parameter
>> at least from idle states perspective. I think we should just
>> avoid it.
> 
> I explained the rationale for this parameter in a previous email but 
> Lorenzo didn't carry it over. To be clearer, this should be "worst case 
> wake-up latency".  It is of interest for PMQOS.  This is the maximum 
> delay that can be expected from the moment a wake-up event is signaled 
> and the moment the CPU is back operational.  This is more than just exit 
> latency.  By default this is entry_latency + exit_latency but when there 
> is an abortable PREP phase then it may be shorter than that.
> 
PMQOS angle is right. It is just that the idle code is not
going to do anything with this value. But I see a value adding it
instead of some one doing calculation.

Thanks for clarity Nico !!

regards,
Santosh

^ permalink raw reply	[flat|nested] 74+ messages in thread

* Re: [PATCH v4 1/6] Documentation: arm: define DT idle states bindings
  2014-06-18 17:36           ` Lorenzo Pieralisi
@ 2014-06-18 21:03             ` Nicolas Pitre
  -1 siblings, 0 replies; 74+ messages in thread
From: Nicolas Pitre @ 2014-06-18 21:03 UTC (permalink / raw)
  To: Lorenzo Pieralisi
  Cc: linux-arm-kernel, linux-pm, devicetree, Mark Rutland,
	Sudeep Holla, Catalin Marinas, Charles Garcia-Tobin, Rob Herring,
	grant.likely, Peter De Schrijver, Santosh Shilimkar,
	Daniel Lezcano, Amit Kucheria, Vincent Guittot, Antti Miettinen,
	Stephen Boyd, Kevin Hilman, Sebastian Capella, Tomasz Figa,
	Mark Brown

On Wed, 18 Jun 2014, Lorenzo Pieralisi wrote:

> On Fri, Jun 13, 2014 at 06:33:35PM +0100, Nicolas Pitre wrote:
> > On Fri, 13 Jun 2014, Lorenzo Pieralisi wrote:
> > 
> > > On Wed, Jun 11, 2014 at 07:15:16PM +0100, Nicolas Pitre wrote:
> > > > Let's illustrate the different periods on a time line to make it clearer
> > > > (hmmm let's see how this can be managed on a braille display :-O ):
> > > > 
> > > > EXEC:	Normal CPU execution.
> > > > 
> > > > PREP:	Preparation phase before committing the hardware to idle mode
> > > > 	like cache flushing. This is abortable on pending wake-up 
> > > > 	event conditions. The abort latency is assumed to be negligible 
> > > > 	(i.e. less than the ENTRY + EXIT duration). If aborted, we go 
> > > > 	back to EXEC. This phase is optional. If not abortable, this 
> > > > 	should be included in the ENTRY phase instead.
> > > > 
> > > > ENTRY:	The hardware is committed to idle mode. This period must run to
> > > > 	completion up to IDLE before anything else can happen.
> > > > 
> > > > IDLE:	This is the actual power-saving idle period. This may last 
> > > > 	between 0 and infinite time, until a wake-up event occurs.
> > > > 
> > > > EXIT:	Period during which the CPU is brought back to operational
> > > > 	mode (EXEC).
> > > > 
> > > > ...__[EXEC]__|__[PREP]--|__[ENTRY]__|__[IDLE]__|___[EXIT]_--|__[EXEC]__...
> > > >              |          |           |          |            |
> > > > 
> > > >              |<-- entry-latency --->|
> > > > 
> > > >                                                |<- exit-  ->|
> > > >                                                |  latency   |
> > > > 
> > > >              |<-------------- min-residency --------------->|
> > > > 
> > > >                         |<----- worst_wakeup_latency ------>|
> > > > 
> > > > entry-latency: Worst case latency required to enter the idle state.  The 
> > > > exit_latency may be guaranteed only after entry-latency has passed.
> > > > 
> > > > min-residency: Minimum period, including preparation, entry and exit, 
> > > > for a given power mode to be worthwhile energy wise.  It must be at 
> > > > least equal to entry_latency + exit_latency.
> 
> Ok, a minor tweak to the diagram above, min-residency should include
> energy costs related to idle entry and exit, but not the exit-latency
> itself, as long as the energy costs implied by exiting the state are
> factored out in the min-residency-us property.

s/factored out /factored in/

> Hence, to sum it up, I attached below the updated bindings patch:
> 
> I think we are close to an agreement, if anyone disagrees please shout
> as soon as possible so that we can still integrate changes.

Comments:

[...]

> +- state node
> +
> +	Description: must be child of the idle-states node
> +
> +	The state node name shall follow standard device tree naming
> +	rules ([5], 2.2.1 "Node names"), in particular state nodes which
> +	are siblings within a single common parent must be given a unique name.
> +
> +	The idle state entered by executing the wfi instruction (idle_standby
> +	SBSA,[3][4]) is considered standard on all ARM platforms and therefore
> +	must not be listed.
> +
> +	To correctly specify idle states timing and energy related properties,
> +	the following definitions identify the different execution phases
> +	a CPU goes through to enter and exit idle states and the implied
> +	energy metrics:
> +
> +	..__[EXEC]__|__[PREP]__|__[ENTRY]__|__[IDLE]__|__[EXIT]__|__[EXEC]__..
> +		    |          |           |          |          |
> +
> +		    |<------ entry ------->|
> +		    |       latency        |
> +						      |<- exit ->|
> +						      |  latency |
> +		    |<-------- min-residency -------->|
> +			       |<-------  wakeup-latency ------->|
> +
> +	EXEC:	Normal CPU execution.
> +
> +	PREP:	Preparation phase before committing the hardware to idle mode
> +		like cache flushing. This is abortable on pending wake-up
> +		event conditions. The abort latency is assumed to be negligible
> +		(i.e. less than the ENTRY + EXIT duration). If aborted, CPU
> +		goes back to EXEC. This phase is optional. If not abortable,
> +		this should be included in the ENTRY phase instead.
> +
> +	ENTRY:	The hardware is committed to idle mode. This period must run
> +		to completion up to IDLE before anything else can happen.
> +
> +	IDLE:	This is the actual energy-saving idle period. This may last
> +		between 0 and infinite time, until a wake-up event occurs.
> +
> +	EXIT:	Period during which the CPU is brought back to operational
> +		mode (EXEC).
> +
> +	With the definitions provided above, the following list represents
> +	the valid properties for a state node:
[...]

I really think the definitions and timing diagram ought to be 
prominently presented at the beginning of the document in a separate 
section for it rather than being burried in a binding section.  Extra 
discussion points from this thread could go there as well, i.e. the 
reason for each timing parameter, etc.  The latest comment from Santosh 
shows that this is never too clear.

For example, your explanation of what the min residency represents at 
the top of this email should belong to such a section.  Similarly for 
the worst wake-up latency.

Then we could refer to it when defining binding parameters.


Nicolas

^ permalink raw reply	[flat|nested] 74+ messages in thread

* [PATCH v4 1/6] Documentation: arm: define DT idle states bindings
@ 2014-06-18 21:03             ` Nicolas Pitre
  0 siblings, 0 replies; 74+ messages in thread
From: Nicolas Pitre @ 2014-06-18 21:03 UTC (permalink / raw)
  To: linux-arm-kernel

On Wed, 18 Jun 2014, Lorenzo Pieralisi wrote:

> On Fri, Jun 13, 2014 at 06:33:35PM +0100, Nicolas Pitre wrote:
> > On Fri, 13 Jun 2014, Lorenzo Pieralisi wrote:
> > 
> > > On Wed, Jun 11, 2014 at 07:15:16PM +0100, Nicolas Pitre wrote:
> > > > Let's illustrate the different periods on a time line to make it clearer
> > > > (hmmm let's see how this can be managed on a braille display :-O ):
> > > > 
> > > > EXEC:	Normal CPU execution.
> > > > 
> > > > PREP:	Preparation phase before committing the hardware to idle mode
> > > > 	like cache flushing. This is abortable on pending wake-up 
> > > > 	event conditions. The abort latency is assumed to be negligible 
> > > > 	(i.e. less than the ENTRY + EXIT duration). If aborted, we go 
> > > > 	back to EXEC. This phase is optional. If not abortable, this 
> > > > 	should be included in the ENTRY phase instead.
> > > > 
> > > > ENTRY:	The hardware is committed to idle mode. This period must run to
> > > > 	completion up to IDLE before anything else can happen.
> > > > 
> > > > IDLE:	This is the actual power-saving idle period. This may last 
> > > > 	between 0 and infinite time, until a wake-up event occurs.
> > > > 
> > > > EXIT:	Period during which the CPU is brought back to operational
> > > > 	mode (EXEC).
> > > > 
> > > > ...__[EXEC]__|__[PREP]--|__[ENTRY]__|__[IDLE]__|___[EXIT]_--|__[EXEC]__...
> > > >              |          |           |          |            |
> > > > 
> > > >              |<-- entry-latency --->|
> > > > 
> > > >                                                |<- exit-  ->|
> > > >                                                |  latency   |
> > > > 
> > > >              |<-------------- min-residency --------------->|
> > > > 
> > > >                         |<----- worst_wakeup_latency ------>|
> > > > 
> > > > entry-latency: Worst case latency required to enter the idle state.  The 
> > > > exit_latency may be guaranteed only after entry-latency has passed.
> > > > 
> > > > min-residency: Minimum period, including preparation, entry and exit, 
> > > > for a given power mode to be worthwhile energy wise.  It must be at 
> > > > least equal to entry_latency + exit_latency.
> 
> Ok, a minor tweak to the diagram above, min-residency should include
> energy costs related to idle entry and exit, but not the exit-latency
> itself, as long as the energy costs implied by exiting the state are
> factored out in the min-residency-us property.

s/factored out /factored in/

> Hence, to sum it up, I attached below the updated bindings patch:
> 
> I think we are close to an agreement, if anyone disagrees please shout
> as soon as possible so that we can still integrate changes.

Comments:

[...]

> +- state node
> +
> +	Description: must be child of the idle-states node
> +
> +	The state node name shall follow standard device tree naming
> +	rules ([5], 2.2.1 "Node names"), in particular state nodes which
> +	are siblings within a single common parent must be given a unique name.
> +
> +	The idle state entered by executing the wfi instruction (idle_standby
> +	SBSA,[3][4]) is considered standard on all ARM platforms and therefore
> +	must not be listed.
> +
> +	To correctly specify idle states timing and energy related properties,
> +	the following definitions identify the different execution phases
> +	a CPU goes through to enter and exit idle states and the implied
> +	energy metrics:
> +
> +	..__[EXEC]__|__[PREP]__|__[ENTRY]__|__[IDLE]__|__[EXIT]__|__[EXEC]__..
> +		    |          |           |          |          |
> +
> +		    |<------ entry ------->|
> +		    |       latency        |
> +						      |<- exit ->|
> +						      |  latency |
> +		    |<-------- min-residency -------->|
> +			       |<-------  wakeup-latency ------->|
> +
> +	EXEC:	Normal CPU execution.
> +
> +	PREP:	Preparation phase before committing the hardware to idle mode
> +		like cache flushing. This is abortable on pending wake-up
> +		event conditions. The abort latency is assumed to be negligible
> +		(i.e. less than the ENTRY + EXIT duration). If aborted, CPU
> +		goes back to EXEC. This phase is optional. If not abortable,
> +		this should be included in the ENTRY phase instead.
> +
> +	ENTRY:	The hardware is committed to idle mode. This period must run
> +		to completion up to IDLE before anything else can happen.
> +
> +	IDLE:	This is the actual energy-saving idle period. This may last
> +		between 0 and infinite time, until a wake-up event occurs.
> +
> +	EXIT:	Period during which the CPU is brought back to operational
> +		mode (EXEC).
> +
> +	With the definitions provided above, the following list represents
> +	the valid properties for a state node:
[...]

I really think the definitions and timing diagram ought to be 
prominently presented at the beginning of the document in a separate 
section for it rather than being burried in a binding section.  Extra 
discussion points from this thread could go there as well, i.e. the 
reason for each timing parameter, etc.  The latest comment from Santosh 
shows that this is never too clear.

For example, your explanation of what the min residency represents at 
the top of this email should belong to such a section.  Similarly for 
the worst wake-up latency.

Then we could refer to it when defining binding parameters.


Nicolas

^ permalink raw reply	[flat|nested] 74+ messages in thread

* Re: [PATCH v4 1/6] Documentation: arm: define DT idle states bindings
  2014-06-18 20:55                 ` Santosh Shilimkar
@ 2014-06-18 21:09                   ` Nicolas Pitre
  -1 siblings, 0 replies; 74+ messages in thread
From: Nicolas Pitre @ 2014-06-18 21:09 UTC (permalink / raw)
  To: Santosh Shilimkar
  Cc: Lorenzo Pieralisi, linux-arm-kernel, linux-pm, devicetree,
	Mark Rutland, Sudeep Holla, Catalin Marinas,
	Charles Garcia-Tobin, Rob Herring, grant.likely,
	Peter De Schrijver, Daniel Lezcano, Amit Kucheria,
	Vincent Guittot, Antti Miettinen, Stephen Boyd, Kevin Hilman,
	Sebastian Capella, Tomasz Figa, Mark Brown

On Wed, 18 Jun 2014, Santosh Shilimkar wrote:

> On Wednesday 18 June 2014 04:51 PM, Nicolas Pitre wrote:
> > On Wed, 18 Jun 2014, Santosh Shilimkar wrote:
> > 
> >> On Wednesday 18 June 2014 01:36 PM, Lorenzo Pieralisi wrote:
> >> [..]
> >>> +	To correctly specify idle states timing and energy related properties,
> >>> +	the following definitions identify the different execution phases
> >>> +	a CPU goes through to enter and exit idle states and the implied
> >>> +	energy metrics:
> >>> +
> >>> +	..__[EXEC]__|__[PREP]__|__[ENTRY]__|__[IDLE]__|__[EXIT]__|__[EXEC]__..
> >>> +		    |          |           |          |          |
> >>> +
> >>> +		    |<------ entry ------->|
> >>> +		    |       latency        |
> >>> +						      |<- exit ->|
> >>> +						      |  latency |
> >>> +		    |<-------- min-residency -------->|
> >>> +			       |<-------  wakeup-latency ------->|
> >>> +
> >> I don't know the wakeup latency makes much sense and also correct.
> >> Hardware wakeup latency is actually exit latency. Is it for failed
> >> or abort-able ilde case ? We are adding this as a new parameter
> >> at least from idle states perspective. I think we should just
> >> avoid it.
> > 
> > I explained the rationale for this parameter in a previous email but 
> > Lorenzo didn't carry it over. To be clearer, this should be "worst case 
> > wake-up latency".  It is of interest for PMQOS.  This is the maximum 
> > delay that can be expected from the moment a wake-up event is signaled 
> > and the moment the CPU is back operational.  This is more than just exit 
> > latency.  By default this is entry_latency + exit_latency but when there 
> > is an abortable PREP phase then it may be shorter than that.
> > 
> PMQOS angle is right. It is just that the idle code is not
> going to do anything with this value. But I see a value adding it
> instead of some one doing calculation.

The idle code should take it into account when a PMQOS restriction is in 
effect i.e. avoid using those modes whose worst case wake-up latency is 
too large.

And cpuidle is being migrated into the scheduler as we speak.  So some 
of the values there, namely entry_latency and exit_latency (taken 
separately for timing purposes) will be directly used by the scheduler 
to decide which CPU to wake up for example.

So there is fundamentally 4 parameters if we want to comprehensively 
support all pertinent use cases.


Nicolas

^ permalink raw reply	[flat|nested] 74+ messages in thread

* [PATCH v4 1/6] Documentation: arm: define DT idle states bindings
@ 2014-06-18 21:09                   ` Nicolas Pitre
  0 siblings, 0 replies; 74+ messages in thread
From: Nicolas Pitre @ 2014-06-18 21:09 UTC (permalink / raw)
  To: linux-arm-kernel

On Wed, 18 Jun 2014, Santosh Shilimkar wrote:

> On Wednesday 18 June 2014 04:51 PM, Nicolas Pitre wrote:
> > On Wed, 18 Jun 2014, Santosh Shilimkar wrote:
> > 
> >> On Wednesday 18 June 2014 01:36 PM, Lorenzo Pieralisi wrote:
> >> [..]
> >>> +	To correctly specify idle states timing and energy related properties,
> >>> +	the following definitions identify the different execution phases
> >>> +	a CPU goes through to enter and exit idle states and the implied
> >>> +	energy metrics:
> >>> +
> >>> +	..__[EXEC]__|__[PREP]__|__[ENTRY]__|__[IDLE]__|__[EXIT]__|__[EXEC]__..
> >>> +		    |          |           |          |          |
> >>> +
> >>> +		    |<------ entry ------->|
> >>> +		    |       latency        |
> >>> +						      |<- exit ->|
> >>> +						      |  latency |
> >>> +		    |<-------- min-residency -------->|
> >>> +			       |<-------  wakeup-latency ------->|
> >>> +
> >> I don't know the wakeup latency makes much sense and also correct.
> >> Hardware wakeup latency is actually exit latency. Is it for failed
> >> or abort-able ilde case ? We are adding this as a new parameter
> >> at least from idle states perspective. I think we should just
> >> avoid it.
> > 
> > I explained the rationale for this parameter in a previous email but 
> > Lorenzo didn't carry it over. To be clearer, this should be "worst case 
> > wake-up latency".  It is of interest for PMQOS.  This is the maximum 
> > delay that can be expected from the moment a wake-up event is signaled 
> > and the moment the CPU is back operational.  This is more than just exit 
> > latency.  By default this is entry_latency + exit_latency but when there 
> > is an abortable PREP phase then it may be shorter than that.
> > 
> PMQOS angle is right. It is just that the idle code is not
> going to do anything with this value. But I see a value adding it
> instead of some one doing calculation.

The idle code should take it into account when a PMQOS restriction is in 
effect i.e. avoid using those modes whose worst case wake-up latency is 
too large.

And cpuidle is being migrated into the scheduler as we speak.  So some 
of the values there, namely entry_latency and exit_latency (taken 
separately for timing purposes) will be directly used by the scheduler 
to decide which CPU to wake up for example.

So there is fundamentally 4 parameters if we want to comprehensively 
support all pertinent use cases.


Nicolas

^ permalink raw reply	[flat|nested] 74+ messages in thread

* Re: [PATCH v4 5/6] drivers: cpuidle: CPU idle ARM64 driver
  2014-06-11 16:18     ` Lorenzo Pieralisi
@ 2014-06-18 21:34       ` Daniel Lezcano
  -1 siblings, 0 replies; 74+ messages in thread
From: Daniel Lezcano @ 2014-06-18 21:34 UTC (permalink / raw)
  To: Lorenzo Pieralisi, linux-arm-kernel, linux-pm, devicetree
  Cc: Mark Rutland, Sudeep Holla, Catalin Marinas,
	Charles Garcia Tobin, Nicolas Pitre, Rob Herring, Grant Likely,
	Peter De Schrijver, Santosh Shilimkar, Amit Kucheria,
	Vincent Guittot, Antti Miettinen, Stephen Boyd, Kevin Hilman,
	Sebastian Capella, Tomasz Figa, Mark Brown, Paul Walmsley,
	Chander Kashyap

On 06/11/2014 06:18 PM, Lorenzo Pieralisi wrote:
> This patch implements a generic CPU idle driver for ARM64 machines.
>
> It relies on the DT idle states infrastructure to initialize idle
> states count and respective parameters. Current code assumes the driver
> is managing idle states on all possible CPUs but can be easily
> generalized to support heterogenous systems and build cpumasks at
> runtime using MIDRs or DT cpu nodes compatible properties.
>
> Suspend back-ends (eg PSCI) must register a suspend initializer with
> the CPU idle driver so that the suspend backend call can be detected,
> and the driver code can call the back-end infrastructure to complete the
> suspend backend initialization.
>
> Idle state index 0 is always initialized as a simple wfi state, ie always
> considered present and functional on all ARM64 platforms.
>
> Signed-off-by: Lorenzo Pieralisi <lorenzo.pieralisi@arm.com>
> ---
>   drivers/cpuidle/Kconfig         |   5 ++
>   drivers/cpuidle/Kconfig.arm64   |  13 ++++
>   drivers/cpuidle/Makefile        |   4 +
>   drivers/cpuidle/cpuidle-arm64.c | 168 ++++++++++++++++++++++++++++++++++++++++
>   4 files changed, 190 insertions(+)
>   create mode 100644 drivers/cpuidle/Kconfig.arm64
>   create mode 100644 drivers/cpuidle/cpuidle-arm64.c
>
> diff --git a/drivers/cpuidle/Kconfig b/drivers/cpuidle/Kconfig
> index 760ce20..360c086 100644
> --- a/drivers/cpuidle/Kconfig
> +++ b/drivers/cpuidle/Kconfig
> @@ -44,6 +44,11 @@ depends on ARM
>   source "drivers/cpuidle/Kconfig.arm"
>   endmenu
>
> +menu "ARM64 CPU Idle Drivers"
> +depends on ARM64
> +source "drivers/cpuidle/Kconfig.arm64"
> +endmenu
> +
>   menu "MIPS CPU Idle Drivers"
>   depends on MIPS
>   source "drivers/cpuidle/Kconfig.mips"
> diff --git a/drivers/cpuidle/Kconfig.arm64 b/drivers/cpuidle/Kconfig.arm64
> new file mode 100644
> index 0000000..b83612c
> --- /dev/null
> +++ b/drivers/cpuidle/Kconfig.arm64
> @@ -0,0 +1,13 @@
> +#
> +# ARM64 CPU Idle drivers
> +#
> +
> +config ARM64_CPUIDLE
> +	bool "Generic ARM64 CPU idle Driver"
> +	select OF_IDLE_STATES
> +	help
> +	  Select this to enable generic cpuidle driver for ARM v8.
> +	  It provides a generic idle driver whose idle states are configured
> +	  at run-time through DT nodes. The CPUidle suspend backend is
> +	  initialized by the device tree parsing code on matching the entry
> +	  method to the respective CPU operations.
> diff --git a/drivers/cpuidle/Makefile b/drivers/cpuidle/Makefile
> index d5ebf4b..e496242 100644
> --- a/drivers/cpuidle/Makefile
> +++ b/drivers/cpuidle/Makefile
> @@ -23,6 +23,10 @@ obj-$(CONFIG_ARM_EXYNOS_CPUIDLE)        += cpuidle-exynos.o
>   obj-$(CONFIG_MIPS_CPS_CPUIDLE)		+= cpuidle-cps.o
>
>   ###############################################################################
> +# ARM64 drivers
> +obj-$(CONFIG_ARM64_CPUIDLE)		+= cpuidle-arm64.o
> +
> +###############################################################################
>   # POWERPC drivers
>   obj-$(CONFIG_PSERIES_CPUIDLE)		+= cpuidle-pseries.o
>   obj-$(CONFIG_POWERNV_CPUIDLE)		+= cpuidle-powernv.o
> diff --git a/drivers/cpuidle/cpuidle-arm64.c b/drivers/cpuidle/cpuidle-arm64.c
> new file mode 100644
> index 0000000..4c932f8
> --- /dev/null
> +++ b/drivers/cpuidle/cpuidle-arm64.c
> @@ -0,0 +1,168 @@
> +/*
> + * ARM64 generic CPU idle driver.
> + *
> + * Copyright (C) 2014 ARM Ltd.
> + * Author: Lorenzo Pieralisi <lorenzo.pieralisi@arm.com>
> + *
> + * This program is free software; you can redistribute it and/or modify
> + * it under the terms of the GNU General Public License version 2 as
> + * published by the Free Software Foundation.
> + */
> +
> +#define pr_fmt(fmt) "CPUidle arm64: " fmt
> +
> +#include <linux/cpuidle.h>
> +#include <linux/cpumask.h>
> +#include <linux/cpu_pm.h>
> +#include <linux/kernel.h>
> +#include <linux/module.h>
> +#include <linux/of.h>
> +
> +#include <asm/psci.h>
> +#include <asm/suspend.h>
> +
> +#include "of_idle_states.h"
> +
> +typedef int (*suspend_init_fn)(struct cpuidle_driver *,
> +			       struct device_node *[]);
> +
> +struct cpu_suspend_ops {
> +	const char *id;
> +	suspend_init_fn init_fn;
> +};
> +
> +static const struct cpu_suspend_ops suspend_operations[] __initconst = {
> +	{"arm,psci", psci_dt_register_idle_states},
> +	{}
> +};
> +
> +static __init const struct cpu_suspend_ops *get_suspend_ops(const char *str)
> +{
> +	int i;
> +
> +	if (!str)
> +		return NULL;
> +
> +	for (i = 0; suspend_operations[i].id; i++)
> +		if (!strcmp(suspend_operations[i].id, str))
> +			return &suspend_operations[i];
> +
> +	return NULL;
> +}
> +
> +/*
> + * arm_enter_idle_state - Programs CPU to enter the specified state
> + *
> + * dev: cpuidle device
> + * drv: cpuidle driver
> + * idx: state index
> + *
> + * Called from the CPUidle framework to program the device to the
> + * specified target state selected by the governor.
> + */
> +static int arm_enter_idle_state(struct cpuidle_device *dev,
> +				struct cpuidle_driver *drv, int idx)
> +{
> +	int ret;
> +
> +	if (!idx) {
> +		cpu_do_idle();
> +		return idx;
> +	}
> +
> +	cpu_pm_enter();
> +	/*
> +	 * Pass idle state index to cpu_suspend which in turn will call
> +	 * the CPU ops suspend protocol with idle index as a parameter.
> +	 *
> +	 * Some states would not require context to be saved and flushed
> +	 * to DRAM, so calling cpu_suspend would not be stricly necessary.
> +	 * When power domains specifications for ARM CPUs are finalized then
> +	 * this code can be optimized to prevent saving registers if not
> +	 * needed.
> +	 */
> +	ret = cpu_suspend(idx);
> +
> +	cpu_pm_exit();
> +
> +	return ret ? -1 : idx;

Is it sure cpu_suspend will return always 0 on success ?

> +}
> +
> +struct cpuidle_driver arm64_idle_driver = {
> +	.name = "arm64_idle",
> +	.owner = THIS_MODULE,
> +};
> +
> +static struct device_node *state_nodes[CPUIDLE_STATE_MAX] __initdata;
> +
> +/*
> + * arm64_idle_init
> + *
> + * Registers the arm64 specific cpuidle driver with the cpuidle
> + * framework. It relies on core code to parse the idle states
> + * and initialize them using driver data structures accordingly.
> + */
> +static int __init arm64_idle_init(void)
> +{
> +	int i, ret;
> +	const char *entry_method;
> +	struct device_node *idle_states_node;
> +	const struct cpu_suspend_ops *suspend_init;
> +	struct cpuidle_driver *drv = &arm64_idle_driver;
> +
> +	idle_states_node = of_find_node_by_path("/cpus/idle-states");
> +	if (!idle_states_node)
> +		return -ENOENT;
> +
> +	if (of_property_read_string(idle_states_node, "entry-method",
> +				    &entry_method)) {
> +		pr_warn(" * %s missing entry-method property\n",
> +			    idle_states_node->full_name);
> +		of_node_put(idle_states_node);
> +		ret = -EOPNOTSUPP;
> +		goto put_node;
> +	}
> +
> +	suspend_init = get_suspend_ops(entry_method);
> +	if (!suspend_init) {
> +		pr_warn("Missing suspend initializer\n");
> +		ret = -EOPNOTSUPP;
> +		goto put_node;
> +	}
> +
> +	/*
> +	 * State at index 0 is standby wfi and considered standard
> +	 * on all ARM platforms. If in some platforms simple wfi
> +	 * can't be used as "state 0", DT bindings must be implemented
> +	 * to work around this issue and allow installing a special
> +	 * handler for idle state index 0.
> +	 */
> +	drv->states[0].exit_latency = 1;
> +	drv->states[0].target_residency = 1;
> +	drv->states[0].flags = CPUIDLE_FLAG_TIME_VALID;
> +	strncpy(drv->states[0].name, "ARM WFI", CPUIDLE_NAME_LEN);
> +	strncpy(drv->states[0].desc, "ARM WFI", CPUIDLE_DESC_LEN);

Please do not copy the state name and desc strings, they will be 
converted to 'const char *'.

> +	drv->cpumask = (struct cpumask *) cpu_possible_mask;
> +	/*
> +	 * Start at index 1, request idle state nodes to be filled
> +	 */
> +	ret = of_init_idle_driver(drv, state_nodes, 1, true);
> +	if (ret)
> +		goto put_node;
> +
> +	if (suspend_init->init_fn(drv, state_nodes)) {
> +		ret = -EOPNOTSUPP;
> +		goto put_node;
> +	}
> +
> +	for (i = 0; i < drv->state_count; i++)
> +		drv->states[i].enter = arm_enter_idle_state;

May be s/arm/arm64/ ?

> +
> +	ret = cpuidle_register(drv, NULL);
> +
> +put_node:
> +	of_node_put(idle_states_node);
> +	return ret;
> +}
> +device_initcall(arm64_idle_init);
>


-- 
  <http://www.linaro.org/> Linaro.org │ Open source software for ARM SoCs

Follow Linaro:  <http://www.facebook.com/pages/Linaro> Facebook |
<http://twitter.com/#!/linaroorg> Twitter |
<http://www.linaro.org/linaro-blog/> Blog


^ permalink raw reply	[flat|nested] 74+ messages in thread

* [PATCH v4 5/6] drivers: cpuidle: CPU idle ARM64 driver
@ 2014-06-18 21:34       ` Daniel Lezcano
  0 siblings, 0 replies; 74+ messages in thread
From: Daniel Lezcano @ 2014-06-18 21:34 UTC (permalink / raw)
  To: linux-arm-kernel

On 06/11/2014 06:18 PM, Lorenzo Pieralisi wrote:
> This patch implements a generic CPU idle driver for ARM64 machines.
>
> It relies on the DT idle states infrastructure to initialize idle
> states count and respective parameters. Current code assumes the driver
> is managing idle states on all possible CPUs but can be easily
> generalized to support heterogenous systems and build cpumasks at
> runtime using MIDRs or DT cpu nodes compatible properties.
>
> Suspend back-ends (eg PSCI) must register a suspend initializer with
> the CPU idle driver so that the suspend backend call can be detected,
> and the driver code can call the back-end infrastructure to complete the
> suspend backend initialization.
>
> Idle state index 0 is always initialized as a simple wfi state, ie always
> considered present and functional on all ARM64 platforms.
>
> Signed-off-by: Lorenzo Pieralisi <lorenzo.pieralisi@arm.com>
> ---
>   drivers/cpuidle/Kconfig         |   5 ++
>   drivers/cpuidle/Kconfig.arm64   |  13 ++++
>   drivers/cpuidle/Makefile        |   4 +
>   drivers/cpuidle/cpuidle-arm64.c | 168 ++++++++++++++++++++++++++++++++++++++++
>   4 files changed, 190 insertions(+)
>   create mode 100644 drivers/cpuidle/Kconfig.arm64
>   create mode 100644 drivers/cpuidle/cpuidle-arm64.c
>
> diff --git a/drivers/cpuidle/Kconfig b/drivers/cpuidle/Kconfig
> index 760ce20..360c086 100644
> --- a/drivers/cpuidle/Kconfig
> +++ b/drivers/cpuidle/Kconfig
> @@ -44,6 +44,11 @@ depends on ARM
>   source "drivers/cpuidle/Kconfig.arm"
>   endmenu
>
> +menu "ARM64 CPU Idle Drivers"
> +depends on ARM64
> +source "drivers/cpuidle/Kconfig.arm64"
> +endmenu
> +
>   menu "MIPS CPU Idle Drivers"
>   depends on MIPS
>   source "drivers/cpuidle/Kconfig.mips"
> diff --git a/drivers/cpuidle/Kconfig.arm64 b/drivers/cpuidle/Kconfig.arm64
> new file mode 100644
> index 0000000..b83612c
> --- /dev/null
> +++ b/drivers/cpuidle/Kconfig.arm64
> @@ -0,0 +1,13 @@
> +#
> +# ARM64 CPU Idle drivers
> +#
> +
> +config ARM64_CPUIDLE
> +	bool "Generic ARM64 CPU idle Driver"
> +	select OF_IDLE_STATES
> +	help
> +	  Select this to enable generic cpuidle driver for ARM v8.
> +	  It provides a generic idle driver whose idle states are configured
> +	  at run-time through DT nodes. The CPUidle suspend backend is
> +	  initialized by the device tree parsing code on matching the entry
> +	  method to the respective CPU operations.
> diff --git a/drivers/cpuidle/Makefile b/drivers/cpuidle/Makefile
> index d5ebf4b..e496242 100644
> --- a/drivers/cpuidle/Makefile
> +++ b/drivers/cpuidle/Makefile
> @@ -23,6 +23,10 @@ obj-$(CONFIG_ARM_EXYNOS_CPUIDLE)        += cpuidle-exynos.o
>   obj-$(CONFIG_MIPS_CPS_CPUIDLE)		+= cpuidle-cps.o
>
>   ###############################################################################
> +# ARM64 drivers
> +obj-$(CONFIG_ARM64_CPUIDLE)		+= cpuidle-arm64.o
> +
> +###############################################################################
>   # POWERPC drivers
>   obj-$(CONFIG_PSERIES_CPUIDLE)		+= cpuidle-pseries.o
>   obj-$(CONFIG_POWERNV_CPUIDLE)		+= cpuidle-powernv.o
> diff --git a/drivers/cpuidle/cpuidle-arm64.c b/drivers/cpuidle/cpuidle-arm64.c
> new file mode 100644
> index 0000000..4c932f8
> --- /dev/null
> +++ b/drivers/cpuidle/cpuidle-arm64.c
> @@ -0,0 +1,168 @@
> +/*
> + * ARM64 generic CPU idle driver.
> + *
> + * Copyright (C) 2014 ARM Ltd.
> + * Author: Lorenzo Pieralisi <lorenzo.pieralisi@arm.com>
> + *
> + * This program is free software; you can redistribute it and/or modify
> + * it under the terms of the GNU General Public License version 2 as
> + * published by the Free Software Foundation.
> + */
> +
> +#define pr_fmt(fmt) "CPUidle arm64: " fmt
> +
> +#include <linux/cpuidle.h>
> +#include <linux/cpumask.h>
> +#include <linux/cpu_pm.h>
> +#include <linux/kernel.h>
> +#include <linux/module.h>
> +#include <linux/of.h>
> +
> +#include <asm/psci.h>
> +#include <asm/suspend.h>
> +
> +#include "of_idle_states.h"
> +
> +typedef int (*suspend_init_fn)(struct cpuidle_driver *,
> +			       struct device_node *[]);
> +
> +struct cpu_suspend_ops {
> +	const char *id;
> +	suspend_init_fn init_fn;
> +};
> +
> +static const struct cpu_suspend_ops suspend_operations[] __initconst = {
> +	{"arm,psci", psci_dt_register_idle_states},
> +	{}
> +};
> +
> +static __init const struct cpu_suspend_ops *get_suspend_ops(const char *str)
> +{
> +	int i;
> +
> +	if (!str)
> +		return NULL;
> +
> +	for (i = 0; suspend_operations[i].id; i++)
> +		if (!strcmp(suspend_operations[i].id, str))
> +			return &suspend_operations[i];
> +
> +	return NULL;
> +}
> +
> +/*
> + * arm_enter_idle_state - Programs CPU to enter the specified state
> + *
> + * dev: cpuidle device
> + * drv: cpuidle driver
> + * idx: state index
> + *
> + * Called from the CPUidle framework to program the device to the
> + * specified target state selected by the governor.
> + */
> +static int arm_enter_idle_state(struct cpuidle_device *dev,
> +				struct cpuidle_driver *drv, int idx)
> +{
> +	int ret;
> +
> +	if (!idx) {
> +		cpu_do_idle();
> +		return idx;
> +	}
> +
> +	cpu_pm_enter();
> +	/*
> +	 * Pass idle state index to cpu_suspend which in turn will call
> +	 * the CPU ops suspend protocol with idle index as a parameter.
> +	 *
> +	 * Some states would not require context to be saved and flushed
> +	 * to DRAM, so calling cpu_suspend would not be stricly necessary.
> +	 * When power domains specifications for ARM CPUs are finalized then
> +	 * this code can be optimized to prevent saving registers if not
> +	 * needed.
> +	 */
> +	ret = cpu_suspend(idx);
> +
> +	cpu_pm_exit();
> +
> +	return ret ? -1 : idx;

Is it sure cpu_suspend will return always 0 on success ?

> +}
> +
> +struct cpuidle_driver arm64_idle_driver = {
> +	.name = "arm64_idle",
> +	.owner = THIS_MODULE,
> +};
> +
> +static struct device_node *state_nodes[CPUIDLE_STATE_MAX] __initdata;
> +
> +/*
> + * arm64_idle_init
> + *
> + * Registers the arm64 specific cpuidle driver with the cpuidle
> + * framework. It relies on core code to parse the idle states
> + * and initialize them using driver data structures accordingly.
> + */
> +static int __init arm64_idle_init(void)
> +{
> +	int i, ret;
> +	const char *entry_method;
> +	struct device_node *idle_states_node;
> +	const struct cpu_suspend_ops *suspend_init;
> +	struct cpuidle_driver *drv = &arm64_idle_driver;
> +
> +	idle_states_node = of_find_node_by_path("/cpus/idle-states");
> +	if (!idle_states_node)
> +		return -ENOENT;
> +
> +	if (of_property_read_string(idle_states_node, "entry-method",
> +				    &entry_method)) {
> +		pr_warn(" * %s missing entry-method property\n",
> +			    idle_states_node->full_name);
> +		of_node_put(idle_states_node);
> +		ret = -EOPNOTSUPP;
> +		goto put_node;
> +	}
> +
> +	suspend_init = get_suspend_ops(entry_method);
> +	if (!suspend_init) {
> +		pr_warn("Missing suspend initializer\n");
> +		ret = -EOPNOTSUPP;
> +		goto put_node;
> +	}
> +
> +	/*
> +	 * State at index 0 is standby wfi and considered standard
> +	 * on all ARM platforms. If in some platforms simple wfi
> +	 * can't be used as "state 0", DT bindings must be implemented
> +	 * to work around this issue and allow installing a special
> +	 * handler for idle state index 0.
> +	 */
> +	drv->states[0].exit_latency = 1;
> +	drv->states[0].target_residency = 1;
> +	drv->states[0].flags = CPUIDLE_FLAG_TIME_VALID;
> +	strncpy(drv->states[0].name, "ARM WFI", CPUIDLE_NAME_LEN);
> +	strncpy(drv->states[0].desc, "ARM WFI", CPUIDLE_DESC_LEN);

Please do not copy the state name and desc strings, they will be 
converted to 'const char *'.

> +	drv->cpumask = (struct cpumask *) cpu_possible_mask;
> +	/*
> +	 * Start at index 1, request idle state nodes to be filled
> +	 */
> +	ret = of_init_idle_driver(drv, state_nodes, 1, true);
> +	if (ret)
> +		goto put_node;
> +
> +	if (suspend_init->init_fn(drv, state_nodes)) {
> +		ret = -EOPNOTSUPP;
> +		goto put_node;
> +	}
> +
> +	for (i = 0; i < drv->state_count; i++)
> +		drv->states[i].enter = arm_enter_idle_state;

May be s/arm/arm64/ ?

> +
> +	ret = cpuidle_register(drv, NULL);
> +
> +put_node:
> +	of_node_put(idle_states_node);
> +	return ret;
> +}
> +device_initcall(arm64_idle_init);
>


-- 
  <http://www.linaro.org/> Linaro.org ? Open source software for ARM SoCs

Follow Linaro:  <http://www.facebook.com/pages/Linaro> Facebook |
<http://twitter.com/#!/linaroorg> Twitter |
<http://www.linaro.org/linaro-blog/> Blog

^ permalink raw reply	[flat|nested] 74+ messages in thread

* Re: [PATCH v4 1/6] Documentation: arm: define DT idle states bindings
  2014-06-18 21:09                   ` Nicolas Pitre
@ 2014-06-18 23:13                     ` Santosh Shilimkar
  -1 siblings, 0 replies; 74+ messages in thread
From: Santosh Shilimkar @ 2014-06-18 23:13 UTC (permalink / raw)
  To: Nicolas Pitre
  Cc: Lorenzo Pieralisi, linux-arm-kernel, linux-pm, devicetree,
	Mark Rutland, Sudeep Holla, Catalin Marinas,
	Charles Garcia-Tobin, Rob Herring, grant.likely,
	Peter De Schrijver, Daniel Lezcano, Amit Kucheria,
	Vincent Guittot, Antti Miettinen, Stephen Boyd, Kevin Hilman,
	Sebastian Capella, Tomasz Figa, Mark Brown

On Wednesday 18 June 2014 05:09 PM, Nicolas Pitre wrote:
> On Wed, 18 Jun 2014, Santosh Shilimkar wrote:
> 
>> On Wednesday 18 June 2014 04:51 PM, Nicolas Pitre wrote:
>>> On Wed, 18 Jun 2014, Santosh Shilimkar wrote:
>>>
>>>> On Wednesday 18 June 2014 01:36 PM, Lorenzo Pieralisi wrote:
>>>> [..]
>>>>> +	To correctly specify idle states timing and energy related properties,
>>>>> +	the following definitions identify the different execution phases
>>>>> +	a CPU goes through to enter and exit idle states and the implied
>>>>> +	energy metrics:
>>>>> +
>>>>> +	..__[EXEC]__|__[PREP]__|__[ENTRY]__|__[IDLE]__|__[EXIT]__|__[EXEC]__..
>>>>> +		    |          |           |          |          |
>>>>> +
>>>>> +		    |<------ entry ------->|
>>>>> +		    |       latency        |
>>>>> +						      |<- exit ->|
>>>>> +						      |  latency |
>>>>> +		    |<-------- min-residency -------->|
>>>>> +			       |<-------  wakeup-latency ------->|
>>>>> +
>>>> I don't know the wakeup latency makes much sense and also correct.
>>>> Hardware wakeup latency is actually exit latency. Is it for failed
>>>> or abort-able ilde case ? We are adding this as a new parameter
>>>> at least from idle states perspective. I think we should just
>>>> avoid it.
>>>
>>> I explained the rationale for this parameter in a previous email but 
>>> Lorenzo didn't carry it over. To be clearer, this should be "worst case 
>>> wake-up latency".  It is of interest for PMQOS.  This is the maximum 
>>> delay that can be expected from the moment a wake-up event is signaled 
>>> and the moment the CPU is back operational.  This is more than just exit 
>>> latency.  By default this is entry_latency + exit_latency but when there 
>>> is an abortable PREP phase then it may be shorter than that.
>>>
>> PMQOS angle is right. It is just that the idle code is not
>> going to do anything with this value. But I see a value adding it
>> instead of some one doing calculation.
> 
> The idle code should take it into account when a PMQOS restriction is in 
> effect i.e. avoid using those modes whose worst case wake-up latency is 
> too large.
> 
> And cpuidle is being migrated into the scheduler as we speak.  So some 
> of the values there, namely entry_latency and exit_latency (taken 
> separately for timing purposes) will be directly used by the scheduler 
> to decide which CPU to wake up for example.
> 
> So there is fundamentally 4 parameters if we want to comprehensively 
> support all pertinent use cases.
> 
Fair enough.

regards,
Santosh


^ permalink raw reply	[flat|nested] 74+ messages in thread

* [PATCH v4 1/6] Documentation: arm: define DT idle states bindings
@ 2014-06-18 23:13                     ` Santosh Shilimkar
  0 siblings, 0 replies; 74+ messages in thread
From: Santosh Shilimkar @ 2014-06-18 23:13 UTC (permalink / raw)
  To: linux-arm-kernel

On Wednesday 18 June 2014 05:09 PM, Nicolas Pitre wrote:
> On Wed, 18 Jun 2014, Santosh Shilimkar wrote:
> 
>> On Wednesday 18 June 2014 04:51 PM, Nicolas Pitre wrote:
>>> On Wed, 18 Jun 2014, Santosh Shilimkar wrote:
>>>
>>>> On Wednesday 18 June 2014 01:36 PM, Lorenzo Pieralisi wrote:
>>>> [..]
>>>>> +	To correctly specify idle states timing and energy related properties,
>>>>> +	the following definitions identify the different execution phases
>>>>> +	a CPU goes through to enter and exit idle states and the implied
>>>>> +	energy metrics:
>>>>> +
>>>>> +	..__[EXEC]__|__[PREP]__|__[ENTRY]__|__[IDLE]__|__[EXIT]__|__[EXEC]__..
>>>>> +		    |          |           |          |          |
>>>>> +
>>>>> +		    |<------ entry ------->|
>>>>> +		    |       latency        |
>>>>> +						      |<- exit ->|
>>>>> +						      |  latency |
>>>>> +		    |<-------- min-residency -------->|
>>>>> +			       |<-------  wakeup-latency ------->|
>>>>> +
>>>> I don't know the wakeup latency makes much sense and also correct.
>>>> Hardware wakeup latency is actually exit latency. Is it for failed
>>>> or abort-able ilde case ? We are adding this as a new parameter
>>>> at least from idle states perspective. I think we should just
>>>> avoid it.
>>>
>>> I explained the rationale for this parameter in a previous email but 
>>> Lorenzo didn't carry it over. To be clearer, this should be "worst case 
>>> wake-up latency".  It is of interest for PMQOS.  This is the maximum 
>>> delay that can be expected from the moment a wake-up event is signaled 
>>> and the moment the CPU is back operational.  This is more than just exit 
>>> latency.  By default this is entry_latency + exit_latency but when there 
>>> is an abortable PREP phase then it may be shorter than that.
>>>
>> PMQOS angle is right. It is just that the idle code is not
>> going to do anything with this value. But I see a value adding it
>> instead of some one doing calculation.
> 
> The idle code should take it into account when a PMQOS restriction is in 
> effect i.e. avoid using those modes whose worst case wake-up latency is 
> too large.
> 
> And cpuidle is being migrated into the scheduler as we speak.  So some 
> of the values there, namely entry_latency and exit_latency (taken 
> separately for timing purposes) will be directly used by the scheduler 
> to decide which CPU to wake up for example.
> 
> So there is fundamentally 4 parameters if we want to comprehensively 
> support all pertinent use cases.
> 
Fair enough.

regards,
Santosh

^ permalink raw reply	[flat|nested] 74+ messages in thread

* Re: [PATCH v4 5/6] drivers: cpuidle: CPU idle ARM64 driver
  2014-06-11 16:18     ` Lorenzo Pieralisi
@ 2014-06-19  3:02       ` Rob Herring
  -1 siblings, 0 replies; 74+ messages in thread
From: Rob Herring @ 2014-06-19  3:02 UTC (permalink / raw)
  To: Lorenzo Pieralisi
  Cc: linux-arm-kernel, linux-pm, devicetree, Mark Rutland,
	Sudeep Holla, Catalin Marinas, Charles Garcia Tobin,
	Nicolas Pitre, Rob Herring, Grant Likely, Peter De Schrijver,
	Santosh Shilimkar, Daniel Lezcano, Amit Kucheria,
	Vincent Guittot, Antti Miettinen, Stephen Boyd, Kevin Hilman,
	Sebastian Capella, Tomasz Figa

On Wed, Jun 11, 2014 at 11:18 AM, Lorenzo Pieralisi
<lorenzo.pieralisi@arm.com> wrote:
> This patch implements a generic CPU idle driver for ARM64 machines.

I fail to see anything arm64 specific here. The idle states binding is
for both arm32 and arm64, right? If not, please make it for both.
Otherwise, I'm okay with the binding for the most part. I need to take
another pass at it though.

Rob

> It relies on the DT idle states infrastructure to initialize idle
> states count and respective parameters. Current code assumes the driver
> is managing idle states on all possible CPUs but can be easily
> generalized to support heterogenous systems and build cpumasks at
> runtime using MIDRs or DT cpu nodes compatible properties.
>
> Suspend back-ends (eg PSCI) must register a suspend initializer with
> the CPU idle driver so that the suspend backend call can be detected,
> and the driver code can call the back-end infrastructure to complete the
> suspend backend initialization.
>
> Idle state index 0 is always initialized as a simple wfi state, ie always
> considered present and functional on all ARM64 platforms.
>
> Signed-off-by: Lorenzo Pieralisi <lorenzo.pieralisi@arm.com>
> ---
>  drivers/cpuidle/Kconfig         |   5 ++
>  drivers/cpuidle/Kconfig.arm64   |  13 ++++
>  drivers/cpuidle/Makefile        |   4 +
>  drivers/cpuidle/cpuidle-arm64.c | 168 ++++++++++++++++++++++++++++++++++++++++
>  4 files changed, 190 insertions(+)
>  create mode 100644 drivers/cpuidle/Kconfig.arm64
>  create mode 100644 drivers/cpuidle/cpuidle-arm64.c
>
> diff --git a/drivers/cpuidle/Kconfig b/drivers/cpuidle/Kconfig
> index 760ce20..360c086 100644
> --- a/drivers/cpuidle/Kconfig
> +++ b/drivers/cpuidle/Kconfig
> @@ -44,6 +44,11 @@ depends on ARM
>  source "drivers/cpuidle/Kconfig.arm"
>  endmenu
>
> +menu "ARM64 CPU Idle Drivers"
> +depends on ARM64
> +source "drivers/cpuidle/Kconfig.arm64"
> +endmenu
> +
>  menu "MIPS CPU Idle Drivers"
>  depends on MIPS
>  source "drivers/cpuidle/Kconfig.mips"
> diff --git a/drivers/cpuidle/Kconfig.arm64 b/drivers/cpuidle/Kconfig.arm64
> new file mode 100644
> index 0000000..b83612c
> --- /dev/null
> +++ b/drivers/cpuidle/Kconfig.arm64
> @@ -0,0 +1,13 @@
> +#
> +# ARM64 CPU Idle drivers
> +#
> +
> +config ARM64_CPUIDLE
> +       bool "Generic ARM64 CPU idle Driver"
> +       select OF_IDLE_STATES
> +       help
> +         Select this to enable generic cpuidle driver for ARM v8.
> +         It provides a generic idle driver whose idle states are configured
> +         at run-time through DT nodes. The CPUidle suspend backend is
> +         initialized by the device tree parsing code on matching the entry
> +         method to the respective CPU operations.
> diff --git a/drivers/cpuidle/Makefile b/drivers/cpuidle/Makefile
> index d5ebf4b..e496242 100644
> --- a/drivers/cpuidle/Makefile
> +++ b/drivers/cpuidle/Makefile
> @@ -23,6 +23,10 @@ obj-$(CONFIG_ARM_EXYNOS_CPUIDLE)        += cpuidle-exynos.o
>  obj-$(CONFIG_MIPS_CPS_CPUIDLE)         += cpuidle-cps.o
>
>  ###############################################################################
> +# ARM64 drivers
> +obj-$(CONFIG_ARM64_CPUIDLE)            += cpuidle-arm64.o
> +
> +###############################################################################
>  # POWERPC drivers
>  obj-$(CONFIG_PSERIES_CPUIDLE)          += cpuidle-pseries.o
>  obj-$(CONFIG_POWERNV_CPUIDLE)          += cpuidle-powernv.o
> diff --git a/drivers/cpuidle/cpuidle-arm64.c b/drivers/cpuidle/cpuidle-arm64.c
> new file mode 100644
> index 0000000..4c932f8
> --- /dev/null
> +++ b/drivers/cpuidle/cpuidle-arm64.c
> @@ -0,0 +1,168 @@
> +/*
> + * ARM64 generic CPU idle driver.
> + *
> + * Copyright (C) 2014 ARM Ltd.
> + * Author: Lorenzo Pieralisi <lorenzo.pieralisi@arm.com>
> + *
> + * This program is free software; you can redistribute it and/or modify
> + * it under the terms of the GNU General Public License version 2 as
> + * published by the Free Software Foundation.
> + */
> +
> +#define pr_fmt(fmt) "CPUidle arm64: " fmt
> +
> +#include <linux/cpuidle.h>
> +#include <linux/cpumask.h>
> +#include <linux/cpu_pm.h>
> +#include <linux/kernel.h>
> +#include <linux/module.h>
> +#include <linux/of.h>
> +
> +#include <asm/psci.h>
> +#include <asm/suspend.h>
> +
> +#include "of_idle_states.h"
> +
> +typedef int (*suspend_init_fn)(struct cpuidle_driver *,
> +                              struct device_node *[]);
> +
> +struct cpu_suspend_ops {
> +       const char *id;
> +       suspend_init_fn init_fn;
> +};
> +
> +static const struct cpu_suspend_ops suspend_operations[] __initconst = {
> +       {"arm,psci", psci_dt_register_idle_states},
> +       {}
> +};
> +
> +static __init const struct cpu_suspend_ops *get_suspend_ops(const char *str)
> +{
> +       int i;
> +
> +       if (!str)
> +               return NULL;
> +
> +       for (i = 0; suspend_operations[i].id; i++)
> +               if (!strcmp(suspend_operations[i].id, str))
> +                       return &suspend_operations[i];
> +
> +       return NULL;
> +}
> +
> +/*
> + * arm_enter_idle_state - Programs CPU to enter the specified state
> + *
> + * dev: cpuidle device
> + * drv: cpuidle driver
> + * idx: state index
> + *
> + * Called from the CPUidle framework to program the device to the
> + * specified target state selected by the governor.
> + */
> +static int arm_enter_idle_state(struct cpuidle_device *dev,
> +                               struct cpuidle_driver *drv, int idx)
> +{
> +       int ret;
> +
> +       if (!idx) {
> +               cpu_do_idle();
> +               return idx;
> +       }
> +
> +       cpu_pm_enter();
> +       /*
> +        * Pass idle state index to cpu_suspend which in turn will call
> +        * the CPU ops suspend protocol with idle index as a parameter.
> +        *
> +        * Some states would not require context to be saved and flushed
> +        * to DRAM, so calling cpu_suspend would not be stricly necessary.
> +        * When power domains specifications for ARM CPUs are finalized then
> +        * this code can be optimized to prevent saving registers if not
> +        * needed.
> +        */
> +       ret = cpu_suspend(idx);
> +
> +       cpu_pm_exit();
> +
> +       return ret ? -1 : idx;
> +}
> +
> +struct cpuidle_driver arm64_idle_driver = {
> +       .name = "arm64_idle",
> +       .owner = THIS_MODULE,
> +};
> +
> +static struct device_node *state_nodes[CPUIDLE_STATE_MAX] __initdata;
> +
> +/*
> + * arm64_idle_init
> + *
> + * Registers the arm64 specific cpuidle driver with the cpuidle
> + * framework. It relies on core code to parse the idle states
> + * and initialize them using driver data structures accordingly.
> + */
> +static int __init arm64_idle_init(void)
> +{
> +       int i, ret;
> +       const char *entry_method;
> +       struct device_node *idle_states_node;
> +       const struct cpu_suspend_ops *suspend_init;
> +       struct cpuidle_driver *drv = &arm64_idle_driver;
> +
> +       idle_states_node = of_find_node_by_path("/cpus/idle-states");
> +       if (!idle_states_node)
> +               return -ENOENT;
> +
> +       if (of_property_read_string(idle_states_node, "entry-method",
> +                                   &entry_method)) {
> +               pr_warn(" * %s missing entry-method property\n",
> +                           idle_states_node->full_name);
> +               of_node_put(idle_states_node);
> +               ret = -EOPNOTSUPP;
> +               goto put_node;
> +       }
> +
> +       suspend_init = get_suspend_ops(entry_method);
> +       if (!suspend_init) {
> +               pr_warn("Missing suspend initializer\n");
> +               ret = -EOPNOTSUPP;
> +               goto put_node;
> +       }
> +
> +       /*
> +        * State at index 0 is standby wfi and considered standard
> +        * on all ARM platforms. If in some platforms simple wfi
> +        * can't be used as "state 0", DT bindings must be implemented
> +        * to work around this issue and allow installing a special
> +        * handler for idle state index 0.
> +        */
> +       drv->states[0].exit_latency = 1;
> +       drv->states[0].target_residency = 1;
> +       drv->states[0].flags = CPUIDLE_FLAG_TIME_VALID;
> +       strncpy(drv->states[0].name, "ARM WFI", CPUIDLE_NAME_LEN);
> +       strncpy(drv->states[0].desc, "ARM WFI", CPUIDLE_DESC_LEN);
> +
> +       drv->cpumask = (struct cpumask *) cpu_possible_mask;
> +       /*
> +        * Start at index 1, request idle state nodes to be filled
> +        */
> +       ret = of_init_idle_driver(drv, state_nodes, 1, true);
> +       if (ret)
> +               goto put_node;
> +
> +       if (suspend_init->init_fn(drv, state_nodes)) {
> +               ret = -EOPNOTSUPP;
> +               goto put_node;
> +       }
> +
> +       for (i = 0; i < drv->state_count; i++)
> +               drv->states[i].enter = arm_enter_idle_state;
> +
> +       ret = cpuidle_register(drv, NULL);
> +
> +put_node:
> +       of_node_put(idle_states_node);
> +       return ret;
> +}
> +device_initcall(arm64_idle_init);
> --
> 1.8.4
>
>
> --
> To unsubscribe from this list: send the line "unsubscribe devicetree" in
> the body of a message to majordomo@vger.kernel.org
> More majordomo info at  http://vger.kernel.org/majordomo-info.html

^ permalink raw reply	[flat|nested] 74+ messages in thread

* [PATCH v4 5/6] drivers: cpuidle: CPU idle ARM64 driver
@ 2014-06-19  3:02       ` Rob Herring
  0 siblings, 0 replies; 74+ messages in thread
From: Rob Herring @ 2014-06-19  3:02 UTC (permalink / raw)
  To: linux-arm-kernel

On Wed, Jun 11, 2014 at 11:18 AM, Lorenzo Pieralisi
<lorenzo.pieralisi@arm.com> wrote:
> This patch implements a generic CPU idle driver for ARM64 machines.

I fail to see anything arm64 specific here. The idle states binding is
for both arm32 and arm64, right? If not, please make it for both.
Otherwise, I'm okay with the binding for the most part. I need to take
another pass at it though.

Rob

> It relies on the DT idle states infrastructure to initialize idle
> states count and respective parameters. Current code assumes the driver
> is managing idle states on all possible CPUs but can be easily
> generalized to support heterogenous systems and build cpumasks at
> runtime using MIDRs or DT cpu nodes compatible properties.
>
> Suspend back-ends (eg PSCI) must register a suspend initializer with
> the CPU idle driver so that the suspend backend call can be detected,
> and the driver code can call the back-end infrastructure to complete the
> suspend backend initialization.
>
> Idle state index 0 is always initialized as a simple wfi state, ie always
> considered present and functional on all ARM64 platforms.
>
> Signed-off-by: Lorenzo Pieralisi <lorenzo.pieralisi@arm.com>
> ---
>  drivers/cpuidle/Kconfig         |   5 ++
>  drivers/cpuidle/Kconfig.arm64   |  13 ++++
>  drivers/cpuidle/Makefile        |   4 +
>  drivers/cpuidle/cpuidle-arm64.c | 168 ++++++++++++++++++++++++++++++++++++++++
>  4 files changed, 190 insertions(+)
>  create mode 100644 drivers/cpuidle/Kconfig.arm64
>  create mode 100644 drivers/cpuidle/cpuidle-arm64.c
>
> diff --git a/drivers/cpuidle/Kconfig b/drivers/cpuidle/Kconfig
> index 760ce20..360c086 100644
> --- a/drivers/cpuidle/Kconfig
> +++ b/drivers/cpuidle/Kconfig
> @@ -44,6 +44,11 @@ depends on ARM
>  source "drivers/cpuidle/Kconfig.arm"
>  endmenu
>
> +menu "ARM64 CPU Idle Drivers"
> +depends on ARM64
> +source "drivers/cpuidle/Kconfig.arm64"
> +endmenu
> +
>  menu "MIPS CPU Idle Drivers"
>  depends on MIPS
>  source "drivers/cpuidle/Kconfig.mips"
> diff --git a/drivers/cpuidle/Kconfig.arm64 b/drivers/cpuidle/Kconfig.arm64
> new file mode 100644
> index 0000000..b83612c
> --- /dev/null
> +++ b/drivers/cpuidle/Kconfig.arm64
> @@ -0,0 +1,13 @@
> +#
> +# ARM64 CPU Idle drivers
> +#
> +
> +config ARM64_CPUIDLE
> +       bool "Generic ARM64 CPU idle Driver"
> +       select OF_IDLE_STATES
> +       help
> +         Select this to enable generic cpuidle driver for ARM v8.
> +         It provides a generic idle driver whose idle states are configured
> +         at run-time through DT nodes. The CPUidle suspend backend is
> +         initialized by the device tree parsing code on matching the entry
> +         method to the respective CPU operations.
> diff --git a/drivers/cpuidle/Makefile b/drivers/cpuidle/Makefile
> index d5ebf4b..e496242 100644
> --- a/drivers/cpuidle/Makefile
> +++ b/drivers/cpuidle/Makefile
> @@ -23,6 +23,10 @@ obj-$(CONFIG_ARM_EXYNOS_CPUIDLE)        += cpuidle-exynos.o
>  obj-$(CONFIG_MIPS_CPS_CPUIDLE)         += cpuidle-cps.o
>
>  ###############################################################################
> +# ARM64 drivers
> +obj-$(CONFIG_ARM64_CPUIDLE)            += cpuidle-arm64.o
> +
> +###############################################################################
>  # POWERPC drivers
>  obj-$(CONFIG_PSERIES_CPUIDLE)          += cpuidle-pseries.o
>  obj-$(CONFIG_POWERNV_CPUIDLE)          += cpuidle-powernv.o
> diff --git a/drivers/cpuidle/cpuidle-arm64.c b/drivers/cpuidle/cpuidle-arm64.c
> new file mode 100644
> index 0000000..4c932f8
> --- /dev/null
> +++ b/drivers/cpuidle/cpuidle-arm64.c
> @@ -0,0 +1,168 @@
> +/*
> + * ARM64 generic CPU idle driver.
> + *
> + * Copyright (C) 2014 ARM Ltd.
> + * Author: Lorenzo Pieralisi <lorenzo.pieralisi@arm.com>
> + *
> + * This program is free software; you can redistribute it and/or modify
> + * it under the terms of the GNU General Public License version 2 as
> + * published by the Free Software Foundation.
> + */
> +
> +#define pr_fmt(fmt) "CPUidle arm64: " fmt
> +
> +#include <linux/cpuidle.h>
> +#include <linux/cpumask.h>
> +#include <linux/cpu_pm.h>
> +#include <linux/kernel.h>
> +#include <linux/module.h>
> +#include <linux/of.h>
> +
> +#include <asm/psci.h>
> +#include <asm/suspend.h>
> +
> +#include "of_idle_states.h"
> +
> +typedef int (*suspend_init_fn)(struct cpuidle_driver *,
> +                              struct device_node *[]);
> +
> +struct cpu_suspend_ops {
> +       const char *id;
> +       suspend_init_fn init_fn;
> +};
> +
> +static const struct cpu_suspend_ops suspend_operations[] __initconst = {
> +       {"arm,psci", psci_dt_register_idle_states},
> +       {}
> +};
> +
> +static __init const struct cpu_suspend_ops *get_suspend_ops(const char *str)
> +{
> +       int i;
> +
> +       if (!str)
> +               return NULL;
> +
> +       for (i = 0; suspend_operations[i].id; i++)
> +               if (!strcmp(suspend_operations[i].id, str))
> +                       return &suspend_operations[i];
> +
> +       return NULL;
> +}
> +
> +/*
> + * arm_enter_idle_state - Programs CPU to enter the specified state
> + *
> + * dev: cpuidle device
> + * drv: cpuidle driver
> + * idx: state index
> + *
> + * Called from the CPUidle framework to program the device to the
> + * specified target state selected by the governor.
> + */
> +static int arm_enter_idle_state(struct cpuidle_device *dev,
> +                               struct cpuidle_driver *drv, int idx)
> +{
> +       int ret;
> +
> +       if (!idx) {
> +               cpu_do_idle();
> +               return idx;
> +       }
> +
> +       cpu_pm_enter();
> +       /*
> +        * Pass idle state index to cpu_suspend which in turn will call
> +        * the CPU ops suspend protocol with idle index as a parameter.
> +        *
> +        * Some states would not require context to be saved and flushed
> +        * to DRAM, so calling cpu_suspend would not be stricly necessary.
> +        * When power domains specifications for ARM CPUs are finalized then
> +        * this code can be optimized to prevent saving registers if not
> +        * needed.
> +        */
> +       ret = cpu_suspend(idx);
> +
> +       cpu_pm_exit();
> +
> +       return ret ? -1 : idx;
> +}
> +
> +struct cpuidle_driver arm64_idle_driver = {
> +       .name = "arm64_idle",
> +       .owner = THIS_MODULE,
> +};
> +
> +static struct device_node *state_nodes[CPUIDLE_STATE_MAX] __initdata;
> +
> +/*
> + * arm64_idle_init
> + *
> + * Registers the arm64 specific cpuidle driver with the cpuidle
> + * framework. It relies on core code to parse the idle states
> + * and initialize them using driver data structures accordingly.
> + */
> +static int __init arm64_idle_init(void)
> +{
> +       int i, ret;
> +       const char *entry_method;
> +       struct device_node *idle_states_node;
> +       const struct cpu_suspend_ops *suspend_init;
> +       struct cpuidle_driver *drv = &arm64_idle_driver;
> +
> +       idle_states_node = of_find_node_by_path("/cpus/idle-states");
> +       if (!idle_states_node)
> +               return -ENOENT;
> +
> +       if (of_property_read_string(idle_states_node, "entry-method",
> +                                   &entry_method)) {
> +               pr_warn(" * %s missing entry-method property\n",
> +                           idle_states_node->full_name);
> +               of_node_put(idle_states_node);
> +               ret = -EOPNOTSUPP;
> +               goto put_node;
> +       }
> +
> +       suspend_init = get_suspend_ops(entry_method);
> +       if (!suspend_init) {
> +               pr_warn("Missing suspend initializer\n");
> +               ret = -EOPNOTSUPP;
> +               goto put_node;
> +       }
> +
> +       /*
> +        * State at index 0 is standby wfi and considered standard
> +        * on all ARM platforms. If in some platforms simple wfi
> +        * can't be used as "state 0", DT bindings must be implemented
> +        * to work around this issue and allow installing a special
> +        * handler for idle state index 0.
> +        */
> +       drv->states[0].exit_latency = 1;
> +       drv->states[0].target_residency = 1;
> +       drv->states[0].flags = CPUIDLE_FLAG_TIME_VALID;
> +       strncpy(drv->states[0].name, "ARM WFI", CPUIDLE_NAME_LEN);
> +       strncpy(drv->states[0].desc, "ARM WFI", CPUIDLE_DESC_LEN);
> +
> +       drv->cpumask = (struct cpumask *) cpu_possible_mask;
> +       /*
> +        * Start at index 1, request idle state nodes to be filled
> +        */
> +       ret = of_init_idle_driver(drv, state_nodes, 1, true);
> +       if (ret)
> +               goto put_node;
> +
> +       if (suspend_init->init_fn(drv, state_nodes)) {
> +               ret = -EOPNOTSUPP;
> +               goto put_node;
> +       }
> +
> +       for (i = 0; i < drv->state_count; i++)
> +               drv->states[i].enter = arm_enter_idle_state;
> +
> +       ret = cpuidle_register(drv, NULL);
> +
> +put_node:
> +       of_node_put(idle_states_node);
> +       return ret;
> +}
> +device_initcall(arm64_idle_init);
> --
> 1.8.4
>
>
> --
> To unsubscribe from this list: send the line "unsubscribe devicetree" in
> the body of a message to majordomo at vger.kernel.org
> More majordomo info at  http://vger.kernel.org/majordomo-info.html

^ permalink raw reply	[flat|nested] 74+ messages in thread

* RE: [PATCH v4 1/6] Documentation: arm: define DT idle states bindings
  2014-06-18 19:27             ` Santosh Shilimkar
@ 2014-06-19  7:33               ` Charles Garcia-Tobin
  -1 siblings, 0 replies; 74+ messages in thread
From: Charles Garcia-Tobin @ 2014-06-19  7:33 UTC (permalink / raw)
  To: 'Santosh Shilimkar', Lorenzo Pieralisi, Nicolas Pitre
  Cc: linux-arm-kernel, linux-pm, devicetree, Mark Rutland,
	Sudeep Holla, Catalin Marinas, Rob Herring, grant.likely,
	Peter De Schrijver, Daniel Lezcano, Amit Kucheria,
	Vincent Guittot, Antti Miettinen, Stephen Boyd, Kevin Hilman,
	Sebastian Capella, Tomasz Figa, Mark Brown, Paul Walmsley,
	Chander Kashyap



> -----Original Message-----
> From: Santosh Shilimkar [mailto:santosh.shilimkar@ti.com]
> Sent: 18 June 2014 20:27
> To: Lorenzo Pieralisi; Nicolas Pitre
> Cc: linux-arm-kernel@lists.infradead.org; linux-pm@vger.kernel.org;
> devicetree@vger.kernel.org; Mark Rutland; Sudeep Holla; Catalin
> Marinas; Charles Garcia-Tobin; Rob Herring; grant.likely@linaro.org;
> Peter De Schrijver; Daniel Lezcano; Amit Kucheria; Vincent Guittot;
> Antti Miettinen; Stephen Boyd; Kevin Hilman; Sebastian Capella; Tomasz
> Figa; Mark Brown; Paul Walmsley; Chander Kashyap
> Subject: Re: [PATCH v4 1/6] Documentation: arm: define DT idle states
> bindings
> 
> On Wednesday 18 June 2014 01:36 PM, Lorenzo Pieralisi wrote:
> > On Fri, Jun 13, 2014 at 06:33:35PM +0100, Nicolas Pitre wrote:
> 
> [..]
> > Ok, a minor tweak to the diagram above, min-residency should include
> > energy costs related to idle entry and exit, but not the exit-latency
> > itself, as long as the energy costs implied by exiting the state are
> > factored out in the min-residency-us property.
> >
> > Hence, to sum it up, I attached below the updated bindings patch:
> >
> > I think we are close to an agreement, if anyone disagrees please
> shout
> > as soon as possible so that we can still integrate changes.
> >
> 
> [..]
> 
> >
> > -- >8 --
> > Subject: [PATCH] Documentation: arm: define DT idle states bindings
> >
> > ARM based platforms implement a variety of power management schemes
> that
> > allow processors to enter idle states at run-time.
> > The parameters defining these idle states vary on a per-platform
> basis forcing
> > the OS to hardcode the state parameters in platform specific static
> tables
> > whose size grows as the number of platforms supported in the kernel
> increases
> > and hampers device drivers standardization.
> >
> > Therefore, this patch aims at standardizing idle state device tree
> bindings for
> > ARM platforms. Bindings define idle state parameters inclusive of
> entry methods
> > and state latencies, to allow operating systems to retrieve the
> configuration
> > entries from the device tree and initialize the related power
> management
> > drivers, paving the way for common code in the kernel to deal with
> idle
> > states and removing the need for static data in current and previous
> kernel
> > versions.
> >
> > Reviewed-by: Sebastian Capella <sebcape@gmail.com>
> > Signed-off-by: Lorenzo Pieralisi <lorenzo.pieralisi@arm.com>
> > ---
> Nice work Lorenzo !!
> I have few comments/questions.
> 
> >  Documentation/devicetree/bindings/arm/cpus.txt     |   8 +
> >  .../devicetree/bindings/arm/idle-states.txt        | 561
> +++++++++++++++++++++
> >  2 files changed, 569 insertions(+)
> >  create mode 100644 Documentation/devicetree/bindings/arm/idle-
> states.txt
> >
> > diff --git a/Documentation/devicetree/bindings/arm/cpus.txt
> b/Documentation/devicetree/bindings/arm/cpus.txt
> > index 1fe72a0..a44d4fd 100644
> > --- a/Documentation/devicetree/bindings/arm/cpus.txt
> > +++ b/Documentation/devicetree/bindings/arm/cpus.txt
> > @@ -215,6 +215,12 @@ nodes to be present and contain the properties
> described below.
> >  		Value type: <phandle>
> >  		Definition: Specifies the ACC[2] node associated with this
> CPU.
> >
> > +	- cpu-idle-states
> > +		Usage: Optional
> > +		Value type: <prop-encoded-array>
> > +		Definition:
> > +			# List of phandles to idle state nodes supported
> > +			  by this cpu [3].
> >
> >  Example 1 (dual-cluster big.LITTLE system 32-bit):
> >
> > @@ -411,3 +417,5 @@ cpus {
> >  --
> >  [1] arm/msm/qcom,saw2.txt
> >  [2] arm/msm/qcom,kpss-acc.txt
> > +[3] ARM Linux kernel documentation - idle states bindings
> > +    Documentation/devicetree/bindings/arm/idle-states.txt
> > diff --git a/Documentation/devicetree/bindings/arm/idle-states.txt
> b/Documentation/devicetree/bindings/arm/idle-states.txt
> > new file mode 100644
> > index 0000000..c9e1ec6
> > --- /dev/null
> > +++ b/Documentation/devicetree/bindings/arm/idle-states.txt
> > @@ -0,0 +1,561 @@
> > +==========================================
> > +ARM idle states binding description
> > +==========================================
> > +
> > +==========================================
> > +1 - Introduction
> > +==========================================
> > +
> > +ARM systems contain HW capable of managing power consumption
> dynamically,
> > +where cores can be put in different low-power states (ranging from
> simple
> > +wfi to power gating) according to OSPM policies. The CPU states
> representing
> s/OSPM/OS PM ?
> > +the range of dynamic idle states that a processor can enter at run-
> time, can be
> > +specified through device tree bindings representing the parameters
> required
> > +to enter/exit specific idle states on a given processor.
> > +
> > +According to the Server Base System Architecture document (SBSA,
> [3]), the
> > +power states an ARM CPU can be put into are identified by the
> following list:
> > +
> > +- Running
> > +- Idle_standby
> > +- Idle_retention
> > +- Sleep
> > +- Off
> > +
> > +The power states described in the SBSA document define the basic CPU
> states on
> > +top of which ARM platforms implement power management schemes that
> allow an OS
> > +PM implementation to put the processor in different idle states
> (which include
> > +states listed above; "off" state is not an idle state since it does
> not have
> > +wake-up capabilities, hence it is not considered in this document).
> > +
> > +Idle state parameters (eg entry latency) are platform specific and
> need to be
> > +characterized with bindings that provide the required information to
> OSPM
> Ditto
> > +code so that it can build the required tables and use them at
> runtime.
> > +
> > +The device tree binding definition for ARM idle states is the
> subject of this
> > +document.
> > +
> > +===========================================
> > +2 - idle-states node
> > +===========================================
> > +
> > +ARM processor idle states are defined within the idle-states node,
> which is
> > +a direct child of the cpus node [1] and provides a container where
> the
> > +processor idle states, defined as device tree nodes, are listed.
> > +
> > +- idle-states node
> > +
> > +	Usage: Optional - On ARM systems, is a container of processor idle
> s/is/it is ?
> > +			  states nodes. If the system does not provide CPU
> > +			  power management capabilities or the processor
just
> > +			  supports idle_standby an idle-states node is not
> > +			  required.
> > +
> > +	Description: idle-states node is a container node, where its
> > +		     subnodes describe the CPU idle states.
> > +
> > +	Node name must be "idle-states".
> > +
> > +	The idle-states node's parent node must be the cpus node.
> > +
> > +	The idle-states node's child nodes can be:
> s/idle-states/idle-state
> > +
> > +	- one or more state nodes
> > +
> > +	Any other configuration is considered invalid.
> > +
> > +	An idle-states node defines the following properties:
> > +
> > +	- entry-method
> > +		Usage: Required
> > +		Value type: <stringlist>
> > +		Definition: Describes the method by which a CPU enters the
> > +			    idle states. This property is required and must
be
> > +			    one of:
> > +
> > +			    - "arm,psci"
> > +			      ARM PSCI firmware interface [2].
> > +
> > +			    - "[vendor],[method]"
> > +			      An implementation dependent string with
> > +			      format "vendor,method", where vendor is a
string
> > +			      denoting the name of the manufacturer and
> > +			      method is a string specifying the mechanism
> > +			      used to enter the idle state.
> > +
> > +The nodes describing the idle states (state) can only be defined
> within the
> > +idle-states node, any other configuration is considered invalid and
> therefore
> > +must be ignored.
> > +
> > +===========================================
> > +3 - state node
> > +===========================================
> > +
> > +A state node represents an idle state description and must be
> defined as
> > +follows:
> > +
> > +- state node
> > +
> > +	Description: must be child of the idle-states node
> > +
> > +	The state node name shall follow standard device tree naming
> > +	rules ([5], 2.2.1 "Node names"), in particular state nodes which
> > +	are siblings within a single common parent must be given a unique
> name.
> > +
> > +	The idle state entered by executing the wfi instruction
> (idle_standby
> > +	SBSA,[3][4]) is considered standard on all ARM platforms and
> therefore
> > +	must not be listed.
> > +
> > +	To correctly specify idle states timing and energy related
> properties,
> > +	the following definitions identify the different execution phases
> > +	a CPU goes through to enter and exit idle states and the implied
> > +	energy metrics:
> > +
> > +
> 	..__[EXEC]__|__[PREP]__|__[ENTRY]__|__[IDLE]__|__[EXIT]__|__[EXEC]
> __..
> > +		    |          |           |          |          |
> > +
> > +		    |<------ entry ------->|
> > +		    |       latency        |
> > +						      |<- exit ->|
> > +						      |  latency |
> > +		    |<-------- min-residency -------->|
> > +			       |<-------  wakeup-latency ------->|
> > +
> I don't know the wakeup latency makes much sense and also correct.
> Hardware wakeup latency is actually exit latency. Is it for failed
> or abort-able ilde case ? We are adding this as a new parameter
> at least from idle states perspective. I think we should just
> avoid it.
> 

Hi Santosh, 

To me wake up latency makes up a lot of sense. It is not always the same as
exit latency, it will depend on your system, and just how smart it is. In
some cases the [ENTRY] period may not be negligible in which case exit
latency will be less than the wake up latency. 
In addition, it will generally always be shorter than entry+exit which is
the default value if omitted, this assumes the PREP time is not abortable,
but this is the safer assumption to make.
Wake up latency is really the number that folk have in their head for what
you'd stick into the pm_qos to veto entry into states when you are latency
constrained. 
The one thing that really is an optimisation here is having a separate exit
latency, which is being proposed for use in core selection for the
scheduler.
So if anything was going to be made optional pending new scheduler patches
should that not be entry/exit latency? 
 
Cheers

Charles



> > +	EXEC:	Normal CPU execution.
> > +
> > +	PREP:	Preparation phase before committing the hardware to idle
mode
> > +		like cache flushing. This is abortable on pending wake-up
> > +		event conditions. The abort latency is assumed to be
> negligible
> > +		(i.e. less than the ENTRY + EXIT duration). If aborted, CPU
> > +		goes back to EXEC. This phase is optional. If not abortable,
> > +		this should be included in the ENTRY phase instead.
> > +
> > +	ENTRY:	The hardware is committed to idle mode. This period must
> run
> > +		to completion up to IDLE before anything else can happen.
> > +
> > +	IDLE:	This is the actual energy-saving idle period. This may last
> > +		between 0 and infinite time, until a wake-up event occurs.
> > +
> > +	EXIT:	Period during which the CPU is brought back to operational
> > +		mode (EXEC).
> > +
> > +	With the definitions provided above, the following list represents
> > +	the valid properties for a state node:
> > +
> > +	- compatible
> > +		Usage: Required
> > +		Value type: <stringlist>
> > +		Definition: Must be "arm,idle-state".
> > +
> > +	- logic-state-retained
> > +		Usage: See definition
> > +		Value type: <none>
> > +		Definition: if present logic is retained on state entry,
> > +			    otherwise it is lost.
> > +
> > +	- cache-state-retained
> > +		Usage: See definition
> > +		Value type: <none>
> > +		Definition: if present cache memory is retained on state
> entry,
> > +			    otherwise it is lost.
> > +
> > +	- entry-method-param
> > +		Usage: See definition.
> > +		Value type: <u32>
> > +		Definition: Depends on the idle-states node entry-method
> > +			    property value. Refer to the entry-method
bindings
> > +			    for this property value definition.
> > +
> > +	- entry-latency-us
> > +		Usage: Required
> > +		Value type: <prop-encoded-array>
> > +		Definition: u32 value representing worst case latency in
> > +			    microseconds required to enter the idle state.
> > +			    The exit-latency-us duration may be guaranteed
> > +			    only after entry-latency-us has passed.
> > +
> > +	- exit-latency-us
> > +		Usage: Required
> > +		Value type: <prop-encoded-array>
> > +		Definition: u32 value representing worst case latency
> > +			    in microseconds required to exit the idle state.
> > +
> > +	- min-residency-us
> > +		Usage: Required
> > +		Value type: <prop-encoded-array>
> > +		Definition: u32 value representing minimum residency
duration
> > +			    in microseconds, inclusive of preparation and
> > +			    entry, for this idle state to be considered
> > +			    worthwhile energy wise.
> > +			    The residency time must take into account the
> > +			    energy consumed while entering and exiting the
> > +			    idle state and is therefore expected to be
> > +			    longer than entry-latency-us.
> > +
> > +	- wakeup-latency-us:
> > +		Usage: Optional
> > +		Value type: <prop-encoded-array>
> > +		Definition: u32 value representing maximum delay between the
> > +			    signaling of a wake-up event and the CPU being
> > +			    able to execute normal code again. If omitted,
> > +			    this is assumed to be equal to:
> > +				entry-latency-us + exit-latency-us
> > +
> Rest of the patch looks fine by to me.
> 
> regards,
> Santosh
> 



^ permalink raw reply	[flat|nested] 74+ messages in thread

* [PATCH v4 1/6] Documentation: arm: define DT idle states bindings
@ 2014-06-19  7:33               ` Charles Garcia-Tobin
  0 siblings, 0 replies; 74+ messages in thread
From: Charles Garcia-Tobin @ 2014-06-19  7:33 UTC (permalink / raw)
  To: linux-arm-kernel



> -----Original Message-----
> From: Santosh Shilimkar [mailto:santosh.shilimkar at ti.com]
> Sent: 18 June 2014 20:27
> To: Lorenzo Pieralisi; Nicolas Pitre
> Cc: linux-arm-kernel at lists.infradead.org; linux-pm at vger.kernel.org;
> devicetree at vger.kernel.org; Mark Rutland; Sudeep Holla; Catalin
> Marinas; Charles Garcia-Tobin; Rob Herring; grant.likely at linaro.org;
> Peter De Schrijver; Daniel Lezcano; Amit Kucheria; Vincent Guittot;
> Antti Miettinen; Stephen Boyd; Kevin Hilman; Sebastian Capella; Tomasz
> Figa; Mark Brown; Paul Walmsley; Chander Kashyap
> Subject: Re: [PATCH v4 1/6] Documentation: arm: define DT idle states
> bindings
> 
> On Wednesday 18 June 2014 01:36 PM, Lorenzo Pieralisi wrote:
> > On Fri, Jun 13, 2014 at 06:33:35PM +0100, Nicolas Pitre wrote:
> 
> [..]
> > Ok, a minor tweak to the diagram above, min-residency should include
> > energy costs related to idle entry and exit, but not the exit-latency
> > itself, as long as the energy costs implied by exiting the state are
> > factored out in the min-residency-us property.
> >
> > Hence, to sum it up, I attached below the updated bindings patch:
> >
> > I think we are close to an agreement, if anyone disagrees please
> shout
> > as soon as possible so that we can still integrate changes.
> >
> 
> [..]
> 
> >
> > -- >8 --
> > Subject: [PATCH] Documentation: arm: define DT idle states bindings
> >
> > ARM based platforms implement a variety of power management schemes
> that
> > allow processors to enter idle states at run-time.
> > The parameters defining these idle states vary on a per-platform
> basis forcing
> > the OS to hardcode the state parameters in platform specific static
> tables
> > whose size grows as the number of platforms supported in the kernel
> increases
> > and hampers device drivers standardization.
> >
> > Therefore, this patch aims at standardizing idle state device tree
> bindings for
> > ARM platforms. Bindings define idle state parameters inclusive of
> entry methods
> > and state latencies, to allow operating systems to retrieve the
> configuration
> > entries from the device tree and initialize the related power
> management
> > drivers, paving the way for common code in the kernel to deal with
> idle
> > states and removing the need for static data in current and previous
> kernel
> > versions.
> >
> > Reviewed-by: Sebastian Capella <sebcape@gmail.com>
> > Signed-off-by: Lorenzo Pieralisi <lorenzo.pieralisi@arm.com>
> > ---
> Nice work Lorenzo !!
> I have few comments/questions.
> 
> >  Documentation/devicetree/bindings/arm/cpus.txt     |   8 +
> >  .../devicetree/bindings/arm/idle-states.txt        | 561
> +++++++++++++++++++++
> >  2 files changed, 569 insertions(+)
> >  create mode 100644 Documentation/devicetree/bindings/arm/idle-
> states.txt
> >
> > diff --git a/Documentation/devicetree/bindings/arm/cpus.txt
> b/Documentation/devicetree/bindings/arm/cpus.txt
> > index 1fe72a0..a44d4fd 100644
> > --- a/Documentation/devicetree/bindings/arm/cpus.txt
> > +++ b/Documentation/devicetree/bindings/arm/cpus.txt
> > @@ -215,6 +215,12 @@ nodes to be present and contain the properties
> described below.
> >  		Value type: <phandle>
> >  		Definition: Specifies the ACC[2] node associated with this
> CPU.
> >
> > +	- cpu-idle-states
> > +		Usage: Optional
> > +		Value type: <prop-encoded-array>
> > +		Definition:
> > +			# List of phandles to idle state nodes supported
> > +			  by this cpu [3].
> >
> >  Example 1 (dual-cluster big.LITTLE system 32-bit):
> >
> > @@ -411,3 +417,5 @@ cpus {
> >  --
> >  [1] arm/msm/qcom,saw2.txt
> >  [2] arm/msm/qcom,kpss-acc.txt
> > +[3] ARM Linux kernel documentation - idle states bindings
> > +    Documentation/devicetree/bindings/arm/idle-states.txt
> > diff --git a/Documentation/devicetree/bindings/arm/idle-states.txt
> b/Documentation/devicetree/bindings/arm/idle-states.txt
> > new file mode 100644
> > index 0000000..c9e1ec6
> > --- /dev/null
> > +++ b/Documentation/devicetree/bindings/arm/idle-states.txt
> > @@ -0,0 +1,561 @@
> > +==========================================
> > +ARM idle states binding description
> > +==========================================
> > +
> > +==========================================
> > +1 - Introduction
> > +==========================================
> > +
> > +ARM systems contain HW capable of managing power consumption
> dynamically,
> > +where cores can be put in different low-power states (ranging from
> simple
> > +wfi to power gating) according to OSPM policies. The CPU states
> representing
> s/OSPM/OS PM ?
> > +the range of dynamic idle states that a processor can enter at run-
> time, can be
> > +specified through device tree bindings representing the parameters
> required
> > +to enter/exit specific idle states on a given processor.
> > +
> > +According to the Server Base System Architecture document (SBSA,
> [3]), the
> > +power states an ARM CPU can be put into are identified by the
> following list:
> > +
> > +- Running
> > +- Idle_standby
> > +- Idle_retention
> > +- Sleep
> > +- Off
> > +
> > +The power states described in the SBSA document define the basic CPU
> states on
> > +top of which ARM platforms implement power management schemes that
> allow an OS
> > +PM implementation to put the processor in different idle states
> (which include
> > +states listed above; "off" state is not an idle state since it does
> not have
> > +wake-up capabilities, hence it is not considered in this document).
> > +
> > +Idle state parameters (eg entry latency) are platform specific and
> need to be
> > +characterized with bindings that provide the required information to
> OSPM
> Ditto
> > +code so that it can build the required tables and use them at
> runtime.
> > +
> > +The device tree binding definition for ARM idle states is the
> subject of this
> > +document.
> > +
> > +===========================================
> > +2 - idle-states node
> > +===========================================
> > +
> > +ARM processor idle states are defined within the idle-states node,
> which is
> > +a direct child of the cpus node [1] and provides a container where
> the
> > +processor idle states, defined as device tree nodes, are listed.
> > +
> > +- idle-states node
> > +
> > +	Usage: Optional - On ARM systems, is a container of processor idle
> s/is/it is ?
> > +			  states nodes. If the system does not provide CPU
> > +			  power management capabilities or the processor
just
> > +			  supports idle_standby an idle-states node is not
> > +			  required.
> > +
> > +	Description: idle-states node is a container node, where its
> > +		     subnodes describe the CPU idle states.
> > +
> > +	Node name must be "idle-states".
> > +
> > +	The idle-states node's parent node must be the cpus node.
> > +
> > +	The idle-states node's child nodes can be:
> s/idle-states/idle-state
> > +
> > +	- one or more state nodes
> > +
> > +	Any other configuration is considered invalid.
> > +
> > +	An idle-states node defines the following properties:
> > +
> > +	- entry-method
> > +		Usage: Required
> > +		Value type: <stringlist>
> > +		Definition: Describes the method by which a CPU enters the
> > +			    idle states. This property is required and must
be
> > +			    one of:
> > +
> > +			    - "arm,psci"
> > +			      ARM PSCI firmware interface [2].
> > +
> > +			    - "[vendor],[method]"
> > +			      An implementation dependent string with
> > +			      format "vendor,method", where vendor is a
string
> > +			      denoting the name of the manufacturer and
> > +			      method is a string specifying the mechanism
> > +			      used to enter the idle state.
> > +
> > +The nodes describing the idle states (state) can only be defined
> within the
> > +idle-states node, any other configuration is considered invalid and
> therefore
> > +must be ignored.
> > +
> > +===========================================
> > +3 - state node
> > +===========================================
> > +
> > +A state node represents an idle state description and must be
> defined as
> > +follows:
> > +
> > +- state node
> > +
> > +	Description: must be child of the idle-states node
> > +
> > +	The state node name shall follow standard device tree naming
> > +	rules ([5], 2.2.1 "Node names"), in particular state nodes which
> > +	are siblings within a single common parent must be given a unique
> name.
> > +
> > +	The idle state entered by executing the wfi instruction
> (idle_standby
> > +	SBSA,[3][4]) is considered standard on all ARM platforms and
> therefore
> > +	must not be listed.
> > +
> > +	To correctly specify idle states timing and energy related
> properties,
> > +	the following definitions identify the different execution phases
> > +	a CPU goes through to enter and exit idle states and the implied
> > +	energy metrics:
> > +
> > +
> 	..__[EXEC]__|__[PREP]__|__[ENTRY]__|__[IDLE]__|__[EXIT]__|__[EXEC]
> __..
> > +		    |          |           |          |          |
> > +
> > +		    |<------ entry ------->|
> > +		    |       latency        |
> > +						      |<- exit ->|
> > +						      |  latency |
> > +		    |<-------- min-residency -------->|
> > +			       |<-------  wakeup-latency ------->|
> > +
> I don't know the wakeup latency makes much sense and also correct.
> Hardware wakeup latency is actually exit latency. Is it for failed
> or abort-able ilde case ? We are adding this as a new parameter
> at least from idle states perspective. I think we should just
> avoid it.
> 

Hi Santosh, 

To me wake up latency makes up a lot of sense. It is not always the same as
exit latency, it will depend on your system, and just how smart it is. In
some cases the [ENTRY] period may not be negligible in which case exit
latency will be less than the wake up latency. 
In addition, it will generally always be shorter than entry+exit which is
the default value if omitted, this assumes the PREP time is not abortable,
but this is the safer assumption to make.
Wake up latency is really the number that folk have in their head for what
you'd stick into the pm_qos to veto entry into states when you are latency
constrained. 
The one thing that really is an optimisation here is having a separate exit
latency, which is being proposed for use in core selection for the
scheduler.
So if anything was going to be made optional pending new scheduler patches
should that not be entry/exit latency? 
 
Cheers

Charles



> > +	EXEC:	Normal CPU execution.
> > +
> > +	PREP:	Preparation phase before committing the hardware to idle
mode
> > +		like cache flushing. This is abortable on pending wake-up
> > +		event conditions. The abort latency is assumed to be
> negligible
> > +		(i.e. less than the ENTRY + EXIT duration). If aborted, CPU
> > +		goes back to EXEC. This phase is optional. If not abortable,
> > +		this should be included in the ENTRY phase instead.
> > +
> > +	ENTRY:	The hardware is committed to idle mode. This period must
> run
> > +		to completion up to IDLE before anything else can happen.
> > +
> > +	IDLE:	This is the actual energy-saving idle period. This may last
> > +		between 0 and infinite time, until a wake-up event occurs.
> > +
> > +	EXIT:	Period during which the CPU is brought back to operational
> > +		mode (EXEC).
> > +
> > +	With the definitions provided above, the following list represents
> > +	the valid properties for a state node:
> > +
> > +	- compatible
> > +		Usage: Required
> > +		Value type: <stringlist>
> > +		Definition: Must be "arm,idle-state".
> > +
> > +	- logic-state-retained
> > +		Usage: See definition
> > +		Value type: <none>
> > +		Definition: if present logic is retained on state entry,
> > +			    otherwise it is lost.
> > +
> > +	- cache-state-retained
> > +		Usage: See definition
> > +		Value type: <none>
> > +		Definition: if present cache memory is retained on state
> entry,
> > +			    otherwise it is lost.
> > +
> > +	- entry-method-param
> > +		Usage: See definition.
> > +		Value type: <u32>
> > +		Definition: Depends on the idle-states node entry-method
> > +			    property value. Refer to the entry-method
bindings
> > +			    for this property value definition.
> > +
> > +	- entry-latency-us
> > +		Usage: Required
> > +		Value type: <prop-encoded-array>
> > +		Definition: u32 value representing worst case latency in
> > +			    microseconds required to enter the idle state.
> > +			    The exit-latency-us duration may be guaranteed
> > +			    only after entry-latency-us has passed.
> > +
> > +	- exit-latency-us
> > +		Usage: Required
> > +		Value type: <prop-encoded-array>
> > +		Definition: u32 value representing worst case latency
> > +			    in microseconds required to exit the idle state.
> > +
> > +	- min-residency-us
> > +		Usage: Required
> > +		Value type: <prop-encoded-array>
> > +		Definition: u32 value representing minimum residency
duration
> > +			    in microseconds, inclusive of preparation and
> > +			    entry, for this idle state to be considered
> > +			    worthwhile energy wise.
> > +			    The residency time must take into account the
> > +			    energy consumed while entering and exiting the
> > +			    idle state and is therefore expected to be
> > +			    longer than entry-latency-us.
> > +
> > +	- wakeup-latency-us:
> > +		Usage: Optional
> > +		Value type: <prop-encoded-array>
> > +		Definition: u32 value representing maximum delay between the
> > +			    signaling of a wake-up event and the CPU being
> > +			    able to execute normal code again. If omitted,
> > +			    this is assumed to be equal to:
> > +				entry-latency-us + exit-latency-us
> > +
> Rest of the patch looks fine by to me.
> 
> regards,
> Santosh
> 

^ permalink raw reply	[flat|nested] 74+ messages in thread

* Re: [PATCH v4 5/6] drivers: cpuidle: CPU idle ARM64 driver
  2014-06-19  3:02       ` Rob Herring
@ 2014-06-19  9:08         ` Lorenzo Pieralisi
  -1 siblings, 0 replies; 74+ messages in thread
From: Lorenzo Pieralisi @ 2014-06-19  9:08 UTC (permalink / raw)
  To: Rob Herring
  Cc: Mark Rutland, Catalin Marinas, Tomasz Figa, Chander Kashyap,
	Vincent Guittot, Nicolas Pitre, Daniel Lezcano, linux-arm-kernel,
	grant.likely, Charles Garcia-Tobin, devicetree, Kevin Hilman,
	linux-pm, Sebastian Capella, Mark Brown, Antti Miettinen,
	Paul Walmsley, Peter De Schrijver, Stephen Boyd, Amit Kucheria

On Thu, Jun 19, 2014 at 04:02:18AM +0100, Rob Herring wrote:
> On Wed, Jun 11, 2014 at 11:18 AM, Lorenzo Pieralisi
> <lorenzo.pieralisi@arm.com> wrote:
> > This patch implements a generic CPU idle driver for ARM64 machines.
> 
> I fail to see anything arm64 specific here. The idle states binding is
> for both arm32 and arm64, right? If not, please make it for both.
> Otherwise, I'm okay with the binding for the most part. I need to take
> another pass at it though.

cpu_suspend prototypes differ and that has implications, in particular
related to how to initialize the cpu_suspend back-end. I think that with some
effort we might have a single common driver for arm32/64, I have to check if
it is worth the additional ifdeffery that is likely to be needed.

Certainly the DT bindings are generic for arm32/arm64 and we are trying
to make code using them generic too, the best we can.

I will post a v5 soon, please let me know if that's ok to go when time
comes.

Thank you,
Lorenzo

> 
> Rob
> 
> > It relies on the DT idle states infrastructure to initialize idle
> > states count and respective parameters. Current code assumes the driver
> > is managing idle states on all possible CPUs but can be easily
> > generalized to support heterogenous systems and build cpumasks at
> > runtime using MIDRs or DT cpu nodes compatible properties.
> >
> > Suspend back-ends (eg PSCI) must register a suspend initializer with
> > the CPU idle driver so that the suspend backend call can be detected,
> > and the driver code can call the back-end infrastructure to complete the
> > suspend backend initialization.
> >
> > Idle state index 0 is always initialized as a simple wfi state, ie always
> > considered present and functional on all ARM64 platforms.
> >
> > Signed-off-by: Lorenzo Pieralisi <lorenzo.pieralisi@arm.com>
> > ---
> >  drivers/cpuidle/Kconfig         |   5 ++
> >  drivers/cpuidle/Kconfig.arm64   |  13 ++++
> >  drivers/cpuidle/Makefile        |   4 +
> >  drivers/cpuidle/cpuidle-arm64.c | 168 ++++++++++++++++++++++++++++++++++++++++
> >  4 files changed, 190 insertions(+)
> >  create mode 100644 drivers/cpuidle/Kconfig.arm64
> >  create mode 100644 drivers/cpuidle/cpuidle-arm64.c
> >
> > diff --git a/drivers/cpuidle/Kconfig b/drivers/cpuidle/Kconfig
> > index 760ce20..360c086 100644
> > --- a/drivers/cpuidle/Kconfig
> > +++ b/drivers/cpuidle/Kconfig
> > @@ -44,6 +44,11 @@ depends on ARM
> >  source "drivers/cpuidle/Kconfig.arm"
> >  endmenu
> >
> > +menu "ARM64 CPU Idle Drivers"
> > +depends on ARM64
> > +source "drivers/cpuidle/Kconfig.arm64"
> > +endmenu
> > +
> >  menu "MIPS CPU Idle Drivers"
> >  depends on MIPS
> >  source "drivers/cpuidle/Kconfig.mips"
> > diff --git a/drivers/cpuidle/Kconfig.arm64 b/drivers/cpuidle/Kconfig.arm64
> > new file mode 100644
> > index 0000000..b83612c
> > --- /dev/null
> > +++ b/drivers/cpuidle/Kconfig.arm64
> > @@ -0,0 +1,13 @@
> > +#
> > +# ARM64 CPU Idle drivers
> > +#
> > +
> > +config ARM64_CPUIDLE
> > +       bool "Generic ARM64 CPU idle Driver"
> > +       select OF_IDLE_STATES
> > +       help
> > +         Select this to enable generic cpuidle driver for ARM v8.
> > +         It provides a generic idle driver whose idle states are configured
> > +         at run-time through DT nodes. The CPUidle suspend backend is
> > +         initialized by the device tree parsing code on matching the entry
> > +         method to the respective CPU operations.
> > diff --git a/drivers/cpuidle/Makefile b/drivers/cpuidle/Makefile
> > index d5ebf4b..e496242 100644
> > --- a/drivers/cpuidle/Makefile
> > +++ b/drivers/cpuidle/Makefile
> > @@ -23,6 +23,10 @@ obj-$(CONFIG_ARM_EXYNOS_CPUIDLE)        += cpuidle-exynos.o
> >  obj-$(CONFIG_MIPS_CPS_CPUIDLE)         += cpuidle-cps.o
> >
> >  ###############################################################################
> > +# ARM64 drivers
> > +obj-$(CONFIG_ARM64_CPUIDLE)            += cpuidle-arm64.o
> > +
> > +###############################################################################
> >  # POWERPC drivers
> >  obj-$(CONFIG_PSERIES_CPUIDLE)          += cpuidle-pseries.o
> >  obj-$(CONFIG_POWERNV_CPUIDLE)          += cpuidle-powernv.o
> > diff --git a/drivers/cpuidle/cpuidle-arm64.c b/drivers/cpuidle/cpuidle-arm64.c
> > new file mode 100644
> > index 0000000..4c932f8
> > --- /dev/null
> > +++ b/drivers/cpuidle/cpuidle-arm64.c
> > @@ -0,0 +1,168 @@
> > +/*
> > + * ARM64 generic CPU idle driver.
> > + *
> > + * Copyright (C) 2014 ARM Ltd.
> > + * Author: Lorenzo Pieralisi <lorenzo.pieralisi@arm.com>
> > + *
> > + * This program is free software; you can redistribute it and/or modify
> > + * it under the terms of the GNU General Public License version 2 as
> > + * published by the Free Software Foundation.
> > + */
> > +
> > +#define pr_fmt(fmt) "CPUidle arm64: " fmt
> > +
> > +#include <linux/cpuidle.h>
> > +#include <linux/cpumask.h>
> > +#include <linux/cpu_pm.h>
> > +#include <linux/kernel.h>
> > +#include <linux/module.h>
> > +#include <linux/of.h>
> > +
> > +#include <asm/psci.h>
> > +#include <asm/suspend.h>
> > +
> > +#include "of_idle_states.h"
> > +
> > +typedef int (*suspend_init_fn)(struct cpuidle_driver *,
> > +                              struct device_node *[]);
> > +
> > +struct cpu_suspend_ops {
> > +       const char *id;
> > +       suspend_init_fn init_fn;
> > +};
> > +
> > +static const struct cpu_suspend_ops suspend_operations[] __initconst = {
> > +       {"arm,psci", psci_dt_register_idle_states},
> > +       {}
> > +};
> > +
> > +static __init const struct cpu_suspend_ops *get_suspend_ops(const char *str)
> > +{
> > +       int i;
> > +
> > +       if (!str)
> > +               return NULL;
> > +
> > +       for (i = 0; suspend_operations[i].id; i++)
> > +               if (!strcmp(suspend_operations[i].id, str))
> > +                       return &suspend_operations[i];
> > +
> > +       return NULL;
> > +}
> > +
> > +/*
> > + * arm_enter_idle_state - Programs CPU to enter the specified state
> > + *
> > + * dev: cpuidle device
> > + * drv: cpuidle driver
> > + * idx: state index
> > + *
> > + * Called from the CPUidle framework to program the device to the
> > + * specified target state selected by the governor.
> > + */
> > +static int arm_enter_idle_state(struct cpuidle_device *dev,
> > +                               struct cpuidle_driver *drv, int idx)
> > +{
> > +       int ret;
> > +
> > +       if (!idx) {
> > +               cpu_do_idle();
> > +               return idx;
> > +       }
> > +
> > +       cpu_pm_enter();
> > +       /*
> > +        * Pass idle state index to cpu_suspend which in turn will call
> > +        * the CPU ops suspend protocol with idle index as a parameter.
> > +        *
> > +        * Some states would not require context to be saved and flushed
> > +        * to DRAM, so calling cpu_suspend would not be stricly necessary.
> > +        * When power domains specifications for ARM CPUs are finalized then
> > +        * this code can be optimized to prevent saving registers if not
> > +        * needed.
> > +        */
> > +       ret = cpu_suspend(idx);
> > +
> > +       cpu_pm_exit();
> > +
> > +       return ret ? -1 : idx;
> > +}
> > +
> > +struct cpuidle_driver arm64_idle_driver = {
> > +       .name = "arm64_idle",
> > +       .owner = THIS_MODULE,
> > +};
> > +
> > +static struct device_node *state_nodes[CPUIDLE_STATE_MAX] __initdata;
> > +
> > +/*
> > + * arm64_idle_init
> > + *
> > + * Registers the arm64 specific cpuidle driver with the cpuidle
> > + * framework. It relies on core code to parse the idle states
> > + * and initialize them using driver data structures accordingly.
> > + */
> > +static int __init arm64_idle_init(void)
> > +{
> > +       int i, ret;
> > +       const char *entry_method;
> > +       struct device_node *idle_states_node;
> > +       const struct cpu_suspend_ops *suspend_init;
> > +       struct cpuidle_driver *drv = &arm64_idle_driver;
> > +
> > +       idle_states_node = of_find_node_by_path("/cpus/idle-states");
> > +       if (!idle_states_node)
> > +               return -ENOENT;
> > +
> > +       if (of_property_read_string(idle_states_node, "entry-method",
> > +                                   &entry_method)) {
> > +               pr_warn(" * %s missing entry-method property\n",
> > +                           idle_states_node->full_name);
> > +               of_node_put(idle_states_node);
> > +               ret = -EOPNOTSUPP;
> > +               goto put_node;
> > +       }
> > +
> > +       suspend_init = get_suspend_ops(entry_method);
> > +       if (!suspend_init) {
> > +               pr_warn("Missing suspend initializer\n");
> > +               ret = -EOPNOTSUPP;
> > +               goto put_node;
> > +       }
> > +
> > +       /*
> > +        * State at index 0 is standby wfi and considered standard
> > +        * on all ARM platforms. If in some platforms simple wfi
> > +        * can't be used as "state 0", DT bindings must be implemented
> > +        * to work around this issue and allow installing a special
> > +        * handler for idle state index 0.
> > +        */
> > +       drv->states[0].exit_latency = 1;
> > +       drv->states[0].target_residency = 1;
> > +       drv->states[0].flags = CPUIDLE_FLAG_TIME_VALID;
> > +       strncpy(drv->states[0].name, "ARM WFI", CPUIDLE_NAME_LEN);
> > +       strncpy(drv->states[0].desc, "ARM WFI", CPUIDLE_DESC_LEN);
> > +
> > +       drv->cpumask = (struct cpumask *) cpu_possible_mask;
> > +       /*
> > +        * Start at index 1, request idle state nodes to be filled
> > +        */
> > +       ret = of_init_idle_driver(drv, state_nodes, 1, true);
> > +       if (ret)
> > +               goto put_node;
> > +
> > +       if (suspend_init->init_fn(drv, state_nodes)) {
> > +               ret = -EOPNOTSUPP;
> > +               goto put_node;
> > +       }
> > +
> > +       for (i = 0; i < drv->state_count; i++)
> > +               drv->states[i].enter = arm_enter_idle_state;
> > +
> > +       ret = cpuidle_register(drv, NULL);
> > +
> > +put_node:
> > +       of_node_put(idle_states_node);
> > +       return ret;
> > +}
> > +device_initcall(arm64_idle_init);
> > --
> > 1.8.4
> >
> >
> > --
> > To unsubscribe from this list: send the line "unsubscribe devicetree" in
> > the body of a message to majordomo@vger.kernel.org
> > More majordomo info at  http://vger.kernel.org/majordomo-info.html
> 
> _______________________________________________
> linux-arm-kernel mailing list
> linux-arm-kernel@lists.infradead.org
> http://lists.infradead.org/mailman/listinfo/linux-arm-kernel
> 


^ permalink raw reply	[flat|nested] 74+ messages in thread

* [PATCH v4 5/6] drivers: cpuidle: CPU idle ARM64 driver
@ 2014-06-19  9:08         ` Lorenzo Pieralisi
  0 siblings, 0 replies; 74+ messages in thread
From: Lorenzo Pieralisi @ 2014-06-19  9:08 UTC (permalink / raw)
  To: linux-arm-kernel

On Thu, Jun 19, 2014 at 04:02:18AM +0100, Rob Herring wrote:
> On Wed, Jun 11, 2014 at 11:18 AM, Lorenzo Pieralisi
> <lorenzo.pieralisi@arm.com> wrote:
> > This patch implements a generic CPU idle driver for ARM64 machines.
> 
> I fail to see anything arm64 specific here. The idle states binding is
> for both arm32 and arm64, right? If not, please make it for both.
> Otherwise, I'm okay with the binding for the most part. I need to take
> another pass at it though.

cpu_suspend prototypes differ and that has implications, in particular
related to how to initialize the cpu_suspend back-end. I think that with some
effort we might have a single common driver for arm32/64, I have to check if
it is worth the additional ifdeffery that is likely to be needed.

Certainly the DT bindings are generic for arm32/arm64 and we are trying
to make code using them generic too, the best we can.

I will post a v5 soon, please let me know if that's ok to go when time
comes.

Thank you,
Lorenzo

> 
> Rob
> 
> > It relies on the DT idle states infrastructure to initialize idle
> > states count and respective parameters. Current code assumes the driver
> > is managing idle states on all possible CPUs but can be easily
> > generalized to support heterogenous systems and build cpumasks at
> > runtime using MIDRs or DT cpu nodes compatible properties.
> >
> > Suspend back-ends (eg PSCI) must register a suspend initializer with
> > the CPU idle driver so that the suspend backend call can be detected,
> > and the driver code can call the back-end infrastructure to complete the
> > suspend backend initialization.
> >
> > Idle state index 0 is always initialized as a simple wfi state, ie always
> > considered present and functional on all ARM64 platforms.
> >
> > Signed-off-by: Lorenzo Pieralisi <lorenzo.pieralisi@arm.com>
> > ---
> >  drivers/cpuidle/Kconfig         |   5 ++
> >  drivers/cpuidle/Kconfig.arm64   |  13 ++++
> >  drivers/cpuidle/Makefile        |   4 +
> >  drivers/cpuidle/cpuidle-arm64.c | 168 ++++++++++++++++++++++++++++++++++++++++
> >  4 files changed, 190 insertions(+)
> >  create mode 100644 drivers/cpuidle/Kconfig.arm64
> >  create mode 100644 drivers/cpuidle/cpuidle-arm64.c
> >
> > diff --git a/drivers/cpuidle/Kconfig b/drivers/cpuidle/Kconfig
> > index 760ce20..360c086 100644
> > --- a/drivers/cpuidle/Kconfig
> > +++ b/drivers/cpuidle/Kconfig
> > @@ -44,6 +44,11 @@ depends on ARM
> >  source "drivers/cpuidle/Kconfig.arm"
> >  endmenu
> >
> > +menu "ARM64 CPU Idle Drivers"
> > +depends on ARM64
> > +source "drivers/cpuidle/Kconfig.arm64"
> > +endmenu
> > +
> >  menu "MIPS CPU Idle Drivers"
> >  depends on MIPS
> >  source "drivers/cpuidle/Kconfig.mips"
> > diff --git a/drivers/cpuidle/Kconfig.arm64 b/drivers/cpuidle/Kconfig.arm64
> > new file mode 100644
> > index 0000000..b83612c
> > --- /dev/null
> > +++ b/drivers/cpuidle/Kconfig.arm64
> > @@ -0,0 +1,13 @@
> > +#
> > +# ARM64 CPU Idle drivers
> > +#
> > +
> > +config ARM64_CPUIDLE
> > +       bool "Generic ARM64 CPU idle Driver"
> > +       select OF_IDLE_STATES
> > +       help
> > +         Select this to enable generic cpuidle driver for ARM v8.
> > +         It provides a generic idle driver whose idle states are configured
> > +         at run-time through DT nodes. The CPUidle suspend backend is
> > +         initialized by the device tree parsing code on matching the entry
> > +         method to the respective CPU operations.
> > diff --git a/drivers/cpuidle/Makefile b/drivers/cpuidle/Makefile
> > index d5ebf4b..e496242 100644
> > --- a/drivers/cpuidle/Makefile
> > +++ b/drivers/cpuidle/Makefile
> > @@ -23,6 +23,10 @@ obj-$(CONFIG_ARM_EXYNOS_CPUIDLE)        += cpuidle-exynos.o
> >  obj-$(CONFIG_MIPS_CPS_CPUIDLE)         += cpuidle-cps.o
> >
> >  ###############################################################################
> > +# ARM64 drivers
> > +obj-$(CONFIG_ARM64_CPUIDLE)            += cpuidle-arm64.o
> > +
> > +###############################################################################
> >  # POWERPC drivers
> >  obj-$(CONFIG_PSERIES_CPUIDLE)          += cpuidle-pseries.o
> >  obj-$(CONFIG_POWERNV_CPUIDLE)          += cpuidle-powernv.o
> > diff --git a/drivers/cpuidle/cpuidle-arm64.c b/drivers/cpuidle/cpuidle-arm64.c
> > new file mode 100644
> > index 0000000..4c932f8
> > --- /dev/null
> > +++ b/drivers/cpuidle/cpuidle-arm64.c
> > @@ -0,0 +1,168 @@
> > +/*
> > + * ARM64 generic CPU idle driver.
> > + *
> > + * Copyright (C) 2014 ARM Ltd.
> > + * Author: Lorenzo Pieralisi <lorenzo.pieralisi@arm.com>
> > + *
> > + * This program is free software; you can redistribute it and/or modify
> > + * it under the terms of the GNU General Public License version 2 as
> > + * published by the Free Software Foundation.
> > + */
> > +
> > +#define pr_fmt(fmt) "CPUidle arm64: " fmt
> > +
> > +#include <linux/cpuidle.h>
> > +#include <linux/cpumask.h>
> > +#include <linux/cpu_pm.h>
> > +#include <linux/kernel.h>
> > +#include <linux/module.h>
> > +#include <linux/of.h>
> > +
> > +#include <asm/psci.h>
> > +#include <asm/suspend.h>
> > +
> > +#include "of_idle_states.h"
> > +
> > +typedef int (*suspend_init_fn)(struct cpuidle_driver *,
> > +                              struct device_node *[]);
> > +
> > +struct cpu_suspend_ops {
> > +       const char *id;
> > +       suspend_init_fn init_fn;
> > +};
> > +
> > +static const struct cpu_suspend_ops suspend_operations[] __initconst = {
> > +       {"arm,psci", psci_dt_register_idle_states},
> > +       {}
> > +};
> > +
> > +static __init const struct cpu_suspend_ops *get_suspend_ops(const char *str)
> > +{
> > +       int i;
> > +
> > +       if (!str)
> > +               return NULL;
> > +
> > +       for (i = 0; suspend_operations[i].id; i++)
> > +               if (!strcmp(suspend_operations[i].id, str))
> > +                       return &suspend_operations[i];
> > +
> > +       return NULL;
> > +}
> > +
> > +/*
> > + * arm_enter_idle_state - Programs CPU to enter the specified state
> > + *
> > + * dev: cpuidle device
> > + * drv: cpuidle driver
> > + * idx: state index
> > + *
> > + * Called from the CPUidle framework to program the device to the
> > + * specified target state selected by the governor.
> > + */
> > +static int arm_enter_idle_state(struct cpuidle_device *dev,
> > +                               struct cpuidle_driver *drv, int idx)
> > +{
> > +       int ret;
> > +
> > +       if (!idx) {
> > +               cpu_do_idle();
> > +               return idx;
> > +       }
> > +
> > +       cpu_pm_enter();
> > +       /*
> > +        * Pass idle state index to cpu_suspend which in turn will call
> > +        * the CPU ops suspend protocol with idle index as a parameter.
> > +        *
> > +        * Some states would not require context to be saved and flushed
> > +        * to DRAM, so calling cpu_suspend would not be stricly necessary.
> > +        * When power domains specifications for ARM CPUs are finalized then
> > +        * this code can be optimized to prevent saving registers if not
> > +        * needed.
> > +        */
> > +       ret = cpu_suspend(idx);
> > +
> > +       cpu_pm_exit();
> > +
> > +       return ret ? -1 : idx;
> > +}
> > +
> > +struct cpuidle_driver arm64_idle_driver = {
> > +       .name = "arm64_idle",
> > +       .owner = THIS_MODULE,
> > +};
> > +
> > +static struct device_node *state_nodes[CPUIDLE_STATE_MAX] __initdata;
> > +
> > +/*
> > + * arm64_idle_init
> > + *
> > + * Registers the arm64 specific cpuidle driver with the cpuidle
> > + * framework. It relies on core code to parse the idle states
> > + * and initialize them using driver data structures accordingly.
> > + */
> > +static int __init arm64_idle_init(void)
> > +{
> > +       int i, ret;
> > +       const char *entry_method;
> > +       struct device_node *idle_states_node;
> > +       const struct cpu_suspend_ops *suspend_init;
> > +       struct cpuidle_driver *drv = &arm64_idle_driver;
> > +
> > +       idle_states_node = of_find_node_by_path("/cpus/idle-states");
> > +       if (!idle_states_node)
> > +               return -ENOENT;
> > +
> > +       if (of_property_read_string(idle_states_node, "entry-method",
> > +                                   &entry_method)) {
> > +               pr_warn(" * %s missing entry-method property\n",
> > +                           idle_states_node->full_name);
> > +               of_node_put(idle_states_node);
> > +               ret = -EOPNOTSUPP;
> > +               goto put_node;
> > +       }
> > +
> > +       suspend_init = get_suspend_ops(entry_method);
> > +       if (!suspend_init) {
> > +               pr_warn("Missing suspend initializer\n");
> > +               ret = -EOPNOTSUPP;
> > +               goto put_node;
> > +       }
> > +
> > +       /*
> > +        * State at index 0 is standby wfi and considered standard
> > +        * on all ARM platforms. If in some platforms simple wfi
> > +        * can't be used as "state 0", DT bindings must be implemented
> > +        * to work around this issue and allow installing a special
> > +        * handler for idle state index 0.
> > +        */
> > +       drv->states[0].exit_latency = 1;
> > +       drv->states[0].target_residency = 1;
> > +       drv->states[0].flags = CPUIDLE_FLAG_TIME_VALID;
> > +       strncpy(drv->states[0].name, "ARM WFI", CPUIDLE_NAME_LEN);
> > +       strncpy(drv->states[0].desc, "ARM WFI", CPUIDLE_DESC_LEN);
> > +
> > +       drv->cpumask = (struct cpumask *) cpu_possible_mask;
> > +       /*
> > +        * Start at index 1, request idle state nodes to be filled
> > +        */
> > +       ret = of_init_idle_driver(drv, state_nodes, 1, true);
> > +       if (ret)
> > +               goto put_node;
> > +
> > +       if (suspend_init->init_fn(drv, state_nodes)) {
> > +               ret = -EOPNOTSUPP;
> > +               goto put_node;
> > +       }
> > +
> > +       for (i = 0; i < drv->state_count; i++)
> > +               drv->states[i].enter = arm_enter_idle_state;
> > +
> > +       ret = cpuidle_register(drv, NULL);
> > +
> > +put_node:
> > +       of_node_put(idle_states_node);
> > +       return ret;
> > +}
> > +device_initcall(arm64_idle_init);
> > --
> > 1.8.4
> >
> >
> > --
> > To unsubscribe from this list: send the line "unsubscribe devicetree" in
> > the body of a message to majordomo at vger.kernel.org
> > More majordomo info at  http://vger.kernel.org/majordomo-info.html
> 
> _______________________________________________
> linux-arm-kernel mailing list
> linux-arm-kernel at lists.infradead.org
> http://lists.infradead.org/mailman/listinfo/linux-arm-kernel
> 

^ permalink raw reply	[flat|nested] 74+ messages in thread

* Re: [PATCH v4 5/6] drivers: cpuidle: CPU idle ARM64 driver
  2014-06-18 21:34       ` Daniel Lezcano
@ 2014-06-19  9:30         ` Lorenzo Pieralisi
  -1 siblings, 0 replies; 74+ messages in thread
From: Lorenzo Pieralisi @ 2014-06-19  9:30 UTC (permalink / raw)
  To: Daniel Lezcano
  Cc: linux-arm-kernel, linux-pm, devicetree, Mark Rutland,
	Sudeep Holla, Catalin Marinas, Charles Garcia-Tobin,
	Nicolas Pitre, Rob Herring, grant.likely, Peter De Schrijver,
	Santosh Shilimkar, Amit Kucheria, Vincent Guittot,
	Antti Miettinen, Stephen Boyd, Kevin Hilman, Sebastian Capella,
	Tomasz Figa, Mark Brown

On Wed, Jun 18, 2014 at 10:34:06PM +0100, Daniel Lezcano wrote:

[...]

> > +/*
> > + * arm_enter_idle_state - Programs CPU to enter the specified state
> > + *
> > + * dev: cpuidle device
> > + * drv: cpuidle driver
> > + * idx: state index
> > + *
> > + * Called from the CPUidle framework to program the device to the
> > + * specified target state selected by the governor.
> > + */
> > +static int arm_enter_idle_state(struct cpuidle_device *dev,
> > +				struct cpuidle_driver *drv, int idx)
> > +{
> > +	int ret;
> > +
> > +	if (!idx) {
> > +		cpu_do_idle();
> > +		return idx;
> > +	}
> > +
> > +	cpu_pm_enter();
> > +	/*
> > +	 * Pass idle state index to cpu_suspend which in turn will call
> > +	 * the CPU ops suspend protocol with idle index as a parameter.
> > +	 *
> > +	 * Some states would not require context to be saved and flushed
> > +	 * to DRAM, so calling cpu_suspend would not be stricly necessary.
> > +	 * When power domains specifications for ARM CPUs are finalized then
> > +	 * this code can be optimized to prevent saving registers if not
> > +	 * needed.
> > +	 */
> > +	ret = cpu_suspend(idx);
> > +
> > +	cpu_pm_exit();
> > +
> > +	return ret ? -1 : idx;
> 
> Is it sure cpu_suspend will return always 0 on success ?

Yes. Now, we have to define "success". On ARM32/64 success means
returning through cpu_resume, which can also happen if a CPU is soft
rebooted following a power down failure. It depends on how the
cpu_suspend back-end behaves on power down failure, if it just returns
or it soft-reboots the CPU. It is an implementation detail, do not think
it is a major problem at the moment.

> > +}
> > +
> > +struct cpuidle_driver arm64_idle_driver = {
> > +	.name = "arm64_idle",
> > +	.owner = THIS_MODULE,
> > +};
> > +
> > +static struct device_node *state_nodes[CPUIDLE_STATE_MAX] __initdata;
> > +
> > +/*
> > + * arm64_idle_init
> > + *
> > + * Registers the arm64 specific cpuidle driver with the cpuidle
> > + * framework. It relies on core code to parse the idle states
> > + * and initialize them using driver data structures accordingly.
> > + */
> > +static int __init arm64_idle_init(void)
> > +{
> > +	int i, ret;
> > +	const char *entry_method;
> > +	struct device_node *idle_states_node;
> > +	const struct cpu_suspend_ops *suspend_init;
> > +	struct cpuidle_driver *drv = &arm64_idle_driver;
> > +
> > +	idle_states_node = of_find_node_by_path("/cpus/idle-states");
> > +	if (!idle_states_node)
> > +		return -ENOENT;
> > +
> > +	if (of_property_read_string(idle_states_node, "entry-method",
> > +				    &entry_method)) {
> > +		pr_warn(" * %s missing entry-method property\n",
> > +			    idle_states_node->full_name);
> > +		of_node_put(idle_states_node);
> > +		ret = -EOPNOTSUPP;
> > +		goto put_node;
> > +	}
> > +
> > +	suspend_init = get_suspend_ops(entry_method);
> > +	if (!suspend_init) {
> > +		pr_warn("Missing suspend initializer\n");
> > +		ret = -EOPNOTSUPP;
> > +		goto put_node;
> > +	}
> > +
> > +	/*
> > +	 * State at index 0 is standby wfi and considered standard
> > +	 * on all ARM platforms. If in some platforms simple wfi
> > +	 * can't be used as "state 0", DT bindings must be implemented
> > +	 * to work around this issue and allow installing a special
> > +	 * handler for idle state index 0.
> > +	 */
> > +	drv->states[0].exit_latency = 1;
> > +	drv->states[0].target_residency = 1;
> > +	drv->states[0].flags = CPUIDLE_FLAG_TIME_VALID;
> > +	strncpy(drv->states[0].name, "ARM WFI", CPUIDLE_NAME_LEN);
> > +	strncpy(drv->states[0].desc, "ARM WFI", CPUIDLE_DESC_LEN);
> 
> Please do not copy the state name and desc strings, they will be 
> converted to 'const char *'.

Ok, I need to sync this code with those changes though.

> > +	drv->cpumask = (struct cpumask *) cpu_possible_mask;
> > +	/*
> > +	 * Start at index 1, request idle state nodes to be filled
> > +	 */
> > +	ret = of_init_idle_driver(drv, state_nodes, 1, true);
> > +	if (ret)
> > +		goto put_node;
> > +
> > +	if (suspend_init->init_fn(drv, state_nodes)) {
> > +		ret = -EOPNOTSUPP;
> > +		goto put_node;
> > +	}
> > +
> > +	for (i = 0; i < drv->state_count; i++)
> > +		drv->states[i].enter = arm_enter_idle_state;
> 
> May be s/arm/arm64/ ?

Well, yes, unless we go for a common arm/arm64 driver (see Rob's email),
with related pros and cons.

Let's make a decision on this asap, I do not think we are that far from
a common solution.

Thanks a lot,
Lorenzo


^ permalink raw reply	[flat|nested] 74+ messages in thread

* [PATCH v4 5/6] drivers: cpuidle: CPU idle ARM64 driver
@ 2014-06-19  9:30         ` Lorenzo Pieralisi
  0 siblings, 0 replies; 74+ messages in thread
From: Lorenzo Pieralisi @ 2014-06-19  9:30 UTC (permalink / raw)
  To: linux-arm-kernel

On Wed, Jun 18, 2014 at 10:34:06PM +0100, Daniel Lezcano wrote:

[...]

> > +/*
> > + * arm_enter_idle_state - Programs CPU to enter the specified state
> > + *
> > + * dev: cpuidle device
> > + * drv: cpuidle driver
> > + * idx: state index
> > + *
> > + * Called from the CPUidle framework to program the device to the
> > + * specified target state selected by the governor.
> > + */
> > +static int arm_enter_idle_state(struct cpuidle_device *dev,
> > +				struct cpuidle_driver *drv, int idx)
> > +{
> > +	int ret;
> > +
> > +	if (!idx) {
> > +		cpu_do_idle();
> > +		return idx;
> > +	}
> > +
> > +	cpu_pm_enter();
> > +	/*
> > +	 * Pass idle state index to cpu_suspend which in turn will call
> > +	 * the CPU ops suspend protocol with idle index as a parameter.
> > +	 *
> > +	 * Some states would not require context to be saved and flushed
> > +	 * to DRAM, so calling cpu_suspend would not be stricly necessary.
> > +	 * When power domains specifications for ARM CPUs are finalized then
> > +	 * this code can be optimized to prevent saving registers if not
> > +	 * needed.
> > +	 */
> > +	ret = cpu_suspend(idx);
> > +
> > +	cpu_pm_exit();
> > +
> > +	return ret ? -1 : idx;
> 
> Is it sure cpu_suspend will return always 0 on success ?

Yes. Now, we have to define "success". On ARM32/64 success means
returning through cpu_resume, which can also happen if a CPU is soft
rebooted following a power down failure. It depends on how the
cpu_suspend back-end behaves on power down failure, if it just returns
or it soft-reboots the CPU. It is an implementation detail, do not think
it is a major problem at the moment.

> > +}
> > +
> > +struct cpuidle_driver arm64_idle_driver = {
> > +	.name = "arm64_idle",
> > +	.owner = THIS_MODULE,
> > +};
> > +
> > +static struct device_node *state_nodes[CPUIDLE_STATE_MAX] __initdata;
> > +
> > +/*
> > + * arm64_idle_init
> > + *
> > + * Registers the arm64 specific cpuidle driver with the cpuidle
> > + * framework. It relies on core code to parse the idle states
> > + * and initialize them using driver data structures accordingly.
> > + */
> > +static int __init arm64_idle_init(void)
> > +{
> > +	int i, ret;
> > +	const char *entry_method;
> > +	struct device_node *idle_states_node;
> > +	const struct cpu_suspend_ops *suspend_init;
> > +	struct cpuidle_driver *drv = &arm64_idle_driver;
> > +
> > +	idle_states_node = of_find_node_by_path("/cpus/idle-states");
> > +	if (!idle_states_node)
> > +		return -ENOENT;
> > +
> > +	if (of_property_read_string(idle_states_node, "entry-method",
> > +				    &entry_method)) {
> > +		pr_warn(" * %s missing entry-method property\n",
> > +			    idle_states_node->full_name);
> > +		of_node_put(idle_states_node);
> > +		ret = -EOPNOTSUPP;
> > +		goto put_node;
> > +	}
> > +
> > +	suspend_init = get_suspend_ops(entry_method);
> > +	if (!suspend_init) {
> > +		pr_warn("Missing suspend initializer\n");
> > +		ret = -EOPNOTSUPP;
> > +		goto put_node;
> > +	}
> > +
> > +	/*
> > +	 * State at index 0 is standby wfi and considered standard
> > +	 * on all ARM platforms. If in some platforms simple wfi
> > +	 * can't be used as "state 0", DT bindings must be implemented
> > +	 * to work around this issue and allow installing a special
> > +	 * handler for idle state index 0.
> > +	 */
> > +	drv->states[0].exit_latency = 1;
> > +	drv->states[0].target_residency = 1;
> > +	drv->states[0].flags = CPUIDLE_FLAG_TIME_VALID;
> > +	strncpy(drv->states[0].name, "ARM WFI", CPUIDLE_NAME_LEN);
> > +	strncpy(drv->states[0].desc, "ARM WFI", CPUIDLE_DESC_LEN);
> 
> Please do not copy the state name and desc strings, they will be 
> converted to 'const char *'.

Ok, I need to sync this code with those changes though.

> > +	drv->cpumask = (struct cpumask *) cpu_possible_mask;
> > +	/*
> > +	 * Start at index 1, request idle state nodes to be filled
> > +	 */
> > +	ret = of_init_idle_driver(drv, state_nodes, 1, true);
> > +	if (ret)
> > +		goto put_node;
> > +
> > +	if (suspend_init->init_fn(drv, state_nodes)) {
> > +		ret = -EOPNOTSUPP;
> > +		goto put_node;
> > +	}
> > +
> > +	for (i = 0; i < drv->state_count; i++)
> > +		drv->states[i].enter = arm_enter_idle_state;
> 
> May be s/arm/arm64/ ?

Well, yes, unless we go for a common arm/arm64 driver (see Rob's email),
with related pros and cons.

Let's make a decision on this asap, I do not think we are that far from
a common solution.

Thanks a lot,
Lorenzo

^ permalink raw reply	[flat|nested] 74+ messages in thread

* Re: [PATCH v4 1/6] Documentation: arm: define DT idle states bindings
  2014-06-19  7:33               ` Charles Garcia-Tobin
@ 2014-06-19 14:08                 ` Santosh Shilimkar
  -1 siblings, 0 replies; 74+ messages in thread
From: Santosh Shilimkar @ 2014-06-19 14:08 UTC (permalink / raw)
  To: Charles Garcia-Tobin, Lorenzo Pieralisi, Nicolas Pitre
  Cc: linux-arm-kernel, linux-pm, devicetree, Mark Rutland,
	Sudeep Holla, Catalin Marinas, Rob Herring, grant.likely,
	Peter De Schrijver, Daniel Lezcano, Amit Kucheria,
	Vincent Guittot, Antti Miettinen, Stephen Boyd, Kevin Hilman,
	Sebastian Capella, Tomasz Figa, Mark Brown, Paul Walmsley,
	Chander Kashyap

Charles,

On Thursday 19 June 2014 03:33 AM, Charles Garcia-Tobin wrote:
> 
> 
>> -----Original Message-----
>> From: Santosh Shilimkar [mailto:santosh.shilimkar@ti.com]
>> Sent: 18 June 2014 20:27
>> To: Lorenzo Pieralisi; Nicolas Pitre

[..]

>>> +===========================================
>>> +3 - state node
>>> +===========================================
>>> +
>>> +A state node represents an idle state description and must be
>> defined as
>>> +follows:
>>> +
>>> +- state node
>>> +
>>> +	Description: must be child of the idle-states node
>>> +
>>> +	The state node name shall follow standard device tree naming
>>> +	rules ([5], 2.2.1 "Node names"), in particular state nodes which
>>> +	are siblings within a single common parent must be given a unique
>> name.
>>> +
>>> +	The idle state entered by executing the wfi instruction
>> (idle_standby
>>> +	SBSA,[3][4]) is considered standard on all ARM platforms and
>> therefore
>>> +	must not be listed.
>>> +
>>> +	To correctly specify idle states timing and energy related
>> properties,
>>> +	the following definitions identify the different execution phases
>>> +	a CPU goes through to enter and exit idle states and the implied
>>> +	energy metrics:
>>> +
>>> +
>> 	..__[EXEC]__|__[PREP]__|__[ENTRY]__|__[IDLE]__|__[EXIT]__|__[EXEC]
>> __..
>>> +		    |          |           |          |          |
>>> +
>>> +		    |<------ entry ------->|
>>> +		    |       latency        |
>>> +						      |<- exit ->|
>>> +						      |  latency |
>>> +		    |<-------- min-residency -------->|
>>> +			       |<-------  wakeup-latency ------->|
>>> +
>> I don't know the wakeup latency makes much sense and also correct.
>> Hardware wakeup latency is actually exit latency. Is it for failed
>> or abort-able ilde case ? We are adding this as a new parameter
>> at least from idle states perspective. I think we should just
>> avoid it.
>>
> 
> Hi Santosh, 
> 
> To me wake up latency makes up a lot of sense. It is not always the same as
> exit latency, it will depend on your system, and just how smart it is. In
> some cases the [ENTRY] period may not be negligible in which case exit
> latency will be less than the wake up latency. 
> In addition, it will generally always be shorter than entry+exit which is
> the default value if omitted, this assumes the PREP time is not abortable,
> but this is the safer assumption to make.
> Wake up latency is really the number that folk have in their head for what
> you'd stick into the pm_qos to veto entry into states when you are latency
> constrained. 
> The one thing that really is an optimisation here is having a separate exit
> latency, which is being proposed for use in core selection for the
> scheduler.
> So if anything was going to be made optional pending new scheduler patches
> should that not be entry/exit latency? 
>  
PM QOS angle Nico pointed out and its clear. The wakeup latency as such is a
worst case wakeup latency from QOS perspective so considering the aborted idle
case it makes sense to have conservative number which includes entry + exit.

If you look at current idle governors, only exit latency and target residency
is being used. No matter how we represent it, as long idle governor or idle
C-state selection logic gets that information, things should be fine. So
from that view your point of entry/exit optional makes sense considering
wakeup latency can convey that information indirectly.

Regards,
Santosh

^ permalink raw reply	[flat|nested] 74+ messages in thread

* [PATCH v4 1/6] Documentation: arm: define DT idle states bindings
@ 2014-06-19 14:08                 ` Santosh Shilimkar
  0 siblings, 0 replies; 74+ messages in thread
From: Santosh Shilimkar @ 2014-06-19 14:08 UTC (permalink / raw)
  To: linux-arm-kernel

Charles,

On Thursday 19 June 2014 03:33 AM, Charles Garcia-Tobin wrote:
> 
> 
>> -----Original Message-----
>> From: Santosh Shilimkar [mailto:santosh.shilimkar at ti.com]
>> Sent: 18 June 2014 20:27
>> To: Lorenzo Pieralisi; Nicolas Pitre

[..]

>>> +===========================================
>>> +3 - state node
>>> +===========================================
>>> +
>>> +A state node represents an idle state description and must be
>> defined as
>>> +follows:
>>> +
>>> +- state node
>>> +
>>> +	Description: must be child of the idle-states node
>>> +
>>> +	The state node name shall follow standard device tree naming
>>> +	rules ([5], 2.2.1 "Node names"), in particular state nodes which
>>> +	are siblings within a single common parent must be given a unique
>> name.
>>> +
>>> +	The idle state entered by executing the wfi instruction
>> (idle_standby
>>> +	SBSA,[3][4]) is considered standard on all ARM platforms and
>> therefore
>>> +	must not be listed.
>>> +
>>> +	To correctly specify idle states timing and energy related
>> properties,
>>> +	the following definitions identify the different execution phases
>>> +	a CPU goes through to enter and exit idle states and the implied
>>> +	energy metrics:
>>> +
>>> +
>> 	..__[EXEC]__|__[PREP]__|__[ENTRY]__|__[IDLE]__|__[EXIT]__|__[EXEC]
>> __..
>>> +		    |          |           |          |          |
>>> +
>>> +		    |<------ entry ------->|
>>> +		    |       latency        |
>>> +						      |<- exit ->|
>>> +						      |  latency |
>>> +		    |<-------- min-residency -------->|
>>> +			       |<-------  wakeup-latency ------->|
>>> +
>> I don't know the wakeup latency makes much sense and also correct.
>> Hardware wakeup latency is actually exit latency. Is it for failed
>> or abort-able ilde case ? We are adding this as a new parameter
>> at least from idle states perspective. I think we should just
>> avoid it.
>>
> 
> Hi Santosh, 
> 
> To me wake up latency makes up a lot of sense. It is not always the same as
> exit latency, it will depend on your system, and just how smart it is. In
> some cases the [ENTRY] period may not be negligible in which case exit
> latency will be less than the wake up latency. 
> In addition, it will generally always be shorter than entry+exit which is
> the default value if omitted, this assumes the PREP time is not abortable,
> but this is the safer assumption to make.
> Wake up latency is really the number that folk have in their head for what
> you'd stick into the pm_qos to veto entry into states when you are latency
> constrained. 
> The one thing that really is an optimisation here is having a separate exit
> latency, which is being proposed for use in core selection for the
> scheduler.
> So if anything was going to be made optional pending new scheduler patches
> should that not be entry/exit latency? 
>  
PM QOS angle Nico pointed out and its clear. The wakeup latency as such is a
worst case wakeup latency from QOS perspective so considering the aborted idle
case it makes sense to have conservative number which includes entry + exit.

If you look at current idle governors, only exit latency and target residency
is being used. No matter how we represent it, as long idle governor or idle
C-state selection logic gets that information, things should be fine. So
from that view your point of entry/exit optional makes sense considering
wakeup latency can convey that information indirectly.

Regards,
Santosh

^ permalink raw reply	[flat|nested] 74+ messages in thread

* RE: [PATCH v4 1/6] Documentation: arm: define DT idle states bindings
  2014-06-19 14:08                 ` Santosh Shilimkar
@ 2014-06-19 15:09                   ` Charles Garcia-Tobin
  -1 siblings, 0 replies; 74+ messages in thread
From: Charles Garcia-Tobin @ 2014-06-19 15:09 UTC (permalink / raw)
  To: 'Santosh Shilimkar', Lorenzo Pieralisi, Nicolas Pitre
  Cc: linux-arm-kernel, linux-pm, devicetree, Mark Rutland,
	Sudeep Holla, Catalin Marinas, Rob Herring, grant.likely,
	Peter De Schrijver, Daniel Lezcano, Amit Kucheria,
	Vincent Guittot, Antti Miettinen, Stephen Boyd, Kevin Hilman,
	Sebastian Capella, Tomasz Figa, Mark Brown, Paul Walmsley,
	Chander Kashyap


Hi

Looks we are pretty much agreed on the number now.
In my e-mail though I was questioning what should be optional and what
shouldn't. The current proposal is that wakeup-latency-us is the optional
one, I was thinking that it's make more sense making entry/exit (given the
use is much more specific and yet to be proven) but frankly it is not great
shakes either way, so for me it's fine as it is. The only thing that I think
would be worth clarifying is that the text around wakeup-latency-us, to make
it clear when it makes sense to provide it. So I was thinking something
like:

        - wakeup-latency-us:
                Usage: Optional
                Value type: <prop-encoded-array>
                Definition: u32 value representing maximum delay between the
                            signalling of a wake-up event and the CPU being
                            able to execute normal code again. If omitted,
                            this is assumed to be equal to:
                                entry-latency-us + exit-latency-us
                            It is important to supply this value on systems 
                            where the duration of PREP phase is 
                            non-neglibigle. In such systems 
                            entry-latency-us + exit-latency-us 
                            will exceed wakeup-latency-us by this duration.

The other thing that may be worth adding is some graphs to help explain what
is meant by min-residency. Lorenzo feel free to take this or leave this. But
something like:

The energy consumption of a cpu when it enters a power state can be roughly
characterised by the following graph:

               |
               |
               |
           e   |
           n   |                                      /---
           e   |                               /------
           r   |                        /------
           g   |                  /-----
           y   |           /------
               |       ----
               |      /|
               |     / |
               |    /  |
               |   /   |
               |  /    |
               | /     |                          
               |/      |                          
          -----|-------+----------------------------------
              0|       1                      time


The graph starts with a steep slope and then a shallower one. The first part
denotes the energy costs incurred whilst entering and leaving the power
state. The shallower slope is essentially representing the power consumption
of the state. 
We are defining min-residency for a given state as the period of time after
which choosing that state become the most energy efficient option. A good
way to visualise this, is if we take the same graph above and compare some
states. Due to the limitations of ascii art we are only showing two made up
states C1, and C2:


          |
          |
          |
          |                                                  /-- C1
       e  |                                              /---      
       n  |                                         /----          
       e  |                                     /---               
       r  |                                /----     /----------- C2
       g  |                    /-------/-------------               
       y  |        ------------    /---|
          |       /           /----    |
          |      /        /---         |
          |     /    /----             |
          |    / /---                  |
          |   ---                      |
          |  /                         |
          | /                          |
          |/                           |                  time
       ---/----------------------------+------------------------
          |  better off with C1        | better off with C2
                                       |
                                   min-residency    
                                   for C2               



As you can see, having taken into account entry/exit costs there is period
were C1 is the better choice of state. This is mainly down to the fact that
entry/exit costs are low. However the lower power consumption of C2 means
that after a suitable time, C2 is the better choice. This interval of time
is what we want to call min-residency
                     
Cheers

Charles

> -----Original Message-----
> From: Santosh Shilimkar [mailto:santosh.shilimkar@ti.com]
> Sent: 19 June 2014 15:09
> To: Charles Garcia-Tobin; Lorenzo Pieralisi; Nicolas Pitre
> Cc: linux-arm-kernel@lists.infradead.org; linux-pm@vger.kernel.org;
> devicetree@vger.kernel.org; Mark Rutland; Sudeep Holla; Catalin
> Marinas; Rob Herring; grant.likely@linaro.org; Peter De Schrijver;
> Daniel Lezcano; Amit Kucheria; Vincent Guittot; Antti Miettinen;
> Stephen Boyd; Kevin Hilman; Sebastian Capella; Tomasz Figa; Mark Brown;
> Paul Walmsley; Chander Kashyap
> Subject: Re: [PATCH v4 1/6] Documentation: arm: define DT idle states
> bindings
> 
> Charles,
> 
> On Thursday 19 June 2014 03:33 AM, Charles Garcia-Tobin wrote:
> >
> >
> >> -----Original Message-----
> >> From: Santosh Shilimkar [mailto:santosh.shilimkar@ti.com]
> >> Sent: 18 June 2014 20:27
> >> To: Lorenzo Pieralisi; Nicolas Pitre
> 
> [..]
> 
> >>> +===========================================
> >>> +3 - state node
> >>> +===========================================
> >>> +
> >>> +A state node represents an idle state description and must be
> >> defined as
> >>> +follows:
> >>> +
> >>> +- state node
> >>> +
> >>> +	Description: must be child of the idle-states node
> >>> +
> >>> +	The state node name shall follow standard device tree naming
> >>> +	rules ([5], 2.2.1 "Node names"), in particular state nodes which
> >>> +	are siblings within a single common parent must be given a unique
> >> name.
> >>> +
> >>> +	The idle state entered by executing the wfi instruction
> >> (idle_standby
> >>> +	SBSA,[3][4]) is considered standard on all ARM platforms and
> >> therefore
> >>> +	must not be listed.
> >>> +
> >>> +	To correctly specify idle states timing and energy related
> >> properties,
> >>> +	the following definitions identify the different execution phases
> >>> +	a CPU goes through to enter and exit idle states and the implied
> >>> +	energy metrics:
> >>> +
> >>> +
> >> 	..__[EXEC]__|__[PREP]__|__[ENTRY]__|__[IDLE]__|__[EXIT]__|__[EXEC]
> >> __..
> >>> +		    |          |           |          |          |
> >>> +
> >>> +		    |<------ entry ------->|
> >>> +		    |       latency        |
> >>> +						      |<- exit ->|
> >>> +						      |  latency |
> >>> +		    |<-------- min-residency -------->|
> >>> +			       |<-------  wakeup-latency ------->|
> >>> +
> >> I don't know the wakeup latency makes much sense and also correct.
> >> Hardware wakeup latency is actually exit latency. Is it for failed
> >> or abort-able ilde case ? We are adding this as a new parameter
> >> at least from idle states perspective. I think we should just
> >> avoid it.
> >>
> >
> > Hi Santosh,
> >
> > To me wake up latency makes up a lot of sense. It is not always the
> same as
> > exit latency, it will depend on your system, and just how smart it
> is. In
> > some cases the [ENTRY] period may not be negligible in which case
> exit
> > latency will be less than the wake up latency.
> > In addition, it will generally always be shorter than entry+exit
> which is
> > the default value if omitted, this assumes the PREP time is not
> abortable,
> > but this is the safer assumption to make.
> > Wake up latency is really the number that folk have in their head for
> what
> > you'd stick into the pm_qos to veto entry into states when you are
> latency
> > constrained.
> > The one thing that really is an optimisation here is having a
> separate exit
> > latency, which is being proposed for use in core selection for the
> > scheduler.
> > So if anything was going to be made optional pending new scheduler
> patches
> > should that not be entry/exit latency?
> >
> PM QOS angle Nico pointed out and its clear. The wakeup latency as such
> is a
> worst case wakeup latency from QOS perspective so considering the
> aborted idle
> case it makes sense to have conservative number which includes entry +
> exit.
> 
> If you look at current idle governors, only exit latency and target
> residency
> is being used. No matter how we represent it, as long idle governor or
> idle
> C-state selection logic gets that information, things should be fine.
> So
> from that view your point of entry/exit optional makes sense
> considering
> wakeup latency can convey that information indirectly.
> 
> Regards,
> Santosh



^ permalink raw reply	[flat|nested] 74+ messages in thread

* [PATCH v4 1/6] Documentation: arm: define DT idle states bindings
@ 2014-06-19 15:09                   ` Charles Garcia-Tobin
  0 siblings, 0 replies; 74+ messages in thread
From: Charles Garcia-Tobin @ 2014-06-19 15:09 UTC (permalink / raw)
  To: linux-arm-kernel


Hi

Looks we are pretty much agreed on the number now.
In my e-mail though I was questioning what should be optional and what
shouldn't. The current proposal is that wakeup-latency-us is the optional
one, I was thinking that it's make more sense making entry/exit (given the
use is much more specific and yet to be proven) but frankly it is not great
shakes either way, so for me it's fine as it is. The only thing that I think
would be worth clarifying is that the text around wakeup-latency-us, to make
it clear when it makes sense to provide it. So I was thinking something
like:

        - wakeup-latency-us:
                Usage: Optional
                Value type: <prop-encoded-array>
                Definition: u32 value representing maximum delay between the
                            signalling of a wake-up event and the CPU being
                            able to execute normal code again. If omitted,
                            this is assumed to be equal to:
                                entry-latency-us + exit-latency-us
                            It is important to supply this value on systems 
                            where the duration of PREP phase is 
                            non-neglibigle. In such systems 
                            entry-latency-us + exit-latency-us 
                            will exceed wakeup-latency-us by this duration.

The other thing that may be worth adding is some graphs to help explain what
is meant by min-residency. Lorenzo feel free to take this or leave this. But
something like:

The energy consumption of a cpu when it enters a power state can be roughly
characterised by the following graph:

               |
               |
               |
           e   |
           n   |                                      /---
           e   |                               /------
           r   |                        /------
           g   |                  /-----
           y   |           /------
               |       ----
               |      /|
               |     / |
               |    /  |
               |   /   |
               |  /    |
               | /     |                          
               |/      |                          
          -----|-------+----------------------------------
              0|       1                      time


The graph starts with a steep slope and then a shallower one. The first part
denotes the energy costs incurred whilst entering and leaving the power
state. The shallower slope is essentially representing the power consumption
of the state. 
We are defining min-residency for a given state as the period of time after
which choosing that state become the most energy efficient option. A good
way to visualise this, is if we take the same graph above and compare some
states. Due to the limitations of ascii art we are only showing two made up
states C1, and C2:


          |
          |
          |
          |                                                  /-- C1
       e  |                                              /---      
       n  |                                         /----          
       e  |                                     /---               
       r  |                                /----     /----------- C2
       g  |                    /-------/-------------               
       y  |        ------------    /---|
          |       /           /----    |
          |      /        /---         |
          |     /    /----             |
          |    / /---                  |
          |   ---                      |
          |  /                         |
          | /                          |
          |/                           |                  time
       ---/----------------------------+------------------------
          |  better off with C1        | better off with C2
                                       |
                                   min-residency    
                                   for C2               



As you can see, having taken into account entry/exit costs there is period
were C1 is the better choice of state. This is mainly down to the fact that
entry/exit costs are low. However the lower power consumption of C2 means
that after a suitable time, C2 is the better choice. This interval of time
is what we want to call min-residency
                     
Cheers

Charles

> -----Original Message-----
> From: Santosh Shilimkar [mailto:santosh.shilimkar at ti.com]
> Sent: 19 June 2014 15:09
> To: Charles Garcia-Tobin; Lorenzo Pieralisi; Nicolas Pitre
> Cc: linux-arm-kernel at lists.infradead.org; linux-pm at vger.kernel.org;
> devicetree at vger.kernel.org; Mark Rutland; Sudeep Holla; Catalin
> Marinas; Rob Herring; grant.likely at linaro.org; Peter De Schrijver;
> Daniel Lezcano; Amit Kucheria; Vincent Guittot; Antti Miettinen;
> Stephen Boyd; Kevin Hilman; Sebastian Capella; Tomasz Figa; Mark Brown;
> Paul Walmsley; Chander Kashyap
> Subject: Re: [PATCH v4 1/6] Documentation: arm: define DT idle states
> bindings
> 
> Charles,
> 
> On Thursday 19 June 2014 03:33 AM, Charles Garcia-Tobin wrote:
> >
> >
> >> -----Original Message-----
> >> From: Santosh Shilimkar [mailto:santosh.shilimkar at ti.com]
> >> Sent: 18 June 2014 20:27
> >> To: Lorenzo Pieralisi; Nicolas Pitre
> 
> [..]
> 
> >>> +===========================================
> >>> +3 - state node
> >>> +===========================================
> >>> +
> >>> +A state node represents an idle state description and must be
> >> defined as
> >>> +follows:
> >>> +
> >>> +- state node
> >>> +
> >>> +	Description: must be child of the idle-states node
> >>> +
> >>> +	The state node name shall follow standard device tree naming
> >>> +	rules ([5], 2.2.1 "Node names"), in particular state nodes which
> >>> +	are siblings within a single common parent must be given a unique
> >> name.
> >>> +
> >>> +	The idle state entered by executing the wfi instruction
> >> (idle_standby
> >>> +	SBSA,[3][4]) is considered standard on all ARM platforms and
> >> therefore
> >>> +	must not be listed.
> >>> +
> >>> +	To correctly specify idle states timing and energy related
> >> properties,
> >>> +	the following definitions identify the different execution phases
> >>> +	a CPU goes through to enter and exit idle states and the implied
> >>> +	energy metrics:
> >>> +
> >>> +
> >> 	..__[EXEC]__|__[PREP]__|__[ENTRY]__|__[IDLE]__|__[EXIT]__|__[EXEC]
> >> __..
> >>> +		    |          |           |          |          |
> >>> +
> >>> +		    |<------ entry ------->|
> >>> +		    |       latency        |
> >>> +						      |<- exit ->|
> >>> +						      |  latency |
> >>> +		    |<-------- min-residency -------->|
> >>> +			       |<-------  wakeup-latency ------->|
> >>> +
> >> I don't know the wakeup latency makes much sense and also correct.
> >> Hardware wakeup latency is actually exit latency. Is it for failed
> >> or abort-able ilde case ? We are adding this as a new parameter
> >> at least from idle states perspective. I think we should just
> >> avoid it.
> >>
> >
> > Hi Santosh,
> >
> > To me wake up latency makes up a lot of sense. It is not always the
> same as
> > exit latency, it will depend on your system, and just how smart it
> is. In
> > some cases the [ENTRY] period may not be negligible in which case
> exit
> > latency will be less than the wake up latency.
> > In addition, it will generally always be shorter than entry+exit
> which is
> > the default value if omitted, this assumes the PREP time is not
> abortable,
> > but this is the safer assumption to make.
> > Wake up latency is really the number that folk have in their head for
> what
> > you'd stick into the pm_qos to veto entry into states when you are
> latency
> > constrained.
> > The one thing that really is an optimisation here is having a
> separate exit
> > latency, which is being proposed for use in core selection for the
> > scheduler.
> > So if anything was going to be made optional pending new scheduler
> patches
> > should that not be entry/exit latency?
> >
> PM QOS angle Nico pointed out and its clear. The wakeup latency as such
> is a
> worst case wakeup latency from QOS perspective so considering the
> aborted idle
> case it makes sense to have conservative number which includes entry +
> exit.
> 
> If you look at current idle governors, only exit latency and target
> residency
> is being used. No matter how we represent it, as long idle governor or
> idle
> C-state selection logic gets that information, things should be fine.
> So
> from that view your point of entry/exit optional makes sense
> considering
> wakeup latency can convey that information indirectly.
> 
> Regards,
> Santosh

^ permalink raw reply	[flat|nested] 74+ messages in thread

* Re: [PATCH v4 3/6] drivers: cpuidle: implement OF based idle states infrastructure
  2014-06-12  9:03       ` Lorenzo Pieralisi
@ 2014-07-06 10:01         ` Paul Burton
  -1 siblings, 0 replies; 74+ messages in thread
From: Paul Burton @ 2014-07-06 10:01 UTC (permalink / raw)
  To: Lorenzo Pieralisi
  Cc: Mark Rutland, Rafael J. Wysocki, Catalin Marinas, Tomasz Figa,
	Chander Kashyap, Vincent Guittot, Nicolas Pitre, Daniel Lezcano,
	linux-arm-kernel, grant.likely, Charles Garcia-Tobin, devicetree,
	Kevin Hilman, linux-pm, Sebastian Capella, Mark Brown,
	Antti Miettinen, Paul Walmsley, Peter De Schrijver, Stephen Boyd,
	Am

On Thu, Jun 12, 2014 at 10:03:39AM +0100, Lorenzo Pieralisi wrote:
> [CC'ing Preeti and Paul to check their opinions]
> 
> Hi Rafael,

snip

> > One question here.
> > 
> > Do you want this to be generally useful or is it just ARM-specific?
> 
> The first series was targeting ARM64, then I noticed that it might be
> used for ARM too (Daniel is working on that). Actually, I discovered
> that Power and MIPS can reuse at least the code that initializes the
> states data too, but I have to point out three things:
> 
> 1) state enter function method: in my bindings it is common for all
>    idle states, need to check if it applies to Power and MIPS too.
> 2) CPUIDLE_FLAG_TIMER_STOP and how to set it. It is non-trivial to
>    add code that detects what idle states lose the tick device context.
>    At the moment I am adding the flag by default to all idle states
>    apart from standbywfi on ARM, but that can be optimised. Unless we
>    resort to power domains (but that's not trivial), we can add a flag
>    to the idle states in DT (ie local-timer-stop or suchlike) to support
>    that. I think that it will be frowned upon but it is worth trying, would
>    like to know what other people think about this.
> 3) idle states bindings should be reviewed, I expect them to be valid
>    on other architectures too, but I need acknowledgments.
> 
> I think this series is not far from being ready to be upstreamed, I
> would be certainly happy if it can be reused for other archs too so
> just let me know.
> 
> Thanks !
> Lorenzo

Sorry for the delayed reply, I'm still catching up with mail after some
time off.

On the MIPS CPS systems I'm working with we are able to detect which
states are supported by reading config registers, so there's no need
for these systems to have available states described in DT.

Thanks,
    Paul

^ permalink raw reply	[flat|nested] 74+ messages in thread

* [PATCH v4 3/6] drivers: cpuidle: implement OF based idle states infrastructure
@ 2014-07-06 10:01         ` Paul Burton
  0 siblings, 0 replies; 74+ messages in thread
From: Paul Burton @ 2014-07-06 10:01 UTC (permalink / raw)
  To: linux-arm-kernel

On Thu, Jun 12, 2014 at 10:03:39AM +0100, Lorenzo Pieralisi wrote:
> [CC'ing Preeti and Paul to check their opinions]
> 
> Hi Rafael,

snip

> > One question here.
> > 
> > Do you want this to be generally useful or is it just ARM-specific?
> 
> The first series was targeting ARM64, then I noticed that it might be
> used for ARM too (Daniel is working on that). Actually, I discovered
> that Power and MIPS can reuse at least the code that initializes the
> states data too, but I have to point out three things:
> 
> 1) state enter function method: in my bindings it is common for all
>    idle states, need to check if it applies to Power and MIPS too.
> 2) CPUIDLE_FLAG_TIMER_STOP and how to set it. It is non-trivial to
>    add code that detects what idle states lose the tick device context.
>    At the moment I am adding the flag by default to all idle states
>    apart from standbywfi on ARM, but that can be optimised. Unless we
>    resort to power domains (but that's not trivial), we can add a flag
>    to the idle states in DT (ie local-timer-stop or suchlike) to support
>    that. I think that it will be frowned upon but it is worth trying, would
>    like to know what other people think about this.
> 3) idle states bindings should be reviewed, I expect them to be valid
>    on other architectures too, but I need acknowledgments.
> 
> I think this series is not far from being ready to be upstreamed, I
> would be certainly happy if it can be reused for other archs too so
> just let me know.
> 
> Thanks !
> Lorenzo

Sorry for the delayed reply, I'm still catching up with mail after some
time off.

On the MIPS CPS systems I'm working with we are able to detect which
states are supported by reading config registers, so there's no need
for these systems to have available states described in DT.

Thanks,
    Paul

^ permalink raw reply	[flat|nested] 74+ messages in thread

end of thread, other threads:[~2014-07-06 10:01 UTC | newest]

Thread overview: 74+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2014-06-11 16:18 [PATCH v4 0/6] ARM generic idle states Lorenzo Pieralisi
2014-06-11 16:18 ` Lorenzo Pieralisi
2014-06-11 16:18 ` [PATCH v4 1/6] Documentation: arm: define DT idle states bindings Lorenzo Pieralisi
2014-06-11 16:18   ` Lorenzo Pieralisi
2014-06-11 18:15   ` Nicolas Pitre
2014-06-11 18:15     ` Nicolas Pitre
2014-06-13 16:49     ` Lorenzo Pieralisi
2014-06-13 16:49       ` Lorenzo Pieralisi
2014-06-13 17:33       ` Nicolas Pitre
2014-06-13 17:33         ` Nicolas Pitre
2014-06-16 14:23         ` Lorenzo Pieralisi
2014-06-16 14:23           ` Lorenzo Pieralisi
2014-06-16 14:48           ` Nicolas Pitre
2014-06-16 14:48             ` Nicolas Pitre
2014-06-18 17:36         ` Lorenzo Pieralisi
2014-06-18 17:36           ` Lorenzo Pieralisi
2014-06-18 18:20           ` Sebastian Capella
2014-06-18 18:20             ` Sebastian Capella
2014-06-18 19:27           ` Santosh Shilimkar
2014-06-18 19:27             ` Santosh Shilimkar
2014-06-18 20:51             ` Nicolas Pitre
2014-06-18 20:51               ` Nicolas Pitre
2014-06-18 20:55               ` Santosh Shilimkar
2014-06-18 20:55                 ` Santosh Shilimkar
2014-06-18 21:09                 ` Nicolas Pitre
2014-06-18 21:09                   ` Nicolas Pitre
2014-06-18 23:13                   ` Santosh Shilimkar
2014-06-18 23:13                     ` Santosh Shilimkar
2014-06-19  7:33             ` Charles Garcia-Tobin
2014-06-19  7:33               ` Charles Garcia-Tobin
2014-06-19 14:08               ` Santosh Shilimkar
2014-06-19 14:08                 ` Santosh Shilimkar
2014-06-19 15:09                 ` Charles Garcia-Tobin
2014-06-19 15:09                   ` Charles Garcia-Tobin
2014-06-18 21:03           ` Nicolas Pitre
2014-06-18 21:03             ` Nicolas Pitre
2014-06-13 17:40       ` Sebastian Capella
2014-06-13 17:40         ` Sebastian Capella
     [not found] ` <1402503520-8611-1-git-send-email-lorenzo.pieralisi-5wv7dgnIgG8@public.gmane.org>
2014-06-11 16:18   ` [PATCH v4 2/6] Documentation: devicetree: psci: define CPU suspend parameter Lorenzo Pieralisi
2014-06-11 16:18     ` Lorenzo Pieralisi
2014-06-11 16:18   ` [PATCH v4 5/6] drivers: cpuidle: CPU idle ARM64 driver Lorenzo Pieralisi
2014-06-11 16:18     ` Lorenzo Pieralisi
2014-06-18 21:34     ` Daniel Lezcano
2014-06-18 21:34       ` Daniel Lezcano
2014-06-19  9:30       ` Lorenzo Pieralisi
2014-06-19  9:30         ` Lorenzo Pieralisi
2014-06-19  3:02     ` Rob Herring
2014-06-19  3:02       ` Rob Herring
2014-06-19  9:08       ` Lorenzo Pieralisi
2014-06-19  9:08         ` Lorenzo Pieralisi
2014-06-11 16:18 ` [PATCH v4 3/6] drivers: cpuidle: implement OF based idle states infrastructure Lorenzo Pieralisi
2014-06-11 16:18   ` Lorenzo Pieralisi
2014-06-11 18:24   ` Nicolas Pitre
2014-06-11 18:24     ` Nicolas Pitre
2014-06-12  8:46     ` Lorenzo Pieralisi
2014-06-12  8:46       ` Lorenzo Pieralisi
2014-06-11 18:25   ` Rafael J. Wysocki
2014-06-11 18:25     ` Rafael J. Wysocki
2014-06-12  9:03     ` Lorenzo Pieralisi
2014-06-12  9:03       ` Lorenzo Pieralisi
2014-06-13  3:48       ` Preeti U Murthy
2014-06-13  3:48         ` Preeti U Murthy
2014-06-13 17:16         ` Lorenzo Pieralisi
2014-06-13 17:16           ` Lorenzo Pieralisi
2014-07-06 10:01       ` Paul Burton
2014-07-06 10:01         ` Paul Burton
2014-06-11 18:38   ` Nicolas Pitre
2014-06-11 18:38     ` Nicolas Pitre
2014-06-12  9:19     ` Lorenzo Pieralisi
2014-06-12  9:19       ` Lorenzo Pieralisi
2014-06-11 16:18 ` [PATCH v4 4/6] arm64: add PSCI CPU_SUSPEND based cpu_suspend support Lorenzo Pieralisi
2014-06-11 16:18   ` Lorenzo Pieralisi
2014-06-11 16:18 ` [PATCH v4 6/6] arm64: boot: dts: update rtsm aemv8 dts with PSCI and idle states Lorenzo Pieralisi
2014-06-11 16:18   ` Lorenzo Pieralisi

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.