All of lore.kernel.org
 help / color / mirror / Atom feed
* [PATCH V2] cpuidle/powernv: Read target_residency value of idle states from DT if available
@ 2015-01-28  2:13 ` Preeti U Murthy
  0 siblings, 0 replies; 9+ messages in thread
From: Preeti U Murthy @ 2015-01-28  2:13 UTC (permalink / raw)
  To: mpe; +Cc: rafael.j.wysocki, linuxppc-dev, linux-kernel, svaidy, linux-pm

The device tree now exposes the residency values for different idle states. Read
these values instead of calculating residency from the latency values. The values
exposed in the DT are validated for optimal power efficiency. However to maintain
compatibility with the older firmware code which does not expose residency
values, use default values as a fallback mechanism. While at it, handle some
cleanups.

Signed-off-by: Preeti U Murthy <preeti@linux.vnet.ibm.com>
---

Changes from V1: https://lkml.org/lkml/2015/1/19/221
1. Used a better API for reading the DT property values.
2. Code cleanups

 drivers/cpuidle/cpuidle-powernv.c |   57 ++++++++++++++++++++-----------------
 1 file changed, 31 insertions(+), 26 deletions(-)

diff --git a/drivers/cpuidle/cpuidle-powernv.c b/drivers/cpuidle/cpuidle-powernv.c
index 223d505..29fdbe7 100644
--- a/drivers/cpuidle/cpuidle-powernv.c
+++ b/drivers/cpuidle/cpuidle-powernv.c
@@ -13,6 +13,7 @@
 #include <linux/notifier.h>
 #include <linux/clockchips.h>
 #include <linux/of.h>
+#include <linux/slab.h>
 
 #include <asm/machdep.h>
 #include <asm/firmware.h>
@@ -161,9 +162,9 @@ static int powernv_add_idle_states(void)
 	int nr_idle_states = 1; /* Snooze */
 	int dt_idle_states;
 	const __be32 *idle_state_flags;
-        const __be32 *idle_state_latency;
-        u32 len_flags, flags, latency_ns;
-	int i;
+	u32 len_flags, flags;
+	u32 *latency_ns, *residency_ns;
+	int i, rc;
 
 	/* Currently we have snooze statically defined */
 
@@ -175,53 +176,57 @@ static int powernv_add_idle_states(void)
 
 	idle_state_flags = of_get_property(power_mgt, "ibm,cpu-idle-state-flags", &len_flags);
 	if (!idle_state_flags) {
-		pr_warn("DT-PowerMgmt: missing ibm,cpu-idle-state-flags\n");
+		pr_warn("cpuidle-powernv : missing ibm,cpu-idle-state-flags in DT\n");
 		return nr_idle_states;
 	}
 
-	idle_state_latency = of_get_property(power_mgt,
-			"ibm,cpu-idle-state-latencies-ns", NULL);
-	if (!idle_state_latency) {
-		pr_warn("DT-PowerMgmt: missing ibm,cpu-idle-state-latencies-ns\n");
+	dt_idle_states = len_flags / sizeof(u32);
+
+	latency_ns = kzalloc(sizeof(*latency_ns) * dt_idle_states, GFP_KERNEL);
+	rc = of_property_read_u32_array(power_mgt,
+			"ibm,cpu-idle-state-latencies-ns", latency_ns, dt_idle_states);
+	if (rc) {
+		pr_warn("cpuidle-powernv: missing ibm,cpu-idle-state-latencies-ns in DT\n");
 		return nr_idle_states;
 	}
 
-	dt_idle_states = len_flags / sizeof(u32);
+	residency_ns = kzalloc(sizeof(*residency_ns) * dt_idle_states, GFP_KERNEL);
+	rc = of_property_read_u32_array(power_mgt,
+			"ibm,cpu-idle-state-residency-ns", residency_ns, dt_idle_states);
 
 	for (i = 0; i < dt_idle_states; i++) {
 
 		flags = be32_to_cpu(idle_state_flags[i]);
-
-		/* Cpuidle accepts exit_latency in us and we estimate
-		 * target residency to be 10x exit_latency
+		/*
+		 * Cpuidle accepts exit_latency and target_residency in us.
+		 * Use default target_residency values if f/w does not expose it.
 		 */
-		latency_ns = be32_to_cpu(idle_state_latency[i]);
 		if (flags & OPAL_PM_NAP_ENABLED) {
 			/* Add NAP state */
 			strcpy(powernv_states[nr_idle_states].name, "Nap");
 			strcpy(powernv_states[nr_idle_states].desc, "Nap");
 			powernv_states[nr_idle_states].flags = 0;
-			powernv_states[nr_idle_states].exit_latency =
-					((unsigned int)latency_ns) / 1000;
-			powernv_states[nr_idle_states].target_residency =
-					((unsigned int)latency_ns / 100);
+			powernv_states[nr_idle_states].target_residency = 100;
 			powernv_states[nr_idle_states].enter = &nap_loop;
-			nr_idle_states++;
-		}
-
-		if (flags & OPAL_PM_SLEEP_ENABLED ||
+		} else if (flags & OPAL_PM_SLEEP_ENABLED ||
 			flags & OPAL_PM_SLEEP_ENABLED_ER1) {
 			/* Add FASTSLEEP state */
 			strcpy(powernv_states[nr_idle_states].name, "FastSleep");
 			strcpy(powernv_states[nr_idle_states].desc, "FastSleep");
 			powernv_states[nr_idle_states].flags = CPUIDLE_FLAG_TIMER_STOP;
-			powernv_states[nr_idle_states].exit_latency =
-					((unsigned int)latency_ns) / 1000;
-			powernv_states[nr_idle_states].target_residency =
-					((unsigned int)latency_ns / 100);
+			powernv_states[nr_idle_states].target_residency = 300000;
 			powernv_states[nr_idle_states].enter = &fastsleep_loop;
-			nr_idle_states++;
 		}
+
+		powernv_states[nr_idle_states].exit_latency =
+				((unsigned int)latency_ns[i]) / 1000;
+
+		if (!rc) {
+			powernv_states[nr_idle_states].target_residency =
+				((unsigned int)residency_ns[i]) / 1000;
+		}
+
+		nr_idle_states++;
 	}
 
 	return nr_idle_states;


^ permalink raw reply related	[flat|nested] 9+ messages in thread

* [PATCH V2] cpuidle/powernv: Read target_residency value of idle states from DT if available
@ 2015-01-28  2:13 ` Preeti U Murthy
  0 siblings, 0 replies; 9+ messages in thread
From: Preeti U Murthy @ 2015-01-28  2:13 UTC (permalink / raw)
  To: mpe; +Cc: rafael.j.wysocki, linuxppc-dev, linux-kernel, linux-pm

The device tree now exposes the residency values for different idle states. Read
these values instead of calculating residency from the latency values. The values
exposed in the DT are validated for optimal power efficiency. However to maintain
compatibility with the older firmware code which does not expose residency
values, use default values as a fallback mechanism. While at it, handle some
cleanups.

Signed-off-by: Preeti U Murthy <preeti@linux.vnet.ibm.com>
---

Changes from V1: https://lkml.org/lkml/2015/1/19/221
1. Used a better API for reading the DT property values.
2. Code cleanups

 drivers/cpuidle/cpuidle-powernv.c |   57 ++++++++++++++++++++-----------------
 1 file changed, 31 insertions(+), 26 deletions(-)

diff --git a/drivers/cpuidle/cpuidle-powernv.c b/drivers/cpuidle/cpuidle-powernv.c
index 223d505..29fdbe7 100644
--- a/drivers/cpuidle/cpuidle-powernv.c
+++ b/drivers/cpuidle/cpuidle-powernv.c
@@ -13,6 +13,7 @@
 #include <linux/notifier.h>
 #include <linux/clockchips.h>
 #include <linux/of.h>
+#include <linux/slab.h>
 
 #include <asm/machdep.h>
 #include <asm/firmware.h>
@@ -161,9 +162,9 @@ static int powernv_add_idle_states(void)
 	int nr_idle_states = 1; /* Snooze */
 	int dt_idle_states;
 	const __be32 *idle_state_flags;
-        const __be32 *idle_state_latency;
-        u32 len_flags, flags, latency_ns;
-	int i;
+	u32 len_flags, flags;
+	u32 *latency_ns, *residency_ns;
+	int i, rc;
 
 	/* Currently we have snooze statically defined */
 
@@ -175,53 +176,57 @@ static int powernv_add_idle_states(void)
 
 	idle_state_flags = of_get_property(power_mgt, "ibm,cpu-idle-state-flags", &len_flags);
 	if (!idle_state_flags) {
-		pr_warn("DT-PowerMgmt: missing ibm,cpu-idle-state-flags\n");
+		pr_warn("cpuidle-powernv : missing ibm,cpu-idle-state-flags in DT\n");
 		return nr_idle_states;
 	}
 
-	idle_state_latency = of_get_property(power_mgt,
-			"ibm,cpu-idle-state-latencies-ns", NULL);
-	if (!idle_state_latency) {
-		pr_warn("DT-PowerMgmt: missing ibm,cpu-idle-state-latencies-ns\n");
+	dt_idle_states = len_flags / sizeof(u32);
+
+	latency_ns = kzalloc(sizeof(*latency_ns) * dt_idle_states, GFP_KERNEL);
+	rc = of_property_read_u32_array(power_mgt,
+			"ibm,cpu-idle-state-latencies-ns", latency_ns, dt_idle_states);
+	if (rc) {
+		pr_warn("cpuidle-powernv: missing ibm,cpu-idle-state-latencies-ns in DT\n");
 		return nr_idle_states;
 	}
 
-	dt_idle_states = len_flags / sizeof(u32);
+	residency_ns = kzalloc(sizeof(*residency_ns) * dt_idle_states, GFP_KERNEL);
+	rc = of_property_read_u32_array(power_mgt,
+			"ibm,cpu-idle-state-residency-ns", residency_ns, dt_idle_states);
 
 	for (i = 0; i < dt_idle_states; i++) {
 
 		flags = be32_to_cpu(idle_state_flags[i]);
-
-		/* Cpuidle accepts exit_latency in us and we estimate
-		 * target residency to be 10x exit_latency
+		/*
+		 * Cpuidle accepts exit_latency and target_residency in us.
+		 * Use default target_residency values if f/w does not expose it.
 		 */
-		latency_ns = be32_to_cpu(idle_state_latency[i]);
 		if (flags & OPAL_PM_NAP_ENABLED) {
 			/* Add NAP state */
 			strcpy(powernv_states[nr_idle_states].name, "Nap");
 			strcpy(powernv_states[nr_idle_states].desc, "Nap");
 			powernv_states[nr_idle_states].flags = 0;
-			powernv_states[nr_idle_states].exit_latency =
-					((unsigned int)latency_ns) / 1000;
-			powernv_states[nr_idle_states].target_residency =
-					((unsigned int)latency_ns / 100);
+			powernv_states[nr_idle_states].target_residency = 100;
 			powernv_states[nr_idle_states].enter = &nap_loop;
-			nr_idle_states++;
-		}
-
-		if (flags & OPAL_PM_SLEEP_ENABLED ||
+		} else if (flags & OPAL_PM_SLEEP_ENABLED ||
 			flags & OPAL_PM_SLEEP_ENABLED_ER1) {
 			/* Add FASTSLEEP state */
 			strcpy(powernv_states[nr_idle_states].name, "FastSleep");
 			strcpy(powernv_states[nr_idle_states].desc, "FastSleep");
 			powernv_states[nr_idle_states].flags = CPUIDLE_FLAG_TIMER_STOP;
-			powernv_states[nr_idle_states].exit_latency =
-					((unsigned int)latency_ns) / 1000;
-			powernv_states[nr_idle_states].target_residency =
-					((unsigned int)latency_ns / 100);
+			powernv_states[nr_idle_states].target_residency = 300000;
 			powernv_states[nr_idle_states].enter = &fastsleep_loop;
-			nr_idle_states++;
 		}
+
+		powernv_states[nr_idle_states].exit_latency =
+				((unsigned int)latency_ns[i]) / 1000;
+
+		if (!rc) {
+			powernv_states[nr_idle_states].target_residency =
+				((unsigned int)residency_ns[i]) / 1000;
+		}
+
+		nr_idle_states++;
 	}
 
 	return nr_idle_states;

^ permalink raw reply related	[flat|nested] 9+ messages in thread

* Re: [PATCH V2] cpuidle/powernv: Read target_residency value of idle states from DT if available
  2015-01-28  2:13 ` Preeti U Murthy
@ 2015-01-28  9:15   ` Stewart Smith
  -1 siblings, 0 replies; 9+ messages in thread
From: Stewart Smith @ 2015-01-28  9:15 UTC (permalink / raw)
  To: Preeti U Murthy, mpe
  Cc: rafael.j.wysocki, linuxppc-dev, linux-kernel, linux-pm

Preeti U Murthy <preeti@linux.vnet.ibm.com> writes:
> The device tree now exposes the residency values for different idle states. Read
> these values instead of calculating residency from the latency values. The values
> exposed in the DT are validated for optimal power efficiency. However to maintain
> compatibility with the older firmware code which does not expose residency
> values, use default values as a fallback mechanism. While at it, handle some
> cleanups.

>From a "I just merged the patch that exports these values from firmware"
point of view, using them and falling back looks good.

(I find the hardcoding of snooze in the driver a bit odd, as is the
hardcoding of max power states to 8 - which could bite us in the future
if a future processor has more states... but these aren't problems with
this patch)

Acked-by: Stewart Smith <stewart@linux.vnet.ibm.com>


^ permalink raw reply	[flat|nested] 9+ messages in thread

* Re: [PATCH V2] cpuidle/powernv: Read target_residency value of idle states from DT if available
@ 2015-01-28  9:15   ` Stewart Smith
  0 siblings, 0 replies; 9+ messages in thread
From: Stewart Smith @ 2015-01-28  9:15 UTC (permalink / raw)
  To: Preeti U Murthy, mpe
  Cc: rafael.j.wysocki, linuxppc-dev, linux-kernel, linux-pm

Preeti U Murthy <preeti@linux.vnet.ibm.com> writes:
> The device tree now exposes the residency values for different idle states. Read
> these values instead of calculating residency from the latency values. The values
> exposed in the DT are validated for optimal power efficiency. However to maintain
> compatibility with the older firmware code which does not expose residency
> values, use default values as a fallback mechanism. While at it, handle some
> cleanups.

From a "I just merged the patch that exports these values from firmware"
point of view, using them and falling back looks good.

(I find the hardcoding of snooze in the driver a bit odd, as is the
hardcoding of max power states to 8 - which could bite us in the future
if a future processor has more states... but these aren't problems with
this patch)

Acked-by: Stewart Smith <stewart@linux.vnet.ibm.com>

_______________________________________________
Linuxppc-dev mailing list
Linuxppc-dev@lists.ozlabs.org
https://lists.ozlabs.org/listinfo/linuxppc-dev

^ permalink raw reply	[flat|nested] 9+ messages in thread

* Re: [PATCH V2] cpuidle/powernv: Read target_residency value of idle states from DT if available
  2015-01-28  9:15   ` Stewart Smith
@ 2015-01-28  9:50     ` Preeti U Murthy
  -1 siblings, 0 replies; 9+ messages in thread
From: Preeti U Murthy @ 2015-01-28  9:50 UTC (permalink / raw)
  To: Stewart Smith, mpe; +Cc: rafael.j.wysocki, linuxppc-dev, linux-kernel, linux-pm

On 01/28/2015 02:45 PM, Stewart Smith wrote:
> Preeti U Murthy <preeti@linux.vnet.ibm.com> writes:
>> The device tree now exposes the residency values for different idle states. Read
>> these values instead of calculating residency from the latency values. The values
>> exposed in the DT are validated for optimal power efficiency. However to maintain
>> compatibility with the older firmware code which does not expose residency
>> values, use default values as a fallback mechanism. While at it, handle some
>> cleanups.
> 
> From a "I just merged the patch that exports these values from firmware"
> point of view, using them and falling back looks good.
> 
> (I find the hardcoding of snooze in the driver a bit odd, as is the

Snooze is the only software defined idle state, the rest are platform
specific. The first idle state is usually associated with some sort of a
polling operation and each architecture has a variant to this. This is
why we end up hard-coding this idle state in the driver as far as my
understanding goes.

> hardcoding of max power states to 8 - which could bite us in the future

Hmm.. not sure about this. Need to check.

> if a future processor has more states... but these aren't problems with
> this patch)
> 
> Acked-by: Stewart Smith <stewart@linux.vnet.ibm.com>

Thanks!

Regards
Preeti U Murthy
> 


^ permalink raw reply	[flat|nested] 9+ messages in thread

* Re: [PATCH V2] cpuidle/powernv: Read target_residency value of idle states from DT if available
@ 2015-01-28  9:50     ` Preeti U Murthy
  0 siblings, 0 replies; 9+ messages in thread
From: Preeti U Murthy @ 2015-01-28  9:50 UTC (permalink / raw)
  To: Stewart Smith, mpe; +Cc: rafael.j.wysocki, linuxppc-dev, linux-kernel, linux-pm

On 01/28/2015 02:45 PM, Stewart Smith wrote:
> Preeti U Murthy <preeti@linux.vnet.ibm.com> writes:
>> The device tree now exposes the residency values for different idle states. Read
>> these values instead of calculating residency from the latency values. The values
>> exposed in the DT are validated for optimal power efficiency. However to maintain
>> compatibility with the older firmware code which does not expose residency
>> values, use default values as a fallback mechanism. While at it, handle some
>> cleanups.
> 
> From a "I just merged the patch that exports these values from firmware"
> point of view, using them and falling back looks good.
> 
> (I find the hardcoding of snooze in the driver a bit odd, as is the

Snooze is the only software defined idle state, the rest are platform
specific. The first idle state is usually associated with some sort of a
polling operation and each architecture has a variant to this. This is
why we end up hard-coding this idle state in the driver as far as my
understanding goes.

> hardcoding of max power states to 8 - which could bite us in the future

Hmm.. not sure about this. Need to check.

> if a future processor has more states... but these aren't problems with
> this patch)
> 
> Acked-by: Stewart Smith <stewart@linux.vnet.ibm.com>

Thanks!

Regards
Preeti U Murthy
> 

_______________________________________________
Linuxppc-dev mailing list
Linuxppc-dev@lists.ozlabs.org
https://lists.ozlabs.org/listinfo/linuxppc-dev

^ permalink raw reply	[flat|nested] 9+ messages in thread

* Re: [PATCH V2] cpuidle/powernv: Read target_residency value of idle states from DT if available
  2015-01-28  9:50     ` Preeti U Murthy
@ 2015-01-28 22:24       ` Stewart Smith
  -1 siblings, 0 replies; 9+ messages in thread
From: Stewart Smith @ 2015-01-28 22:24 UTC (permalink / raw)
  To: Preeti U Murthy, mpe
  Cc: rafael.j.wysocki, linuxppc-dev, linux-kernel, linux-pm

Preeti U Murthy <preeti@linux.vnet.ibm.com> writes:
> On 01/28/2015 02:45 PM, Stewart Smith wrote:
>> Preeti U Murthy <preeti@linux.vnet.ibm.com> writes:
>>> The device tree now exposes the residency values for different idle states. Read
>>> these values instead of calculating residency from the latency values. The values
>>> exposed in the DT are validated for optimal power efficiency. However to maintain
>>> compatibility with the older firmware code which does not expose residency
>>> values, use default values as a fallback mechanism. While at it, handle some
>>> cleanups.
>> 
>> From a "I just merged the patch that exports these values from firmware"
>> point of view, using them and falling back looks good.
>> 
>> (I find the hardcoding of snooze in the driver a bit odd, as is the
>
> Snooze is the only software defined idle state, the rest are platform
> specific. The first idle state is usually associated with some sort of a
> polling operation and each architecture has a variant to this. This is
> why we end up hard-coding this idle state in the driver as far as my
> understanding goes.

At least in the PowerISA 2.07 I could only see that lowering priority
would give priority to other threads in the core, I couldn't find
anything saying that or 31,31,31 would end up saving any power... but I
could be looking in the wrong place too.

Basically, I was wanting to check that it's actually written down and
architected somewhere that this is the case and it isn't something too
P7/P8 specific.


^ permalink raw reply	[flat|nested] 9+ messages in thread

* Re: [PATCH V2] cpuidle/powernv: Read target_residency value of idle states from DT if available
@ 2015-01-28 22:24       ` Stewart Smith
  0 siblings, 0 replies; 9+ messages in thread
From: Stewart Smith @ 2015-01-28 22:24 UTC (permalink / raw)
  To: Preeti U Murthy, mpe
  Cc: rafael.j.wysocki, linuxppc-dev, linux-kernel, linux-pm

Preeti U Murthy <preeti@linux.vnet.ibm.com> writes:
> On 01/28/2015 02:45 PM, Stewart Smith wrote:
>> Preeti U Murthy <preeti@linux.vnet.ibm.com> writes:
>>> The device tree now exposes the residency values for different idle states. Read
>>> these values instead of calculating residency from the latency values. The values
>>> exposed in the DT are validated for optimal power efficiency. However to maintain
>>> compatibility with the older firmware code which does not expose residency
>>> values, use default values as a fallback mechanism. While at it, handle some
>>> cleanups.
>> 
>> From a "I just merged the patch that exports these values from firmware"
>> point of view, using them and falling back looks good.
>> 
>> (I find the hardcoding of snooze in the driver a bit odd, as is the
>
> Snooze is the only software defined idle state, the rest are platform
> specific. The first idle state is usually associated with some sort of a
> polling operation and each architecture has a variant to this. This is
> why we end up hard-coding this idle state in the driver as far as my
> understanding goes.

At least in the PowerISA 2.07 I could only see that lowering priority
would give priority to other threads in the core, I couldn't find
anything saying that or 31,31,31 would end up saving any power... but I
could be looking in the wrong place too.

Basically, I was wanting to check that it's actually written down and
architected somewhere that this is the case and it isn't something too
P7/P8 specific.

_______________________________________________
Linuxppc-dev mailing list
Linuxppc-dev@lists.ozlabs.org
https://lists.ozlabs.org/listinfo/linuxppc-dev

^ permalink raw reply	[flat|nested] 9+ messages in thread

* Re: [V2] cpuidle/powernv: Read target_residency value of idle states from DT if available
  2015-01-28  2:13 ` Preeti U Murthy
  (?)
  (?)
@ 2015-01-30  3:56 ` Michael Ellerman
  -1 siblings, 0 replies; 9+ messages in thread
From: Michael Ellerman @ 2015-01-30  3:56 UTC (permalink / raw)
  To: Preeti U Murthy, mpe
  Cc: rafael.j.wysocki, linuxppc-dev, linux-kernel, linux-pm

On Wed, 2015-28-01 at 02:13:06 UTC, Preeti U Murthy wrote:
> The device tree now exposes the residency values for different idle states. Read
> these values instead of calculating residency from the latency values. The values
> exposed in the DT are validated for optimal power efficiency. However to maintain
> compatibility with the older firmware code which does not expose residency
> values, use default values as a fallback mechanism. While at it, handle some
> cleanups.
> 
> Signed-off-by: Preeti U Murthy <preeti@linux.vnet.ibm.com>
> Acked-by: Stewart Smith <stewart@linux.vnet.ibm.com>

This looks good to me.

Acked-by: Michael Ellerman <mpe@ellerman.id.au>

I'm assuming Rafael will take it.

cheers

^ permalink raw reply	[flat|nested] 9+ messages in thread

end of thread, other threads:[~2015-01-30  3:56 UTC | newest]

Thread overview: 9+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2015-01-28  2:13 [PATCH V2] cpuidle/powernv: Read target_residency value of idle states from DT if available Preeti U Murthy
2015-01-28  2:13 ` Preeti U Murthy
2015-01-28  9:15 ` Stewart Smith
2015-01-28  9:15   ` Stewart Smith
2015-01-28  9:50   ` Preeti U Murthy
2015-01-28  9:50     ` Preeti U Murthy
2015-01-28 22:24     ` Stewart Smith
2015-01-28 22:24       ` Stewart Smith
2015-01-30  3:56 ` [V2] " Michael Ellerman

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.