All of lore.kernel.org
 help / color / mirror / Atom feed
* [PATCH 1/3] cpuidle/menu: stop seeking deeper idle if current state is too deep
       [not found] <1483630187-29622-1-git-send-email-alex.shi@linaro.org>
@ 2017-01-05 15:29   ` Alex Shi
  2017-01-05 15:29   ` Alex Shi
                     ` (2 subsequent siblings)
  3 siblings, 0 replies; 15+ messages in thread
From: Alex Shi @ 2017-01-05 15:29 UTC (permalink / raw)
  To: Daniel Lezcano, Rafael J . Wysocki, vincent.guittot, open list
  Cc: linux-pm, Ulf Hansson, Rasmus Villemoes, Arjan van de Ven, Rik van Riel

The obsolete commit 71abbbf85 want to introduce a dynamic cstates,
but it was removed for long time. Just left the nonsense deeper cstate
checking.

Since all target_residency and exit_latency are going longer in deeper
idle state, no needs to waste some cpu cycle on useless seeking.

Signed-off-by: Alex Shi <alex.shi@linaro.org>
To: linux-kernel@vger.kernel.org
Cc: linux-pm@vger.kernel.org
Cc: Ulf Hansson <ulf.hansson@linaro.org>
Cc: Daniel Lezcano <daniel.lezcano@linaro.org>
Cc: Rasmus Villemoes <linux@rasmusvillemoes.dk>
Cc: Arjan van de Ven <arjan@linux.intel.com>
Cc: Rik van Riel <riel@redhat.com>
Cc: "Rafael J. Wysocki" <rafael.j.wysocki@intel.com>
---
 drivers/cpuidle/governors/menu.c | 4 ++--
 1 file changed, 2 insertions(+), 2 deletions(-)

diff --git a/drivers/cpuidle/governors/menu.c b/drivers/cpuidle/governors/menu.c
index d9b5b93..07e36bb 100644
--- a/drivers/cpuidle/governors/menu.c
+++ b/drivers/cpuidle/governors/menu.c
@@ -357,9 +357,9 @@ static int menu_select(struct cpuidle_driver *drv, struct cpuidle_device *dev)
 		if (s->disabled || su->disable)
 			continue;
 		if (s->target_residency > data->predicted_us)
-			continue;
+			break;
 		if (s->exit_latency > latency_req)
-			continue;
+			break;
 
 		data->last_state_idx = i;
 	}
-- 
2.8.1.101.g72d917a

^ permalink raw reply related	[flat|nested] 15+ messages in thread

* [PATCH 1/3] cpuidle/menu: stop seeking deeper idle if current state is too deep
@ 2017-01-05 15:29   ` Alex Shi
  0 siblings, 0 replies; 15+ messages in thread
From: Alex Shi @ 2017-01-05 15:29 UTC (permalink / raw)
  To: Daniel Lezcano, Rafael J . Wysocki, vincent.guittot, open list
  Cc: linux-pm, Ulf Hansson, Rasmus Villemoes, Arjan van de Ven, Rik van Riel

The obsolete commit 71abbbf85 want to introduce a dynamic cstates,
but it was removed for long time. Just left the nonsense deeper cstate
checking.

Since all target_residency and exit_latency are going longer in deeper
idle state, no needs to waste some cpu cycle on useless seeking.

Signed-off-by: Alex Shi <alex.shi@linaro.org>
To: linux-kernel@vger.kernel.org
Cc: linux-pm@vger.kernel.org
Cc: Ulf Hansson <ulf.hansson@linaro.org>
Cc: Daniel Lezcano <daniel.lezcano@linaro.org>
Cc: Rasmus Villemoes <linux@rasmusvillemoes.dk>
Cc: Arjan van de Ven <arjan@linux.intel.com>
Cc: Rik van Riel <riel@redhat.com>
Cc: "Rafael J. Wysocki" <rafael.j.wysocki@intel.com>
---
 drivers/cpuidle/governors/menu.c | 4 ++--
 1 file changed, 2 insertions(+), 2 deletions(-)

diff --git a/drivers/cpuidle/governors/menu.c b/drivers/cpuidle/governors/menu.c
index d9b5b93..07e36bb 100644
--- a/drivers/cpuidle/governors/menu.c
+++ b/drivers/cpuidle/governors/menu.c
@@ -357,9 +357,9 @@ static int menu_select(struct cpuidle_driver *drv, struct cpuidle_device *dev)
 		if (s->disabled || su->disable)
 			continue;
 		if (s->target_residency > data->predicted_us)
-			continue;
+			break;
 		if (s->exit_latency > latency_req)
-			continue;
+			break;
 
 		data->last_state_idx = i;
 	}
-- 
2.8.1.101.g72d917a


^ permalink raw reply related	[flat|nested] 15+ messages in thread

* [PATCH 2/3] cpu: expose pm_qos_resume_latency for each cpu
       [not found] <1483630187-29622-1-git-send-email-alex.shi@linaro.org>
@ 2017-01-05 15:29   ` Alex Shi
  2017-01-05 15:29   ` Alex Shi
                     ` (2 subsequent siblings)
  3 siblings, 0 replies; 15+ messages in thread
From: Alex Shi @ 2017-01-05 15:29 UTC (permalink / raw)
  To: Daniel Lezcano, Rafael J . Wysocki, vincent.guittot,
	Greg Kroah-Hartman, open list
  Cc: linux-pm, Ulf Hansson

The cpu-dma PM QoS constraint impacts all the cpus in the system. There
is no way to let the user to choose a PM QoS constraint per cpu.

The following patch exposes to the userspace a per cpu based sysfs file
in order to let the userspace to change the value of the PM QoS latency
constraint.

This change is inoperative in its form and the cpuidle governors have to
take into account the per cpu latency constraint in addition to the
global cpu-dma latency constraint in order to operate properly.

BTW
The pm_qos_resume_latency usage defined in
Documentation/ABI/testing/sysfs-devices-power
The /sys/devices/.../power/pm_qos_resume_latency_us attribute
contains the PM QoS resume latency limit for the given device,
which is the maximum allowed time it can take to resume the
device, after it has been suspended at run time, from a resume
request to the moment the device will be ready to process I/O,
in microseconds.  If it is equal to 0, however, this means that
the PM QoS resume latency may be arbitrary.

Signed-off-by: Alex Shi <alex.shi@linaro.org>
To: linux-kernel@vger.kernel.org
To: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Cc: linux-pm@vger.kernel.org
Cc: Ulf Hansson <ulf.hansson@linaro.org>
Cc: Daniel Lezcano <daniel.lezcano@linaro.org>
Cc: "Rafael J. Wysocki" <rafael.j.wysocki@intel.com>
---
 drivers/base/cpu.c | 4 ++++
 1 file changed, 4 insertions(+)

diff --git a/drivers/base/cpu.c b/drivers/base/cpu.c
index 4c28e1a..29cf3459 100644
--- a/drivers/base/cpu.c
+++ b/drivers/base/cpu.c
@@ -17,6 +17,7 @@
 #include <linux/of.h>
 #include <linux/cpufeature.h>
 #include <linux/tick.h>
+#include <linux/pm_qos.h>
 
 #include "base.h"
 
@@ -376,6 +377,9 @@ int register_cpu(struct cpu *cpu, int num)
 
 	per_cpu(cpu_sys_devices, num) = &cpu->dev;
 	register_cpu_under_node(num, cpu_to_node(num));
+#ifdef CONFIG_CPU_IDLE_GOV_MENU
+	dev_pm_qos_expose_latency_limit(&cpu->dev, 0);
+#endif
 
 	return 0;
 }
-- 
2.8.1.101.g72d917a

^ permalink raw reply related	[flat|nested] 15+ messages in thread

* [PATCH 2/3] cpu: expose pm_qos_resume_latency for each cpu
@ 2017-01-05 15:29   ` Alex Shi
  0 siblings, 0 replies; 15+ messages in thread
From: Alex Shi @ 2017-01-05 15:29 UTC (permalink / raw)
  To: Daniel Lezcano, Rafael J . Wysocki, vincent.guittot,
	Greg Kroah-Hartman, open list
  Cc: linux-pm, Ulf Hansson

The cpu-dma PM QoS constraint impacts all the cpus in the system. There
is no way to let the user to choose a PM QoS constraint per cpu.

The following patch exposes to the userspace a per cpu based sysfs file
in order to let the userspace to change the value of the PM QoS latency
constraint.

This change is inoperative in its form and the cpuidle governors have to
take into account the per cpu latency constraint in addition to the
global cpu-dma latency constraint in order to operate properly.

BTW
The pm_qos_resume_latency usage defined in
Documentation/ABI/testing/sysfs-devices-power
The /sys/devices/.../power/pm_qos_resume_latency_us attribute
contains the PM QoS resume latency limit for the given device,
which is the maximum allowed time it can take to resume the
device, after it has been suspended at run time, from a resume
request to the moment the device will be ready to process I/O,
in microseconds.  If it is equal to 0, however, this means that
the PM QoS resume latency may be arbitrary.

Signed-off-by: Alex Shi <alex.shi@linaro.org>
To: linux-kernel@vger.kernel.org
To: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Cc: linux-pm@vger.kernel.org
Cc: Ulf Hansson <ulf.hansson@linaro.org>
Cc: Daniel Lezcano <daniel.lezcano@linaro.org>
Cc: "Rafael J. Wysocki" <rafael.j.wysocki@intel.com>
---
 drivers/base/cpu.c | 4 ++++
 1 file changed, 4 insertions(+)

diff --git a/drivers/base/cpu.c b/drivers/base/cpu.c
index 4c28e1a..29cf3459 100644
--- a/drivers/base/cpu.c
+++ b/drivers/base/cpu.c
@@ -17,6 +17,7 @@
 #include <linux/of.h>
 #include <linux/cpufeature.h>
 #include <linux/tick.h>
+#include <linux/pm_qos.h>
 
 #include "base.h"
 
@@ -376,6 +377,9 @@ int register_cpu(struct cpu *cpu, int num)
 
 	per_cpu(cpu_sys_devices, num) = &cpu->dev;
 	register_cpu_under_node(num, cpu_to_node(num));
+#ifdef CONFIG_CPU_IDLE_GOV_MENU
+	dev_pm_qos_expose_latency_limit(&cpu->dev, 0);
+#endif
 
 	return 0;
 }
-- 
2.8.1.101.g72d917a


^ permalink raw reply related	[flat|nested] 15+ messages in thread

* [PATCH 3/3] cpuidle/menu: add per cpu pm_qos_resume_latency consideration
       [not found] <1483630187-29622-1-git-send-email-alex.shi@linaro.org>
@ 2017-01-05 15:29   ` Alex Shi
  2017-01-05 15:29   ` Alex Shi
                     ` (2 subsequent siblings)
  3 siblings, 0 replies; 15+ messages in thread
From: Alex Shi @ 2017-01-05 15:29 UTC (permalink / raw)
  To: Daniel Lezcano, Rafael J . Wysocki, vincent.guittot, open list
  Cc: linux-pm, Ulf Hansson, Rasmus Villemoes, Arjan van de Ven, Rik van Riel

Kernel or user may have special requirement on cpu response time, like
if a interrupt is pinned to a cpu, we don't want the cpu goes too deep
sleep. This patch can prevent this thing happen by consider per cpu
resume_latency setting in cpu sleep state selection in menu governor.

The pm_qos_resume_latency ask device to give reponse in this time. That's
similar with cpu cstates' entry_latency + exit_latency. But since
most of cpu cstate either has no entry_latency or add it into exit_latency
So, we just can restrict this time requirement as states exit_latency.

The 0 value of pm_qos_resume_latency is for no limitation according ABI
definition. So set the value 1 could get the 0 latency if it's needed.

Signed-off-by: Alex Shi <alex.shi@linaro.org>
To: linux-kernel@vger.kernel.org
Cc: linux-pm@vger.kernel.org
Cc: Ulf Hansson <ulf.hansson@linaro.org>
Cc: Daniel Lezcano <daniel.lezcano@linaro.org>
Cc: Rasmus Villemoes <linux@rasmusvillemoes.dk>
Cc: Arjan van de Ven <arjan@linux.intel.com>
Cc: Rik van Riel <riel@redhat.com>
Cc: "Rafael J. Wysocki" <rafael.j.wysocki@intel.com>
---
 drivers/cpuidle/governors/menu.c | 7 +++++++
 1 file changed, 7 insertions(+)

diff --git a/drivers/cpuidle/governors/menu.c b/drivers/cpuidle/governors/menu.c
index 07e36bb..8d6d25c 100644
--- a/drivers/cpuidle/governors/menu.c
+++ b/drivers/cpuidle/governors/menu.c
@@ -19,6 +19,7 @@
 #include <linux/tick.h>
 #include <linux/sched.h>
 #include <linux/math64.h>
+#include <linux/cpu.h>
 
 /*
  * Please note when changing the tuning values:
@@ -280,17 +281,23 @@ static unsigned int get_typical_interval(struct menu_device *data)
 static int menu_select(struct cpuidle_driver *drv, struct cpuidle_device *dev)
 {
 	struct menu_device *data = this_cpu_ptr(&menu_devices);
+	struct device *device = get_cpu_device(dev->cpu);
 	int latency_req = pm_qos_request(PM_QOS_CPU_DMA_LATENCY);
 	int i;
 	unsigned int interactivity_req;
 	unsigned int expected_interval;
 	unsigned long nr_iowaiters, cpu_load;
+	int resume_latency = dev_pm_qos_read_value(device);
 
 	if (data->needs_update) {
 		menu_update(drv, dev);
 		data->needs_update = 0;
 	}
 
+	/* resume_latency is 0 means no restriction */
+	if (resume_latency && resume_latency < latency_req)
+		latency_req = resume_latency;
+
 	/* Special case when user has set very strict latency requirement */
 	if (unlikely(latency_req == 0))
 		return 0;
-- 
2.8.1.101.g72d917a

^ permalink raw reply related	[flat|nested] 15+ messages in thread

* [PATCH 3/3] cpuidle/menu: add per cpu pm_qos_resume_latency consideration
@ 2017-01-05 15:29   ` Alex Shi
  0 siblings, 0 replies; 15+ messages in thread
From: Alex Shi @ 2017-01-05 15:29 UTC (permalink / raw)
  To: Daniel Lezcano, Rafael J . Wysocki, vincent.guittot, open list
  Cc: linux-pm, Ulf Hansson, Rasmus Villemoes, Arjan van de Ven, Rik van Riel

Kernel or user may have special requirement on cpu response time, like
if a interrupt is pinned to a cpu, we don't want the cpu goes too deep
sleep. This patch can prevent this thing happen by consider per cpu
resume_latency setting in cpu sleep state selection in menu governor.

The pm_qos_resume_latency ask device to give reponse in this time. That's
similar with cpu cstates' entry_latency + exit_latency. But since
most of cpu cstate either has no entry_latency or add it into exit_latency
So, we just can restrict this time requirement as states exit_latency.

The 0 value of pm_qos_resume_latency is for no limitation according ABI
definition. So set the value 1 could get the 0 latency if it's needed.

Signed-off-by: Alex Shi <alex.shi@linaro.org>
To: linux-kernel@vger.kernel.org
Cc: linux-pm@vger.kernel.org
Cc: Ulf Hansson <ulf.hansson@linaro.org>
Cc: Daniel Lezcano <daniel.lezcano@linaro.org>
Cc: Rasmus Villemoes <linux@rasmusvillemoes.dk>
Cc: Arjan van de Ven <arjan@linux.intel.com>
Cc: Rik van Riel <riel@redhat.com>
Cc: "Rafael J. Wysocki" <rafael.j.wysocki@intel.com>
---
 drivers/cpuidle/governors/menu.c | 7 +++++++
 1 file changed, 7 insertions(+)

diff --git a/drivers/cpuidle/governors/menu.c b/drivers/cpuidle/governors/menu.c
index 07e36bb..8d6d25c 100644
--- a/drivers/cpuidle/governors/menu.c
+++ b/drivers/cpuidle/governors/menu.c
@@ -19,6 +19,7 @@
 #include <linux/tick.h>
 #include <linux/sched.h>
 #include <linux/math64.h>
+#include <linux/cpu.h>
 
 /*
  * Please note when changing the tuning values:
@@ -280,17 +281,23 @@ static unsigned int get_typical_interval(struct menu_device *data)
 static int menu_select(struct cpuidle_driver *drv, struct cpuidle_device *dev)
 {
 	struct menu_device *data = this_cpu_ptr(&menu_devices);
+	struct device *device = get_cpu_device(dev->cpu);
 	int latency_req = pm_qos_request(PM_QOS_CPU_DMA_LATENCY);
 	int i;
 	unsigned int interactivity_req;
 	unsigned int expected_interval;
 	unsigned long nr_iowaiters, cpu_load;
+	int resume_latency = dev_pm_qos_read_value(device);
 
 	if (data->needs_update) {
 		menu_update(drv, dev);
 		data->needs_update = 0;
 	}
 
+	/* resume_latency is 0 means no restriction */
+	if (resume_latency && resume_latency < latency_req)
+		latency_req = resume_latency;
+
 	/* Special case when user has set very strict latency requirement */
 	if (unlikely(latency_req == 0))
 		return 0;
-- 
2.8.1.101.g72d917a


^ permalink raw reply related	[flat|nested] 15+ messages in thread

* Re: [PATCH 1/3] cpuidle/menu: stop seeking deeper idle if current state is too deep
  2017-01-05 15:29   ` Alex Shi
  (?)
@ 2017-01-05 15:43   ` Rik van Riel
  2017-01-05 15:55     ` Arjan van de Ven
  -1 siblings, 1 reply; 15+ messages in thread
From: Rik van Riel @ 2017-01-05 15:43 UTC (permalink / raw)
  To: Alex Shi, Daniel Lezcano, Rafael J . Wysocki, vincent.guittot, open list
  Cc: linux-pm, Ulf Hansson, Rasmus Villemoes, Arjan van de Ven

On Thu, 2017-01-05 at 23:29 +0800, Alex Shi wrote:
> The obsolete commit 71abbbf85 want to introduce a dynamic cstates,
> but it was removed for long time. Just left the nonsense deeper
> cstate
> checking.
> 
> Since all target_residency and exit_latency are going longer in
> deeper
> idle state, no needs to waste some cpu cycle on useless seeking.

Makes me wonder if it would be worth documenting the
requirement that c-states be listed in increasing
order?

Acked-by: Rik van Riel <riel@redhat.com>

^ permalink raw reply	[flat|nested] 15+ messages in thread

* Re: [PATCH 0/3] per cpu resume latency
       [not found] <1483630187-29622-1-git-send-email-alex.shi@linaro.org>
                   ` (2 preceding siblings ...)
  2017-01-05 15:29   ` Alex Shi
@ 2017-01-05 15:48 ` Alex Shi
  2017-01-10  8:02   ` Alex Shi
  3 siblings, 1 reply; 15+ messages in thread
From: Alex Shi @ 2017-01-05 15:48 UTC (permalink / raw)
  To: Daniel Lezcano, Rafael J . Wysocki, vincent.guittot,
	linux-kernel, linux-pm

Sorry for missing the mailing list.

Add linux-kernel and linux-pm.


On 01/05/2017 11:29 PM, Alex Shi wrote:
> cpu_dma_latency is designed to keep all cpu awake from deep c-state.
> That is good keep system with short response latency. But sometime we
> don't need all cpu power especially in a more and more multi-core day.
> So set all cpu restless that lead to a big power waste.
> 
> A better way is to keep the short cpu response latency on needed cpu, 
> while let other unnecesscary cpus go to deep idle. That is this 
> patchset. We just use the pm_qos_resume_latency on cpu. Giving the 
> short cpu latency on appointed cpu via setting value on
> /sys/devices/system/cpu/cpuX/power/pm_qos_resume_latency_us
> We can set we wanted latency value according to the value of
> /sys/devices/system/cpu/cpuX/cpuidle/stateX/latency. to just a bit
> less related state's latency value. Then cpu can get to this state or
> higher.
> 
> Here is some testing data on my dragonboard 410c, the latency of state1
> is 280us. It has 4 cores.
> 
> Benchmark: cyclictest -t 1  -n -i 10000 -l 1000 -q --latency=10000
> 
> without the patch:
> Latency (us) Min:     87 Act:  209 Avg:  205 Max:     239
> With the patch and cpu0/power/pm_qos_resume_latency_us is lower than
> 280us, like set to 279
> benchmark result on cpu0:
> Latency (us) Min:     82 Act:   91 Avg:   95 Max:     110
> In repeat testing, the Avg latency always drop to half of vanilla kernel
> value, as well as Max latency value, although sometime the Max latency
> is similar with vanilla kernel. 
> 
> Also we could use the cpu_dma_latency to get the similar short latency.
> But 'idlestate' show all cpu are restless. Here is the idle status 
> compression between cpu_dma_latency and this feature:
> 
> To record idlestate
> #./idlestat --trace -t 10 -f /tmp/mytracepmlat -p -c -w -- cyclictest -t 1  -n -i 10000 -l 1000 -q --latency=10000
> 
> To compare the idle state, the 'total' colum show cpu1~3 nearly stay
> in WFI state with cpu_dma_latency. but w/ my patch, they can get about
> 10 second sleep in 'spc' state.
> # ./idlestat --import -f /tmp/mytracepmlat -b /tmp/mytrace -r comparison
> Log is 10.055305 secs long with 7514 events
> Log is 10.055370 secs long with 7545 events
> --------------------------------------------------------------------------------
> | C-state  |   min    |   max    |   avg    |   total  | hits  |  over | under |
> --------------------------------------------------------------------------------
> | clusterA                                                                     |
> --------------------------------------------------------------------------------
> |      WFI |      2us |  12.88ms |   4.18ms |    9.76s |  2334 |     0 |     0 |
> |          |     -2us |  -14.4ms |    -17us |  -72.5ms |    -8 |     0 |     0 |
> --------------------------------------------------------------------------------
> |             cpu0                                                             |
> --------------------------------------------------------------------------------
> |      WFI |      3us | 100.98ms |  26.81ms |   10.03s |   374 |     0 |     0 |
> |          |     -1us |     -1us |   -350us |   +5.0ms |    +5 |     0 |     0 |
> --------------------------------------------------------------------------------
> |             cpu1                                                             |
> --------------------------------------------------------------------------------
> |      WFI |    280us |   3.96ms |   1.96ms |  19.64ms |    10 |     0 |     5 |
> |          |   +221us | -891.7ms |   -9.1ms |    -9.9s |  -889 |     0 |     0 |
> |      spc |    234us |  19.71ms |   9.79ms |    9.91s |  1012 |     4 |     0 |
> |          |   +167us |  +17.9ms |   +8.6ms |    +9.9s | +1009 |    +1 |     0 |
> --------------------------------------------------------------------------------
> |             cpu2                                                             |
> --------------------------------------------------------------------------------
> |      WFI |     86us |   1.01ms |    637us |   1.91ms |     3 |     0 |     0 |
> |          |    -16us |  -26.5ms |   -8.8ms |   -10.0s | -1057 |     0 |     0 |
> |      spc |    930us |  47.67ms |  10.05ms |    9.92s |   987 |     2 |     0 |
> |          |   -1.4ms |  +43.7ms |   +6.9ms |    +9.9s |  +985 |    +2 |     0 |
> --------------------------------------------------------------------------------
> |             cpu3                                                             |
> --------------------------------------------------------------------------------
> |      WFI |      0us |      0us |      0us |      0us |     0 |     0 |     0 |
> |          |          |    -4.0s | -152.1ms |   -10.0s |   -66 |     0 |     0 |
> |      spc |    420us |    3.50s | 913.74ms |   10.05s |    11 |     3 |     0 |
> |          |   -891us |    +3.5s | +911.0ms |   +10.0s |    +8 |    +1 |     0 |
> --------------------------------------------------------------------------------
> 
> 
> Thanks
> Alex
> 

^ permalink raw reply	[flat|nested] 15+ messages in thread

* Re: [PATCH 1/3] cpuidle/menu: stop seeking deeper idle if current state is too deep
  2017-01-05 15:43   ` Rik van Riel
@ 2017-01-05 15:55     ` Arjan van de Ven
  0 siblings, 0 replies; 15+ messages in thread
From: Arjan van de Ven @ 2017-01-05 15:55 UTC (permalink / raw)
  To: Rik van Riel, Alex Shi, Daniel Lezcano, Rafael J . Wysocki,
	vincent.guittot, open list
  Cc: linux-pm, Ulf Hansson, Rasmus Villemoes

On 1/5/2017 7:43 AM, Rik van Riel wrote:
> On Thu, 2017-01-05 at 23:29 +0800, Alex Shi wrote:
>> The obsolete commit 71abbbf85 want to introduce a dynamic cstates,
>> but it was removed for long time. Just left the nonsense deeper
>> cstate
>> checking.
>>
>> Since all target_residency and exit_latency are going longer in
>> deeper
>> idle state, no needs to waste some cpu cycle on useless seeking.
>
> Makes me wonder if it would be worth documenting the
> requirement that c-states be listed in increasing
> order?

or better, a boot time quick check...

^ permalink raw reply	[flat|nested] 15+ messages in thread

* Re: [PATCH 0/3] per cpu resume latency
  2017-01-05 15:48 ` [PATCH 0/3] per cpu resume latency Alex Shi
@ 2017-01-10  8:02   ` Alex Shi
  2017-01-19 21:46     ` Rafael J. Wysocki
  0 siblings, 1 reply; 15+ messages in thread
From: Alex Shi @ 2017-01-10  8:02 UTC (permalink / raw)
  To: Daniel Lezcano, Rafael J . Wysocki, vincent.guittot,
	linux-kernel, linux-pm

Hi All,

Is there any comments for the testing or the patch?

Thanks!
Alex

On 01/05/2017 11:48 PM, Alex Shi wrote:
> On 01/05/2017 11:29 PM, Alex Shi wrote:
>> > cpu_dma_latency is designed to keep all cpu awake from deep c-state.
>> > That is good keep system with short response latency. But sometime we
>> > don't need all cpu power especially in a more and more multi-core day.
>> > So set all cpu restless that lead to a big power waste.
>> > 
>> > A better way is to keep the short cpu response latency on needed cpu, 
>> > while let other unnecesscary cpus go to deep idle. That is this 
>> > patchset. We just use the pm_qos_resume_latency on cpu. Giving the 
>> > short cpu latency on appointed cpu via setting value on
>> > /sys/devices/system/cpu/cpuX/power/pm_qos_resume_latency_us
>> > We can set we wanted latency value according to the value of
>> > /sys/devices/system/cpu/cpuX/cpuidle/stateX/latency. to just a bit
>> > less related state's latency value. Then cpu can get to this state or
>> > higher.
>> > 
>> > Here is some testing data on my dragonboard 410c, the latency of state1
>> > is 280us. It has 4 cores.
>> > 
>> > Benchmark: cyclictest -t 1  -n -i 10000 -l 1000 -q --latency=10000
>> > 
>> > without the patch:
>> > Latency (us) Min:     87 Act:  209 Avg:  205 Max:     239
>> > With the patch and cpu0/power/pm_qos_resume_latency_us is lower than
>> > 280us, like set to 279
>> > benchmark result on cpu0:
>> > Latency (us) Min:     82 Act:   91 Avg:   95 Max:     110
>> > In repeat testing, the Avg latency always drop to half of vanilla kernel
>> > value, as well as Max latency value, although sometime the Max latency
>> > is similar with vanilla kernel. 

^ permalink raw reply	[flat|nested] 15+ messages in thread

* Re: [PATCH 2/3] cpu: expose pm_qos_resume_latency for each cpu
  2017-01-05 15:29   ` Alex Shi
  (?)
@ 2017-01-11  8:09   ` Greg Kroah-Hartman
  2017-01-11 14:33     ` Alex Shi
  -1 siblings, 1 reply; 15+ messages in thread
From: Greg Kroah-Hartman @ 2017-01-11  8:09 UTC (permalink / raw)
  To: Alex Shi
  Cc: Daniel Lezcano, Rafael J . Wysocki, vincent.guittot, open list,
	linux-pm, Ulf Hansson

On Thu, Jan 05, 2017 at 11:29:46PM +0800, Alex Shi wrote:
> The cpu-dma PM QoS constraint impacts all the cpus in the system. There
> is no way to let the user to choose a PM QoS constraint per cpu.
> 
> The following patch exposes to the userspace a per cpu based sysfs file
> in order to let the userspace to change the value of the PM QoS latency
> constraint.
> 
> This change is inoperative in its form and the cpuidle governors have to
> take into account the per cpu latency constraint in addition to the
> global cpu-dma latency constraint in order to operate properly.
> 
> BTW
> The pm_qos_resume_latency usage defined in
> Documentation/ABI/testing/sysfs-devices-power
> The /sys/devices/.../power/pm_qos_resume_latency_us attribute
> contains the PM QoS resume latency limit for the given device,
> which is the maximum allowed time it can take to resume the
> device, after it has been suspended at run time, from a resume
> request to the moment the device will be ready to process I/O,
> in microseconds.  If it is equal to 0, however, this means that
> the PM QoS resume latency may be arbitrary.
> 
> Signed-off-by: Alex Shi <alex.shi@linaro.org>
> To: linux-kernel@vger.kernel.org
> To: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
> Cc: linux-pm@vger.kernel.org
> Cc: Ulf Hansson <ulf.hansson@linaro.org>
> Cc: Daniel Lezcano <daniel.lezcano@linaro.org>
> Cc: "Rafael J. Wysocki" <rafael.j.wysocki@intel.com>
> ---
>  drivers/base/cpu.c | 4 ++++
>  1 file changed, 4 insertions(+)
> 
> diff --git a/drivers/base/cpu.c b/drivers/base/cpu.c
> index 4c28e1a..29cf3459 100644
> --- a/drivers/base/cpu.c
> +++ b/drivers/base/cpu.c
> @@ -17,6 +17,7 @@
>  #include <linux/of.h>
>  #include <linux/cpufeature.h>
>  #include <linux/tick.h>
> +#include <linux/pm_qos.h>
>  
>  #include "base.h"
>  
> @@ -376,6 +377,9 @@ int register_cpu(struct cpu *cpu, int num)
>  
>  	per_cpu(cpu_sys_devices, num) = &cpu->dev;
>  	register_cpu_under_node(num, cpu_to_node(num));
> +#ifdef CONFIG_CPU_IDLE_GOV_MENU
> +	dev_pm_qos_expose_latency_limit(&cpu->dev, 0);
> +#endif

No way to do this without the #ifdef?  That's really not recommended for
.c code :(

thanks,

greg k-h

^ permalink raw reply	[flat|nested] 15+ messages in thread

* Re: [PATCH 2/3] cpu: expose pm_qos_resume_latency for each cpu
  2017-01-11  8:09   ` Greg Kroah-Hartman
@ 2017-01-11 14:33     ` Alex Shi
  2017-01-11 18:40       ` Rafael J. Wysocki
  0 siblings, 1 reply; 15+ messages in thread
From: Alex Shi @ 2017-01-11 14:33 UTC (permalink / raw)
  To: Greg Kroah-Hartman
  Cc: Daniel Lezcano, Rafael J . Wysocki, vincent.guittot, open list,
	linux-pm, Ulf Hansson


>> >  #include "base.h"
>> >  
>> > @@ -376,6 +377,9 @@ int register_cpu(struct cpu *cpu, int num)
>> >  
>> >  	per_cpu(cpu_sys_devices, num) = &cpu->dev;
>> >  	register_cpu_under_node(num, cpu_to_node(num));
>> > +#ifdef CONFIG_CPU_IDLE_GOV_MENU
>> > +	dev_pm_qos_expose_latency_limit(&cpu->dev, 0);
>> > +#endif
> No way to do this without the #ifdef?  That's really not recommended for
> .c code :(
> 

Hi Greg,

Thanks for comments!

The function dev_pm_qos_expose_latency_limit() is null if no CONFIG_PM.
So when CONFIG_PM enabled, may we could consider the cpu idle is also
wanted. In this assumption the #ifdef could be removed. If user want to
use this feature, she/he should understand the feature only work on menu
gov only currently. So consider this, I guess we could remove this
#ifdef. :)

Any different concerns on this?

Regards
Alex

BTW,
Although I did try this patch on other platform, but it clearly other
multi core system, like x86 could also get the same benefit.

^ permalink raw reply	[flat|nested] 15+ messages in thread

* Re: [PATCH 2/3] cpu: expose pm_qos_resume_latency for each cpu
  2017-01-11 14:33     ` Alex Shi
@ 2017-01-11 18:40       ` Rafael J. Wysocki
  2017-01-12 13:04         ` Alex Shi
  0 siblings, 1 reply; 15+ messages in thread
From: Rafael J. Wysocki @ 2017-01-11 18:40 UTC (permalink / raw)
  To: Alex Shi
  Cc: Greg Kroah-Hartman, Daniel Lezcano, Rafael J . Wysocki,
	Vincent Guittot, open list, Linux PM, Ulf Hansson

On Wed, Jan 11, 2017 at 3:33 PM, Alex Shi <alex.shi@linaro.org> wrote:
>
>>> >  #include "base.h"
>>> >
>>> > @@ -376,6 +377,9 @@ int register_cpu(struct cpu *cpu, int num)
>>> >
>>> >    per_cpu(cpu_sys_devices, num) = &cpu->dev;
>>> >    register_cpu_under_node(num, cpu_to_node(num));
>>> > +#ifdef CONFIG_CPU_IDLE_GOV_MENU
>>> > +  dev_pm_qos_expose_latency_limit(&cpu->dev, 0);
>>> > +#endif
>> No way to do this without the #ifdef?  That's really not recommended for
>> .c code :(
>>
>
> Hi Greg,
>
> Thanks for comments!
>
> The function dev_pm_qos_expose_latency_limit() is null if no CONFIG_PM.
> So when CONFIG_PM enabled, may we could consider the cpu idle is also
> wanted. In this assumption the #ifdef could be removed. If user want to
> use this feature, she/he should understand the feature only work on menu
> gov only currently. So consider this, I guess we could remove this
> #ifdef. :)

But instead of putting the #ifdef into the function body, you can use
a wrapper function defined to be empty for CONFIG_PM unset.

Thanks,
Rafael

^ permalink raw reply	[flat|nested] 15+ messages in thread

* Re: [PATCH 2/3] cpu: expose pm_qos_resume_latency for each cpu
  2017-01-11 18:40       ` Rafael J. Wysocki
@ 2017-01-12 13:04         ` Alex Shi
  0 siblings, 0 replies; 15+ messages in thread
From: Alex Shi @ 2017-01-12 13:04 UTC (permalink / raw)
  To: Rafael J. Wysocki
  Cc: Greg Kroah-Hartman, Daniel Lezcano, Rafael J . Wysocki,
	Vincent Guittot, open list, Linux PM, Ulf Hansson



On 01/12/2017 02:40 AM, Rafael J. Wysocki wrote:
>> >
>> > Hi Greg,
>> >
>> > Thanks for comments!
>> >
>> > The function dev_pm_qos_expose_latency_limit() is null if no CONFIG_PM.
>> > So when CONFIG_PM enabled, may we could consider the cpu idle is also
>> > wanted. In this assumption the #ifdef could be removed. If user want to
>> > use this feature, she/he should understand the feature only work on menu
>> > gov only currently. So consider this, I guess we could remove this
>> > #ifdef. :)
> But instead of putting the #ifdef into the function body, you can use
> a wrapper function defined to be empty for CONFIG_PM unset.

Thanks Rafael!

The function dev_pm_qos_expose_latency_limit() is empty now when
CONFIG_PM disabled. :)
I will resend the patch without the #ifdef.

Thanks!
Alex

^ permalink raw reply	[flat|nested] 15+ messages in thread

* Re: [PATCH 0/3] per cpu resume latency
  2017-01-10  8:02   ` Alex Shi
@ 2017-01-19 21:46     ` Rafael J. Wysocki
  0 siblings, 0 replies; 15+ messages in thread
From: Rafael J. Wysocki @ 2017-01-19 21:46 UTC (permalink / raw)
  To: Alex Shi
  Cc: Daniel Lezcano, Rafael J . Wysocki, Vincent Guittot,
	linux-kernel, linux-pm

Hi Alex,

On Tue, Jan 10, 2017 at 9:02 AM, Alex Shi <alex.shi@linaro.org> wrote:
> Hi All,
>
> Is there any comments for the testing or the patch?

Please see: http://marc.info/?l=linux-pm&m=148410629024194&w=2

I will get to the patches when I'm back.

Thanks,
Rafael

^ permalink raw reply	[flat|nested] 15+ messages in thread

end of thread, other threads:[~2017-01-19 21:46 UTC | newest]

Thread overview: 15+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
     [not found] <1483630187-29622-1-git-send-email-alex.shi@linaro.org>
2017-01-05 15:29 ` [PATCH 1/3] cpuidle/menu: stop seeking deeper idle if current state is too deep Alex Shi
2017-01-05 15:29   ` Alex Shi
2017-01-05 15:43   ` Rik van Riel
2017-01-05 15:55     ` Arjan van de Ven
2017-01-05 15:29 ` [PATCH 2/3] cpu: expose pm_qos_resume_latency for each cpu Alex Shi
2017-01-05 15:29   ` Alex Shi
2017-01-11  8:09   ` Greg Kroah-Hartman
2017-01-11 14:33     ` Alex Shi
2017-01-11 18:40       ` Rafael J. Wysocki
2017-01-12 13:04         ` Alex Shi
2017-01-05 15:29 ` [PATCH 3/3] cpuidle/menu: add per cpu pm_qos_resume_latency consideration Alex Shi
2017-01-05 15:29   ` Alex Shi
2017-01-05 15:48 ` [PATCH 0/3] per cpu resume latency Alex Shi
2017-01-10  8:02   ` Alex Shi
2017-01-19 21:46     ` Rafael J. Wysocki

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.