All of lore.kernel.org
 help / color / mirror / Atom feed
* [PATCH v2  0/8] Fix performance issue with ondemand governor
@ 2010-05-09 15:21 Arjan van de Ven
  2010-05-09 15:22 ` [PATCH v2 1/8] sched: Add a comment to get_cpu_idle_time_us() Arjan van de Ven
                   ` (8 more replies)
  0 siblings, 9 replies; 20+ messages in thread
From: Arjan van de Ven @ 2010-05-09 15:21 UTC (permalink / raw)
  To: linux-kernel; +Cc: mingo, davej

[Version 2 includes the acks/etc. Andrew: no changes from the
patches that are in -mm]



There have been various reports of the ondemand governor causing some
serious performance issues, one of the latest ones from Andrew.
There are several fundamental issues with ondemand (being worked on),
but the report from Andrew can be fixed relatively easily.

The fundamental issue is that ondemand will go to a (too) low CPU
frequency for workloads that alternatingly disk and CPU bound...


-- 
Arjan van de Ven 	Intel Open Source Technology Centre
For development, discussion and tips for power savings, 
visit http://www.lesswatts.org

^ permalink raw reply	[flat|nested] 20+ messages in thread

* [PATCH v2  1/8] sched: Add a comment to get_cpu_idle_time_us()
  2010-05-09 15:21 [PATCH v2 0/8] Fix performance issue with ondemand governor Arjan van de Ven
@ 2010-05-09 15:22 ` Arjan van de Ven
  2010-05-10  5:52   ` [tip:sched/core] " tip-bot for Arjan van de Ven
  2010-05-09 15:22 ` [PATCH v2 2/8] sched: Introduce a function to update the idle statistics Arjan van de Ven
                   ` (7 subsequent siblings)
  8 siblings, 1 reply; 20+ messages in thread
From: Arjan van de Ven @ 2010-05-09 15:22 UTC (permalink / raw)
  To: Arjan van de Ven; +Cc: linux-kernel, mingo, davej

From: Arjan van de Ven <arjan@linux.intel.com>

The exported function get_cpu_idle_time_us() has no comment
describing it; add a kerneldoc comment

Signed-off-by: Arjan van de Ven <arjan@linux.intel.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Reviewed-by: Rik van Riel <riel@redhat.com>

---

 kernel/time/tick-sched.c |   14 ++++++++++++++
 1 file changed, 14 insertions(+)

diff -puN kernel/time/tick-sched.c~sched-add-a-comment-to-get_cpu_idle_time_us kernel/time/tick-sched.c
--- a/kernel/time/tick-sched.c~sched-add-a-comment-to-get_cpu_idle_time_us
+++ a/kernel/time/tick-sched.c
@@ -179,6 +179,20 @@ static ktime_t tick_nohz_start_idle(stru
 	return now;
 }
 
+/**
+ * get_cpu_idle_time_us - get the total idle time of a cpu
+ * @cpu: CPU number to query
+ * @last_update_time: variable to store update time in
+ *
+ * Return the cummulative idle time (since boot) for a given
+ * CPU, in microseconds. The idle time returned includes
+ * the iowait time (unlike what "top" and co report).
+ *
+ * This time is measured via accounting rather than sampling,
+ * and is as accurate as ktime_get() is.
+ *
+ * This function returns -1 if NOHZ is not enabled.
+ */
 u64 get_cpu_idle_time_us(int cpu, u64 *last_update_time)
 {
 	struct tick_sched *ts = &per_cpu(tick_cpu_sched, cpu);
_

^ permalink raw reply	[flat|nested] 20+ messages in thread

* [PATCH v2  2/8] sched: Introduce a function to update the idle statistics
  2010-05-09 15:21 [PATCH v2 0/8] Fix performance issue with ondemand governor Arjan van de Ven
  2010-05-09 15:22 ` [PATCH v2 1/8] sched: Add a comment to get_cpu_idle_time_us() Arjan van de Ven
@ 2010-05-09 15:22 ` Arjan van de Ven
  2010-05-10  5:52   ` [tip:sched/core] " tip-bot for Arjan van de Ven
  2010-05-09 15:23 ` [PATCH v2 3/8] sched: Update the idle statistics in get_cpu_idle_time_us() Arjan van de Ven
                   ` (6 subsequent siblings)
  8 siblings, 1 reply; 20+ messages in thread
From: Arjan van de Ven @ 2010-05-09 15:22 UTC (permalink / raw)
  To: Arjan van de Ven; +Cc: linux-kernel, mingo, davej

From: Arjan van de Ven <arjan@linux.intel.com>

Currently, two places update the idle statistics (and more to
come later in this series).

This patch creates a helper function for updating these statistics.

Signed-off-by: Arjan van de Ven <arjan@linux.intel.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Reviewed-by: Rik van Riel <riel@redhat.com>

---

 kernel/time/tick-sched.c |   29 +++++++++++++++++++----------
 1 file changed, 19 insertions(+), 10 deletions(-)

diff -puN kernel/time/tick-sched.c~sched-introduce-a-function-to-update-the-idle-statistics kernel/time/tick-sched.c
--- a/kernel/time/tick-sched.c~sched-introduce-a-function-to-update-the-idle-statistics
+++ a/kernel/time/tick-sched.c
@@ -150,14 +150,25 @@ static void tick_nohz_update_jiffies(kti
 	touch_softlockup_watchdog();
 }
 
-static void tick_nohz_stop_idle(int cpu, ktime_t now)
+/*
+ * Updates the per cpu time idle statistics counters
+ */
+static void update_ts_time_stats(struct tick_sched *ts, ktime_t now)
 {
-	struct tick_sched *ts = &per_cpu(tick_cpu_sched, cpu);
 	ktime_t delta;
 
-	delta = ktime_sub(now, ts->idle_entrytime);
 	ts->idle_lastupdate = now;
-	ts->idle_sleeptime = ktime_add(ts->idle_sleeptime, delta);
+	if (ts->idle_active) {
+		delta = ktime_sub(now, ts->idle_entrytime);
+		ts->idle_sleeptime = ktime_add(ts->idle_sleeptime, delta);
+	}
+}
+
+static void tick_nohz_stop_idle(int cpu, ktime_t now)
+{
+	struct tick_sched *ts = &per_cpu(tick_cpu_sched, cpu);
+
+	update_ts_time_stats(ts, now);
 	ts->idle_active = 0;
 
 	sched_clock_idle_wakeup_event(0);
@@ -165,14 +176,12 @@ static void tick_nohz_stop_idle(int cpu,
 
 static ktime_t tick_nohz_start_idle(struct tick_sched *ts)
 {
-	ktime_t now, delta;
+	ktime_t now;
 
 	now = ktime_get();
-	if (ts->idle_active) {
-		delta = ktime_sub(now, ts->idle_entrytime);
-		ts->idle_lastupdate = now;
-		ts->idle_sleeptime = ktime_add(ts->idle_sleeptime, delta);
-	}
+
+	update_ts_time_stats(ts, now);
+
 	ts->idle_entrytime = now;
 	ts->idle_active = 1;
 	sched_clock_idle_sleep_event();
_

^ permalink raw reply	[flat|nested] 20+ messages in thread

* [PATCH v2  3/8] sched: Update the idle statistics in get_cpu_idle_time_us()
  2010-05-09 15:21 [PATCH v2 0/8] Fix performance issue with ondemand governor Arjan van de Ven
  2010-05-09 15:22 ` [PATCH v2 1/8] sched: Add a comment to get_cpu_idle_time_us() Arjan van de Ven
  2010-05-09 15:22 ` [PATCH v2 2/8] sched: Introduce a function to update the idle statistics Arjan van de Ven
@ 2010-05-09 15:23 ` Arjan van de Ven
  2010-05-10  5:53   ` [tip:sched/core] " tip-bot for Arjan van de Ven
  2010-05-09 15:24 ` [PATCH v2 4/8] sched: Fold updating of the last_update_time_info into update_ts_time_stats() Arjan van de Ven
                   ` (5 subsequent siblings)
  8 siblings, 1 reply; 20+ messages in thread
From: Arjan van de Ven @ 2010-05-09 15:23 UTC (permalink / raw)
  To: Arjan van de Ven; +Cc: linux-kernel, mingo, davej

From: Arjan van de Ven <arjan@linux.intel.com>

Right now, get_cpu_idle_time_us() only reports the idle statistics
upto the point the CPU entered last idle; not what is valid right now.

This patch adds an update of the idle statistics to get_cpu_idle_time_us(),
so that calling this function always returns statistics that are accurate
at the point of the call.

This includes resetting the start of the idle time for accounting purposes
to avoid double accounting.

Signed-off-by: Arjan van de Ven <arjan@linux.intel.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Reviewed-by: Rik van Riel <riel@redhat.com>
---

 kernel/time/tick-sched.c |    7 ++++++-
 1 file changed, 6 insertions(+), 1 deletion(-)

diff -puN kernel/time/tick-sched.c~sched-update-the-idle-statistics-in-get_cpu_idle_time_us kernel/time/tick-sched.c
--- a/kernel/time/tick-sched.c~sched-update-the-idle-statistics-in-get_cpu_idle_time_us
+++ a/kernel/time/tick-sched.c
@@ -161,6 +161,7 @@ static void update_ts_time_stats(struct 
 	if (ts->idle_active) {
 		delta = ktime_sub(now, ts->idle_entrytime);
 		ts->idle_sleeptime = ktime_add(ts->idle_sleeptime, delta);
+		ts->idle_entrytime = now;
 	}
 }
 
@@ -205,14 +206,18 @@ static ktime_t tick_nohz_start_idle(stru
 u64 get_cpu_idle_time_us(int cpu, u64 *last_update_time)
 {
 	struct tick_sched *ts = &per_cpu(tick_cpu_sched, cpu);
+	ktime_t now;
 
 	if (!tick_nohz_enabled)
 		return -1;
 
+	now = ktime_get();
+	update_ts_time_stats(ts, now);
+
 	if (ts->idle_active)
 		*last_update_time = ktime_to_us(ts->idle_lastupdate);
 	else
-		*last_update_time = ktime_to_us(ktime_get());
+		*last_update_time = ktime_to_us(now);
 
 	return ktime_to_us(ts->idle_sleeptime);
 }
_

^ permalink raw reply	[flat|nested] 20+ messages in thread

* [PATCH v2  4/8] sched: Fold updating of the last_update_time_info into update_ts_time_stats()
  2010-05-09 15:21 [PATCH v2 0/8] Fix performance issue with ondemand governor Arjan van de Ven
                   ` (2 preceding siblings ...)
  2010-05-09 15:23 ` [PATCH v2 3/8] sched: Update the idle statistics in get_cpu_idle_time_us() Arjan van de Ven
@ 2010-05-09 15:24 ` Arjan van de Ven
  2010-05-10  5:53   ` [tip:sched/core] " tip-bot for Arjan van de Ven
  2010-05-09 15:24 ` [PATCH v2 5/8] sched: Eliminate the ts->idle_lastupdate field Arjan van de Ven
                   ` (4 subsequent siblings)
  8 siblings, 1 reply; 20+ messages in thread
From: Arjan van de Ven @ 2010-05-09 15:24 UTC (permalink / raw)
  To: Arjan van de Ven; +Cc: linux-kernel, mingo, davej

From: Arjan van de Ven <arjan@linux.intel.com>

This patch folds the updating of the last_update_time into the
update_ts_time_stats() function, and updates the callers.

This allows for further cleanups that are done in the next patch.

Signed-off-by: Arjan van de Ven <arjan@linux.intel.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Reviewed-by: Rik van Riel <riel@redhat.com>
---

 kernel/time/tick-sched.c |   22 +++++++++++-----------
 1 file changed, 11 insertions(+), 11 deletions(-)

diff -puN kernel/time/tick-sched.c~sched-fold-updating-of-the-last-update-time-into-update_ts_time_stats kernel/time/tick-sched.c
--- a/kernel/time/tick-sched.c~sched-fold-updating-of-the-last-update-time-into-update_ts_time_stats
+++ a/kernel/time/tick-sched.c
@@ -153,7 +153,8 @@ static void tick_nohz_update_jiffies(kti
 /*
  * Updates the per cpu time idle statistics counters
  */
-static void update_ts_time_stats(struct tick_sched *ts, ktime_t now)
+static void
+update_ts_time_stats(struct tick_sched *ts, ktime_t now, u64 *last_update_time)
 {
 	ktime_t delta;
 
@@ -163,13 +164,19 @@ static void update_ts_time_stats(struct 
 		ts->idle_sleeptime = ktime_add(ts->idle_sleeptime, delta);
 		ts->idle_entrytime = now;
 	}
+
+	if (ts->idle_active && last_update_time)
+		*last_update_time = ktime_to_us(ts->idle_lastupdate);
+	else
+		*last_update_time = ktime_to_us(now);
+
 }
 
 static void tick_nohz_stop_idle(int cpu, ktime_t now)
 {
 	struct tick_sched *ts = &per_cpu(tick_cpu_sched, cpu);
 
-	update_ts_time_stats(ts, now);
+	update_ts_time_stats(ts, now, NULL);
 	ts->idle_active = 0;
 
 	sched_clock_idle_wakeup_event(0);
@@ -181,7 +188,7 @@ static ktime_t tick_nohz_start_idle(stru
 
 	now = ktime_get();
 
-	update_ts_time_stats(ts, now);
+	update_ts_time_stats(ts, now, NULL);
 
 	ts->idle_entrytime = now;
 	ts->idle_active = 1;
@@ -206,18 +213,11 @@ static ktime_t tick_nohz_start_idle(stru
 u64 get_cpu_idle_time_us(int cpu, u64 *last_update_time)
 {
 	struct tick_sched *ts = &per_cpu(tick_cpu_sched, cpu);
-	ktime_t now;
 
 	if (!tick_nohz_enabled)
 		return -1;
 
-	now = ktime_get();
-	update_ts_time_stats(ts, now);
-
-	if (ts->idle_active)
-		*last_update_time = ktime_to_us(ts->idle_lastupdate);
-	else
-		*last_update_time = ktime_to_us(now);
+	update_ts_time_stats(ts, ktime_get(), last_update_time);
 
 	return ktime_to_us(ts->idle_sleeptime);
 }
_

^ permalink raw reply	[flat|nested] 20+ messages in thread

* [PATCH v2  5/8] sched: Eliminate the ts->idle_lastupdate field
  2010-05-09 15:21 [PATCH v2 0/8] Fix performance issue with ondemand governor Arjan van de Ven
                   ` (3 preceding siblings ...)
  2010-05-09 15:24 ` [PATCH v2 4/8] sched: Fold updating of the last_update_time_info into update_ts_time_stats() Arjan van de Ven
@ 2010-05-09 15:24 ` Arjan van de Ven
  2010-05-10  5:53   ` [tip:sched/core] " tip-bot for Arjan van de Ven
  2010-05-09 15:25 ` [PATCH v2 6/8] sched: Intoduce get_cpu_iowait_time_us() Arjan van de Ven
                   ` (3 subsequent siblings)
  8 siblings, 1 reply; 20+ messages in thread
From: Arjan van de Ven @ 2010-05-09 15:24 UTC (permalink / raw)
  To: Arjan van de Ven; +Cc: linux-kernel, mingo, davej

From: Arjan van de Ven <arjan@linux.intel.com>

Now that the only user of ts->idle_lastupdate is update_ts_time_stats(),
the entire field can be eliminated.

In update_ts_time_stats(), idle_lastupdate is first set to "now",
and a few lines later, the only user is an if() statement that
assigns a variable either to "now" or to ts->idle_lastupdate,
which has the value of "now" at that point.

Signed-off-by: Arjan van de Ven <arjan@linux.intel.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Reviewed-by: Rik van Riel <riel@redhat.com>
---

 include/linux/tick.h     |    1 -
 kernel/time/tick-sched.c |    5 +----
 2 files changed, 1 insertion(+), 5 deletions(-)

diff -puN include/linux/tick.h~sched-eliminate-the-ts-idle_lastupdate-field include/linux/tick.h
--- a/include/linux/tick.h~sched-eliminate-the-ts-idle_lastupdate-field
+++ a/include/linux/tick.h
@@ -60,7 +60,6 @@ struct tick_sched {
 	ktime_t				idle_waketime;
 	ktime_t				idle_exittime;
 	ktime_t				idle_sleeptime;
-	ktime_t				idle_lastupdate;
 	ktime_t				sleep_length;
 	unsigned long			last_jiffies;
 	unsigned long			next_jiffies;
diff -puN kernel/time/tick-sched.c~sched-eliminate-the-ts-idle_lastupdate-field kernel/time/tick-sched.c
--- a/kernel/time/tick-sched.c~sched-eliminate-the-ts-idle_lastupdate-field
+++ a/kernel/time/tick-sched.c
@@ -158,16 +158,13 @@ update_ts_time_stats(struct tick_sched *
 {
 	ktime_t delta;
 
-	ts->idle_lastupdate = now;
 	if (ts->idle_active) {
 		delta = ktime_sub(now, ts->idle_entrytime);
 		ts->idle_sleeptime = ktime_add(ts->idle_sleeptime, delta);
 		ts->idle_entrytime = now;
 	}
 
-	if (ts->idle_active && last_update_time)
-		*last_update_time = ktime_to_us(ts->idle_lastupdate);
-	else
+	if (last_update_time)
 		*last_update_time = ktime_to_us(now);
 
 }
_

^ permalink raw reply	[flat|nested] 20+ messages in thread

* [PATCH v2  6/8] sched: Intoduce get_cpu_iowait_time_us()
  2010-05-09 15:21 [PATCH v2 0/8] Fix performance issue with ondemand governor Arjan van de Ven
                   ` (4 preceding siblings ...)
  2010-05-09 15:24 ` [PATCH v2 5/8] sched: Eliminate the ts->idle_lastupdate field Arjan van de Ven
@ 2010-05-09 15:25 ` Arjan van de Ven
  2010-05-10  5:54   ` [tip:sched/core] " tip-bot for Arjan van de Ven
  2010-05-09 15:26 ` [PATCH v2 7/8] ondemand: Solve a big performance issue by counting IOWAIT time as busy Arjan van de Ven
                   ` (2 subsequent siblings)
  8 siblings, 1 reply; 20+ messages in thread
From: Arjan van de Ven @ 2010-05-09 15:25 UTC (permalink / raw)
  To: Arjan van de Ven; +Cc: linux-kernel, mingo, davej

From: Arjan van de Ven <arjan@linux.intel.com>

For the ondemand cpufreq governor, it is desired that the iowait
time is microaccounted in a similar way as idle time is.

This patch introduces the infrastructure to account and expose
this information via the get_cpu_iowait_time_us() function.

[akpm@linux-foundation.org: fix CONFIG_NO_HZ=n build]
Signed-off-by: Arjan van de Ven <arjan@linux.intel.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Reviewed-by: Rik van Riel <riel@redhat.com>
---

 include/linux/tick.h     |    4 ++++
 kernel/time/tick-sched.c |   28 ++++++++++++++++++++++++++++
 kernel/time/timer_list.c |    1 +
 3 files changed, 33 insertions(+)

diff -puN include/linux/tick.h~sched-introduce-get_cpu_iowait_time_us include/linux/tick.h
--- a/include/linux/tick.h~sched-introduce-get_cpu_iowait_time_us
+++ a/include/linux/tick.h
@@ -42,6 +42,7 @@ enum tick_nohz_mode {
  * @idle_waketime:	Time when the idle was interrupted
  * @idle_exittime:	Time when the idle state was left
  * @idle_sleeptime:	Sum of the time slept in idle with sched tick stopped
+ * @iowait_sleeptime:	Sum of the time slept in idle with sched tick stopped, with IO outstanding
  * @sleep_length:	Duration of the current idle sleep
  * @do_timer_lst:	CPU was the last one doing do_timer before going idle
  */
@@ -60,6 +61,7 @@ struct tick_sched {
 	ktime_t				idle_waketime;
 	ktime_t				idle_exittime;
 	ktime_t				idle_sleeptime;
+	ktime_t				iowait_sleeptime;
 	ktime_t				sleep_length;
 	unsigned long			last_jiffies;
 	unsigned long			next_jiffies;
@@ -123,6 +125,7 @@ extern void tick_nohz_stop_sched_tick(in
 extern void tick_nohz_restart_sched_tick(void);
 extern ktime_t tick_nohz_get_sleep_length(void);
 extern u64 get_cpu_idle_time_us(int cpu, u64 *last_update_time);
+extern u64 get_cpu_iowait_time_us(int cpu, u64 *last_update_time);
 # else
 static inline void tick_nohz_stop_sched_tick(int inidle) { }
 static inline void tick_nohz_restart_sched_tick(void) { }
@@ -133,6 +136,7 @@ static inline ktime_t tick_nohz_get_slee
 	return len;
 }
 static inline u64 get_cpu_idle_time_us(int cpu, u64 *unused) { return -1; }
+static inline u64 get_cpu_iowait_time_us(int cpu, u64 *unused) { return -1; }
 # endif /* !NO_HZ */
 
 #endif
diff -puN kernel/time/tick-sched.c~sched-introduce-get_cpu_iowait_time_us kernel/time/tick-sched.c
--- a/kernel/time/tick-sched.c~sched-introduce-get_cpu_iowait_time_us
+++ a/kernel/time/tick-sched.c
@@ -161,6 +161,8 @@ update_ts_time_stats(struct tick_sched *
 	if (ts->idle_active) {
 		delta = ktime_sub(now, ts->idle_entrytime);
 		ts->idle_sleeptime = ktime_add(ts->idle_sleeptime, delta);
+		if (nr_iowait_cpu() > 0)
+			ts->iowait_sleeptime = ktime_add(ts->iowait_sleeptime, delta);
 		ts->idle_entrytime = now;
 	}
 
@@ -220,6 +222,32 @@ u64 get_cpu_idle_time_us(int cpu, u64 *l
 }
 EXPORT_SYMBOL_GPL(get_cpu_idle_time_us);
 
+/*
+ * get_cpu_iowait_time_us - get the total iowait time of a cpu
+ * @cpu: CPU number to query
+ * @last_update_time: variable to store update time in
+ *
+ * Return the cummulative iowait time (since boot) for a given
+ * CPU, in microseconds.
+ *
+ * This time is measured via accounting rather than sampling,
+ * and is as accurate as ktime_get() is.
+ *
+ * This function returns -1 if NOHZ is not enabled.
+ */
+u64 get_cpu_iowait_time_us(int cpu, u64 *last_update_time)
+{
+	struct tick_sched *ts = &per_cpu(tick_cpu_sched, cpu);
+
+	if (!tick_nohz_enabled)
+		return -1;
+
+	update_ts_time_stats(ts, ktime_get(), last_update_time);
+
+	return ktime_to_us(ts->iowait_sleeptime);
+}
+EXPORT_SYMBOL_GPL(get_cpu_iowait_time_us);
+
 /**
  * tick_nohz_stop_sched_tick - stop the idle tick from the idle task
  *
diff -puN kernel/time/timer_list.c~sched-introduce-get_cpu_iowait_time_us kernel/time/timer_list.c
--- a/kernel/time/timer_list.c~sched-introduce-get_cpu_iowait_time_us
+++ a/kernel/time/timer_list.c
@@ -176,6 +176,7 @@ static void print_cpu(struct seq_file *m
 		P_ns(idle_waketime);
 		P_ns(idle_exittime);
 		P_ns(idle_sleeptime);
+		P_ns(iowait_sleeptime);
 		P(last_jiffies);
 		P(next_jiffies);
 		P_ns(idle_expires);
_

^ permalink raw reply	[flat|nested] 20+ messages in thread

* [PATCH v2  7/8] ondemand: Solve a big performance issue by counting IOWAIT time as busy
  2010-05-09 15:21 [PATCH v2 0/8] Fix performance issue with ondemand governor Arjan van de Ven
                   ` (5 preceding siblings ...)
  2010-05-09 15:25 ` [PATCH v2 6/8] sched: Intoduce get_cpu_iowait_time_us() Arjan van de Ven
@ 2010-05-09 15:26 ` Arjan van de Ven
  2010-05-10  5:54   ` [tip:sched/core] " tip-bot for Arjan van de Ven
  2010-05-09 15:26 ` [PATCH v2 8/8] ondemand: Make the iowait-is-busy time a sysfs tunable Arjan van de Ven
  2010-05-09 17:49 ` [PATCH v2 0/8] Fix performance issue with ondemand governor Ingo Molnar
  8 siblings, 1 reply; 20+ messages in thread
From: Arjan van de Ven @ 2010-05-09 15:26 UTC (permalink / raw)
  To: Arjan van de Ven; +Cc: linux-kernel, mingo, davej

From: Arjan van de Ven <arjan@linux.intel.com>

The ondemand cpufreq governor uses CPU busy time (e.g. not-idle time) as
a measure for scaling the CPU frequency up or down.
If the CPU is busy, the CPU frequency scales up, if it's idle, the CPU
frequency scales down. Effectively, it uses the CPU busy time as proxy
variable for the more nebulous "how critical is performance right now"
question.

This algorithm falls flat on its face in the light of workloads where
you're alternatingly disk and CPU bound, such as the ever popular
"git grep", but also things like startup of programs and maildir using
email clients... much to the chagarin of Andrew Morton.

This patch changes the ondemand algorithm to count iowait time as busy,
not idle, time. As shown in the breakdown cases above, iowait is performance
critical often, and by counting iowait, the proxy variable becomes a more
accurate representation of the "how critical is performance" question.

The problem and fix are both verified with the "perf timechar" tool.

Signed-off-by: Arjan van de Ven <arjan@linux.intel.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Dave Jones <davej@redhat.com>
Reviewed-by: Rik van Riel <riel@redhat.com>
---

 drivers/cpufreq/cpufreq_ondemand.c |   30 +++++++++++++++++++++++++--
 1 file changed, 28 insertions(+), 2 deletions(-)

diff -puN drivers/cpufreq/cpufreq_ondemand.c~ondemand-solve-the-big-performance-issue-with-ondemand-during-disk-io drivers/cpufreq/cpufreq_ondemand.c
--- a/drivers/cpufreq/cpufreq_ondemand.c~ondemand-solve-the-big-performance-issue-with-ondemand-during-disk-io
+++ a/drivers/cpufreq/cpufreq_ondemand.c
@@ -73,6 +73,7 @@ enum {DBS_NORMAL_SAMPLE, DBS_SUB_SAMPLE}
 
 struct cpu_dbs_info_s {
 	cputime64_t prev_cpu_idle;
+	cputime64_t prev_cpu_iowait;
 	cputime64_t prev_cpu_wall;
 	cputime64_t prev_cpu_nice;
 	struct cpufreq_policy *cur_policy;
@@ -148,6 +149,16 @@ static inline cputime64_t get_cpu_idle_t
 	return idle_time;
 }
 
+static inline cputime64_t get_cpu_iowait_time(unsigned int cpu, cputime64_t *wall)
+{
+	u64 iowait_time = get_cpu_iowait_time_us(cpu, wall);
+
+	if (iowait_time == -1ULL)
+		return 0;
+
+	return iowait_time;
+}
+
 /*
  * Find right freq to be set now with powersave_bias on.
  * Returns the freq_hi to be used right now and will set freq_hi_jiffies,
@@ -465,14 +476,15 @@ static void dbs_check_cpu(struct cpu_dbs
 
 	for_each_cpu(j, policy->cpus) {
 		struct cpu_dbs_info_s *j_dbs_info;
-		cputime64_t cur_wall_time, cur_idle_time;
-		unsigned int idle_time, wall_time;
+		cputime64_t cur_wall_time, cur_idle_time, cur_iowait_time;
+		unsigned int idle_time, wall_time, iowait_time;
 		unsigned int load, load_freq;
 		int freq_avg;
 
 		j_dbs_info = &per_cpu(od_cpu_dbs_info, j);
 
 		cur_idle_time = get_cpu_idle_time(j, &cur_wall_time);
+		cur_iowait_time = get_cpu_iowait_time(j, &cur_wall_time);
 
 		wall_time = (unsigned int) cputime64_sub(cur_wall_time,
 				j_dbs_info->prev_cpu_wall);
@@ -482,6 +494,10 @@ static void dbs_check_cpu(struct cpu_dbs
 				j_dbs_info->prev_cpu_idle);
 		j_dbs_info->prev_cpu_idle = cur_idle_time;
 
+		iowait_time = (unsigned int) cputime64_sub(cur_iowait_time,
+				j_dbs_info->prev_cpu_iowait);
+		j_dbs_info->prev_cpu_iowait = cur_iowait_time;
+
 		if (dbs_tuners_ins.ignore_nice) {
 			cputime64_t cur_nice;
 			unsigned long cur_nice_jiffies;
@@ -499,6 +515,16 @@ static void dbs_check_cpu(struct cpu_dbs
 			idle_time += jiffies_to_usecs(cur_nice_jiffies);
 		}
 
+		/*
+		 * For the purpose of ondemand, waiting for disk IO is an
+		 * indication that you're performance critical, and not that
+		 * the system is actually idle. So subtract the iowait time
+		 * from the cpu idle time.
+		 */
+
+		if (idle_time >= iowait_time)
+			idle_time -= iowait_time;
+
 		if (unlikely(!wall_time || wall_time < idle_time))
 			continue;
 
_

^ permalink raw reply	[flat|nested] 20+ messages in thread

* [PATCH v2  8/8] ondemand: Make the iowait-is-busy time a sysfs tunable
  2010-05-09 15:21 [PATCH v2 0/8] Fix performance issue with ondemand governor Arjan van de Ven
                   ` (6 preceding siblings ...)
  2010-05-09 15:26 ` [PATCH v2 7/8] ondemand: Solve a big performance issue by counting IOWAIT time as busy Arjan van de Ven
@ 2010-05-09 15:26 ` Arjan van de Ven
  2010-05-10  5:54   ` [tip:sched/core] " tip-bot for Arjan van de Ven
  2010-05-09 17:49 ` [PATCH v2 0/8] Fix performance issue with ondemand governor Ingo Molnar
  8 siblings, 1 reply; 20+ messages in thread
From: Arjan van de Ven @ 2010-05-09 15:26 UTC (permalink / raw)
  To: Arjan van de Ven; +Cc: linux-kernel, mingo, davej

From: Arjan van de Ven <arjan@linux.intel.com>
Date: Mon, 3 May 2010 21:40:32 -0400
Subject: [PATCH] ondemand: Make the iowait-is-busy-time a sysfs tunable

Pavel Machek pointed out that not all CPUs have an efficient idle
at high frequency. Specifically, older Intel and various AMD cpus
would get a higher powerusage when copying files from USB.

Mike Chan pointed out that the same is true for various ARM chips
as well.

Thomas Renninger suggested to make this a sysfs tunable with a
reasonable default.

This patch adds a sysfs tunable for the new behavior, and uses
a very simple function to determine a reasonable default, depending
on the CPU vendor/type.

Signed-off-by: Arjan van de Ven <arjan@linux.intel.com>
Acked-by: Rik van Riel <riel@redhat.com>
Acked-by: Pavel Machek <pavel@ucw.cz>
---
 drivers/cpufreq/cpufreq_ondemand.c |   46 +++++++++++++++++++++++++++++++++++-
 1 files changed, 45 insertions(+), 1 deletions(-)

diff --git a/drivers/cpufreq/cpufreq_ondemand.c b/drivers/cpufreq/cpufreq_ondemand.c
index ed472f8..4877e8f 100644
--- a/drivers/cpufreq/cpufreq_ondemand.c
+++ b/drivers/cpufreq/cpufreq_ondemand.c
@@ -109,6 +109,7 @@ static struct dbs_tuners {
 	unsigned int down_differential;
 	unsigned int ignore_nice;
 	unsigned int powersave_bias;
+	unsigned int io_is_busy;
 } dbs_tuners_ins = {
 	.up_threshold = DEF_FREQUENCY_UP_THRESHOLD,
 	.down_differential = DEF_FREQUENCY_DOWN_DIFFERENTIAL,
@@ -260,6 +261,7 @@ static ssize_t show_##file_name						\
 	return sprintf(buf, "%u\n", dbs_tuners_ins.object);		\
 }
 show_one(sampling_rate, sampling_rate);
+show_one(io_is_busy, io_is_busy);
 show_one(up_threshold, up_threshold);
 show_one(ignore_nice_load, ignore_nice);
 show_one(powersave_bias, powersave_bias);
@@ -310,6 +312,22 @@ static ssize_t store_sampling_rate(struct kobject *a, struct attribute *b,
 	return count;
 }
 
+static ssize_t store_io_is_busy(struct kobject *a, struct attribute *b,
+				   const char *buf, size_t count)
+{
+	unsigned int input;
+	int ret;
+	ret = sscanf(buf, "%u", &input);
+	if (ret != 1)
+		return -EINVAL;
+
+	mutex_lock(&dbs_mutex);
+	dbs_tuners_ins.io_is_busy = !!input;
+	mutex_unlock(&dbs_mutex);
+
+	return count;
+}
+
 static ssize_t store_up_threshold(struct kobject *a, struct attribute *b,
 				  const char *buf, size_t count)
 {
@@ -392,6 +410,7 @@ static struct global_attr _name = \
 __ATTR(_name, 0644, show_##_name, store_##_name)
 
 define_one_rw(sampling_rate);
+define_one_rw(io_is_busy);
 define_one_rw(up_threshold);
 define_one_rw(ignore_nice_load);
 define_one_rw(powersave_bias);
@@ -403,6 +422,7 @@ static struct attribute *dbs_attributes[] = {
 	&up_threshold.attr,
 	&ignore_nice_load.attr,
 	&powersave_bias.attr,
+	&io_is_busy.attr,
 	NULL
 };
 
@@ -527,7 +547,7 @@ static void dbs_check_cpu(struct cpu_dbs_info_s *this_dbs_info)
 		 * from the cpu idle time.
 		 */
 
-		if (idle_time >= iowait_time)
+		if (dbs_tuners_ins.io_is_busy && idle_time >= iowait_time)
 			idle_time -= iowait_time;
 
 		if (unlikely(!wall_time || wall_time < idle_time))
@@ -643,6 +663,29 @@ static inline void dbs_timer_exit(struct cpu_dbs_info_s *dbs_info)
 	cancel_delayed_work_sync(&dbs_info->work);
 }
 
+/*
+ * Not all CPUs want IO time to be accounted as busy; this dependson how
+ * efficient idling at a higher frequency/voltage is.
+ * Pavel Machek says this is not so for various generations of AMD and old
+ * Intel systems.
+ * Mike Chan (androidlcom) calis this is also not true for ARM.
+ * Because of this, whitelist specific known (series) of CPUs by default, and
+ * leave all others up to the user.
+ */
+static int should_io_be_busy(void)
+{
+#if defined(CONFIG_X86)
+	/*
+	 * For Intel, Core 2 (model 15) andl later have an efficient idle.
+	 */
+	if (boot_cpu_data.x86_vendor == X86_VENDOR_INTEL &&
+	    boot_cpu_data.x86 == 6 &&
+	    boot_cpu_data.x86_model >= 15)
+		return 1;
+#endif
+	return 0;
+}
+
 static int cpufreq_governor_dbs(struct cpufreq_policy *policy,
 				   unsigned int event)
 {
@@ -705,6 +748,7 @@ static int cpufreq_governor_dbs(struct cpufreq_policy *policy,
 			dbs_tuners_ins.sampling_rate =
 				max(min_sampling_rate,
 				    latency * LATENCY_MULTIPLIER);
+			dbs_tuners_ins.io_is_busy = should_io_be_busy();
 		}
 		mutex_unlock(&dbs_mutex);
 
-- 
1.6.1.3


^ permalink raw reply related	[flat|nested] 20+ messages in thread

* Re: [PATCH v2  0/8] Fix performance issue with ondemand governor
  2010-05-09 15:21 [PATCH v2 0/8] Fix performance issue with ondemand governor Arjan van de Ven
                   ` (7 preceding siblings ...)
  2010-05-09 15:26 ` [PATCH v2 8/8] ondemand: Make the iowait-is-busy time a sysfs tunable Arjan van de Ven
@ 2010-05-09 17:49 ` Ingo Molnar
  2010-05-24 20:44   ` Rik van Riel
  8 siblings, 1 reply; 20+ messages in thread
From: Ingo Molnar @ 2010-05-09 17:49 UTC (permalink / raw)
  To: Arjan van de Ven; +Cc: linux-kernel, davej, Andrew Morton, Peter Zijlstra


* Arjan van de Ven <arjan@infradead.org> wrote:

> [Version 2 includes the acks/etc. Andrew: no changes from the patches that 
> are in -mm]
> 
> There have been various reports of the ondemand governor causing some 
> serious performance issues, one of the latest ones from Andrew. There are 
> several fundamental issues with ondemand (being worked on), but the report 
> from Andrew can be fixed relatively easily.
> 
> The fundamental issue is that ondemand will go to a (too) low CPU frequency 
> for workloads that alternatingly disk and CPU bound...

I've applied your series to sched/core and started testing it, thanks Arjan!

	Ingo

^ permalink raw reply	[flat|nested] 20+ messages in thread

* [tip:sched/core] sched: Add a comment to get_cpu_idle_time_us()
  2010-05-09 15:22 ` [PATCH v2 1/8] sched: Add a comment to get_cpu_idle_time_us() Arjan van de Ven
@ 2010-05-10  5:52   ` tip-bot for Arjan van de Ven
  0 siblings, 0 replies; 20+ messages in thread
From: tip-bot for Arjan van de Ven @ 2010-05-10  5:52 UTC (permalink / raw)
  To: linux-tip-commits
  Cc: linux-kernel, hpa, mingo, a.p.zijlstra, arjan, riel, akpm, tglx, mingo

Commit-ID:  b1f724c3055fa75a31d272222213647547a2d3d4
Gitweb:     http://git.kernel.org/tip/b1f724c3055fa75a31d272222213647547a2d3d4
Author:     Arjan van de Ven <arjan@linux.intel.com>
AuthorDate: Sun, 9 May 2010 08:22:08 -0700
Committer:  Ingo Molnar <mingo@elte.hu>
CommitDate: Sun, 9 May 2010 19:35:25 +0200

sched: Add a comment to get_cpu_idle_time_us()

The exported function get_cpu_idle_time_us() has no comment
describing it; add a kerneldoc comment

Signed-off-by: Arjan van de Ven <arjan@linux.intel.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Reviewed-by: Rik van Riel <riel@redhat.com>
Acked-by: Peter Zijlstra <a.p.zijlstra@chello.nl>
Cc: davej@redhat.com
LKML-Reference: <20100509082208.7cb721f0@infradead.org>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
---
 kernel/time/tick-sched.c |   14 ++++++++++++++
 1 files changed, 14 insertions(+), 0 deletions(-)

diff --git a/kernel/time/tick-sched.c b/kernel/time/tick-sched.c
index f25735a..358822e 100644
--- a/kernel/time/tick-sched.c
+++ b/kernel/time/tick-sched.c
@@ -179,6 +179,20 @@ static ktime_t tick_nohz_start_idle(struct tick_sched *ts)
 	return now;
 }
 
+/**
+ * get_cpu_idle_time_us - get the total idle time of a cpu
+ * @cpu: CPU number to query
+ * @last_update_time: variable to store update time in
+ *
+ * Return the cummulative idle time (since boot) for a given
+ * CPU, in microseconds. The idle time returned includes
+ * the iowait time (unlike what "top" and co report).
+ *
+ * This time is measured via accounting rather than sampling,
+ * and is as accurate as ktime_get() is.
+ *
+ * This function returns -1 if NOHZ is not enabled.
+ */
 u64 get_cpu_idle_time_us(int cpu, u64 *last_update_time)
 {
 	struct tick_sched *ts = &per_cpu(tick_cpu_sched, cpu);

^ permalink raw reply related	[flat|nested] 20+ messages in thread

* [tip:sched/core] sched: Introduce a function to update the idle statistics
  2010-05-09 15:22 ` [PATCH v2 2/8] sched: Introduce a function to update the idle statistics Arjan van de Ven
@ 2010-05-10  5:52   ` tip-bot for Arjan van de Ven
  0 siblings, 0 replies; 20+ messages in thread
From: tip-bot for Arjan van de Ven @ 2010-05-10  5:52 UTC (permalink / raw)
  To: linux-tip-commits
  Cc: linux-kernel, hpa, mingo, a.p.zijlstra, arjan, riel, akpm, tglx, mingo

Commit-ID:  595aac488b546c7185be7e29c8ae165a588b2a9f
Gitweb:     http://git.kernel.org/tip/595aac488b546c7185be7e29c8ae165a588b2a9f
Author:     Arjan van de Ven <arjan@linux.intel.com>
AuthorDate: Sun, 9 May 2010 08:22:45 -0700
Committer:  Ingo Molnar <mingo@elte.hu>
CommitDate: Sun, 9 May 2010 19:35:25 +0200

sched: Introduce a function to update the idle statistics

Currently, two places update the idle statistics (and more to
come later in this series).

This patch creates a helper function for updating these
statistics.

Signed-off-by: Arjan van de Ven <arjan@linux.intel.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Reviewed-by: Rik van Riel <riel@redhat.com>
Acked-by: Peter Zijlstra <a.p.zijlstra@chello.nl>
Cc: davej@redhat.com
LKML-Reference: <20100509082245.163e67ed@infradead.org>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
---
 kernel/time/tick-sched.c |   29 +++++++++++++++++++----------
 1 files changed, 19 insertions(+), 10 deletions(-)

diff --git a/kernel/time/tick-sched.c b/kernel/time/tick-sched.c
index 358822e..59d8762 100644
--- a/kernel/time/tick-sched.c
+++ b/kernel/time/tick-sched.c
@@ -150,14 +150,25 @@ static void tick_nohz_update_jiffies(ktime_t now)
 	touch_softlockup_watchdog();
 }
 
-static void tick_nohz_stop_idle(int cpu, ktime_t now)
+/*
+ * Updates the per cpu time idle statistics counters
+ */
+static void update_ts_time_stats(struct tick_sched *ts, ktime_t now)
 {
-	struct tick_sched *ts = &per_cpu(tick_cpu_sched, cpu);
 	ktime_t delta;
 
-	delta = ktime_sub(now, ts->idle_entrytime);
 	ts->idle_lastupdate = now;
-	ts->idle_sleeptime = ktime_add(ts->idle_sleeptime, delta);
+	if (ts->idle_active) {
+		delta = ktime_sub(now, ts->idle_entrytime);
+		ts->idle_sleeptime = ktime_add(ts->idle_sleeptime, delta);
+	}
+}
+
+static void tick_nohz_stop_idle(int cpu, ktime_t now)
+{
+	struct tick_sched *ts = &per_cpu(tick_cpu_sched, cpu);
+
+	update_ts_time_stats(ts, now);
 	ts->idle_active = 0;
 
 	sched_clock_idle_wakeup_event(0);
@@ -165,14 +176,12 @@ static void tick_nohz_stop_idle(int cpu, ktime_t now)
 
 static ktime_t tick_nohz_start_idle(struct tick_sched *ts)
 {
-	ktime_t now, delta;
+	ktime_t now;
 
 	now = ktime_get();
-	if (ts->idle_active) {
-		delta = ktime_sub(now, ts->idle_entrytime);
-		ts->idle_lastupdate = now;
-		ts->idle_sleeptime = ktime_add(ts->idle_sleeptime, delta);
-	}
+
+	update_ts_time_stats(ts, now);
+
 	ts->idle_entrytime = now;
 	ts->idle_active = 1;
 	sched_clock_idle_sleep_event();

^ permalink raw reply related	[flat|nested] 20+ messages in thread

* [tip:sched/core] sched: Update the idle statistics in get_cpu_idle_time_us()
  2010-05-09 15:23 ` [PATCH v2 3/8] sched: Update the idle statistics in get_cpu_idle_time_us() Arjan van de Ven
@ 2010-05-10  5:53   ` tip-bot for Arjan van de Ven
  0 siblings, 0 replies; 20+ messages in thread
From: tip-bot for Arjan van de Ven @ 2010-05-10  5:53 UTC (permalink / raw)
  To: linux-tip-commits
  Cc: linux-kernel, hpa, mingo, a.p.zijlstra, arjan, riel, akpm, tglx, mingo

Commit-ID:  8c7b09f43f4bf570654bcc458ce96819a932303c
Gitweb:     http://git.kernel.org/tip/8c7b09f43f4bf570654bcc458ce96819a932303c
Author:     Arjan van de Ven <arjan@linux.intel.com>
AuthorDate: Sun, 9 May 2010 08:23:23 -0700
Committer:  Ingo Molnar <mingo@elte.hu>
CommitDate: Sun, 9 May 2010 19:35:26 +0200

sched: Update the idle statistics in get_cpu_idle_time_us()

Right now, get_cpu_idle_time_us() only reports the idle
statistics upto the point the CPU entered last idle; not what is
valid right now.

This patch adds an update of the idle statistics to
get_cpu_idle_time_us(), so that calling this function always
returns statistics that are accurate at the point of the call.

This includes resetting the start of the idle time for
accounting purposes to avoid double accounting.

Signed-off-by: Arjan van de Ven <arjan@linux.intel.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Reviewed-by: Rik van Riel <riel@redhat.com>
Acked-by: Peter Zijlstra <a.p.zijlstra@chello.nl>
Cc: davej@redhat.com
LKML-Reference: <20100509082323.2d2f1945@infradead.org>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
---
 kernel/time/tick-sched.c |    7 ++++++-
 1 files changed, 6 insertions(+), 1 deletions(-)

diff --git a/kernel/time/tick-sched.c b/kernel/time/tick-sched.c
index 59d8762..f15d18d 100644
--- a/kernel/time/tick-sched.c
+++ b/kernel/time/tick-sched.c
@@ -161,6 +161,7 @@ static void update_ts_time_stats(struct tick_sched *ts, ktime_t now)
 	if (ts->idle_active) {
 		delta = ktime_sub(now, ts->idle_entrytime);
 		ts->idle_sleeptime = ktime_add(ts->idle_sleeptime, delta);
+		ts->idle_entrytime = now;
 	}
 }
 
@@ -205,14 +206,18 @@ static ktime_t tick_nohz_start_idle(struct tick_sched *ts)
 u64 get_cpu_idle_time_us(int cpu, u64 *last_update_time)
 {
 	struct tick_sched *ts = &per_cpu(tick_cpu_sched, cpu);
+	ktime_t now;
 
 	if (!tick_nohz_enabled)
 		return -1;
 
+	now = ktime_get();
+	update_ts_time_stats(ts, now);
+
 	if (ts->idle_active)
 		*last_update_time = ktime_to_us(ts->idle_lastupdate);
 	else
-		*last_update_time = ktime_to_us(ktime_get());
+		*last_update_time = ktime_to_us(now);
 
 	return ktime_to_us(ts->idle_sleeptime);
 }

^ permalink raw reply related	[flat|nested] 20+ messages in thread

* [tip:sched/core] sched: Fold updating of the last_update_time_info into update_ts_time_stats()
  2010-05-09 15:24 ` [PATCH v2 4/8] sched: Fold updating of the last_update_time_info into update_ts_time_stats() Arjan van de Ven
@ 2010-05-10  5:53   ` tip-bot for Arjan van de Ven
  0 siblings, 0 replies; 20+ messages in thread
From: tip-bot for Arjan van de Ven @ 2010-05-10  5:53 UTC (permalink / raw)
  To: linux-tip-commits
  Cc: linux-kernel, hpa, mingo, a.p.zijlstra, arjan, riel, akpm, tglx, mingo

Commit-ID:  8d63bf949e330588b80d30ca8f0a27a45297a9e9
Gitweb:     http://git.kernel.org/tip/8d63bf949e330588b80d30ca8f0a27a45297a9e9
Author:     Arjan van de Ven <arjan@linux.intel.com>
AuthorDate: Sun, 9 May 2010 08:24:03 -0700
Committer:  Ingo Molnar <mingo@elte.hu>
CommitDate: Sun, 9 May 2010 19:35:26 +0200

sched: Fold updating of the last_update_time_info into update_ts_time_stats()

This patch folds the updating of the last_update_time into the
update_ts_time_stats() function, and updates the callers.

This allows for further cleanups that are done in the next
patch.

Signed-off-by: Arjan van de Ven <arjan@linux.intel.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Reviewed-by: Rik van Riel <riel@redhat.com>
Acked-by: Peter Zijlstra <a.p.zijlstra@chello.nl>
Cc: davej@redhat.com
LKML-Reference: <20100509082403.60072967@infradead.org>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
---
 kernel/time/tick-sched.c |   22 +++++++++++-----------
 1 files changed, 11 insertions(+), 11 deletions(-)

diff --git a/kernel/time/tick-sched.c b/kernel/time/tick-sched.c
index f15d18d..e86e1c6 100644
--- a/kernel/time/tick-sched.c
+++ b/kernel/time/tick-sched.c
@@ -153,7 +153,8 @@ static void tick_nohz_update_jiffies(ktime_t now)
 /*
  * Updates the per cpu time idle statistics counters
  */
-static void update_ts_time_stats(struct tick_sched *ts, ktime_t now)
+static void
+update_ts_time_stats(struct tick_sched *ts, ktime_t now, u64 *last_update_time)
 {
 	ktime_t delta;
 
@@ -163,13 +164,19 @@ static void update_ts_time_stats(struct tick_sched *ts, ktime_t now)
 		ts->idle_sleeptime = ktime_add(ts->idle_sleeptime, delta);
 		ts->idle_entrytime = now;
 	}
+
+	if (ts->idle_active && last_update_time)
+		*last_update_time = ktime_to_us(ts->idle_lastupdate);
+	else
+		*last_update_time = ktime_to_us(now);
+
 }
 
 static void tick_nohz_stop_idle(int cpu, ktime_t now)
 {
 	struct tick_sched *ts = &per_cpu(tick_cpu_sched, cpu);
 
-	update_ts_time_stats(ts, now);
+	update_ts_time_stats(ts, now, NULL);
 	ts->idle_active = 0;
 
 	sched_clock_idle_wakeup_event(0);
@@ -181,7 +188,7 @@ static ktime_t tick_nohz_start_idle(struct tick_sched *ts)
 
 	now = ktime_get();
 
-	update_ts_time_stats(ts, now);
+	update_ts_time_stats(ts, now, NULL);
 
 	ts->idle_entrytime = now;
 	ts->idle_active = 1;
@@ -206,18 +213,11 @@ static ktime_t tick_nohz_start_idle(struct tick_sched *ts)
 u64 get_cpu_idle_time_us(int cpu, u64 *last_update_time)
 {
 	struct tick_sched *ts = &per_cpu(tick_cpu_sched, cpu);
-	ktime_t now;
 
 	if (!tick_nohz_enabled)
 		return -1;
 
-	now = ktime_get();
-	update_ts_time_stats(ts, now);
-
-	if (ts->idle_active)
-		*last_update_time = ktime_to_us(ts->idle_lastupdate);
-	else
-		*last_update_time = ktime_to_us(now);
+	update_ts_time_stats(ts, ktime_get(), last_update_time);
 
 	return ktime_to_us(ts->idle_sleeptime);
 }

^ permalink raw reply related	[flat|nested] 20+ messages in thread

* [tip:sched/core] sched: Eliminate the ts->idle_lastupdate field
  2010-05-09 15:24 ` [PATCH v2 5/8] sched: Eliminate the ts->idle_lastupdate field Arjan van de Ven
@ 2010-05-10  5:53   ` tip-bot for Arjan van de Ven
  0 siblings, 0 replies; 20+ messages in thread
From: tip-bot for Arjan van de Ven @ 2010-05-10  5:53 UTC (permalink / raw)
  To: linux-tip-commits
  Cc: linux-kernel, hpa, mingo, a.p.zijlstra, arjan, riel, akpm, tglx, mingo

Commit-ID:  e0e37c200f1357db0dd986edb359c41c57d24f6e
Gitweb:     http://git.kernel.org/tip/e0e37c200f1357db0dd986edb359c41c57d24f6e
Author:     Arjan van de Ven <arjan@linux.intel.com>
AuthorDate: Sun, 9 May 2010 08:24:39 -0700
Committer:  Ingo Molnar <mingo@elte.hu>
CommitDate: Sun, 9 May 2010 19:35:26 +0200

sched: Eliminate the ts->idle_lastupdate field

Now that the only user of ts->idle_lastupdate is
update_ts_time_stats(), the entire field can be eliminated.

In update_ts_time_stats(), idle_lastupdate is first set to
"now", and a few lines later, the only user is an if() statement
that assigns a variable either to "now" or to
ts->idle_lastupdate, which has the value of "now" at that point.

Signed-off-by: Arjan van de Ven <arjan@linux.intel.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Reviewed-by: Rik van Riel <riel@redhat.com>
Acked-by: Peter Zijlstra <a.p.zijlstra@chello.nl>
Cc: davej@redhat.com
LKML-Reference: <20100509082439.2fab0b4f@infradead.org>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
---
 include/linux/tick.h     |    1 -
 kernel/time/tick-sched.c |    5 +----
 2 files changed, 1 insertions(+), 5 deletions(-)

diff --git a/include/linux/tick.h b/include/linux/tick.h
index d2ae79e..0343eed 100644
--- a/include/linux/tick.h
+++ b/include/linux/tick.h
@@ -60,7 +60,6 @@ struct tick_sched {
 	ktime_t				idle_waketime;
 	ktime_t				idle_exittime;
 	ktime_t				idle_sleeptime;
-	ktime_t				idle_lastupdate;
 	ktime_t				sleep_length;
 	unsigned long			last_jiffies;
 	unsigned long			next_jiffies;
diff --git a/kernel/time/tick-sched.c b/kernel/time/tick-sched.c
index e86e1c6..50953f4 100644
--- a/kernel/time/tick-sched.c
+++ b/kernel/time/tick-sched.c
@@ -158,16 +158,13 @@ update_ts_time_stats(struct tick_sched *ts, ktime_t now, u64 *last_update_time)
 {
 	ktime_t delta;
 
-	ts->idle_lastupdate = now;
 	if (ts->idle_active) {
 		delta = ktime_sub(now, ts->idle_entrytime);
 		ts->idle_sleeptime = ktime_add(ts->idle_sleeptime, delta);
 		ts->idle_entrytime = now;
 	}
 
-	if (ts->idle_active && last_update_time)
-		*last_update_time = ktime_to_us(ts->idle_lastupdate);
-	else
+	if (last_update_time)
 		*last_update_time = ktime_to_us(now);
 
 }

^ permalink raw reply related	[flat|nested] 20+ messages in thread

* [tip:sched/core] sched: Intoduce get_cpu_iowait_time_us()
  2010-05-09 15:25 ` [PATCH v2 6/8] sched: Intoduce get_cpu_iowait_time_us() Arjan van de Ven
@ 2010-05-10  5:54   ` tip-bot for Arjan van de Ven
  0 siblings, 0 replies; 20+ messages in thread
From: tip-bot for Arjan van de Ven @ 2010-05-10  5:54 UTC (permalink / raw)
  To: linux-tip-commits
  Cc: linux-kernel, hpa, mingo, a.p.zijlstra, arjan, riel, akpm, tglx, mingo

Commit-ID:  0224cf4c5ee0d7faec83956b8e21f7d89e3df3bd
Gitweb:     http://git.kernel.org/tip/0224cf4c5ee0d7faec83956b8e21f7d89e3df3bd
Author:     Arjan van de Ven <arjan@linux.intel.com>
AuthorDate: Sun, 9 May 2010 08:25:23 -0700
Committer:  Ingo Molnar <mingo@elte.hu>
CommitDate: Sun, 9 May 2010 19:35:27 +0200

sched: Intoduce get_cpu_iowait_time_us()

For the ondemand cpufreq governor, it is desired that the iowait
time is microaccounted in a similar way as idle time is.

This patch introduces the infrastructure to account and expose
this information via the get_cpu_iowait_time_us() function.

[akpm@linux-foundation.org: fix CONFIG_NO_HZ=n build]
Signed-off-by: Arjan van de Ven <arjan@linux.intel.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Reviewed-by: Rik van Riel <riel@redhat.com>
Acked-by: Peter Zijlstra <a.p.zijlstra@chello.nl>
Cc: davej@redhat.com
LKML-Reference: <20100509082523.284feab6@infradead.org>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
---
 include/linux/tick.h     |    4 ++++
 kernel/time/tick-sched.c |   28 ++++++++++++++++++++++++++++
 kernel/time/timer_list.c |    1 +
 3 files changed, 33 insertions(+), 0 deletions(-)

diff --git a/include/linux/tick.h b/include/linux/tick.h
index 0343eed..b232ccc 100644
--- a/include/linux/tick.h
+++ b/include/linux/tick.h
@@ -42,6 +42,7 @@ enum tick_nohz_mode {
  * @idle_waketime:	Time when the idle was interrupted
  * @idle_exittime:	Time when the idle state was left
  * @idle_sleeptime:	Sum of the time slept in idle with sched tick stopped
+ * @iowait_sleeptime:	Sum of the time slept in idle with sched tick stopped, with IO outstanding
  * @sleep_length:	Duration of the current idle sleep
  * @do_timer_lst:	CPU was the last one doing do_timer before going idle
  */
@@ -60,6 +61,7 @@ struct tick_sched {
 	ktime_t				idle_waketime;
 	ktime_t				idle_exittime;
 	ktime_t				idle_sleeptime;
+	ktime_t				iowait_sleeptime;
 	ktime_t				sleep_length;
 	unsigned long			last_jiffies;
 	unsigned long			next_jiffies;
@@ -123,6 +125,7 @@ extern void tick_nohz_stop_sched_tick(int inidle);
 extern void tick_nohz_restart_sched_tick(void);
 extern ktime_t tick_nohz_get_sleep_length(void);
 extern u64 get_cpu_idle_time_us(int cpu, u64 *last_update_time);
+extern u64 get_cpu_iowait_time_us(int cpu, u64 *last_update_time);
 # else
 static inline void tick_nohz_stop_sched_tick(int inidle) { }
 static inline void tick_nohz_restart_sched_tick(void) { }
@@ -133,6 +136,7 @@ static inline ktime_t tick_nohz_get_sleep_length(void)
 	return len;
 }
 static inline u64 get_cpu_idle_time_us(int cpu, u64 *unused) { return -1; }
+static inline u64 get_cpu_iowait_time_us(int cpu, u64 *unused) { return -1; }
 # endif /* !NO_HZ */
 
 #endif
diff --git a/kernel/time/tick-sched.c b/kernel/time/tick-sched.c
index 50953f4..1d7b9bc 100644
--- a/kernel/time/tick-sched.c
+++ b/kernel/time/tick-sched.c
@@ -161,6 +161,8 @@ update_ts_time_stats(struct tick_sched *ts, ktime_t now, u64 *last_update_time)
 	if (ts->idle_active) {
 		delta = ktime_sub(now, ts->idle_entrytime);
 		ts->idle_sleeptime = ktime_add(ts->idle_sleeptime, delta);
+		if (nr_iowait_cpu() > 0)
+			ts->iowait_sleeptime = ktime_add(ts->iowait_sleeptime, delta);
 		ts->idle_entrytime = now;
 	}
 
@@ -220,6 +222,32 @@ u64 get_cpu_idle_time_us(int cpu, u64 *last_update_time)
 }
 EXPORT_SYMBOL_GPL(get_cpu_idle_time_us);
 
+/*
+ * get_cpu_iowait_time_us - get the total iowait time of a cpu
+ * @cpu: CPU number to query
+ * @last_update_time: variable to store update time in
+ *
+ * Return the cummulative iowait time (since boot) for a given
+ * CPU, in microseconds.
+ *
+ * This time is measured via accounting rather than sampling,
+ * and is as accurate as ktime_get() is.
+ *
+ * This function returns -1 if NOHZ is not enabled.
+ */
+u64 get_cpu_iowait_time_us(int cpu, u64 *last_update_time)
+{
+	struct tick_sched *ts = &per_cpu(tick_cpu_sched, cpu);
+
+	if (!tick_nohz_enabled)
+		return -1;
+
+	update_ts_time_stats(ts, ktime_get(), last_update_time);
+
+	return ktime_to_us(ts->iowait_sleeptime);
+}
+EXPORT_SYMBOL_GPL(get_cpu_iowait_time_us);
+
 /**
  * tick_nohz_stop_sched_tick - stop the idle tick from the idle task
  *
diff --git a/kernel/time/timer_list.c b/kernel/time/timer_list.c
index 1a4a7dd..ab8f5e3 100644
--- a/kernel/time/timer_list.c
+++ b/kernel/time/timer_list.c
@@ -176,6 +176,7 @@ static void print_cpu(struct seq_file *m, int cpu, u64 now)
 		P_ns(idle_waketime);
 		P_ns(idle_exittime);
 		P_ns(idle_sleeptime);
+		P_ns(iowait_sleeptime);
 		P(last_jiffies);
 		P(next_jiffies);
 		P_ns(idle_expires);

^ permalink raw reply related	[flat|nested] 20+ messages in thread

* [tip:sched/core] ondemand: Solve a big performance issue by counting IOWAIT time as busy
  2010-05-09 15:26 ` [PATCH v2 7/8] ondemand: Solve a big performance issue by counting IOWAIT time as busy Arjan van de Ven
@ 2010-05-10  5:54   ` tip-bot for Arjan van de Ven
  0 siblings, 0 replies; 20+ messages in thread
From: tip-bot for Arjan van de Ven @ 2010-05-10  5:54 UTC (permalink / raw)
  To: linux-tip-commits
  Cc: linux-kernel, hpa, mingo, a.p.zijlstra, arjan, davej, riel, akpm,
	tglx, mingo

Commit-ID:  6b8fcd9029f217a9ecce822db645e19111c11080
Gitweb:     http://git.kernel.org/tip/6b8fcd9029f217a9ecce822db645e19111c11080
Author:     Arjan van de Ven <arjan@linux.intel.com>
AuthorDate: Sun, 9 May 2010 08:26:06 -0700
Committer:  Ingo Molnar <mingo@elte.hu>
CommitDate: Sun, 9 May 2010 19:35:27 +0200

ondemand: Solve a big performance issue by counting IOWAIT time as busy

The ondemand cpufreq governor uses CPU busy time (e.g. not-idle
time) as a measure for scaling the CPU frequency up or down.
If the CPU is busy, the CPU frequency scales up, if it's idle,
the CPU frequency scales down. Effectively, it uses the CPU busy
time as proxy variable for the more nebulous "how critical is
performance right now" question.

This algorithm falls flat on its face in the light of workloads
where you're alternatingly disk and CPU bound, such as the ever
popular "git grep", but also things like startup of programs and
maildir using email clients... much to the chagarin of Andrew
Morton.

This patch changes the ondemand algorithm to count iowait time
as busy, not idle, time. As shown in the breakdown cases above,
iowait is performance critical often, and by counting iowait,
the proxy variable becomes a more accurate representation of the
"how critical is performance" question.

The problem and fix are both verified with the "perf timechar"
tool.

Signed-off-by: Arjan van de Ven <arjan@linux.intel.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Dave Jones <davej@redhat.com>
Reviewed-by: Rik van Riel <riel@redhat.com>
Acked-by: Peter Zijlstra <a.p.zijlstra@chello.nl>
LKML-Reference: <20100509082606.3d9f00d0@infradead.org>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
---
 drivers/cpufreq/cpufreq_ondemand.c |   30 ++++++++++++++++++++++++++++--
 1 files changed, 28 insertions(+), 2 deletions(-)

diff --git a/drivers/cpufreq/cpufreq_ondemand.c b/drivers/cpufreq/cpufreq_ondemand.c
index bd444dc..ed472f8 100644
--- a/drivers/cpufreq/cpufreq_ondemand.c
+++ b/drivers/cpufreq/cpufreq_ondemand.c
@@ -73,6 +73,7 @@ enum {DBS_NORMAL_SAMPLE, DBS_SUB_SAMPLE};
 
 struct cpu_dbs_info_s {
 	cputime64_t prev_cpu_idle;
+	cputime64_t prev_cpu_iowait;
 	cputime64_t prev_cpu_wall;
 	cputime64_t prev_cpu_nice;
 	struct cpufreq_policy *cur_policy;
@@ -148,6 +149,16 @@ static inline cputime64_t get_cpu_idle_time(unsigned int cpu, cputime64_t *wall)
 	return idle_time;
 }
 
+static inline cputime64_t get_cpu_iowait_time(unsigned int cpu, cputime64_t *wall)
+{
+	u64 iowait_time = get_cpu_iowait_time_us(cpu, wall);
+
+	if (iowait_time == -1ULL)
+		return 0;
+
+	return iowait_time;
+}
+
 /*
  * Find right freq to be set now with powersave_bias on.
  * Returns the freq_hi to be used right now and will set freq_hi_jiffies,
@@ -470,14 +481,15 @@ static void dbs_check_cpu(struct cpu_dbs_info_s *this_dbs_info)
 
 	for_each_cpu(j, policy->cpus) {
 		struct cpu_dbs_info_s *j_dbs_info;
-		cputime64_t cur_wall_time, cur_idle_time;
-		unsigned int idle_time, wall_time;
+		cputime64_t cur_wall_time, cur_idle_time, cur_iowait_time;
+		unsigned int idle_time, wall_time, iowait_time;
 		unsigned int load, load_freq;
 		int freq_avg;
 
 		j_dbs_info = &per_cpu(od_cpu_dbs_info, j);
 
 		cur_idle_time = get_cpu_idle_time(j, &cur_wall_time);
+		cur_iowait_time = get_cpu_iowait_time(j, &cur_wall_time);
 
 		wall_time = (unsigned int) cputime64_sub(cur_wall_time,
 				j_dbs_info->prev_cpu_wall);
@@ -487,6 +499,10 @@ static void dbs_check_cpu(struct cpu_dbs_info_s *this_dbs_info)
 				j_dbs_info->prev_cpu_idle);
 		j_dbs_info->prev_cpu_idle = cur_idle_time;
 
+		iowait_time = (unsigned int) cputime64_sub(cur_iowait_time,
+				j_dbs_info->prev_cpu_iowait);
+		j_dbs_info->prev_cpu_iowait = cur_iowait_time;
+
 		if (dbs_tuners_ins.ignore_nice) {
 			cputime64_t cur_nice;
 			unsigned long cur_nice_jiffies;
@@ -504,6 +520,16 @@ static void dbs_check_cpu(struct cpu_dbs_info_s *this_dbs_info)
 			idle_time += jiffies_to_usecs(cur_nice_jiffies);
 		}
 
+		/*
+		 * For the purpose of ondemand, waiting for disk IO is an
+		 * indication that you're performance critical, and not that
+		 * the system is actually idle. So subtract the iowait time
+		 * from the cpu idle time.
+		 */
+
+		if (idle_time >= iowait_time)
+			idle_time -= iowait_time;
+
 		if (unlikely(!wall_time || wall_time < idle_time))
 			continue;
 

^ permalink raw reply related	[flat|nested] 20+ messages in thread

* [tip:sched/core] ondemand: Make the iowait-is-busy time a sysfs tunable
  2010-05-09 15:26 ` [PATCH v2 8/8] ondemand: Make the iowait-is-busy time a sysfs tunable Arjan van de Ven
@ 2010-05-10  5:54   ` tip-bot for Arjan van de Ven
  0 siblings, 0 replies; 20+ messages in thread
From: tip-bot for Arjan van de Ven @ 2010-05-10  5:54 UTC (permalink / raw)
  To: linux-tip-commits
  Cc: linux-kernel, hpa, mingo, a.p.zijlstra, arjan, pavel, riel, tglx, mingo

Commit-ID:  19379b11819efc1fc3b602e64f7e7531050aaddb
Gitweb:     http://git.kernel.org/tip/19379b11819efc1fc3b602e64f7e7531050aaddb
Author:     Arjan van de Ven <arjan@linux.intel.com>
AuthorDate: Sun, 9 May 2010 08:26:51 -0700
Committer:  Ingo Molnar <mingo@elte.hu>
CommitDate: Sun, 9 May 2010 19:35:27 +0200

ondemand: Make the iowait-is-busy time a sysfs tunable

Pavel Machek pointed out that not all CPUs have an efficient
idle at high frequency. Specifically, older Intel and various
AMD cpus would get a higher powerusage when copying files from
USB.

Mike Chan pointed out that the same is true for various ARM
chips as well.

Thomas Renninger suggested to make this a sysfs tunable with a
reasonable default.

This patch adds a sysfs tunable for the new behavior, and uses
a very simple function to determine a reasonable default,
depending on the CPU vendor/type.

Signed-off-by: Arjan van de Ven <arjan@linux.intel.com>
Acked-by: Rik van Riel <riel@redhat.com>
Acked-by: Pavel Machek <pavel@ucw.cz>
Acked-by: Peter Zijlstra <a.p.zijlstra@chello.nl>
Cc: davej@redhat.com
LKML-Reference: <20100509082651.46914d04@infradead.org>
[ minor tidyup ]
Signed-off-by: Ingo Molnar <mingo@elte.hu>
---
 drivers/cpufreq/cpufreq_ondemand.c |   47 +++++++++++++++++++++++++++++++++++-
 1 files changed, 46 insertions(+), 1 deletions(-)

diff --git a/drivers/cpufreq/cpufreq_ondemand.c b/drivers/cpufreq/cpufreq_ondemand.c
index ed472f8..8e9dbdc 100644
--- a/drivers/cpufreq/cpufreq_ondemand.c
+++ b/drivers/cpufreq/cpufreq_ondemand.c
@@ -109,6 +109,7 @@ static struct dbs_tuners {
 	unsigned int down_differential;
 	unsigned int ignore_nice;
 	unsigned int powersave_bias;
+	unsigned int io_is_busy;
 } dbs_tuners_ins = {
 	.up_threshold = DEF_FREQUENCY_UP_THRESHOLD,
 	.down_differential = DEF_FREQUENCY_DOWN_DIFFERENTIAL,
@@ -260,6 +261,7 @@ static ssize_t show_##file_name						\
 	return sprintf(buf, "%u\n", dbs_tuners_ins.object);		\
 }
 show_one(sampling_rate, sampling_rate);
+show_one(io_is_busy, io_is_busy);
 show_one(up_threshold, up_threshold);
 show_one(ignore_nice_load, ignore_nice);
 show_one(powersave_bias, powersave_bias);
@@ -310,6 +312,23 @@ static ssize_t store_sampling_rate(struct kobject *a, struct attribute *b,
 	return count;
 }
 
+static ssize_t store_io_is_busy(struct kobject *a, struct attribute *b,
+				   const char *buf, size_t count)
+{
+	unsigned int input;
+	int ret;
+
+	ret = sscanf(buf, "%u", &input);
+	if (ret != 1)
+		return -EINVAL;
+
+	mutex_lock(&dbs_mutex);
+	dbs_tuners_ins.io_is_busy = !!input;
+	mutex_unlock(&dbs_mutex);
+
+	return count;
+}
+
 static ssize_t store_up_threshold(struct kobject *a, struct attribute *b,
 				  const char *buf, size_t count)
 {
@@ -392,6 +411,7 @@ static struct global_attr _name = \
 __ATTR(_name, 0644, show_##_name, store_##_name)
 
 define_one_rw(sampling_rate);
+define_one_rw(io_is_busy);
 define_one_rw(up_threshold);
 define_one_rw(ignore_nice_load);
 define_one_rw(powersave_bias);
@@ -403,6 +423,7 @@ static struct attribute *dbs_attributes[] = {
 	&up_threshold.attr,
 	&ignore_nice_load.attr,
 	&powersave_bias.attr,
+	&io_is_busy.attr,
 	NULL
 };
 
@@ -527,7 +548,7 @@ static void dbs_check_cpu(struct cpu_dbs_info_s *this_dbs_info)
 		 * from the cpu idle time.
 		 */
 
-		if (idle_time >= iowait_time)
+		if (dbs_tuners_ins.io_is_busy && idle_time >= iowait_time)
 			idle_time -= iowait_time;
 
 		if (unlikely(!wall_time || wall_time < idle_time))
@@ -643,6 +664,29 @@ static inline void dbs_timer_exit(struct cpu_dbs_info_s *dbs_info)
 	cancel_delayed_work_sync(&dbs_info->work);
 }
 
+/*
+ * Not all CPUs want IO time to be accounted as busy; this dependson how
+ * efficient idling at a higher frequency/voltage is.
+ * Pavel Machek says this is not so for various generations of AMD and old
+ * Intel systems.
+ * Mike Chan (androidlcom) calis this is also not true for ARM.
+ * Because of this, whitelist specific known (series) of CPUs by default, and
+ * leave all others up to the user.
+ */
+static int should_io_be_busy(void)
+{
+#if defined(CONFIG_X86)
+	/*
+	 * For Intel, Core 2 (model 15) andl later have an efficient idle.
+	 */
+	if (boot_cpu_data.x86_vendor == X86_VENDOR_INTEL &&
+	    boot_cpu_data.x86 == 6 &&
+	    boot_cpu_data.x86_model >= 15)
+		return 1;
+#endif
+	return 0;
+}
+
 static int cpufreq_governor_dbs(struct cpufreq_policy *policy,
 				   unsigned int event)
 {
@@ -705,6 +749,7 @@ static int cpufreq_governor_dbs(struct cpufreq_policy *policy,
 			dbs_tuners_ins.sampling_rate =
 				max(min_sampling_rate,
 				    latency * LATENCY_MULTIPLIER);
+			dbs_tuners_ins.io_is_busy = should_io_be_busy();
 		}
 		mutex_unlock(&dbs_mutex);
 

^ permalink raw reply related	[flat|nested] 20+ messages in thread

* Re: [PATCH v2  0/8] Fix performance issue with ondemand governor
  2010-05-09 17:49 ` [PATCH v2 0/8] Fix performance issue with ondemand governor Ingo Molnar
@ 2010-05-24 20:44   ` Rik van Riel
  2010-05-28  9:30     ` Ingo Molnar
  0 siblings, 1 reply; 20+ messages in thread
From: Rik van Riel @ 2010-05-24 20:44 UTC (permalink / raw)
  To: Ingo Molnar
  Cc: Arjan van de Ven, linux-kernel, davej, Andrew Morton, Peter Zijlstra

On 05/09/2010 01:49 PM, Ingo Molnar wrote:
> * Arjan van de Ven<arjan@infradead.org>  wrote:
>
>> [Version 2 includes the acks/etc. Andrew: no changes from the patches that
>> are in -mm]
>>
>> There have been various reports of the ondemand governor causing some
>> serious performance issues, one of the latest ones from Andrew. There are
>> several fundamental issues with ondemand (being worked on), but the report
>> from Andrew can be fixed relatively easily.
>>
>> The fundamental issue is that ondemand will go to a (too) low CPU frequency
>> for workloads that alternatingly disk and CPU bound...
>
> I've applied your series to sched/core and started testing it, thanks Arjan!

This code seems to help significantly with some workloads,
allowing more systems to use the ondemand governor (and
having fewer systems waste power by using the performance
governor all the time).

It would be nice to see it in 2.6.35

-- 
All rights reversed

^ permalink raw reply	[flat|nested] 20+ messages in thread

* Re: [PATCH v2  0/8] Fix performance issue with ondemand governor
  2010-05-24 20:44   ` Rik van Riel
@ 2010-05-28  9:30     ` Ingo Molnar
  0 siblings, 0 replies; 20+ messages in thread
From: Ingo Molnar @ 2010-05-28  9:30 UTC (permalink / raw)
  To: Rik van Riel
  Cc: Arjan van de Ven, linux-kernel, davej, Andrew Morton, Peter Zijlstra


* Rik van Riel <riel@redhat.com> wrote:

> On 05/09/2010 01:49 PM, Ingo Molnar wrote:
> >* Arjan van de Ven<arjan@infradead.org>  wrote:
> >
> >>[Version 2 includes the acks/etc. Andrew: no changes from the patches that
> >>are in -mm]
> >>
> >>There have been various reports of the ondemand governor causing some
> >>serious performance issues, one of the latest ones from Andrew. There are
> >>several fundamental issues with ondemand (being worked on), but the report
> >>from Andrew can be fixed relatively easily.
> >>
> >>The fundamental issue is that ondemand will go to a (too) low CPU frequency
> >>for workloads that alternatingly disk and CPU bound...
> >
> >I've applied your series to sched/core and started testing it, thanks Arjan!
> 
> This code seems to help significantly with some workloads,
> allowing more systems to use the ondemand governor (and
> having fewer systems waste power by using the performance
> governor all the time).
> 
> It would be nice to see it in 2.6.35

Yeah, it's uptream now :-)

Cheers,

	Ingo

^ permalink raw reply	[flat|nested] 20+ messages in thread

end of thread, other threads:[~2010-05-28  9:30 UTC | newest]

Thread overview: 20+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2010-05-09 15:21 [PATCH v2 0/8] Fix performance issue with ondemand governor Arjan van de Ven
2010-05-09 15:22 ` [PATCH v2 1/8] sched: Add a comment to get_cpu_idle_time_us() Arjan van de Ven
2010-05-10  5:52   ` [tip:sched/core] " tip-bot for Arjan van de Ven
2010-05-09 15:22 ` [PATCH v2 2/8] sched: Introduce a function to update the idle statistics Arjan van de Ven
2010-05-10  5:52   ` [tip:sched/core] " tip-bot for Arjan van de Ven
2010-05-09 15:23 ` [PATCH v2 3/8] sched: Update the idle statistics in get_cpu_idle_time_us() Arjan van de Ven
2010-05-10  5:53   ` [tip:sched/core] " tip-bot for Arjan van de Ven
2010-05-09 15:24 ` [PATCH v2 4/8] sched: Fold updating of the last_update_time_info into update_ts_time_stats() Arjan van de Ven
2010-05-10  5:53   ` [tip:sched/core] " tip-bot for Arjan van de Ven
2010-05-09 15:24 ` [PATCH v2 5/8] sched: Eliminate the ts->idle_lastupdate field Arjan van de Ven
2010-05-10  5:53   ` [tip:sched/core] " tip-bot for Arjan van de Ven
2010-05-09 15:25 ` [PATCH v2 6/8] sched: Intoduce get_cpu_iowait_time_us() Arjan van de Ven
2010-05-10  5:54   ` [tip:sched/core] " tip-bot for Arjan van de Ven
2010-05-09 15:26 ` [PATCH v2 7/8] ondemand: Solve a big performance issue by counting IOWAIT time as busy Arjan van de Ven
2010-05-10  5:54   ` [tip:sched/core] " tip-bot for Arjan van de Ven
2010-05-09 15:26 ` [PATCH v2 8/8] ondemand: Make the iowait-is-busy time a sysfs tunable Arjan van de Ven
2010-05-10  5:54   ` [tip:sched/core] " tip-bot for Arjan van de Ven
2010-05-09 17:49 ` [PATCH v2 0/8] Fix performance issue with ondemand governor Ingo Molnar
2010-05-24 20:44   ` Rik van Riel
2010-05-28  9:30     ` Ingo Molnar

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.