All of lore.kernel.org
 help / color / mirror / Atom feed
* [PATCH 0/1] sched: defer idle accounting till after load update period
@ 2010-03-21 22:45 Chase Douglas
  2010-03-21 22:45 ` [PATCH 1/1] " Chase Douglas
  0 siblings, 1 reply; 2+ messages in thread
From: Chase Douglas @ 2010-03-21 22:45 UTC (permalink / raw)
  To: linux-kernel; +Cc: Ingo Molnar, Peter Zijlstra, Thomas Gleixner

The following patch fixes a load avg calculation bug. Details are included with
the patch. Essentially, a task that often runs for less than 10 ticks at a time
is likely to be left out of the load avg calculation. A test case is provided
below. If you run the test case on a near zero-load system you will find top
report 90% cpu usage while the load avg stays at or near 0.00. With the patch,
the load avg is calculated correctly to be at least 0.90.

--

#include <asm/param.h>
#include <sys/time.h>
#include <time.h>

int main() {
	struct timespec ts;
	ts.tv_sec = 0;
	ts.tv_nsec = 1000000000 / HZ;

	/*
	 * Run gettimeofday in a tight loop 9 ticks, then sleep for 1 tick
	 */
	while (1) {
		struct timeval tv;

		do {
			gettimeofday(&tv, NULL);
		} while ((tv.tv_usec * HZ / 1000000) % 10 != 0);

		nanosleep(&ts, NULL);
	}
	return 0;
}

^ permalink raw reply	[flat|nested] 2+ messages in thread

* [PATCH 1/1] sched: defer idle accounting till after load update period
  2010-03-21 22:45 [PATCH 0/1] sched: defer idle accounting till after load update period Chase Douglas
@ 2010-03-21 22:45 ` Chase Douglas
  0 siblings, 0 replies; 2+ messages in thread
From: Chase Douglas @ 2010-03-21 22:45 UTC (permalink / raw)
  To: linux-kernel; +Cc: Ingo Molnar, Peter Zijlstra, Thomas Gleixner

There's a period of 10 ticks where calc_load_tasks is updated by all the
cpus for the load avg. Usually all the cpus do this during the first
tick. If any cpus go idle, calc_load_tasks is decremented accordingly.
However, if they wake up calc_load_tasks is not incremented. Thus, if
cpus go idle during the 10 tick period, calc_load_tasks may be
decremented to a non-representative value. This issue can lead to
systems having a load avg of exactly 0, even though the real load avg
could theoretically be up to NR_CPUS.

This change defers calc_load_tasks accounting after each cpu updates the
count until after the 10 tick period.

BugLink: http://bugs.launchpad.net/bugs/513848

Signed-off-by: Chase Douglas <chase.douglas@canonical.com>
---
 kernel/sched.c |   16 ++++++++++++++--
 1 files changed, 14 insertions(+), 2 deletions(-)

diff --git a/kernel/sched.c b/kernel/sched.c
index 9ab3cd7..c0aedac 100644
--- a/kernel/sched.c
+++ b/kernel/sched.c
@@ -3064,7 +3064,8 @@ void calc_global_load(void)
  */
 static void calc_load_account_active(struct rq *this_rq)
 {
-	long nr_active, delta;
+	static atomic_long_t deferred;
+	long nr_active, delta, deferred_delta;
 
 	nr_active = this_rq->nr_running;
 	nr_active += (long) this_rq->nr_uninterruptible;
@@ -3072,6 +3073,17 @@ static void calc_load_account_active(struct rq *this_rq)
 	if (nr_active != this_rq->calc_load_active) {
 		delta = nr_active - this_rq->calc_load_active;
 		this_rq->calc_load_active = nr_active;
+
+		/* Need to defer idle accounting during load update period: */
+		if (unlikely(time_before(jiffies, this_rq->calc_load_update) &&
+			     time_after_eq(jiffies, calc_load_update))) {
+			atomic_long_add(delta, &deferred);
+			return;
+		}
+
+		deferred_delta = atomic_long_xchg(&deferred, 0);
+		delta += deferred_delta;
+
 		atomic_long_add(delta, &calc_load_tasks);
 	}
 }
@@ -3106,8 +3118,8 @@ static void update_cpu_load(struct rq *this_rq)
 	}
 
 	if (time_after_eq(jiffies, this_rq->calc_load_update)) {
-		this_rq->calc_load_update += LOAD_FREQ;
 		calc_load_account_active(this_rq);
+		this_rq->calc_load_update += LOAD_FREQ;
 	}
 }
 
-- 
1.6.3.3


^ permalink raw reply related	[flat|nested] 2+ messages in thread

end of thread, other threads:[~2010-03-21 22:45 UTC | newest]

Thread overview: 2+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2010-03-21 22:45 [PATCH 0/1] sched: defer idle accounting till after load update period Chase Douglas
2010-03-21 22:45 ` [PATCH 1/1] " Chase Douglas

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.