All of lore.kernel.org
 help / color / mirror / Atom feed
* [PATCH v4 0/2] sched: consider missed ticks when updating cpu load
@ 2015-10-14  9:47 byungchul.park
  2015-10-14  9:47 ` [PATCH v4 1/2] sched: make __update_cpu_load() handle active tickless case byungchul.park
  2015-10-14  9:47 ` [PATCH v4 2/2] sched: consider missed ticks in full NOHZ byungchul.park
  0 siblings, 2 replies; 14+ messages in thread
From: byungchul.park @ 2015-10-14  9:47 UTC (permalink / raw)
  To: mingo, peterz; +Cc: linux-kernel, fweisbec, tglx, Byungchul Park

From: Byungchul Park <byungchul.park@lge.com>

change from v3 to v4
- focus the problem on full NOHZ

change from v2 to v3
- add a patch which make __update_cpu_load() handle active tickless

change from v1 to v2
- add some additional commit message (logic is same exactly)

i will try to fix other stuffs caused by full NOHZ later with another patch.
i decided to send this patch at first because "cpu load update" is a standalone
problem which is not coupled with other stuffs.

Byungchul Park (2):
  sched: make __update_cpu_load() handle active tickless case
  sched: consider missed ticks in full NOHZ

 include/linux/sched.h    |    4 ++--
 kernel/sched/fair.c      |   58 +++++++++++++++++++++++++++++++++++++++-------
 kernel/time/tick-sched.c |    8 +++----
 3 files changed, 56 insertions(+), 14 deletions(-)

-- 
1.7.9.5


^ permalink raw reply	[flat|nested] 14+ messages in thread

* [PATCH v4 1/2] sched: make __update_cpu_load() handle active tickless case
  2015-10-14  9:47 [PATCH v4 0/2] sched: consider missed ticks when updating cpu load byungchul.park
@ 2015-10-14  9:47 ` byungchul.park
  2015-10-19 13:16   ` Peter Zijlstra
  2015-11-23 16:18   ` [tip:sched/core] sched/fair: Prepare __update_cpu_load() to handle active tickless tip-bot for Byungchul Park
  2015-10-14  9:47 ` [PATCH v4 2/2] sched: consider missed ticks in full NOHZ byungchul.park
  1 sibling, 2 replies; 14+ messages in thread
From: byungchul.park @ 2015-10-14  9:47 UTC (permalink / raw)
  To: mingo, peterz; +Cc: linux-kernel, fweisbec, tglx, Byungchul Park

From: Byungchul Park <byungchul.park@lge.com>

There are some cases where distance between ticks is more then one tick,
while the cpu is not idle, e.g. full NOHZ.

However __update_cpu_load() assumes it is the idle tickless case if the
distance between ticks is more than 1, even though it can be the active
tickless case. Thus in the active tickless case, updating cpu load
cannot be performed correctly. To update cpu load properly, this patch
changes __update_cpu_load() so that it can handle active tickless case.

We can consider the new_load for each omitted tick during tickless though
the each new_load would be zero in idle tickless case, while it's not
true in non-idle tickless case. We can approximately consider the
new_load ~= the last this_rq->cpu_load[0] during tickless in non-idle
tickless case. Now, let's define a symbol L which is representing the
each new_load during tickless so that L = 0 in idle tickless case,
L = this_rq->cpu_load[0] in non-idle tickless case. Then, we can get
the cpu_load(n) during tickless since last update, by

cpu_load(n) = (1 - 1/s) * cpu_load(n-1) + (1/s) * L
            , where n = the current tick - 1, s = scale

            = A * cpu_load(n-1) + B
            , where A = 1 - 1/s, B = (1/s) * L

            = A * (A * cpu_load(n-2) + B) + B

            = A * (A * (A * cpu_load(n-3) + B) + B) + B

            = A^3 * cpu_load(n-3) + A^2 * B + A * B + B

            = A^i * cpu_load(n-i) + (A^(i-1) + A^(i-2) + ... + 1) * B
            , where i = pending_updates - 1

            = A^i * cpu_load(n-i) + B * (A^i - 1) / (A - 1)
            , by geometric series formula for sum

            = (1 - 1/s)^i * (cpu_load(n-i) - L) + L
            , by extending A and B

Calculation is over!

1. (1 - 1/s)^i can be handled by decay_load_missed().
2. cpu_load(n-i) is the "old_load".
3. L is the this_rq->cpu_load[0], updated at the last time.

Finally,

cpu_load(n) for current tick will be handled in this turn.

Signed-off-by: Byungchul Park <byungchul.park@lge.com>
---
 kernel/sched/fair.c |   56 ++++++++++++++++++++++++++++++++++++++++++++-------
 1 file changed, 49 insertions(+), 7 deletions(-)

diff --git a/kernel/sched/fair.c b/kernel/sched/fair.c
index 077076f..26473fc 100644
--- a/kernel/sched/fair.c
+++ b/kernel/sched/fair.c
@@ -4307,13 +4307,54 @@ decay_load_missed(unsigned long load, unsigned long missed_updates, int idx)
 
 /*
  * Update rq->cpu_load[] statistics. This function is usually called every
- * scheduler tick (TICK_NSEC). With tickless idle this will not be called
- * every tick. We fix it up based on jiffies.
+ * scheduler tick (TICK_NSEC). With tickless this will not be called every
+ * tick. We fix it up based on jiffies.
+ *
+ * We can consider the new_load for each omitted tick during tickless though
+ * the each new_load would be zero in idle tickless case, while it's not
+ * true in non-idle tickless case e.g. full NOHZ. We can approximately
+ * consider the new_load ~= the last this_rq->cpu_load[0] during tickless
+ * in non-idle tickless case e.g. full NOHZ. Now, let's define a symbol L
+ * which is representing the each new_load during tickless so that L = 0 in
+ * idle tickless case, L = this_rq->cpu_load[0] in non-idle tickless case.
+ * Then, we can get the cpu_load(n) during tickless since last update, by
+ *
+ * cpu_load(n) = (1 - 1/s) * cpu_load(n-1) + (1/s) * L
+ *             , where n = the current tick - 1, s = scale
+ *
+ *             = A * cpu_load(n-1) + B
+ *             , where A = 1 - 1/s, B = (1/s) * L
+ *
+ *             = A * (A * cpu_load(n-2) + B) + B
+ *
+ *             = A * (A * (A * cpu_load(n-3) + B) + B) + B
+ *
+ *             = A^3 * cpu_load(n-3) + A^2 * B + A * B + B
+ *
+ *             = A^i * cpu_load(n-i) + (A^(i-1) + A^(i-2) + ... + 1) * B
+ *             , where i = pending_updates - 1
+ *
+ *             = A^i * cpu_load(n-i) + B * (A^i - 1) / (A - 1)
+ *             , by geometric series formula for sum
+ *
+ *             = (1 - 1/s)^i * (cpu_load(n-i) - L) + L
+ *             , by extending A and B
+ *
+ * Calculation is over!
+ *
+ * 1. (1 - 1/s)^i can be handled by decay_load_missed().
+ * 2. cpu_load(n-i) is the "old_load".
+ * 3. L is the this_rq->cpu_load[0], updated at the last time.
+ *
+ * Finally,
+ *
+ * cpu_load(n) for current tick will be handled in this turn.
  */
 static void __update_cpu_load(struct rq *this_rq, unsigned long this_load,
-			      unsigned long pending_updates)
+			      unsigned long pending_updates, int active)
 {
 	int i, scale;
+	unsigned long tickless_load = active ? this_rq->cpu_load[0] : 0;
 
 	this_rq->nr_load_updates++;
 
@@ -4324,8 +4365,9 @@ static void __update_cpu_load(struct rq *this_rq, unsigned long this_load,
 
 		/* scale is effectively 1 << i now, and >> i divides by scale */
 
-		old_load = this_rq->cpu_load[i];
+		old_load = this_rq->cpu_load[i] - tickless_load;
 		old_load = decay_load_missed(old_load, pending_updates - 1, i);
+		old_load += tickless_load;
 		new_load = this_load;
 		/*
 		 * Round up the averaging division if load is increasing. This
@@ -4380,7 +4422,7 @@ static void update_idle_cpu_load(struct rq *this_rq)
 	pending_updates = curr_jiffies - this_rq->last_load_update_tick;
 	this_rq->last_load_update_tick = curr_jiffies;
 
-	__update_cpu_load(this_rq, load, pending_updates);
+	__update_cpu_load(this_rq, load, pending_updates, 0);
 }
 
 /*
@@ -4403,7 +4445,7 @@ void update_cpu_load_nohz(void)
 		 * We were idle, this means load 0, the current load might be
 		 * !0 due to remote wakeups and the sort.
 		 */
-		__update_cpu_load(this_rq, 0, pending_updates);
+		__update_cpu_load(this_rq, 0, pending_updates, 0);
 	}
 	raw_spin_unlock(&this_rq->lock);
 }
@@ -4419,7 +4461,7 @@ void update_cpu_load_active(struct rq *this_rq)
 	 * See the mess around update_idle_cpu_load() / update_cpu_load_nohz().
 	 */
 	this_rq->last_load_update_tick = jiffies;
-	__update_cpu_load(this_rq, load, 1);
+	__update_cpu_load(this_rq, load, 1, 1);
 }
 
 /*
-- 
1.7.9.5


^ permalink raw reply related	[flat|nested] 14+ messages in thread

* [PATCH v4 2/2] sched: consider missed ticks in full NOHZ
  2015-10-14  9:47 [PATCH v4 0/2] sched: consider missed ticks when updating cpu load byungchul.park
  2015-10-14  9:47 ` [PATCH v4 1/2] sched: make __update_cpu_load() handle active tickless case byungchul.park
@ 2015-10-14  9:47 ` byungchul.park
  2015-10-22 16:20   ` Frederic Weisbecker
  2015-11-02 16:10   ` Peter Zijlstra
  1 sibling, 2 replies; 14+ messages in thread
From: byungchul.park @ 2015-10-14  9:47 UTC (permalink / raw)
  To: mingo, peterz; +Cc: linux-kernel, fweisbec, tglx, Byungchul Park

From: Byungchul Park <byungchul.park@lge.com>

Even though the cpu is non-idle when its tick is stoped in full NOHZ,
current "update_cpu_load" code considers as if the cpu has been idle
unconditionally. It's wrong. This patch makes the "update_cpu_load"
code know if the calling path comes from full NOHZ or idle NOHZ.

Signed-off-by: Byungchul Park <byungchul.park@lge.com>
---
 include/linux/sched.h    |    4 ++--
 kernel/sched/fair.c      |    4 ++--
 kernel/time/tick-sched.c |    8 ++++----
 3 files changed, 8 insertions(+), 8 deletions(-)

diff --git a/include/linux/sched.h b/include/linux/sched.h
index 699228b..f1b59aa 100644
--- a/include/linux/sched.h
+++ b/include/linux/sched.h
@@ -177,9 +177,9 @@ extern void get_iowait_load(unsigned long *nr_waiters, unsigned long *load);
 extern void calc_global_load(unsigned long ticks);
 
 #if defined(CONFIG_SMP) && defined(CONFIG_NO_HZ_COMMON)
-extern void update_cpu_load_nohz(void);
+extern void update_cpu_load_nohz(int active);
 #else
-static inline void update_cpu_load_nohz(void) { }
+static inline void update_cpu_load_nohz(int active) { }
 #endif
 
 extern unsigned long get_parent_ip(unsigned long addr);
diff --git a/kernel/sched/fair.c b/kernel/sched/fair.c
index 26473fc..63c5ffd 100644
--- a/kernel/sched/fair.c
+++ b/kernel/sched/fair.c
@@ -4428,7 +4428,7 @@ static void update_idle_cpu_load(struct rq *this_rq)
 /*
  * Called from tick_nohz_idle_exit() -- try and fix up the ticks we missed.
  */
-void update_cpu_load_nohz(void)
+void update_cpu_load_nohz(int active)
 {
 	struct rq *this_rq = this_rq();
 	unsigned long curr_jiffies = READ_ONCE(jiffies);
@@ -4445,7 +4445,7 @@ void update_cpu_load_nohz(void)
 		 * We were idle, this means load 0, the current load might be
 		 * !0 due to remote wakeups and the sort.
 		 */
-		__update_cpu_load(this_rq, 0, pending_updates, 0);
+		__update_cpu_load(this_rq, 0, pending_updates, active);
 	}
 	raw_spin_unlock(&this_rq->lock);
 }
diff --git a/kernel/time/tick-sched.c b/kernel/time/tick-sched.c
index 7c7ec45..515edf3 100644
--- a/kernel/time/tick-sched.c
+++ b/kernel/time/tick-sched.c
@@ -694,11 +694,11 @@ out:
 	return tick;
 }
 
-static void tick_nohz_restart_sched_tick(struct tick_sched *ts, ktime_t now)
+static void tick_nohz_restart_sched_tick(struct tick_sched *ts, ktime_t now, int active)
 {
 	/* Update jiffies first */
 	tick_do_update_jiffies64(now);
-	update_cpu_load_nohz();
+	update_cpu_load_nohz(active);
 
 	calc_load_exit_idle();
 	touch_softlockup_watchdog();
@@ -725,7 +725,7 @@ static void tick_nohz_full_update_tick(struct tick_sched *ts)
 	if (can_stop_full_tick())
 		tick_nohz_stop_sched_tick(ts, ktime_get(), cpu);
 	else if (ts->tick_stopped)
-		tick_nohz_restart_sched_tick(ts, ktime_get());
+		tick_nohz_restart_sched_tick(ts, ktime_get(), 1);
 #endif
 }
 
@@ -916,7 +916,7 @@ void tick_nohz_idle_exit(void)
 		tick_nohz_stop_idle(ts, now);
 
 	if (ts->tick_stopped) {
-		tick_nohz_restart_sched_tick(ts, now);
+		tick_nohz_restart_sched_tick(ts, now, 0);
 		tick_nohz_account_idle_ticks(ts);
 	}
 
-- 
1.7.9.5


^ permalink raw reply related	[flat|nested] 14+ messages in thread

* Re: [PATCH v4 1/2] sched: make __update_cpu_load() handle active tickless case
  2015-10-14  9:47 ` [PATCH v4 1/2] sched: make __update_cpu_load() handle active tickless case byungchul.park
@ 2015-10-19 13:16   ` Peter Zijlstra
  2015-10-20  0:49     ` Byungchul Park
  2015-11-23 16:18   ` [tip:sched/core] sched/fair: Prepare __update_cpu_load() to handle active tickless tip-bot for Byungchul Park
  1 sibling, 1 reply; 14+ messages in thread
From: Peter Zijlstra @ 2015-10-19 13:16 UTC (permalink / raw)
  To: byungchul.park; +Cc: mingo, linux-kernel, fweisbec, tglx

On Wed, Oct 14, 2015 at 06:47:35PM +0900, byungchul.park@lge.com wrote:
> 
> cpu_load(n) = (1 - 1/s) * cpu_load(n-1) + (1/s) * L

So I've been taught to use subscripts, not arguments, for progressive
values of the same thing. However I can see the recursive nature of you
definition so I can well imagine people having different views on it.

>             , where n = the current tick - 1, s = scale
> 
>             = A * cpu_load(n-1) + B
>             , where A = 1 - 1/s, B = (1/s) * L
> 
>             = A * (A * cpu_load(n-2) + B) + B
> 
>             = A * (A * (A * cpu_load(n-3) + B) + B) + B
> 
>             = A^3 * cpu_load(n-3) + A^2 * B + A * B + B
> 
>             = A^i * cpu_load(n-i) + (A^(i-1) + A^(i-2) + ... + 1) * B
>             , where i = pending_updates - 1

You missed an opportunity here, if you take i==n you avoid the need for
i entirely.

>             = A^i * cpu_load(n-i) + B * (A^i - 1) / (A - 1)
>             , by geometric series formula for sum

That's wrong; the limited geometric series expands to:

  a * (1 - r^n) / (1 - r)

Which would give: A^i * cpu_load(n-i) + B * (1 - A^i) / (1 - A)

>             = (1 - 1/s)^i * (cpu_load(n-i) - L) + L
>             , by extending A and B

This appears to be correct however, I think your above mistake must have
been one copying.

I've rewritten the things a little; does this look good to you?

---
Subject: sched: make __update_cpu_load() handle active tickless case
From: Byungchul Park <byungchul.park@lge.com>
Date: Wed, 14 Oct 2015 18:47:35 +0900

XXX write new changelog...

Cc: mingo@kernel.org
Signed-off-by: Byungchul Park <byungchul.park@lge.com>
Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org>
Link: http://lkml.kernel.org/r/1444816056-11886-2-git-send-email-byungchul.park@lge.com
---
 kernel/sched/fair.c |   49 +++++++++++++++++++++++++++++++++++++++++--------
 1 file changed, 41 insertions(+), 8 deletions(-)

--- a/kernel/sched/fair.c
+++ b/kernel/sched/fair.c
@@ -4298,14 +4298,46 @@ decay_load_missed(unsigned long load, un
 	return load;
 }
 
-/*
+/**
+ * __update_cpu_load - update the rq->cpu_load[] statistics
+ * @this_rq: The rq to update statistics for
+ * @this_load: The current load
+ * @pending_updates: The number of missed updates
+ * @active: !0 for NOHZ_FULL
+ *
  * Update rq->cpu_load[] statistics. This function is usually called every
- * scheduler tick (TICK_NSEC). With tickless idle this will not be called
- * every tick. We fix it up based on jiffies.
+ * scheduler tick (TICK_NSEC).
+ *
+ * This function computes a decaying average:
+ *
+ *   load[i]' = (1 - 1/2^i) * load[i] + (1/2^i) * load
+ *
+ * Because of NOHZ it might not get called on every tick which gives need for
+ * the @pending_updates argument.
+ *
+ *   load[i]_n = (1 - 1/2^i) * load[i]_n-1 + (1/2^i) * load_n-1
+ *             = A * load[i]_n-1 + B ; A := (1 - 1/2^i), B := (1/2^i) * load
+ *             = A * (A * load[i]_n-2 + B) + B
+ *             = A * (A * (A * load[i]_n-3 + B) + B) + B
+ *             = A^3 * load[i]_n-3 + (A^2 + A + 1) * B
+ *             = A^n * load[i]_0 + (A^(n-1) + A^(n-2) + ... + 1) * B
+ *             = A^n * load[i]_0 + ((1 - A^n) / (1 - A)) * B
+ *             = (1 - 1/2^i)^n * (load[i]_0 - load) + load
+ *
+ * In the above we've assumed load_n := load, which is true for NOHZ_FULL as
+ * any change in load would have resulted in the tick being turned back on.
+ *
+ * For regular NOHZ, this reduces to:
+ *
+ *   load[i]_n = (1 - 1/2^i)^n * load[i]_0
+ *
+ * see decay_load_misses(). For NOHZ_FULL we get to subtract and add the extra
+ * term. See the @active paramter.
  */
 static void __update_cpu_load(struct rq *this_rq, unsigned long this_load,
-			      unsigned long pending_updates)
+			      unsigned long pending_updates, int active)
 {
+	unsigned long tickless_load = active ? this_rq->cpu_load[0] : 0;
 	int i, scale;
 
 	this_rq->nr_load_updates++;
@@ -4317,8 +4349,9 @@ static void __update_cpu_load(struct rq
 
 		/* scale is effectively 1 << i now, and >> i divides by scale */
 
-		old_load = this_rq->cpu_load[i];
+		old_load = this_rq->cpu_load[i] - tickless_load;
 		old_load = decay_load_missed(old_load, pending_updates - 1, i);
+		old_load += tickless_load;
 		new_load = this_load;
 		/*
 		 * Round up the averaging division if load is increasing. This
@@ -4373,7 +4406,7 @@ static void update_idle_cpu_load(struct
 	pending_updates = curr_jiffies - this_rq->last_load_update_tick;
 	this_rq->last_load_update_tick = curr_jiffies;
 
-	__update_cpu_load(this_rq, load, pending_updates);
+	__update_cpu_load(this_rq, load, pending_updates, 0);
 }
 
 /*
@@ -4396,7 +4429,7 @@ void update_cpu_load_nohz(void)
 		 * We were idle, this means load 0, the current load might be
 		 * !0 due to remote wakeups and the sort.
 		 */
-		__update_cpu_load(this_rq, 0, pending_updates);
+		__update_cpu_load(this_rq, 0, pending_updates, 0);
 	}
 	raw_spin_unlock(&this_rq->lock);
 }
@@ -4412,7 +4445,7 @@ void update_cpu_load_active(struct rq *t
 	 * See the mess around update_idle_cpu_load() / update_cpu_load_nohz().
 	 */
 	this_rq->last_load_update_tick = jiffies;
-	__update_cpu_load(this_rq, load, 1);
+	__update_cpu_load(this_rq, load, 1, 1);
 }
 
 /*

^ permalink raw reply	[flat|nested] 14+ messages in thread

* Re: [PATCH v4 1/2] sched: make __update_cpu_load() handle active tickless case
  2015-10-19 13:16   ` Peter Zijlstra
@ 2015-10-20  0:49     ` Byungchul Park
  2015-10-20  9:48       ` Peter Zijlstra
  0 siblings, 1 reply; 14+ messages in thread
From: Byungchul Park @ 2015-10-20  0:49 UTC (permalink / raw)
  To: Peter Zijlstra; +Cc: mingo, linux-kernel, fweisbec, tglx

On Mon, Oct 19, 2015 at 03:16:18PM +0200, Peter Zijlstra wrote:
> On Wed, Oct 14, 2015 at 06:47:35PM +0900, byungchul.park@lge.com wrote:
> > 
> > cpu_load(n) = (1 - 1/s) * cpu_load(n-1) + (1/s) * L
> 
> So I've been taught to use subscripts, not arguments, for progressive
> values of the same thing. However I can see the recursive nature of you
> definition so I can well imagine people having different views on it.
> 
> >             , where n = the current tick - 1, s = scale
> > 
> >             = A * cpu_load(n-1) + B
> >             , where A = 1 - 1/s, B = (1/s) * L
> > 
> >             = A * (A * cpu_load(n-2) + B) + B
> > 
> >             = A * (A * (A * cpu_load(n-3) + B) + B) + B
> > 
> >             = A^3 * cpu_load(n-3) + A^2 * B + A * B + B
> > 
> >             = A^i * cpu_load(n-i) + (A^(i-1) + A^(i-2) + ... + 1) * B
> >             , where i = pending_updates - 1
> 
> You missed an opportunity here, if you take i==n you avoid the need for
> i entirely.

i don't think so. as i said, _n_ is the current tick -1 and _i_ is
pending_updates - 1. we cannot take i == n, but should keep (n-i).

> 
> >             = A^i * cpu_load(n-i) + B * (A^i - 1) / (A - 1)
> >             , by geometric series formula for sum
> 
> That's wrong; the limited geometric series expands to:

NO, that's not wrong. it doesn't matter at all.

a * (1 - r^n) / (1 - r)
= a * (-1)(r^n - 1) / (-1)(r - 1)
= a * (r^n - 1) / (r - 1)

i mean these two are exactly same.

> 
>   a * (1 - r^n) / (1 - r)

but i think this is also good one.

> 
> Which would give: A^i * cpu_load(n-i) + B * (1 - A^i) / (1 - A)
> 
> >             = (1 - 1/s)^i * (cpu_load(n-i) - L) + L
> >             , by extending A and B
> 
> This appears to be correct however, I think your above mistake must have
> been one copying.
> 
> I've rewritten the things a little; does this look good to you?

however, your expressions and descriptions below look better than me,
except some logical errors. could you keep my logical flow unchagned?

thanks anyway,
byungchul

> 
> ---
> Subject: sched: make __update_cpu_load() handle active tickless case
> From: Byungchul Park <byungchul.park@lge.com>
> Date: Wed, 14 Oct 2015 18:47:35 +0900
> 
> XXX write new changelog...
> 
> Cc: mingo@kernel.org
> Signed-off-by: Byungchul Park <byungchul.park@lge.com>
> Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org>
> Link: http://lkml.kernel.org/r/1444816056-11886-2-git-send-email-byungchul.park@lge.com
> ---
>  kernel/sched/fair.c |   49 +++++++++++++++++++++++++++++++++++++++++--------
>  1 file changed, 41 insertions(+), 8 deletions(-)
> 
> --- a/kernel/sched/fair.c
> +++ b/kernel/sched/fair.c
> @@ -4298,14 +4298,46 @@ decay_load_missed(unsigned long load, un
>  	return load;
>  }
>  
> -/*
> +/**
> + * __update_cpu_load - update the rq->cpu_load[] statistics
> + * @this_rq: The rq to update statistics for
> + * @this_load: The current load
> + * @pending_updates: The number of missed updates
> + * @active: !0 for NOHZ_FULL
> + *
>   * Update rq->cpu_load[] statistics. This function is usually called every
> - * scheduler tick (TICK_NSEC). With tickless idle this will not be called
> - * every tick. We fix it up based on jiffies.
> + * scheduler tick (TICK_NSEC).
> + *
> + * This function computes a decaying average:
> + *
> + *   load[i]' = (1 - 1/2^i) * load[i] + (1/2^i) * load
> + *
> + * Because of NOHZ it might not get called on every tick which gives need for
> + * the @pending_updates argument.
> + *
> + *   load[i]_n = (1 - 1/2^i) * load[i]_n-1 + (1/2^i) * load_n-1
> + *             = A * load[i]_n-1 + B ; A := (1 - 1/2^i), B := (1/2^i) * load
> + *             = A * (A * load[i]_n-2 + B) + B
> + *             = A * (A * (A * load[i]_n-3 + B) + B) + B
> + *             = A^3 * load[i]_n-3 + (A^2 + A + 1) * B
> + *             = A^n * load[i]_0 + (A^(n-1) + A^(n-2) + ... + 1) * B
> + *             = A^n * load[i]_0 + ((1 - A^n) / (1 - A)) * B
> + *             = (1 - 1/2^i)^n * (load[i]_0 - load) + load
> + *
> + * In the above we've assumed load_n := load, which is true for NOHZ_FULL as
> + * any change in load would have resulted in the tick being turned back on.
> + *
> + * For regular NOHZ, this reduces to:
> + *
> + *   load[i]_n = (1 - 1/2^i)^n * load[i]_0
> + *
> + * see decay_load_misses(). For NOHZ_FULL we get to subtract and add the extra
> + * term. See the @active paramter.
>   */
>  static void __update_cpu_load(struct rq *this_rq, unsigned long this_load,
> -			      unsigned long pending_updates)
> +			      unsigned long pending_updates, int active)
>  {
> +	unsigned long tickless_load = active ? this_rq->cpu_load[0] : 0;
>  	int i, scale;
>  
>  	this_rq->nr_load_updates++;
> @@ -4317,8 +4349,9 @@ static void __update_cpu_load(struct rq
>  
>  		/* scale is effectively 1 << i now, and >> i divides by scale */
>  
> -		old_load = this_rq->cpu_load[i];
> +		old_load = this_rq->cpu_load[i] - tickless_load;
>  		old_load = decay_load_missed(old_load, pending_updates - 1, i);
> +		old_load += tickless_load;
>  		new_load = this_load;
>  		/*
>  		 * Round up the averaging division if load is increasing. This
> @@ -4373,7 +4406,7 @@ static void update_idle_cpu_load(struct
>  	pending_updates = curr_jiffies - this_rq->last_load_update_tick;
>  	this_rq->last_load_update_tick = curr_jiffies;
>  
> -	__update_cpu_load(this_rq, load, pending_updates);
> +	__update_cpu_load(this_rq, load, pending_updates, 0);
>  }
>  
>  /*
> @@ -4396,7 +4429,7 @@ void update_cpu_load_nohz(void)
>  		 * We were idle, this means load 0, the current load might be
>  		 * !0 due to remote wakeups and the sort.
>  		 */
> -		__update_cpu_load(this_rq, 0, pending_updates);
> +		__update_cpu_load(this_rq, 0, pending_updates, 0);
>  	}
>  	raw_spin_unlock(&this_rq->lock);
>  }
> @@ -4412,7 +4445,7 @@ void update_cpu_load_active(struct rq *t
>  	 * See the mess around update_idle_cpu_load() / update_cpu_load_nohz().
>  	 */
>  	this_rq->last_load_update_tick = jiffies;
> -	__update_cpu_load(this_rq, load, 1);
> +	__update_cpu_load(this_rq, load, 1, 1);
>  }
>  
>  /*
> --
> To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
> the body of a message to majordomo@vger.kernel.org
> More majordomo info at  http://vger.kernel.org/majordomo-info.html
> Please read the FAQ at  http://www.tux.org/lkml/

^ permalink raw reply	[flat|nested] 14+ messages in thread

* Re: [PATCH v4 1/2] sched: make __update_cpu_load() handle active tickless case
  2015-10-20  0:49     ` Byungchul Park
@ 2015-10-20  9:48       ` Peter Zijlstra
  2015-10-22 10:28         ` Byungchul Park
  0 siblings, 1 reply; 14+ messages in thread
From: Peter Zijlstra @ 2015-10-20  9:48 UTC (permalink / raw)
  To: Byungchul Park; +Cc: mingo, linux-kernel, fweisbec, tglx

On Tue, Oct 20, 2015 at 09:49:32AM +0900, Byungchul Park wrote:
> > >             = A^i * cpu_load(n-i) + (A^(i-1) + A^(i-2) + ... + 1) * B
> > >             , where i = pending_updates - 1
> > 
> > You missed an opportunity here, if you take i==n you avoid the need for
> > i entirely.
> 
> i don't think so. as i said, _n_ is the current tick -1 and _i_ is
> pending_updates - 1. we cannot take i == n, but should keep (n-i).

Just think relative; it doesn't matter when in time we do this.
So 0 to n is identical to any other interval.

> > >             = A^i * cpu_load(n-i) + B * (A^i - 1) / (A - 1)
> > >             , by geometric series formula for sum
> > 
> > That's wrong; the limited geometric series expands to:
> 
> NO, that's not wrong. it doesn't matter at all.
> 
> a * (1 - r^n) / (1 - r)
> = a * (-1)(r^n - 1) / (-1)(r - 1)
> = a * (r^n - 1) / (r - 1)
> 
> i mean these two are exactly same.

Ah indeed! Sorry for that. I clearly didn't think beyond the series
expansion I found.

> > I've rewritten the things a little; does this look good to you?
> 
> however, your expressions and descriptions below look better than me,
> except some logical errors. could you keep my logical flow unchagned?

> > + *   load[i]_n = (1 - 1/2^i) * load[i]_n-1 + (1/2^i) * load_n-1
> > + *             = A * load[i]_n-1 + B ; A := (1 - 1/2^i), B := (1/2^i) * load
> > + *             = A * (A * load[i]_n-2 + B) + B
> > + *             = A * (A * (A * load[i]_n-3 + B) + B) + B
> > + *             = A^3 * load[i]_n-3 + (A^2 + A + 1) * B
> > + *             = A^n * load[i]_0 + (A^(n-1) + A^(n-2) + ... + 1) * B
> > + *             = A^n * load[i]_0 + ((1 - A^n) / (1 - A)) * B
> > + *             = (1 - 1/2^i)^n * (load[i]_0 - load) + load

That is the same logic, right?

Please be more specific as to what you'd like restored.

^ permalink raw reply	[flat|nested] 14+ messages in thread

* Re: [PATCH v4 1/2] sched: make __update_cpu_load() handle active tickless case
  2015-10-20  9:48       ` Peter Zijlstra
@ 2015-10-22 10:28         ` Byungchul Park
  2015-10-22 11:05           ` Peter Zijlstra
  0 siblings, 1 reply; 14+ messages in thread
From: Byungchul Park @ 2015-10-22 10:28 UTC (permalink / raw)
  To: Peter Zijlstra; +Cc: mingo, linux-kernel, fweisbec, tglx

On Tue, Oct 20, 2015 at 11:48:48AM +0200, Peter Zijlstra wrote:
> On Tue, Oct 20, 2015 at 09:49:32AM +0900, Byungchul Park wrote:
> > > >             = A^i * cpu_load(n-i) + (A^(i-1) + A^(i-2) + ... + 1) * B
> > > >             , where i = pending_updates - 1
> > > 
> > > You missed an opportunity here, if you take i==n you avoid the need for
> > > i entirely.
> > 
> > i don't think so. as i said, _n_ is the current tick -1 and _i_ is
> > pending_updates - 1. we cannot take i == n, but should keep (n-i).
> 
> Just think relative; it doesn't matter when in time we do this.
> So 0 to n is identical to any other interval.

You are right. I'm always tired these days. I was confused.
Sorry for that.

> 
> > > >             = A^i * cpu_load(n-i) + B * (A^i - 1) / (A - 1)
> > > >             , by geometric series formula for sum
> > > 
> > > That's wrong; the limited geometric series expands to:
> > 
> > NO, that's not wrong. it doesn't matter at all.
> > 
> > a * (1 - r^n) / (1 - r)
> > = a * (-1)(r^n - 1) / (-1)(r - 1)
> > = a * (r^n - 1) / (r - 1)
> > 
> > i mean these two are exactly same.
> 
> Ah indeed! Sorry for that. I clearly didn't think beyond the series
> expansion I found.
> 
> > > I've rewritten the things a little; does this look good to you?
> > 
> > however, your expressions and descriptions below look better than me,
> > except some logical errors. could you keep my logical flow unchagned?
> 
> > > + *   load[i]_n = (1 - 1/2^i) * load[i]_n-1 + (1/2^i) * load_n-1
> > > + *             = A * load[i]_n-1 + B ; A := (1 - 1/2^i), B := (1/2^i) * load
> > > + *             = A * (A * load[i]_n-2 + B) + B
> > > + *             = A * (A * (A * load[i]_n-3 + B) + B) + B
> > > + *             = A^3 * load[i]_n-3 + (A^2 + A + 1) * B
> > > + *             = A^n * load[i]_0 + (A^(n-1) + A^(n-2) + ... + 1) * B
> > > + *             = A^n * load[i]_0 + ((1 - A^n) / (1 - A)) * B
> > > + *             = (1 - 1/2^i)^n * (load[i]_0 - load) + load
> 
> That is the same logic, right?

Yes, absolutely same logic.

Thanks a lot,
byungchul

> 
> Please be more specific as to what you'd like restored.
> --
> To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
> the body of a message to majordomo@vger.kernel.org
> More majordomo info at  http://vger.kernel.org/majordomo-info.html
> Please read the FAQ at  http://www.tux.org/lkml/

^ permalink raw reply	[flat|nested] 14+ messages in thread

* Re: [PATCH v4 1/2] sched: make __update_cpu_load() handle active tickless case
  2015-10-22 10:28         ` Byungchul Park
@ 2015-10-22 11:05           ` Peter Zijlstra
  0 siblings, 0 replies; 14+ messages in thread
From: Peter Zijlstra @ 2015-10-22 11:05 UTC (permalink / raw)
  To: Byungchul Park; +Cc: mingo, linux-kernel, fweisbec, tglx

On Thu, Oct 22, 2015 at 07:28:52PM +0900, Byungchul Park wrote:

> You are right. I'm always tired these days. I was confused.
> Sorry for that.

No problem, I know the feeling, my second kid is growing teeth :-)

^ permalink raw reply	[flat|nested] 14+ messages in thread

* Re: [PATCH v4 2/2] sched: consider missed ticks in full NOHZ
  2015-10-14  9:47 ` [PATCH v4 2/2] sched: consider missed ticks in full NOHZ byungchul.park
@ 2015-10-22 16:20   ` Frederic Weisbecker
  2015-11-02 16:10   ` Peter Zijlstra
  1 sibling, 0 replies; 14+ messages in thread
From: Frederic Weisbecker @ 2015-10-22 16:20 UTC (permalink / raw)
  To: byungchul.park; +Cc: mingo, peterz, linux-kernel, tglx

On Wed, Oct 14, 2015 at 06:47:36PM +0900, byungchul.park@lge.com wrote:
> From: Byungchul Park <byungchul.park@lge.com>
> 
> Even though the cpu is non-idle when its tick is stoped in full NOHZ,
> current "update_cpu_load" code considers as if the cpu has been idle
> unconditionally. It's wrong. This patch makes the "update_cpu_load"
> code know if the calling path comes from full NOHZ or idle NOHZ.
> 
> Signed-off-by: Byungchul Park <byungchul.park@lge.com>

ACK.

Thanks!

^ permalink raw reply	[flat|nested] 14+ messages in thread

* Re: [PATCH v4 2/2] sched: consider missed ticks in full NOHZ
  2015-10-14  9:47 ` [PATCH v4 2/2] sched: consider missed ticks in full NOHZ byungchul.park
  2015-10-22 16:20   ` Frederic Weisbecker
@ 2015-11-02 16:10   ` Peter Zijlstra
  2015-11-09  2:36     ` Byungchul Park
  1 sibling, 1 reply; 14+ messages in thread
From: Peter Zijlstra @ 2015-11-02 16:10 UTC (permalink / raw)
  To: byungchul.park; +Cc: mingo, linux-kernel, fweisbec, tglx

On Wed, Oct 14, 2015 at 06:47:36PM +0900, byungchul.park@lge.com wrote:
> --- a/kernel/sched/fair.c
> +++ b/kernel/sched/fair.c
> @@ -4428,7 +4428,7 @@ static void update_idle_cpu_load(struct rq *this_rq)

So if one were to read the comment above update_idle_cpu_load() one
would find there's a problem with jiffy based accounting.

>  /*
>   * Called from tick_nohz_idle_exit() -- try and fix up the ticks we missed.
>   */
> -void update_cpu_load_nohz(void)
> +void update_cpu_load_nohz(int active)

> diff --git a/kernel/time/tick-sched.c b/kernel/time/tick-sched.c
> index 7c7ec45..515edf3 100644
> --- a/kernel/time/tick-sched.c
> +++ b/kernel/time/tick-sched.c

> -static void tick_nohz_restart_sched_tick(struct tick_sched *ts, ktime_t now)
> +static void tick_nohz_restart_sched_tick(struct tick_sched *ts, ktime_t now, int active)
>  {
>  	/* Update jiffies first */
>  	tick_do_update_jiffies64(now);
> -	update_cpu_load_nohz();
> +	update_cpu_load_nohz(active);
>  
>  	calc_load_exit_idle();
>  	touch_softlockup_watchdog();

And we could solve all that nicely if we pull up the hrtimer_forward()
result from tick_nohz_restart(), that way we have the actual number of
ticks lost on this cpu, and no need to start guessing about it.

Hmm?

^ permalink raw reply	[flat|nested] 14+ messages in thread

* Re: [PATCH v4 2/2] sched: consider missed ticks in full NOHZ
  2015-11-02 16:10   ` Peter Zijlstra
@ 2015-11-09  2:36     ` Byungchul Park
  2015-11-09 10:36       ` Peter Zijlstra
  0 siblings, 1 reply; 14+ messages in thread
From: Byungchul Park @ 2015-11-09  2:36 UTC (permalink / raw)
  To: Peter Zijlstra; +Cc: mingo, linux-kernel, fweisbec, tglx

On Mon, Nov 02, 2015 at 05:10:16PM +0100, Peter Zijlstra wrote:
> On Wed, Oct 14, 2015 at 06:47:36PM +0900, byungchul.park@lge.com wrote:
> > --- a/kernel/sched/fair.c
> > +++ b/kernel/sched/fair.c
> > @@ -4428,7 +4428,7 @@ static void update_idle_cpu_load(struct rq *this_rq)
> 
> So if one were to read the comment above update_idle_cpu_load() one
> would find there's a problem with jiffy based accounting.
> 
> >  /*
> >   * Called from tick_nohz_idle_exit() -- try and fix up the ticks we missed.
> >   */
> > -void update_cpu_load_nohz(void)
> > +void update_cpu_load_nohz(int active)
> 
> > diff --git a/kernel/time/tick-sched.c b/kernel/time/tick-sched.c
> > index 7c7ec45..515edf3 100644
> > --- a/kernel/time/tick-sched.c
> > +++ b/kernel/time/tick-sched.c
> 
> > -static void tick_nohz_restart_sched_tick(struct tick_sched *ts, ktime_t now)
> > +static void tick_nohz_restart_sched_tick(struct tick_sched *ts, ktime_t now, int active)
> >  {
> >  	/* Update jiffies first */
> >  	tick_do_update_jiffies64(now);
> > -	update_cpu_load_nohz();
> > +	update_cpu_load_nohz(active);
> >  
> >  	calc_load_exit_idle();
> >  	touch_softlockup_watchdog();
> 
> And we could solve all that nicely if we pull up the hrtimer_forward()
> result from tick_nohz_restart(), that way we have the actual number of
> ticks lost on this cpu, and no need to start guessing about it.

hello,

are you talking about the lag between writer and reader for jiffies?
i think your proposal can solve the problem of update_cpu_load_nohz().
but it's still hard to care the cases of update_idle_cpu_load()
and update_cpu_load_active() even by the way you proposed.

do you think it would be ok even if it solves only one case?
update_idle_cpu_load() still need to guess about it. is there something
i missed? or did i mis-understand what you intend?

> 
> Hmm?
> --
> To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
> the body of a message to majordomo@vger.kernel.org
> More majordomo info at  http://vger.kernel.org/majordomo-info.html
> Please read the FAQ at  http://www.tux.org/lkml/

^ permalink raw reply	[flat|nested] 14+ messages in thread

* Re: [PATCH v4 2/2] sched: consider missed ticks in full NOHZ
  2015-11-09  2:36     ` Byungchul Park
@ 2015-11-09 10:36       ` Peter Zijlstra
  2015-11-09 11:16         ` Byungchul Park
  0 siblings, 1 reply; 14+ messages in thread
From: Peter Zijlstra @ 2015-11-09 10:36 UTC (permalink / raw)
  To: Byungchul Park; +Cc: mingo, linux-kernel, fweisbec, tglx

On Mon, Nov 09, 2015 at 11:36:54AM +0900, Byungchul Park wrote:
> On Mon, Nov 02, 2015 at 05:10:16PM +0100, Peter Zijlstra wrote:
> > On Wed, Oct 14, 2015 at 06:47:36PM +0900, byungchul.park@lge.com wrote:
> > > --- a/kernel/sched/fair.c
> > > +++ b/kernel/sched/fair.c
> > > @@ -4428,7 +4428,7 @@ static void update_idle_cpu_load(struct rq *this_rq)
> > 
> > So if one were to read the comment above update_idle_cpu_load() one
> > would find there's a problem with jiffy based accounting.
> > 
> > >  /*
> > >   * Called from tick_nohz_idle_exit() -- try and fix up the ticks we missed.
> > >   */
> > > -void update_cpu_load_nohz(void)
> > > +void update_cpu_load_nohz(int active)
> > 
> > > diff --git a/kernel/time/tick-sched.c b/kernel/time/tick-sched.c
> > > index 7c7ec45..515edf3 100644
> > > --- a/kernel/time/tick-sched.c
> > > +++ b/kernel/time/tick-sched.c
> > 
> > > -static void tick_nohz_restart_sched_tick(struct tick_sched *ts, ktime_t now)
> > > +static void tick_nohz_restart_sched_tick(struct tick_sched *ts, ktime_t now, int active)
> > >  {
> > >  	/* Update jiffies first */
> > >  	tick_do_update_jiffies64(now);
> > > -	update_cpu_load_nohz();
> > > +	update_cpu_load_nohz(active);
> > >  
> > >  	calc_load_exit_idle();
> > >  	touch_softlockup_watchdog();
> > 
> > And we could solve all that nicely if we pull up the hrtimer_forward()
> > result from tick_nohz_restart(), that way we have the actual number of
> > ticks lost on this cpu, and no need to start guessing about it.
> 
> hello,
> 
> are you talking about the lag between writer and reader for jiffies?
> i think your proposal can solve the problem of update_cpu_load_nohz().
> but it's still hard to care the cases of update_idle_cpu_load()
> and update_cpu_load_active() even by the way you proposed.
> 
> do you think it would be ok even if it solves only one case?
> update_idle_cpu_load() still need to guess about it. is there something
> i missed? or did i mis-understand what you intend?

I was thinking of getting rid of rq->last_load_update_tick entirely. If
we can pass in how many (local) ticks were lost on this cpu, we don't
have to rely on the jiffy counter at all.

^ permalink raw reply	[flat|nested] 14+ messages in thread

* Re: [PATCH v4 2/2] sched: consider missed ticks in full NOHZ
  2015-11-09 10:36       ` Peter Zijlstra
@ 2015-11-09 11:16         ` Byungchul Park
  0 siblings, 0 replies; 14+ messages in thread
From: Byungchul Park @ 2015-11-09 11:16 UTC (permalink / raw)
  To: Peter Zijlstra; +Cc: mingo, linux-kernel, fweisbec, tglx

On Mon, Nov 09, 2015 at 11:36:06AM +0100, Peter Zijlstra wrote:
> On Mon, Nov 09, 2015 at 11:36:54AM +0900, Byungchul Park wrote:
> > On Mon, Nov 02, 2015 at 05:10:16PM +0100, Peter Zijlstra wrote:
> > > On Wed, Oct 14, 2015 at 06:47:36PM +0900, byungchul.park@lge.com wrote:
> > > > --- a/kernel/sched/fair.c
> > > > +++ b/kernel/sched/fair.c
> > > > @@ -4428,7 +4428,7 @@ static void update_idle_cpu_load(struct rq *this_rq)
> > > 
> > > So if one were to read the comment above update_idle_cpu_load() one
> > > would find there's a problem with jiffy based accounting.
> > > 
> > > >  /*
> > > >   * Called from tick_nohz_idle_exit() -- try and fix up the ticks we missed.
> > > >   */
> > > > -void update_cpu_load_nohz(void)
> > > > +void update_cpu_load_nohz(int active)
> > > 
> > > > diff --git a/kernel/time/tick-sched.c b/kernel/time/tick-sched.c
> > > > index 7c7ec45..515edf3 100644
> > > > --- a/kernel/time/tick-sched.c
> > > > +++ b/kernel/time/tick-sched.c
> > > 
> > > > -static void tick_nohz_restart_sched_tick(struct tick_sched *ts, ktime_t now)
> > > > +static void tick_nohz_restart_sched_tick(struct tick_sched *ts, ktime_t now, int active)
> > > >  {
> > > >  	/* Update jiffies first */
> > > >  	tick_do_update_jiffies64(now);
> > > > -	update_cpu_load_nohz();
> > > > +	update_cpu_load_nohz(active);
> > > >  
> > > >  	calc_load_exit_idle();
> > > >  	touch_softlockup_watchdog();
> > > 
> > > And we could solve all that nicely if we pull up the hrtimer_forward()
> > > result from tick_nohz_restart(), that way we have the actual number of
> > > ticks lost on this cpu, and no need to start guessing about it.
> > 
> > hello,
> > 
> > are you talking about the lag between writer and reader for jiffies?
> > i think your proposal can solve the problem of update_cpu_load_nohz().
> > but it's still hard to care the cases of update_idle_cpu_load()
> > and update_cpu_load_active() even by the way you proposed.
> > 
> > do you think it would be ok even if it solves only one case?
> > update_idle_cpu_load() still need to guess about it. is there something
> > i missed? or did i mis-understand what you intend?
> 
> I was thinking of getting rid of rq->last_load_update_tick entirely. If
> we can pass in how many (local) ticks were lost on this cpu, we don't
> have to rely on the jiffy counter at all.

i see..

thank you.

> --
> To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
> the body of a message to majordomo@vger.kernel.org
> More majordomo info at  http://vger.kernel.org/majordomo-info.html
> Please read the FAQ at  http://www.tux.org/lkml/

^ permalink raw reply	[flat|nested] 14+ messages in thread

* [tip:sched/core] sched/fair: Prepare __update_cpu_load() to handle active tickless
  2015-10-14  9:47 ` [PATCH v4 1/2] sched: make __update_cpu_load() handle active tickless case byungchul.park
  2015-10-19 13:16   ` Peter Zijlstra
@ 2015-11-23 16:18   ` tip-bot for Byungchul Park
  1 sibling, 0 replies; 14+ messages in thread
From: tip-bot for Byungchul Park @ 2015-11-23 16:18 UTC (permalink / raw)
  To: linux-tip-commits
  Cc: linux-kernel, tglx, torvalds, byungchul.park, hpa, peterz, efault, mingo

Commit-ID:  59543275488d18d878cd2ab2b1072efc1e9ac1c4
Gitweb:     http://git.kernel.org/tip/59543275488d18d878cd2ab2b1072efc1e9ac1c4
Author:     Byungchul Park <byungchul.park@lge.com>
AuthorDate: Wed, 14 Oct 2015 18:47:35 +0900
Committer:  Ingo Molnar <mingo@kernel.org>
CommitDate: Mon, 23 Nov 2015 09:37:53 +0100

sched/fair: Prepare __update_cpu_load() to handle active tickless

There are some cases where distance between ticks is more than one tick
while the CPU is not idle, e.g. full NOHZ.

However __update_cpu_load() assumes it is the idle tickless case if the
distance between ticks is more than 1, even though it can be the active
tickless case as well. Thus in the active tickless case, updating the CPU
load will not be performed correctly.

Where the current code assumes the load for each tick is zero, this is
(obviously) not true in non-idle tickless case. We can approximately
consider the load ~= this_rq->cpu_load[0] during tickless in non-idle
tickless case.

Signed-off-by: Byungchul Park <byungchul.park@lge.com>
Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Mike Galbraith <efault@gmx.de>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Thomas Gleixner <tglx@linutronix.de>
Link: http://lkml.kernel.org/r/1444816056-11886-2-git-send-email-byungchul.park@lge.com
Signed-off-by: Ingo Molnar <mingo@kernel.org>
---
 kernel/sched/fair.c | 49 +++++++++++++++++++++++++++++++++++++++++--------
 1 file changed, 41 insertions(+), 8 deletions(-)

diff --git a/kernel/sched/fair.c b/kernel/sched/fair.c
index 8f3905e..404006a 100644
--- a/kernel/sched/fair.c
+++ b/kernel/sched/fair.c
@@ -4283,14 +4283,46 @@ decay_load_missed(unsigned long load, unsigned long missed_updates, int idx)
 	return load;
 }
 
-/*
+/**
+ * __update_cpu_load - update the rq->cpu_load[] statistics
+ * @this_rq: The rq to update statistics for
+ * @this_load: The current load
+ * @pending_updates: The number of missed updates
+ * @active: !0 for NOHZ_FULL
+ *
  * Update rq->cpu_load[] statistics. This function is usually called every
- * scheduler tick (TICK_NSEC). With tickless idle this will not be called
- * every tick. We fix it up based on jiffies.
+ * scheduler tick (TICK_NSEC).
+ *
+ * This function computes a decaying average:
+ *
+ *   load[i]' = (1 - 1/2^i) * load[i] + (1/2^i) * load
+ *
+ * Because of NOHZ it might not get called on every tick which gives need for
+ * the @pending_updates argument.
+ *
+ *   load[i]_n = (1 - 1/2^i) * load[i]_n-1 + (1/2^i) * load_n-1
+ *             = A * load[i]_n-1 + B ; A := (1 - 1/2^i), B := (1/2^i) * load
+ *             = A * (A * load[i]_n-2 + B) + B
+ *             = A * (A * (A * load[i]_n-3 + B) + B) + B
+ *             = A^3 * load[i]_n-3 + (A^2 + A + 1) * B
+ *             = A^n * load[i]_0 + (A^(n-1) + A^(n-2) + ... + 1) * B
+ *             = A^n * load[i]_0 + ((1 - A^n) / (1 - A)) * B
+ *             = (1 - 1/2^i)^n * (load[i]_0 - load) + load
+ *
+ * In the above we've assumed load_n := load, which is true for NOHZ_FULL as
+ * any change in load would have resulted in the tick being turned back on.
+ *
+ * For regular NOHZ, this reduces to:
+ *
+ *   load[i]_n = (1 - 1/2^i)^n * load[i]_0
+ *
+ * see decay_load_misses(). For NOHZ_FULL we get to subtract and add the extra
+ * term. See the @active paramter.
  */
 static void __update_cpu_load(struct rq *this_rq, unsigned long this_load,
-			      unsigned long pending_updates)
+			      unsigned long pending_updates, int active)
 {
+	unsigned long tickless_load = active ? this_rq->cpu_load[0] : 0;
 	int i, scale;
 
 	this_rq->nr_load_updates++;
@@ -4302,8 +4334,9 @@ static void __update_cpu_load(struct rq *this_rq, unsigned long this_load,
 
 		/* scale is effectively 1 << i now, and >> i divides by scale */
 
-		old_load = this_rq->cpu_load[i];
+		old_load = this_rq->cpu_load[i] - tickless_load;
 		old_load = decay_load_missed(old_load, pending_updates - 1, i);
+		old_load += tickless_load;
 		new_load = this_load;
 		/*
 		 * Round up the averaging division if load is increasing. This
@@ -4358,7 +4391,7 @@ static void update_idle_cpu_load(struct rq *this_rq)
 	pending_updates = curr_jiffies - this_rq->last_load_update_tick;
 	this_rq->last_load_update_tick = curr_jiffies;
 
-	__update_cpu_load(this_rq, load, pending_updates);
+	__update_cpu_load(this_rq, load, pending_updates, 0);
 }
 
 /*
@@ -4381,7 +4414,7 @@ void update_cpu_load_nohz(void)
 		 * We were idle, this means load 0, the current load might be
 		 * !0 due to remote wakeups and the sort.
 		 */
-		__update_cpu_load(this_rq, 0, pending_updates);
+		__update_cpu_load(this_rq, 0, pending_updates, 0);
 	}
 	raw_spin_unlock(&this_rq->lock);
 }
@@ -4397,7 +4430,7 @@ void update_cpu_load_active(struct rq *this_rq)
 	 * See the mess around update_idle_cpu_load() / update_cpu_load_nohz().
 	 */
 	this_rq->last_load_update_tick = jiffies;
-	__update_cpu_load(this_rq, load, 1);
+	__update_cpu_load(this_rq, load, 1, 1);
 }
 
 /*

^ permalink raw reply related	[flat|nested] 14+ messages in thread

end of thread, other threads:[~2015-11-23 16:19 UTC | newest]

Thread overview: 14+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2015-10-14  9:47 [PATCH v4 0/2] sched: consider missed ticks when updating cpu load byungchul.park
2015-10-14  9:47 ` [PATCH v4 1/2] sched: make __update_cpu_load() handle active tickless case byungchul.park
2015-10-19 13:16   ` Peter Zijlstra
2015-10-20  0:49     ` Byungchul Park
2015-10-20  9:48       ` Peter Zijlstra
2015-10-22 10:28         ` Byungchul Park
2015-10-22 11:05           ` Peter Zijlstra
2015-11-23 16:18   ` [tip:sched/core] sched/fair: Prepare __update_cpu_load() to handle active tickless tip-bot for Byungchul Park
2015-10-14  9:47 ` [PATCH v4 2/2] sched: consider missed ticks in full NOHZ byungchul.park
2015-10-22 16:20   ` Frederic Weisbecker
2015-11-02 16:10   ` Peter Zijlstra
2015-11-09  2:36     ` Byungchul Park
2015-11-09 10:36       ` Peter Zijlstra
2015-11-09 11:16         ` Byungchul Park

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.