All of lore.kernel.org
 help / color / mirror / Atom feed
From: Morten Rasmussen <morten.rasmussen@arm.com>
To: Peter Zijlstra <peterz@infradead.org>
Cc: mingo@kernel.org, linux-kernel@vger.kernel.org, tj@kernel.org,
	josef@toxicpanda.com, torvalds@linux-foundation.org,
	vincent.guittot@linaro.org, efault@gmx.de, pjt@google.com,
	clm@fb.com, dietmar.eggemann@arm.com, bsegall@google.com,
	yuyang.du@intel.com
Subject: Re: [PATCH -v2 12/18] sched/fair: Rewrite PELT migration propagation
Date: Mon, 9 Oct 2017 09:08:57 +0100	[thread overview]
Message-ID: <20171009080856.GB24129@e105550-lin.cambridge.arm.com> (raw)
In-Reply-To: <20170901132748.580255511@infradead.org>

On Fri, Sep 01, 2017 at 03:21:11PM +0200, Peter Zijlstra wrote:
> When an entity migrates in (or out) of a runqueue, we need to add (or
> remove) its contribution from the entire PELT hierarchy, because even
> non-runnable entities are included in the load average sums.
> 
> In order to do this we have some propagation logic that updates the
> PELT tree, however the way it 'propagates' the runnable (or load)
> change is (more or less):
> 
>                      tg->weight * grq->avg.load_avg
>   ge->avg.load_avg = ------------------------------
>                                tg->load_avg
> 
> But that is the expression for ge->weight, and per the definition of
> load_avg:
> 
>   ge->avg.load_avg := ge->weight * ge->avg.runnable_avg

You may want to replace "ge->weight" by "ge->load.weight" to be consistent
with the code comments introduced by the patch.

> That destroys the runnable_avg (by setting it to 1) we wanted to
> propagate.
> 
> Instead directly propagate runnable_sum.
> 
> Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org>
> ---
>  kernel/sched/debug.c |    2 
>  kernel/sched/fair.c  |  186 ++++++++++++++++++++++++++++-----------------------
>  kernel/sched/sched.h |    9 +-
>  3 files changed, 112 insertions(+), 85 deletions(-)
> 
> --- a/kernel/sched/debug.c
> +++ b/kernel/sched/debug.c
> @@ -565,6 +565,8 @@ void print_cfs_rq(struct seq_file *m, in
>  			cfs_rq->removed.load_avg);
>  	SEQ_printf(m, "  .%-30s: %ld\n", "removed.util_avg",
>  			cfs_rq->removed.util_avg);
> +	SEQ_printf(m, "  .%-30s: %ld\n", "removed.runnable_sum",
> +			cfs_rq->removed.runnable_sum);
>  #ifdef CONFIG_FAIR_GROUP_SCHED
>  	SEQ_printf(m, "  .%-30s: %lu\n", "tg_load_avg_contrib",
>  			cfs_rq->tg_load_avg_contrib);
> --- a/kernel/sched/fair.c
> +++ b/kernel/sched/fair.c
> @@ -3330,11 +3330,77 @@ void set_task_rq_fair(struct sched_entit
>  	se->avg.last_update_time = n_last_update_time;
>  }
>  
> -/* Take into account change of utilization of a child task group */
> +
> +/*
> + * When on migration a sched_entity joins/leaves the PELT hierarchy, we need to
> + * propagate its contribution. The key to this propagation is the invariant
> + * that for each group:
> + *
> + *   ge->avg == grq->avg						(1)
> + *
> + * _IFF_ we look at the pure running and runnable sums. Because they
> + * represent the very same entity, just at different points in the hierarchy.
> + *
> + *
> + * Per the above update_tg_cfs_util() is trivial (and still 'wrong') and
> + * simply copies the running sum over.
> + *
> + * However, update_tg_cfs_runnable() is more complex. So we have:
> + *
> + *   ge->avg.load_avg = ge->load.weight * ge->avg.runnable_avg		(2)
> + *
> + * And since, like util, the runnable part should be directly transferable,
> + * the following would _appear_ to be the straight forward approach:
> + *
> + *   grq->avg.load_avg = grq->load.weight * grq->avg.running_avg	(3)

Should it be grq->avg.runnable_avg instead of running_avg?

cfs_rq->avg.load_avg has been defined previous (in patch 2 I think) to
be:

	load_avg = \Sum se->avg.load_avg
		 = \Sum se->load.weight * se->avg.runnable_avg

That sum will increase when ge is runnable regardless of whether it is
running or not. So, I think it has to be runnable_avg to make sense?

> + *
> + * And per (1) we have:
> + *
> + *   ge->avg.running_avg == grq->avg.running_avg

You just said further up that (1) only applies to running and runnable
sums? These are averages, so I think this is invalid use of (1). But
maybe that is part of your point about (4) being wrong?

I'm still trying to get my head around the remaining bits, but it sort
of depends if I understood the above bits correctly :)

Morten

> + *
> + * Which gives:
> + *
> + *                      ge->load.weight * grq->avg.load_avg
> + *   ge->avg.load_avg = -----------------------------------		(4)
> + *                               grq->load.weight
> + *
> + * Except that is wrong!
> + *
> + * Because while for entities historical weight is not important and we
> + * really only care about our future and therefore can consider a pure
> + * runnable sum, runqueues can NOT do this.
> + *
> + * We specifically want runqueues to have a load_avg that includes
> + * historical weights. Those represent the blocked load, the load we expect
> + * to (shortly) return to us. This only works by keeping the weights as
> + * integral part of the sum. We therefore cannot decompose as per (3).
> + *
> + * OK, so what then?
> + *
> + *
> + * Another way to look at things is:
> + *
> + *   grq->avg.load_avg = \Sum se->avg.load_avg
> + *
> + * Therefore, per (2):
> + *
> + *   grq->avg.load_avg = \Sum se->load.weight * se->avg.runnable_avg
> + *
> + * And the very thing we're propagating is a change in that sum (someone
> + * joined/left). So we can easily know the runnable change, which would be, per
> + * (2) the already tracked se->load_avg divided by the corresponding
> + * se->weight.
> + *
> + * Basically (4) but in differential form:
> + *
> + *   d(runnable_avg) += se->avg.load_avg / se->load.weight
> + *								   (5)
> + *   ge->avg.load_avg += ge->load.weight * d(runnable_avg)
> + */
> +

  reply	other threads:[~2017-10-09  8:09 UTC|newest]

Thread overview: 57+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2017-09-01 13:20 [PATCH -v2 00/18] sched/fair: A bit of a cgroup/PELT overhaul Peter Zijlstra
2017-09-01 13:21 ` [PATCH -v2 01/18] sched/fair: Clean up calc_cfs_shares() Peter Zijlstra
2017-09-01 13:21 ` [PATCH -v2 02/18] sched/fair: Add comment to calc_cfs_shares() Peter Zijlstra
2017-09-28 10:03   ` Morten Rasmussen
2017-09-29 11:35     ` Peter Zijlstra
2017-09-29 13:03       ` Morten Rasmussen
2017-09-01 13:21 ` [PATCH -v2 03/18] sched/fair: Cure calc_cfs_shares() vs reweight_entity() Peter Zijlstra
2017-09-29  9:04   ` Morten Rasmussen
2017-09-29 11:38     ` Peter Zijlstra
2017-09-29 13:00       ` Morten Rasmussen
2017-09-01 13:21 ` [PATCH -v2 04/18] sched/fair: Remove se->load.weight from se->avg.load_sum Peter Zijlstra
2017-09-29 15:26   ` Morten Rasmussen
2017-09-29 16:39     ` Peter Zijlstra
2017-09-01 13:21 ` [PATCH -v2 05/18] sched/fair: Change update_load_avg() arguments Peter Zijlstra
2017-09-01 13:21 ` [PATCH -v2 06/18] sched/fair: Move enqueue migrate handling Peter Zijlstra
2017-09-01 13:21 ` [PATCH -v2 07/18] sched/fair: Rename {en,de}queue_entity_load_avg() Peter Zijlstra
2017-09-01 13:21 ` [PATCH -v2 08/18] sched/fair: Introduce {en,de}queue_load_avg() Peter Zijlstra
2017-09-01 13:21 ` [PATCH -v2 09/18] sched/fair: More accurate reweight_entity() Peter Zijlstra
2017-09-01 13:21 ` [PATCH -v2 10/18] sched/fair: Use reweight_entity() for set_user_nice() Peter Zijlstra
2017-09-01 13:21 ` [PATCH -v2 11/18] sched/fair: Rewrite cfs_rq->removed_*avg Peter Zijlstra
2017-09-01 13:21 ` [PATCH -v2 12/18] sched/fair: Rewrite PELT migration propagation Peter Zijlstra
2017-10-09  8:08   ` Morten Rasmussen [this message]
2017-10-09  9:45     ` Peter Zijlstra
2017-10-18 12:45       ` Morten Rasmussen
2017-10-30 13:35         ` Peter Zijlstra
2017-10-09 15:03   ` Vincent Guittot
2017-10-09 15:29     ` Vincent Guittot
2017-10-10  7:29       ` Peter Zijlstra
2017-10-10  7:44         ` Vincent Guittot
2017-10-13 15:22           ` Vincent Guittot
2017-10-13 20:41             ` Peter Zijlstra
2017-10-15 12:01               ` Vincent Guittot
2017-10-16 13:55               ` Vincent Guittot
2017-10-19 15:04                 ` Vincent Guittot
2017-10-30 17:20                   ` Peter Zijlstra
2017-10-31 11:14                     ` Vincent Guittot
2017-10-31 15:01                       ` Peter Zijlstra
2017-10-31 16:38                         ` Vincent Guittot
2017-11-16 14:09                         ` [PATCH v3] sched: Update runnable propagation rule Vincent Guittot
2017-11-16 14:21                           ` [PATCH v4] " Vincent Guittot
2017-12-06 11:40                             ` Peter Zijlstra
2017-12-06 17:10                               ` Ingo Molnar
2017-12-06 20:29                             ` [tip:sched/core] sched/fair: Update and fix the " tip-bot for Vincent Guittot
2017-09-01 13:21 ` [PATCH -v2 13/18] sched/fair: Propagate an effective runnable_load_avg Peter Zijlstra
2017-10-02 17:46   ` Dietmar Eggemann
2017-10-03  8:50     ` Peter Zijlstra
2017-10-03  9:29       ` Dietmar Eggemann
2017-10-03 12:26   ` Dietmar Eggemann
2017-09-01 13:21 ` [PATCH -v2 14/18] sched/fair: Synchonous PELT detach on load-balance migrate Peter Zijlstra
2017-09-01 13:21 ` [PATCH -v2 15/18] sched/fair: Align PELT windows between cfs_rq and its se Peter Zijlstra
2017-10-04 19:27   ` Dietmar Eggemann
2017-10-06 13:02     ` Peter Zijlstra
2017-10-09 12:15       ` Dietmar Eggemann
2017-10-09 12:19         ` Peter Zijlstra
2017-09-01 13:21 ` [PATCH -v2 16/18] sched/fair: More accurate async detach Peter Zijlstra
2017-09-01 13:21 ` [PATCH -v2 17/18] sched/fair: Calculate runnable_weight slightly differently Peter Zijlstra
2017-09-01 13:21 ` [PATCH -v2 18/18] sched/fair: Update calc_group_*() comments Peter Zijlstra

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20171009080856.GB24129@e105550-lin.cambridge.arm.com \
    --to=morten.rasmussen@arm.com \
    --cc=bsegall@google.com \
    --cc=clm@fb.com \
    --cc=dietmar.eggemann@arm.com \
    --cc=efault@gmx.de \
    --cc=josef@toxicpanda.com \
    --cc=linux-kernel@vger.kernel.org \
    --cc=mingo@kernel.org \
    --cc=peterz@infradead.org \
    --cc=pjt@google.com \
    --cc=tj@kernel.org \
    --cc=torvalds@linux-foundation.org \
    --cc=vincent.guittot@linaro.org \
    --cc=yuyang.du@intel.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.