From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1752865AbdJPNzG (ORCPT ); Mon, 16 Oct 2017 09:55:06 -0400 Received: from mail-wm0-f51.google.com ([74.125.82.51]:47139 "EHLO mail-wm0-f51.google.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1752431AbdJPNzF (ORCPT ); Mon, 16 Oct 2017 09:55:05 -0400 X-Google-Smtp-Source: ABhQp+QNtbctTf6a5RzKy/DXLTeNfqH5lhNI6Mgr7u5n07nFuMt2tn+699QBn//kKD35CfMfA4Uvag== Date: Mon, 16 Oct 2017 15:55:01 +0200 From: Vincent Guittot To: Peter Zijlstra Cc: Ingo Molnar , linux-kernel , Tejun Heo , Josef Bacik , Linus Torvalds , Mike Galbraith , Paul Turner , Chris Mason , Dietmar Eggemann , Morten Rasmussen , Ben Segall , Yuyang Du Subject: Re: [PATCH -v2 12/18] sched/fair: Rewrite PELT migration propagation Message-ID: <20171016135501.GA11688@linaro.org> References: <20170901132059.342024223@infradead.org> <20170901132748.580255511@infradead.org> <20171010072945.rjeuripvfksfpdcf@hirez.programming.kicks-ass.net> <20171013152254.GA7393@linaro.org> <20171013204111.GB6524@worktop.programming.kicks-ass.net> MIME-Version: 1.0 Content-Type: text/plain; charset=iso-8859-1 Content-Disposition: inline Content-Transfer-Encoding: 8bit In-Reply-To: <20171013204111.GB6524@worktop.programming.kicks-ass.net> User-Agent: Mutt/1.5.24 (2015-08-30) Sender: linux-kernel-owner@vger.kernel.org List-ID: X-Mailing-List: linux-kernel@vger.kernel.org Hi Peter, Le Friday 13 Oct 2017 à 22:41:11 (+0200), Peter Zijlstra a écrit : > On Fri, Oct 13, 2017 at 05:22:54PM +0200, Vincent Guittot wrote: > > > > I have studied a bit more how to improve the propagation formula and the > > changes below is doing the job for the UCs that I have tested. > > > > Unlike running, we can't directly propagate the runnable through hierarchy > > when we migrate a task. Instead we must ensure that we will not > > over/underestimate the impact of the migration thanks to several rules: > > - ge->avg.runnable_sum can't be higher than LOAD_AVG_MAX > > - ge->avg.runnable_sum can't be lower than ge->avg.running_sum (once scaled to > > the same range) > > - we can't directly propagate a negative delta of runnable_sum because part of > > this runnable time can be "shared" with others sched_entities and stays on the > > gcfs_rq. > > Right, that's about how far I got. > > > - ge->avg.runnable_sum can't increase when we detach a task. > > Yeah, that would be fairly broken. > > > > Instead, we can't estimate the new runnable_sum of the gcfs_rq with > > s/can't/can/ ? > > > the formula: > > > > gcfs_rq's runnable sum = gcfs_rq's load_sum / gcfs_rq's weight. > > That might be the best we can do.. its wrong, but then its less wrong > that what we have now. The comments can be much improved though. Not to > mention that the big comment on top needs a little help. Subject: [PATCH] sched: Update runnable propagation rule Unlike running, the runnable part can't be directly propagated through the hierarchy when we migrate a task. The main reason is that runnable time can be shared with other sched_entities that stay on the rq and this runnable time will also remain on prev cfs_rq and must not be removed. Instead, we can estimate what should be the new runnable of the prev cfs_rq and check that this estimation stay in a possible range. The prop_runnable_sum is a good estimation when adding runnable_sum but fails most often when we remove it. Instead, we could use the formula below instead: gcfs_rq's runnable_sum = gcfs_rq->avg.load_sum / gcfs_rq->load.weight (1) (1) assumes that tasks are equally runnable which is not true but easy to compute. Beside these estimates, we have several simple rules that help us to filter out wrong ones: -ge->avg.runnable_sum <= than LOAD_AVG_MAX -ge->avg.runnable_sum >= ge->avg.running_sum (ge->avg.util_sum << LOAD_AVG_MAX) -ge->avg.runnable_sum can't increase when we detach a task Signed-off-by: Vincent Guittot --- kernel/sched/fair.c | 45 ++++++++++++++++++++++++++++++++++----------- 1 file changed, 34 insertions(+), 11 deletions(-) diff --git a/kernel/sched/fair.c b/kernel/sched/fair.c index 350dbec0..08d2a58 100644 --- a/kernel/sched/fair.c +++ b/kernel/sched/fair.c @@ -3489,33 +3489,56 @@ update_tg_cfs_util(struct cfs_rq *cfs_rq, struct sched_entity *se, struct cfs_rq static inline void update_tg_cfs_runnable(struct cfs_rq *cfs_rq, struct sched_entity *se, struct cfs_rq *gcfs_rq) { - long runnable_sum = gcfs_rq->prop_runnable_sum; - long runnable_load_avg, load_avg; - s64 runnable_load_sum, load_sum; + long running_sum, runnable_sum = gcfs_rq->prop_runnable_sum; + long runnable_load_avg, delta_avg, load_avg; + s64 runnable_load_sum, delta_sum, load_sum = 0; if (!runnable_sum) return; gcfs_rq->prop_runnable_sum = 0; + if (runnable_sum >= 0) { + /* Get a rough estimate of the new gcfs_rq's runnable */ + runnable_sum += se->avg.load_sum; + /* ge->avg.runnable_sum can't be higher than LOAD_AVG_MAX */ + runnable_sum = min(runnable_sum, LOAD_AVG_MAX); + } else { + /* Get a rough estimate of the new gcfs_rq's runnable */ + if (scale_load_down(gcfs_rq->load.weight)) + load_sum = div_s64(gcfs_rq->avg.load_sum, + scale_load_down(gcfs_rq->load.weight)); + + /* ge->avg.runnable_sum can't increase when removing runnable */ + runnable_sum = min(se->avg.load_sum, load_sum); + } + + /* runnable_sum can't be lower than running_sum */ + running_sum = se->avg.util_sum >> SCHED_CAPACITY_SHIFT; + runnable_sum = max(runnable_sum, running_sum); + load_sum = (s64)se_weight(se) * runnable_sum; load_avg = div_s64(load_sum, LOAD_AVG_MAX); - add_positive(&se->avg.load_sum, runnable_sum); - add_positive(&se->avg.load_avg, load_avg); + delta_sum = load_sum - (s64)se_weight(se) * se->avg.load_sum; + delta_avg = load_avg - se->avg.load_avg; - add_positive(&cfs_rq->avg.load_avg, load_avg); - add_positive(&cfs_rq->avg.load_sum, load_sum); + se->avg.load_sum = runnable_sum; + se->avg.load_avg = load_avg; + add_positive(&cfs_rq->avg.load_avg, delta_avg); + add_positive(&cfs_rq->avg.load_sum, delta_sum); runnable_load_sum = (s64)se_runnable(se) * runnable_sum; runnable_load_avg = div_s64(runnable_load_sum, LOAD_AVG_MAX); + delta_sum = runnable_load_sum - se_weight(se) * se->avg.runnable_load_sum; + delta_avg = runnable_load_avg - se->avg.runnable_load_avg; - add_positive(&se->avg.runnable_load_sum, runnable_sum); - add_positive(&se->avg.runnable_load_avg, runnable_load_avg); + se->avg.runnable_load_sum = runnable_sum; + se->avg.runnable_load_avg = runnable_load_avg; if (se->on_rq) { - add_positive(&cfs_rq->avg.runnable_load_avg, runnable_load_avg); - add_positive(&cfs_rq->avg.runnable_load_sum, runnable_load_sum); + add_positive(&cfs_rq->avg.runnable_load_avg, delta_avg); + add_positive(&cfs_rq->avg.runnable_load_sum, delta_sum); } } -- 2.7.4