From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1752145AbcFUOP7 (ORCPT ); Tue, 21 Jun 2016 10:15:59 -0400 Received: from bombadil.infradead.org ([198.137.202.9]:38777 "EHLO bombadil.infradead.org" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1751380AbcFUOP5 (ORCPT ); Tue, 21 Jun 2016 10:15:57 -0400 Date: Tue, 21 Jun 2016 15:17:46 +0200 From: Peter Zijlstra To: Dietmar Eggemann Cc: Vincent Guittot , Yuyang Du , Ingo Molnar , linux-kernel , Mike Galbraith , Benjamin Segall , Paul Turner , Morten Rasmussen , Matt Fleming Subject: Re: [PATCH 4/4] sched,fair: Fix PELT integrity for new tasks Message-ID: <20160621131746.GR30927@twins.programming.kicks-ass.net> References: <20160617120136.064100812@infradead.org> <20160617120454.150630859@infradead.org> <20160617142814.GT30154@twins.programming.kicks-ass.net> <20160617160239.GL30927@twins.programming.kicks-ass.net> <20160617161831.GM30927@twins.programming.kicks-ass.net> <5767D51F.3080600@arm.com> <5768027E.1090408@arm.com> <20160621084119.GN30154@twins.programming.kicks-ass.net> MIME-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline In-Reply-To: <20160621084119.GN30154@twins.programming.kicks-ass.net> User-Agent: Mutt/1.5.23.1 (2014-03-12) Sender: linux-kernel-owner@vger.kernel.org List-ID: X-Mailing-List: linux-kernel@vger.kernel.org On Tue, Jun 21, 2016 at 10:41:19AM +0200, Peter Zijlstra wrote: > On Mon, Jun 20, 2016 at 03:49:34PM +0100, Dietmar Eggemann wrote: > > On 20/06/16 13:35, Vincent Guittot wrote: > > > > It will go through wake_up_new_task and post_init_entity_util_avg > > > during its fork which is enough to set last_update_time. Then, it will > > > use the switched_to_fair if the task becomes a fair one > > > > Oh I see. We want to make sure that every task (even when forked as > > !fair) has a last_update_time value != 0, when becoming fair one day. > > Right, see 2 below. I need to write a bunch of comments explaining PELT > proper, as well as document these things. > > The things we ran into with these patches were that: > > 1) You need to update the cfs_rq _before_ any entity attach/detach > (and might need to update_tg_load_avg when update_cfs_rq_load_avg() > returns true). > > 2) (fair) entities are always attached, switched_from/to deal with !fair. > > 3) cpu migration is the only exception and uses the last_update_time=0 > thing -- because refusal to take second rq->lock. > > Which is why I dislike Yuyang's patches, they create more exceptions > instead of applying existing rules (albeit undocumented). > > Esp. 1 is important, because while for mathematically consistency you > don't actually need to do this, you only need the entities to be > up-to-date with the cfs rq when you attach/detach, but that forgets the > temporal aspect of _when_ you do this. I have the below for now, I'll continue poking at this for a bit. --- a/kernel/sched/fair.c +++ b/kernel/sched/fair.c @@ -692,6 +692,7 @@ void init_entity_runnable_average(struct static inline u64 cfs_rq_clock_task(struct cfs_rq *cfs_rq); static int update_cfs_rq_load_avg(u64 now, struct cfs_rq *cfs_rq, bool update_freq); +static void update_tg_load_avg(struct cfs_rq *cfs_rq, int force); static void attach_entity_load_avg(struct cfs_rq *cfs_rq, struct sched_entity *se); /* @@ -757,7 +758,8 @@ void post_init_entity_util_avg(struct sc } } - update_cfs_rq_load_avg(now, cfs_rq, false); + if (update_cfs_rq_load_avg(now, cfs_rq, false)) + update_tg_load_avg(cfs_rq, false); attach_entity_load_avg(cfs_rq, se); } @@ -2919,7 +2921,21 @@ static inline void cfs_rq_util_change(st WRITE_ONCE(*ptr, res); \ } while (0) -/* Group cfs_rq's load_avg is used for task_h_load and update_cfs_share */ +/** + * update_cfs_rq_load_avg - update the cfs_rq's load/util averages + * @now: current time, as per cfs_rq_clock_task() + * @cfs_rq: cfs_rq to update + * @update_freq: should we call cfs_rq_util_change() or will the call do so + * + * The cfs_rq avg is the direct sum of all its entities (blocked and runnable) + * avg. The immediate corollary is that all (fair) tasks must be attached, see + * post_init_entity_util_avg(). + * + * cfs_rq->avg is used for task_h_load() and update_cfs_share() for example. + * + * Returns true if the load decayed or we removed utilization. It is expected + * that one calls update_tg_load_avg() on this condition. + */ static inline int update_cfs_rq_load_avg(u64 now, struct cfs_rq *cfs_rq, bool update_freq) { @@ -2974,6 +2990,14 @@ static inline void update_load_avg(struc update_tg_load_avg(cfs_rq, 0); } +/** + * attach_entity_load_avg - attach this entity to its cfs_rq load avg + * @cfs_rq: cfs_rq to attach to + * @se: sched_entity to attach + * + * Must call update_cfs_rq_load_avg() before this, since we rely on + * cfs_rq->avg.last_update_time being current. + */ static void attach_entity_load_avg(struct cfs_rq *cfs_rq, struct sched_entity *se) { if (!sched_feat(ATTACH_AGE_LOAD)) @@ -3005,6 +3029,14 @@ static void attach_entity_load_avg(struc cfs_rq_util_change(cfs_rq); } +/** + * detach_entity_load_avg - detach this entity from its cfs_rq load avg + * @cfs_rq: cfs_rq to detach from + * @se: sched_entity to detach + * + * Must call update_cfs_rq_load_avg() before this, since we rely on + * cfs_rq->avg.last_update_time being current. + */ static void detach_entity_load_avg(struct cfs_rq *cfs_rq, struct sched_entity *se) { __update_load_avg(cfs_rq->avg.last_update_time, cpu_of(rq_of(cfs_rq)), @@ -8392,7 +8424,8 @@ static void detach_task_cfs_rq(struct ta } /* Catch up with the cfs_rq and remove our load when we leave */ - update_cfs_rq_load_avg(now, cfs_rq, false); + if (update_cfs_rq_load_avg(now, cfs_rq, false)) + update_tg_load_avg(cfs_rq, false); detach_entity_load_avg(cfs_rq, se); } @@ -8411,7 +8444,8 @@ static void attach_task_cfs_rq(struct ta #endif /* Synchronize task with its cfs_rq */ - update_cfs_rq_load_avg(now, cfs_rq, false); + if (update_cfs_rq_load_avg(now, cfs_rq, false)) + update_tg_load_avg(cfs_rq, false); attach_entity_load_avg(cfs_rq, se); if (!vruntime_normalized(p))