From mboxrd@z Thu Jan  1 00:00:00 1970
Return-Path: <linux-kernel-owner@vger.kernel.org>
Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand
	id S1754654AbcFPUHW (ORCPT <rfc822;w@1wt.eu>);
	Thu, 16 Jun 2016 16:07:22 -0400
Received: from merlin.infradead.org ([205.233.59.134]:39520 "EHLO
	merlin.infradead.org" rhost-flags-OK-OK-OK-OK) by vger.kernel.org
	with ESMTP id S1753853AbcFPUHU (ORCPT
	<rfc822;linux-kernel@vger.kernel.org>);
	Thu, 16 Jun 2016 16:07:20 -0400
Date: Thu, 16 Jun 2016 22:07:11 +0200
From: Peter Zijlstra <peterz@infradead.org>
To: Vincent Guittot <vincent.guittot@linaro.org>
Cc: Yuyang Du <yuyang.du@intel.com>, Ingo Molnar <mingo@kernel.org>,
        linux-kernel <linux-kernel@vger.kernel.org>,
        Mike Galbraith <umgwanakikbuti@gmail.com>,
        Benjamin Segall <bsegall@google.com>, Paul Turner <pjt@google.com>,
        Morten Rasmussen <morten.rasmussen@arm.com>,
        Dietmar Eggemann <dietmar.eggemann@arm.com>,
        Matt Fleming <matt@codeblueprint.co.uk>
Subject: Re: [PATCH v6 1/4] sched/fair: Fix attaching task sched avgs twice
 when switching to fair or changing task group
Message-ID: <20160616200711.GK30154@twins.programming.kicks-ass.net>
References: <1465942870-28419-1-git-send-email-yuyang.du@intel.com>
 <1465942870-28419-2-git-send-email-yuyang.du@intel.com>
 <CAKfTPtC6SqsTkH4u6GT_64wwVr9t0mf8J0TchxSQGfbH6oAX9A@mail.gmail.com>
 <20160615152217.GN30921@twins.programming.kicks-ass.net>
 <20160616163013.GA32169@vingu-laptop>
 <20160616185115.GL30921@twins.programming.kicks-ass.net>
 <CAKfTPtBRjWVqbDhAVFr1DhUe-HssMu7gMGJXpCo=+TvRQeLfsA@mail.gmail.com>
MIME-Version: 1.0
Content-Type: text/plain; charset=us-ascii
Content-Disposition: inline
In-Reply-To: <CAKfTPtBRjWVqbDhAVFr1DhUe-HssMu7gMGJXpCo=+TvRQeLfsA@mail.gmail.com>
User-Agent: Mutt/1.5.23.1 (2014-03-12)
Sender: linux-kernel-owner@vger.kernel.org
List-ID: <linux-kernel.vger.kernel.org>
X-Mailing-List: linux-kernel@vger.kernel.org

On Thu, Jun 16, 2016 at 09:00:57PM +0200, Vincent Guittot wrote:
> On 16 June 2016 at 20:51, Peter Zijlstra <peterz@infradead.org> wrote:
> > On Thu, Jun 16, 2016 at 06:30:13PM +0200, Vincent Guittot wrote:
> >> With patch [1] for the init of cfs_rq side, all use cases will be
> >> covered regarding the issue linked to a last_update_time set to 0 at
> >> init
> >> [1] https://lkml.org/lkml/2016/5/30/508
> >
> > Aah, wait, now I get it :-)
> >
> > Still, we should put cfs_rq_clock_task(cfs_rq) in it, not 1. And since
> > we now acquire rq->lock on init this should well be possible. Lemme sort
> > that.
> 
> yes with the rq->lock we can use cfs_rq_clock_task which is make more
> sense than 1.
> But the delta can be still significant between the creation of the
> task group and the 1st task that will be attach to the cfs_rq

Ah, I think I've spotted more fail.

And I think you're right, it doesn't matter, in fact, 0 should have been
fine too!

enqueue_entity()
  enqueue_entity_load_avg()
    update_cfs_rq_load_avg()
      now = clock()
      __update_load_avg(&cfs_rq->avg)
        cfs_rq->avg.last_load_update = now
        // ages 0 load/util for: now - 0
    if (migrated)
      attach_entity_load_avg()
        se->avg.last_load_update = cfs_rq->avg.last_load_update; // now != 0

So I don't see how it can end up being attached again.


Now I do see another problem, and that is that we're forgetting to
update_cfs_rq_load_avg() in all detach_entity_load_avg() callers and all
but the enqueue caller of attach_entity_load_avg().

Something like the below.


---
 kernel/sched/fair.c | 4 ++++
 1 file changed, 4 insertions(+)

diff --git a/kernel/sched/fair.c b/kernel/sched/fair.c
index f75930bdd326..5d8fa135bbc5 100644
--- a/kernel/sched/fair.c
+++ b/kernel/sched/fair.c
@@ -8349,6 +8349,7 @@ static void detach_task_cfs_rq(struct task_struct *p)
 {
 	struct sched_entity *se = &p->se;
 	struct cfs_rq *cfs_rq = cfs_rq_of(se);
+	u64 now = cfs_rq_clock_task(cfs_rq);
 
 	if (!vruntime_normalized(p)) {
 		/*
@@ -8360,6 +8361,7 @@ static void detach_task_cfs_rq(struct task_struct *p)
 	}
 
 	/* Catch up with the cfs_rq and remove our load when we leave */
+	update_cfs_rq_load_avg(now, cfs_rq, false);
 	detach_entity_load_avg(cfs_rq, se);
 }
 
@@ -8367,6 +8369,7 @@ static void attach_task_cfs_rq(struct task_struct *p)
 {
 	struct sched_entity *se = &p->se;
 	struct cfs_rq *cfs_rq = cfs_rq_of(se);
+	u64 now = cfs_rq_clock_task(cfs_rq);
 
 #ifdef CONFIG_FAIR_GROUP_SCHED
 	/*
@@ -8377,6 +8380,7 @@ static void attach_task_cfs_rq(struct task_struct *p)
 #endif
 
 	/* Synchronize task with its cfs_rq */
+	update_cfs_rq_load_avg(now, cfs_rq, false);
 	attach_entity_load_avg(cfs_rq, se);
 
 	if (!vruntime_normalized(p))