From mboxrd@z Thu Jan  1 00:00:00 1970
Return-Path: <linux-kernel-owner@vger.kernel.org>
Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand
	id S1752372AbbLNEs4 (ORCPT <rfc822;w@1wt.eu>);
	Sun, 13 Dec 2015 23:48:56 -0500
Received: from mga09.intel.com ([134.134.136.24]:46934 "EHLO mga09.intel.com"
	rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP
	id S1752120AbbLNEsz (ORCPT <rfc822;linux-kernel@vger.kernel.org>);
	Sun, 13 Dec 2015 23:48:55 -0500
X-ExtLoop1: 1
X-IronPort-AV: E=Sophos;i="5.20,425,1444719600"; 
   d="scan'208";a="860073393"
Date: Mon, 14 Dec 2015 05:02:25 +0800
From: Yuyang Du <yuyang.du@intel.com>
To: bsegall@google.com
Cc: Dietmar Eggemann <dietmar.eggemann@arm.com>,
        Morten Rasmussen <morten.rasmussen@arm.com>,
        Andrey Ryabinin <aryabinin@virtuozzo.com>,
        Peter Zijlstra <peterz@infradead.org>, mingo@redhat.com,
        linux-kernel@vger.kernel.org, Paul Turner <pjt@google.com>
Subject: Re: [PATCH] sched/fair: fix mul overflow on 32-bit systems
Message-ID: <20151213210225.GB28098@intel.com>
References: <1449838518-26543-1-git-send-email-aryabinin@virtuozzo.com>
 <20151211132551.GO6356@twins.programming.kicks-ass.net>
 <20151211133612.GG6373@twins.programming.kicks-ass.net>
 <566AD6E1.2070005@virtuozzo.com>
 <20151211175751.GA27552@e105550-lin.cambridge.arm.com>
 <566B16D8.2060109@arm.com>
 <xm26wpsk3ian.fsf@sword-of-the-dawn.mtv.corp.google.com>
MIME-Version: 1.0
Content-Type: text/plain; charset=us-ascii
Content-Disposition: inline
In-Reply-To: <xm26wpsk3ian.fsf@sword-of-the-dawn.mtv.corp.google.com>
User-Agent: Mutt/1.5.21 (2010-09-15)
Sender: linux-kernel-owner@vger.kernel.org
List-ID: <linux-kernel.vger.kernel.org>
X-Mailing-List: linux-kernel@vger.kernel.org

On Fri, Dec 11, 2015 at 11:18:56AM -0800, bsegall@google.com wrote:
> First, I believe in theory util_avg on a cpu should add up to 100% or
> 1024 or whatever. However, recently migrated-in tasks don't have their
> utilization cleared, so if they were quickly migrated again you could
> have up to the number of cpus or so times 100%, which could lead to
> overflow here. This just leads to more questions though:
> 
> The whole removed_util_avg thing doesn't seem to make a ton of sense -
> the code doesn't add util_avg for a migrating task onto
> cfs_rq->avg.util_avg

The code does add util_avg for a migrating task onto cfs_rq->avg.util_avg:

enqueue_entity_load_avg() calls attach_entity_load_avg()

> and doing so would regularly give >100% values (it
> does so on attach/detach where it's less likely to cause issues, but not
> migration). Removing it only makes sense if the task has accumulated all
> that utilization on this cpu, and even then mostly only makes sense if
> this is the only task on the cpu (and then it would make sense to add it
> on migrate-enqueue). The whole add-on-enqueue-migrate,
> remove-on-dequeue-migrate thing comes from /load/, where doing so is a
> more globally applicable approximation than it is for utilization,
> though it could still be useful as a fast-start/fast-stop approximation,
> if the add-on-enqueue part was added. It could also I guess be cleared
> on migrate-in, as basically the opposite assumption (or do something
> like add on enqueue, up to 100% and then set the se utilization to the
> amount actually added or something).
> 
> If the choice was to not do the add/remove thing, then se->avg.util_sum
> would be unused except for attach/detach, which currently do the
> add/remove thing. It's not unreasonable for them, except that currently
> nothing uses anything other than the root's utilization, so migration
> between cgroups wouldn't actually change the relevant util number
> (except it could because changing the cfs_rq util_sum doesn't actually
> do /anything/ unless it's the root, so you'd have to wait until the
> cgroup se actually changed in utilization).
> 
> 
> So uh yeah, my initial impression is "rip it out", but if being
> immediately-correct is important in the case of one task being most of
> the utilization, rather than when it is more evenly distributed, it
> would probably make more sense to instead put in the add-on-enqueue
> code.