From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1753927AbbIJRXO (ORCPT ); Thu, 10 Sep 2015 13:23:14 -0400 Received: from mail-pa0-f53.google.com ([209.85.220.53]:34935 "EHLO mail-pa0-f53.google.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1753719AbbIJRXL (ORCPT ); Thu, 10 Sep 2015 13:23:11 -0400 From: bsegall@google.com To: Morten Rasmussen Cc: Peter Zijlstra , Vincent Guittot , Dietmar Eggemann , Steve Muckle , "mingo\@redhat.com" , "daniel.lezcano\@linaro.org" , "yuyang.du\@intel.com" , "mturquette\@baylibre.com" , "rjw\@rjwysocki.net" , Juri Lelli , "sgurrappadi\@nvidia.com" , "pang.xunlei\@zte.com.cn" , "linux-kernel\@vger.kernel.org" Subject: Re: [PATCH 5/6] sched/fair: Get rid of scaling utilization by capacity_orig References: <55E8DD00.2030706@linaro.org> <55EDAF43.30500@arm.com> <55EDDD5A.70904@arm.com> <20150908122606.GH3644@twins.programming.kicks-ass.net> <20150908125205.GW18673@twins.programming.kicks-ass.net> <20150908143157.GA27098@e105550-lin.cambridge.arm.com> <20150908153300.GK3644@twins.programming.kicks-ass.net> <20150910110638.GE27098@e105550-lin.cambridge.arm.com> Date: Thu, 10 Sep 2015 10:23:08 -0700 In-Reply-To: <20150910110638.GE27098@e105550-lin.cambridge.arm.com> (Morten Rasmussen's message of "Thu, 10 Sep 2015 12:06:39 +0100") Message-ID: User-Agent: Gnus/5.13 (Gnus v5.13) Emacs/24.3 (gnu/linux) MIME-Version: 1.0 Content-Type: text/plain Sender: linux-kernel-owner@vger.kernel.org List-ID: X-Mailing-List: linux-kernel@vger.kernel.org Morten Rasmussen writes: > On Wed, Sep 09, 2015 at 03:23:43PM -0700, bsegall@google.com wrote: >> Peter Zijlstra writes: >> >> > On Tue, Sep 08, 2015 at 03:31:58PM +0100, Morten Rasmussen wrote: >> >> On Tue, Sep 08, 2015 at 02:52:05PM +0200, Peter Zijlstra wrote: >> >> > > Tricky that, LOAD_AVG_MAX very much relies on the unit being 1<<10. >> >> >> >> I don't get why LOAD_AVG_MAX relies on the util_avg shifting being >> >> 1<<10, it is just the sum of the geometric series and the upper bound of >> >> util_sum? >> > >> > It needs a 1024, it might just have been the 1024 ns we use a period >> > instead of the scale unit though. >> > >> > The LOAD_AVG_MAX is the number where adding a next element to the series >> > doesn't change the result anymore, so scaling it up will allow more >> > significant elements to the series before we bottom out, which is the _N >> > thing. >> > >> >> Yes, as the comments say, the 1024ns unit is arbitrary (and is an >> average of not-quite-microseconds instead of just nanoseconds to allow >> more bits to load.weight when we multiply load.weight by this number). >> In fact there are two arbitrary 1024 units here, which are technically >> unrelated and are both unrelated to SCHED_LOAD_RESOLUTION/etc - we >> operate on units of almost-microseconds and we also do decays every >> almost-millisecond. >> >> There appears to be a bunch of confusion in the current code around >> util_sum/util_avg which appears to using SCHED_LOAD_SCALE >> for a fixed-point percentage or something, which is at least reasonable, >> but is initializing it as scale_load_down(SCHED_LOAD_SCALE), which >> results in either initializing as 100% or .1% depending on RESOLUTION. >> This'll get clobbered on first update, but if it needs to be >> initialized, it should either get initialized to something sane or at >> least consistent. > > This is what I thought too. The whole geometric series math is completely > independent of the scale used for priority in load_avg and the fixed > point shifting used for util_avg. > >> load_sum/load_avg appear to be scale_load_down()ed properly, and appear >> to be used as such at a quick glance. > > I don't think shifting by SCHED_LOAD_SHIFT in __update_load_avg() is > right: > > sa->util_avg = (sa->util_sum << SCHED_LOAD_SHIFT) / LOAD_AVG_MAX; > > util_avg is initialized to low resolution (>> SCHED_LOAD_RESOLUTION): > > sa->util_avg = scale_load_down(SCHED_LOAD_SCALE); > > so it appear to be intended to be using low resolution like load_avg > (weight is scaled down before it is passed into __update_load_avg()), > but util_avg is shifted up to high resolution. It should be: > > sa->util_avg = (sa->util_sum << (SCHED_LOAD_SHIFT - > SCHED_LOAD_SHIFT)) / LOAD_AVG_MAX; > > to be consistent. Yeah, util_avg was/is screwed up in terms of either the initialization or which shift to use there. The load ones however appear to be fine.