From mboxrd@z Thu Jan  1 00:00:00 1970
Return-Path: <linux-kernel-owner@vger.kernel.org>
Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand
        id S1031908AbdDZWkc (ORCPT <rfc822;w@1wt.eu>);
        Wed, 26 Apr 2017 18:40:32 -0400
Received: from mail-pf0-f179.google.com ([209.85.192.179]:36628 "EHLO
        mail-pf0-f179.google.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org
        with ESMTP id S1031880AbdDZWkX (ORCPT
        <rfc822;linux-kernel@vger.kernel.org>);
        Wed, 26 Apr 2017 18:40:23 -0400
Date: Wed, 26 Apr 2017 15:40:20 -0700
From: Tejun Heo <tj@kernel.org>
To: Vincent Guittot <vincent.guittot@linaro.org>
Cc: Ingo Molnar <mingo@redhat.com>, Peter Zijlstra <peterz@infradead.org>,
        linux-kernel <linux-kernel@vger.kernel.org>,
        Linus Torvalds <torvalds@linux-foundation.org>,
        Mike Galbraith <efault@gmx.de>, Paul Turner <pjt@google.com>,
        Chris Mason <clm@fb.com>, kernel-team@fb.com
Subject: Re: [PATCH 1/2] sched/fair: Fix how load gets propagated from cfs_rq
 to its sched_entity
Message-ID: <20170426224020.GB11348@wtj.duckdns.org>
References: <20170424201344.GA14169@wtj.duckdns.org>
 <20170424201415.GB14169@wtj.duckdns.org>
 <CAKfTPtCvQwmA2awnHWLpjhMK6JKp7deopxGOoyZaQKp+O1Am1w@mail.gmail.com>
 <20170425181219.GA15593@wtj.duckdns.org>
 <20170426165123.GA17921@linaro.org>
MIME-Version: 1.0
Content-Type: text/plain; charset=iso-8859-1
Content-Disposition: inline
Content-Transfer-Encoding: 8bit
In-Reply-To: <20170426165123.GA17921@linaro.org>
User-Agent: Mutt/1.8.0 (2017-02-23)
Sender: linux-kernel-owner@vger.kernel.org
List-ID: <linux-kernel.vger.kernel.org>
X-Mailing-List: linux-kernel@vger.kernel.org

Hello,

On Wed, Apr 26, 2017 at 06:51:23PM +0200, Vincent Guittot wrote:
> > It's not temporary.  The weight of a group is its shares, which is its
> > load fraction of the configured weight of the group.  Assuming UP, if
> > you configure a group to the weight of 1024 and have any task running
> > full-tilt in it, the group will converge to the load of 1024.  The
> > problem is that the propagation logic is currently doing something
> > completely different and temporarily push down the load whenever it
> > triggers.
> 
> Ok, I see your point and agree that there is an issue when propagating
> load_avg of a task group which has tasks with lower weight than the share
> but your proposal has got issue because it uses runnable_load_avg instead
> of load_avg and this makes propagation of loadavg_avg incorrect, something
> like below which keeps using load_avg solve the problem
> 
> +	if (gcfs_rq->load.weight) {
> +		long shares = scale_load_down(calc_cfs_shares(gcfs_rq, gcfs_rq->tg));
> +
> +		load = min(gcfs_rq->avg.load_avg *
> +			   shares / scale_load_down(gcfs_rq->load.weight), shares);
> 
> I have run schbench with the change above on v4.11-rc8 and latency are ok

Hmm... so, I'll test this but this wouldn't solve the problem of
root's runnable_load_avg being out of sync with the approximate sum of
all task loads, which is the cause of the latencies that I'm seeing.

Are you saying that with the above change, you're not seeing the
higher latency issue that you reported in the other reply?

Thanks.

-- 
tejun