From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S932598AbdDZQvh (ORCPT ); Wed, 26 Apr 2017 12:51:37 -0400 Received: from mail-wm0-f42.google.com ([74.125.82.42]:33815 "EHLO mail-wm0-f42.google.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S932511AbdDZQv2 (ORCPT ); Wed, 26 Apr 2017 12:51:28 -0400 Date: Wed, 26 Apr 2017 18:51:23 +0200 From: Vincent Guittot To: Tejun Heo Cc: Ingo Molnar , Peter Zijlstra , linux-kernel , Linus Torvalds , Mike Galbraith , Paul Turner , Chris Mason , kernel-team@fb.com Subject: Re: [PATCH 1/2] sched/fair: Fix how load gets propagated from cfs_rq to its sched_entity Message-ID: <20170426165123.GA17921@linaro.org> References: <20170424201344.GA14169@wtj.duckdns.org> <20170424201415.GB14169@wtj.duckdns.org> <20170425181219.GA15593@wtj.duckdns.org> MIME-Version: 1.0 Content-Type: text/plain; charset=iso-8859-1 Content-Disposition: inline Content-Transfer-Encoding: 8bit In-Reply-To: <20170425181219.GA15593@wtj.duckdns.org> User-Agent: Mutt/1.5.24 (2015-08-30) Sender: linux-kernel-owner@vger.kernel.org List-ID: X-Mailing-List: linux-kernel@vger.kernel.org Le Tuesday 25 Apr 2017 à 11:12:19 (-0700), Tejun Heo a écrit : > Hello, > > On Tue, Apr 25, 2017 at 10:35:53AM +0200, Vincent Guittot wrote: > > not sure to catch your example: > > a task TA with a load_avg = 1 is the only task in a task group GB so > > the cfs_rq load_avg = 1 too and the group_entity of this cfs_rq has > > got a weight of 1024 (I use 10bits format for readability) which is > > the total share of task group GB > > The group_entity (the sched_entity corresponding to the cfs_rq) should > behave as if it's a task which has the weight of 1024. > > > Are you saying that the group_entity load_avg should be around 1024 and not 1 ? > > Yes. > > > I would say it depends of TA weight. I assume that TA weight is the > > default value (1024) as you don't specify any value in your example > > Please consider the following configuration, where GA is a group > entity, and TA and TB are tasks. > > ROOT - GA (weight 1024) - TA (weight 1) > \ GB (weight 1 ) - TB (weight 1) > > Let's say both TA and TB are running full-tilt. Now let's take out GA > and GB. > > ROOT - TA1 (weight 1024) > \ TB1 (weight 1 ) > > GA should behave the same as TA1 and GB TB1. GA's load should match > TA1's, and GA's load when seen from ROOT's cfs_rq has nothing to do > with how much total absolute weight it has inside it. > > ROOT - GA2 (weight 1024) - TA2 (weight 1 ) > \ GB2 (weight 1 ) - TB2 (weight 1024) > > If TA2 and TB2 are constantly running, GA2 and GB2's in ROOT's cfs_rq > should match GA and GB's, respectively. Yes I agree > > > If TA directly runs at parent level, its sched_entity would have a > > load_avg of 1 so why the group entity load_avg should be 1024 ? it > > Because then the hierarchical weight configuration doesn't mean > anything. > > > will just temporally show the cfs_rq more loaded than it is really and > > at the end the group entity load_avg will go back to 1 > > It's not temporary. The weight of a group is its shares, which is its > load fraction of the configured weight of the group. Assuming UP, if > you configure a group to the weight of 1024 and have any task running > full-tilt in it, the group will converge to the load of 1024. The > problem is that the propagation logic is currently doing something > completely different and temporarily push down the load whenever it > triggers. Ok, I see your point and agree that there is an issue when propagating load_avg of a task group which has tasks with lower weight than the share but your proposal has got issue because it uses runnable_load_avg instead of load_avg and this makes propagation of loadavg_avg incorrect, something like below which keeps using load_avg solve the problem + if (gcfs_rq->load.weight) { + long shares = scale_load_down(calc_cfs_shares(gcfs_rq, gcfs_rq->tg)); + + load = min(gcfs_rq->avg.load_avg * + shares / scale_load_down(gcfs_rq->load.weight), shares); I have run schbench with the change above on v4.11-rc8 and latency are ok Thanks Vincent > > > Thanks. > > -- > tejun