From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1432819AbdDYSMj (ORCPT ); Tue, 25 Apr 2017 14:12:39 -0400 Received: from mail-pg0-f45.google.com ([74.125.83.45]:33553 "EHLO mail-pg0-f45.google.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1432786AbdDYSMW (ORCPT ); Tue, 25 Apr 2017 14:12:22 -0400 Date: Tue, 25 Apr 2017 11:12:19 -0700 From: Tejun Heo To: Vincent Guittot Cc: Ingo Molnar , Peter Zijlstra , linux-kernel , Linus Torvalds , Mike Galbraith , Paul Turner , Chris Mason , kernel-team@fb.com Subject: Re: [PATCH 1/2] sched/fair: Fix how load gets propagated from cfs_rq to its sched_entity Message-ID: <20170425181219.GA15593@wtj.duckdns.org> References: <20170424201344.GA14169@wtj.duckdns.org> <20170424201415.GB14169@wtj.duckdns.org> MIME-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline In-Reply-To: User-Agent: Mutt/1.8.0 (2017-02-23) Sender: linux-kernel-owner@vger.kernel.org List-ID: X-Mailing-List: linux-kernel@vger.kernel.org Hello, On Tue, Apr 25, 2017 at 10:35:53AM +0200, Vincent Guittot wrote: > not sure to catch your example: > a task TA with a load_avg = 1 is the only task in a task group GB so > the cfs_rq load_avg = 1 too and the group_entity of this cfs_rq has > got a weight of 1024 (I use 10bits format for readability) which is > the total share of task group GB The group_entity (the sched_entity corresponding to the cfs_rq) should behave as if it's a task which has the weight of 1024. > Are you saying that the group_entity load_avg should be around 1024 and not 1 ? Yes. > I would say it depends of TA weight. I assume that TA weight is the > default value (1024) as you don't specify any value in your example Please consider the following configuration, where GA is a group entity, and TA and TB are tasks. ROOT - GA (weight 1024) - TA (weight 1) \ GB (weight 1 ) - TB (weight 1) Let's say both TA and TB are running full-tilt. Now let's take out GA and GB. ROOT - TA1 (weight 1024) \ TB1 (weight 1 ) GA should behave the same as TA1 and GB TB1. GA's load should match TA1's, and GA's load when seen from ROOT's cfs_rq has nothing to do with how much total absolute weight it has inside it. ROOT - GA2 (weight 1024) - TA2 (weight 1 ) \ GB2 (weight 1 ) - TB2 (weight 1024) If TA2 and TB2 are constantly running, GA2 and GB2's in ROOT's cfs_rq should match GA and GB's, respectively. > If TA directly runs at parent level, its sched_entity would have a > load_avg of 1 so why the group entity load_avg should be 1024 ? it Because then the hierarchical weight configuration doesn't mean anything. > will just temporally show the cfs_rq more loaded than it is really and > at the end the group entity load_avg will go back to 1 It's not temporary. The weight of a group is its shares, which is its load fraction of the configured weight of the group. Assuming UP, if you configure a group to the weight of 1024 and have any task running full-tilt in it, the group will converge to the load of 1024. The problem is that the propagation logic is currently doing something completely different and temporarily push down the load whenever it triggers. Thanks. -- tejun