From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1753427AbbDADhv (ORCPT ); Tue, 31 Mar 2015 23:37:51 -0400 Received: from mail-ig0-f174.google.com ([209.85.213.174]:36225 "EHLO mail-ig0-f174.google.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1751315AbbDADhr (ORCPT ); Tue, 31 Mar 2015 23:37:47 -0400 MIME-Version: 1.0 In-Reply-To: References: <1425052454-25797-1-git-send-email-vincent.guittot@linaro.org> <1425052454-25797-9-git-send-email-vincent.guittot@linaro.org> Date: Wed, 1 Apr 2015 11:37:45 +0800 Message-ID: Subject: Re: [PATCH v10 08/11] sched: replace capacity_factor by usage From: Xunlei Pang To: Vincent Guittot Cc: Peter Zijlstra , Ingo Molnar , lkml , Preeti U Murthy , Morten Rasmussen , Kamalesh Babulal , Rik van Riel , Linaro Kernel Mailman List , Mike Galbraith , Dietmar Eggemann Content-Type: text/plain; charset=UTF-8 Sender: linux-kernel-owner@vger.kernel.org List-ID: X-Mailing-List: linux-kernel@vger.kernel.org Hi Vincent, On 27 March 2015 at 23:59, Vincent Guittot wrote: > On 27 March 2015 at 15:52, Xunlei Pang wrote: >> Hi Vincent, >> >> On 27 February 2015 at 23:54, Vincent Guittot >> wrote: >>> /** >>> @@ -6432,18 +6435,19 @@ static inline void update_sd_lb_stats(struct lb_env *env, struct sd_lb_stats *sd >>> >>> /* >>> * In case the child domain prefers tasks go to siblings >>> - * first, lower the sg capacity factor to one so that we'll try >>> + * first, lower the sg capacity so that we'll try >>> * and move all the excess tasks away. We lower the capacity >>> * of a group only if the local group has the capacity to fit >>> - * these excess tasks, i.e. nr_running < group_capacity_factor. The >>> - * extra check prevents the case where you always pull from the >>> - * heaviest group when it is already under-utilized (possible >>> - * with a large weight task outweighs the tasks on the system). >>> + * these excess tasks. The extra check prevents the case where >>> + * you always pull from the heaviest group when it is already >>> + * under-utilized (possible with a large weight task outweighs >>> + * the tasks on the system). >>> */ >>> if (prefer_sibling && sds->local && >>> - sds->local_stat.group_has_free_capacity) { >>> - sgs->group_capacity_factor = min(sgs->group_capacity_factor, 1U); >>> - sgs->group_type = group_classify(sg, sgs); >>> + group_has_capacity(env, &sds->local_stat) && >>> + (sgs->sum_nr_running > 1)) { >>> + sgs->group_no_capacity = 1; >>> + sgs->group_type = group_overloaded; >>> } >>> >> >> For SD_PREFER_SIBLING, if local has 1 task and group_has_capacity() >> returns true(but not overloaded) for it, and assume sgs group has 2 >> tasks, should we still mark this group overloaded? > > yes, the load balance will then choose if it's worth pulling it or not > depending of the load of each groups Maybe I didn't make it clearly. For example, CPU0~1 are SMT siblings, CPU2~CPU3 are another pair. CPU0 is idle, others each has 1 task. Then according to this patch, CPU2~CPU3(as one group) will be viewed as overloaded(CPU0~CPU1 as local group, and group_has_capacity() returns true here), so the balancer may initiate an active task moving. This is different from the current code as SD_PREFER_SIBLING logic does. Is this problematic? > >> >> -Xunlei