From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1753986AbaIZJiL (ORCPT ); Fri, 26 Sep 2014 05:38:11 -0400 Received: from foss-mx-na.foss.arm.com ([217.140.108.86]:37807 "EHLO foss-mx-na.foss.arm.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1753275AbaIZJiJ (ORCPT ); Fri, 26 Sep 2014 05:38:09 -0400 Date: Fri, 26 Sep 2014 10:38:32 +0100 From: Morten Rasmussen To: Vincent Guittot Cc: Peter Zijlstra , "mingo@redhat.com" , Dietmar Eggemann , Paul Turner , Benjamin Segall , Nicolas Pitre , Mike Turquette , "rjw@rjwysocki.net" , linux-kernel Subject: Re: [PATCH 1/7] sched: Introduce scale-invariant load tracking Message-ID: <20140926093832.GY23693@e103034-lin> References: <1411403047-32010-1-git-send-email-morten.rasmussen@arm.com> <1411403047-32010-2-git-send-email-morten.rasmussen@arm.com> <20140925172343.GX23693@e103034-lin> MIME-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline In-Reply-To: User-Agent: Mutt/1.5.21 (2010-09-15) Sender: linux-kernel-owner@vger.kernel.org List-ID: X-Mailing-List: linux-kernel@vger.kernel.org On Fri, Sep 26, 2014 at 08:36:53AM +0100, Vincent Guittot wrote: > On 25 September 2014 19:23, Morten Rasmussen wrote: > > [snip] > > >> > /* Remainder of delta accrued against u_0` */ > >> > if (runnable) > >> > - sa->runnable_avg_sum += delta; > >> > + sa->runnable_avg_sum += (delta * scale_cap) > >> > + >> SCHED_CAPACITY_SHIFT; > >> > >> If we take the example of an always running task, its runnable_avg_sum > >> should stay at the LOAD_AVG_MAX value whatever the frequency of the > >> CPU on which it runs. But your change links the max value of > >> runnable_avg_sum with the current frequency of the CPU so an always > >> running task will have a load contribution of 25% > >> your proposed scaling is fine with usage_avg_sum which reflects the > >> effective running time on the CPU but the runnable_avg_sum should be > >> able to reach LOAD_AVG_MAX whatever the current frequency is > > > > I don't think it makes sense to scale one metric and not the other. You > > will end up with two very different (potentially opposite) views of the > > you have missed my point, i fully agree that scaling in-variance is a > good enhancement but IIUC your patchset doesn't solve the whole > problem. > > Let me try to explain with examples : > - A task with a load of 10% on a CPU at max frequency will keep a load > of 10% if the frequency of the CPU is divided by 2 which is fine Yes. > - But an always running task with a load of 100% on a CPU at max > frequency will have a load of 50% if the frequency of the CPU is > divided by 2 which is not what we want; the load of such task should > stay at 100% I think that is fine too and that is intentional. We can't say anything about the load/utilization of an always running no matter what cpu and at what frequency it is running. As soon as the tracked load/utilization indicates always running, we don't know how much load/utilization it will cause on a faster cpu. However, if it is 99.9% we are fine (well, we do probably want some bigger margin). As I see it, always running tasks must be treated specially. We can easily figure out which tasks are always running by comparing the scale load divided by se->load.weight to the current compute capacity on the cpu it is running on. If they are equal (or close), the task is always running. If we migrate it to a different cpu we should take into account that its load might increase if it gets more cycles to spend. You could even do something like: unsigned long migration_load(sched_entity *se) { if (se->avg.load_avg_contrib >= current_capacity(cpu_of(se)) * se->load.weight) return se->load.weight; return se->avg.load_avg_contrib; } for use when moving tasks between cpus when the source cpu is fully loaded at its current capacity. The task load is actually 100% relative to the current compute capacity of the task cpu, but not compared to the fastest cpu in the system. As I said in my previous reply, this isn't covered yet by this patch set. It is of course necessary to go through the load-balancing conditions to see where/if modifications are needed to do the right thing for scale-invariant load. > - if we have 2 identical always running tasks on CPUs with different > frequency, their load will be different Yes, in terms of absolute load and it is only the case for always running tasks. However, they would both have a load equal to the cpu capacity divided by se->avg.load_avg_contrib, so we can easily identify them. > So your patchset adds scaling invariance for small tasks but add some > scaling variances for heavy tasks For always running tasks, yes, but I don't see how we can avoid treating them specially anyway as we don't know anything about their true load. That doesn't change by changing how we scale their load. Better suggestions are of course welcome :) Morten