linux-kernel.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
From: Dietmar Eggemann <dietmar.eggemann@arm.com>
To: "pang.xunlei@zte.com.cn" <pang.xunlei@zte.com.cn>
Cc: Juri Lelli <Juri.Lelli@arm.com>,
	"linux-kernel@vger.kernel.org" <linux-kernel@vger.kernel.org>,
	"linux-kernel-owner@vger.kernel.org" 
	<linux-kernel-owner@vger.kernel.org>,
	"mingo@redhat.com" <mingo@redhat.com>,
	Morten Rasmussen <Morten.Rasmussen@arm.com>,
	"mturquette@linaro.org" <mturquette@linaro.org>,
	"nico@linaro.org" <nico@linaro.org>,
	Peter Zijlstra <peterz@infradead.org>,
	"preeti@linux.vnet.ibm.com" <preeti@linux.vnet.ibm.com>,
	"rjw@rjwysocki.net" <rjw@rjwysocki.net>,
	"vincent.guittot@linaro.org" <vincent.guittot@linaro.org>,
	"yuyang.du@intel.com" <yuyang.du@intel.com>
Subject: Re: [RFCv3 PATCH 12/48] sched: Make usage tracking cpu scale-invariant
Date: Wed, 06 May 2015 10:49:42 +0100	[thread overview]
Message-ID: <5549E3B6.2060709@arm.com> (raw)
In-Reply-To: <OF8A3E3617.0D4400A5-ON48257E3A.001B38D9-48257E3A.002379A4@zte.com.cn>

On 03/05/15 07:27, pang.xunlei@zte.com.cn wrote:
> Hi Dietmar,
> 
> Dietmar Eggemann <dietmar.eggemann@arm.com>  wrote 2015-03-24 AM 03:19:41:
>>
>> Re: [RFCv3 PATCH 12/48] sched: Make usage tracking cpu scale-invariant

[...]

>> In the previous patch-set https://lkml.org/lkml/2014/12/2/332we
>> cpu-scaled both (sched_avg::runnable_avg_sum (load) and
>> sched_avg::running_avg_sum (utilization)) but during the review Vincent
>> pointed out that a cpu-scaled invariant load signal messes up
>> load-balancing based on s[dg]_lb_stats::avg_load in overload scenarios.
>>
>> avg_load = load/capacity and load can't be simply replaced here by
>> 'cpu-scale invariant load' (which is load*capacity).
> 
> I can't see why it shouldn't.
> 
> For "avg_load = load/capacity", "avg_load" stands for how busy the cpu
> works,
> it is actually a value relative to its capacity. The system is seen
> balanced
> for the case that a task runs on a 512-capacity cpu contributing 50% usage,
> and two the same tasks run on the 1024-capacity cpu contributing 50% usage.
> "capacity" in this formula contains uarch capacity, "load" in this formula
> must be an absolute real load, not relative.
> 
> But with current kernel implementation, "load" computed without this patch
> is a relative value. For example, one task (1024 weight) runs on a 1024
> capacity CPU, it gets 256 load contribution(25% on this CPU). When it runs
> on a 512 capacity CPU, it will get the 512 load contribution(50% on ths
> CPU).
> See, currently runnable "load" is relative, so "avg_load" is actually wrong
> and its value equals that of "load". So I think the runnable load should be
> made cpu scale-invariant as well.
> 
> Please point me out if I was wrong.

Cpu-scaled load leads to wrong lb decisions in overload scenarios:

(1) Overload example taken from email thread between Vincent and Morten:
    https://lkml.org/lkml/2014/12/30/114

7 always running tasks, 4 on cluster 0, 3 on cluster 1:

		cluster 0	cluster 1
capacity	1024 (2*512)	1024 (1*1024)
load		4096		3072
scale_load	2048		3072

Simply using cpu-scaled load in the existing lb code would declare
cluster 1 busier than cluster 0, although the compute capacity budget
for one task is higher on cluster 1 (1024/3 = 341) than on cluster 0
(2*512/4 = 256).

(2) A non-overload example does not show this problem:

7 12.5% (scaled to 1024) tasks, 4 on cluster 0, 3 on cluster 1:

		cluster 0	cluster 1
capacity	1024 (2*512)	1024 (1*1024)
load		1024		384
scale_load	512		384

Here cluster 0 is busier taking load or cpu-scaled load.

We should continue to use avg_load based on load (maybe calculated out
of scaled load once introduced?) for overload scenarios and use
scale_load for non-overload scenarios. Since this hasn't been
implemented yet, we got rid of cpu-scaled load in
this RFC.

[...]


  parent reply	other threads:[~2015-05-06  9:53 UTC|newest]

Thread overview: 124+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2015-02-04 18:30 [RFCv3 PATCH 00/48] sched: Energy cost model for energy-aware scheduling Morten Rasmussen
2015-02-04 18:30 ` [RFCv3 PATCH 01/48] sched: add utilization_avg_contrib Morten Rasmussen
2015-02-11  8:50   ` Preeti U Murthy
2015-02-12  1:07     ` Vincent Guittot
2015-02-04 18:30 ` [RFCv3 PATCH 02/48] sched: Track group sched_entity usage contributions Morten Rasmussen
2015-02-04 18:30 ` [RFCv3 PATCH 03/48] sched: remove frequency scaling from cpu_capacity Morten Rasmussen
2015-02-04 18:30 ` [RFCv3 PATCH 04/48] sched: Make sched entity usage tracking frequency-invariant Morten Rasmussen
2015-02-04 18:30 ` [RFCv3 PATCH 05/48] sched: make scale_rt invariant with frequency Morten Rasmussen
2015-02-04 18:30 ` [RFCv3 PATCH 06/48] sched: add per rq cpu_capacity_orig Morten Rasmussen
2015-02-04 18:30 ` [RFCv3 PATCH 07/48] sched: get CPU's usage statistic Morten Rasmussen
2015-02-04 18:30 ` [RFCv3 PATCH 08/48] sched: replace capacity_factor by usage Morten Rasmussen
2015-02-04 18:30 ` [RFCv3 PATCH 09/48] sched: add SD_PREFER_SIBLING for SMT level Morten Rasmussen
2015-02-04 18:30 ` [RFCv3 PATCH 10/48] sched: move cfs task on a CPU with higher capacity Morten Rasmussen
2015-02-04 18:30 ` [RFCv3 PATCH 11/48] sched: Make load tracking frequency scale-invariant Morten Rasmussen
2015-02-04 18:30 ` [RFCv3 PATCH 12/48] sched: Make usage tracking cpu scale-invariant Morten Rasmussen
2015-03-23 14:46   ` Peter Zijlstra
2015-03-23 19:19     ` Dietmar Eggemann
     [not found]       ` <OF8A3E3617.0D4400A5-ON48257E3A.001B38D9-48257E3A.002379A4@zte.com.cn>
2015-05-06  9:49         ` Dietmar Eggemann [this message]
2015-02-04 18:30 ` [RFCv3 PATCH 13/48] cpufreq: Architecture specific callback for frequency changes Morten Rasmussen
2015-02-04 18:30 ` [RFCv3 PATCH 14/48] arm: Frequency invariant scheduler load-tracking support Morten Rasmussen
2015-03-23 13:39   ` Peter Zijlstra
2015-03-24  9:41     ` Morten Rasmussen
2015-02-04 18:30 ` [RFCv3 PATCH 15/48] arm: vexpress: Add CPU clock-frequencies to TC2 device-tree Morten Rasmussen
2015-02-04 18:30 ` [RFCv3 PATCH 16/48] arm: Cpu invariant scheduler load-tracking support Morten Rasmussen
2015-02-04 18:30 ` [RFCv3 PATCH 17/48] sched: Get rid of scaling usage by cpu_capacity_orig Morten Rasmussen
     [not found]   ` <OFFC493540.15A92099-ON48257E35.0026F60C-48257E35.0027A5FB@zte.com.cn>
2015-04-28 16:54     ` Dietmar Eggemann
2015-02-04 18:30 ` [RFCv3 PATCH 18/48] sched: Track blocked utilization contributions Morten Rasmussen
2015-03-23 14:08   ` Peter Zijlstra
2015-03-24  9:43     ` Morten Rasmussen
2015-03-24 16:07       ` Peter Zijlstra
2015-03-24 17:44         ` Morten Rasmussen
2015-02-04 18:30 ` [RFCv3 PATCH 19/48] sched: Include blocked utilization in usage tracking Morten Rasmussen
2015-02-04 18:30 ` [RFCv3 PATCH 20/48] sched: Documentation for scheduler energy cost model Morten Rasmussen
2015-02-04 18:30 ` [RFCv3 PATCH 21/48] sched: Make energy awareness a sched feature Morten Rasmussen
2015-02-04 18:30 ` [RFCv3 PATCH 22/48] sched: Introduce energy data structures Morten Rasmussen
2015-02-04 18:31 ` [RFCv3 PATCH 23/48] sched: Allocate and initialize " Morten Rasmussen
     [not found]   ` <OF29F384AC.37929D8E-ON48257E35.002FCB0C-48257E35.003156FE@zte.com.cn>
2015-04-29 15:43     ` Dietmar Eggemann
2015-02-04 18:31 ` [RFCv3 PATCH 24/48] sched: Introduce SD_SHARE_CAP_STATES sched_domain flag Morten Rasmussen
2015-02-04 18:31 ` [RFCv3 PATCH 25/48] arm: topology: Define TC2 energy and provide it to the scheduler Morten Rasmussen
2015-02-04 18:31 ` [RFCv3 PATCH 26/48] sched: Compute cpu capacity available at current frequency Morten Rasmussen
2015-02-04 18:31 ` [RFCv3 PATCH 27/48] sched: Relocated get_cpu_usage() Morten Rasmussen
2015-02-04 18:31 ` [RFCv3 PATCH 28/48] sched: Use capacity_curr to cap utilization in get_cpu_usage() Morten Rasmussen
2015-03-23 16:14   ` Peter Zijlstra
2015-03-24 11:36     ` Morten Rasmussen
2015-03-24 12:59       ` Peter Zijlstra
2015-02-04 18:31 ` [RFCv3 PATCH 29/48] sched: Highest energy aware balancing sched_domain level pointer Morten Rasmussen
2015-03-23 16:16   ` Peter Zijlstra
2015-03-24 10:52     ` Morten Rasmussen
     [not found]   ` <OF5977496A.A21A7B96-ON48257E35.002EC23C-48257E35.00324DAD@zte.com.cn>
2015-04-29 15:54     ` Dietmar Eggemann
2015-02-04 18:31 ` [RFCv3 PATCH 30/48] sched: Calculate energy consumption of sched_group Morten Rasmussen
2015-03-13 22:54   ` Sai Gurrappadi
2015-03-16 14:15     ` Morten Rasmussen
2015-03-23 16:47       ` Peter Zijlstra
2015-03-23 20:21         ` Dietmar Eggemann
2015-03-24 10:44           ` Morten Rasmussen
2015-03-24 16:10             ` Peter Zijlstra
2015-03-24 17:39               ` Morten Rasmussen
2015-03-26 15:23                 ` Dietmar Eggemann
2015-03-20 18:40   ` Sai Gurrappadi
2015-03-27 15:58     ` Morten Rasmussen
2015-02-04 18:31 ` [RFCv3 PATCH 31/48] sched: Extend sched_group_energy to test load-balancing decisions Morten Rasmussen
     [not found]   ` <OF081FBA75.F80B8844-ON48257E37.00261E89-48257E37.00267F24@zte.com.cn>
2015-04-30 20:26     ` Dietmar Eggemann
2015-02-04 18:31 ` [RFCv3 PATCH 32/48] sched: Estimate energy impact of scheduling decisions Morten Rasmussen
2015-02-04 18:31 ` [RFCv3 PATCH 33/48] sched: Energy-aware wake-up task placement Morten Rasmussen
2015-03-13 22:47   ` Sai Gurrappadi
2015-03-16 14:47     ` Morten Rasmussen
2015-03-18 20:15       ` Sai Gurrappadi
2015-03-27 16:37         ` Morten Rasmussen
2015-03-24 13:00       ` Peter Zijlstra
2015-03-24 15:24         ` Morten Rasmussen
2015-03-24 13:00   ` Peter Zijlstra
2015-03-24 15:42     ` Morten Rasmussen
2015-03-24 15:53       ` Peter Zijlstra
2015-03-24 17:47         ` Morten Rasmussen
2015-03-24 16:35   ` Peter Zijlstra
2015-03-25 18:01     ` Juri Lelli
2015-03-25 18:14       ` Peter Zijlstra
2015-03-26 10:21         ` Juri Lelli
2015-03-26 10:41           ` Peter Zijlstra
2015-04-27 16:01             ` Michael Turquette
2015-04-28 13:06               ` Peter Zijlstra
2015-02-04 18:31 ` [RFCv3 PATCH 34/48] sched: Bias new task wakeups towards higher capacity cpus Morten Rasmussen
2015-03-24 13:33   ` Peter Zijlstra
2015-03-25 18:18     ` Morten Rasmussen
2015-02-04 18:31 ` [RFCv3 PATCH 35/48] sched, cpuidle: Track cpuidle state index in the scheduler Morten Rasmussen
2015-02-04 18:31 ` [RFCv3 PATCH 36/48] sched: Count number of shallower idle-states in struct sched_group_energy Morten Rasmussen
2015-03-24 13:14   ` Peter Zijlstra
2015-03-24 17:13     ` Morten Rasmussen
2015-02-04 18:31 ` [RFCv3 PATCH 37/48] sched: Determine the current sched_group idle-state Morten Rasmussen
     [not found]   ` <OF1FDC99CD.22435E74-ON48257E37.001BA739-48257E37.001CA5ED@zte.com.cn>
2015-04-30 20:17     ` Dietmar Eggemann
     [not found]       ` <OF2F4202E4.8A4AF229-ON48257E38.00312CD4-48257E38.0036ADB6@zte.com.cn>
2015-05-01 15:09         ` Dietmar Eggemann
2015-02-04 18:31 ` [RFCv3 PATCH 38/48] sched: Infrastructure to query if load balancing is energy-aware Morten Rasmussen
2015-03-24 13:41   ` Peter Zijlstra
2015-03-24 16:17     ` Dietmar Eggemann
2015-03-24 13:56   ` Peter Zijlstra
2015-03-24 16:22     ` Dietmar Eggemann
2015-02-04 18:31 ` [RFCv3 PATCH 39/48] sched: Introduce energy awareness into update_sg_lb_stats Morten Rasmussen
2015-02-04 18:31 ` [RFCv3 PATCH 40/48] sched: Introduce energy awareness into update_sd_lb_stats Morten Rasmussen
2015-02-04 18:31 ` [RFCv3 PATCH 41/48] sched: Introduce energy awareness into find_busiest_group Morten Rasmussen
2015-02-04 18:31 ` [RFCv3 PATCH 42/48] sched: Introduce energy awareness into find_busiest_queue Morten Rasmussen
2015-03-24 15:21   ` Peter Zijlstra
2015-03-24 18:04     ` Dietmar Eggemann
2015-02-04 18:31 ` [RFCv3 PATCH 43/48] sched: Introduce energy awareness into detach_tasks Morten Rasmussen
2015-03-24 15:25   ` Peter Zijlstra
2015-03-25 23:50   ` Sai Gurrappadi
2015-03-27 15:03     ` Dietmar Eggemann
     [not found]       ` <OFDCE15EEF.2F536D7F-ON48257E37.002565ED-48257E37.0027A8B9@zte.com.cn>
2015-04-30 20:35         ` Dietmar Eggemann
2015-02-04 18:31 ` [RFCv3 PATCH 44/48] sched: Tipping point from energy-aware to conventional load balancing Morten Rasmussen
2015-03-24 15:26   ` Peter Zijlstra
2015-03-24 18:47     ` Dietmar Eggemann
2015-02-04 18:31 ` [RFCv3 PATCH 45/48] sched: Skip cpu as lb src which has one task and capacity gte the dst cpu Morten Rasmussen
2015-03-24 15:27   ` Peter Zijlstra
2015-03-25 18:44     ` Dietmar Eggemann
     [not found]       ` <OF9320540C.255228F9-ON48257E37.002A02D1-48257E37.002AB5EE@zte.com.cn>
2015-05-05 10:01         ` Dietmar Eggemann
2015-02-04 18:31 ` [RFCv3 PATCH 46/48] sched: Turn off fast idling of cpus on a partially loaded system Morten Rasmussen
2015-03-24 16:01   ` Peter Zijlstra
2015-02-04 18:31 ` [RFCv3 PATCH 47/48] sched: Enable active migration for cpus of lower capacity Morten Rasmussen
2015-03-24 16:02   ` Peter Zijlstra
2015-02-04 18:31 ` [RFCv3 PATCH 48/48] sched: Disable energy-unfriendly nohz kicks Morten Rasmussen
2015-02-20 19:26   ` Dietmar Eggemann
2015-04-02 12:43 ` [RFCv3 PATCH 00/48] sched: Energy cost model for energy-aware scheduling Vincent Guittot
2015-04-08 13:33   ` Morten Rasmussen
2015-04-09  7:41     ` Vincent Guittot
2015-04-10 14:46       ` Morten Rasmussen

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=5549E3B6.2060709@arm.com \
    --to=dietmar.eggemann@arm.com \
    --cc=Juri.Lelli@arm.com \
    --cc=Morten.Rasmussen@arm.com \
    --cc=linux-kernel-owner@vger.kernel.org \
    --cc=linux-kernel@vger.kernel.org \
    --cc=mingo@redhat.com \
    --cc=mturquette@linaro.org \
    --cc=nico@linaro.org \
    --cc=pang.xunlei@zte.com.cn \
    --cc=peterz@infradead.org \
    --cc=preeti@linux.vnet.ibm.com \
    --cc=rjw@rjwysocki.net \
    --cc=vincent.guittot@linaro.org \
    --cc=yuyang.du@intel.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).