linux-kernel.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
From: bsegall@google.com
To: Yuyang Du <yuyang.du@intel.com>
Cc: Peter Zijlstra <peterz@infradead.org>,
	Morten Rasmussen <morten.rasmussen@arm.com>,
	Vincent Guittot <vincent.guittot@linaro.org>,
	Dietmar Eggemann <dietmar.eggemann@arm.com>,
	Steve Muckle <steve.muckle@linaro.org>,
	"mingo\@redhat.com" <mingo@redhat.com>,
	"daniel.lezcano\@linaro.org" <daniel.lezcano@linaro.org>,
	"mturquette\@baylibre.com" <mturquette@baylibre.com>,
	"rjw\@rjwysocki.net" <rjw@rjwysocki.net>,
	Juri Lelli <Juri.Lelli@arm.com>,
	"sgurrappadi\@nvidia.com" <sgurrappadi@nvidia.com>,
	"pang.xunlei\@zte.com.cn" <pang.xunlei@zte.com.cn>,
	"linux-kernel\@vger.kernel.org" <linux-kernel@vger.kernel.org>
Subject: Re: [PATCH 5/6] sched/fair: Get rid of scaling utilization by capacity_orig
Date: Wed, 23 Sep 2015 09:54:08 -0700	[thread overview]
Message-ID: <xm26zj0d84b3.fsf@sword-of-the-dawn.mtv.corp.google.com> (raw)
In-Reply-To: <20150922232222.GF11102@intel.com> (Yuyang Du's message of "Wed, 23 Sep 2015 07:22:22 +0800")

Yuyang Du <yuyang.du@intel.com> writes:

> On Tue, Sep 22, 2015 at 10:18:30AM -0700, bsegall@google.com wrote:
>> Yuyang Du <yuyang.du@intel.com> writes:
>> 
>> > On Mon, Sep 21, 2015 at 10:30:04AM -0700, bsegall@google.com wrote:
>> >> > But first, I think as load_sum and load_avg can afford NICE_0_LOAD with either high
>> >> > or low resolution. So we have no reason to have low resolution (10bits) load_avg
>> >> > when NICE_0_LOAD has high resolution (20bits), because load_avg = runnable% * load,
>> >> > as opposed to now we have load_avg = runnable% * scale_load_down(load).
>> >> >
>> >> > We get rid of all scale_load_down() for runnable load average?
>> >> 
>> >> Hmm, LOAD_AVG_MAX * prio_to_weight[0] is 4237627662, ie barely within a
>> >> 32-bit unsigned long, but in fact LOAD_AVG_MAX * MAX_SHARES is already
>> >> going to give errors on 32-bit (even with the old code in fact). This
>> >> should probably be fixed... somehow (dividing by 4 for load_sum on
>> >> 32-bit would work, though be ugly. Reducing MAX_SHARES by 2 bits on
>> >> 32-bit might have made sense but would be a weird difference between 32
>> >> and 64, and could break userspace anyway, so it's presumably too late
>> >> for that).
>> >> 
>> >> 64-bit has ~30 bits free, so this would be fine so long as SLR is 0 on
>> >> 32-bit.
>> >> 
>> >
>> > load_avg has no LOAD_AVG_MAX term in it, it is runnable% * load, IOW, load_avg <= load.
>> > So, on 32bit, cfs_rq's load_avg can host 2^32/prio_to_weight[0]/1024 = 47, with 20bits
>> > load resolution. This is ok, because struct load_weight's load is also unsigned
>> > long. If overflown, cfs_rq->load.weight will be overflown in the first place.
>> >
>> > However, after a second thought, this is not quite right. Because load_avg is not
>> > necessarily no greater than load, since load_avg has blocked load in it. Although,
>> > load_avg is still at the same level as load (converging to be <= load), we may not
>> > want the risk to overflow on 32bit.
>  
> This second thought made a mistake (what was wrong with me). load_avg is for sure
> no greater than load with or without blocked load.
>
> With that said, it really does not matter what the following numbers are, 32bit or
> 64bit machine. What matters is that cfs_rq->load.weight is one that needs to worry
> whether overflow or not, not the load_avg. It is as simple as that.
>
> With that, I think we can and should get rid of the scale_load_down()
> for load_avg.

load_avg yes is bounded by load.weight, but on 64-bit load_sum is only
bounded by load.weight * LOAD_AVG_MAX and is the same size as
load.weight (as I said below). There's still space for anything
reasonable though with 10 bits of SLR.

>
>> Yeah, I missed that load_sum was u64 and only load_avg was long. This
>> means we're fine on 32-bit with no SLR (or more precisely, cfs_rq
>> runnable_load_avg can overflow, but only when cfs_rq load.weight does,
>> so whatever). On 64-bit you can currently have 2^36 cgroups or 2^37
>> tasks before load.weight overflows, and ~2^31 tasks before
>> runnable_load_avg does, which is obviously fine (and in fact you hit
>> PID_MAX_LIMIT and even if you had the cpu/ram/etc to not fall over).
>> 
>> Now, applying SLR to runnable_load_avg would cut this down to ~2^21
>> tasks running at once or 2^20 with cgroups, which is technically
>> allowed, though it seems utterly implausible (especially since this
>> would have to all be on one cpu). If SLR was increased as peterz asked
>> about, this could be an issue though.
>> 
>> All that said, using SLR on load_sum/load_avg as opposed to cfs_rq
>> runnable_load_avg would be fine, as they're limited to only one
>> task/cgroup's weight. Having it SLRed and cfs_rq not would be a
>> little odd, but not impossible.
>  
>
>> > If NICE_0_LOAD is nice-0's load, and if SCHED_LOAD_SHIFT is to say how to get 
>> > nice-0's load, I don't understand why you want to separate them.
>> 
>> SCHED_LOAD_SHIFT is not how to get nice-0's load, it just happens to
>> have the same value as NICE_0_SHIFT. (I think anyway, SCHED_LOAD_* is
>> used in precisely one place other than the newish util_avg, and as I
>> mentioned it's not remotely clear what compute_imbalance is doing theer)
>
> Yes, it is not clear to me either.
>
> With the above proposal to get rid of scale_load_down() for load_avg, so I think
> now we can remove SCHED_LOAD_*, and rename scale_load() to user_to_kernel_load(),
> and raname scale_load_down() to kernel_to_user_load().
>
> Hmm?

I have no opinion on renaming the scale_load functions, it's certainly
reasonable, but the scale_load names seem fine too.

  reply	other threads:[~2015-09-23 16:54 UTC|newest]

Thread overview: 97+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2015-08-14 16:23 [PATCH 0/6] sched/fair: Compute capacity invariant load/utilization tracking Morten Rasmussen
2015-08-14 16:23 ` [PATCH 1/6] sched/fair: Make load tracking frequency scale-invariant Morten Rasmussen
2015-09-13 11:03   ` [tip:sched/core] " tip-bot for Dietmar Eggemann
2015-08-14 16:23 ` [PATCH 2/6] sched/fair: Convert arch_scale_cpu_capacity() from weak function to #define Morten Rasmussen
2015-09-02  9:31   ` Vincent Guittot
2015-09-02 12:41     ` Vincent Guittot
2015-09-03 19:58     ` Dietmar Eggemann
2015-09-04  7:26       ` Vincent Guittot
2015-09-07 13:25         ` Dietmar Eggemann
2015-09-11 13:21         ` Dietmar Eggemann
2015-09-11 14:45           ` Vincent Guittot
2015-09-13 11:03   ` [tip:sched/core] " tip-bot for Morten Rasmussen
2015-08-14 16:23 ` [PATCH 3/6] sched/fair: Make utilization tracking cpu scale-invariant Morten Rasmussen
2015-08-14 23:04   ` Dietmar Eggemann
2015-09-04  7:52     ` Vincent Guittot
2015-09-13 11:04     ` [tip:sched/core] sched/fair: Make utilization tracking CPU scale-invariant tip-bot for Dietmar Eggemann
2015-08-14 16:23 ` [PATCH 4/6] sched/fair: Name utilization related data and functions consistently Morten Rasmussen
2015-09-04  9:08   ` Vincent Guittot
2015-09-11 16:35     ` Dietmar Eggemann
2015-09-13 11:04   ` [tip:sched/core] " tip-bot for Dietmar Eggemann
2015-08-14 16:23 ` [PATCH 5/6] sched/fair: Get rid of scaling utilization by capacity_orig Morten Rasmussen
2015-09-03 23:51   ` Steve Muckle
2015-09-07 15:37     ` Dietmar Eggemann
2015-09-07 16:21       ` Vincent Guittot
2015-09-07 18:54         ` Dietmar Eggemann
2015-09-07 19:47           ` Peter Zijlstra
2015-09-08 12:47             ` Dietmar Eggemann
2015-09-08  7:22           ` Vincent Guittot
2015-09-08 12:26             ` Peter Zijlstra
2015-09-08 12:52               ` Peter Zijlstra
2015-09-08 14:06                 ` Vincent Guittot
2015-09-08 14:35                   ` Morten Rasmussen
2015-09-08 14:40                     ` Vincent Guittot
2015-09-08 14:31                 ` Morten Rasmussen
2015-09-08 15:33                   ` Peter Zijlstra
2015-09-09 22:23                     ` bsegall
2015-09-10 11:06                       ` Morten Rasmussen
2015-09-10 11:11                         ` Vincent Guittot
2015-09-10 12:10                           ` Morten Rasmussen
2015-09-11  0:50                             ` Yuyang Du
2015-09-10 17:23                         ` bsegall
2015-09-08 16:53                   ` Morten Rasmussen
2015-09-09  9:43                     ` Peter Zijlstra
2015-09-09  9:45                       ` Peter Zijlstra
2015-09-09 11:13                       ` Morten Rasmussen
2015-09-11 17:22                         ` Morten Rasmussen
2015-09-17  9:51                           ` Peter Zijlstra
2015-09-17 10:38                           ` Peter Zijlstra
2015-09-21  1:16                             ` Yuyang Du
2015-09-21 17:30                               ` bsegall
2015-09-21 23:39                                 ` Yuyang Du
2015-09-22 17:18                                   ` bsegall
2015-09-22 23:22                                     ` Yuyang Du
2015-09-23 16:54                                       ` bsegall [this message]
2015-09-24  0:22                                         ` Yuyang Du
2015-09-30 12:52                                     ` Peter Zijlstra
2015-09-11  7:46                     ` Leo Yan
2015-09-11 10:02                       ` Morten Rasmussen
2015-09-11 14:11                         ` Leo Yan
2015-09-09 19:07                 ` Yuyang Du
2015-09-10 10:06                   ` Peter Zijlstra
2015-09-08 13:39               ` Vincent Guittot
2015-09-08 14:10                 ` Peter Zijlstra
2015-09-08 15:17                   ` Vincent Guittot
2015-09-08 12:50             ` Dietmar Eggemann
2015-09-08 14:01               ` Vincent Guittot
2015-09-08 14:27                 ` Dietmar Eggemann
2015-09-09 20:15               ` Yuyang Du
2015-09-10 10:07                 ` Peter Zijlstra
2015-09-11  0:28                   ` Yuyang Du
2015-09-11 10:31                     ` Morten Rasmussen
2015-09-11 17:05                       ` bsegall
2015-09-11 18:24                         ` Yuyang Du
2015-09-14 17:36                           ` bsegall
2015-09-14 12:56                         ` Morten Rasmussen
2015-09-14 17:34                           ` bsegall
2015-09-14 22:56                             ` Yuyang Du
2015-09-15 17:11                               ` bsegall
2015-09-15 18:39                                 ` Yuyang Du
2015-09-16 17:06                                   ` bsegall
2015-09-17  2:31                                     ` Yuyang Du
2015-09-15  8:43                             ` Morten Rasmussen
2015-09-16 15:36                             ` Peter Zijlstra
2015-09-08 11:44           ` Peter Zijlstra
2015-09-13 11:04       ` [tip:sched/core] " tip-bot for Dietmar Eggemann
2015-08-14 16:23 ` [PATCH 6/6] sched/fair: Initialize task load and utilization before placing task on rq Morten Rasmussen
2015-09-13 11:05   ` [tip:sched/core] " tip-bot for Morten Rasmussen
2015-08-16 20:46 ` [PATCH 0/6] sched/fair: Compute capacity invariant load/utilization tracking Peter Zijlstra
2015-08-17 11:29   ` Morten Rasmussen
2015-08-17 11:48     ` Peter Zijlstra
2015-08-31  9:24 ` Peter Zijlstra
2015-09-02  9:51   ` Dietmar Eggemann
2015-09-07 12:42   ` Peter Zijlstra
2015-09-07 13:21     ` Peter Zijlstra
2015-09-07 13:23     ` Peter Zijlstra
2015-09-07 14:44     ` Dietmar Eggemann
2015-09-13 11:06       ` [tip:sched/core] sched/fair: Defer calling scaling functions tip-bot for Dietmar Eggemann

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=xm26zj0d84b3.fsf@sword-of-the-dawn.mtv.corp.google.com \
    --to=bsegall@google.com \
    --cc=Juri.Lelli@arm.com \
    --cc=daniel.lezcano@linaro.org \
    --cc=dietmar.eggemann@arm.com \
    --cc=linux-kernel@vger.kernel.org \
    --cc=mingo@redhat.com \
    --cc=morten.rasmussen@arm.com \
    --cc=mturquette@baylibre.com \
    --cc=pang.xunlei@zte.com.cn \
    --cc=peterz@infradead.org \
    --cc=rjw@rjwysocki.net \
    --cc=sgurrappadi@nvidia.com \
    --cc=steve.muckle@linaro.org \
    --cc=vincent.guittot@linaro.org \
    --cc=yuyang.du@intel.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).