From: bsegall@google.com
To: Yuyang Du <yuyang.du@intel.com>
Cc: Peter Zijlstra <peterz@infradead.org>,
Morten Rasmussen <morten.rasmussen@arm.com>,
Vincent Guittot <vincent.guittot@linaro.org>,
Dietmar Eggemann <dietmar.eggemann@arm.com>,
Steve Muckle <steve.muckle@linaro.org>,
"mingo\@redhat.com" <mingo@redhat.com>,
"daniel.lezcano\@linaro.org" <daniel.lezcano@linaro.org>,
"mturquette\@baylibre.com" <mturquette@baylibre.com>,
"rjw\@rjwysocki.net" <rjw@rjwysocki.net>,
Juri Lelli <Juri.Lelli@arm.com>,
"sgurrappadi\@nvidia.com" <sgurrappadi@nvidia.com>,
"pang.xunlei\@zte.com.cn" <pang.xunlei@zte.com.cn>,
"linux-kernel\@vger.kernel.org" <linux-kernel@vger.kernel.org>
Subject: Re: [PATCH 5/6] sched/fair: Get rid of scaling utilization by capacity_orig
Date: Wed, 23 Sep 2015 09:54:08 -0700 [thread overview]
Message-ID: <xm26zj0d84b3.fsf@sword-of-the-dawn.mtv.corp.google.com> (raw)
In-Reply-To: <20150922232222.GF11102@intel.com> (Yuyang Du's message of "Wed, 23 Sep 2015 07:22:22 +0800")
Yuyang Du <yuyang.du@intel.com> writes:
> On Tue, Sep 22, 2015 at 10:18:30AM -0700, bsegall@google.com wrote:
>> Yuyang Du <yuyang.du@intel.com> writes:
>>
>> > On Mon, Sep 21, 2015 at 10:30:04AM -0700, bsegall@google.com wrote:
>> >> > But first, I think as load_sum and load_avg can afford NICE_0_LOAD with either high
>> >> > or low resolution. So we have no reason to have low resolution (10bits) load_avg
>> >> > when NICE_0_LOAD has high resolution (20bits), because load_avg = runnable% * load,
>> >> > as opposed to now we have load_avg = runnable% * scale_load_down(load).
>> >> >
>> >> > We get rid of all scale_load_down() for runnable load average?
>> >>
>> >> Hmm, LOAD_AVG_MAX * prio_to_weight[0] is 4237627662, ie barely within a
>> >> 32-bit unsigned long, but in fact LOAD_AVG_MAX * MAX_SHARES is already
>> >> going to give errors on 32-bit (even with the old code in fact). This
>> >> should probably be fixed... somehow (dividing by 4 for load_sum on
>> >> 32-bit would work, though be ugly. Reducing MAX_SHARES by 2 bits on
>> >> 32-bit might have made sense but would be a weird difference between 32
>> >> and 64, and could break userspace anyway, so it's presumably too late
>> >> for that).
>> >>
>> >> 64-bit has ~30 bits free, so this would be fine so long as SLR is 0 on
>> >> 32-bit.
>> >>
>> >
>> > load_avg has no LOAD_AVG_MAX term in it, it is runnable% * load, IOW, load_avg <= load.
>> > So, on 32bit, cfs_rq's load_avg can host 2^32/prio_to_weight[0]/1024 = 47, with 20bits
>> > load resolution. This is ok, because struct load_weight's load is also unsigned
>> > long. If overflown, cfs_rq->load.weight will be overflown in the first place.
>> >
>> > However, after a second thought, this is not quite right. Because load_avg is not
>> > necessarily no greater than load, since load_avg has blocked load in it. Although,
>> > load_avg is still at the same level as load (converging to be <= load), we may not
>> > want the risk to overflow on 32bit.
>
> This second thought made a mistake (what was wrong with me). load_avg is for sure
> no greater than load with or without blocked load.
>
> With that said, it really does not matter what the following numbers are, 32bit or
> 64bit machine. What matters is that cfs_rq->load.weight is one that needs to worry
> whether overflow or not, not the load_avg. It is as simple as that.
>
> With that, I think we can and should get rid of the scale_load_down()
> for load_avg.
load_avg yes is bounded by load.weight, but on 64-bit load_sum is only
bounded by load.weight * LOAD_AVG_MAX and is the same size as
load.weight (as I said below). There's still space for anything
reasonable though with 10 bits of SLR.
>
>> Yeah, I missed that load_sum was u64 and only load_avg was long. This
>> means we're fine on 32-bit with no SLR (or more precisely, cfs_rq
>> runnable_load_avg can overflow, but only when cfs_rq load.weight does,
>> so whatever). On 64-bit you can currently have 2^36 cgroups or 2^37
>> tasks before load.weight overflows, and ~2^31 tasks before
>> runnable_load_avg does, which is obviously fine (and in fact you hit
>> PID_MAX_LIMIT and even if you had the cpu/ram/etc to not fall over).
>>
>> Now, applying SLR to runnable_load_avg would cut this down to ~2^21
>> tasks running at once or 2^20 with cgroups, which is technically
>> allowed, though it seems utterly implausible (especially since this
>> would have to all be on one cpu). If SLR was increased as peterz asked
>> about, this could be an issue though.
>>
>> All that said, using SLR on load_sum/load_avg as opposed to cfs_rq
>> runnable_load_avg would be fine, as they're limited to only one
>> task/cgroup's weight. Having it SLRed and cfs_rq not would be a
>> little odd, but not impossible.
>
>
>> > If NICE_0_LOAD is nice-0's load, and if SCHED_LOAD_SHIFT is to say how to get
>> > nice-0's load, I don't understand why you want to separate them.
>>
>> SCHED_LOAD_SHIFT is not how to get nice-0's load, it just happens to
>> have the same value as NICE_0_SHIFT. (I think anyway, SCHED_LOAD_* is
>> used in precisely one place other than the newish util_avg, and as I
>> mentioned it's not remotely clear what compute_imbalance is doing theer)
>
> Yes, it is not clear to me either.
>
> With the above proposal to get rid of scale_load_down() for load_avg, so I think
> now we can remove SCHED_LOAD_*, and rename scale_load() to user_to_kernel_load(),
> and raname scale_load_down() to kernel_to_user_load().
>
> Hmm?
I have no opinion on renaming the scale_load functions, it's certainly
reasonable, but the scale_load names seem fine too.
next prev parent reply other threads:[~2015-09-23 16:54 UTC|newest]
Thread overview: 97+ messages / expand[flat|nested] mbox.gz Atom feed top
2015-08-14 16:23 [PATCH 0/6] sched/fair: Compute capacity invariant load/utilization tracking Morten Rasmussen
2015-08-14 16:23 ` [PATCH 1/6] sched/fair: Make load tracking frequency scale-invariant Morten Rasmussen
2015-09-13 11:03 ` [tip:sched/core] " tip-bot for Dietmar Eggemann
2015-08-14 16:23 ` [PATCH 2/6] sched/fair: Convert arch_scale_cpu_capacity() from weak function to #define Morten Rasmussen
2015-09-02 9:31 ` Vincent Guittot
2015-09-02 12:41 ` Vincent Guittot
2015-09-03 19:58 ` Dietmar Eggemann
2015-09-04 7:26 ` Vincent Guittot
2015-09-07 13:25 ` Dietmar Eggemann
2015-09-11 13:21 ` Dietmar Eggemann
2015-09-11 14:45 ` Vincent Guittot
2015-09-13 11:03 ` [tip:sched/core] " tip-bot for Morten Rasmussen
2015-08-14 16:23 ` [PATCH 3/6] sched/fair: Make utilization tracking cpu scale-invariant Morten Rasmussen
2015-08-14 23:04 ` Dietmar Eggemann
2015-09-04 7:52 ` Vincent Guittot
2015-09-13 11:04 ` [tip:sched/core] sched/fair: Make utilization tracking CPU scale-invariant tip-bot for Dietmar Eggemann
2015-08-14 16:23 ` [PATCH 4/6] sched/fair: Name utilization related data and functions consistently Morten Rasmussen
2015-09-04 9:08 ` Vincent Guittot
2015-09-11 16:35 ` Dietmar Eggemann
2015-09-13 11:04 ` [tip:sched/core] " tip-bot for Dietmar Eggemann
2015-08-14 16:23 ` [PATCH 5/6] sched/fair: Get rid of scaling utilization by capacity_orig Morten Rasmussen
2015-09-03 23:51 ` Steve Muckle
2015-09-07 15:37 ` Dietmar Eggemann
2015-09-07 16:21 ` Vincent Guittot
2015-09-07 18:54 ` Dietmar Eggemann
2015-09-07 19:47 ` Peter Zijlstra
2015-09-08 12:47 ` Dietmar Eggemann
2015-09-08 7:22 ` Vincent Guittot
2015-09-08 12:26 ` Peter Zijlstra
2015-09-08 12:52 ` Peter Zijlstra
2015-09-08 14:06 ` Vincent Guittot
2015-09-08 14:35 ` Morten Rasmussen
2015-09-08 14:40 ` Vincent Guittot
2015-09-08 14:31 ` Morten Rasmussen
2015-09-08 15:33 ` Peter Zijlstra
2015-09-09 22:23 ` bsegall
2015-09-10 11:06 ` Morten Rasmussen
2015-09-10 11:11 ` Vincent Guittot
2015-09-10 12:10 ` Morten Rasmussen
2015-09-11 0:50 ` Yuyang Du
2015-09-10 17:23 ` bsegall
2015-09-08 16:53 ` Morten Rasmussen
2015-09-09 9:43 ` Peter Zijlstra
2015-09-09 9:45 ` Peter Zijlstra
2015-09-09 11:13 ` Morten Rasmussen
2015-09-11 17:22 ` Morten Rasmussen
2015-09-17 9:51 ` Peter Zijlstra
2015-09-17 10:38 ` Peter Zijlstra
2015-09-21 1:16 ` Yuyang Du
2015-09-21 17:30 ` bsegall
2015-09-21 23:39 ` Yuyang Du
2015-09-22 17:18 ` bsegall
2015-09-22 23:22 ` Yuyang Du
2015-09-23 16:54 ` bsegall [this message]
2015-09-24 0:22 ` Yuyang Du
2015-09-30 12:52 ` Peter Zijlstra
2015-09-11 7:46 ` Leo Yan
2015-09-11 10:02 ` Morten Rasmussen
2015-09-11 14:11 ` Leo Yan
2015-09-09 19:07 ` Yuyang Du
2015-09-10 10:06 ` Peter Zijlstra
2015-09-08 13:39 ` Vincent Guittot
2015-09-08 14:10 ` Peter Zijlstra
2015-09-08 15:17 ` Vincent Guittot
2015-09-08 12:50 ` Dietmar Eggemann
2015-09-08 14:01 ` Vincent Guittot
2015-09-08 14:27 ` Dietmar Eggemann
2015-09-09 20:15 ` Yuyang Du
2015-09-10 10:07 ` Peter Zijlstra
2015-09-11 0:28 ` Yuyang Du
2015-09-11 10:31 ` Morten Rasmussen
2015-09-11 17:05 ` bsegall
2015-09-11 18:24 ` Yuyang Du
2015-09-14 17:36 ` bsegall
2015-09-14 12:56 ` Morten Rasmussen
2015-09-14 17:34 ` bsegall
2015-09-14 22:56 ` Yuyang Du
2015-09-15 17:11 ` bsegall
2015-09-15 18:39 ` Yuyang Du
2015-09-16 17:06 ` bsegall
2015-09-17 2:31 ` Yuyang Du
2015-09-15 8:43 ` Morten Rasmussen
2015-09-16 15:36 ` Peter Zijlstra
2015-09-08 11:44 ` Peter Zijlstra
2015-09-13 11:04 ` [tip:sched/core] " tip-bot for Dietmar Eggemann
2015-08-14 16:23 ` [PATCH 6/6] sched/fair: Initialize task load and utilization before placing task on rq Morten Rasmussen
2015-09-13 11:05 ` [tip:sched/core] " tip-bot for Morten Rasmussen
2015-08-16 20:46 ` [PATCH 0/6] sched/fair: Compute capacity invariant load/utilization tracking Peter Zijlstra
2015-08-17 11:29 ` Morten Rasmussen
2015-08-17 11:48 ` Peter Zijlstra
2015-08-31 9:24 ` Peter Zijlstra
2015-09-02 9:51 ` Dietmar Eggemann
2015-09-07 12:42 ` Peter Zijlstra
2015-09-07 13:21 ` Peter Zijlstra
2015-09-07 13:23 ` Peter Zijlstra
2015-09-07 14:44 ` Dietmar Eggemann
2015-09-13 11:06 ` [tip:sched/core] sched/fair: Defer calling scaling functions tip-bot for Dietmar Eggemann
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=xm26zj0d84b3.fsf@sword-of-the-dawn.mtv.corp.google.com \
--to=bsegall@google.com \
--cc=Juri.Lelli@arm.com \
--cc=daniel.lezcano@linaro.org \
--cc=dietmar.eggemann@arm.com \
--cc=linux-kernel@vger.kernel.org \
--cc=mingo@redhat.com \
--cc=morten.rasmussen@arm.com \
--cc=mturquette@baylibre.com \
--cc=pang.xunlei@zte.com.cn \
--cc=peterz@infradead.org \
--cc=rjw@rjwysocki.net \
--cc=sgurrappadi@nvidia.com \
--cc=steve.muckle@linaro.org \
--cc=vincent.guittot@linaro.org \
--cc=yuyang.du@intel.com \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).