linux-kernel.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
From: changhuaixin <changhuaixin@linux.alibaba.com>
To: Peter Zijlstra <peterz@infradead.org>
Cc: changhuaixin <changhuaixin@linux.alibaba.com>,
	Benjamin Segall <bsegall@google.com>,
	dietmar.eggemann@arm.com, juri.lelli@redhat.com,
	khlebnikov@yandex-team.ru,
	open list <linux-kernel@vger.kernel.org>,
	mgorman@suse.de, mingo@redhat.com, Odin Ugedal <odin@uged.al>,
	Odin Ugedal <odin@ugedal.com>,
	pauld@redhead.com, Paul Turner <pjt@google.com>,
	rostedt@goodmis.org, Shanpei Chen <shanpeic@linux.alibaba.com>,
	Tejun Heo <tj@kernel.org>,
	Vincent Guittot <vincent.guittot@linaro.org>,
	xiyou.wangcong@gmail.com
Subject: Re: [PATCH v4 1/4] sched/fair: Introduce primitives for CFS bandwidth burst
Date: Fri, 19 Mar 2021 20:39:47 +0800	[thread overview]
Message-ID: <2F207CE6-F849-457A-B0A6-3A8BFFE0AFFB@linux.alibaba.com> (raw)
In-Reply-To: <YFNsKGKRL3SaJNZk@hirez.programming.kicks-ass.net>



> On Mar 18, 2021, at 11:05 PM, Peter Zijlstra <peterz@infradead.org> wrote:
> 
> On Thu, Mar 18, 2021 at 09:26:58AM +0800, changhuaixin wrote:
>>> On Mar 17, 2021, at 4:06 PM, Peter Zijlstra <peterz@infradead.org> wrote:
> 
>>> So what is the typical avg,stdev,max and mode for the workloads where you find
>>> you need this?
>>> 
>>> I would really like to put a limit on the burst. IMO a workload that has
>>> a burst many times longer than the quota is plain broken.
>> 
>> I see. Then the problem comes down to how large the limit on burst shall be.
>> 
>> I have sampled the CPU usage of a bursty container in 100ms periods. The statistics are:
> 
> So CPU usage isn't exactly what is required, job execution time is what
> you're after. Assuming there is a relation...
> 

Yes, job execution time is important. To be specific, it is to improve the CPU usage of the whole
system to reduce the total cost of ownership, while not damaging job execution time. This
requires lower the average CPU resource of underutilized cgroups, and allowing their bursts
at the same time.

>> average	: 42.2%
>> stddev	: 81.5%
>> max		: 844.5%
>> P95		: 183.3%
>> P99		: 437.0%
> 
> Then your WCET is 844% of 100ms ? , which is .84s.
> 
> But you forgot your mode; what is the most common duration, given P95 is
> so high, I doubt that avg is representative of the most common duration.
> 

It is true.

>> If quota is 100000ms, burst buffer needs to be 8 times more in order
>> for this workload not to be throttled.
> 
> Where does that 100s come from? And an 800s burst is bizarre.
> 
> Did you typo [us] as [ms] ?
> 

Sorry, it should be 100000us.

>> I can't say this is typical, but these workloads exist. On a machine
>> running Kubernetes containers, where there is often room for such
>> burst and the interference is hard to notice, users would prefer
>> allowing such burst to being throttled occasionally.
> 
> Users also want ponies. I've no idea what kubernetes actually is or what
> it has to do with containers. That's all just word salad.
> 
>> In this sense, I suggest limit burst buffer to 16 times of quota or
>> around. That should be enough for users to improve tail latency caused
>> by throttling. And users might choose a smaller one or even none, if
>> the interference is unacceptable. What do you think?
> 
> Well, normal RT theory would suggest you pick your runtime around 200%
> to get that P95 and then allow a full period burst to get your P99, but
> that same RT theory would also have you calculate the resulting
> interference and see if that works with the rest of the system...
> 

I am sorry that I don't know much about the RT theory you mentioned, and can't provide
the desired calculation now. But I'd like to try and do some reading if that is needed.

> 16 times is horrific.

So can we decide on a more relative value now? Or is the interference probabilities still the
missing piece?

Is the paper you mentioned about called "Insensitivity results in statistical bandwidth sharing",
or some related ones on statistical bandwidth results under some kind of fairness?


  reply	other threads:[~2021-03-19 12:38 UTC|newest]

Thread overview: 21+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2021-03-16  4:49 [PATCH v4 0/4] sched/fair: Burstable CFS bandwidth controller Huaixin Chang
2021-03-16  4:49 ` [PATCH v4 1/4] sched/fair: Introduce primitives for CFS bandwidth burst Huaixin Chang
2021-03-16  9:27   ` Peter Zijlstra
2021-03-16  9:41   ` Peter Zijlstra
2021-03-16  9:54   ` Peter Zijlstra
2021-03-17  7:16     ` changhuaixin
2021-03-17  8:06       ` Peter Zijlstra
2021-03-18  1:26         ` changhuaixin
2021-03-18 12:59           ` Phil Auld
2021-03-18 15:10             ` Peter Zijlstra
2021-04-19  8:18               ` changhuaixin
2021-03-19 12:51             ` changhuaixin
2021-03-18 15:05           ` Peter Zijlstra
2021-03-19 12:39             ` changhuaixin [this message]
2021-03-20  2:06               ` changhuaixin
2021-05-12 12:41             ` changhuaixin
2021-03-16 10:40   ` Peter Zijlstra
2021-03-16  4:49 ` [PATCH v4 2/4] sched/fair: Make CFS bandwidth controller burstable Huaixin Chang
2021-03-16  9:52   ` Peter Zijlstra
2021-03-16  4:49 ` [PATCH v4 3/4] sched/fair: Add cfs bandwidth burst statistics Huaixin Chang
2021-03-16  4:49 ` [PATCH v4 4/4] sched/fair: Add document for burstable CFS bandwidth control Huaixin Chang

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=2F207CE6-F849-457A-B0A6-3A8BFFE0AFFB@linux.alibaba.com \
    --to=changhuaixin@linux.alibaba.com \
    --cc=bsegall@google.com \
    --cc=dietmar.eggemann@arm.com \
    --cc=juri.lelli@redhat.com \
    --cc=khlebnikov@yandex-team.ru \
    --cc=linux-kernel@vger.kernel.org \
    --cc=mgorman@suse.de \
    --cc=mingo@redhat.com \
    --cc=odin@uged.al \
    --cc=odin@ugedal.com \
    --cc=pauld@redhead.com \
    --cc=peterz@infradead.org \
    --cc=pjt@google.com \
    --cc=rostedt@goodmis.org \
    --cc=shanpeic@linux.alibaba.com \
    --cc=tj@kernel.org \
    --cc=vincent.guittot@linaro.org \
    --cc=xiyou.wangcong@gmail.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).