All of lore.kernel.org
 help / color / mirror / Atom feed
From: Peter Zijlstra <peterz@infradead.org>
To: Huaixin Chang <changhuaixin@linux.alibaba.com>
Cc: luca.abeni@santannapisa.it, anderson@cs.unc.edu,
	baruah@wustl.edu, bsegall@google.com, dietmar.eggemann@arm.com,
	dtcccc@linux.alibaba.com, juri.lelli@redhat.com,
	khlebnikov@yandex-team.ru, linux-kernel@vger.kernel.org,
	mgorman@suse.de, mingo@redhat.com, odin@uged.al, odin@ugedal.com,
	pauld@redhead.com, pjt@google.com, rostedt@goodmis.org,
	shanpeic@linux.alibaba.com, tj@kernel.org,
	tommaso.cucinotta@santannapisa.it, vincent.guittot@linaro.org,
	xiyou.wangcong@gmail.com
Subject: Re: [PATCH v6 1/3] sched/fair: Introduce the burstable CFS controller
Date: Tue, 22 Jun 2021 15:19:34 +0200	[thread overview]
Message-ID: <YNHjZqbtzoOy8w87@hirez.programming.kicks-ass.net> (raw)
In-Reply-To: <20210621092800.23714-2-changhuaixin@linux.alibaba.com>

On Mon, Jun 21, 2021 at 05:27:58PM +0800, Huaixin Chang wrote:
> The CFS bandwidth controller limits CPU requests of a task group to
> quota during each period. However, parallel workloads might be bursty
> so that they get throttled even when their average utilization is under
> quota. And they are latency sensitive at the same time so that
> throttling them is undesired.
> 
> We borrow time now against our future underrun, at the cost of increased
> interference against the other system users. All nicely bounded.
> 
> Traditional (UP-EDF) bandwidth control is something like:
> 
>   (U = \Sum u_i) <= 1
> 
> This guaranteeds both that every deadline is met and that the system is
> stable. After all, if U were > 1, then for every second of walltime,
> we'd have to run more than a second of program time, and obviously miss
> our deadline, but the next deadline will be further out still, there is
> never time to catch up, unbounded fail.
> 
> This work observes that a workload doesn't always executes the full
> quota; this enables one to describe u_i as a statistical distribution.
> 
> For example, have u_i = {x,e}_i, where x is the p(95) and x+e p(100)
> (the traditional WCET). This effectively allows u to be smaller,
> increasing the efficiency (we can pack more tasks in the system), but at
> the cost of missing deadlines when all the odds line up. However, it
> does maintain stability, since every overrun must be paired with an
> underrun as long as our x is above the average.
> 
> That is, suppose we have 2 tasks, both specify a p(95) value, then we
> have a p(95)*p(95) = 90.25% chance both tasks are within their quota and
> everything is good. At the same time we have a p(5)p(5) = 0.25% chance
> both tasks will exceed their quota at the same time (guaranteed deadline
> fail). Somewhere in between there's a threshold where one exceeds and
> the other doesn't underrun enough to compensate; this depends on the
> specific CDFs.
> 
> At the same time, we can say that the worst case deadline miss, will be
> \Sum e_i; that is, there is a bounded tardiness (under the assumption
> that x+e is indeed WCET).
> 
> The benefit of burst is seen when testing with schbench. Default value of
> kernel.sched_cfs_bandwidth_slice_us(5ms) and CONFIG_HZ(1000) is used.
> 
> 	mkdir /sys/fs/cgroup/cpu/test
> 	echo $$ > /sys/fs/cgroup/cpu/test/cgroup.procs
> 	echo 100000 > /sys/fs/cgroup/cpu/test/cpu.cfs_quota_us
> 	echo 100000 > /sys/fs/cgroup/cpu/test/cpu.cfs_burst_us
> 
> 	./schbench -m 1 -t 3 -r 20 -c 80000 -R 10
> 
> The average CPU usage is at 80%. I run this for 10 times, and got long tail
> latency for 6 times and got throttled for 8 times.
> 
> Tail latencies are shown below, and it wasn't the worst case.
> 
> 	Latency percentiles (usec)
> 		50.0000th: 19872
> 		75.0000th: 21344
> 		90.0000th: 22176
> 		95.0000th: 22496
> 		*99.0000th: 22752
> 		99.5000th: 22752
> 		99.9000th: 22752
> 		min=0, max=22727
> 	rps: 9.90 p95 (usec) 22496 p99 (usec) 22752 p95/cputime 28.12% p99/cputime 28.44%
> 
> The interferenece when using burst is valued by the possibilities for
> missing the deadline and the average WCET. Test results showed that when
> there many cgroups or CPU is under utilized, the interference is
> limited. More details are shown in:
> https://lore.kernel.org/lkml/5371BD36-55AE-4F71-B9D7-B86DC32E3D2B@linux.alibaba.com/
> 
> Co-developed-by: Shanpei Chen <shanpeic@linux.alibaba.com>
> Signed-off-by: Shanpei Chen <shanpeic@linux.alibaba.com>
> Co-developed-by: Tianchen Ding <dtcccc@linux.alibaba.com>
> Signed-off-by: Tianchen Ding <dtcccc@linux.alibaba.com>
> Signed-off-by: Huaixin Chang <changhuaixin@linux.alibaba.com>
> ---

Ben, what say you? I'm tempted to pick up at least this first patch.

  reply	other threads:[~2021-06-22 13:20 UTC|newest]

Thread overview: 17+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2021-06-21  9:27 [PATCH v6 0/3] sched/fair: Burstable CFS bandwidth controller Huaixin Chang
2021-06-21  9:27 ` [PATCH v6 1/3] sched/fair: Introduce the burstable CFS controller Huaixin Chang
2021-06-22 13:19   ` Peter Zijlstra [this message]
2021-06-22 18:57     ` Benjamin Segall
2021-06-24  8:48     ` changhuaixin
2021-06-24  9:28       ` Peter Zijlstra
2021-06-22 15:27   ` Odin Ugedal
2021-06-23  8:47     ` Peter Zijlstra
2021-06-24  8:45     ` changhuaixin
2021-06-24  7:39   ` [tip: sched/core] " tip-bot2 for Huaixin Chang
2021-06-21  9:27 ` [PATCH v6 2/3] sched/fair: Add cfs bandwidth burst statistics Huaixin Chang
2021-06-28 15:00   ` Peter Zijlstra
2021-06-28 15:12     ` Peter Zijlstra
2021-07-02 11:31     ` changhuaixin
2021-06-21  9:28 ` [PATCH v6 3/3] sched/fair: Add document for burstable CFS bandwidth Huaixin Chang
2021-06-22 15:26   ` Odin Ugedal
2021-06-22 14:25 ` [PATCH v6 0/3] sched/fair: Burstable CFS bandwidth controller Tejun Heo

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=YNHjZqbtzoOy8w87@hirez.programming.kicks-ass.net \
    --to=peterz@infradead.org \
    --cc=anderson@cs.unc.edu \
    --cc=baruah@wustl.edu \
    --cc=bsegall@google.com \
    --cc=changhuaixin@linux.alibaba.com \
    --cc=dietmar.eggemann@arm.com \
    --cc=dtcccc@linux.alibaba.com \
    --cc=juri.lelli@redhat.com \
    --cc=khlebnikov@yandex-team.ru \
    --cc=linux-kernel@vger.kernel.org \
    --cc=luca.abeni@santannapisa.it \
    --cc=mgorman@suse.de \
    --cc=mingo@redhat.com \
    --cc=odin@uged.al \
    --cc=odin@ugedal.com \
    --cc=pauld@redhead.com \
    --cc=pjt@google.com \
    --cc=rostedt@goodmis.org \
    --cc=shanpeic@linux.alibaba.com \
    --cc=tj@kernel.org \
    --cc=tommaso.cucinotta@santannapisa.it \
    --cc=vincent.guittot@linaro.org \
    --cc=xiyou.wangcong@gmail.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.