linux-kernel.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
From: Shaohua Li <shli@fb.com>
To: Paolo Valente <paolo.valente@unimore.it>
Cc: Tejun Heo <tj@kernel.org>, Vivek Goyal <vgoyal@redhat.com>,
	<linux-block@vger.kernel.org>, <linux-kernel@vger.kernel.org>,
	Jens Axboe <axboe@fb.com>, <Kernel-team@fb.com>,
	<jmoyer@redhat.com>, Mark Brown <broonie@kernel.org>,
	Linus Walleij <linus.walleij@linaro.org>,
	Ulf Hansson <ulf.hansson@linaro.org>
Subject: Re: [PATCH V3 00/11] block-throttle: add .high limit
Date: Wed, 5 Oct 2016 13:36:24 -0700	[thread overview]
Message-ID: <20161005203623.GA1754@anikkar-mbp.local.dhcp.thefacebook.com> (raw)
In-Reply-To: <98C2E984-2CF4-41F7-8A7D-6569C45A627A@unimore.it>

On Wed, Oct 05, 2016 at 09:57:22PM +0200, Paolo Valente wrote:
> 
> > Il giorno 05 ott 2016, alle ore 21:08, Shaohua Li <shli@fb.com> ha scritto:
> > 
> > On Wed, Oct 05, 2016 at 11:30:53AM -0700, Shaohua Li wrote:
> >> On Wed, Oct 05, 2016 at 10:49:46AM -0400, Tejun Heo wrote:
> >>> Hello, Paolo.
> >>> 
> >>> On Wed, Oct 05, 2016 at 02:37:00PM +0200, Paolo Valente wrote:
> >>>> In this respect, for your generic, unpredictable scenario to make
> >>>> sense, there must exist at least one real system that meets the
> >>>> requirements of such a scenario.  Or, if such a real system does not
> >>>> yet exist, it must be possible to emulate it.  If it is impossible to
> >>>> achieve this last goal either, then I miss the usefulness
> >>>> of looking for solutions for such a scenario.
> >>>> 
> >>>> That said, let's define the instance(s) of the scenario that you find
> >>>> most representative, and let's test BFQ on it/them.  Numbers will give
> >>>> us the answers.  For example, what about all or part of the following
> >>>> groups:
> >>>> . one cyclically doing random I/O for some second and then sequential I/O
> >>>> for the next seconds
> >>>> . one doing, say, quasi-sequential I/O in ON/OFF cycles
> >>>> . one starting an application cyclically
> >>>> . one playing back or streaming a movie
> >>>> 
> >>>> For each group, we could then measure the time needed to complete each
> >>>> phase of I/O in each cycle, plus the responsiveness in the group
> >>>> starting an application, plus the frame drop in the group streaming
> >>>> the movie.  In addition, we can measure the bandwidth/iops enjoyed by
> >>>> each group, plus, of course, the aggregate throughput of the whole
> >>>> system.  In particular we could compare results with throttling, BFQ,
> >>>> and CFQ.
> >>>> 
> >>>> Then we could write resulting numbers on the stone, and stick to them
> >>>> until something proves them wrong.
> >>>> 
> >>>> What do you (or others) think about it?
> >>> 
> >>> That sounds great and yeah it's lame that we didn't start with that.
> >>> Shaohua, would it be difficult to compare how bfq performs against
> >>> blk-throttle?
> >> 
> >> I had a test of BFQ. I'm using BFQ found at
> >> https://urldefense.proofpoint.com/v2/url?u=http-3A__algogroup.unimore.it_people_paolo_disk-5Fsched_sources.php&d=DQIFAg&c=5VD0RTtNlTh3ycd41b3MUw&r=X13hAPkxmvBro1Ug8vcKHw&m=zB09S7v2QifXXTa6f2_r6YLjiXq3AwAi7sqO4o2UfBQ&s=oMKpjQMXfWmMwHmANB-Qnrm2EdERzz9Oef7jcLkbyFg&e= . version is
> >> 4.7.0-v8r3. It's a LSI SSD, queue depth 32. I use default setting. fio script
> >> is:
> >> 
> >> [global]
> >> ioengine=libaio
> >> direct=1
> >> readwrite=randread
> >> bs=4k
> >> runtime=60
> >> time_based=1
> >> file_service_type=random:36
> >> overwrite=1
> >> thread=0
> >> group_reporting=1
> >> filename=/dev/sdb
> >> iodepth=1
> >> numjobs=8
> >> 
> >> [groupA]
> >> prio=2
> >> 
> >> [groupB]
> >> new_group
> >> prio=6
> >> 
> >> I'll change iodepth, numjobs and prio in different tests. result unit is MB/s.
> >> 
> >> iodepth=1 numjobs=1 prio 4:4
> >> CFQ: 28:28 BFQ: 21:21 deadline: 29:29
> >> 
> >> iodepth=8 numjobs=1 prio 4:4
> >> CFQ: 162:162 BFQ: 102:98 deadline: 205:205
> >> 
> >> iodepth=1 numjobs=8 prio 4:4
> >> CFQ: 157:157 BFQ: 81:92 deadline: 196:197
> >> 
> >> iodepth=1 numjobs=1 prio 2:6
> >> CFQ: 26.7:27.6 BFQ: 20:6 deadline: 29:29
> >> 
> >> iodepth=8 numjobs=1 prio 2:6
> >> CFQ: 166:174 BFQ: 139:72  deadline: 202:202
> >> 
> >> iodepth=1 numjobs=8 prio 2:6
> >> CFQ: 148:150 BFQ: 90:77 deadline: 198:197
> > 
> > More tests:
> > 
> > iodepth=8 numjobs=1 prio 2:6, group A has 50M/s limit
> > CFQ:51:207  BFQ: 51:45  deadline: 51:216
> > 
> > iodepth=1 numjobs=1 prio 2:6, group A bs=4k, group B bs=64k
> > CFQ:25:249  BFQ: 23:42  deadline: 26:251
> > 
> 
> A true proportional share scheduler like BFQ works under the
> assumption to be the only limiter of the bandwidth of its clients.
> And the availability of such a scheduler should apparently make
> bandwidth limiting useless: once you have a mechanism that allows you
> to give each group the desired fraction of the bandwidth, and to
> redistribute excess bandwidth seamlessly when needed, what do you need
> additional limiting for?
> 
> But I'm not expert of any possible system configuration or
> requirement.  So, if you have practical examples, I would really
> appreciate them.  And I don't think it will be difficult to see what
> goes wrong in BFQ with external bw limitation, and to fix the
> problem.

I think the test emulates a very common configuration. We assign more IO
resources to high priority workload. But such workload doesn't always dispatch
enough io. That's why I set a rate limit. When this happend, we hope low
priority workload uses the disk bandwidth. That's the whole point of disk
sharing.

Thanks,
Shaohua

  reply	other threads:[~2016-10-05 20:36 UTC|newest]

Thread overview: 65+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2016-10-03 21:20 [PATCH V3 00/11] block-throttle: add .high limit Shaohua Li
2016-10-03 21:20 ` [PATCH v3 01/11] block-throttle: prepare support multiple limits Shaohua Li
2016-10-03 21:20 ` [PATCH v3 02/11] block-throttle: add .high interface Shaohua Li
2016-10-03 21:20 ` [PATCH v3 03/11] block-throttle: configure bps/iops limit for cgroup in high limit Shaohua Li
2016-10-03 21:20 ` [PATCH v3 04/11] block-throttle: add upgrade logic for LIMIT_HIGH state Shaohua Li
2016-10-03 21:20 ` [PATCH v3 05/11] block-throttle: add downgrade logic Shaohua Li
2016-10-03 21:20 ` [PATCH v3 06/11] blk-throttle: make sure expire time isn't too big Shaohua Li
2016-10-03 21:20 ` [PATCH v3 07/11] blk-throttle: make throtl_slice tunable Shaohua Li
2016-10-03 21:20 ` [PATCH v3 08/11] blk-throttle: detect completed idle cgroup Shaohua Li
2016-10-03 21:20 ` [PATCH v3 09/11] block-throttle: make bandwidth change smooth Shaohua Li
2016-10-03 21:20 ` [PATCH v3 10/11] block-throttle: add a simple idle detection Shaohua Li
2016-10-03 21:20 ` [PATCH v3 11/11] blk-throttle: ignore idle cgroup limit Shaohua Li
2016-10-04 13:28 ` [PATCH V3 00/11] block-throttle: add .high limit Vivek Goyal
2016-10-04 15:56   ` Tejun Heo
2016-10-04 16:22     ` Paolo Valente
2016-10-04 16:27       ` Tejun Heo
2016-10-04 17:01         ` Paolo Valente
2016-10-04 17:28           ` Shaohua Li
2016-10-04 17:43             ` Paolo Valente
2016-10-04 18:28               ` Shaohua Li
2016-10-04 19:49                 ` Paolo Valente
2016-10-04 18:54               ` Tejun Heo
2016-10-04 19:02                 ` Paolo Valente
2016-10-04 19:14                   ` Tejun Heo
2016-10-04 19:29                     ` Paolo Valente
2016-10-04 20:27                       ` Tejun Heo
2016-10-05 12:37                         ` Paolo Valente
2016-10-05 13:12                           ` Vivek Goyal
2016-10-05 14:04                             ` Paolo Valente
2016-10-05 14:49                           ` Tejun Heo
2016-10-05 18:30                             ` Shaohua Li
2016-10-05 19:08                               ` Shaohua Li
2016-10-05 19:57                                 ` Paolo Valente
2016-10-05 20:36                                   ` Shaohua Li [this message]
2016-10-06  7:22                                     ` Paolo Valente
2016-10-05 19:47                               ` Paolo Valente
2016-10-05 20:07                                 ` Paolo Valente
2016-10-05 20:46                                 ` Shaohua Li
2016-10-06  7:58                                   ` Paolo Valente
2016-10-06 13:15                                     ` Paolo Valente
2016-10-06 17:49                                       ` Vivek Goyal
2016-10-06 18:01                                         ` Paolo Valente
2016-10-06 18:32                                           ` Vivek Goyal
2016-10-06 20:51                                             ` Paolo Valente
2016-10-06 19:44                                         ` Mark Brown
2016-10-06 19:57                                     ` Shaohua Li
2016-10-06 22:24                                       ` Paolo Valente
     [not found]                         ` <CACsaVZ+AqSXHTRdpdrQQp6PuynEPeB-5YOyweWsenjvuKsD12w@mail.gmail.com>
2016-10-09  1:15                           ` Fwd: " Kyle Sanderson
2016-10-14 16:40                             ` Tejun Heo
2016-10-14 17:13                               ` Paolo Valente
2016-10-14 18:35                                 ` Tejun Heo
2016-10-16 19:02                                   ` Paolo Valente
2016-10-18  5:15                                     ` Kyle Sanderson
2016-10-06  8:04                     ` Linus Walleij
2016-10-06 11:03                       ` Mark Brown
2016-10-06 11:57                         ` Austin S. Hemmelgarn
2016-10-06 12:50                           ` Paolo Valente
2016-10-06 13:52                             ` Austin S. Hemmelgarn
2016-10-06 15:05                               ` Paolo Valente
2016-10-06 15:10                                 ` Austin S. Hemmelgarn
2016-10-08 10:46                       ` Heinz Diehl
2016-10-04 18:12     ` Vivek Goyal
2016-10-04 18:50       ` Tejun Heo
2016-10-04 18:56         ` Paolo Valente
2016-10-04 17:08   ` Shaohua Li

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20161005203623.GA1754@anikkar-mbp.local.dhcp.thefacebook.com \
    --to=shli@fb.com \
    --cc=Kernel-team@fb.com \
    --cc=axboe@fb.com \
    --cc=broonie@kernel.org \
    --cc=jmoyer@redhat.com \
    --cc=linus.walleij@linaro.org \
    --cc=linux-block@vger.kernel.org \
    --cc=linux-kernel@vger.kernel.org \
    --cc=paolo.valente@unimore.it \
    --cc=tj@kernel.org \
    --cc=ulf.hansson@linaro.org \
    --cc=vgoyal@redhat.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).