All of lore.kernel.org
 help / color / mirror / Atom feed
From: Paolo Valente <paolo.valente@unimore.it>
To: Shaohua Li <shli@fb.com>
Cc: Tejun Heo <tj@kernel.org>, Vivek Goyal <vgoyal@redhat.com>,
	linux-block@vger.kernel.org, linux-kernel@vger.kernel.org,
	Jens Axboe <axboe@fb.com>,
	Kernel-team@fb.com, jmoyer@redhat.com,
	Mark Brown <broonie@kernel.org>,
	Linus Walleij <linus.walleij@linaro.org>,
	Ulf Hansson <ulf.hansson@linaro.org>
Subject: Re: [PATCH V3 00/11] block-throttle: add .high limit
Date: Thu, 6 Oct 2016 09:22:05 +0200	[thread overview]
Message-ID: <5F716FD2-2027-434E-BC7F-8B0385722E05@unimore.it> (raw)
In-Reply-To: <20161005203623.GA1754@anikkar-mbp.local.dhcp.thefacebook.com>


> Il giorno 05 ott 2016, alle ore 22:36, Shaohua Li <shli@fb.com> ha scritto:
> 
> On Wed, Oct 05, 2016 at 09:57:22PM +0200, Paolo Valente wrote:
>> 
>>> Il giorno 05 ott 2016, alle ore 21:08, Shaohua Li <shli@fb.com> ha scritto:
>>> 
>>> On Wed, Oct 05, 2016 at 11:30:53AM -0700, Shaohua Li wrote:
>>>> On Wed, Oct 05, 2016 at 10:49:46AM -0400, Tejun Heo wrote:
>>>>> Hello, Paolo.
>>>>> 
>>>>> On Wed, Oct 05, 2016 at 02:37:00PM +0200, Paolo Valente wrote:
>>>>>> In this respect, for your generic, unpredictable scenario to make
>>>>>> sense, there must exist at least one real system that meets the
>>>>>> requirements of such a scenario.  Or, if such a real system does not
>>>>>> yet exist, it must be possible to emulate it.  If it is impossible to
>>>>>> achieve this last goal either, then I miss the usefulness
>>>>>> of looking for solutions for such a scenario.
>>>>>> 
>>>>>> That said, let's define the instance(s) of the scenario that you find
>>>>>> most representative, and let's test BFQ on it/them.  Numbers will give
>>>>>> us the answers.  For example, what about all or part of the following
>>>>>> groups:
>>>>>> . one cyclically doing random I/O for some second and then sequential I/O
>>>>>> for the next seconds
>>>>>> . one doing, say, quasi-sequential I/O in ON/OFF cycles
>>>>>> . one starting an application cyclically
>>>>>> . one playing back or streaming a movie
>>>>>> 
>>>>>> For each group, we could then measure the time needed to complete each
>>>>>> phase of I/O in each cycle, plus the responsiveness in the group
>>>>>> starting an application, plus the frame drop in the group streaming
>>>>>> the movie.  In addition, we can measure the bandwidth/iops enjoyed by
>>>>>> each group, plus, of course, the aggregate throughput of the whole
>>>>>> system.  In particular we could compare results with throttling, BFQ,
>>>>>> and CFQ.
>>>>>> 
>>>>>> Then we could write resulting numbers on the stone, and stick to them
>>>>>> until something proves them wrong.
>>>>>> 
>>>>>> What do you (or others) think about it?
>>>>> 
>>>>> That sounds great and yeah it's lame that we didn't start with that.
>>>>> Shaohua, would it be difficult to compare how bfq performs against
>>>>> blk-throttle?
>>>> 
>>>> I had a test of BFQ. I'm using BFQ found at
>>>> https://urldefense.proofpoint.com/v2/url?u=http-3A__algogroup.unimore.it_people_paolo_disk-5Fsched_sources.php&d=DQIFAg&c=5VD0RTtNlTh3ycd41b3MUw&r=X13hAPkxmvBro1Ug8vcKHw&m=zB09S7v2QifXXTa6f2_r6YLjiXq3AwAi7sqO4o2UfBQ&s=oMKpjQMXfWmMwHmANB-Qnrm2EdERzz9Oef7jcLkbyFg&e= . version is
>>>> 4.7.0-v8r3. It's a LSI SSD, queue depth 32. I use default setting. fio script
>>>> is:
>>>> 
>>>> [global]
>>>> ioengine=libaio
>>>> direct=1
>>>> readwrite=randread
>>>> bs=4k
>>>> runtime=60
>>>> time_based=1
>>>> file_service_type=random:36
>>>> overwrite=1
>>>> thread=0
>>>> group_reporting=1
>>>> filename=/dev/sdb
>>>> iodepth=1
>>>> numjobs=8
>>>> 
>>>> [groupA]
>>>> prio=2
>>>> 
>>>> [groupB]
>>>> new_group
>>>> prio=6
>>>> 
>>>> I'll change iodepth, numjobs and prio in different tests. result unit is MB/s.
>>>> 
>>>> iodepth=1 numjobs=1 prio 4:4
>>>> CFQ: 28:28 BFQ: 21:21 deadline: 29:29
>>>> 
>>>> iodepth=8 numjobs=1 prio 4:4
>>>> CFQ: 162:162 BFQ: 102:98 deadline: 205:205
>>>> 
>>>> iodepth=1 numjobs=8 prio 4:4
>>>> CFQ: 157:157 BFQ: 81:92 deadline: 196:197
>>>> 
>>>> iodepth=1 numjobs=1 prio 2:6
>>>> CFQ: 26.7:27.6 BFQ: 20:6 deadline: 29:29
>>>> 
>>>> iodepth=8 numjobs=1 prio 2:6
>>>> CFQ: 166:174 BFQ: 139:72  deadline: 202:202
>>>> 
>>>> iodepth=1 numjobs=8 prio 2:6
>>>> CFQ: 148:150 BFQ: 90:77 deadline: 198:197
>>> 
>>> More tests:
>>> 
>>> iodepth=8 numjobs=1 prio 2:6, group A has 50M/s limit
>>> CFQ:51:207  BFQ: 51:45  deadline: 51:216
>>> 
>>> iodepth=1 numjobs=1 prio 2:6, group A bs=4k, group B bs=64k
>>> CFQ:25:249  BFQ: 23:42  deadline: 26:251
>>> 
>> 
>> A true proportional share scheduler like BFQ works under the
>> assumption to be the only limiter of the bandwidth of its clients.
>> And the availability of such a scheduler should apparently make
>> bandwidth limiting useless: once you have a mechanism that allows you
>> to give each group the desired fraction of the bandwidth, and to
>> redistribute excess bandwidth seamlessly when needed, what do you need
>> additional limiting for?
>> 
>> But I'm not expert of any possible system configuration or
>> requirement.  So, if you have practical examples, I would really
>> appreciate them.  And I don't think it will be difficult to see what
>> goes wrong in BFQ with external bw limitation, and to fix the
>> problem.
> 
> I think the test emulates a very common configuration. We assign more IO
> resources to high priority workload. But such workload doesn't always dispatch
> enough io. That's why I set a rate limit. When this happend, we hope low
> priority workload uses the disk bandwidth. That's the whole point of disk
> sharing.
> 

But that's exactly the configuration for which a proportional-share
scheduler is designed: systematically and seamlessly redistribute
excess bw, with no configuration needed.  Or is there something else
in the scenario you have in mind?

Thanks,
Paolo

> Thanks,
> Shaohua
> --
> To unsubscribe from this list: send the line "unsubscribe linux-block" in
> the body of a message to majordomo@vger.kernel.org
> More majordomo info at  http://vger.kernel.org/majordomo-info.html


--
Paolo Valente
Algogroup
Dipartimento di Scienze Fisiche, Informatiche e Matematiche
Via Campi 213/B
41125 Modena - Italy
http://algogroup.unimore.it/people/paolo/






  reply	other threads:[~2016-10-06  7:22 UTC|newest]

Thread overview: 65+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2016-10-03 21:20 [PATCH V3 00/11] block-throttle: add .high limit Shaohua Li
2016-10-03 21:20 ` [PATCH v3 01/11] block-throttle: prepare support multiple limits Shaohua Li
2016-10-03 21:20 ` [PATCH v3 02/11] block-throttle: add .high interface Shaohua Li
2016-10-03 21:20 ` [PATCH v3 03/11] block-throttle: configure bps/iops limit for cgroup in high limit Shaohua Li
2016-10-03 21:20 ` [PATCH v3 04/11] block-throttle: add upgrade logic for LIMIT_HIGH state Shaohua Li
2016-10-03 21:20 ` [PATCH v3 05/11] block-throttle: add downgrade logic Shaohua Li
2016-10-03 21:20 ` [PATCH v3 06/11] blk-throttle: make sure expire time isn't too big Shaohua Li
2016-10-03 21:20 ` [PATCH v3 07/11] blk-throttle: make throtl_slice tunable Shaohua Li
2016-10-03 21:20 ` [PATCH v3 08/11] blk-throttle: detect completed idle cgroup Shaohua Li
2016-10-03 21:20 ` [PATCH v3 09/11] block-throttle: make bandwidth change smooth Shaohua Li
2016-10-03 21:20 ` [PATCH v3 10/11] block-throttle: add a simple idle detection Shaohua Li
2016-10-03 21:20 ` [PATCH v3 11/11] blk-throttle: ignore idle cgroup limit Shaohua Li
2016-10-04 13:28 ` [PATCH V3 00/11] block-throttle: add .high limit Vivek Goyal
2016-10-04 15:56   ` Tejun Heo
2016-10-04 16:22     ` Paolo Valente
2016-10-04 16:27       ` Tejun Heo
2016-10-04 17:01         ` Paolo Valente
2016-10-04 17:28           ` Shaohua Li
2016-10-04 17:43             ` Paolo Valente
2016-10-04 18:28               ` Shaohua Li
2016-10-04 19:49                 ` Paolo Valente
2016-10-04 18:54               ` Tejun Heo
2016-10-04 19:02                 ` Paolo Valente
2016-10-04 19:14                   ` Tejun Heo
2016-10-04 19:29                     ` Paolo Valente
2016-10-04 20:27                       ` Tejun Heo
2016-10-05 12:37                         ` Paolo Valente
2016-10-05 13:12                           ` Vivek Goyal
2016-10-05 14:04                             ` Paolo Valente
2016-10-05 14:49                           ` Tejun Heo
2016-10-05 18:30                             ` Shaohua Li
2016-10-05 19:08                               ` Shaohua Li
2016-10-05 19:57                                 ` Paolo Valente
2016-10-05 20:36                                   ` Shaohua Li
2016-10-06  7:22                                     ` Paolo Valente [this message]
2016-10-05 19:47                               ` Paolo Valente
2016-10-05 20:07                                 ` Paolo Valente
2016-10-05 20:46                                 ` Shaohua Li
2016-10-06  7:58                                   ` Paolo Valente
2016-10-06 13:15                                     ` Paolo Valente
2016-10-06 17:49                                       ` Vivek Goyal
2016-10-06 18:01                                         ` Paolo Valente
2016-10-06 18:32                                           ` Vivek Goyal
2016-10-06 20:51                                             ` Paolo Valente
2016-10-06 19:44                                         ` Mark Brown
2016-10-06 19:57                                     ` Shaohua Li
2016-10-06 22:24                                       ` Paolo Valente
     [not found]                         ` <CACsaVZ+AqSXHTRdpdrQQp6PuynEPeB-5YOyweWsenjvuKsD12w@mail.gmail.com>
2016-10-09  1:15                           ` Fwd: " Kyle Sanderson
2016-10-14 16:40                             ` Tejun Heo
2016-10-14 17:13                               ` Paolo Valente
2016-10-14 18:35                                 ` Tejun Heo
2016-10-16 19:02                                   ` Paolo Valente
2016-10-18  5:15                                     ` Kyle Sanderson
2016-10-06  8:04                     ` Linus Walleij
2016-10-06 11:03                       ` Mark Brown
2016-10-06 11:57                         ` Austin S. Hemmelgarn
2016-10-06 12:50                           ` Paolo Valente
2016-10-06 13:52                             ` Austin S. Hemmelgarn
2016-10-06 15:05                               ` Paolo Valente
2016-10-06 15:10                                 ` Austin S. Hemmelgarn
2016-10-08 10:46                       ` Heinz Diehl
2016-10-04 18:12     ` Vivek Goyal
2016-10-04 18:50       ` Tejun Heo
2016-10-04 18:56         ` Paolo Valente
2016-10-04 17:08   ` Shaohua Li

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=5F716FD2-2027-434E-BC7F-8B0385722E05@unimore.it \
    --to=paolo.valente@unimore.it \
    --cc=Kernel-team@fb.com \
    --cc=axboe@fb.com \
    --cc=broonie@kernel.org \
    --cc=jmoyer@redhat.com \
    --cc=linus.walleij@linaro.org \
    --cc=linux-block@vger.kernel.org \
    --cc=linux-kernel@vger.kernel.org \
    --cc=shli@fb.com \
    --cc=tj@kernel.org \
    --cc=ulf.hansson@linaro.org \
    --cc=vgoyal@redhat.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.