linux-kernel.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
From: Tejun Heo <tj@kernel.org>
To: Paolo Valente <paolo.valente@linaro.org>
Cc: Jens Axboe <axboe@kernel.dk>,
	Fabio Checconi <fchecconi@gmail.com>,
	Arianna Avanzini <avanzini.arianna@gmail.com>,
	linux-block@vger.kernel.org, linux-kernel@vger.kernel.org,
	Ulf Hansson <ulf.hansson@linaro.org>,
	Linus Walleij <linus.walleij@linaro.org>,
	broonie@kernel.org
Subject: Re: [PATCH RFC 09/22] block, cfq: replace CFQ with the BFQ-v0 I/O scheduler
Date: Wed, 13 Apr 2016 16:41:10 -0400	[thread overview]
Message-ID: <20160413204110.GF20142@htj.duckdns.org> (raw)
In-Reply-To: <E0694788-2787-4D99-88FC-8EAC5E335CBE@linaro.org>

Hello,

Sorry about the long delay.

On Wed, Mar 09, 2016 at 07:34:15AM +0100, Paolo Valente wrote:
> This is probably the focal point of our discussion. Unfortunately, I
> am still not convinced of your claim. In fact, basing budgets on
> sectors (service), instead of time, still seems to me the only way to
> provide the stronger bandwidth and low-latency guarantees that I have
> tried to highlight in my previous email. And these guarantees do not
> seem to concern only a single special case, but several, common use
> cases for server and desktop systems. I will try to repeat these facts
> more concisely, and hopefully more clearly, in my replies to next
> points.

I'm still trying to get my head wrapped around why basing the
scheduling on bandwidth would have those benefits because the
connection isn't intuitive to me at all.  If you're saying that most
workloads care about bandwidth a lot more and the specifics of their
IO patterns are mostly accidental and should be discounted for
scheduling decisions, I can understand how that could be.  Is that
what you're saying?

> > I see.  Once a queue starts timing out its slice, it gets switched to
> > time based scheduling; however, as you mentioned, workloads which
> > generate moderate random IOs would still get preferential treatment
> > over workloads which generate sequential IOs, by definition.
> 
> Exactly. However, there is still something I don’t fully understand in
> your doubt. With BFQ, workloads that generate moderate random IOs
> would actually do less damage to throughput, on average, than with
> CFQ. In fact, with CFQ the queues containing these IOs would
> systematically get a full time slice, while there are two
> possibilities with BFQ:
> 1) If the degree of randomness of (the IOs in) these queues is not too
> high, then these queues are likely to finish budgets before
> timeouts. In this case, with BFQ these queues get less service than
> with CFQ, and thus can waste throughput less.
> 2) If the degree of randomness of these queues is very high, then they
> consume full time slices with BFQ, exactly as with CFQ.
> 
> Of course, performance may differ if time slices, i.e., timeouts,
> differ between BFQ and CFQ, but this is easy to tune, if needed.

Hmm.. the above doesn't really make sense to me, so you're saying that
bandwidth based control only cuts down the slice a random workload
would use and thus wouldn't benefit them; however, that cut-down of
slice is based on bandwidth consumption, so it would kick in a lot
more for sequential workloads.  It wouldn't make a random workload's
slice longer than the timeout but it would make sequantial ones'
slices shorter.  What am I missing?

> > Workloads are varied and underlying device performs wildly differently
> > depending on the specific IO pattern.  Because rotating disks suck so
> > badly, it's true that there's a lot of wiggle room in what the IO
> > scheduler can do.  People are accustomed to dealing with random
> > behaviors.  That said, it still doesn't feel comfortable to use the
> > obviously wrong unit as the fundamental basis of resource
> > distribution.
> 
> Actually this does not seem to match our (admittedly limited)
> experience with: low-to-high-end rotational devices, RAIDS, SSDs, SD
> cards and eMMCs. When stimulated with the same patterns in out tests,
> these devices always responded with about the same IO service
> times. And this seems to comply with intuition, because, apart from
> different initial cache states, the same stimuli cause about the same
> arm movements, cache hits/misses, and circuitry operations.

Oh, that's not what I meant.  If you feed the same sequence of IOs,
they would behave similarly.  What I meant was that the composition of
IOs themselves would change significantly depneding on how different
types of workloads get scheduled.

> > So, yes, I see that bandwidth based control would yield a better
> > result for this specific use case but at the same time this is a very
> > specialized use case and probably the *only* use case where bandwidth
> > based distribution makes sense - equivalent logically sequential
> > workloads where the specifics of IO pattern are completely accidental.
> > We can't really design the whole thing around that single use case.
> 
> Actually, the tight bandwidth guarantees that I have tried to
> highlight are the ground on which the other low-latency guarantees are
> built. So, to sum up the set of guarantees that Linus discussed in
> more detail in his email, BFQ mainly guarantees, even in the presence
> of throughput fluctuations, and thanks also to sector-based
> scheduling:
> 1) Stable(r) and tight bandwidth distribution for mostly-sequential
> reads/writes

So, yeah, the above makes toal sense.

> 2) Stable(r) and high responsiveness
> 3) Stable(r) and low latency for soft real-time applications
> 4) Faster execution of dev tasks, such as compile and git operations
> (checkout, merge, …), in the presence of background workloads, and
> while guaranteeing a high responsiveness too

But can you please enlighten me on why 2-4 are inherently tied to
bandwidth-based scheduling?

> > Hmmm... it could be that I'm mistaken on how trigger happy the switch
> > to time based scheduling is.  Maybe it's sensitive enough that
> > bandwidth based scheduling is only applied to workloads which are
> > mostly sequential.  I'm sorry if I'm being too dense on this point but
> > can you please give me some examples on what would happen when
> > sequential workloads and random ones mix?
> > 
> 
> In the simplest case,
> . sequential workloads would get sector-based service guarantees, with
> the resulting stability and low-latency properties that I have tried
> to highlight;
> . random workloads would get time-based service, and thus similar
> service guarantees as with CFQ (actually guarantees would still be
> tighter, because of the more accurate scheduling policy of BFQ).

But don't the above inherently mean that sequential workloads would
get less in terms of IO time?

To summarize,

1. I still don't understand why bandwidth-based scheduling is better
   (sorry).  The only reason I can think of is that most workloads
   that we care about are at least quasi-sequential and can benefit
   from ignoring randomness to a certain degree.  Is that it?

2. I don't think strict fairness matters is all that important for IO
   scheduling in general.  Whatever gives us the best overall result
   should work, so if bandwidth based scheduling does that great;
   however, fairness does matter across cgroups.  A cgroup configured
   to receive 50% of IO resources should get close to that no matter
   what others are doing, would bfq be able to do that?

Thanks.

-- 
tejun

  reply	other threads:[~2016-04-13 20:41 UTC|newest]

Thread overview: 103+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2016-02-01 22:12 [PATCH RFC 00/22] Replace the CFQ I/O Scheduler with BFQ Paolo Valente
2016-02-01 22:12 ` [PATCH RFC 01/22] block, cfq: remove queue merging for close cooperators Paolo Valente
2016-02-01 22:12 ` [PATCH RFC 02/22] block, cfq: remove close-based preemption Paolo Valente
2016-02-01 22:12 ` [PATCH RFC 03/22] block, cfq: remove deep seek queues logic Paolo Valente
2016-02-01 22:12 ` [PATCH RFC 04/22] block, cfq: remove SSD-related logic Paolo Valente
2016-02-01 22:12 ` [PATCH RFC 05/22] block, cfq: get rid of hierarchical support Paolo Valente
2016-02-10 23:04   ` Tejun Heo
2016-02-01 22:12 ` [PATCH RFC 06/22] block, cfq: get rid of queue preemption Paolo Valente
2016-02-01 22:12 ` [PATCH RFC 07/22] block, cfq: get rid of workload type Paolo Valente
2016-02-01 22:12 ` [PATCH RFC 08/22] block, cfq: get rid of latency tunables Paolo Valente
2016-02-10 23:05   ` Tejun Heo
2016-02-01 22:12 ` [PATCH RFC 09/22] block, cfq: replace CFQ with the BFQ-v0 I/O scheduler Paolo Valente
2016-02-11 22:22   ` Tejun Heo
2016-02-12  0:35     ` Mark Brown
2016-02-17 15:57       ` Tejun Heo
2016-02-17 16:02         ` Mark Brown
2016-02-17 17:04           ` Tejun Heo
2016-02-17 18:13             ` Jonathan Corbet
2016-02-17 19:45               ` Tejun Heo
2016-02-17 19:56                 ` Jonathan Corbet
2016-02-17 20:14                   ` Tejun Heo
2016-02-17  9:02     ` Paolo Valente
2016-02-17 17:02       ` Tejun Heo
2016-02-20 10:23         ` Paolo Valente
2016-02-20 11:02           ` Paolo Valente
2016-03-01 18:46           ` Tejun Heo
2016-03-04 17:29             ` Linus Walleij
2016-03-04 17:39               ` Christoph Hellwig
2016-03-04 18:10                 ` Austin S. Hemmelgarn
2016-03-11 11:16                   ` Christoph Hellwig
2016-03-11 13:38                     ` Austin S. Hemmelgarn
2016-03-05 12:18                 ` Linus Walleij
2016-03-11 11:17                   ` Christoph Hellwig
2016-03-11 11:24                     ` Nikolay Borisov
2016-03-11 11:49                       ` Christoph Hellwig
2016-03-11 14:53                     ` Linus Walleij
2016-03-09  6:55                 ` Paolo Valente
2016-04-13 19:54                 ` Tejun Heo
2016-04-14  5:03                   ` Mark Brown
2016-03-09  6:34             ` Paolo Valente
2016-04-13 20:41               ` Tejun Heo [this message]
2016-04-14 10:23                 ` Paolo Valente
2016-04-14 16:29                   ` Tejun Heo
2016-04-15 14:20                     ` Paolo Valente
2016-04-15 15:08                       ` Tejun Heo
2016-04-15 16:17                         ` Paolo Valente
2016-04-15 19:29                           ` Tejun Heo
2016-04-15 22:08                             ` Paolo Valente
2016-04-15 22:45                               ` Tejun Heo
2016-04-16  6:03                                 ` Paolo Valente
2016-04-15 14:49                     ` Linus Walleij
2016-02-01 22:12 ` [PATCH RFC 10/22] block, bfq: add full hierarchical scheduling and cgroups support Paolo Valente
2016-02-11 22:28   ` Tejun Heo
2016-02-17  9:07     ` Paolo Valente
2016-02-17 17:14       ` Tejun Heo
2016-02-17 17:45         ` Tejun Heo
2016-04-20  9:32     ` Paolo
2016-04-22 18:13       ` Tejun Heo
2016-04-22 18:19         ` Paolo Valente
2016-04-22 18:41           ` Tejun Heo
2016-04-22 19:05             ` Paolo Valente
2016-04-22 19:32               ` Tejun Heo
2016-04-23  7:07                 ` Paolo Valente
2016-04-25 19:24                   ` Tejun Heo
2016-04-25 20:30                     ` Paolo
2016-05-06 20:20                       ` Paolo Valente
2016-05-12 13:11                         ` Paolo
2016-07-27 16:13                         ` [PATCH RFC V8 00/22] Replace the CFQ I/O Scheduler with BFQ Paolo Valente
2016-07-27 16:13                           ` [PATCH RFC V8 01/22] block, cfq: remove queue merging for close cooperators Paolo Valente
2016-07-27 16:13                           ` [PATCH RFC V8 02/22] block, cfq: remove close-based preemption Paolo Valente
2016-07-27 16:13                           ` [PATCH RFC V8 03/22] block, cfq: remove deep seek queues logic Paolo Valente
2016-07-27 16:13                           ` [PATCH RFC V8 04/22] block, cfq: remove SSD-related logic Paolo Valente
2016-07-27 16:13                           ` [PATCH RFC V8 05/22] block, cfq: get rid of hierarchical support Paolo Valente
2016-07-27 16:13                           ` [PATCH RFC V8 06/22] block, cfq: get rid of queue preemption Paolo Valente
2016-07-27 16:13                           ` [PATCH RFC V8 07/22] block, cfq: get rid of workload type Paolo Valente
2016-07-27 16:13                           ` [PATCH RFC V8 08/22] block, cfq: get rid of latency tunables Paolo Valente
2016-07-27 16:13                           ` [PATCH RFC V8 09/22] block, cfq: replace CFQ with the BFQ-v0 I/O scheduler Paolo Valente
2016-07-27 16:13                           ` [PATCH RFC V8 10/22] block, bfq: add full hierarchical scheduling and cgroups support Paolo Valente
2016-07-27 16:13                           ` [PATCH RFC V8 11/22] block, bfq: improve throughput boosting Paolo Valente
2016-07-27 16:13                           ` [PATCH RFC V8 12/22] block, bfq: modify the peak-rate estimator Paolo Valente
2016-07-27 16:13                           ` [PATCH RFC V8 13/22] block, bfq: add more fairness with writes and slow processes Paolo Valente
2016-07-27 16:13                           ` [PATCH RFC V8 14/22] block, bfq: improve responsiveness Paolo Valente
2016-07-27 16:13                           ` [PATCH RFC V8 15/22] block, bfq: reduce I/O latency for soft real-time applications Paolo Valente
2016-07-27 16:13                           ` [PATCH RFC V8 16/22] block, bfq: preserve a low latency also with NCQ-capable drives Paolo Valente
2016-07-27 16:13                           ` [PATCH RFC V8 17/22] block, bfq: reduce latency during request-pool saturation Paolo Valente
2016-07-27 16:13                           ` [PATCH RFC V8 18/22] block, bfq: add Early Queue Merge (EQM) Paolo Valente
2016-07-27 16:13                           ` [PATCH RFC V8 19/22] block, bfq: reduce idling only in symmetric scenarios Paolo Valente
2016-07-27 16:13                           ` [PATCH RFC V8 20/22] block, bfq: boost the throughput on NCQ-capable flash-based devices Paolo Valente
2016-07-27 16:13                           ` [PATCH RFC V8 21/22] block, bfq: boost the throughput with random I/O on NCQ-capable HDDs Paolo Valente
2016-07-27 16:13                           ` [PATCH RFC V8 22/22] block, bfq: handle bursts of queue activations Paolo Valente
2016-07-28 16:50                           ` [PATCH RFC V8 00/22] Replace the CFQ I/O Scheduler with BFQ Paolo
2016-02-01 22:12 ` [PATCH RFC 11/22] block, bfq: improve throughput boosting Paolo Valente
2016-02-01 22:12 ` [PATCH RFC 12/22] block, bfq: modify the peak-rate estimator Paolo Valente
2016-02-01 22:12 ` [PATCH RFC 13/22] block, bfq: add more fairness to boost throughput and reduce latency Paolo Valente
2016-02-01 22:12 ` [PATCH RFC 14/22] block, bfq: improve responsiveness Paolo Valente
2016-02-01 22:12 ` [PATCH RFC 15/22] block, bfq: reduce I/O latency for soft real-time applications Paolo Valente
2016-02-01 22:12 ` [PATCH RFC 16/22] block, bfq: preserve a low latency also with NCQ-capable drives Paolo Valente
2016-02-01 22:12 ` [PATCH RFC 17/22] block, bfq: reduce latency during request-pool saturation Paolo Valente
2016-02-01 22:12 ` [PATCH RFC 18/22] block, bfq: add Early Queue Merge (EQM) Paolo Valente
2016-02-01 22:12 ` [PATCH RFC 19/22] block, bfq: reduce idling only in symmetric scenarios Paolo Valente
2016-02-01 22:12 ` [PATCH RFC 20/22] block, bfq: boost the throughput on NCQ-capable flash-based devices Paolo Valente
2016-02-01 22:12 ` [PATCH RFC 21/22] block, bfq: boost the throughput with random I/O on NCQ-capable HDDs Paolo Valente
2016-02-01 22:12 ` [PATCH RFC 22/22] block, bfq: handle bursts of queue activations Paolo Valente

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20160413204110.GF20142@htj.duckdns.org \
    --to=tj@kernel.org \
    --cc=avanzini.arianna@gmail.com \
    --cc=axboe@kernel.dk \
    --cc=broonie@kernel.org \
    --cc=fchecconi@gmail.com \
    --cc=linus.walleij@linaro.org \
    --cc=linux-block@vger.kernel.org \
    --cc=linux-kernel@vger.kernel.org \
    --cc=paolo.valente@linaro.org \
    --cc=ulf.hansson@linaro.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).