All of lore.kernel.org
 help / color / mirror / Atom feed
From: Paolo Valente <paolo.valente@linaro.org>
To: Jens Axboe <axboe@kernel.dk>, Tejun Heo <tj@kernel.org>
Cc: Fabio Checconi <fchecconi@gmail.com>,
	Arianna Avanzini <avanzini.arianna@gmail.com>,
	linux-block@vger.kernel.org, linux-kernel@vger.kernel.org,
	ulf.hansson@linaro.org, linus.walleij@linaro.org,
	broonie@kernel.org, Paolo Valente <paolo.valente@linaro.org>
Subject: [PATCH RFC 20/22] block, bfq: boost the throughput on NCQ-capable flash-based devices
Date: Mon,  1 Feb 2016 23:12:56 +0100	[thread overview]
Message-ID: <1454364778-25179-21-git-send-email-paolo.valente@linaro.org> (raw)
In-Reply-To: <1454364778-25179-1-git-send-email-paolo.valente@linaro.org>

This patch boosts the throughput on NCQ-capable flash-based devices,
while still preserving latency guarantees for interactive and soft
real-time applications. The throughput is boosted by just not idling
the device when the in-service queue remains empty, even if the queue
is sync and has a non-null idle window. This helps to keep the drive's
internal queue full, which is necessary to achieve maximum
performance. This solution to boost the throughput is a port of
commits a68bbdd and f7d7b7a for CFQ.

As already highlighted in a previous patch, allowing the device to
prefetch and internally reorder requests trivially causes loss of
control on the request service order, and hence on service guarantees.
Fortunately, as discussed in detail in the comments on the function
bfq_bfqq_must_not_expire(), if every process has to receive the same
fraction of the throughput, then the service order enforced by the
internal scheduler of a flash-based device is relatively close to that
enforced by BFQ. In particular, it is close enough to let service
guarantees be substantially preserved.

Things change in an asymmetric scenario, i.e., if not every process
has to receive the same fraction of the throughput. In this case, to
guarantee the desired throughput distribution, the device must be
prevented from prefetching requests. This is exactly what this patch
does in asymmetric scenarios.

Signed-off-by: Paolo Valente <paolo.valente@linaro.org>
Signed-off-by: Arianna Avanzini <avanzini.arianna@gmail.com>
---
 block/cfq-iosched.c | 86 ++++++++++++++++++++++++++++++++++-------------------
 1 file changed, 56 insertions(+), 30 deletions(-)

diff --git a/block/cfq-iosched.c b/block/cfq-iosched.c
index cddc8c6..b8ee88c 100644
--- a/block/cfq-iosched.c
+++ b/block/cfq-iosched.c
@@ -4411,15 +4411,25 @@ static bool bfq_bfqq_may_idle(struct bfq_queue *bfqq)
 	 * The value of the variable is computed considering that
 	 * idling is usually beneficial for the throughput if:
 	 * (a) the device is not NCQ-capable, or
-	 * (b) regardless of the presence of NCQ, the request pattern
-	 *     for bfqq is I/O-bound (possible throughput losses
-	 *     caused by granting idling to seeky queues are mitigated
-	 *     by the fact that, in all scenarios where boosting
-	 *     throughput is the best thing to do, i.e., in all
-	 *     symmetric scenarios, only a minimal idle time is
-	 *     allowed to seeky queues).
+	 * (b) regardless of the presence of NCQ, the device is rotational
+	 *     and the request pattern for bfqq is I/O-bound (possible
+	 *     throughput losses caused by granting idling to seeky queues
+	 *     are mitigated by the fact that, in all scenarios where
+	 *     boosting throughput is the best thing to do, i.e., in all
+	 *     symmetric scenarios, only a minimal idle time is allowed to
+	 *     seeky queues).
+	 *
+	 * Secondly, and in contrast to the above item (b), idling an
+	 * NCQ-capable flash-based device would not boost the
+	 * throughput even with intense I/O; rather it would lower
+	 * the throughput in proportion to how fast the device
+	 * is. Accordingly, the next variable is true if any of the
+	 * above conditions (a) and (b) is true, and, in particular,
+	 * happens to be false if bfqd is an NCQ-capable flash-based
+	 * device.
 	 */
-	idling_boosts_thr = !bfqd->hw_tag || bfq_bfqq_IO_bound(bfqq);
+	idling_boosts_thr = !bfqd->hw_tag ||
+		(!blk_queue_nonrot(bfqd->queue) && bfq_bfqq_IO_bound(bfqq));
 
 	/*
 	 * The value of the next variable,
@@ -4491,38 +4501,54 @@ static bool bfq_bfqq_may_idle(struct bfq_queue *bfqq)
 	 * receives its assigned fraction of the device throughput
 	 * (see [1] for details).
 	 *
-	 * As for sub-condition (i), actually we check only whether
-	 * bfqq is being weight-raised. In fact, if bfqq is not being
-	 * weight-raised, we have that:
-	 * - if the process associated to bfqq is not I/O-bound, then
-	 *   it is not either latency- or throughput-critical; therefore
-	 *   idling is not needed for bfqq;
-	 * - if the process asociated to bfqq is I/O-bound, then
-	 *   idling is already granted to bfqq (see the comments on
-	 *   idling_boosts_thr).
+	 * We address this issue by controlling, actually, only the
+	 * symmetry sub-condition (i), i.e., provided that
+	 * sub-condition (i) holds, idling is not performed,
+	 * regardless of whether sub-condition (ii) holds. In other
+	 * words, only if sub-condition (i) does not hold, then idling
+	 * is allowed, and the device tends to be prevented from
+	 * queueing many requests, possibly of several processes. The
+	 * reason for not controlling also sub-condition (ii) is that,
+	 * first, in the case of an HDD, idling is always performed
+	 * for I/O-bound processes (see the comments on
+	 * idling_boosts_thr above). Secondly, in the case of a
+	 * flash-based device, we prefer however to privilege
+	 * throughput (and idling lowers throughput for this type of
+	 * devices), for the following reasons:
+	 * 1) differently from HDDs, the service time of random
+	 *    requests is not orders of magnitudes lower than the service
+	 *    time of sequential requests; thus, even if processes doing
+	 *    sequential I/O get a preferential treatment with respect to
+	 *    others doing random I/O, the consequences are not as
+	 *    dramatic as with HDDs;
+	 * 2) if a process doing random I/O does need strong
+	 *    throughput guarantees, it is hopefully already being
+	 *    weight-raised, or the user is likely to have assigned it a
+	 *    higher weight than the other processes (and thus
+	 *    sub-condition (i) is likely to be false, which triggers
+	 *    idling).
 	 *
-	 * We do not check sub-condition (ii) at all, i.e., the next
-	 * variable is true if and only if bfqq is being
-	 * weight-raised. We do not need to control sub-condition (ii)
-	 * for the following reason:
-	 * - if bfqq is being weight-raised, then idling is already
-	 *   guaranteed to bfqq by sub-condition (i);
-	 * - if bfqq is not being weight-raised, then idling is
-	 *   already guaranteed to bfqq (only) if it matters, i.e., if
-	 *   bfqq is associated to a currently I/O-bound process (see
-	 *   the above comment on sub-condition (i)).
+	 * According to the above considerations, the next variable is
+	 * true (only) if sub-condition (i) holds. To compute the
+	 * value of this variable, we not only use the return value of
+	 * the function bfq_symmetric_scenario(), but also check
+	 * whether bfqq is being weight-raised, because
+	 * bfq_symmetric_scenario() does not take into account also
+	 * weight-raised queues (see comments to
+	 * bfq_weights_tree_add()).
 	 *
 	 * As a side note, it is worth considering that the above
 	 * device-idling countermeasures may however fail in the
 	 * following unlucky scenario: if idling is (correctly)
-	 * disabled in a time period during which the symmetry
-	 * sub-condition holds, and hence the device is allowed to
+	 * disabled in a time period during which all symmetry
+	 * sub-conditions hold, and hence the device is allowed to
 	 * enqueue many requests, but at some later point in time some
 	 * sub-condition stops to hold, then it may become impossible
 	 * to let requests be served in the desired order until all
 	 * the requests already queued in the device have been served.
 	 */
-	asymmetric_scenario = bfqq->wr_coeff > 1;
+	asymmetric_scenario = bfqq->wr_coeff > 1 ||
+		!bfq_symmetric_scenario(bfqd);
 
 	/*
 	 * We have now all the components we need to compute the return
-- 
1.9.1

  parent reply	other threads:[~2016-02-01 22:55 UTC|newest]

Thread overview: 109+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2016-02-01 22:12 [PATCH RFC 00/22] Replace the CFQ I/O Scheduler with BFQ Paolo Valente
2016-02-01 22:12 ` [PATCH RFC 01/22] block, cfq: remove queue merging for close cooperators Paolo Valente
2016-02-01 22:12 ` [PATCH RFC 02/22] block, cfq: remove close-based preemption Paolo Valente
2016-02-01 22:12 ` [PATCH RFC 03/22] block, cfq: remove deep seek queues logic Paolo Valente
2016-02-01 22:12 ` [PATCH RFC 04/22] block, cfq: remove SSD-related logic Paolo Valente
2016-02-01 22:12 ` [PATCH RFC 05/22] block, cfq: get rid of hierarchical support Paolo Valente
2016-02-10 23:04   ` Tejun Heo
2016-02-01 22:12 ` [PATCH RFC 06/22] block, cfq: get rid of queue preemption Paolo Valente
2016-02-01 22:12 ` [PATCH RFC 07/22] block, cfq: get rid of workload type Paolo Valente
2016-02-01 22:12 ` [PATCH RFC 08/22] block, cfq: get rid of latency tunables Paolo Valente
2016-02-10 23:05   ` Tejun Heo
2016-02-01 22:12 ` [PATCH RFC 09/22] block, cfq: replace CFQ with the BFQ-v0 I/O scheduler Paolo Valente
2016-02-11 22:22   ` Tejun Heo
2016-02-12  0:35     ` Mark Brown
2016-02-17 15:57       ` Tejun Heo
2016-02-17 16:02         ` Mark Brown
2016-02-17 17:04           ` Tejun Heo
2016-02-17 18:13             ` Jonathan Corbet
2016-02-17 19:45               ` Tejun Heo
2016-02-17 19:56                 ` Jonathan Corbet
2016-02-17 20:14                   ` Tejun Heo
2016-02-17  9:02     ` Paolo Valente
2016-02-17 17:02       ` Tejun Heo
2016-02-20 10:23         ` Paolo Valente
2016-02-20 11:02           ` Paolo Valente
2016-03-01 18:46           ` Tejun Heo
2016-03-04 17:29             ` Linus Walleij
2016-03-04 17:39               ` Christoph Hellwig
2016-03-04 18:10                 ` Austin S. Hemmelgarn
2016-03-11 11:16                   ` Christoph Hellwig
2016-03-11 13:38                     ` Austin S. Hemmelgarn
2016-03-05 12:18                 ` Linus Walleij
2016-03-11 11:17                   ` Christoph Hellwig
2016-03-11 11:24                     ` Nikolay Borisov
2016-03-11 11:49                       ` Christoph Hellwig
2016-03-11 14:53                     ` Linus Walleij
2016-03-09  6:55                 ` Paolo Valente
2016-04-13 19:54                 ` Tejun Heo
2016-04-14  5:03                   ` Mark Brown
2016-03-09  6:34             ` Paolo Valente
2016-04-13 20:41               ` Tejun Heo
2016-04-14 10:23                 ` Paolo Valente
2016-04-14 16:29                   ` Tejun Heo
2016-04-15 14:20                     ` Paolo Valente
2016-04-15 15:08                       ` Tejun Heo
2016-04-15 16:17                         ` Paolo Valente
2016-04-15 19:29                           ` Tejun Heo
2016-04-15 22:08                             ` Paolo Valente
2016-04-15 22:45                               ` Tejun Heo
2016-04-16  6:03                                 ` Paolo Valente
2016-04-15 14:49                     ` Linus Walleij
2016-02-01 22:12 ` [PATCH RFC 10/22] block, bfq: add full hierarchical scheduling and cgroups support Paolo Valente
2016-02-11 22:28   ` Tejun Heo
2016-02-17  9:07     ` Paolo Valente
2016-02-17  9:07       ` Paolo Valente
2016-02-17 17:14       ` Tejun Heo
2016-02-17 17:45         ` Tejun Heo
2016-04-20  9:21     ` Paolo
2016-04-20  9:32     ` Paolo
2016-04-22 18:13       ` Tejun Heo
2016-04-22 18:19         ` Paolo Valente
2016-04-22 18:19           ` Paolo Valente
2016-04-22 18:41           ` Tejun Heo
2016-04-22 19:05             ` Paolo Valente
2016-04-22 19:05               ` Paolo Valente
2016-04-22 19:32               ` Tejun Heo
2016-04-23  7:07                 ` Paolo Valente
2016-04-23  7:07                   ` Paolo Valente
2016-04-25 19:24                   ` Tejun Heo
2016-04-25 20:30                     ` Paolo
2016-05-06 20:20                       ` Paolo Valente
2016-05-06 20:20                         ` Paolo Valente
2016-05-12 13:11                         ` Paolo
2016-07-27 16:13                         ` [PATCH RFC V8 00/22] Replace the CFQ I/O Scheduler with BFQ Paolo Valente
2016-07-27 16:13                           ` [PATCH RFC V8 01/22] block, cfq: remove queue merging for close cooperators Paolo Valente
2016-07-27 16:13                           ` [PATCH RFC V8 02/22] block, cfq: remove close-based preemption Paolo Valente
2016-07-27 16:13                           ` [PATCH RFC V8 03/22] block, cfq: remove deep seek queues logic Paolo Valente
2016-07-27 16:13                           ` [PATCH RFC V8 04/22] block, cfq: remove SSD-related logic Paolo Valente
2016-07-27 16:13                           ` [PATCH RFC V8 05/22] block, cfq: get rid of hierarchical support Paolo Valente
2016-07-27 16:13                           ` [PATCH RFC V8 06/22] block, cfq: get rid of queue preemption Paolo Valente
2016-07-27 16:13                           ` [PATCH RFC V8 07/22] block, cfq: get rid of workload type Paolo Valente
2016-07-27 16:13                           ` [PATCH RFC V8 08/22] block, cfq: get rid of latency tunables Paolo Valente
2016-07-27 16:13                           ` [PATCH RFC V8 09/22] block, cfq: replace CFQ with the BFQ-v0 I/O scheduler Paolo Valente
2016-07-27 16:13                           ` [PATCH RFC V8 10/22] block, bfq: add full hierarchical scheduling and cgroups support Paolo Valente
2016-07-27 16:13                           ` [PATCH RFC V8 11/22] block, bfq: improve throughput boosting Paolo Valente
2016-07-27 16:13                           ` [PATCH RFC V8 12/22] block, bfq: modify the peak-rate estimator Paolo Valente
2016-07-27 16:13                           ` [PATCH RFC V8 13/22] block, bfq: add more fairness with writes and slow processes Paolo Valente
2016-07-27 16:13                           ` [PATCH RFC V8 14/22] block, bfq: improve responsiveness Paolo Valente
2016-07-27 16:13                           ` [PATCH RFC V8 15/22] block, bfq: reduce I/O latency for soft real-time applications Paolo Valente
2016-07-27 16:13                           ` [PATCH RFC V8 16/22] block, bfq: preserve a low latency also with NCQ-capable drives Paolo Valente
2016-07-27 16:13                           ` [PATCH RFC V8 17/22] block, bfq: reduce latency during request-pool saturation Paolo Valente
2016-07-27 16:13                           ` [PATCH RFC V8 18/22] block, bfq: add Early Queue Merge (EQM) Paolo Valente
2016-07-27 16:13                           ` [PATCH RFC V8 19/22] block, bfq: reduce idling only in symmetric scenarios Paolo Valente
2016-07-27 16:13                           ` [PATCH RFC V8 20/22] block, bfq: boost the throughput on NCQ-capable flash-based devices Paolo Valente
2016-07-27 16:13                           ` [PATCH RFC V8 21/22] block, bfq: boost the throughput with random I/O on NCQ-capable HDDs Paolo Valente
2016-07-27 16:13                           ` [PATCH RFC V8 22/22] block, bfq: handle bursts of queue activations Paolo Valente
2016-07-28 16:50                           ` [PATCH RFC V8 00/22] Replace the CFQ I/O Scheduler with BFQ Paolo
2016-02-01 22:12 ` [PATCH RFC 11/22] block, bfq: improve throughput boosting Paolo Valente
2016-02-01 22:12 ` [PATCH RFC 12/22] block, bfq: modify the peak-rate estimator Paolo Valente
2016-02-01 22:12 ` [PATCH RFC 13/22] block, bfq: add more fairness to boost throughput and reduce latency Paolo Valente
2016-02-01 22:12 ` [PATCH RFC 14/22] block, bfq: improve responsiveness Paolo Valente
2016-02-01 22:12 ` [PATCH RFC 15/22] block, bfq: reduce I/O latency for soft real-time applications Paolo Valente
2016-02-01 22:12 ` [PATCH RFC 16/22] block, bfq: preserve a low latency also with NCQ-capable drives Paolo Valente
2016-02-01 22:12 ` [PATCH RFC 17/22] block, bfq: reduce latency during request-pool saturation Paolo Valente
2016-02-01 22:12 ` [PATCH RFC 18/22] block, bfq: add Early Queue Merge (EQM) Paolo Valente
2016-02-01 22:12 ` [PATCH RFC 19/22] block, bfq: reduce idling only in symmetric scenarios Paolo Valente
2016-02-01 22:12 ` Paolo Valente [this message]
2016-02-01 22:12 ` [PATCH RFC 21/22] block, bfq: boost the throughput with random I/O on NCQ-capable HDDs Paolo Valente
2016-02-01 22:12 ` [PATCH RFC 22/22] block, bfq: handle bursts of queue activations Paolo Valente

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=1454364778-25179-21-git-send-email-paolo.valente@linaro.org \
    --to=paolo.valente@linaro.org \
    --cc=avanzini.arianna@gmail.com \
    --cc=axboe@kernel.dk \
    --cc=broonie@kernel.org \
    --cc=fchecconi@gmail.com \
    --cc=linus.walleij@linaro.org \
    --cc=linux-block@vger.kernel.org \
    --cc=linux-kernel@vger.kernel.org \
    --cc=tj@kernel.org \
    --cc=ulf.hansson@linaro.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.