LKML Archive on lore.kernel.org
 help / color / Atom feed
From: Paolo Valente <paolo.valente@linaro.org>
To: Jens Axboe <axboe@kernel.dk>, Tejun Heo <tj@kernel.org>
Cc: Fabio Checconi <fchecconi@gmail.com>,
	Arianna Avanzini <avanzini.arianna@gmail.com>,
	linux-block@vger.kernel.org, linux-kernel@vger.kernel.org,
	ulf.hansson@linaro.org, linus.walleij@linaro.org,
	broonie@kernel.org, Paolo Valente <paolo.valente@linaro.org>
Subject: [PATCH RFC 12/22] block, bfq: modify the peak-rate estimator
Date: Mon,  1 Feb 2016 23:12:48 +0100
Message-ID: <1454364778-25179-13-git-send-email-paolo.valente@linaro.org> (raw)
In-Reply-To: <1454364778-25179-1-git-send-email-paolo.valente@linaro.org>

Unless the maximum budget B_max that BFQ can assign to a queue is set
explicitly by the user, BFQ automatically updates B_max. In
particular, BFQ dynamically sets B_max to the number of sectors that
can be read, at the current estimated peak rate, during the maximum
time, T_max, allowed before a budget timeout occurs. In formulas, if
we denote as R_est the estimated peak rate, then B_max = T_max ∗
R_est. Hence, the higher R_est is with respect to the actual device
peak rate, the higher the probability that processes incur budget
timeouts unjustly is. Besides, a too high value of B_max unnecessarily
increases the deviation from an ideal, smooth service.

To filter out spikes, the estimated peak rate is updated only on the
expiration of queues that have been served for a long-enough time.  As
a first step, the estimator computes the device rate, R_meas, during
the service of the queue. After that, if R_est < R_meas, then R_est is
set to R_meas.

Unfortunately, our experiments highlighted the following two
problems. First, because of ZBR, depending on the locality of the
workload, the estimator may easily converge to a value that is
appropriate only for part of a disk. Second, R_est may jump (and
remain forever equal) to a much higher value than the actual device
peak rate, in case of hits in the drive cache, which may let sectors
be transferred in practice at bus rate.

To try to converge to the actual average peak rate over the disk
surface (in case of rotational devices), and to smooth the spikes
caused by the drive cache, this patch changes the estimator as
follows. In the description of the changes, we refer to a queue
containing random requests as 'seeky', according to the terminology
used in the code, and inherited from CFQ.

- First, now R_est may be updated also in case the just-expired queue,
  despite not being detected as seeky, has not been however able to
  consume all of its budget within the maximum time slice T_max. In
  fact, this is an indication that B_max is too large. Since B_max =
  T_max ∗ R_est, R_est is then probably too large, and should be
  reduced.

- Second, to filter the spikes in R_meas, a discrete low-pass filter
  is now used to update R_est instead of just keeping the highest rate
  sampled. The rationale is that the average peak rate of a disk over
  its surface is a relatively stable quantity, hence a low-pass filter
  should converge more or less quickly to the right value.

With the current values of the constants used in the filter, the
latter seems to effectively smooth fluctuations and allow the
estimator to converge to the actual peak rate with all the devices we
tested.

Signed-off-by: Paolo Valente <paolo.valente@linaro.org>
Signed-off-by: Arianna Avanzini <avanzini.arianna@gmail.com>
---
 block/cfq-iosched.c | 23 ++++++++++++++++++-----
 1 file changed, 18 insertions(+), 5 deletions(-)

diff --git a/block/cfq-iosched.c b/block/cfq-iosched.c
index de914a3..9792cd7 100644
--- a/block/cfq-iosched.c
+++ b/block/cfq-iosched.c
@@ -3038,7 +3038,7 @@ static unsigned long bfq_calc_max_budget(u64 peak_rate, u64 timeout)
  * throughput. See the code for more details.
  */
 static bool bfq_update_peak_rate(struct bfq_data *bfqd, struct bfq_queue *bfqq,
-				 bool compensate)
+				 bool compensate, enum bfqq_expiration reason)
 {
 	u64 bw, usecs, expected, timeout;
 	ktime_t delta;
@@ -3074,10 +3074,23 @@ static bool bfq_update_peak_rate(struct bfq_data *bfqd, struct bfq_queue *bfqq,
 	 * the peak rate estimation.
 	 */
 	if (usecs > 20000) {
-		if (bw > bfqd->peak_rate) {
-			bfqd->peak_rate = bw;
+		if (bw > bfqd->peak_rate ||
+		   (!BFQQ_SEEKY(bfqq) &&
+		    reason == BFQ_BFQQ_BUDGET_TIMEOUT)) {
+			bfq_log(bfqd, "measured bw =%llu", bw);
+			/*
+			 * To smooth oscillations use a low-pass filter with
+			 * alpha=7/8, i.e.,
+			 * new_rate = (7/8) * old_rate + (1/8) * bw
+			 */
+			do_div(bw, 8);
+			if (bw == 0)
+				return 0;
+			bfqd->peak_rate *= 7;
+			do_div(bfqd->peak_rate, 8);
+			bfqd->peak_rate += bw;
 			update = 1;
-			bfq_log(bfqd, "new peak_rate=%llu", bw);
+			bfq_log(bfqd, "new peak_rate=%llu", bfqd->peak_rate);
 		}
 
 		update |= bfqd->peak_rate_samples == BFQ_PEAK_RATE_SAMPLES - 1;
@@ -3157,7 +3170,7 @@ static void bfq_bfqq_expire(struct bfq_data *bfqd,
 	 * Update disk peak rate for autotuning and check whether the
 	 * process is slow (see bfq_update_peak_rate).
 	 */
-	slow = bfq_update_peak_rate(bfqd, bfqq, compensate);
+	slow = bfq_update_peak_rate(bfqd, bfqq, compensate, reason);
 
 	/*
 	 * As above explained, 'punish' slow (i.e., seeky), timed-out
-- 
1.9.1

  parent reply index

Thread overview: 103+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2016-02-01 22:12 [PATCH RFC 00/22] Replace the CFQ I/O Scheduler with BFQ Paolo Valente
2016-02-01 22:12 ` [PATCH RFC 01/22] block, cfq: remove queue merging for close cooperators Paolo Valente
2016-02-01 22:12 ` [PATCH RFC 02/22] block, cfq: remove close-based preemption Paolo Valente
2016-02-01 22:12 ` [PATCH RFC 03/22] block, cfq: remove deep seek queues logic Paolo Valente
2016-02-01 22:12 ` [PATCH RFC 04/22] block, cfq: remove SSD-related logic Paolo Valente
2016-02-01 22:12 ` [PATCH RFC 05/22] block, cfq: get rid of hierarchical support Paolo Valente
2016-02-10 23:04   ` Tejun Heo
2016-02-01 22:12 ` [PATCH RFC 06/22] block, cfq: get rid of queue preemption Paolo Valente
2016-02-01 22:12 ` [PATCH RFC 07/22] block, cfq: get rid of workload type Paolo Valente
2016-02-01 22:12 ` [PATCH RFC 08/22] block, cfq: get rid of latency tunables Paolo Valente
2016-02-10 23:05   ` Tejun Heo
2016-02-01 22:12 ` [PATCH RFC 09/22] block, cfq: replace CFQ with the BFQ-v0 I/O scheduler Paolo Valente
2016-02-11 22:22   ` Tejun Heo
2016-02-12  0:35     ` Mark Brown
2016-02-17 15:57       ` Tejun Heo
2016-02-17 16:02         ` Mark Brown
2016-02-17 17:04           ` Tejun Heo
2016-02-17 18:13             ` Jonathan Corbet
2016-02-17 19:45               ` Tejun Heo
2016-02-17 19:56                 ` Jonathan Corbet
2016-02-17 20:14                   ` Tejun Heo
2016-02-17  9:02     ` Paolo Valente
2016-02-17 17:02       ` Tejun Heo
2016-02-20 10:23         ` Paolo Valente
2016-02-20 11:02           ` Paolo Valente
2016-03-01 18:46           ` Tejun Heo
2016-03-04 17:29             ` Linus Walleij
2016-03-04 17:39               ` Christoph Hellwig
2016-03-04 18:10                 ` Austin S. Hemmelgarn
2016-03-11 11:16                   ` Christoph Hellwig
2016-03-11 13:38                     ` Austin S. Hemmelgarn
2016-03-05 12:18                 ` Linus Walleij
2016-03-11 11:17                   ` Christoph Hellwig
2016-03-11 11:24                     ` Nikolay Borisov
2016-03-11 11:49                       ` Christoph Hellwig
2016-03-11 14:53                     ` Linus Walleij
2016-03-09  6:55                 ` Paolo Valente
2016-04-13 19:54                 ` Tejun Heo
2016-04-14  5:03                   ` Mark Brown
2016-03-09  6:34             ` Paolo Valente
2016-04-13 20:41               ` Tejun Heo
2016-04-14 10:23                 ` Paolo Valente
2016-04-14 16:29                   ` Tejun Heo
2016-04-15 14:20                     ` Paolo Valente
2016-04-15 15:08                       ` Tejun Heo
2016-04-15 16:17                         ` Paolo Valente
2016-04-15 19:29                           ` Tejun Heo
2016-04-15 22:08                             ` Paolo Valente
2016-04-15 22:45                               ` Tejun Heo
2016-04-16  6:03                                 ` Paolo Valente
2016-04-15 14:49                     ` Linus Walleij
2016-02-01 22:12 ` [PATCH RFC 10/22] block, bfq: add full hierarchical scheduling and cgroups support Paolo Valente
2016-02-11 22:28   ` Tejun Heo
2016-02-17  9:07     ` Paolo Valente
2016-02-17 17:14       ` Tejun Heo
2016-02-17 17:45         ` Tejun Heo
2016-04-20  9:32     ` Paolo
2016-04-22 18:13       ` Tejun Heo
2016-04-22 18:19         ` Paolo Valente
2016-04-22 18:41           ` Tejun Heo
2016-04-22 19:05             ` Paolo Valente
2016-04-22 19:32               ` Tejun Heo
2016-04-23  7:07                 ` Paolo Valente
2016-04-25 19:24                   ` Tejun Heo
2016-04-25 20:30                     ` Paolo
2016-05-06 20:20                       ` Paolo Valente
2016-05-12 13:11                         ` Paolo
2016-07-27 16:13                         ` [PATCH RFC V8 00/22] Replace the CFQ I/O Scheduler with BFQ Paolo Valente
2016-07-27 16:13                           ` [PATCH RFC V8 01/22] block, cfq: remove queue merging for close cooperators Paolo Valente
2016-07-27 16:13                           ` [PATCH RFC V8 02/22] block, cfq: remove close-based preemption Paolo Valente
2016-07-27 16:13                           ` [PATCH RFC V8 03/22] block, cfq: remove deep seek queues logic Paolo Valente
2016-07-27 16:13                           ` [PATCH RFC V8 04/22] block, cfq: remove SSD-related logic Paolo Valente
2016-07-27 16:13                           ` [PATCH RFC V8 05/22] block, cfq: get rid of hierarchical support Paolo Valente
2016-07-27 16:13                           ` [PATCH RFC V8 06/22] block, cfq: get rid of queue preemption Paolo Valente
2016-07-27 16:13                           ` [PATCH RFC V8 07/22] block, cfq: get rid of workload type Paolo Valente
2016-07-27 16:13                           ` [PATCH RFC V8 08/22] block, cfq: get rid of latency tunables Paolo Valente
2016-07-27 16:13                           ` [PATCH RFC V8 09/22] block, cfq: replace CFQ with the BFQ-v0 I/O scheduler Paolo Valente
2016-07-27 16:13                           ` [PATCH RFC V8 10/22] block, bfq: add full hierarchical scheduling and cgroups support Paolo Valente
2016-07-27 16:13                           ` [PATCH RFC V8 11/22] block, bfq: improve throughput boosting Paolo Valente
2016-07-27 16:13                           ` [PATCH RFC V8 12/22] block, bfq: modify the peak-rate estimator Paolo Valente
2016-07-27 16:13                           ` [PATCH RFC V8 13/22] block, bfq: add more fairness with writes and slow processes Paolo Valente
2016-07-27 16:13                           ` [PATCH RFC V8 14/22] block, bfq: improve responsiveness Paolo Valente
2016-07-27 16:13                           ` [PATCH RFC V8 15/22] block, bfq: reduce I/O latency for soft real-time applications Paolo Valente
2016-07-27 16:13                           ` [PATCH RFC V8 16/22] block, bfq: preserve a low latency also with NCQ-capable drives Paolo Valente
2016-07-27 16:13                           ` [PATCH RFC V8 17/22] block, bfq: reduce latency during request-pool saturation Paolo Valente
2016-07-27 16:13                           ` [PATCH RFC V8 18/22] block, bfq: add Early Queue Merge (EQM) Paolo Valente
2016-07-27 16:13                           ` [PATCH RFC V8 19/22] block, bfq: reduce idling only in symmetric scenarios Paolo Valente
2016-07-27 16:13                           ` [PATCH RFC V8 20/22] block, bfq: boost the throughput on NCQ-capable flash-based devices Paolo Valente
2016-07-27 16:13                           ` [PATCH RFC V8 21/22] block, bfq: boost the throughput with random I/O on NCQ-capable HDDs Paolo Valente
2016-07-27 16:13                           ` [PATCH RFC V8 22/22] block, bfq: handle bursts of queue activations Paolo Valente
2016-07-28 16:50                           ` [PATCH RFC V8 00/22] Replace the CFQ I/O Scheduler with BFQ Paolo
2016-02-01 22:12 ` [PATCH RFC 11/22] block, bfq: improve throughput boosting Paolo Valente
2016-02-01 22:12 ` Paolo Valente [this message]
2016-02-01 22:12 ` [PATCH RFC 13/22] block, bfq: add more fairness to boost throughput and reduce latency Paolo Valente
2016-02-01 22:12 ` [PATCH RFC 14/22] block, bfq: improve responsiveness Paolo Valente
2016-02-01 22:12 ` [PATCH RFC 15/22] block, bfq: reduce I/O latency for soft real-time applications Paolo Valente
2016-02-01 22:12 ` [PATCH RFC 16/22] block, bfq: preserve a low latency also with NCQ-capable drives Paolo Valente
2016-02-01 22:12 ` [PATCH RFC 17/22] block, bfq: reduce latency during request-pool saturation Paolo Valente
2016-02-01 22:12 ` [PATCH RFC 18/22] block, bfq: add Early Queue Merge (EQM) Paolo Valente
2016-02-01 22:12 ` [PATCH RFC 19/22] block, bfq: reduce idling only in symmetric scenarios Paolo Valente
2016-02-01 22:12 ` [PATCH RFC 20/22] block, bfq: boost the throughput on NCQ-capable flash-based devices Paolo Valente
2016-02-01 22:12 ` [PATCH RFC 21/22] block, bfq: boost the throughput with random I/O on NCQ-capable HDDs Paolo Valente
2016-02-01 22:12 ` [PATCH RFC 22/22] block, bfq: handle bursts of queue activations Paolo Valente

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=1454364778-25179-13-git-send-email-paolo.valente@linaro.org \
    --to=paolo.valente@linaro.org \
    --cc=avanzini.arianna@gmail.com \
    --cc=axboe@kernel.dk \
    --cc=broonie@kernel.org \
    --cc=fchecconi@gmail.com \
    --cc=linus.walleij@linaro.org \
    --cc=linux-block@vger.kernel.org \
    --cc=linux-kernel@vger.kernel.org \
    --cc=tj@kernel.org \
    --cc=ulf.hansson@linaro.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link

LKML Archive on lore.kernel.org

Archives are clonable:
	git clone --mirror https://lore.kernel.org/lkml/0 lkml/git/0.git
	git clone --mirror https://lore.kernel.org/lkml/1 lkml/git/1.git
	git clone --mirror https://lore.kernel.org/lkml/2 lkml/git/2.git
	git clone --mirror https://lore.kernel.org/lkml/3 lkml/git/3.git
	git clone --mirror https://lore.kernel.org/lkml/4 lkml/git/4.git
	git clone --mirror https://lore.kernel.org/lkml/5 lkml/git/5.git
	git clone --mirror https://lore.kernel.org/lkml/6 lkml/git/6.git
	git clone --mirror https://lore.kernel.org/lkml/7 lkml/git/7.git
	git clone --mirror https://lore.kernel.org/lkml/8 lkml/git/8.git
	git clone --mirror https://lore.kernel.org/lkml/9 lkml/git/9.git

	# If you have public-inbox 1.1+ installed, you may
	# initialize and index your mirror using the following commands:
	public-inbox-init -V2 lkml lkml/ https://lore.kernel.org/lkml \
		linux-kernel@vger.kernel.org
	public-inbox-index lkml

Example config snippet for mirrors

Newsgroup available over NNTP:
	nntp://nntp.lore.kernel.org/org.kernel.vger.linux-kernel


AGPL code for this site: git clone https://public-inbox.org/public-inbox.git