linux-kernel.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
From: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
To: linux-kernel@vger.kernel.org
Cc: Greg Kroah-Hartman <gregkh@linuxfoundation.org>,
	stable@vger.kernel.org, Paolo Valente <paolo.valente@linaro.org>,
	Angelo Ruocco <angeloruocco90@gmail.com>,
	Oleksandr Natalenko <oleksandr@natalenko.name>,
	Lee Tibbert <lee.tibbert@gmail.com>,
	Mirko Montanari <mirkomontanari91@gmail.com>,
	Jens Axboe <axboe@kernel.dk>,
	Sudip Mukherjee <sudipm.mukherjee@gmail.com>
Subject: [PATCH 4.14 17/22] block, bfq: fix wrong init of saved start time for weight raising
Date: Thu, 16 Aug 2018 20:45:18 +0200	[thread overview]
Message-ID: <20180816171625.072831365@linuxfoundation.org> (raw)
In-Reply-To: <20180816171622.831964729@linuxfoundation.org>

4.14-stable review patch.  If anyone has any objections, please let me know.

------------------

From: Paolo Valente <paolo.valente@linaro.org>

commit 4baa8bb13f41307f3eb62fe91f93a1a798ebef53 upstream.

This commit fixes a bug that causes bfq to fail to guarantee a high
responsiveness on some drives, if there is heavy random read+write I/O
in the background. More precisely, such a failure allowed this bug to
be found [1], but the bug may well cause other yet unreported
anomalies.

BFQ raises the weight of the bfq_queues associated with soft real-time
applications, to privilege the I/O, and thus reduce latency, for these
applications. This mechanism is named soft-real-time weight raising in
BFQ. A soft real-time period may happen to be nested into an
interactive weight raising period, i.e., it may happen that, when a
bfq_queue switches to a soft real-time weight-raised state, the
bfq_queue is already being weight-raised because deemed interactive
too. In this case, BFQ saves in a special variable
wr_start_at_switch_to_srt, the time instant when the interactive
weight-raising period started for the bfq_queue, i.e., the time
instant when BFQ started to deem the bfq_queue interactive. This value
is then used to check whether the interactive weight-raising period
would still be in progress when the soft real-time weight-raising
period ends.  If so, interactive weight raising is restored for the
bfq_queue. This restore is useful, in particular, because it prevents
bfq_queues from losing their interactive weight raising prematurely,
as a consequence of spurious, short-lived soft real-time
weight-raising periods caused by wrong detections as soft real-time.

If, instead, a bfq_queue switches to soft-real-time weight raising
while it *is not* already in an interactive weight-raising period,
then the variable wr_start_at_switch_to_srt has no meaning during the
following soft real-time weight-raising period. Unfortunately the
handling of this case is wrong in BFQ: not only the variable is not
flagged somehow as meaningless, but it is also set to the time when
the switch to soft real-time weight-raising occurs. This may cause an
interactive weight-raising period to be considered mistakenly as still
in progress, and thus a spurious interactive weight-raising period to
start for the bfq_queue, at the end of the soft-real-time
weight-raising period. In particular the spurious interactive
weight-raising period will be considered as still in progress, if the
soft-real-time weight-raising period does not last very long. The
bfq_queue will then be wrongly privileged and, if I/O bound, will
unjustly steal bandwidth to truly interactive or soft real-time
bfq_queues, harming responsiveness and low latency.

This commit fixes this issue by just setting wr_start_at_switch_to_srt
to minus infinity (farthest past time instant according to jiffies
macros): when the soft-real-time weight-raising period ends, certainly
no interactive weight-raising period will be considered as still in
progress.

[1] Background I/O Type: Random - Background I/O mix: Reads and writes
- Application to start: LibreOffice Writer in
http://www.phoronix.com/scan.php?page=news_item&px=Linux-4.13-IO-Laptop

Signed-off-by: Paolo Valente <paolo.valente@linaro.org>
Signed-off-by: Angelo Ruocco <angeloruocco90@gmail.com>
Tested-by: Oleksandr Natalenko <oleksandr@natalenko.name>
Tested-by: Lee Tibbert <lee.tibbert@gmail.com>
Tested-by: Mirko Montanari <mirkomontanari91@gmail.com>
Signed-off-by: Jens Axboe <axboe@kernel.dk>
Signed-off-by: Sudip Mukherjee <sudipm.mukherjee@gmail.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>

---
 block/bfq-iosched.c |   50 +++++++++++++++++++++++++++++++-------------------
 1 file changed, 31 insertions(+), 19 deletions(-)

--- a/block/bfq-iosched.c
+++ b/block/bfq-iosched.c
@@ -1203,6 +1203,24 @@ static unsigned int bfq_wr_duration(stru
 	return dur;
 }
 
+/*
+ * Return the farthest future time instant according to jiffies
+ * macros.
+ */
+static unsigned long bfq_greatest_from_now(void)
+{
+	return jiffies + MAX_JIFFY_OFFSET;
+}
+
+/*
+ * Return the farthest past time instant according to jiffies
+ * macros.
+ */
+static unsigned long bfq_smallest_from_now(void)
+{
+	return jiffies - MAX_JIFFY_OFFSET;
+}
+
 static void bfq_update_bfqq_wr_on_rq_arrival(struct bfq_data *bfqd,
 					     struct bfq_queue *bfqq,
 					     unsigned int old_wr_coeff,
@@ -1217,7 +1235,19 @@ static void bfq_update_bfqq_wr_on_rq_arr
 			bfqq->wr_coeff = bfqd->bfq_wr_coeff;
 			bfqq->wr_cur_max_time = bfq_wr_duration(bfqd);
 		} else {
-			bfqq->wr_start_at_switch_to_srt = jiffies;
+			/*
+			 * No interactive weight raising in progress
+			 * here: assign minus infinity to
+			 * wr_start_at_switch_to_srt, to make sure
+			 * that, at the end of the soft-real-time
+			 * weight raising periods that is starting
+			 * now, no interactive weight-raising period
+			 * may be wrongly considered as still in
+			 * progress (and thus actually started by
+			 * mistake).
+			 */
+			bfqq->wr_start_at_switch_to_srt =
+				bfq_smallest_from_now();
 			bfqq->wr_coeff = bfqd->bfq_wr_coeff *
 				BFQ_SOFTRT_WEIGHT_FACTOR;
 			bfqq->wr_cur_max_time =
@@ -2896,24 +2926,6 @@ static unsigned long bfq_bfqq_softrt_nex
 		   jiffies + nsecs_to_jiffies(bfqq->bfqd->bfq_slice_idle) + 4);
 }
 
-/*
- * Return the farthest future time instant according to jiffies
- * macros.
- */
-static unsigned long bfq_greatest_from_now(void)
-{
-	return jiffies + MAX_JIFFY_OFFSET;
-}
-
-/*
- * Return the farthest past time instant according to jiffies
- * macros.
- */
-static unsigned long bfq_smallest_from_now(void)
-{
-	return jiffies - MAX_JIFFY_OFFSET;
-}
-
 /**
  * bfq_bfqq_expire - expire a queue.
  * @bfqd: device owning the queue.



  parent reply	other threads:[~2018-08-16 18:46 UTC|newest]

Thread overview: 23+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2018-08-16 18:45 [PATCH 4.14 00/22] 4.14.64-stable review Greg Kroah-Hartman
2018-08-16 18:45 ` [PATCH 4.14 01/22] x86/l1tf: Fix build error seen if CONFIG_KVM_INTEL is disabled Greg Kroah-Hartman
2018-08-16 18:45 ` [PATCH 4.14 02/22] x86: i8259: Add missing include file Greg Kroah-Hartman
2018-08-16 18:45 ` [PATCH 4.14 03/22] x86/mm: Disable ioremap free page handling on x86-PAE Greg Kroah-Hartman
2018-08-16 18:45 ` [PATCH 4.14 04/22] kbuild: verify that $DEPMOD is installed Greg Kroah-Hartman
2018-08-16 18:45 ` [PATCH 4.14 05/22] crypto: x86/sha256-mb - fix digest copy in sha256_mb_mgr_get_comp_job_avx2() Greg Kroah-Hartman
2018-08-16 18:45 ` [PATCH 4.14 06/22] crypto: vmac - require a block cipher with 128-bit block size Greg Kroah-Hartman
2018-08-16 18:45 ` [PATCH 4.14 07/22] crypto: vmac - separate tfm and request context Greg Kroah-Hartman
2018-08-16 18:45 ` [PATCH 4.14 08/22] crypto: blkcipher - fix crash flushing dcache in error path Greg Kroah-Hartman
2018-08-16 18:45 ` [PATCH 4.14 09/22] crypto: ablkcipher " Greg Kroah-Hartman
2018-08-16 18:45 ` [PATCH 4.14 10/22] crypto: skcipher - fix aligning block size in skcipher_copy_iv() Greg Kroah-Hartman
2018-08-16 18:45 ` [PATCH 4.14 11/22] crypto: skcipher - fix crash flushing dcache in error path Greg Kroah-Hartman
2018-08-16 18:45 ` [PATCH 4.14 12/22] ACPI / APEI: Remove ghes_ioremap_area Greg Kroah-Hartman
2018-08-16 18:45 ` [PATCH 4.14 13/22] sched/debug: Fix task state recording/printout Greg Kroah-Hartman
2018-08-16 18:45 ` [PATCH 4.14 15/22] ASoC: rsnd: fix ADG flags Greg Kroah-Hartman
2018-08-16 18:45 ` [PATCH 4.14 16/22] clk: sunxi-ng: Fix missing CLK_SET_RATE_PARENT in ccu-sun4i-a10.c Greg Kroah-Hartman
2018-08-16 18:45 ` Greg Kroah-Hartman [this message]
2018-08-16 18:45 ` [PATCH 4.14 19/22] ASoC: Intel: cht_bsw_max98090_ti: Fix jack initialization Greg Kroah-Hartman
2018-08-16 18:45 ` [PATCH 4.14 20/22] Bluetooth: hidp: buffer overflow in hidp_process_report Greg Kroah-Hartman
2018-08-16 18:45 ` [PATCH 4.14 21/22] ioremap: Update pgtable free interfaces with addr Greg Kroah-Hartman
2018-08-16 18:45 ` [PATCH 4.14 22/22] x86/mm: Add TLB purge to free pmd/pte page interfaces Greg Kroah-Hartman
2018-08-17 17:17 ` [PATCH 4.14 00/22] 4.14.64-stable review Guenter Roeck
2018-08-18 14:36 ` Rafael David Tinoco

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20180816171625.072831365@linuxfoundation.org \
    --to=gregkh@linuxfoundation.org \
    --cc=angeloruocco90@gmail.com \
    --cc=axboe@kernel.dk \
    --cc=lee.tibbert@gmail.com \
    --cc=linux-kernel@vger.kernel.org \
    --cc=mirkomontanari91@gmail.com \
    --cc=oleksandr@natalenko.name \
    --cc=paolo.valente@linaro.org \
    --cc=stable@vger.kernel.org \
    --cc=sudipm.mukherjee@gmail.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).