linux-kernel.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
* [PATCH BUGFIX IMPROVEMENT 0/1] block, bfq: eliminate latency regression with fast drives
@ 2019-07-15  9:31 Paolo Valente
  2019-07-15  9:31 ` [PATCH BUGFIX IMPROVEMENT 1/1] block, bfq: check also in-flight I/O in dispatch plugging Paolo Valente
  0 siblings, 1 reply; 6+ messages in thread
From: Paolo Valente @ 2019-07-15  9:31 UTC (permalink / raw)
  To: Jens Axboe
  Cc: linux-block, linux-kernel, ulf.hansson, linus.walleij,
	bfq-iosched, oleksandr, Paolo Valente

Hi Jens,
I've spotted a regression on a fast SSD: a loss of I/O-latency control
with interactive tasks (such as the application start up I usually
test). Details in the commit.

I do hope that, after proper review, this commit makes it for 5.3.

Thanks,
Paolo

Paolo Valente (1):
  block, bfq: check also in-flight I/O in dispatch plugging

 block/bfq-iosched.c | 67 +++++++++++++++++++++++++++++----------------
 1 file changed, 43 insertions(+), 24 deletions(-)

--
2.20.1

^ permalink raw reply	[flat|nested] 6+ messages in thread

* [PATCH BUGFIX IMPROVEMENT 1/1] block, bfq: check also in-flight I/O in dispatch plugging
  2019-07-15  9:31 [PATCH BUGFIX IMPROVEMENT 0/1] block, bfq: eliminate latency regression with fast drives Paolo Valente
@ 2019-07-15  9:31 ` Paolo Valente
  2019-07-15 10:16   ` Holger Hoffstätte
  0 siblings, 1 reply; 6+ messages in thread
From: Paolo Valente @ 2019-07-15  9:31 UTC (permalink / raw)
  To: Jens Axboe
  Cc: linux-block, linux-kernel, ulf.hansson, linus.walleij,
	bfq-iosched, oleksandr, Paolo Valente

Consider a sync bfq_queue Q that remains empty while in service, and
suppose that, when this happens, there is a fair amount of already
in-flight I/O not belonging to Q. This situation is not considered
when deciding whether to plug I/O dispatching (until new I/O arrives
for Q). But it has to be checked, for the following reason.

The drive may decide to serve in-flight non-Q's I/O requests before
Q's ones, thereby delaying the arrival of new I/O requests for Q
(recall that Q is sync). If I/O-dispatching is not plugged, then,
while Q remains empty, a basically uncontrolled amount of I/O from
other queues may be dispatched too, possibly causing the service of
Q's I/O to be delayed even longer in the drive. This problem gets more
and more serious as the speed and the queue depth of the drive grow,
because, as these two quantities grow, the probability to find no
queue busy but many requests in flight grows too.

This commit performs I/O-dispatch plugging in this scenario.  Plugging
minimizes the delay induced by already in-flight I/O, and enables Q to
recover the bandwidth it may lose because of this delay.

As a practical example, under write load on a Samsung SSD 970 PRO,
gnome-terminal starts in 1.5 seconds after this fix, against 15
seconds before the fix (as a reference, gnome-terminal takes about 35
seconds to start with any of the other I/O schedulers).

Signed-off-by: Paolo Valente <paolo.valente@linaro.org>
---
 block/bfq-iosched.c | 67 +++++++++++++++++++++++++++++----------------
 1 file changed, 43 insertions(+), 24 deletions(-)

diff --git a/block/bfq-iosched.c b/block/bfq-iosched.c
index ddf042b36549..c719020fa121 100644
--- a/block/bfq-iosched.c
+++ b/block/bfq-iosched.c
@@ -3744,38 +3744,57 @@ static void bfq_dispatch_remove(struct request_queue *q, struct request *rq)
  * there is no active group, then the primary expectation for
  * this device is probably a high throughput.
  *
- * We are now left only with explaining the additional
- * compound condition that is checked below for deciding
- * whether the scenario is asymmetric. To explain this
- * compound condition, we need to add that the function
+ * We are now left only with explaining the two sub-conditions in the
+ * additional compound condition that is checked below for deciding
+ * whether the scenario is asymmetric. To explain the first
+ * sub-condition, we need to add that the function
  * bfq_asymmetric_scenario checks the weights of only
- * non-weight-raised queues, for efficiency reasons (see
- * comments on bfq_weights_tree_add()). Then the fact that
- * bfqq is weight-raised is checked explicitly here. More
- * precisely, the compound condition below takes into account
- * also the fact that, even if bfqq is being weight-raised,
- * the scenario is still symmetric if all queues with requests
- * waiting for completion happen to be
- * weight-raised. Actually, we should be even more precise
- * here, and differentiate between interactive weight raising
- * and soft real-time weight raising.
+ * non-weight-raised queues, for efficiency reasons (see comments on
+ * bfq_weights_tree_add()). Then the fact that bfqq is weight-raised
+ * is checked explicitly here. More precisely, the compound condition
+ * below takes into account also the fact that, even if bfqq is being
+ * weight-raised, the scenario is still symmetric if all queues with
+ * requests waiting for completion happen to be
+ * weight-raised. Actually, we should be even more precise here, and
+ * differentiate between interactive weight raising and soft real-time
+ * weight raising.
+ *
+ * The second sub-condition checked in the compound condition is
+ * whether there is a fair amount of already in-flight I/O not
+ * belonging to bfqq. If so, I/O dispatching is to be plugged, for the
+ * following reason. The drive may decide to serve in-flight
+ * non-bfqq's I/O requests before bfqq's ones, thereby delaying the
+ * arrival of new I/O requests for bfqq (recall that bfqq is sync). If
+ * I/O-dispatching is not plugged, then, while bfqq remains empty, a
+ * basically uncontrolled amount of I/O from other queues may be
+ * dispatched too, possibly causing the service of bfqq's I/O to be
+ * delayed even longer in the drive. This problem gets more and more
+ * serious as the speed and the queue depth of the drive grow,
+ * because, as these two quantities grow, the probability to find no
+ * queue busy but many requests in flight grows too. By contrast,
+ * plugging I/O dispatching minimizes the delay induced by already
+ * in-flight I/O, and enables bfqq to recover the bandwidth it may
+ * lose because of this delay.
  *
  * As a side note, it is worth considering that the above
- * device-idling countermeasures may however fail in the
- * following unlucky scenario: if idling is (correctly)
- * disabled in a time period during which all symmetry
- * sub-conditions hold, and hence the device is allowed to
- * enqueue many requests, but at some later point in time some
- * sub-condition stops to hold, then it may become impossible
- * to let requests be served in the desired order until all
- * the requests already queued in the device have been served.
+ * device-idling countermeasures may however fail in the following
+ * unlucky scenario: if I/O-dispatch plugging is (correctly) disabled
+ * in a time period during which all symmetry sub-conditions hold, and
+ * therefore the device is allowed to enqueue many requests, but at
+ * some later point in time some sub-condition stops to hold, then it
+ * may become impossible to make requests be served in the desired
+ * order until all the requests already queued in the device have been
+ * served. The last sub-condition commented above somewhat mitigates
+ * this problem for weight-raised queues.
  */
 static bool idling_needed_for_service_guarantees(struct bfq_data *bfqd,
 						 struct bfq_queue *bfqq)
 {
 	bool asymmetric_scenario = (bfqq->wr_coeff > 1 &&
-				    bfqd->wr_busy_queues <
-				    bfq_tot_busy_queues(bfqd)) ||
+				    (bfqd->wr_busy_queues <
+				     bfq_tot_busy_queues(bfqd) ||
+				     bfqd->rq_in_driver >=
+				     bfqq->dispatched + 4)) ||
 		bfq_asymmetric_scenario(bfqd, bfqq);
 
 	bfq_log_bfqq(bfqd, bfqq,
-- 
2.20.1


^ permalink raw reply related	[flat|nested] 6+ messages in thread

* Re: [PATCH BUGFIX IMPROVEMENT 1/1] block, bfq: check also in-flight I/O in dispatch plugging
  2019-07-15  9:31 ` [PATCH BUGFIX IMPROVEMENT 1/1] block, bfq: check also in-flight I/O in dispatch plugging Paolo Valente
@ 2019-07-15 10:16   ` Holger Hoffstätte
  2019-07-15 10:18     ` Paolo Valente
  0 siblings, 1 reply; 6+ messages in thread
From: Holger Hoffstätte @ 2019-07-15 10:16 UTC (permalink / raw)
  To: Paolo Valente, Jens Axboe
  Cc: linux-block, linux-kernel, ulf.hansson, linus.walleij,
	bfq-iosched, oleksandr


Paolo,

The function idling_needed_for_service_guarantees() was just removed in 5.3-commit
3726112ec731 ("block, bfq: re-schedule empty queues if they deserve I/O plugging").
See [1].

cheers
Holger

[1] https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/commit/block/bfq-iosched.c?id=3726112ec7316068625a1adefa101b9522c588ba

^ permalink raw reply	[flat|nested] 6+ messages in thread

* Re: [PATCH BUGFIX IMPROVEMENT 1/1] block, bfq: check also in-flight I/O in dispatch plugging
  2019-07-15 10:16   ` Holger Hoffstätte
@ 2019-07-15 10:18     ` Paolo Valente
  2019-07-15 10:49       ` Holger Hoffstätte
  0 siblings, 1 reply; 6+ messages in thread
From: Paolo Valente @ 2019-07-15 10:18 UTC (permalink / raw)
  To: Holger Hoffstätte
  Cc: Jens Axboe, linux-block, kernel list, Ulf Hansson, linus.walleij,
	bfq-iosched, oleksandr

Didn't I simply move it forward in that commit?

> Il giorno 15 lug 2019, alle ore 12:16, Holger Hoffstätte <holger@applied-asynchrony.com> ha scritto:
> 
> 
> Paolo,
> 
> The function idling_needed_for_service_guarantees() was just removed in 5.3-commit
> 3726112ec731 ("block, bfq: re-schedule empty queues if they deserve I/O plugging").
> See [1].
> 
> cheers
> Holger
> 
> [1] https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/commit/block/bfq-iosched.c?id=3726112ec7316068625a1adefa101b9522c588ba


^ permalink raw reply	[flat|nested] 6+ messages in thread

* Re: [PATCH BUGFIX IMPROVEMENT 1/1] block, bfq: check also in-flight I/O in dispatch plugging
  2019-07-15 10:18     ` Paolo Valente
@ 2019-07-15 10:49       ` Holger Hoffstätte
  2019-07-15 10:53         ` Paolo Valente
  0 siblings, 1 reply; 6+ messages in thread
From: Holger Hoffstätte @ 2019-07-15 10:49 UTC (permalink / raw)
  To: Paolo Valente
  Cc: Jens Axboe, linux-block, kernel list, Ulf Hansson, linus.walleij,
	bfq-iosched, oleksandr

On 7/15/19 12:18 PM, Paolo Valente wrote:
> Didn't I simply move it forward in that commit?
> 
>> Il giorno 15 lug 2019, alle ore 12:16, Holger Hoffstätte <holger@applied-asynchrony.com> ha scritto:
>> Paolo,
>>
>> The function idling_needed_for_service_guarantees() was just removed in 5.3-commit
>> 3726112ec731 ("block, bfq: re-schedule empty queues if they deserve I/O plugging").

Argh, sorry - yes you did, it failed to apply because the asymmetric_scenario
variable was already gone. Fixing the patch to just return the expression so that
it matches 3726112ec731 worked.

All good!
Holger

^ permalink raw reply	[flat|nested] 6+ messages in thread

* Re: [PATCH BUGFIX IMPROVEMENT 1/1] block, bfq: check also in-flight I/O in dispatch plugging
  2019-07-15 10:49       ` Holger Hoffstätte
@ 2019-07-15 10:53         ` Paolo Valente
  0 siblings, 0 replies; 6+ messages in thread
From: Paolo Valente @ 2019-07-15 10:53 UTC (permalink / raw)
  To: Holger Hoffstätte
  Cc: Jens Axboe, linux-block, kernel list, Ulf Hansson, linus.walleij,
	bfq-iosched, oleksandr

Ops, my fault of course. I submitted the patch I made on top of the dev version of bfq. Preparing a V2. Sorry.

> Il giorno 15 lug 2019, alle ore 12:49, Holger Hoffstätte <holger@applied-asynchrony.com> ha scritto:
> 
> On 7/15/19 12:18 PM, Paolo Valente wrote:
>> Didn't I simply move it forward in that commit?
>>> Il giorno 15 lug 2019, alle ore 12:16, Holger Hoffstätte <holger@applied-asynchrony.com> ha scritto:
>>> Paolo,
>>> 
>>> The function idling_needed_for_service_guarantees() was just removed in 5.3-commit
>>> 3726112ec731 ("block, bfq: re-schedule empty queues if they deserve I/O plugging").
> 
> Argh, sorry - yes you did, it failed to apply because the asymmetric_scenario
> variable was already gone. Fixing the patch to just return the expression so that
> it matches 3726112ec731 worked.
> 
> All good!
> Holger


^ permalink raw reply	[flat|nested] 6+ messages in thread

end of thread, other threads:[~2019-07-15 10:53 UTC | newest]

Thread overview: 6+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2019-07-15  9:31 [PATCH BUGFIX IMPROVEMENT 0/1] block, bfq: eliminate latency regression with fast drives Paolo Valente
2019-07-15  9:31 ` [PATCH BUGFIX IMPROVEMENT 1/1] block, bfq: check also in-flight I/O in dispatch plugging Paolo Valente
2019-07-15 10:16   ` Holger Hoffstätte
2019-07-15 10:18     ` Paolo Valente
2019-07-15 10:49       ` Holger Hoffstätte
2019-07-15 10:53         ` Paolo Valente

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).