All of lore.kernel.org
 help / color / mirror / Atom feed
From: Bart Van Assche <Bart.VanAssche@wdc.com>
To: "ming.lei@redhat.com" <ming.lei@redhat.com>,
	"snitzer@redhat.com" <snitzer@redhat.com>,
	"axboe@kernel.dk" <axboe@kernel.dk>
Cc: "dm-devel@redhat.com" <dm-devel@redhat.com>,
	"linux-kernel@vger.kernel.org" <linux-kernel@vger.kernel.org>,
	"hch@infradead.org" <hch@infradead.org>,
	"linux-block@vger.kernel.org" <linux-block@vger.kernel.org>,
	"osandov@fb.com" <osandov@fb.com>
Subject: Re: [PATCH] block: neutralize blk_insert_cloned_request IO stall regression (was: Re: [RFC PATCH] blk-mq: fixup RESTART when queue becomes idle)
Date: Tue, 23 Jan 2018 16:43:42 +0000	[thread overview]
Message-ID: <1516725821.3339.11.camel@wdc.com> (raw)
In-Reply-To: <20180123092204.GA39002@redhat.com>

T24gVHVlLCAyMDE4LTAxLTIzIGF0IDEwOjIyICswMTAwLCBNaWtlIFNuaXR6ZXIgd3JvdGU6DQo+
IE9uIFRodSwgSmFuIDE4IDIwMTggYXQgIDU6MjBwbSAtMDUwMCwNCj4gQmFydCBWYW4gQXNzY2hl
IDxCYXJ0LlZhbkFzc2NoZUB3ZGMuY29tPiB3cm90ZToNCj4gDQo+ID4gT24gVGh1LCAyMDE4LTAx
LTE4IGF0IDE3OjAxIC0wNTAwLCBNaWtlIFNuaXR6ZXIgd3JvdGU6DQo+ID4gPiBBbmQgeWV0IExh
dXJlbmNlIGNhbm5vdCByZXByb2R1Y2UgYW55IHN1Y2ggbG9ja3VwcyB3aXRoIHlvdXIgdGVzdC4u
Lg0KPiA+IA0KPiA+IEhtbSAuLi4gbWF5YmUgSSBtaXN1bmRlcnN0b29kIExhdXJlbmNlIGJ1dCBJ
IGRvbid0IHRoaW5rIHRoYXQgTGF1cmVuY2UgaGFzDQo+ID4gYWxyZWFkeSBzdWNjZWVkZWQgYXQg
cnVubmluZyBhbiB1bm1vZGlmaWVkIHZlcnNpb24gb2YgbXkgdGVzdHMuIEluIG9uZSBvZiB0aGUN
Cj4gPiBlLW1haWxzIExhdXJlbmNlIHNlbnQgbWUgdGhpcyBtb3JuaW5nIEkgcmVhZCB0aGF0IGhl
IG1vZGlmaWVkIHRoZXNlIHNjcmlwdHMNCj4gPiB0byBnZXQgcGFzdCBhIGtlcm5lbCBtb2R1bGUg
dW5sb2FkIGZhaWx1cmUgdGhhdCB3YXMgcmVwb3J0ZWQgd2hpbGUgc3RhcnRpbmcNCj4gPiB0aGVz
ZSB0ZXN0cy4gU28gdGhlIG5leHQgc3RlcCBpcyB0byBjaGVjayB3aGljaCBjaGFuZ2VzIHdlcmUg
bWFkZSB0byB0aGUgdGVzdA0KPiA+IHNjcmlwdHMgYW5kIGFsc28gd2hldGhlciB0aGUgdGVzdCBy
ZXN1bHRzIGFyZSBzdGlsbCB2YWxpZC4NCj4gPiANCj4gPiA+IEFyZSB5b3UgYWJzb2x1dGVseSBj
ZXJ0YWluIHRoaXMgcGF0Y2ggZG9lc24ndCBoZWxwIHlvdT8NCj4gPiA+IGh0dHBzOi8vcGF0Y2h3
b3JrLmtlcm5lbC5vcmcvcGF0Y2gvMTAxNzQwMzcvDQo+ID4gPiANCj4gPiA+IElmIGl0IGRvZXNu
J3QgdGhlbiB0aGF0IGlzIGFjdHVhbGx5IHZlcnkgdXNlZnVsIHRvIGtub3cuDQo+ID4gDQo+ID4g
VGhlIGZpcnN0IEkgdHJpZWQgdGhpcyBtb3JuaW5nIGlzIHRvIHJ1biB0aGUgc3JwLXRlc3Qgc29m
dHdhcmUgYWdhaW5zdCBhIG1lcmdlDQo+ID4gb2YgSmVucycgZm9yLW5leHQgYnJhbmNoIGFuZCB5
b3VyIGRtLTQuMTYgYnJhbmNoLiBTaW5jZSBJIG5vdGljZWQgdGhhdCB0aGUgZG0NCj4gPiBxdWV1
ZSBsb2NrZWQgdXAgSSByZWluc2VydGVkIGEgYmxrX21xX2RlbGF5X3J1bl9od19xdWV1ZSgpIGNh
bGwgaW4gdGhlIGRtIGNvZGUuDQo+ID4gU2luY2UgZXZlbiB0aGF0IHdhcyBub3Qgc3VmZmljaWVu
dCBJIHRyaWVkIHRvIGtpY2sgdGhlIHF1ZXVlcyB2aWEgZGVidWdmcyAoZm9yDQo+ID4gcyBpbiAv
c3lzL2tlcm5lbC9kZWJ1Zy9ibG9jay8qL3N0YXRlOyBkbyBlY2hvIGtpY2sgPiRzOyBkb25lKS4g
U2luY2UgdGhhdCB3YXMNCj4gPiBub3Qgc3VmZmljaWVudCB0byByZXNvbHZlIHRoZSBxdWV1ZSBz
dGFsbCBJIHJldmVydGVkIHRoZSBmb2xsb3dpbmcgdHJlZSBwYXRjaGVzDQo+ID4gdGhhdCBhcmUg
aW4gSmVucycgdHJlZToNCj4gPiAqICJibGstbXE6IGltcHJvdmUgRE0ncyBibGstbXEgSU8gbWVy
Z2luZyB2aWEgYmxrX2luc2VydF9jbG9uZWRfcmVxdWVzdCBmZWVkYmFjayINCj4gPiAqICJibGst
bXEtc2NoZWQ6IHJlbW92ZSB1bnVzZWQgJ2Nhbl9ibG9jaycgYXJnIGZyb20gYmxrX21xX3NjaGVk
X2luc2VydF9yZXF1ZXN0Ig0KPiA+ICogImJsay1tcTogZG9uJ3QgZGlzcGF0Y2ggcmVxdWVzdCBp
biBibGtfbXFfcmVxdWVzdF9kaXJlY3RfaXNzdWUgaWYgcXVldWUgaXMgYnVzeSINCj4gPiANCj4g
PiBPbmx5IGFmdGVyIEkgaGFkIGRvbmUgdGhpcyB0aGUgc3JwLXRlc3Qgc29mdHdhcmUgcmFuIGFn
YWluIHdpdGhvdXQgdHJpZ2dlcmluZw0KPiA+IGRtIHF1ZXVlIGxvY2t1cHMuDQoNCkhlbGxvIE1p
a2UsDQoNCkkgd2FudCB0byB0YWtlIGJhY2sgd2hhdCBJIHdyb3RlIGFib3V0IHRoZSByZXZlcnRz
LiBJIGhhdmUgbm90IHlldCB0cmllZCB0bw0KYW5hbHl6ZSBleGFjdGx5IHdoaWNoIGNoYW5nZSBt
YWRlIGJsa19pbnNlcnRfY2xvbmVkX3JlcXVlc3QoKSB3b3JrIHJlbGlhYmx5DQpmb3IgbWUgYnV0
IHdpdGggSmVucycgbGF0ZXN0IGZvci1uZXh0IGJyYW5jaCBhbmQgeW91ciBmb3ItNC4xNiBicmFu
Y2ggbWVyZ2VkDQphbGwgSSBuZWVkIHRvIGF2b2lkIHF1ZXVlIHN0YWxscyBpcyB0aGUgZm9sbG93
aW5nOg0KDQpkaWZmIC0tZ2l0IGEvZHJpdmVycy9tZC9kbS1ycS5jIGIvZHJpdmVycy9tZC9kbS1y
cS5jDQppbmRleCBmMTYwOTZhZjg3OWEuLmM1OWM1OWNmZDJhNSAxMDA2NDQNCi0tLSBhL2RyaXZl
cnMvbWQvZG0tcnEuYw0KKysrIGIvZHJpdmVycy9tZC9kbS1ycS5jDQpAQCAtNzYxLDYgKzc2MSw3
IEBAIHN0YXRpYyBibGtfc3RhdHVzX3QgZG1fbXFfcXVldWVfcnEoc3RydWN0IGJsa19tcV9od19j
dHggKmhjdHgsDQogCQkvKiBVbmRvIGRtX3N0YXJ0X3JlcXVlc3QoKSBiZWZvcmUgcmVxdWV1aW5n
ICovDQogCQlycV9lbmRfc3RhdHMobWQsIHJxKTsNCiAJCXJxX2NvbXBsZXRlZChtZCwgcnFfZGF0
YV9kaXIocnEpLCBmYWxzZSk7DQorCQlibGtfbXFfZGVsYXlfcnVuX2h3X3F1ZXVlKGhjdHgsIDEw
MC8qbXMqLyk7DQogCQlyZXR1cm4gQkxLX1NUU19SRVNPVVJDRTsNCiAJfQ0KIA0KQmFydC4=

WARNING: multiple messages have this Message-ID (diff)
From: Bart Van Assche <Bart.VanAssche@wdc.com>
To: "ming.lei@redhat.com" <ming.lei@redhat.com>,
	"snitzer@redhat.com" <snitzer@redhat.com>,
	"axboe@kernel.dk" <axboe@kernel.dk>
Cc: "dm-devel@redhat.com" <dm-devel@redhat.com>,
	"linux-kernel@vger.kernel.org" <linux-kernel@vger.kernel.org>,
	"hch@infradead.org" <hch@infradead.org>,
	"linux-block@vger.kernel.org" <linux-block@vger.kernel.org>,
	"osandov@fb.com" <osandov@fb.com>
Subject: Re: [PATCH] block: neutralize blk_insert_cloned_request IO stall regression (was: Re: [RFC PATCH] blk-mq: fixup RESTART when queue becomes idle)
Date: Tue, 23 Jan 2018 16:43:42 +0000	[thread overview]
Message-ID: <1516725821.3339.11.camel@wdc.com> (raw)
In-Reply-To: <20180123092204.GA39002@redhat.com>

On Tue, 2018-01-23 at 10:22 +0100, Mike Snitzer wrote:
> On Thu, Jan 18 2018 at  5:20pm -0500,
> Bart Van Assche <Bart.VanAssche@wdc.com> wrote:
> 
> > On Thu, 2018-01-18 at 17:01 -0500, Mike Snitzer wrote:
> > > And yet Laurence cannot reproduce any such lockups with your test...
> > 
> > Hmm ... maybe I misunderstood Laurence but I don't think that Laurence has
> > already succeeded at running an unmodified version of my tests. In one of the
> > e-mails Laurence sent me this morning I read that he modified these scripts
> > to get past a kernel module unload failure that was reported while starting
> > these tests. So the next step is to check which changes were made to the test
> > scripts and also whether the test results are still valid.
> > 
> > > Are you absolutely certain this patch doesn't help you?
> > > https://patchwork.kernel.org/patch/10174037/
> > > 
> > > If it doesn't then that is actually very useful to know.
> > 
> > The first I tried this morning is to run the srp-test software against a merge
> > of Jens' for-next branch and your dm-4.16 branch. Since I noticed that the dm
> > queue locked up I reinserted a blk_mq_delay_run_hw_queue() call in the dm code.
> > Since even that was not sufficient I tried to kick the queues via debugfs (for
> > s in /sys/kernel/debug/block/*/state; do echo kick >$s; done). Since that was
> > not sufficient to resolve the queue stall I reverted the following tree patches
> > that are in Jens' tree:
> > * "blk-mq: improve DM's blk-mq IO merging via blk_insert_cloned_request feedback"
> > * "blk-mq-sched: remove unused 'can_block' arg from blk_mq_sched_insert_request"
> > * "blk-mq: don't dispatch request in blk_mq_request_direct_issue if queue is busy"
> > 
> > Only after I had done this the srp-test software ran again without triggering
> > dm queue lockups.

Hello Mike,

I want to take back what I wrote about the reverts. I have not yet tried to
analyze exactly which change made blk_insert_cloned_request() work reliably
for me but with Jens' latest for-next branch and your for-4.16 branch merged
all I need to avoid queue stalls is the following:

diff --git a/drivers/md/dm-rq.c b/drivers/md/dm-rq.c
index f16096af879a..c59c59cfd2a5 100644
--- a/drivers/md/dm-rq.c
+++ b/drivers/md/dm-rq.c
@@ -761,6 +761,7 @@ static blk_status_t dm_mq_queue_rq(struct blk_mq_hw_ctx *hctx,
 		/* Undo dm_start_request() before requeuing */
 		rq_end_stats(md, rq);
 		rq_completed(md, rq_data_dir(rq), false);
+		blk_mq_delay_run_hw_queue(hctx, 100/*ms*/);
 		return BLK_STS_RESOURCE;
 	}
 
Bart.

  parent reply	other threads:[~2018-01-23 16:43 UTC|newest]

Thread overview: 80+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2018-01-18  2:41 [RFC PATCH] blk-mq: fixup RESTART when queue becomes idle Ming Lei
2018-01-18 16:50 ` Bart Van Assche
2018-01-18 17:03   ` Mike Snitzer
2018-01-18 17:03     ` Mike Snitzer
2018-01-18 17:20     ` Bart Van Assche
2018-01-18 17:20       ` Bart Van Assche
2018-01-18 18:30       ` Mike Snitzer
2018-01-18 18:47         ` Bart Van Assche
2018-01-18 18:47           ` Bart Van Assche
2018-01-18 20:11           ` Jens Axboe
2018-01-18 20:11             ` Jens Axboe
2018-01-18 20:48             ` Mike Snitzer
2018-01-18 20:58               ` Bart Van Assche
2018-01-18 20:58                 ` Bart Van Assche
2018-01-18 21:23                 ` Mike Snitzer
2018-01-18 21:23                   ` Mike Snitzer
2018-01-18 21:37                   ` Laurence Oberman
2018-01-18 21:39                   ` [dm-devel] " Bart Van Assche
2018-01-18 21:39                     ` Bart Van Assche
2018-01-18 21:45                     ` Laurence Oberman
2018-01-18 21:45                       ` Laurence Oberman
2018-01-18 22:01                     ` Mike Snitzer
2018-01-18 22:18                       ` Laurence Oberman
2018-01-18 22:20                         ` Laurence Oberman
2018-01-18 22:20                           ` Laurence Oberman
2018-01-18 22:24                         ` Bart Van Assche
2018-01-18 22:24                           ` Bart Van Assche
2018-01-18 22:35                           ` Laurence Oberman
2018-01-18 22:39                             ` Jens Axboe
2018-01-18 22:55                               ` Bart Van Assche
2018-01-18 22:55                                 ` Bart Van Assche
2018-01-18 22:20                       ` Bart Van Assche
2018-01-18 22:20                         ` Bart Van Assche
2018-01-23  9:22                         ` [PATCH] block: neutralize blk_insert_cloned_request IO stall regression (was: Re: [RFC PATCH] blk-mq: fixup RESTART when queue becomes idle) Mike Snitzer
2018-01-23 10:53                           ` Ming Lei
2018-01-23 12:15                             ` Mike Snitzer
2018-01-23 12:17                               ` Ming Lei
2018-01-23 12:43                                 ` Mike Snitzer
2018-01-23 16:43                           ` Bart Van Assche [this message]
2018-01-23 16:43                             ` [PATCH] " Bart Van Assche
2018-01-19  2:32             ` [RFC PATCH] blk-mq: fixup RESTART when queue becomes idle Ming Lei
2018-01-19  4:02               ` Jens Axboe
2018-01-19  7:26                 ` Ming Lei
2018-01-19 15:20                   ` Bart Van Assche
2018-01-19 15:20                     ` Bart Van Assche
2018-01-19 15:25                     ` Jens Axboe
2018-01-19 15:33                     ` Ming Lei
2018-01-19 16:06                       ` Bart Van Assche
2018-01-19 16:06                         ` Bart Van Assche
2018-01-19 15:24                   ` Jens Axboe
2018-01-19 15:40                     ` Ming Lei
2018-01-19 15:40                       ` Ming Lei
2018-01-19 15:48                       ` Jens Axboe
2018-01-19 16:05                         ` Ming Lei
2018-01-19 16:19                           ` Jens Axboe
2018-01-19 16:26                             ` Ming Lei
2018-01-19 16:27                               ` Jens Axboe
2018-01-19 16:37                                 ` Ming Lei
2018-01-19 16:41                                   ` Jens Axboe
2018-01-19 16:41                                     ` Jens Axboe
2018-01-19 16:47                                     ` Mike Snitzer
2018-01-19 16:52                                       ` Jens Axboe
2018-01-19 17:05                                         ` Ming Lei
2018-01-19 17:09                                           ` Jens Axboe
2018-01-19 17:20                                             ` Ming Lei
2018-01-19 17:38                                   ` Jens Axboe
2018-01-19 18:24                                     ` Ming Lei
2018-01-19 18:24                                       ` Ming Lei
2018-01-19 18:33                                     ` Mike Snitzer
2018-01-19 23:52                                     ` Ming Lei
2018-01-20  4:27                                       ` Jens Axboe
2018-01-19 16:13                         ` Mike Snitzer
2018-01-19 16:23                           ` Jens Axboe
2018-01-19 23:57                             ` Ming Lei
2018-01-29 22:37                     ` Bart Van Assche
2018-01-19  5:09               ` Bart Van Assche
2018-01-19  5:09                 ` Bart Van Assche
2018-01-19  7:34                 ` Ming Lei
2018-01-19 19:47                   ` Bart Van Assche
2018-01-19 19:47                     ` Bart Van Assche

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=1516725821.3339.11.camel@wdc.com \
    --to=bart.vanassche@wdc.com \
    --cc=axboe@kernel.dk \
    --cc=dm-devel@redhat.com \
    --cc=hch@infradead.org \
    --cc=linux-block@vger.kernel.org \
    --cc=linux-kernel@vger.kernel.org \
    --cc=ming.lei@redhat.com \
    --cc=osandov@fb.com \
    --cc=snitzer@redhat.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.