[PATCH] SCSI: run queue if SCSI device queue isn't ready and queue is idle

All of lore.kernel.org
 help / color / mirror / Atom feed

* [PATCH] SCSI: run queue if SCSI device queue isn't ready and queue is idle
@ 2017-12-05  7:52 Ming Lei
  2017-12-05 14:29 ` Johannes Thumshirn
                   ` (2 more replies)
  0 siblings, 3 replies; 24+ messages in thread
From: Ming Lei @ 2017-12-05  7:52 UTC (permalink / raw)
  To: Jens Axboe, linux-block, Christoph Hellwig, linux-scsi,
	Martin K . Petersen, James E . J . Bottomley
  Cc: Bart Van Assche, linux-kernel, Hannes Reinecke,
	Holger Hoffstätte, Ming Lei

Before commit 0df21c86bdbf ("scsi: implement .get_budget and .put_budget
for blk-mq"), we run queue after 3ms if queue is idle and SCSI device
queue isn't ready, which is done in handling BLK_STS_RESOURCE. After
commit 0df21c86bdbf is introduced, queue won't be run any more under
this situation.

IO hang is observed when timeout happened, and this patch fixes the IO
hang issue by running queue after delay in scsi_dev_queue_ready, just like
non-mq. This issue can be triggered by the following script[1].

There is another issue which can be covered by running idle queue:
when .get_budget() is called on request coming from hctx->dispatch_list,
if one request just completes during .get_budget(), we can't depend on
SCSI's restart to make progress any more. This patch fixes the race too.

With this patch, we basically recover to previous behaviour(before commit
0df21c86bdbf) of handling idle queue when running out of resource.

[1] script for test/verify SCSI timeout
rmmod scsi_debug
modprobe scsi_debug max_queue=1

DEVICE=`ls -d /sys/bus/pseudo/drivers/scsi_debug/adapter*/host*/target*/*/block/* | head -1 | xargs basename`
DISK_DIR=`ls -d /sys/block/$DEVICE/device/scsi_disk/*`

echo "using scsi device $DEVICE"
echo "-1" >/sys/bus/pseudo/drivers/scsi_debug/every_nth
echo "temporary write through" >$DISK_DIR/cache_type
echo "128" >/sys/bus/pseudo/drivers/scsi_debug/opts
echo none > /sys/block/$DEVICE/queue/scheduler
dd if=/dev/$DEVICE of=/dev/null bs=1M iflag=direct count=1 &
sleep 5
echo "0" >/sys/bus/pseudo/drivers/scsi_debug/opts
wait
echo "SUCCESS"

Fixes: 0df21c86bdbf ("scsi: implement .get_budget and .put_budget for blk-mq")
Signed-off-by: Ming Lei <ming.lei@redhat.com>
---
 drivers/scsi/scsi_lib.c | 2 ++
 1 file changed, 2 insertions(+)

diff --git a/drivers/scsi/scsi_lib.c b/drivers/scsi/scsi_lib.c
index db9556662e27..1816dd8259b3 100644
--- a/drivers/scsi/scsi_lib.c
+++ b/drivers/scsi/scsi_lib.c
@@ -1967,6 +1967,8 @@ static bool scsi_mq_get_budget(struct blk_mq_hw_ctx *hctx)
 out_put_device:
 	put_device(&sdev->sdev_gendev);
 out:
+	if (atomic_read(&sdev->device_busy) == 0 && !scsi_device_blocked(sdev))
+		blk_mq_delay_run_hw_queue(hctx, SCSI_QUEUE_DELAY);
 	return false;
 }

-- 
2.9.5

^ permalink raw reply related	[flat|nested] 24+ messages in thread

* Re: [PATCH] SCSI: run queue if SCSI device queue isn't ready and queue is idle
  2017-12-05  7:52 [PATCH] SCSI: run queue if SCSI device queue isn't ready and queue is idle Ming Lei
@ 2017-12-05 14:29 ` Johannes Thumshirn
  2017-12-05 16:16     ` Bart Van Assche
  2017-12-05 16:08   ` Bart Van Assche
  2017-12-06 23:10 ` Holger Hoffstätte
  2 siblings, 1 reply; 24+ messages in thread
From: Johannes Thumshirn @ 2017-12-05 14:29 UTC (permalink / raw)
  To: Ming Lei, Omar Sandoval
  Cc: Jens Axboe, linux-block, Christoph Hellwig, linux-scsi,
	Martin K . Petersen, James E . J . Bottomley, Bart Van Assche,
	linux-kernel, Hannes Reinecke, Holger Hoffstätte

[ +Cc Omar ]

Ming Lei <ming.lei@redhat.com> writes:
> Before commit 0df21c86bdbf ("scsi: implement .get_budget and .put_budget
> for blk-mq"), we run queue after 3ms if queue is idle and SCSI device
> queue isn't ready, which is done in handling BLK_STS_RESOURCE. After
> commit 0df21c86bdbf is introduced, queue won't be run any more under
> this situation.
>
> IO hang is observed when timeout happened, and this patch fixes the IO
> hang issue by running queue after delay in scsi_dev_queue_ready, just like
> non-mq. This issue can be triggered by the following script[1].
>
> There is another issue which can be covered by running idle queue:
> when .get_budget() is called on request coming from hctx->dispatch_list,
> if one request just completes during .get_budget(), we can't depend on
> SCSI's restart to make progress any more. This patch fixes the race too.
>
> With this patch, we basically recover to previous behaviour(before commit
> 0df21c86bdbf) of handling idle queue when running out of resource.
>
> [1] script for test/verify SCSI timeout
> rmmod scsi_debug
> modprobe scsi_debug max_queue=1
>
> DEVICE=`ls -d /sys/bus/pseudo/drivers/scsi_debug/adapter*/host*/target*/*/block/* | head -1 | xargs basename`
> DISK_DIR=`ls -d /sys/block/$DEVICE/device/scsi_disk/*`
>
> echo "using scsi device $DEVICE"
> echo "-1" >/sys/bus/pseudo/drivers/scsi_debug/every_nth
> echo "temporary write through" >$DISK_DIR/cache_type
> echo "128" >/sys/bus/pseudo/drivers/scsi_debug/opts
> echo none > /sys/block/$DEVICE/queue/scheduler
> dd if=/dev/$DEVICE of=/dev/null bs=1M iflag=direct count=1 &
> sleep 5
> echo "0" >/sys/bus/pseudo/drivers/scsi_debug/opts
> wait
> echo "SUCCESS"

SO I turned the above into a blktest but have found some shortcommings
of my bash skills. Maybe you or Omar has a soution for it:

--- 8< ---
>From 80e5810011d52bc188cd858962ce202bfd4dbee5 Mon Sep 17 00:00:00 2001
From: Johannes Thumshirn <jthumshirn@suse.de>
Date: Tue, 5 Dec 2017 15:21:08 +0100
Subject: [PATCH blktests] block/013: add test for scsi_device queue starvation

Add a test for Ming Lei's patch titled "SCSI: run queue if SCSI device
queue isn't ready and queue is idle"

Signed-off-by: Johannes Thumshirn <jthumshirn@suse.de>

---
This test case has two shortcommings, which need to be addressed I'm
just lacking a bit of the shell magic to address them properly.

1) Testing without the patch applied hangs the test forever as it
   doesn't get killed after a specific timeout (I think this should be
   solved in a common function).
2) It has a nasty sleep at it's end to wait for scsi_debug's refcounts
   to drop to 0 before removing the module or removing will fail and thus
   the test case. This as well should be solved in a more generic way.
---
 tests/block/013     | 63 +++++++++++++++++++++++++++++++++++++++++++++++++++++
 tests/block/013.out |  2 ++
 2 files changed, 65 insertions(+)
 create mode 100755 tests/block/013
 create mode 100644 tests/block/013.out

diff --git a/tests/block/013 b/tests/block/013
new file mode 100755
index 000000000000..f73724fc9ed2
--- /dev/null
+++ b/tests/block/013
@@ -0,0 +1,63 @@
+#!/bin/bash
+#
+# Regression test for patch "SCSI: delay run queue if device is
+# blocked in scsi_dev_queue_ready()"
+#
+# Copyright (C) 2017 Johannes Thumshirn
+#
+# This program is free software: you can redistribute it and/or modify
+# it under the terms of the GNU General Public License as published by
+# the Free Software Foundation, either version 3 of the License, or
+# (at your option) any later version.
+#
+# This program is distributed in the hope that it will be useful,
+# but WITHOUT ANY WARRANTY; without even the implied warranty of
+# MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the
+# GNU General Public License for more details.
+#
+# You should have received a copy of the GNU General Public License
+# along with this program.  If not, see <http://www.gnu.org/licenses/>.
+
+. common/scsi_debug
+
+DESCRIPTION="Test if a SCSI device's queue can be run if it isn't ready but the device is idle"
+TIMED=1
+
+requires() {
+	_have_scsi_debug && _have_module sd_mod && \
+		grep -q Y /sys/module/scsi_mod/parameters/use_blk_mq
+}
+
+test_one_device()
+{
+    local device=$1
+
+    echo "-1" > /sys/bus/pseudo/drivers/scsi_debug/every_nth
+    echo "temporary write through" > \
+	/sys/block/"${device}"/device/scsi_disk/"$(basename $(readlink /sys/block/${device}/device))"/cache_type
+    echo "128" > /sys/bus/pseudo/drivers/scsi_debug/opts
+    echo "none" > /sys/block/${device}/queue/scheduler
+    dd if=/dev/"${device}" of=/dev/null bs=1M iflag=direct \
+	count=1 2> /dev/null &
+    sleep 5
+    echo 0 > /sys/bus/pseudo/drivers/scsi_debug/opts
+    wait
+}
+
+test() {
+	echo "Running ${TEST_NAME}"
+
+	if ! _init_scsi_debug statistics=1 max_queue=1; then
+		return
+	fi
+
+	local device
+	for device in "${SCSI_DEBUG_DEVICES[@]}"; do
+	    test_one_device ${device}	    
+	done
+
+	sleep 5 # to free up all scsi_debug refcnts
+	_exit_scsi_debug
+
+	echo "Test complete"
+}
diff --git a/tests/block/013.out b/tests/block/013.out
new file mode 100644
index 000000000000..947bd04e2104
--- /dev/null
+++ b/tests/block/013.out
@@ -0,0 +1,2 @@
+Running block/013
+Test complete
-- 
2.13.6



-- 
Johannes Thumshirn                                          Storage
jthumshirn@suse.de                                +49 911 74053 689
SUSE LINUX GmbH, Maxfeldstr. 5, 90409 Nürnberg
GF: Felix Imendörffer, Jane Smithard, Graham Norton
HRB 21284 (AG Nürnberg)
Key fingerprint = EC38 9CAB C2C4 F25D 8600 D0D0 0393 969D 2D76 0850

^ permalink raw reply related	[flat|nested] 24+ messages in thread

* Re: [PATCH] SCSI: run queue if SCSI device queue isn't ready and queue is idle
  2017-12-05  7:52 [PATCH] SCSI: run queue if SCSI device queue isn't ready and queue is idle Ming Lei
@ 2017-12-05 16:08   ` Bart Van Assche
  2017-12-05 16:08   ` Bart Van Assche
  2017-12-06 23:10 ` Holger Hoffstätte
  2 siblings, 0 replies; 24+ messages in thread
From: Bart Van Assche @ 2017-12-05 16:08 UTC (permalink / raw)
  To: linux-scsi, hch, jejb, linux-block, axboe, ming.lei, martin.petersen
  Cc: linux-kernel, hare, holger

T24gVHVlLCAyMDE3LTEyLTA1IGF0IDE1OjUyICswODAwLCBNaW5nIExlaSB3cm90ZToNCj4gZGlm
ZiAtLWdpdCBhL2RyaXZlcnMvc2NzaS9zY3NpX2xpYi5jIGIvZHJpdmVycy9zY3NpL3Njc2lfbGli
LmMNCj4gaW5kZXggZGI5NTU2NjYyZTI3Li4xODE2ZGQ4MjU5YjMgMTAwNjQ0DQo+IC0tLSBhL2Ry
aXZlcnMvc2NzaS9zY3NpX2xpYi5jDQo+ICsrKyBiL2RyaXZlcnMvc2NzaS9zY3NpX2xpYi5jDQo+
IEBAIC0xOTY3LDYgKzE5NjcsOCBAQCBzdGF0aWMgYm9vbCBzY3NpX21xX2dldF9idWRnZXQoc3Ry
dWN0IGJsa19tcV9od19jdHggKmhjdHgpDQo+ICBvdXRfcHV0X2RldmljZToNCj4gIAlwdXRfZGV2
aWNlKCZzZGV2LT5zZGV2X2dlbmRldik7DQo+ICBvdXQ6DQo+ICsJaWYgKGF0b21pY19yZWFkKCZz
ZGV2LT5kZXZpY2VfYnVzeSkgPT0gMCAmJiAhc2NzaV9kZXZpY2VfYmxvY2tlZChzZGV2KSkNCj4g
KwkJYmxrX21xX2RlbGF5X3J1bl9od19xdWV1ZShoY3R4LCBTQ1NJX1FVRVVFX0RFTEFZKTsNCj4g
IAlyZXR1cm4gZmFsc2U7DQo+ICB9DQoNClRoaXMgY2Fubm90IHdvcmsgc2luY2UgbXVsdGlwbGUg
dGhyZWFkcyBjYW4gY2FsbCBzY3NpX21xX2dldF9idWRnZXQoKQ0KY29uY3VycmVudGx5IGFuZCBo
ZW5jZSBpdCBjYW4gaGFwcGVuIHRoYXQgbm9uZSBvZiB0aGVtIHNlZXMNCmF0b21pY19yZWFkKCZz
ZGV2LT5kZXZpY2VfYnVzeSkgPT0gMC4gQlRXLCB0aGUgYWJvdmUgcGF0Y2ggaXMgc29tZXRoaW5n
IEkNCmNvbnNpZGVyIGFzIGEgd29ya2Fyb3VuZC4gSGF2ZSB5b3UgY29uc2lkZXJlZCB0byBmaXgg
dGhlIHJvb3QgY2F1c2UgYW5kIHRvDQphZGQgYmxrX21xX3NjaGVkX21hcmtfcmVzdGFydF9oY3R4
KCkgd2hlcmUgdGhlc2UgYXJlIG1pc3NpbmcsIGUuZy4gaW4gDQpibGtfbXFfc2NoZWRfZGlzcGF0
Y2hfcmVxdWVzdHMoKT8gVGhlIHBhdGNoIGJlbG93IGlzIG5vdCBhIGZ1bGwgc29sdXRpb24NCmJ1
dCByZXN1bHRlZCBpbiBhIHNpZ25pZmljYW50IGltcHJvdmVtZW50IGluIG15IHRlc3RzOg0KDQpk
aWZmIC0tZ2l0IGEvYmxvY2svYmxrLW1xLXNjaGVkLmMgYi9ibG9jay9ibGstbXEtc2NoZWQuYw0K
aW5kZXggNjllMzIyNmU2NmNhLi45ZDg2ODc2ZWM1MDMgMTAwNjQ0DQotLS0gYS9ibG9jay9ibGst
bXEtc2NoZWQuYw0KKysrIGIvYmxvY2svYmxrLW1xLXNjaGVkLmMNCkBAIC0yMjYsNiArMjI2LDcg
QEAgdm9pZCBibGtfbXFfc2NoZWRfZGlzcGF0Y2hfcmVxdWVzdHMoc3RydWN0IGJsa19tcV9od19j
dHggKmhjdHgpDQogCQkgKiBUT0RPOiBnZXQgbW9yZSBidWRnZXRzLCBhbmQgZGVxdWV1ZSBtb3Jl
IHJlcXVlc3RzIGluDQogCQkgKiBvbmUgdGltZS4NCiAJCSAqLw0KKwkJYmxrX21xX3NjaGVkX21h
cmtfcmVzdGFydF9oY3R4KGhjdHgpOw0KIAkJYmxrX21xX2RvX2Rpc3BhdGNoX2N0eChoY3R4KTsN
CiAJfSBlbHNlIHsNCiAJCWJsa19tcV9mbHVzaF9idXN5X2N0eHMoaGN0eCwgJnJxX2xpc3QpOw0K
DQpCYXJ0Lg==

^ permalink raw reply	[flat|nested] 24+ messages in thread

* Re: [PATCH] SCSI: run queue if SCSI device queue isn't ready and queue is idle
@ 2017-12-05 16:08   ` Bart Van Assche
  0 siblings, 0 replies; 24+ messages in thread
From: Bart Van Assche @ 2017-12-05 16:08 UTC (permalink / raw)
  To: linux-scsi, hch, jejb, linux-block, axboe, ming.lei, martin.petersen
  Cc: linux-kernel, hare, holger

On Tue, 2017-12-05 at 15:52 +0800, Ming Lei wrote:
> diff --git a/drivers/scsi/scsi_lib.c b/drivers/scsi/scsi_lib.c
> index db9556662e27..1816dd8259b3 100644
> --- a/drivers/scsi/scsi_lib.c
> +++ b/drivers/scsi/scsi_lib.c
> @@ -1967,6 +1967,8 @@ static bool scsi_mq_get_budget(struct blk_mq_hw_ctx *hctx)
>  out_put_device:
>  	put_device(&sdev->sdev_gendev);
>  out:
> +	if (atomic_read(&sdev->device_busy) == 0 && !scsi_device_blocked(sdev))
> +		blk_mq_delay_run_hw_queue(hctx, SCSI_QUEUE_DELAY);
>  	return false;
>  }

This cannot work since multiple threads can call scsi_mq_get_budget()
concurrently and hence it can happen that none of them sees
atomic_read(&sdev->device_busy) == 0. BTW, the above patch is something I
consider as a workaround. Have you considered to fix the root cause and to
add blk_mq_sched_mark_restart_hctx() where these are missing, e.g. in 
blk_mq_sched_dispatch_requests()? The patch below is not a full solution
but resulted in a significant improvement in my tests:

diff --git a/block/blk-mq-sched.c b/block/blk-mq-sched.c
index 69e3226e66ca..9d86876ec503 100644
--- a/block/blk-mq-sched.c
+++ b/block/blk-mq-sched.c
@@ -226,6 +226,7 @@ void blk_mq_sched_dispatch_requests(struct blk_mq_hw_ctx *hctx)
 		 * TODO: get more budgets, and dequeue more requests in
 		 * one time.
 		 */
+		blk_mq_sched_mark_restart_hctx(hctx);
 		blk_mq_do_dispatch_ctx(hctx);
 	} else {
 		blk_mq_flush_busy_ctxs(hctx, &rq_list);

Bart.

^ permalink raw reply related	[flat|nested] 24+ messages in thread

* Re: [PATCH] SCSI: run queue if SCSI device queue isn't ready and queue is idle
  2017-12-05 14:29 ` Johannes Thumshirn
@ 2017-12-05 16:16     ` Bart Van Assche
  0 siblings, 0 replies; 24+ messages in thread
From: Bart Van Assche @ 2017-12-05 16:16 UTC (permalink / raw)
  To: osandov, jthumshirn, ming.lei
  Cc: linux-kernel, linux-block, hch, martin.petersen, linux-scsi,
	axboe, hare, holger, jejb

T24gVHVlLCAyMDE3LTEyLTA1IGF0IDE1OjI5ICswMTAwLCBKb2hhbm5lcyBUaHVtc2hpcm4gd3Jv
dGU6DQo+IDEpIFRlc3Rpbmcgd2l0aG91dCB0aGUgcGF0Y2ggYXBwbGllZCBoYW5ncyB0aGUgdGVz
dCBmb3JldmVyIGFzIGl0DQo+ICAgIGRvZXNuJ3QgZ2V0IGtpbGxlZCBhZnRlciBhIHNwZWNpZmlj
IHRpbWVvdXQgKEkgdGhpbmsgdGhpcyBzaG91bGQgYmUNCj4gICAgc29sdmVkIGluIGEgY29tbW9u
IGZ1bmN0aW9uKS4NCg0KSGVsbG8gSm9oYW5uZXMsDQoNCklmIGEgcmVxdWVzdCBxdWV1ZSBnb3Qg
c3R1Y2sgdGhlbiB0aGUgcHJvY2Vzc2VzIHRoYXQgc3VibWl0dGVkIHRoZSByZXF1ZXN0cw0Kb24g
dGhhdCBxdWV1ZSBhcmUgdW5raWxsYWJsZS4gVGhlIG9ubHkgYXBwcm9hY2ggSSBrbm93IG9mIHRv
IHN0b3AgdGhlc2UNCnByb2Nlc3NlcyBpcyB0byBzZW5kIGEga2lsbCBzaWduYWwgYW5kIG5leHQg
dG8gdHJpZ2dlciBhIHF1ZXVlIHJ1biBmcm9tIHVzZXINCnNwYWNlLiBPbmUgcG9zc2libGUgYXBw
cm9hY2ggaXMgdG8gcnVuIHRoZSBmb2xsb3dpbmcgY29tbWFuZDoNCg0KICAgIGZvciBkIGluIC9z
eXMva2VybmVsL2RlYnVnL2Jsb2NrLyo7IGRvIGVjaG8ga2ljayA+JGQvc3RhdGU7IGRvbmUNCg0K
QmFydC4=

^ permalink raw reply	[flat|nested] 24+ messages in thread

* Re: [PATCH] SCSI: run queue if SCSI device queue isn't ready and queue is idle
@ 2017-12-05 16:16     ` Bart Van Assche
  0 siblings, 0 replies; 24+ messages in thread
From: Bart Van Assche @ 2017-12-05 16:16 UTC (permalink / raw)
  To: osandov, jthumshirn, ming.lei
  Cc: linux-kernel, linux-block, hch, martin.petersen, linux-scsi,
	axboe, hare, holger, jejb

On Tue, 2017-12-05 at 15:29 +0100, Johannes Thumshirn wrote:
> 1) Testing without the patch applied hangs the test forever as it
>    doesn't get killed after a specific timeout (I think this should be
>    solved in a common function).

Hello Johannes,

If a request queue got stuck then the processes that submitted the requests
on that queue are unkillable. The only approach I know of to stop these
processes is to send a kill signal and next to trigger a queue run from user
space. One possible approach is to run the following command:

    for d in /sys/kernel/debug/block/*; do echo kick >$d/state; done

Bart.

^ permalink raw reply	[flat|nested] 24+ messages in thread

* Re: [PATCH] SCSI: run queue if SCSI device queue isn't ready and queue is idle
  2017-12-05 16:08   ` Bart Van Assche
  (?)
@ 2017-12-05 16:28   ` Ming Lei
  2017-12-05 16:41       ` Bart Van Assche
                       ` (2 more replies)
  -1 siblings, 3 replies; 24+ messages in thread
From: Ming Lei @ 2017-12-05 16:28 UTC (permalink / raw)
  To: Bart Van Assche
  Cc: linux-scsi, hch, jejb, linux-block, axboe, martin.petersen,
	linux-kernel, hare, holger

On Tue, Dec 05, 2017 at 04:08:20PM +0000, Bart Van Assche wrote:
> On Tue, 2017-12-05 at 15:52 +0800, Ming Lei wrote:
> > diff --git a/drivers/scsi/scsi_lib.c b/drivers/scsi/scsi_lib.c
> > index db9556662e27..1816dd8259b3 100644
> > --- a/drivers/scsi/scsi_lib.c
> > +++ b/drivers/scsi/scsi_lib.c
> > @@ -1967,6 +1967,8 @@ static bool scsi_mq_get_budget(struct blk_mq_hw_ctx *hctx)
> >  out_put_device:
> >  	put_device(&sdev->sdev_gendev);
> >  out:
> > +	if (atomic_read(&sdev->device_busy) == 0 && !scsi_device_blocked(sdev))
> > +		blk_mq_delay_run_hw_queue(hctx, SCSI_QUEUE_DELAY);
> >  	return false;
> >  }
> 
> This cannot work since multiple threads can call scsi_mq_get_budget()

That is exactly the way we are handling these cases before 0df21c86bdbf(scsi:
implement .get_budget and .put_budget for blk-mq), so if it can't work,
that is not fault of commit 0df21c86bdbf.

> concurrently and hence it can happen that none of them sees
> atomic_read(&sdev->device_busy) == 0. BTW, the above patch is something I

If there is concurrent .get_budget(), one of them will see the counter
becoming zero finally because each sdev->device_busy is inc/dec
atomically. Or scsi_dev_queue_ready() return true.

Anyway, we need this patch to avoid possible regression. If you think
there are bugs in blk-mq RESTART, just root cause and and fix it.

> consider as a workaround. Have you considered to fix the root cause and to
> add blk_mq_sched_mark_restart_hctx() where these are missing, e.g. in 
> blk_mq_sched_dispatch_requests()? The patch below is not a full solution
> but resulted in a significant improvement in my tests:
> 
> diff --git a/block/blk-mq-sched.c b/block/blk-mq-sched.c
> index 69e3226e66ca..9d86876ec503 100644
> --- a/block/blk-mq-sched.c
> +++ b/block/blk-mq-sched.c
> @@ -226,6 +226,7 @@ void blk_mq_sched_dispatch_requests(struct blk_mq_hw_ctx *hctx)
>  		 * TODO: get more budgets, and dequeue more requests in
>  		 * one time.
>  		 */
> +		blk_mq_sched_mark_restart_hctx(hctx);
>  		blk_mq_do_dispatch_ctx(hctx);
>  	} else {
>  		blk_mq_flush_busy_ctxs(hctx, &rq_list);

This is still a workaround for RESTART, see my comment before:

	https://marc.info/?l=linux-block&m=151217500929341&w=2

-- 
Ming

^ permalink raw reply	[flat|nested] 24+ messages in thread

* Re: [PATCH] SCSI: run queue if SCSI device queue isn't ready and queue is idle
  2017-12-05 16:28   ` Ming Lei
@ 2017-12-05 16:41       ` Bart Van Assche
  2017-12-06  1:52     ` Ming Lei
  2017-12-07 21:06       ` Bart Van Assche
  2 siblings, 0 replies; 24+ messages in thread
From: Bart Van Assche @ 2017-12-05 16:41 UTC (permalink / raw)
  To: ming.lei
  Cc: linux-kernel, linux-block, hch, martin.petersen, linux-scsi,
	axboe, hare, holger, jejb

T24gV2VkLCAyMDE3LTEyLTA2IGF0IDAwOjI4ICswODAwLCBNaW5nIExlaSB3cm90ZToNCj4gVGhp
cyBpcyBzdGlsbCBhIHdvcmthcm91bmQgZm9yIFJFU1RBUlQsIHNlZSBteSBjb21tZW50IGJlZm9y
ZToNCj4gDQo+IAlodHRwczovL21hcmMuaW5mby8/bD1saW51eC1ibG9jayZtPTE1MTIxNzUwMDky
OTM0MSZ3PTINCg0KQSBxdW90ZSBmcm9tIHRoYXQgZS1tYWlsOiAiVGhlIHRoZW9yeSBhYm91dCB1
c2luZyBCTEtfTVFfU19TQ0hFRF9SRVNUQVJUIGluDQpjdXJyZW50IHdheSBpcyB0aGF0IHdlIG1h
cmsgaXQgYWZ0ZXIgcmVxdWVzdHMgYXJlIGFkZGVkIHRvIGhjdHgtPmRpc3BhdGNoIi4NClJlYWRp
bmcgdGhhdCBtYWtlcyBtZSB3b25kZXIgd2hldGhlciB5b3UgdW5kZXJzdGFuZCB0aGUgcHVycG9z
ZSBvZiB0aGUNCkJMS19NUV9TX1NDSEVEX1JFU1RBUlQgZmxhZz8gVGhhdCBmbGFnIGlzIG5vdCBz
ZXQgYWZ0ZXIgcmVxdWVzdHMgYXJlIGFkZGVkDQp0byB0aGUgZGlzcGF0Y2ggbGlzdCBidXQgYWZ0
ZXIgcmVxdWVzdHMgaGF2ZSBiZWVuICpyZW1vdmVkKi4gVGhlIHB1cnBvc2Ugb2YNCnRoYXQgZmxh
ZyBpcyB0byBkZXRlY3Qgd2hldGhlciBhbm90aGVyIHRocmVhZCBoYXMgcnVuIHRoZSBxdWV1ZSBh
ZnRlcg0KcmVxdWVzdHMgd2VyZSByZW1vdmVkIGZyb20gdGhlIGRpc3BhdGNoIGxpc3QgYW5kIGJl
Zm9yZSB0aGVzZSB3ZXJlIHJlYWRkZWQuDQpJZiBzbywgdGhlIHF1ZXVlIG5lZWRzIHRvIGJlIHJl
cnVuLg0KDQpCYXJ0Lg==

^ permalink raw reply	[flat|nested] 24+ messages in thread

* Re: [PATCH] SCSI: run queue if SCSI device queue isn't ready and queue is idle
@ 2017-12-05 16:41       ` Bart Van Assche
  0 siblings, 0 replies; 24+ messages in thread
From: Bart Van Assche @ 2017-12-05 16:41 UTC (permalink / raw)
  To: ming.lei
  Cc: linux-kernel, linux-block, hch, martin.petersen, linux-scsi,
	axboe, hare, holger, jejb

On Wed, 2017-12-06 at 00:28 +0800, Ming Lei wrote:
> This is still a workaround for RESTART, see my comment before:
> 
> 	https://marc.info/?l=linux-block&m=151217500929341&w=2

A quote from that e-mail: "The theory about using BLK_MQ_S_SCHED_RESTART in
current way is that we mark it after requests are added to hctx->dispatch".
Reading that makes me wonder whether you understand the purpose of the
BLK_MQ_S_SCHED_RESTART flag? That flag is not set after requests are added
to the dispatch list but after requests have been *removed*. The purpose of
that flag is to detect whether another thread has run the queue after
requests were removed from the dispatch list and before these were readded.
If so, the queue needs to be rerun.

Bart.

^ permalink raw reply	[flat|nested] 24+ messages in thread

* Re: [PATCH] SCSI: run queue if SCSI device queue isn't ready and queue is idle
  2017-12-05 16:41       ` Bart Van Assche
  (?)
@ 2017-12-05 16:45       ` Ming Lei
  -1 siblings, 0 replies; 24+ messages in thread
From: Ming Lei @ 2017-12-05 16:45 UTC (permalink / raw)
  To: Bart Van Assche
  Cc: linux-kernel, linux-block, hch, martin.petersen, linux-scsi,
	axboe, hare, holger, jejb

On Tue, Dec 05, 2017 at 04:41:46PM +0000, Bart Van Assche wrote:
> On Wed, 2017-12-06 at 00:28 +0800, Ming Lei wrote:
> > This is still a workaround for RESTART, see my comment before:
> > 
> > 	https://marc.info/?l=linux-block&m=151217500929341&w=2
> 
> A quote from that e-mail: "The theory about using BLK_MQ_S_SCHED_RESTART in
> current way is that we mark it after requests are added to hctx->dispatch".
> Reading that makes me wonder whether you understand the purpose of the
> BLK_MQ_S_SCHED_RESTART flag? That flag is not set after requests are added
> to the dispatch list but after requests have been *removed*. The purpose of
> that flag is to detect whether another thread has run the queue after
> requests were removed from the dispatch list and before these were readded.
> If so, the queue needs to be rerun.

If you want to discuss that, please reply on that thread of '[PATCH 4/7] blk-mq:
Avoid that request processing sta', I will reply on you there too.

-- 
Ming

^ permalink raw reply	[flat|nested] 24+ messages in thread

* Re: [PATCH] SCSI: run queue if SCSI device queue isn't ready and queue is idle
  2017-12-05 16:28   ` Ming Lei
  2017-12-05 16:41       ` Bart Van Assche
@ 2017-12-06  1:52     ` Ming Lei
  2017-12-06 16:07         ` Bart Van Assche
  2017-12-07 21:06       ` Bart Van Assche
  2 siblings, 1 reply; 24+ messages in thread
From: Ming Lei @ 2017-12-06  1:52 UTC (permalink / raw)
  To: Bart Van Assche
  Cc: linux-scsi, hch, jejb, linux-block, axboe, martin.petersen,
	linux-kernel, hare, holger

On Wed, Dec 06, 2017 at 12:28:25AM +0800, Ming Lei wrote:
> On Tue, Dec 05, 2017 at 04:08:20PM +0000, Bart Van Assche wrote:
> > On Tue, 2017-12-05 at 15:52 +0800, Ming Lei wrote:
> > > diff --git a/drivers/scsi/scsi_lib.c b/drivers/scsi/scsi_lib.c
> > > index db9556662e27..1816dd8259b3 100644
> > > --- a/drivers/scsi/scsi_lib.c
> > > +++ b/drivers/scsi/scsi_lib.c
> > > @@ -1967,6 +1967,8 @@ static bool scsi_mq_get_budget(struct blk_mq_hw_ctx *hctx)
> > >  out_put_device:
> > >  	put_device(&sdev->sdev_gendev);
> > >  out:
> > > +	if (atomic_read(&sdev->device_busy) == 0 && !scsi_device_blocked(sdev))
> > > +		blk_mq_delay_run_hw_queue(hctx, SCSI_QUEUE_DELAY);
> > >  	return false;
> > >  }
> > 
> > This cannot work since multiple threads can call scsi_mq_get_budget()
> 
> That is exactly the way we are handling these cases before 0df21c86bdbf(scsi:
> implement .get_budget and .put_budget for blk-mq), so if it can't work,
> that is not fault of commit 0df21c86bdbf.
> 
> > concurrently and hence it can happen that none of them sees
> > atomic_read(&sdev->device_busy) == 0. BTW, the above patch is something I
> 
> If there is concurrent .get_budget(), one of them will see the counter
> becoming zero finally because each sdev->device_busy is inc/dec
> atomically. Or scsi_dev_queue_ready() return true.
> 
> Anyway, we need this patch to avoid possible regression. If you think
> there are bugs in blk-mq RESTART, just root cause and and fix it.
> 
> > consider as a workaround. Have you considered to fix the root cause and to
> > add blk_mq_sched_mark_restart_hctx() where these are missing, e.g. in 
> > blk_mq_sched_dispatch_requests()? The patch below is not a full solution
> > but resulted in a significant improvement in my tests:
> > 
> > diff --git a/block/blk-mq-sched.c b/block/blk-mq-sched.c
> > index 69e3226e66ca..9d86876ec503 100644
> > --- a/block/blk-mq-sched.c
> > +++ b/block/blk-mq-sched.c
> > @@ -226,6 +226,7 @@ void blk_mq_sched_dispatch_requests(struct blk_mq_hw_ctx *hctx)
> >  		 * TODO: get more budgets, and dequeue more requests in
> >  		 * one time.
> >  		 */
> > +		blk_mq_sched_mark_restart_hctx(hctx);
> >  		blk_mq_do_dispatch_ctx(hctx);
> >  	} else {
> >  		blk_mq_flush_busy_ctxs(hctx, &rq_list);
> 

BTW, this kind of change can't cover scsi_set_blocked() which is
triggered by timeout, scsi dispatch failure. You will see that
easily if you run the SCSI test script I provided in the commit log.

-- 
Ming

^ permalink raw reply	[flat|nested] 24+ messages in thread

* Re: [PATCH] SCSI: run queue if SCSI device queue isn't ready and queue is idle
  2017-12-06  1:52     ` Ming Lei
@ 2017-12-06 16:07         ` Bart Van Assche
  0 siblings, 0 replies; 24+ messages in thread
From: Bart Van Assche @ 2017-12-06 16:07 UTC (permalink / raw)
  To: ming.lei
  Cc: linux-kernel, linux-block, hch, martin.petersen, linux-scsi,
	axboe, hare, holger, jejb

T24gV2VkLCAyMDE3LTEyLTA2IGF0IDA5OjUyICswODAwLCBNaW5nIExlaSB3cm90ZToNCj4gT24g
V2VkLCBEZWMgMDYsIDIwMTcgYXQgMTI6Mjg6MjVBTSArMDgwMCwgTWluZyBMZWkgd3JvdGU6DQo+
ID4gT24gVHVlLCBEZWMgMDUsIDIwMTcgYXQgMDQ6MDg6MjBQTSArMDAwMCwgQmFydCBWYW4gQXNz
Y2hlIHdyb3RlOg0KPiA+ID4gVGhlIHBhdGNoIGJlbG93IGlzIG5vdCBhIGZ1bGwgc29sdXRpb24g
YnV0IHJlc3VsdGVkIGluIGEgc2lnbmlmaWNhbnQNCj4gPiA+IGltcHJvdmVtZW50IGluIG15IHRl
c3RzOg0KPiA+ID4gDQo+ID4gPiBkaWZmIC0tZ2l0IGEvYmxvY2svYmxrLW1xLXNjaGVkLmMgYi9i
bG9jay9ibGstbXEtc2NoZWQuYw0KPiA+ID4gaW5kZXggNjllMzIyNmU2NmNhLi45ZDg2ODc2ZWM1
MDMgMTAwNjQ0DQo+ID4gPiAtLS0gYS9ibG9jay9ibGstbXEtc2NoZWQuYw0KPiA+ID4gKysrIGIv
YmxvY2svYmxrLW1xLXNjaGVkLmMNCj4gPiA+IEBAIC0yMjYsNiArMjI2LDcgQEAgdm9pZCBibGtf
bXFfc2NoZWRfZGlzcGF0Y2hfcmVxdWVzdHMoc3RydWN0IGJsa19tcV9od19jdHggKmhjdHgpDQo+
ID4gPiAgCQkgKiBUT0RPOiBnZXQgbW9yZSBidWRnZXRzLCBhbmQgZGVxdWV1ZSBtb3JlIHJlcXVl
c3RzIGluDQo+ID4gPiAgCQkgKiBvbmUgdGltZS4NCj4gPiA+ICAJCSAqLw0KPiA+ID4gKwkJYmxr
X21xX3NjaGVkX21hcmtfcmVzdGFydF9oY3R4KGhjdHgpOw0KPiA+ID4gIAkJYmxrX21xX2RvX2Rp
c3BhdGNoX2N0eChoY3R4KTsNCj4gPiA+ICAJfSBlbHNlIHsNCj4gPiA+ICAJCWJsa19tcV9mbHVz
aF9idXN5X2N0eHMoaGN0eCwgJnJxX2xpc3QpOw0KPiANCj4gQlRXLCB0aGlzIGtpbmQgb2YgY2hh
bmdlIGNhbid0IGNvdmVyIHNjc2lfc2V0X2Jsb2NrZWQoKSB3aGljaCBpcw0KPiB0cmlnZ2VyZWQg
YnkgdGltZW91dCwgc2NzaSBkaXNwYXRjaCBmYWlsdXJlLiBZb3Ugd2lsbCBzZWUgdGhhdA0KPiBl
YXNpbHkgaWYgeW91IHJ1biB0aGUgU0NTSSB0ZXN0IHNjcmlwdCBJIHByb3ZpZGVkIGluIHRoZSBj
b21taXQgbG9nLg0KDQpIZWxsbyBNaW5nLA0KDQpJIGFtIGF3YXJlIHRoYXQgdGhlIGFib3ZlIGNo
YW5nZSBkb2VzIG5vdCBjb3ZlciBhbGwgY2FzZXMuIFRoYXQncyB3aHkgSSB3cm90ZQ0KaW4gbXkg
cHJldmlvdXMgZS1tYWlsIHRoYXQgdGhhdCBwYXRjaCBpcyBub3QgYSBmdWxsIHNvbHV0aW9uLiBU
aGUgcmVhc29uIEkNCnBvc3RlZCB0aGF0IGNoYW5nZSBhbnl3YXkgaXMgYmVjYXVzZSBJIHByZWZl
ciBhIHNvbHV0aW9uIHRoYXQgaXMgbm90IGJhc2VkIG9uDQpkZWxheWVkIHF1ZXVlIHJ1bnMgb3Zl
ciBhIHNvbHV0aW9uIHRoYXQgaXMgYmFzZWQgb24gZGVsYXllZCBxdWV1ZSBydW5zDQooYmxrX21x
X2RlbGF5X3J1bl9od19xdWV1ZSgpKS4gTXkgY29uY2VybiBpcyB0aGF0IHBlcmZvcm1hbmNlIG9m
IGEgc29sdXRpb24NCmJhc2VkIG9uIGRlbGF5ZWQgcXVldWUgcnVucyB3aWxsIGJlIHN1Ym9wdGlt
YWwuDQoNCkJhcnQu

^ permalink raw reply	[flat|nested] 24+ messages in thread

* Re: [PATCH] SCSI: run queue if SCSI device queue isn't ready and queue is idle
@ 2017-12-06 16:07         ` Bart Van Assche
  0 siblings, 0 replies; 24+ messages in thread
From: Bart Van Assche @ 2017-12-06 16:07 UTC (permalink / raw)
  To: ming.lei
  Cc: linux-kernel, linux-block, hch, martin.petersen, linux-scsi,
	axboe, hare, holger, jejb

On Wed, 2017-12-06 at 09:52 +0800, Ming Lei wrote:
> On Wed, Dec 06, 2017 at 12:28:25AM +0800, Ming Lei wrote:
> > On Tue, Dec 05, 2017 at 04:08:20PM +0000, Bart Van Assche wrote:
> > > The patch below is not a full solution but resulted in a significant
> > > improvement in my tests:
> > > 
> > > diff --git a/block/blk-mq-sched.c b/block/blk-mq-sched.c
> > > index 69e3226e66ca..9d86876ec503 100644
> > > --- a/block/blk-mq-sched.c
> > > +++ b/block/blk-mq-sched.c
> > > @@ -226,6 +226,7 @@ void blk_mq_sched_dispatch_requests(struct blk_mq_hw_ctx *hctx)
> > >  		 * TODO: get more budgets, and dequeue more requests in
> > >  		 * one time.
> > >  		 */
> > > +		blk_mq_sched_mark_restart_hctx(hctx);
> > >  		blk_mq_do_dispatch_ctx(hctx);
> > >  	} else {
> > >  		blk_mq_flush_busy_ctxs(hctx, &rq_list);
> 
> BTW, this kind of change can't cover scsi_set_blocked() which is
> triggered by timeout, scsi dispatch failure. You will see that
> easily if you run the SCSI test script I provided in the commit log.

Hello Ming,

I am aware that the above change does not cover all cases. That's why I wrote
in my previous e-mail that that patch is not a full solution. The reason I
posted that change anyway is because I prefer a solution that is not based on
delayed queue runs over a solution that is based on delayed queue runs
(blk_mq_delay_run_hw_queue()). My concern is that performance of a solution
based on delayed queue runs will be suboptimal.

Bart.

^ permalink raw reply	[flat|nested] 24+ messages in thread

* Re: [PATCH] SCSI: run queue if SCSI device queue isn't ready and queue is idle
  2017-12-05  7:52 [PATCH] SCSI: run queue if SCSI device queue isn't ready and queue is idle Ming Lei
  2017-12-05 14:29 ` Johannes Thumshirn
  2017-12-05 16:08   ` Bart Van Assche
@ 2017-12-06 23:10 ` Holger Hoffstätte
  2017-12-07  1:40     ` Ming Lei
  2 siblings, 1 reply; 24+ messages in thread
From: Holger Hoffstätte @ 2017-12-06 23:10 UTC (permalink / raw)
  To: Ming Lei, Jens Axboe, linux-block, Christoph Hellwig, linux-scsi,
	Martin K . Petersen, James E . J . Bottomley
  Cc: Bart Van Assche, linux-kernel, Hannes Reinecke

On 12/05/17 08:52, Ming Lei wrote:
> Before commit 0df21c86bdbf ("scsi: implement .get_budget and .put_budget
> for blk-mq"), we run queue after 3ms if queue is idle and SCSI device
> queue isn't ready, which is done in handling BLK_STS_RESOURCE. After
> commit 0df21c86bdbf is introduced, queue won't be run any more under
> this situation.
> 
> IO hang is observed when timeout happened, and this patch fixes the IO
> hang issue by running queue after delay in scsi_dev_queue_ready, just like
> non-mq. This issue can be triggered by the following script[1].
> 
> There is another issue which can be covered by running idle queue:
> when .get_budget() is called on request coming from hctx->dispatch_list,
> if one request just completes during .get_budget(), we can't depend on
> SCSI's restart to make progress any more. This patch fixes the race too.
> 
> With this patch, we basically recover to previous behaviour(before commit
> 0df21c86bdbf) of handling idle queue when running out of resource.
> 
> [1] script for test/verify SCSI timeout
> rmmod scsi_debug
> modprobe scsi_debug max_queue=1
> 
> DEVICE=`ls -d /sys/bus/pseudo/drivers/scsi_debug/adapter*/host*/target*/*/block/* | head -1 | xargs basename`
> DISK_DIR=`ls -d /sys/block/$DEVICE/device/scsi_disk/*`
> 
> echo "using scsi device $DEVICE"
> echo "-1" >/sys/bus/pseudo/drivers/scsi_debug/every_nth
> echo "temporary write through" >$DISK_DIR/cache_type
> echo "128" >/sys/bus/pseudo/drivers/scsi_debug/opts
> echo none > /sys/block/$DEVICE/queue/scheduler
> dd if=/dev/$DEVICE of=/dev/null bs=1M iflag=direct count=1 &
> sleep 5
> echo "0" >/sys/bus/pseudo/drivers/scsi_debug/opts
> wait
> echo "SUCCESS"
> 
> Fixes: 0df21c86bdbf ("scsi: implement .get_budget and .put_budget for blk-mq")
> Signed-off-by: Ming Lei <ming.lei@redhat.com>
> ---
>  drivers/scsi/scsi_lib.c | 2 ++
>  1 file changed, 2 insertions(+)
> 
> diff --git a/drivers/scsi/scsi_lib.c b/drivers/scsi/scsi_lib.c
> index db9556662e27..1816dd8259b3 100644
> --- a/drivers/scsi/scsi_lib.c
> +++ b/drivers/scsi/scsi_lib.c
> @@ -1967,6 +1967,8 @@ static bool scsi_mq_get_budget(struct blk_mq_hw_ctx *hctx)
>  out_put_device:
>  	put_device(&sdev->sdev_gendev);
>  out:
> +	if (atomic_read(&sdev->device_busy) == 0 && !scsi_device_blocked(sdev))
> +		blk_mq_delay_run_hw_queue(hctx, SCSI_QUEUE_DELAY);
>  	return false;
>  }

So just to follow up on this: with this patch I haven't encountered any
new hangs with blk-mq, regardless of medium (SSD/rotating disk) or scheduler.
I cannot speak for other hangs that may be reproducible by other means,
but for now here's my:

Tested-by: Holger Hoffstätte <holger@applied-asynchrony.com>

cheers,
Holger

^ permalink raw reply	[flat|nested] 24+ messages in thread

* Re: [PATCH] SCSI: run queue if SCSI device queue isn't ready and queue is idle
  2017-12-06 16:07         ` Bart Van Assche
  (?)
@ 2017-12-07  1:31         ` Ming Lei
  2017-12-07 21:11             ` Bart Van Assche
  -1 siblings, 1 reply; 24+ messages in thread
From: Ming Lei @ 2017-12-07  1:31 UTC (permalink / raw)
  To: Bart Van Assche
  Cc: linux-kernel, linux-block, hch, martin.petersen, linux-scsi,
	axboe, hare, holger, jejb

On Wed, Dec 06, 2017 at 04:07:17PM +0000, Bart Van Assche wrote:
> On Wed, 2017-12-06 at 09:52 +0800, Ming Lei wrote:
> > On Wed, Dec 06, 2017 at 12:28:25AM +0800, Ming Lei wrote:
> > > On Tue, Dec 05, 2017 at 04:08:20PM +0000, Bart Van Assche wrote:
> > > > The patch below is not a full solution but resulted in a significant
> > > > improvement in my tests:
> > > > 
> > > > diff --git a/block/blk-mq-sched.c b/block/blk-mq-sched.c
> > > > index 69e3226e66ca..9d86876ec503 100644
> > > > --- a/block/blk-mq-sched.c
> > > > +++ b/block/blk-mq-sched.c
> > > > @@ -226,6 +226,7 @@ void blk_mq_sched_dispatch_requests(struct blk_mq_hw_ctx *hctx)
> > > >  		 * TODO: get more budgets, and dequeue more requests in
> > > >  		 * one time.
> > > >  		 */
> > > > +		blk_mq_sched_mark_restart_hctx(hctx);
> > > >  		blk_mq_do_dispatch_ctx(hctx);
> > > >  	} else {
> > > >  		blk_mq_flush_busy_ctxs(hctx, &rq_list);
> > 
> > BTW, this kind of change can't cover scsi_set_blocked() which is
> > triggered by timeout, scsi dispatch failure. You will see that
> > easily if you run the SCSI test script I provided in the commit log.
> 
> Hello Ming,
> 
> I am aware that the above change does not cover all cases. That's why I wrote
> in my previous e-mail that that patch is not a full solution. The reason I
> posted that change anyway is because I prefer a solution that is not based on
> delayed queue runs over a solution that is based on delayed queue runs
> (blk_mq_delay_run_hw_queue()). My concern is that performance of a solution
> based on delayed queue runs will be suboptimal.

Hi,

The patch I posted won't cause any performance regression because it is
only triggered when queue is becoming idle, also that is exact the way for us
to deal with these cases before.

But if you always call blk_mq_sched_mark_restart_hctx() before a new
dispatch, that may affect performance on NVMe which may never trigger
BLK_STS_RESOURCE.

-- 
Ming

^ permalink raw reply	[flat|nested] 24+ messages in thread

* Re: [PATCH] SCSI: run queue if SCSI device queue isn't ready and queue is idle
  2017-12-06 23:10 ` Holger Hoffstätte
@ 2017-12-07  1:40     ` Ming Lei
  0 siblings, 0 replies; 24+ messages in thread
From: Ming Lei @ 2017-12-07  1:40 UTC (permalink / raw)
  To: Holger Hoffstätte, Jens Axboe, Martin K. Petersen
  Cc: linux-block, Christoph Hellwig, linux-scsi,
	James E . J . Bottomley, Bart Van Assche, linux-kernel,
	Hannes Reinecke

On Thu, Dec 07, 2017 at 12:10:51AM +0100, Holger Hoffstï¿½tte wrote:
> On 12/05/17 08:52, Ming Lei wrote:
> > Before commit 0df21c86bdbf ("scsi: implement .get_budget and .put_budget
> > for blk-mq"), we run queue after 3ms if queue is idle and SCSI device
> > queue isn't ready, which is done in handling BLK_STS_RESOURCE. After
> > commit 0df21c86bdbf is introduced, queue won't be run any more under
> > this situation.
> > 
> > IO hang is observed when timeout happened, and this patch fixes the IO
> > hang issue by running queue after delay in scsi_dev_queue_ready, just like
> > non-mq. This issue can be triggered by the following script[1].
> > 
> > There is another issue which can be covered by running idle queue:
> > when .get_budget() is called on request coming from hctx->dispatch_list,
> > if one request just completes during .get_budget(), we can't depend on
> > SCSI's restart to make progress any more. This patch fixes the race too.
> > 
> > With this patch, we basically recover to previous behaviour(before commit
> > 0df21c86bdbf) of handling idle queue when running out of resource.
> > 
> > [1] script for test/verify SCSI timeout
> > rmmod scsi_debug
> > modprobe scsi_debug max_queue=1
> > 
> > DEVICE=`ls -d /sys/bus/pseudo/drivers/scsi_debug/adapter*/host*/target*/*/block/* | head -1 | xargs basename`
> > DISK_DIR=`ls -d /sys/block/$DEVICE/device/scsi_disk/*`
> > 
> > echo "using scsi device $DEVICE"
> > echo "-1" >/sys/bus/pseudo/drivers/scsi_debug/every_nth
> > echo "temporary write through" >$DISK_DIR/cache_type
> > echo "128" >/sys/bus/pseudo/drivers/scsi_debug/opts
> > echo none > /sys/block/$DEVICE/queue/scheduler
> > dd if=/dev/$DEVICE of=/dev/null bs=1M iflag=direct count=1 &
> > sleep 5
> > echo "0" >/sys/bus/pseudo/drivers/scsi_debug/opts
> > wait
> > echo "SUCCESS"
> > 
> > Fixes: 0df21c86bdbf ("scsi: implement .get_budget and .put_budget for blk-mq")
> > Signed-off-by: Ming Lei <ming.lei@redhat.com>
> > ---
> >  drivers/scsi/scsi_lib.c | 2 ++
> >  1 file changed, 2 insertions(+)
> > 
> > diff --git a/drivers/scsi/scsi_lib.c b/drivers/scsi/scsi_lib.c
> > index db9556662e27..1816dd8259b3 100644
> > --- a/drivers/scsi/scsi_lib.c
> > +++ b/drivers/scsi/scsi_lib.c
> > @@ -1967,6 +1967,8 @@ static bool scsi_mq_get_budget(struct blk_mq_hw_ctx *hctx)
> >  out_put_device:
> >  	put_device(&sdev->sdev_gendev);
> >  out:
> > +	if (atomic_read(&sdev->device_busy) == 0 && !scsi_device_blocked(sdev))
> > +		blk_mq_delay_run_hw_queue(hctx, SCSI_QUEUE_DELAY);
> >  	return false;
> >  }
> 
> So just to follow up on this: with this patch I haven't encountered any
> new hangs with blk-mq, regardless of medium (SSD/rotating disk) or scheduler.
> I cannot speak for other hangs that may be reproducible by other means,
> but for now here's my:
> 
> Tested-by: Holger Hoffstï¿½tte <holger@applied-asynchrony.com>

Hi Holger,

That is great to see this patch fixes your issue, and thanks for your
test!

Jens, Martin, would any of you mind making this patch in V4.15? Since
it fixes real use cases and this way is exact what we do before
0df21c86bdbf("scsi: implement .get_budget and .put_budget for blk-mq").


Thanks,
Ming

^ permalink raw reply	[flat|nested] 24+ messages in thread

* Re: [PATCH] SCSI: run queue if SCSI device queue isn't ready and queue is idle
@ 2017-12-07  1:40     ` Ming Lei
  0 siblings, 0 replies; 24+ messages in thread
From: Ming Lei @ 2017-12-07  1:40 UTC (permalink / raw)
  To: Holger Hoffstätte, Jens Axboe, Martin K. Petersen
  Cc: linux-block, Christoph Hellwig, linux-scsi,
	James E . J . Bottomley, Bart Van Assche, linux-kernel,
	Hannes Reinecke

On Thu, Dec 07, 2017 at 12:10:51AM +0100, Holger Hoffstätte wrote:
> On 12/05/17 08:52, Ming Lei wrote:
> > Before commit 0df21c86bdbf ("scsi: implement .get_budget and .put_budget
> > for blk-mq"), we run queue after 3ms if queue is idle and SCSI device
> > queue isn't ready, which is done in handling BLK_STS_RESOURCE. After
> > commit 0df21c86bdbf is introduced, queue won't be run any more under
> > this situation.
> > 
> > IO hang is observed when timeout happened, and this patch fixes the IO
> > hang issue by running queue after delay in scsi_dev_queue_ready, just like
> > non-mq. This issue can be triggered by the following script[1].
> > 
> > There is another issue which can be covered by running idle queue:
> > when .get_budget() is called on request coming from hctx->dispatch_list,
> > if one request just completes during .get_budget(), we can't depend on
> > SCSI's restart to make progress any more. This patch fixes the race too.
> > 
> > With this patch, we basically recover to previous behaviour(before commit
> > 0df21c86bdbf) of handling idle queue when running out of resource.
> > 
> > [1] script for test/verify SCSI timeout
> > rmmod scsi_debug
> > modprobe scsi_debug max_queue=1
> > 
> > DEVICE=`ls -d /sys/bus/pseudo/drivers/scsi_debug/adapter*/host*/target*/*/block/* | head -1 | xargs basename`
> > DISK_DIR=`ls -d /sys/block/$DEVICE/device/scsi_disk/*`
> > 
> > echo "using scsi device $DEVICE"
> > echo "-1" >/sys/bus/pseudo/drivers/scsi_debug/every_nth
> > echo "temporary write through" >$DISK_DIR/cache_type
> > echo "128" >/sys/bus/pseudo/drivers/scsi_debug/opts
> > echo none > /sys/block/$DEVICE/queue/scheduler
> > dd if=/dev/$DEVICE of=/dev/null bs=1M iflag=direct count=1 &
> > sleep 5
> > echo "0" >/sys/bus/pseudo/drivers/scsi_debug/opts
> > wait
> > echo "SUCCESS"
> > 
> > Fixes: 0df21c86bdbf ("scsi: implement .get_budget and .put_budget for blk-mq")
> > Signed-off-by: Ming Lei <ming.lei@redhat.com>
> > ---
> >  drivers/scsi/scsi_lib.c | 2 ++
> >  1 file changed, 2 insertions(+)
> > 
> > diff --git a/drivers/scsi/scsi_lib.c b/drivers/scsi/scsi_lib.c
> > index db9556662e27..1816dd8259b3 100644
> > --- a/drivers/scsi/scsi_lib.c
> > +++ b/drivers/scsi/scsi_lib.c
> > @@ -1967,6 +1967,8 @@ static bool scsi_mq_get_budget(struct blk_mq_hw_ctx *hctx)
> >  out_put_device:
> >  	put_device(&sdev->sdev_gendev);
> >  out:
> > +	if (atomic_read(&sdev->device_busy) == 0 && !scsi_device_blocked(sdev))
> > +		blk_mq_delay_run_hw_queue(hctx, SCSI_QUEUE_DELAY);
> >  	return false;
> >  }
> 
> So just to follow up on this: with this patch I haven't encountered any
> new hangs with blk-mq, regardless of medium (SSD/rotating disk) or scheduler.
> I cannot speak for other hangs that may be reproducible by other means,
> but for now here's my:
> 
> Tested-by: Holger Hoffstätte <holger@applied-asynchrony.com>

Hi Holger,

That is great to see this patch fixes your issue, and thanks for your
test!

Jens, Martin, would any of you mind making this patch in V4.15? Since
it fixes real use cases and this way is exact what we do before
0df21c86bdbf("scsi: implement .get_budget and .put_budget for blk-mq").


Thanks,
Ming

^ permalink raw reply	[flat|nested] 24+ messages in thread

* Re: [PATCH] SCSI: run queue if SCSI device queue isn't ready and queue is idle
  2017-12-05 16:28   ` Ming Lei
@ 2017-12-07 21:06       ` Bart Van Assche
  2017-12-06  1:52     ` Ming Lei
  2017-12-07 21:06       ` Bart Van Assche
  2 siblings, 0 replies; 24+ messages in thread
From: Bart Van Assche @ 2017-12-07 21:06 UTC (permalink / raw)
  To: ming.lei
  Cc: linux-kernel, linux-block, hch, martin.petersen, linux-scsi,
	axboe, hare, holger, jejb

T24gV2VkLCAyMDE3LTEyLTA2IGF0IDAwOjI4ICswODAwLCBNaW5nIExlaSB3cm90ZToNCj4gT24g
VHVlLCBEZWMgMDUsIDIwMTcgYXQgMDQ6MDg6MjBQTSArMDAwMCwgQmFydCBWYW4gQXNzY2hlIHdy
b3RlOg0KPiA+IE9uIFR1ZSwgMjAxNy0xMi0wNSBhdCAxNTo1MiArMDgwMCwgTWluZyBMZWkgd3Jv
dGU6DQo+ID4gPiBkaWZmIC0tZ2l0IGEvZHJpdmVycy9zY3NpL3Njc2lfbGliLmMgYi9kcml2ZXJz
L3Njc2kvc2NzaV9saWIuYw0KPiA+ID4gaW5kZXggZGI5NTU2NjYyZTI3Li4xODE2ZGQ4MjU5YjMg
MTAwNjQ0DQo+ID4gPiAtLS0gYS9kcml2ZXJzL3Njc2kvc2NzaV9saWIuYw0KPiA+ID4gKysrIGIv
ZHJpdmVycy9zY3NpL3Njc2lfbGliLmMNCj4gPiA+IEBAIC0xOTY3LDYgKzE5NjcsOCBAQCBzdGF0
aWMgYm9vbCBzY3NpX21xX2dldF9idWRnZXQoc3RydWN0IGJsa19tcV9od19jdHggKmhjdHgpDQo+
ID4gPiAgb3V0X3B1dF9kZXZpY2U6DQo+ID4gPiAgCXB1dF9kZXZpY2UoJnNkZXYtPnNkZXZfZ2Vu
ZGV2KTsNCj4gPiA+ICBvdXQ6DQo+ID4gPiArCWlmIChhdG9taWNfcmVhZCgmc2Rldi0+ZGV2aWNl
X2J1c3kpID09IDAgJiYgIXNjc2lfZGV2aWNlX2Jsb2NrZWQoc2RldikpDQo+ID4gPiArCQlibGtf
bXFfZGVsYXlfcnVuX2h3X3F1ZXVlKGhjdHgsIFNDU0lfUVVFVUVfREVMQVkpOw0KPiA+ID4gIAly
ZXR1cm4gZmFsc2U7DQo+ID4gPiAgfQ0KPiA+IA0KPiA+IFRoaXMgY2Fubm90IHdvcmsgc2luY2Ug
bXVsdGlwbGUgdGhyZWFkcyBjYW4gY2FsbCBzY3NpX21xX2dldF9idWRnZXQoKQ0KPiANCj4gVGhh
dCBpcyBleGFjdGx5IHRoZSB3YXkgd2UgYXJlIGhhbmRsaW5nIHRoZXNlIGNhc2VzIGJlZm9yZSAw
ZGYyMWM4NmJkYmYoc2NzaToNCj4gaW1wbGVtZW50IC5nZXRfYnVkZ2V0IGFuZCAucHV0X2J1ZGdl
dCBmb3IgYmxrLW1xKSwgc28gaWYgaXQgY2FuJ3Qgd29yaywNCj4gdGhhdCBpcyBub3QgZmF1bHQg
b2YgY29tbWl0IDBkZjIxYzg2YmRiZi4NCj4gDQo+ID4gY29uY3VycmVudGx5IGFuZCBoZW5jZSBp
dCBjYW4gaGFwcGVuIHRoYXQgbm9uZSBvZiB0aGVtIHNlZXMNCj4gPiBhdG9taWNfcmVhZCgmc2Rl
di0+ZGV2aWNlX2J1c3kpID09IDAuIEJUVywgdGhlIGFib3ZlIHBhdGNoIGlzIHNvbWV0aGluZyBJ
DQo+IA0KPiBJZiB0aGVyZSBpcyBjb25jdXJyZW50IC5nZXRfYnVkZ2V0KCksIG9uZSBvZiB0aGVt
IHdpbGwgc2VlIHRoZSBjb3VudGVyDQo+IGJlY29taW5nIHplcm8gZmluYWxseSBiZWNhdXNlIGVh
Y2ggc2Rldi0+ZGV2aWNlX2J1c3kgaXMgaW5jL2RlYw0KPiBhdG9taWNhbGx5LiBPciBzY3NpX2Rl
dl9xdWV1ZV9yZWFkeSgpIHJldHVybiB0cnVlLg0KPiANCj4gQW55d2F5LCB3ZSBuZWVkIHRoaXMg
cGF0Y2ggdG8gYXZvaWQgcG9zc2libGUgcmVncmVzc2lvbi4gSWYgeW91IHRoaW5rDQo+IHRoZXJl
IGFyZSBidWdzIGluIGJsay1tcSBSRVNUQVJULCBqdXN0IHJvb3QgY2F1c2UgYW5kIGFuZCBmaXgg
aXQuDQoNCkhlbGxvIE1pbmcsDQoNCldoZW4gSSBsb29rZWQgYXQgdGhlIHBhdGNoIGF0IHRoZSBz
dGFydCBvZiB0aGlzIHRocmVhZCBmb3IgdGhlIGZpcnN0IHRpbWUgSQ0KZ290IGZydXN0cmF0ZWQg
YmVjYXVzZSBJIGRpZG4ndCBzZWUgaG93IHRoaXMgcGF0Y2ggY291bGQgZml4IHRoZSBxdWV1ZSBz
dGFsbA0KSSByYW4gaW50byBteXNlbGYuIFRvZGF5IEkgc3RhcnRlZCByZWFsaXppbmcgdGhhdCB3
aGF0IEhvbGdlciByZXBvcnRlZCBpcw0KcHJvYmFibHkgYW5vdGhlciBpc3N1ZSB0aGFuIHdoYXQg
SSByYW4gaW50byBteXNlbGYuIFNpbmNlIHRoaXMgcGF0Y2ggYnkNCml0c2VsZiBsb29rcyBub3cg
dXNlZnVsIHRvIG1lOg0KDQpSZXZpZXdlZC1ieTogQmFydCBWYW4gQXNzY2hlIDxiYXJ0LnZhbmFz
c2NoZUB3ZGMuY29tPg0KDQpCVFcsIGRvIHlvdSB0aGluayB0aGUgcGF0Y2ggYXQgdGhlIHN0YXJ0
IG9mIHRoaXMgdGhyZWFkIGFsc28gZml4ZXMgdGhlIGlzc3VlDQp0aGF0IHJlc3VsdGVkIGluIGNv
bW1pdCA4MjZhNzBhMDhiMTIgKCJTQ1NJOiBkb24ndCBnZXQgdGFyZ2V0L2hvc3QgYnVzeV9jb3Vu
dA0KaW4gc2NzaV9tcV9nZXRfYnVkZ2V0KCkiKT8gRG8geW91IHRoaW5rIHdlIHN0aWxsIG5lZWQg
dGhhdCBwYXRjaD8NCg0KQmFydC4=

^ permalink raw reply	[flat|nested] 24+ messages in thread

* Re: [PATCH] SCSI: run queue if SCSI device queue isn't ready and queue is idle
@ 2017-12-07 21:06       ` Bart Van Assche
  0 siblings, 0 replies; 24+ messages in thread
From: Bart Van Assche @ 2017-12-07 21:06 UTC (permalink / raw)
  To: ming.lei
  Cc: linux-kernel, linux-block, hch, martin.petersen, linux-scsi,
	axboe, hare, holger, jejb

On Wed, 2017-12-06 at 00:28 +0800, Ming Lei wrote:
> On Tue, Dec 05, 2017 at 04:08:20PM +0000, Bart Van Assche wrote:
> > On Tue, 2017-12-05 at 15:52 +0800, Ming Lei wrote:
> > > diff --git a/drivers/scsi/scsi_lib.c b/drivers/scsi/scsi_lib.c
> > > index db9556662e27..1816dd8259b3 100644
> > > --- a/drivers/scsi/scsi_lib.c
> > > +++ b/drivers/scsi/scsi_lib.c
> > > @@ -1967,6 +1967,8 @@ static bool scsi_mq_get_budget(struct blk_mq_hw_ctx *hctx)
> > >  out_put_device:
> > >  	put_device(&sdev->sdev_gendev);
> > >  out:
> > > +	if (atomic_read(&sdev->device_busy) == 0 && !scsi_device_blocked(sdev))
> > > +		blk_mq_delay_run_hw_queue(hctx, SCSI_QUEUE_DELAY);
> > >  	return false;
> > >  }
> > 
> > This cannot work since multiple threads can call scsi_mq_get_budget()
> 
> That is exactly the way we are handling these cases before 0df21c86bdbf(scsi:
> implement .get_budget and .put_budget for blk-mq), so if it can't work,
> that is not fault of commit 0df21c86bdbf.
> 
> > concurrently and hence it can happen that none of them sees
> > atomic_read(&sdev->device_busy) == 0. BTW, the above patch is something I
> 
> If there is concurrent .get_budget(), one of them will see the counter
> becoming zero finally because each sdev->device_busy is inc/dec
> atomically. Or scsi_dev_queue_ready() return true.
> 
> Anyway, we need this patch to avoid possible regression. If you think
> there are bugs in blk-mq RESTART, just root cause and and fix it.

Hello Ming,

When I looked at the patch at the start of this thread for the first time I
got frustrated because I didn't see how this patch could fix the queue stall
I ran into myself. Today I started realizing that what Holger reported is
probably another issue than what I ran into myself. Since this patch by
itself looks now useful to me:

Reviewed-by: Bart Van Assche <bart.vanassche@wdc.com>

BTW, do you think the patch at the start of this thread also fixes the issue
that resulted in commit 826a70a08b12 ("SCSI: don't get target/host busy_count
in scsi_mq_get_budget()")? Do you think we still need that patch?

Bart.

^ permalink raw reply	[flat|nested] 24+ messages in thread

* Re: [PATCH] SCSI: run queue if SCSI device queue isn't ready and queue is idle
  2017-12-07  1:31         ` Ming Lei
@ 2017-12-07 21:11             ` Bart Van Assche
  0 siblings, 0 replies; 24+ messages in thread
From: Bart Van Assche @ 2017-12-07 21:11 UTC (permalink / raw)
  To: ming.lei
  Cc: linux-kernel, linux-block, hch, martin.petersen, linux-scsi,
	axboe, hare, holger, jejb

T24gVGh1LCAyMDE3LTEyLTA3IGF0IDA5OjMxICswODAwLCBNaW5nIExlaSB3cm90ZToNCj4gQnV0
IGlmIHlvdSBhbHdheXMgY2FsbCBibGtfbXFfc2NoZWRfbWFya19yZXN0YXJ0X2hjdHgoKSBiZWZv
cmUgYSBuZXcNCj4gZGlzcGF0Y2gsIHRoYXQgbWF5IGFmZmVjdCBwZXJmb3JtYW5jZSBvbiBOVk1l
IHdoaWNoIG1heSBuZXZlciB0cmlnZ2VyDQo+IEJMS19TVFNfUkVTT1VSQ0UuDQoNCkhtbSAuLi4g
b25seSB0aGUgU0NTSSBjb3JlIGltcGxlbWVudHMgLmdldF9idWRnZXQoKSBhbmQgLnB1dF9idWRn
ZXQoKSBhbmQNCkkgcHJvcG9zZWQgdG8gaW5zZXJ0IGEgYmxrX21xX3NjaGVkX21hcmtfcmVzdGFy
dF9oY3R4KCkgY2FsbCB1bmRlciAiaWYNCihxLT5tcV9vcHMtPmdldF9idWRnZXQpIi4gSW4gb3Ro
ZXIgd29yZHMsIEkgcHJvcG9zZWQgdG8gaW5zZXJ0IGENCmJsa19tcV9zY2hlZF9tYXJrX3Jlc3Rh
cnRfaGN0eCgpIGNhbGwgaW4gYSBjb2RlIHBhdGggdGhhdCBpcyBuZXZlciB0cmlnZ2VyZWQNCmJ5
IHRoZSBOVk1lIGRyaXZlci4gU28gSSBkb24ndCBzZWUgaG93IHRoZSBjaGFuZ2UgSSBwcm9wb3Nl
ZCBjb3VsZCBhZmZlY3QNCnRoZSBwZXJmb3JtYW5jZSBvZiB0aGUgTlZNZSBkcml2ZXI/DQoNCkJh
cnQu

^ permalink raw reply	[flat|nested] 24+ messages in thread

* Re: [PATCH] SCSI: run queue if SCSI device queue isn't ready and queue is idle
@ 2017-12-07 21:11             ` Bart Van Assche
  0 siblings, 0 replies; 24+ messages in thread
From: Bart Van Assche @ 2017-12-07 21:11 UTC (permalink / raw)
  To: ming.lei
  Cc: linux-kernel, linux-block, hch, martin.petersen, linux-scsi,
	axboe, hare, holger, jejb

On Thu, 2017-12-07 at 09:31 +0800, Ming Lei wrote:
> But if you always call blk_mq_sched_mark_restart_hctx() before a new
> dispatch, that may affect performance on NVMe which may never trigger
> BLK_STS_RESOURCE.

Hmm ... only the SCSI core implements .get_budget() and .put_budget() and
I proposed to insert a blk_mq_sched_mark_restart_hctx() call under "if
(q->mq_ops->get_budget)". In other words, I proposed to insert a
blk_mq_sched_mark_restart_hctx() call in a code path that is never triggered
by the NVMe driver. So I don't see how the change I proposed could affect
the performance of the NVMe driver?

Bart.

^ permalink raw reply	[flat|nested] 24+ messages in thread

* Re: [PATCH] SCSI: run queue if SCSI device queue isn't ready and queue is idle
  2017-12-07 21:11             ` Bart Van Assche
  (?)
@ 2017-12-08  0:36             ` Ming Lei
  -1 siblings, 0 replies; 24+ messages in thread
From: Ming Lei @ 2017-12-08  0:36 UTC (permalink / raw)
  To: Bart Van Assche
  Cc: linux-kernel, linux-block, hch, martin.petersen, linux-scsi,
	axboe, hare, holger, jejb

On Thu, Dec 07, 2017 at 09:11:54PM +0000, Bart Van Assche wrote:
> On Thu, 2017-12-07 at 09:31 +0800, Ming Lei wrote:
> > But if you always call blk_mq_sched_mark_restart_hctx() before a new
> > dispatch, that may affect performance on NVMe which may never trigger
> > BLK_STS_RESOURCE.
> 
> Hmm ... only the SCSI core implements .get_budget() and .put_budget() and
> I proposed to insert a blk_mq_sched_mark_restart_hctx() call under "if
> (q->mq_ops->get_budget)". In other words, I proposed to insert a
> blk_mq_sched_mark_restart_hctx() call in a code path that is never triggered
> by the NVMe driver. So I don't see how the change I proposed could affect
> the performance of the NVMe driver?

You only add the check on none scheduler, right?

But this race isn't related with scheduler, that means it can't fix the
race with other schedulers.

I have test case to trigger this issue on both none and mq-deadline, and
my patch fixes them all.

Thanks,
Ming

^ permalink raw reply	[flat|nested] 24+ messages in thread

* Re: [PATCH] SCSI: run queue if SCSI device queue isn't ready and queue is idle
  2017-12-07 21:06       ` Bart Van Assche
  (?)
@ 2017-12-08  0:50       ` Ming Lei
  -1 siblings, 0 replies; 24+ messages in thread
From: Ming Lei @ 2017-12-08  0:50 UTC (permalink / raw)
  To: Bart Van Assche
  Cc: linux-kernel, linux-block, hch, martin.petersen, linux-scsi,
	axboe, hare, holger, jejb

On Thu, Dec 07, 2017 at 09:06:58PM +0000, Bart Van Assche wrote:
> On Wed, 2017-12-06 at 00:28 +0800, Ming Lei wrote:
> > On Tue, Dec 05, 2017 at 04:08:20PM +0000, Bart Van Assche wrote:
> > > On Tue, 2017-12-05 at 15:52 +0800, Ming Lei wrote:
> > > > diff --git a/drivers/scsi/scsi_lib.c b/drivers/scsi/scsi_lib.c
> > > > index db9556662e27..1816dd8259b3 100644
> > > > --- a/drivers/scsi/scsi_lib.c
> > > > +++ b/drivers/scsi/scsi_lib.c
> > > > @@ -1967,6 +1967,8 @@ static bool scsi_mq_get_budget(struct blk_mq_hw_ctx *hctx)
> > > >  out_put_device:
> > > >  	put_device(&sdev->sdev_gendev);
> > > >  out:
> > > > +	if (atomic_read(&sdev->device_busy) == 0 && !scsi_device_blocked(sdev))
> > > > +		blk_mq_delay_run_hw_queue(hctx, SCSI_QUEUE_DELAY);
> > > >  	return false;
> > > >  }
> > > 
> > > This cannot work since multiple threads can call scsi_mq_get_budget()
> > 
> > That is exactly the way we are handling these cases before 0df21c86bdbf(scsi:
> > implement .get_budget and .put_budget for blk-mq), so if it can't work,
> > that is not fault of commit 0df21c86bdbf.
> > 
> > > concurrently and hence it can happen that none of them sees
> > > atomic_read(&sdev->device_busy) == 0. BTW, the above patch is something I
> > 
> > If there is concurrent .get_budget(), one of them will see the counter
> > becoming zero finally because each sdev->device_busy is inc/dec
> > atomically. Or scsi_dev_queue_ready() return true.
> > 
> > Anyway, we need this patch to avoid possible regression. If you think
> > there are bugs in blk-mq RESTART, just root cause and and fix it.
> 
> Hello Ming,
> 
> When I looked at the patch at the start of this thread for the first time I
> got frustrated because I didn't see how this patch could fix the queue stall
> I ran into myself. Today I started realizing that what Holger reported is
> probably another issue than what I ran into myself. Since this patch by
> itself looks now useful to me:
> 
> Reviewed-by: Bart Van Assche <bart.vanassche@wdc.com>

Hi Bart,

Thanks for your Review!

> 
> BTW, do you think the patch at the start of this thread also fixes the issue
> that resulted in commit 826a70a08b12 ("SCSI: don't get target/host busy_count
> in scsi_mq_get_budget()")? Do you think we still need that patch?

Yeah, once the patch in this thread is in, IO hang shouldn't happen any more
even without 826a70a08b12.

But that commit is still the correct thing to do, since we let blk-mq's sbitmap
queue respect per-host queue depth, which way is much efficient than the
simple atomic operations used in scsi_host_queue_ready(). So I think that commit
is still useful.

When I was figuring out patch of 826a70a08b12, I just ignore the .get_budget() for
requests from hctx->dispatch_list, because we don't need to deal with queue idle
when .get_budget() is called from both blk_mq_do_dispatch_sched() and blk_mq_do_dispatch_ctx().

Thanks,
Ming

^ permalink raw reply	[flat|nested] 24+ messages in thread

* Re: [PATCH] SCSI: run queue if SCSI device queue isn't ready and queue is idle
  2017-12-07  1:40     ` Ming Lei
  (?)
@ 2017-12-08  0:54     ` Martin K. Petersen
  -1 siblings, 0 replies; 24+ messages in thread
From: Martin K. Petersen @ 2017-12-08  0:54 UTC (permalink / raw)
  To: Ming Lei
  Cc: Holger Hoffstätte, Jens Axboe, Martin K. Petersen,
	linux-block, Christoph Hellwig, linux-scsi,
	James E . J . Bottomley, Bart Van Assche, linux-kernel,
	Hannes Reinecke


Ming,

> Jens, Martin, would any of you mind making this patch in V4.15? Since
> it fixes real use cases and this way is exact what we do before
> 0df21c86bdbf("scsi: implement .get_budget and .put_budget for blk-mq").

Applied to 4.15/scsi-fixes, thank you!

-- 
Martin K. Petersen	Oracle Linux Engineering

^ permalink raw reply	[flat|nested] 24+ messages in thread

end of thread, other threads:[~2017-12-08  0:56 UTC | newest]

Thread overview: 24+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2017-12-05  7:52 [PATCH] SCSI: run queue if SCSI device queue isn't ready and queue is idle Ming Lei
2017-12-05 14:29 ` Johannes Thumshirn
2017-12-05 16:16   ` Bart Van Assche
2017-12-05 16:16     ` Bart Van Assche
2017-12-05 16:08 ` Bart Van Assche
2017-12-05 16:08   ` Bart Van Assche
2017-12-05 16:28   ` Ming Lei
2017-12-05 16:41     ` Bart Van Assche
2017-12-05 16:41       ` Bart Van Assche
2017-12-05 16:45       ` Ming Lei
2017-12-06  1:52     ` Ming Lei
2017-12-06 16:07       ` Bart Van Assche
2017-12-06 16:07         ` Bart Van Assche
2017-12-07  1:31         ` Ming Lei
2017-12-07 21:11           ` Bart Van Assche
2017-12-07 21:11             ` Bart Van Assche
2017-12-08  0:36             ` Ming Lei
2017-12-07 21:06     ` Bart Van Assche
2017-12-07 21:06       ` Bart Van Assche
2017-12-08  0:50       ` Ming Lei
2017-12-06 23:10 ` Holger Hoffstätte
2017-12-07  1:40   ` Ming Lei
2017-12-07  1:40     ` Ming Lei
2017-12-08  0:54     ` Martin K. Petersen

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.