linux-kernel.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
* [PATCH] blk-mq: fix a hung issue when set device state to blocked and restore running
@ 2019-03-20  8:02 zhengbin
  2019-03-20  8:11 ` Ming Lei
  2019-03-20  8:15 ` jianchao.wang
  0 siblings, 2 replies; 5+ messages in thread
From: zhengbin @ 2019-03-20  8:02 UTC (permalink / raw)
  To: axboe, ming.lei, hch, linux-block, linux-kernel
  Cc: houtao1, yanaijie, yuyufen

When I use dd test a SCSI device which use blk-mq in the following steps:
1.echo "blocked" >/sys/block/sda/device/state
2.dd if=/dev/sda of=/mnt/t.log bs=1M count=10
3.echo "running" >/sys/block/sda/device/state
dd should finish this work after step 3, unfortunately, still hung.

After step2, the key code process is like this:
blk_mq_dispatch_rq_list-->scsi_queue_rq-->prep_to_mq
                       -->if ret is BLK_STS_RESOURCE, delay run hw queue

prep_to_mq will return BLK_STS_RESOURCE, and scsi_queue_rq will transter
it to BLK_STS_DEV_RESOURCE. In this situtation, we should delay run hw
queue. This patch fixes that.

Fixes: 86ff7c2a80cd ("blk-mq: introduce BLK_STS_DEV_RESOURCE")
Signed-off-by: zhengbin <zhengbin13@huawei.com>
---
 block/blk-mq.c | 10 ++++++----
 1 file changed, 6 insertions(+), 4 deletions(-)

diff --git a/block/blk-mq.c b/block/blk-mq.c
index a9c1816..556d606 100644
--- a/block/blk-mq.c
+++ b/block/blk-mq.c
@@ -1309,15 +1309,17 @@ bool blk_mq_dispatch_rq_list(struct request_queue *q, struct list_head *list,
 		 *   returning BLK_STS_RESOURCE. Two exceptions are scsi-mq
 		 *   and dm-rq.
 		 *
-		 * If driver returns BLK_STS_RESOURCE and SCHED_RESTART
-		 * bit is set, run queue after a delay to avoid IO stalls
-		 * that could otherwise occur if the queue is idle.
+		 * If driver returns BLK_STS_RESOURCE or BLK_STS_DEV_RESOURCE
+		 * and SCHED_RESTART bit is set, run queue after a delay to
+		 * avoid IO stalls that could otherwise occur if the queue
+		 * is idle.
 		 */
 		needs_restart = blk_mq_sched_needs_restart(hctx);
 		if (!needs_restart ||
 		    (no_tag && list_empty_careful(&hctx->dispatch_wait.entry)))
 			blk_mq_run_hw_queue(hctx, true);
-		else if (needs_restart && (ret == BLK_STS_RESOURCE))
+		else if (needs_restart && ((ret == BLK_STS_RESOURCE) ||
+					   (ret == BLK_STS_DEV_RESOURCE)))
 			blk_mq_delay_run_hw_queue(hctx, BLK_MQ_RESOURCE_DELAY);

 		blk_mq_update_dispatch_busy(hctx, true);
--
2.7.4


^ permalink raw reply related	[flat|nested] 5+ messages in thread

* Re: [PATCH] blk-mq: fix a hung issue when set device state to blocked and restore running
  2019-03-20  8:02 [PATCH] blk-mq: fix a hung issue when set device state to blocked and restore running zhengbin
@ 2019-03-20  8:11 ` Ming Lei
  2019-03-20  8:52   ` zhengbin (A)
  2019-03-20  8:15 ` jianchao.wang
  1 sibling, 1 reply; 5+ messages in thread
From: Ming Lei @ 2019-03-20  8:11 UTC (permalink / raw)
  To: zhengbin
  Cc: axboe, hch, linux-block, linux-kernel, houtao1, yanaijie, yuyufen

On Wed, Mar 20, 2019 at 04:02:01PM +0800, zhengbin wrote:
> When I use dd test a SCSI device which use blk-mq in the following steps:
> 1.echo "blocked" >/sys/block/sda/device/state
> 2.dd if=/dev/sda of=/mnt/t.log bs=1M count=10
> 3.echo "running" >/sys/block/sda/device/state
> dd should finish this work after step 3, unfortunately, still hung.
> 
> After step2, the key code process is like this:
> blk_mq_dispatch_rq_list-->scsi_queue_rq-->prep_to_mq
>                        -->if ret is BLK_STS_RESOURCE, delay run hw queue
> 
> prep_to_mq will return BLK_STS_RESOURCE, and scsi_queue_rq will transter
> it to BLK_STS_DEV_RESOURCE. In this situtation, we should delay run hw

BLK_STS_DEV_RESOURCE means that the driver will rerun hw queue, so
maybe you need to investigate why it is returned from scsi driver first.

BTW, I'd suggest you read the big comment on BLK_STS_DEV_RESOURCE first.

Thanks,
Ming

^ permalink raw reply	[flat|nested] 5+ messages in thread

* Re: [PATCH] blk-mq: fix a hung issue when set device state to blocked and restore running
  2019-03-20  8:02 [PATCH] blk-mq: fix a hung issue when set device state to blocked and restore running zhengbin
  2019-03-20  8:11 ` Ming Lei
@ 2019-03-20  8:15 ` jianchao.wang
  1 sibling, 0 replies; 5+ messages in thread
From: jianchao.wang @ 2019-03-20  8:15 UTC (permalink / raw)
  To: zhengbin, axboe, ming.lei, hch, linux-block, linux-kernel
  Cc: houtao1, yanaijie, yuyufen

Hi Bin

On 3/20/19 4:02 PM, zhengbin wrote:
> When I use dd test a SCSI device which use blk-mq in the following steps:
> 1.echo "blocked" >/sys/block/sda/device/state
> 2.dd if=/dev/sda of=/mnt/t.log bs=1M count=10
> 3.echo "running" >/sys/block/sda/device/state
> dd should finish this work after step 3, unfortunately, still hung.

If this test case really matters for you, we should try to run the hw queues after set state
to 'running'.

Thanks
Jianchao
> 
> After step2, the key code process is like this:
> blk_mq_dispatch_rq_list-->scsi_queue_rq-->prep_to_mq
>                        -->if ret is BLK_STS_RESOURCE, delay run hw queue
> 
> prep_to_mq will return BLK_STS_RESOURCE, and scsi_queue_rq will transter
> it to BLK_STS_DEV_RESOURCE. In this situtation, we should delay run hw
> queue. This patch fixes that.
> 
> Fixes: 86ff7c2a80cd ("blk-mq: introduce BLK_STS_DEV_RESOURCE")
> Signed-off-by: zhengbin <zhengbin13@huawei.com>
> ---
>  block/blk-mq.c | 10 ++++++----
>  1 file changed, 6 insertions(+), 4 deletions(-)
> 
> diff --git a/block/blk-mq.c b/block/blk-mq.c
> index a9c1816..556d606 100644
> --- a/block/blk-mq.c
> +++ b/block/blk-mq.c
> @@ -1309,15 +1309,17 @@ bool blk_mq_dispatch_rq_list(struct request_queue *q, struct list_head *list,
>  		 *   returning BLK_STS_RESOURCE. Two exceptions are scsi-mq
>  		 *   and dm-rq.
>  		 *
> -		 * If driver returns BLK_STS_RESOURCE and SCHED_RESTART
> -		 * bit is set, run queue after a delay to avoid IO stalls
> -		 * that could otherwise occur if the queue is idle.
> +		 * If driver returns BLK_STS_RESOURCE or BLK_STS_DEV_RESOURCE
> +		 * and SCHED_RESTART bit is set, run queue after a delay to
> +		 * avoid IO stalls that could otherwise occur if the queue
> +		 * is idle.
>  		 */
>  		needs_restart = blk_mq_sched_needs_restart(hctx);
>  		if (!needs_restart ||
>  		    (no_tag && list_empty_careful(&hctx->dispatch_wait.entry)))
>  			blk_mq_run_hw_queue(hctx, true);
> -		else if (needs_restart && (ret == BLK_STS_RESOURCE))
> +		else if (needs_restart && ((ret == BLK_STS_RESOURCE) ||
> +					   (ret == BLK_STS_DEV_RESOURCE)))
>  			blk_mq_delay_run_hw_queue(hctx, BLK_MQ_RESOURCE_DELAY);
> 
>  		blk_mq_update_dispatch_busy(hctx, true);
> --
> 2.7.4
> 
> 

^ permalink raw reply	[flat|nested] 5+ messages in thread

* Re: [PATCH] blk-mq: fix a hung issue when set device state to blocked and restore running
  2019-03-20  8:11 ` Ming Lei
@ 2019-03-20  8:52   ` zhengbin (A)
  2019-03-20  9:29     ` Ming Lei
  0 siblings, 1 reply; 5+ messages in thread
From: zhengbin (A) @ 2019-03-20  8:52 UTC (permalink / raw)
  To: Ming Lei
  Cc: axboe, hch, linux-block, linux-kernel, houtao1, yanaijie,
	jianchao.w.wang

Thanks for your quick reply, I will study BLK_STS_DEV_RESOURCE in detail

> BLK_STS_DEV_RESOURCE means that the driver will rerun hw queue, so
> maybe you need to investigate why it is returned from scsi driver first.
because we set the device state to blocked,
scsi_queue_rq-->prep_to_mq(return BLK_STS_RESOURCE)
	     -->out_put_budget  transfer BLK_STS_RESOURCE to BLK_STS_DEV_RESOURCE
In this situtation, the request does not send to the driver.


If the device use sq, when we we set the device state to blocked and test dd, it will continue to
call blk_delay_queue.


If this test case really matters for you, we should try to run the hw queues after set state
to 'running'.
--->Maybe we should call blk_mq_run_hw_queue in scsi_device_set_state?


On 2019/3/20 16:11, Ming Lei wrote:
> On Wed, Mar 20, 2019 at 04:02:01PM +0800, zhengbin wrote:
>> When I use dd test a SCSI device which use blk-mq in the following steps:
>> 1.echo "blocked" >/sys/block/sda/device/state
>> 2.dd if=/dev/sda of=/mnt/t.log bs=1M count=10
>> 3.echo "running" >/sys/block/sda/device/state
>> dd should finish this work after step 3, unfortunately, still hung.
>>
>> After step2, the key code process is like this:
>> blk_mq_dispatch_rq_list-->scsi_queue_rq-->prep_to_mq
>>                        -->if ret is BLK_STS_RESOURCE, delay run hw queue
>>
>> prep_to_mq will return BLK_STS_RESOURCE, and scsi_queue_rq will transter
>> it to BLK_STS_DEV_RESOURCE. In this situtation, we should delay run hw
> 
> BLK_STS_DEV_RESOURCE means that the driver will rerun hw queue, so
> maybe you need to investigate why it is returned from scsi driver first.
> 
> BTW, I'd suggest you read the big comment on BLK_STS_DEV_RESOURCE first.
> 
> Thanks,
> Ming
> 
> .
> 


^ permalink raw reply	[flat|nested] 5+ messages in thread

* Re: [PATCH] blk-mq: fix a hung issue when set device state to blocked and restore running
  2019-03-20  8:52   ` zhengbin (A)
@ 2019-03-20  9:29     ` Ming Lei
  0 siblings, 0 replies; 5+ messages in thread
From: Ming Lei @ 2019-03-20  9:29 UTC (permalink / raw)
  To: zhengbin (A)
  Cc: axboe, hch, linux-block, linux-kernel, houtao1, yanaijie,
	jianchao.w.wang

On Wed, Mar 20, 2019 at 04:52:40PM +0800, zhengbin (A) wrote:
> Thanks for your quick reply, I will study BLK_STS_DEV_RESOURCE in detail
> 
> > BLK_STS_DEV_RESOURCE means that the driver will rerun hw queue, so
> > maybe you need to investigate why it is returned from scsi driver first.
> because we set the device state to blocked,
> scsi_queue_rq-->prep_to_mq(return BLK_STS_RESOURCE)
> 	     -->out_put_budget  transfer BLK_STS_RESOURCE to BLK_STS_DEV_RESOURCE
> In this situtation, the request does not send to the driver.

Then the queue will be run when the scsi_device becomes un-blocked,
see scsi_internal_device_unblock_nowait().


Thanks,
Ming

^ permalink raw reply	[flat|nested] 5+ messages in thread

end of thread, other threads:[~2019-03-20  9:30 UTC | newest]

Thread overview: 5+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2019-03-20  8:02 [PATCH] blk-mq: fix a hung issue when set device state to blocked and restore running zhengbin
2019-03-20  8:11 ` Ming Lei
2019-03-20  8:52   ` zhengbin (A)
2019-03-20  9:29     ` Ming Lei
2019-03-20  8:15 ` jianchao.wang

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).