All of lore.kernel.org
 help / color / mirror / Atom feed
* [PATCH v2 -next] block/wbt: fix negative inflight counter when remove scsi device
@ 2021-12-14  4:42 Laibin Qiu
  2021-12-14 12:44 ` Ming Lei
  0 siblings, 1 reply; 2+ messages in thread
From: Laibin Qiu @ 2021-12-14  4:42 UTC (permalink / raw)
  To: hch, axboe; +Cc: yi.zhang, linux-block, linux-kernel

Now that we disable wbt by set WBT_STATE_OFF_DEFAULT in
wbt_disable_default() when switch elevator to bfq. And when
we remove scsi device, wbt will be enabled by wbt_enable_default.
If it become false positive between wbt_wait() and wbt_track()
when submit write request.

The following is the scenario that triggered the problem.

T1                          T2                           T3
                            elevator_switch_mq
                            bfq_init_queue
                            wbt_disable_default <= Set
                            rwb->enable_state (OFF)
Submit_bio
blk_mq_make_request
rq_qos_throttle
<= rwb->enable_state (OFF)
                                                         scsi_remove_device
                                                         sd_remove
                                                         del_gendisk
                                                         blk_unregister_queue
                                                         elv_unregister_queue
                                                         wbt_enable_default
                                                         <= Set rwb->enable_state (ON)
q_qos_track
<= rwb->enable_state (ON)
^^^^^^ this request will mark WBT_TRACKED without inflight add and will
lead to drop rqw->inflight to -1 in wbt_done() which will trigger IO hung.

Fix this by move wbt_enable_default() from elv_unregister to
elevator_switch_mq. Only re-enable wbt when scheduler switch.
Fixes: 76a8040817b4b ("blk-wbt: make sure throttle is enabled properly")
Signed-off-by: Laibin Qiu <qiulaibin@huawei.com>
---
 block/elevator.c | 7 ++++---
 1 file changed, 4 insertions(+), 3 deletions(-)

diff --git a/block/elevator.c b/block/elevator.c
index ec98aed39c4f..de3cf1fa52fa 100644
--- a/block/elevator.c
+++ b/block/elevator.c
@@ -525,8 +525,6 @@ void elv_unregister_queue(struct request_queue *q)
 		kobject_del(&e->kobj);
 
 		e->registered = 0;
-		/* Re-enable throttling in case elevator disabled it */
-		wbt_enable_default(q);
 	}
 }
 
@@ -593,8 +591,11 @@ int elevator_switch_mq(struct request_queue *q,
 	lockdep_assert_held(&q->sysfs_lock);
 
 	if (q->elevator) {
-		if (q->elevator->registered)
+		if (q->elevator->registered) {
 			elv_unregister_queue(q);
+			/* Re-enable throttling in case elevator disabled it */
+			wbt_enable_default(q);
+		}
 
 		ioc_clear_queue(q);
 		blk_mq_sched_free_rqs(q);
-- 
2.22.0


^ permalink raw reply related	[flat|nested] 2+ messages in thread

* Re: [PATCH v2 -next] block/wbt: fix negative inflight counter when remove scsi device
  2021-12-14  4:42 [PATCH v2 -next] block/wbt: fix negative inflight counter when remove scsi device Laibin Qiu
@ 2021-12-14 12:44 ` Ming Lei
  0 siblings, 0 replies; 2+ messages in thread
From: Ming Lei @ 2021-12-14 12:44 UTC (permalink / raw)
  To: Laibin Qiu; +Cc: hch, axboe, yi.zhang, linux-block, linux-kernel

On Tue, Dec 14, 2021 at 12:42:59PM +0800, Laibin Qiu wrote:
> Now that we disable wbt by set WBT_STATE_OFF_DEFAULT in
> wbt_disable_default() when switch elevator to bfq. And when
> we remove scsi device, wbt will be enabled by wbt_enable_default.
> If it become false positive between wbt_wait() and wbt_track()
> when submit write request.
> 
> The following is the scenario that triggered the problem.
> 
> T1                          T2                           T3
>                             elevator_switch_mq
>                             bfq_init_queue
>                             wbt_disable_default <= Set
>                             rwb->enable_state (OFF)
> Submit_bio
> blk_mq_make_request
> rq_qos_throttle
> <= rwb->enable_state (OFF)
>                                                          scsi_remove_device
>                                                          sd_remove
>                                                          del_gendisk
>                                                          blk_unregister_queue
>                                                          elv_unregister_queue
>                                                          wbt_enable_default
>                                                          <= Set rwb->enable_state (ON)
> q_qos_track
> <= rwb->enable_state (ON)
> ^^^^^^ this request will mark WBT_TRACKED without inflight add and will
> lead to drop rqw->inflight to -1 in wbt_done() which will trigger IO hung.
> 
> Fix this by move wbt_enable_default() from elv_unregister to
> elevator_switch_mq. Only re-enable wbt when scheduler switch.
> Fixes: 76a8040817b4b ("blk-wbt: make sure throttle is enabled properly")
> Signed-off-by: Laibin Qiu <qiulaibin@huawei.com>
> ---
>  block/elevator.c | 7 ++++---
>  1 file changed, 4 insertions(+), 3 deletions(-)
> 
> diff --git a/block/elevator.c b/block/elevator.c
> index ec98aed39c4f..de3cf1fa52fa 100644
> --- a/block/elevator.c
> +++ b/block/elevator.c
> @@ -525,8 +525,6 @@ void elv_unregister_queue(struct request_queue *q)
>  		kobject_del(&e->kobj);
>  
>  		e->registered = 0;
> -		/* Re-enable throttling in case elevator disabled it */
> -		wbt_enable_default(q);
>  	}
>  }
>  
> @@ -593,8 +591,11 @@ int elevator_switch_mq(struct request_queue *q,
>  	lockdep_assert_held(&q->sysfs_lock);
>  
>  	if (q->elevator) {
> -		if (q->elevator->registered)
> +		if (q->elevator->registered) {
>  			elv_unregister_queue(q);
> +			/* Re-enable throttling in case elevator disabled it */
> +			wbt_enable_default(q);
> +		}

Please move wbt_enable_default() into bfq_exit_queue(), which should
be easier to follow and fix the issue too given only bfq disables wbt.


Thanks,
Ming


^ permalink raw reply	[flat|nested] 2+ messages in thread

end of thread, other threads:[~2021-12-14 12:45 UTC | newest]

Thread overview: 2+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2021-12-14  4:42 [PATCH v2 -next] block/wbt: fix negative inflight counter when remove scsi device Laibin Qiu
2021-12-14 12:44 ` Ming Lei

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.