* [PATCH v3 -next] block/wbt: fix negative inflight counter when remove scsi device
@ 2021-12-14 13:31 Laibin Qiu
2021-12-14 15:04 ` Christoph Hellwig
2021-12-15 2:42 ` Ming Lei
0 siblings, 2 replies; 3+ messages in thread
From: Laibin Qiu @ 2021-12-14 13:31 UTC (permalink / raw)
To: ming.lei, hch, axboe; +Cc: yi.zhang, linux-block, linux-kernel
Now that we disable wbt by set WBT_STATE_OFF_DEFAULT in
wbt_disable_default() when switch elevator to bfq. And when
we remove scsi device, wbt will be enabled by wbt_enable_default.
If it become false positive between wbt_wait() and wbt_track()
when submit write request.
The following is the scenario that triggered the problem.
T1 T2 T3
elevator_switch_mq
bfq_init_queue
wbt_disable_default <= Set
rwb->enable_state (OFF)
Submit_bio
blk_mq_make_request
rq_qos_throttle
<= rwb->enable_state (OFF)
scsi_remove_device
sd_remove
del_gendisk
blk_unregister_queue
elv_unregister_queue
wbt_enable_default
<= Set rwb->enable_state (ON)
q_qos_track
<= rwb->enable_state (ON)
^^^^^^ this request will mark WBT_TRACKED without inflight add and will
lead to drop rqw->inflight to -1 in wbt_done() which will trigger IO hung.
Fix this by move wbt_enable_default() from elv_unregister to
bfq_exit_queue(). Only re-enable wbt when bfq exit.
Fixes: 76a8040817b4b ("blk-wbt: make sure throttle is enabled properly")
Signed-off-by: Laibin Qiu <qiulaibin@huawei.com>
---
block/bfq-iosched.c | 4 ++++
block/elevator.c | 2 --
2 files changed, 4 insertions(+), 2 deletions(-)
diff --git a/block/bfq-iosched.c b/block/bfq-iosched.c
index 0c612a911696..8b7524450835 100644
--- a/block/bfq-iosched.c
+++ b/block/bfq-iosched.c
@@ -6996,6 +6996,7 @@ static void bfq_exit_queue(struct elevator_queue *e)
{
struct bfq_data *bfqd = e->elevator_data;
struct bfq_queue *bfqq, *n;
+ struct request_queue *q = bfqd->queue;
hrtimer_cancel(&bfqd->idle_slice_timer);
@@ -7019,6 +7020,9 @@ static void bfq_exit_queue(struct elevator_queue *e)
#endif
kfree(bfqd);
+
+ /* Re-enable throttling in case elevator disabled it */
+ wbt_enable_default(q);
}
static void bfq_init_root_group(struct bfq_group *root_group,
diff --git a/block/elevator.c b/block/elevator.c
index ec98aed39c4f..482df2a350fc 100644
--- a/block/elevator.c
+++ b/block/elevator.c
@@ -525,8 +525,6 @@ void elv_unregister_queue(struct request_queue *q)
kobject_del(&e->kobj);
e->registered = 0;
- /* Re-enable throttling in case elevator disabled it */
- wbt_enable_default(q);
}
}
--
2.22.0
^ permalink raw reply related [flat|nested] 3+ messages in thread
* Re: [PATCH v3 -next] block/wbt: fix negative inflight counter when remove scsi device
2021-12-14 13:31 [PATCH v3 -next] block/wbt: fix negative inflight counter when remove scsi device Laibin Qiu
@ 2021-12-14 15:04 ` Christoph Hellwig
2021-12-15 2:42 ` Ming Lei
1 sibling, 0 replies; 3+ messages in thread
From: Christoph Hellwig @ 2021-12-14 15:04 UTC (permalink / raw)
To: Laibin Qiu; +Cc: ming.lei, hch, axboe, yi.zhang, linux-block, linux-kernel
Looks good,
Reviewed-by: Christoph Hellwig <hch@lst.de>
^ permalink raw reply [flat|nested] 3+ messages in thread
* Re: [PATCH v3 -next] block/wbt: fix negative inflight counter when remove scsi device
2021-12-14 13:31 [PATCH v3 -next] block/wbt: fix negative inflight counter when remove scsi device Laibin Qiu
2021-12-14 15:04 ` Christoph Hellwig
@ 2021-12-15 2:42 ` Ming Lei
1 sibling, 0 replies; 3+ messages in thread
From: Ming Lei @ 2021-12-15 2:42 UTC (permalink / raw)
To: Laibin Qiu; +Cc: hch, axboe, yi.zhang, linux-block, linux-kernel
On Tue, Dec 14, 2021 at 09:31:03PM +0800, Laibin Qiu wrote:
> Now that we disable wbt by set WBT_STATE_OFF_DEFAULT in
> wbt_disable_default() when switch elevator to bfq. And when
> we remove scsi device, wbt will be enabled by wbt_enable_default.
> If it become false positive between wbt_wait() and wbt_track()
> when submit write request.
>
> The following is the scenario that triggered the problem.
>
> T1 T2 T3
> elevator_switch_mq
> bfq_init_queue
> wbt_disable_default <= Set
> rwb->enable_state (OFF)
> Submit_bio
> blk_mq_make_request
> rq_qos_throttle
> <= rwb->enable_state (OFF)
> scsi_remove_device
> sd_remove
> del_gendisk
> blk_unregister_queue
> elv_unregister_queue
> wbt_enable_default
> <= Set rwb->enable_state (ON)
> q_qos_track
> <= rwb->enable_state (ON)
> ^^^^^^ this request will mark WBT_TRACKED without inflight add and will
> lead to drop rqw->inflight to -1 in wbt_done() which will trigger IO hung.
>
> Fix this by move wbt_enable_default() from elv_unregister to
> bfq_exit_queue(). Only re-enable wbt when bfq exit.
> Fixes: 76a8040817b4b ("blk-wbt: make sure throttle is enabled properly")
> Signed-off-by: Laibin Qiu <qiulaibin@huawei.com>
> ---
> block/bfq-iosched.c | 4 ++++
> block/elevator.c | 2 --
> 2 files changed, 4 insertions(+), 2 deletions(-)
>
> diff --git a/block/bfq-iosched.c b/block/bfq-iosched.c
> index 0c612a911696..8b7524450835 100644
> --- a/block/bfq-iosched.c
> +++ b/block/bfq-iosched.c
> @@ -6996,6 +6996,7 @@ static void bfq_exit_queue(struct elevator_queue *e)
> {
> struct bfq_data *bfqd = e->elevator_data;
> struct bfq_queue *bfqq, *n;
> + struct request_queue *q = bfqd->queue;
>
> hrtimer_cancel(&bfqd->idle_slice_timer);
>
> @@ -7019,6 +7020,9 @@ static void bfq_exit_queue(struct elevator_queue *e)
> #endif
>
> kfree(bfqd);
> +
> + /* Re-enable throttling in case elevator disabled it */
Of course, bfq has disabled it, so the above comment is useless,
otherwise looks fine:
Reviewed-by: Ming Lei <ming.lei@rehdat.com>
Thanks,
Ming
^ permalink raw reply [flat|nested] 3+ messages in thread
end of thread, other threads:[~2021-12-15 2:43 UTC | newest]
Thread overview: 3+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2021-12-14 13:31 [PATCH v3 -next] block/wbt: fix negative inflight counter when remove scsi device Laibin Qiu
2021-12-14 15:04 ` Christoph Hellwig
2021-12-15 2:42 ` Ming Lei
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.