linux-kernel.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
* [PATCH -next] block/wbt: fix negative inflight counter when remove scsi device
@ 2021-12-13  4:09 Laibin Qiu
  2021-12-13 17:16 ` Christoph Hellwig
  0 siblings, 1 reply; 5+ messages in thread
From: Laibin Qiu @ 2021-12-13  4:09 UTC (permalink / raw)
  To: axboe; +Cc: yi.zhang, linux-block, linux-kernel

Now that we disable wbt by set WBT_STATE_OFF_DEFAULT in
wbt_disable_default() when switch elevator to bfq. And when
we remove scsi device, wbt will be enabled by wbt_enable_default.
If it become false positive between wbt_wait() and wbt_track()
when submit write request.

The following is the scenario that triggered the problem.

T1                          T2                           T3
                            elevator_switch_mq
                            bfq_init_queue
                            wbt_disable_default <= Set
                            rwb->enable_state (OFF)
Submit_bio
blk_mq_make_request
rq_qos_throttle
<= rwb->enable_state (OFF)
                                                         scsi_remove_device
                                                         sd_remove
                                                         del_gendisk
                                                         blk_unregister_queue
                                                         elv_unregister_queue
                                                         wbt_enable_default
                                                         <= Set rwb->enable_state (ON)
q_qos_track
<= rwb->enable_state (ON)
^^^^^^ this request will mark WBT_TRACKED without inflight add and will
lead to drop rqw->inflight to -1 in wbt_done() which will trigger IO hung.

Fix this by judge whether QUEUE_FLAG_REGISTERED is marked to distinguish
scsi remove scene.
Fixes: 76a8040817b4b ("blk-wbt: make sure throttle is enabled properly")
Signed-off-by: Laibin Qiu <qiulaibin@huawei.com>
---
 block/blk-wbt.c | 8 ++++----
 1 file changed, 4 insertions(+), 4 deletions(-)

diff --git a/block/blk-wbt.c b/block/blk-wbt.c
index 3ed71b8da887..537f77bb1365 100644
--- a/block/blk-wbt.c
+++ b/block/blk-wbt.c
@@ -637,6 +637,10 @@ void wbt_enable_default(struct request_queue *q)
 {
 	struct rq_qos *rqos = wbt_rq_qos(q);
 
+	/* Queue not registered? Maybe shutting down... */
+	if (!blk_queue_registered(q))
+		return;
+
 	/* Throttling already enabled? */
 	if (rqos) {
 		if (RQWB(rqos)->enable_state == WBT_STATE_OFF_DEFAULT)
@@ -644,10 +648,6 @@ void wbt_enable_default(struct request_queue *q)
 		return;
 	}
 
-	/* Queue not registered? Maybe shutting down... */
-	if (!blk_queue_registered(q))
-		return;
-
 	if (queue_is_mq(q) && IS_ENABLED(CONFIG_BLK_WBT_MQ))
 		wbt_init(q);
 }
-- 
2.22.0


^ permalink raw reply related	[flat|nested] 5+ messages in thread

* Re: [PATCH -next] block/wbt: fix negative inflight counter when remove scsi device
  2021-12-13  4:09 [PATCH -next] block/wbt: fix negative inflight counter when remove scsi device Laibin Qiu
@ 2021-12-13 17:16 ` Christoph Hellwig
  2021-12-14  1:13   ` Ming Lei
  2021-12-14  4:25   ` QiuLaibin
  0 siblings, 2 replies; 5+ messages in thread
From: Christoph Hellwig @ 2021-12-13 17:16 UTC (permalink / raw)
  To: Laibin Qiu; +Cc: axboe, yi.zhang, linux-block, linux-kernel

On Mon, Dec 13, 2021 at 12:09:07PM +0800, Laibin Qiu wrote:
> Submit_bio
>                                                          scsi_remove_device
>                                                          sd_remove
>                                                          del_gendisk
>                                                          blk_unregister_queue
>                                                          elv_unregister_queue
>                                                          wbt_enable_default
>                                                          <= Set rwb->enable_state (ON)
> q_qos_track
> <= rwb->enable_state (ON)
> ^^^^^^ this request will mark WBT_TRACKED without inflight add and will
> lead to drop rqw->inflight to -1 in wbt_done() which will trigger IO hung.
> 
> Fix this by judge whether QUEUE_FLAG_REGISTERED is marked to distinguish
> scsi remove scene.
> Fixes: 76a8040817b4b ("blk-wbt: make sure throttle is enabled properly")
> Signed-off-by: Laibin Qiu <qiulaibin@huawei.com>
> ---
>  block/blk-wbt.c | 8 ++++----
>  1 file changed, 4 insertions(+), 4 deletions(-)
> 
> diff --git a/block/blk-wbt.c b/block/blk-wbt.c
> index 3ed71b8da887..537f77bb1365 100644
> --- a/block/blk-wbt.c
> +++ b/block/blk-wbt.c
> @@ -637,6 +637,10 @@ void wbt_enable_default(struct request_queue *q)
>  {
>  	struct rq_qos *rqos = wbt_rq_qos(q);
>  
> +	/* Queue not registered? Maybe shutting down... */
> +	if (!blk_queue_registered(q))
> +		return;

Wouldn't it make more sense to simply not call wbt_enable_default from
elv_unregister_queue?

^ permalink raw reply	[flat|nested] 5+ messages in thread

* Re: [PATCH -next] block/wbt: fix negative inflight counter when remove scsi device
  2021-12-13 17:16 ` Christoph Hellwig
@ 2021-12-14  1:13   ` Ming Lei
  2021-12-14  8:07     ` Christoph Hellwig
  2021-12-14  4:25   ` QiuLaibin
  1 sibling, 1 reply; 5+ messages in thread
From: Ming Lei @ 2021-12-14  1:13 UTC (permalink / raw)
  To: Christoph Hellwig; +Cc: Laibin Qiu, axboe, yi.zhang, linux-block, linux-kernel

On Mon, Dec 13, 2021 at 09:16:51AM -0800, Christoph Hellwig wrote:
> On Mon, Dec 13, 2021 at 12:09:07PM +0800, Laibin Qiu wrote:
> > Submit_bio
> >                                                          scsi_remove_device
> >                                                          sd_remove
> >                                                          del_gendisk
> >                                                          blk_unregister_queue
> >                                                          elv_unregister_queue
> >                                                          wbt_enable_default
> >                                                          <= Set rwb->enable_state (ON)
> > q_qos_track
> > <= rwb->enable_state (ON)
> > ^^^^^^ this request will mark WBT_TRACKED without inflight add and will
> > lead to drop rqw->inflight to -1 in wbt_done() which will trigger IO hung.
> > 
> > Fix this by judge whether QUEUE_FLAG_REGISTERED is marked to distinguish
> > scsi remove scene.
> > Fixes: 76a8040817b4b ("blk-wbt: make sure throttle is enabled properly")
> > Signed-off-by: Laibin Qiu <qiulaibin@huawei.com>
> > ---
> >  block/blk-wbt.c | 8 ++++----
> >  1 file changed, 4 insertions(+), 4 deletions(-)
> > 
> > diff --git a/block/blk-wbt.c b/block/blk-wbt.c
> > index 3ed71b8da887..537f77bb1365 100644
> > --- a/block/blk-wbt.c
> > +++ b/block/blk-wbt.c
> > @@ -637,6 +637,10 @@ void wbt_enable_default(struct request_queue *q)
> >  {
> >  	struct rq_qos *rqos = wbt_rq_qos(q);
> >  
> > +	/* Queue not registered? Maybe shutting down... */
> > +	if (!blk_queue_registered(q))
> > +		return;
> 
> Wouldn't it make more sense to simply not call wbt_enable_default from
> elv_unregister_queue?

wbt_disable_default() is called in bfq_init_root_group(), so wbt_enable_default
should be moved to bfq_exit_queue()?


Thanks,
Ming


^ permalink raw reply	[flat|nested] 5+ messages in thread

* Re: [PATCH -next] block/wbt: fix negative inflight counter when remove scsi device
  2021-12-13 17:16 ` Christoph Hellwig
  2021-12-14  1:13   ` Ming Lei
@ 2021-12-14  4:25   ` QiuLaibin
  1 sibling, 0 replies; 5+ messages in thread
From: QiuLaibin @ 2021-12-14  4:25 UTC (permalink / raw)
  To: Christoph Hellwig; +Cc: axboe, yi.zhang, linux-block, linux-kernel


On 2021/12/14 1:16, Christoph Hellwig wrote:
> On Mon, Dec 13, 2021 at 12:09:07PM +0800, Laibin Qiu wrote:
>> Submit_bio
>>                                                           scsi_remove_device
>>                                                           sd_remove
>>                                                           del_gendisk
>>                                                           blk_unregister_queue
>>                                                           elv_unregister_queue
>>                                                           wbt_enable_default
>>                                                           <= Set rwb->enable_state (ON)
>> q_qos_track
>> <= rwb->enable_state (ON)
>> ^^^^^^ this request will mark WBT_TRACKED without inflight add and will
>> lead to drop rqw->inflight to -1 in wbt_done() which will trigger IO hung.
>>
>> Fix this by judge whether QUEUE_FLAG_REGISTERED is marked to distinguish
>> scsi remove scene.
>> Fixes: 76a8040817b4b ("blk-wbt: make sure throttle is enabled properly")
>> Signed-off-by: Laibin Qiu <qiulaibin@huawei.com>
>> ---
>>   block/blk-wbt.c | 8 ++++----
>>   1 file changed, 4 insertions(+), 4 deletions(-)
>>
>> diff --git a/block/blk-wbt.c b/block/blk-wbt.c
>> index 3ed71b8da887..537f77bb1365 100644
>> --- a/block/blk-wbt.c
>> +++ b/block/blk-wbt.c
>> @@ -637,6 +637,10 @@ void wbt_enable_default(struct request_queue *q)
>>   {
>>   	struct rq_qos *rqos = wbt_rq_qos(q);
>>   
>> +	/* Queue not registered? Maybe shutting down... */
>> +	if (!blk_queue_registered(q))
>> +		return;
> 
> Wouldn't it make more sense to simply not call wbt_enable_default from
> elv_unregister_queue?
> .
> 

Refer to your opinion, I will post another version of V2.
Please take a look again.

^ permalink raw reply	[flat|nested] 5+ messages in thread

* Re: [PATCH -next] block/wbt: fix negative inflight counter when remove scsi device
  2021-12-14  1:13   ` Ming Lei
@ 2021-12-14  8:07     ` Christoph Hellwig
  0 siblings, 0 replies; 5+ messages in thread
From: Christoph Hellwig @ 2021-12-14  8:07 UTC (permalink / raw)
  To: Ming Lei
  Cc: Christoph Hellwig, Laibin Qiu, axboe, yi.zhang, linux-block,
	linux-kernel

On Tue, Dec 14, 2021 at 09:13:10AM +0800, Ming Lei wrote:
> > Wouldn't it make more sense to simply not call wbt_enable_default from
> > elv_unregister_queue?
> 
> wbt_disable_default() is called in bfq_init_root_group(), so wbt_enable_default

s/bfq_init_root_group/bfq_init_queue/

But yes, that sounds like an even better idea.  Or maybe even an
elevator feature flag.

^ permalink raw reply	[flat|nested] 5+ messages in thread

end of thread, other threads:[~2021-12-14  8:07 UTC | newest]

Thread overview: 5+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2021-12-13  4:09 [PATCH -next] block/wbt: fix negative inflight counter when remove scsi device Laibin Qiu
2021-12-13 17:16 ` Christoph Hellwig
2021-12-14  1:13   ` Ming Lei
2021-12-14  8:07     ` Christoph Hellwig
2021-12-14  4:25   ` QiuLaibin

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).