linux-kernel.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
* [PATCH -next 0/2] block, bfq: make bfq_has_work() more accurate
@ 2022-05-10 13:16 Yu Kuai
  2022-05-10 13:16 ` [PATCH -next 1/2] block, bfq: protect 'bfqd->queued' by 'bfqd->lock' Yu Kuai
  2022-05-10 13:16 ` [PATCH -next 2/2] block, bfq: make bfq_has_work() more accurate Yu Kuai
  0 siblings, 2 replies; 8+ messages in thread
From: Yu Kuai @ 2022-05-10 13:16 UTC (permalink / raw)
  To: jack, paolo.valente, axboe; +Cc: linux-block, linux-kernel, yukuai3, yi.zhang

This patchset try to make bfq_has_work() more accurate, patch 1 is a
small problem found by code review.

BTW, I not sure why blk_mq_run_hw_queues() is called with 'bfqd->lock'
held, I think this is not necessary. And bfq_has_work() can be more
accurate by reading 'bfqd->queued' with 'bfqd->lock' held after patch 2.

Yu Kuai (2):
  block, bfq: protect 'bfqd->queued' by 'bfqd->lock'
  block, bfq: make bfq_has_work() more accurate

 block/bfq-iosched.c | 8 +++++---
 1 file changed, 5 insertions(+), 3 deletions(-)

-- 
2.31.1


^ permalink raw reply	[flat|nested] 8+ messages in thread

* [PATCH -next 1/2] block, bfq: protect 'bfqd->queued' by 'bfqd->lock'
  2022-05-10 13:16 [PATCH -next 0/2] block, bfq: make bfq_has_work() more accurate Yu Kuai
@ 2022-05-10 13:16 ` Yu Kuai
  2022-05-11 13:52   ` Jan Kara
  2022-05-10 13:16 ` [PATCH -next 2/2] block, bfq: make bfq_has_work() more accurate Yu Kuai
  1 sibling, 1 reply; 8+ messages in thread
From: Yu Kuai @ 2022-05-10 13:16 UTC (permalink / raw)
  To: jack, paolo.valente, axboe; +Cc: linux-block, linux-kernel, yukuai3, yi.zhang

If bfq_schedule_dispatch() is called from bfq_idle_slice_timer_body(),
then 'bfqd->queued' is read without holding 'bfqd->lock'. This is
wrong since it can be wrote concurrently.

Fix the problem by holding 'bfqd->lock' for bfq_schedule_dispatch(),
like everywhere else.

Signed-off-by: Yu Kuai <yukuai3@huawei.com>
---
 block/bfq-iosched.c | 4 +++-
 1 file changed, 3 insertions(+), 1 deletion(-)

diff --git a/block/bfq-iosched.c b/block/bfq-iosched.c
index 272d48d8f326..61750696e87f 100644
--- a/block/bfq-iosched.c
+++ b/block/bfq-iosched.c
@@ -456,6 +456,8 @@ static struct bfq_io_cq *bfq_bic_lookup(struct request_queue *q)
  */
 void bfq_schedule_dispatch(struct bfq_data *bfqd)
 {
+	lockdep_assert_held(&bfqd->lock);
+
 	if (bfqd->queued != 0) {
 		bfq_log(bfqd, "schedule dispatch");
 		blk_mq_run_hw_queues(bfqd->queue, true);
@@ -6898,8 +6900,8 @@ bfq_idle_slice_timer_body(struct bfq_data *bfqd, struct bfq_queue *bfqq)
 	bfq_bfqq_expire(bfqd, bfqq, true, reason);
 
 schedule_dispatch:
-	spin_unlock_irqrestore(&bfqd->lock, flags);
 	bfq_schedule_dispatch(bfqd);
+	spin_unlock_irqrestore(&bfqd->lock, flags);
 }
 
 /*
-- 
2.31.1


^ permalink raw reply related	[flat|nested] 8+ messages in thread

* [PATCH -next 2/2] block, bfq: make bfq_has_work() more accurate
  2022-05-10 13:16 [PATCH -next 0/2] block, bfq: make bfq_has_work() more accurate Yu Kuai
  2022-05-10 13:16 ` [PATCH -next 1/2] block, bfq: protect 'bfqd->queued' by 'bfqd->lock' Yu Kuai
@ 2022-05-10 13:16 ` Yu Kuai
  2022-05-11 14:08   ` Jan Kara
  1 sibling, 1 reply; 8+ messages in thread
From: Yu Kuai @ 2022-05-10 13:16 UTC (permalink / raw)
  To: jack, paolo.valente, axboe; +Cc: linux-block, linux-kernel, yukuai3, yi.zhang

bfq_has_work() is using busy_queues currently, which is not accurate
because bfq_queue is busy doesn't represent that it has requests. Since
bfqd aready has a counter 'queued' to record how many requests are in
bfq, use it instead of busy_queues.

Noted that bfq_has_work() can be called with 'bfqd->lock' held, thus the
lock can't be held in bfq_has_work() to protect 'bfqd->queued'.

Signed-off-by: Yu Kuai <yukuai3@huawei.com>
---
 block/bfq-iosched.c | 4 ++--
 1 file changed, 2 insertions(+), 2 deletions(-)

diff --git a/block/bfq-iosched.c b/block/bfq-iosched.c
index 61750696e87f..1d2f8110c26b 100644
--- a/block/bfq-iosched.c
+++ b/block/bfq-iosched.c
@@ -5063,11 +5063,11 @@ static bool bfq_has_work(struct blk_mq_hw_ctx *hctx)
 	struct bfq_data *bfqd = hctx->queue->elevator->elevator_data;
 
 	/*
-	 * Avoiding lock: a race on bfqd->busy_queues should cause at
+	 * Avoiding lock: a race on bfqd->queued should cause at
 	 * most a call to dispatch for nothing
 	 */
 	return !list_empty_careful(&bfqd->dispatch) ||
-		bfq_tot_busy_queues(bfqd) > 0;
+		READ_ONCE(bfqd->queued);
 }
 
 static struct request *__bfq_dispatch_request(struct blk_mq_hw_ctx *hctx)
-- 
2.31.1


^ permalink raw reply related	[flat|nested] 8+ messages in thread

* Re: [PATCH -next 1/2] block, bfq: protect 'bfqd->queued' by 'bfqd->lock'
  2022-05-10 13:16 ` [PATCH -next 1/2] block, bfq: protect 'bfqd->queued' by 'bfqd->lock' Yu Kuai
@ 2022-05-11 13:52   ` Jan Kara
  0 siblings, 0 replies; 8+ messages in thread
From: Jan Kara @ 2022-05-11 13:52 UTC (permalink / raw)
  To: Yu Kuai; +Cc: jack, paolo.valente, axboe, linux-block, linux-kernel, yi.zhang

On Tue 10-05-22 21:16:28, Yu Kuai wrote:
> If bfq_schedule_dispatch() is called from bfq_idle_slice_timer_body(),
> then 'bfqd->queued' is read without holding 'bfqd->lock'. This is
> wrong since it can be wrote concurrently.
> 
> Fix the problem by holding 'bfqd->lock' for bfq_schedule_dispatch(),
> like everywhere else.
> 
> Signed-off-by: Yu Kuai <yukuai3@huawei.com>

Looks good. Feel free to add:

Reviewed-by: Jan Kara <jack@suse.cz>

								Honza

> ---
>  block/bfq-iosched.c | 4 +++-
>  1 file changed, 3 insertions(+), 1 deletion(-)
> 
> diff --git a/block/bfq-iosched.c b/block/bfq-iosched.c
> index 272d48d8f326..61750696e87f 100644
> --- a/block/bfq-iosched.c
> +++ b/block/bfq-iosched.c
> @@ -456,6 +456,8 @@ static struct bfq_io_cq *bfq_bic_lookup(struct request_queue *q)
>   */
>  void bfq_schedule_dispatch(struct bfq_data *bfqd)
>  {
> +	lockdep_assert_held(&bfqd->lock);
> +
>  	if (bfqd->queued != 0) {
>  		bfq_log(bfqd, "schedule dispatch");
>  		blk_mq_run_hw_queues(bfqd->queue, true);
> @@ -6898,8 +6900,8 @@ bfq_idle_slice_timer_body(struct bfq_data *bfqd, struct bfq_queue *bfqq)
>  	bfq_bfqq_expire(bfqd, bfqq, true, reason);
>  
>  schedule_dispatch:
> -	spin_unlock_irqrestore(&bfqd->lock, flags);
>  	bfq_schedule_dispatch(bfqd);
> +	spin_unlock_irqrestore(&bfqd->lock, flags);
>  }
>  
>  /*
> -- 
> 2.31.1
> 
-- 
Jan Kara <jack@suse.com>
SUSE Labs, CR

^ permalink raw reply	[flat|nested] 8+ messages in thread

* Re: [PATCH -next 2/2] block, bfq: make bfq_has_work() more accurate
  2022-05-10 13:16 ` [PATCH -next 2/2] block, bfq: make bfq_has_work() more accurate Yu Kuai
@ 2022-05-11 14:08   ` Jan Kara
  2022-05-12  1:30     ` yukuai (C)
  0 siblings, 1 reply; 8+ messages in thread
From: Jan Kara @ 2022-05-11 14:08 UTC (permalink / raw)
  To: Yu Kuai; +Cc: jack, paolo.valente, axboe, linux-block, linux-kernel, yi.zhang

On Tue 10-05-22 21:16:29, Yu Kuai wrote:
> bfq_has_work() is using busy_queues currently, which is not accurate
> because bfq_queue is busy doesn't represent that it has requests. Since
> bfqd aready has a counter 'queued' to record how many requests are in
> bfq, use it instead of busy_queues.
> 
> Noted that bfq_has_work() can be called with 'bfqd->lock' held, thus the
> lock can't be held in bfq_has_work() to protect 'bfqd->queued'.
> 
> Signed-off-by: Yu Kuai <yukuai3@huawei.com>

So did you find this causing any real problem? Because bfq queue is
accounted among busy queues once bfq_add_bfqq_busy() is called. And that
happens once a new request is inserted into the queue so it should be very
similar to bfqd->queued.

								Honza

> ---
>  block/bfq-iosched.c | 4 ++--
>  1 file changed, 2 insertions(+), 2 deletions(-)
> 
> diff --git a/block/bfq-iosched.c b/block/bfq-iosched.c
> index 61750696e87f..1d2f8110c26b 100644
> --- a/block/bfq-iosched.c
> +++ b/block/bfq-iosched.c
> @@ -5063,11 +5063,11 @@ static bool bfq_has_work(struct blk_mq_hw_ctx *hctx)
>  	struct bfq_data *bfqd = hctx->queue->elevator->elevator_data;
>  
>  	/*
> -	 * Avoiding lock: a race on bfqd->busy_queues should cause at
> +	 * Avoiding lock: a race on bfqd->queued should cause at
>  	 * most a call to dispatch for nothing
>  	 */
>  	return !list_empty_careful(&bfqd->dispatch) ||
> -		bfq_tot_busy_queues(bfqd) > 0;
> +		READ_ONCE(bfqd->queued);
>  }
>  
>  static struct request *__bfq_dispatch_request(struct blk_mq_hw_ctx *hctx)
> -- 
> 2.31.1
> 
-- 
Jan Kara <jack@suse.com>
SUSE Labs, CR

^ permalink raw reply	[flat|nested] 8+ messages in thread

* Re: [PATCH -next 2/2] block, bfq: make bfq_has_work() more accurate
  2022-05-11 14:08   ` Jan Kara
@ 2022-05-12  1:30     ` yukuai (C)
  2022-05-12 17:10       ` Jan Kara
  0 siblings, 1 reply; 8+ messages in thread
From: yukuai (C) @ 2022-05-12  1:30 UTC (permalink / raw)
  To: Jan Kara; +Cc: paolo.valente, axboe, linux-block, linux-kernel, yi.zhang

On 2022/05/11 22:08, Jan Kara wrote:
> On Tue 10-05-22 21:16:29, Yu Kuai wrote:
>> bfq_has_work() is using busy_queues currently, which is not accurate
>> because bfq_queue is busy doesn't represent that it has requests. Since
>> bfqd aready has a counter 'queued' to record how many requests are in
>> bfq, use it instead of busy_queues.
>>
>> Noted that bfq_has_work() can be called with 'bfqd->lock' held, thus the
>> lock can't be held in bfq_has_work() to protect 'bfqd->queued'.
>>
>> Signed-off-by: Yu Kuai <yukuai3@huawei.com>
> 
> So did you find this causing any real problem? Because bfq queue is
> accounted among busy queues once bfq_add_bfqq_busy() is called. And that
> happens once a new request is inserted into the queue so it should be very
> similar to bfqd->queued.
> 
> 								Honza

Hi,

The related problem is described here:

https://lore.kernel.org/all/20220510112302.1215092-1-yukuai3@huawei.com/

The root cause of the panic is a linux-block problem, however, it can
be bypassed if bfq_has_work() is accurate. On the other hand,
unnecessary run_work will be triggered if bfqq stays busy:

__blk_mq_run_hw_queue
  __blk_mq_sched_dispatch_requests
   __blk_mq_do_dispatch_sched
    if (!bfq_has_work())
     break;
    blk_mq_delay_run_hw_queues -> run again after 3ms

Thanks,
Kuai
> 
>> ---
>>   block/bfq-iosched.c | 4 ++--
>>   1 file changed, 2 insertions(+), 2 deletions(-)
>>
>> diff --git a/block/bfq-iosched.c b/block/bfq-iosched.c
>> index 61750696e87f..1d2f8110c26b 100644
>> --- a/block/bfq-iosched.c
>> +++ b/block/bfq-iosched.c
>> @@ -5063,11 +5063,11 @@ static bool bfq_has_work(struct blk_mq_hw_ctx *hctx)
>>   	struct bfq_data *bfqd = hctx->queue->elevator->elevator_data;
>>   
>>   	/*
>> -	 * Avoiding lock: a race on bfqd->busy_queues should cause at
>> +	 * Avoiding lock: a race on bfqd->queued should cause at
>>   	 * most a call to dispatch for nothing
>>   	 */
>>   	return !list_empty_careful(&bfqd->dispatch) ||
>> -		bfq_tot_busy_queues(bfqd) > 0;
>> +		READ_ONCE(bfqd->queued);
>>   }
>>   
>>   static struct request *__bfq_dispatch_request(struct blk_mq_hw_ctx *hctx)
>> -- 
>> 2.31.1
>>

^ permalink raw reply	[flat|nested] 8+ messages in thread

* Re: [PATCH -next 2/2] block, bfq: make bfq_has_work() more accurate
  2022-05-12  1:30     ` yukuai (C)
@ 2022-05-12 17:10       ` Jan Kara
  2022-05-13  1:08         ` yukuai (C)
  0 siblings, 1 reply; 8+ messages in thread
From: Jan Kara @ 2022-05-12 17:10 UTC (permalink / raw)
  To: yukuai (C)
  Cc: Jan Kara, paolo.valente, axboe, linux-block, linux-kernel, yi.zhang

On Thu 12-05-22 09:30:16, yukuai (C) wrote:
> On 2022/05/11 22:08, Jan Kara wrote:
> > On Tue 10-05-22 21:16:29, Yu Kuai wrote:
> > > bfq_has_work() is using busy_queues currently, which is not accurate
> > > because bfq_queue is busy doesn't represent that it has requests. Since
> > > bfqd aready has a counter 'queued' to record how many requests are in
> > > bfq, use it instead of busy_queues.
> > > 
> > > Noted that bfq_has_work() can be called with 'bfqd->lock' held, thus the
> > > lock can't be held in bfq_has_work() to protect 'bfqd->queued'.
> > > 
> > > Signed-off-by: Yu Kuai <yukuai3@huawei.com>
> > 
> > So did you find this causing any real problem? Because bfq queue is
> > accounted among busy queues once bfq_add_bfqq_busy() is called. And that
> > happens once a new request is inserted into the queue so it should be very
> > similar to bfqd->queued.
> > 
> > 								Honza
> 
> Hi,
> 
> The related problem is described here:
> 
> https://lore.kernel.org/all/20220510112302.1215092-1-yukuai3@huawei.com/
> 
> The root cause of the panic is a linux-block problem, however, it can
> be bypassed if bfq_has_work() is accurate. On the other hand,
> unnecessary run_work will be triggered if bfqq stays busy:
> 
> __blk_mq_run_hw_queue
>  __blk_mq_sched_dispatch_requests
>   __blk_mq_do_dispatch_sched
>    if (!bfq_has_work())
>     break;
>    blk_mq_delay_run_hw_queues -> run again after 3ms

Ah, I see. So it is the other way around than I thought. Due to idling
bfq_tot_busy_queues() can be greater than 0 even if there are no requests
to dispatch. Indeed. OK, the patch makes sense. But please use WRITE_ONCE
for the updates of bfqd->queued. Otherwise the READ_ONCE does not really
make sense (it can still result in some bogus value due to compiler
optimizations on the write side).

								Honza

> > > ---
> > >   block/bfq-iosched.c | 4 ++--
> > >   1 file changed, 2 insertions(+), 2 deletions(-)
> > > 
> > > diff --git a/block/bfq-iosched.c b/block/bfq-iosched.c
> > > index 61750696e87f..1d2f8110c26b 100644
> > > --- a/block/bfq-iosched.c
> > > +++ b/block/bfq-iosched.c
> > > @@ -5063,11 +5063,11 @@ static bool bfq_has_work(struct blk_mq_hw_ctx *hctx)
> > >   	struct bfq_data *bfqd = hctx->queue->elevator->elevator_data;
> > >   	/*
> > > -	 * Avoiding lock: a race on bfqd->busy_queues should cause at
> > > +	 * Avoiding lock: a race on bfqd->queued should cause at
> > >   	 * most a call to dispatch for nothing
> > >   	 */
> > >   	return !list_empty_careful(&bfqd->dispatch) ||
> > > -		bfq_tot_busy_queues(bfqd) > 0;
> > > +		READ_ONCE(bfqd->queued);
> > >   }
> > >   static struct request *__bfq_dispatch_request(struct blk_mq_hw_ctx *hctx)
> > > -- 
> > > 2.31.1
> > > 
-- 
Jan Kara <jack@suse.com>
SUSE Labs, CR

^ permalink raw reply	[flat|nested] 8+ messages in thread

* Re: [PATCH -next 2/2] block, bfq: make bfq_has_work() more accurate
  2022-05-12 17:10       ` Jan Kara
@ 2022-05-13  1:08         ` yukuai (C)
  0 siblings, 0 replies; 8+ messages in thread
From: yukuai (C) @ 2022-05-13  1:08 UTC (permalink / raw)
  To: Jan Kara; +Cc: paolo.valente, axboe, linux-block, linux-kernel, yi.zhang

在 2022/05/13 1:10, Jan Kara 写道:
> On Thu 12-05-22 09:30:16, yukuai (C) wrote:
>> On 2022/05/11 22:08, Jan Kara wrote:
>>> On Tue 10-05-22 21:16:29, Yu Kuai wrote:
>>>> bfq_has_work() is using busy_queues currently, which is not accurate
>>>> because bfq_queue is busy doesn't represent that it has requests. Since
>>>> bfqd aready has a counter 'queued' to record how many requests are in
>>>> bfq, use it instead of busy_queues.
>>>>
>>>> Noted that bfq_has_work() can be called with 'bfqd->lock' held, thus the
>>>> lock can't be held in bfq_has_work() to protect 'bfqd->queued'.
>>>>
>>>> Signed-off-by: Yu Kuai <yukuai3@huawei.com>
>>>
>>> So did you find this causing any real problem? Because bfq queue is
>>> accounted among busy queues once bfq_add_bfqq_busy() is called. And that
>>> happens once a new request is inserted into the queue so it should be very
>>> similar to bfqd->queued.
>>>
>>> 								Honza
>>
>> Hi,
>>
>> The related problem is described here:
>>
>> https://lore.kernel.org/all/20220510112302.1215092-1-yukuai3@huawei.com/
>>
>> The root cause of the panic is a linux-block problem, however, it can
>> be bypassed if bfq_has_work() is accurate. On the other hand,
>> unnecessary run_work will be triggered if bfqq stays busy:
>>
>> __blk_mq_run_hw_queue
>>   __blk_mq_sched_dispatch_requests
>>    __blk_mq_do_dispatch_sched
>>     if (!bfq_has_work())
>>      break;
>>     blk_mq_delay_run_hw_queues -> run again after 3ms
> 
> Ah, I see. So it is the other way around than I thought. Due to idling
> bfq_tot_busy_queues() can be greater than 0 even if there are no requests
> to dispatch. Indeed. OK, the patch makes sense. But please use WRITE_ONCE
> for the updates of bfqd->queued. Otherwise the READ_ONCE does not really
> make sense (it can still result in some bogus value due to compiler
> optimizations on the write side).

Thanks for you adivce, I'll send a new version.

Kuai

^ permalink raw reply	[flat|nested] 8+ messages in thread

end of thread, other threads:[~2022-05-13  1:08 UTC | newest]

Thread overview: 8+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2022-05-10 13:16 [PATCH -next 0/2] block, bfq: make bfq_has_work() more accurate Yu Kuai
2022-05-10 13:16 ` [PATCH -next 1/2] block, bfq: protect 'bfqd->queued' by 'bfqd->lock' Yu Kuai
2022-05-11 13:52   ` Jan Kara
2022-05-10 13:16 ` [PATCH -next 2/2] block, bfq: make bfq_has_work() more accurate Yu Kuai
2022-05-11 14:08   ` Jan Kara
2022-05-12  1:30     ` yukuai (C)
2022-05-12 17:10       ` Jan Kara
2022-05-13  1:08         ` yukuai (C)

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).