linux-block.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
* [PATCH] blk-mq: fix sbitmap ws_active for shared tags
@ 2019-03-25 16:22 Jens Axboe
  2019-03-25 16:39 ` Bart Van Assche
  2019-03-25 18:56 ` Omar Sandoval
  0 siblings, 2 replies; 7+ messages in thread
From: Jens Axboe @ 2019-03-25 16:22 UTC (permalink / raw)
  To: linux-block; +Cc: Omar Sandoval

We now wrap sbitmap waitqueues in an active counter, so we can avoid
iterating wakeups unless we have waiters there. This works as long as
everyone that's manipulating the waitqueues use the proper helpers. For
the tag wait case for shared tags, however, we add ourselves to the
waitqueue without incrementing/decrementing the ->ws_active count. This
means that wakeups can take a long time to happen.

Fix this by manually doing the inc/dec as needed for the wait queue
handling.

Fixes: 5d2ee7122c73 ("sbitmap: optimize wakeup check")
Signed-off-by: Jens Axboe <axboe@kernel.dk>

---

Got a bug report on raid5 on 4 USB-3 attached drives which is slow with
5.0, but works fine with the ->ws_active check bypassed. Waiting for
confirmation that this patch fixes it, but I think it will.

diff --git a/block/blk-mq.c b/block/blk-mq.c
index 28080b0235f0..3ff3d7b49969 100644
--- a/block/blk-mq.c
+++ b/block/blk-mq.c
@@ -1072,7 +1072,13 @@ static int blk_mq_dispatch_wake(wait_queue_entry_t *wait, unsigned mode,
 	hctx = container_of(wait, struct blk_mq_hw_ctx, dispatch_wait);
 
 	spin_lock(&hctx->dispatch_wait_lock);
-	list_del_init(&wait->entry);
+	if (!list_empty(&wait->entry)) {
+		struct sbitmap_queue *sbq;
+
+		list_del_init(&wait->entry);
+		sbq = &hctx->tags->bitmap_tags;
+		atomic_dec(&sbq->ws_active);
+	}
 	spin_unlock(&hctx->dispatch_wait_lock);
 
 	blk_mq_run_hw_queue(hctx, true);
@@ -1088,6 +1094,7 @@ static int blk_mq_dispatch_wake(wait_queue_entry_t *wait, unsigned mode,
 static bool blk_mq_mark_tag_wait(struct blk_mq_hw_ctx *hctx,
 				 struct request *rq)
 {
+	struct sbitmap_queue *sbq = &hctx->tags->bitmap_tags;
 	struct wait_queue_head *wq;
 	wait_queue_entry_t *wait;
 	bool ret;
@@ -1110,7 +1117,7 @@ static bool blk_mq_mark_tag_wait(struct blk_mq_hw_ctx *hctx,
 	if (!list_empty_careful(&wait->entry))
 		return false;
 
-	wq = &bt_wait_ptr(&hctx->tags->bitmap_tags, hctx)->wait;
+	wq = &bt_wait_ptr(sbq, hctx)->wait;
 
 	spin_lock_irq(&wq->lock);
 	spin_lock(&hctx->dispatch_wait_lock);
@@ -1120,6 +1127,7 @@ static bool blk_mq_mark_tag_wait(struct blk_mq_hw_ctx *hctx,
 		return false;
 	}
 
+	atomic_inc(&sbq->ws_active);
 	wait->flags &= ~WQ_FLAG_EXCLUSIVE;
 	__add_wait_queue(wq, wait);
 
@@ -1140,6 +1148,7 @@ static bool blk_mq_mark_tag_wait(struct blk_mq_hw_ctx *hctx,
 	 * someone else gets the wakeup.
 	 */
 	list_del_init(&wait->entry);
+	atomic_dec(&sbq->ws_active);
 	spin_unlock(&hctx->dispatch_wait_lock);
 	spin_unlock_irq(&wq->lock);
 

-- 
Jens Axboe


^ permalink raw reply related	[flat|nested] 7+ messages in thread

* Re: [PATCH] blk-mq: fix sbitmap ws_active for shared tags
  2019-03-25 16:22 [PATCH] blk-mq: fix sbitmap ws_active for shared tags Jens Axboe
@ 2019-03-25 16:39 ` Bart Van Assche
  2019-03-25 16:42   ` Jens Axboe
  2019-03-25 18:56 ` Omar Sandoval
  1 sibling, 1 reply; 7+ messages in thread
From: Bart Van Assche @ 2019-03-25 16:39 UTC (permalink / raw)
  To: Jens Axboe, linux-block; +Cc: Omar Sandoval

On Mon, 2019-03-25 at 10:22 -0600, Jens Axboe wrote:
> We now wrap sbitmap waitqueues in an active counter, so we can avoid
> iterating wakeups unless we have waiters there. This works as long as
> everyone that's manipulating the waitqueues use the proper helpers. For
> the tag wait case for shared tags, however, we add ourselves to the
> waitqueue without incrementing/decrementing the ->ws_active count. This
> means that wakeups can take a long time to happen.
> 
> Fix this by manually doing the inc/dec as needed for the wait queue
> handling.
> 
> Fixes: 5d2ee7122c73 ("sbitmap: optimize wakeup check")
> Signed-off-by: Jens Axboe <axboe@kernel.dk>

Hi Jens,

Since commit 5d2ee7122c73 went upstream in kernel v5.0, does this patch need
a "Cc: stable" tag?

Thanks,

Bart.

^ permalink raw reply	[flat|nested] 7+ messages in thread

* Re: [PATCH] blk-mq: fix sbitmap ws_active for shared tags
  2019-03-25 16:39 ` Bart Van Assche
@ 2019-03-25 16:42   ` Jens Axboe
  0 siblings, 0 replies; 7+ messages in thread
From: Jens Axboe @ 2019-03-25 16:42 UTC (permalink / raw)
  To: Bart Van Assche, linux-block; +Cc: Omar Sandoval

On 3/25/19 10:39 AM, Bart Van Assche wrote:
> On Mon, 2019-03-25 at 10:22 -0600, Jens Axboe wrote:
>> We now wrap sbitmap waitqueues in an active counter, so we can avoid
>> iterating wakeups unless we have waiters there. This works as long as
>> everyone that's manipulating the waitqueues use the proper helpers. For
>> the tag wait case for shared tags, however, we add ourselves to the
>> waitqueue without incrementing/decrementing the ->ws_active count. This
>> means that wakeups can take a long time to happen.
>>
>> Fix this by manually doing the inc/dec as needed for the wait queue
>> handling.
>>
>> Fixes: 5d2ee7122c73 ("sbitmap: optimize wakeup check")
>> Signed-off-by: Jens Axboe <axboe@kernel.dk>
> 
> Hi Jens,
> 
> Since commit 5d2ee7122c73 went upstream in kernel v5.0, does this patch need
> a "Cc: stable" tag?

I guess it does, I'll add it.

-- 
Jens Axboe


^ permalink raw reply	[flat|nested] 7+ messages in thread

* Re: [PATCH] blk-mq: fix sbitmap ws_active for shared tags
  2019-03-25 16:22 [PATCH] blk-mq: fix sbitmap ws_active for shared tags Jens Axboe
  2019-03-25 16:39 ` Bart Van Assche
@ 2019-03-25 18:56 ` Omar Sandoval
  2019-03-25 18:58   ` Jens Axboe
  1 sibling, 1 reply; 7+ messages in thread
From: Omar Sandoval @ 2019-03-25 18:56 UTC (permalink / raw)
  To: Jens Axboe; +Cc: linux-block, Omar Sandoval

On Mon, Mar 25, 2019 at 10:22:50AM -0600, Jens Axboe wrote:
> We now wrap sbitmap waitqueues in an active counter, so we can avoid
> iterating wakeups unless we have waiters there. This works as long as
> everyone that's manipulating the waitqueues use the proper helpers. For
> the tag wait case for shared tags, however, we add ourselves to the
> waitqueue without incrementing/decrementing the ->ws_active count. This
> means that wakeups can take a long time to happen.
> 
> Fix this by manually doing the inc/dec as needed for the wait queue
> handling.
> 
> Fixes: 5d2ee7122c73 ("sbitmap: optimize wakeup check")
> Signed-off-by: Jens Axboe <axboe@kernel.dk>

Can this use the helpers we added in 9f6b7ef6c3eb ("sbitmap: add helpers
for add/del wait queue handling")?

^ permalink raw reply	[flat|nested] 7+ messages in thread

* Re: [PATCH] blk-mq: fix sbitmap ws_active for shared tags
  2019-03-25 18:56 ` Omar Sandoval
@ 2019-03-25 18:58   ` Jens Axboe
  2019-03-25 19:04     ` Omar Sandoval
  0 siblings, 1 reply; 7+ messages in thread
From: Jens Axboe @ 2019-03-25 18:58 UTC (permalink / raw)
  To: Omar Sandoval; +Cc: linux-block, Omar Sandoval

On 3/25/19 12:56 PM, Omar Sandoval wrote:
> On Mon, Mar 25, 2019 at 10:22:50AM -0600, Jens Axboe wrote:
>> We now wrap sbitmap waitqueues in an active counter, so we can avoid
>> iterating wakeups unless we have waiters there. This works as long as
>> everyone that's manipulating the waitqueues use the proper helpers. For
>> the tag wait case for shared tags, however, we add ourselves to the
>> waitqueue without incrementing/decrementing the ->ws_active count. This
>> means that wakeups can take a long time to happen.
>>
>> Fix this by manually doing the inc/dec as needed for the wait queue
>> handling.
>>
>> Fixes: 5d2ee7122c73 ("sbitmap: optimize wakeup check")
>> Signed-off-by: Jens Axboe <axboe@kernel.dk>
> 
> Can this use the helpers we added in 9f6b7ef6c3eb ("sbitmap: add helpers
> for add/del wait queue handling")?

I don't think so without adding more, which seems kind of silly for this
very specialized use case of openly manipulating the wait queues. The
blk-mq setup there is very special cased.

-- 
Jens Axboe


^ permalink raw reply	[flat|nested] 7+ messages in thread

* Re: [PATCH] blk-mq: fix sbitmap ws_active for shared tags
  2019-03-25 18:58   ` Jens Axboe
@ 2019-03-25 19:04     ` Omar Sandoval
  2019-03-25 19:05       ` Jens Axboe
  0 siblings, 1 reply; 7+ messages in thread
From: Omar Sandoval @ 2019-03-25 19:04 UTC (permalink / raw)
  To: Jens Axboe; +Cc: linux-block, Omar Sandoval

On Mon, Mar 25, 2019 at 12:58:47PM -0600, Jens Axboe wrote:
> On 3/25/19 12:56 PM, Omar Sandoval wrote:
> > On Mon, Mar 25, 2019 at 10:22:50AM -0600, Jens Axboe wrote:
> >> We now wrap sbitmap waitqueues in an active counter, so we can avoid
> >> iterating wakeups unless we have waiters there. This works as long as
> >> everyone that's manipulating the waitqueues use the proper helpers. For
> >> the tag wait case for shared tags, however, we add ourselves to the
> >> waitqueue without incrementing/decrementing the ->ws_active count. This
> >> means that wakeups can take a long time to happen.
> >>
> >> Fix this by manually doing the inc/dec as needed for the wait queue
> >> handling.
> >>
> >> Fixes: 5d2ee7122c73 ("sbitmap: optimize wakeup check")
> >> Signed-off-by: Jens Axboe <axboe@kernel.dk>
> > 
> > Can this use the helpers we added in 9f6b7ef6c3eb ("sbitmap: add helpers
> > for add/del wait queue handling")?
> 
> I don't think so without adding more, which seems kind of silly for this
> very specialized use case of openly manipulating the wait queues. The
> blk-mq setup there is very special cased.

Yup, I see. Assuming it fixes the issue,

Reviewed-by: Omar Sandoval <osandov@fb.com>

^ permalink raw reply	[flat|nested] 7+ messages in thread

* Re: [PATCH] blk-mq: fix sbitmap ws_active for shared tags
  2019-03-25 19:04     ` Omar Sandoval
@ 2019-03-25 19:05       ` Jens Axboe
  0 siblings, 0 replies; 7+ messages in thread
From: Jens Axboe @ 2019-03-25 19:05 UTC (permalink / raw)
  To: Omar Sandoval; +Cc: linux-block, Omar Sandoval

On 3/25/19 1:04 PM, Omar Sandoval wrote:
> On Mon, Mar 25, 2019 at 12:58:47PM -0600, Jens Axboe wrote:
>> On 3/25/19 12:56 PM, Omar Sandoval wrote:
>>> On Mon, Mar 25, 2019 at 10:22:50AM -0600, Jens Axboe wrote:
>>>> We now wrap sbitmap waitqueues in an active counter, so we can avoid
>>>> iterating wakeups unless we have waiters there. This works as long as
>>>> everyone that's manipulating the waitqueues use the proper helpers. For
>>>> the tag wait case for shared tags, however, we add ourselves to the
>>>> waitqueue without incrementing/decrementing the ->ws_active count. This
>>>> means that wakeups can take a long time to happen.
>>>>
>>>> Fix this by manually doing the inc/dec as needed for the wait queue
>>>> handling.
>>>>
>>>> Fixes: 5d2ee7122c73 ("sbitmap: optimize wakeup check")
>>>> Signed-off-by: Jens Axboe <axboe@kernel.dk>
>>>
>>> Can this use the helpers we added in 9f6b7ef6c3eb ("sbitmap: add helpers
>>> for add/del wait queue handling")?
>>
>> I don't think so without adding more, which seems kind of silly for this
>> very specialized use case of openly manipulating the wait queues. The
>> blk-mq setup there is very special cased.
> 
> Yup, I see. Assuming it fixes the issue,
> 
> Reviewed-by: Omar Sandoval <osandov@fb.com>

Thanks - it does fix the issue, the original reporter has since tested and
confirmed that his abysmally slow raid5 now works at full speed again.

-- 
Jens Axboe


^ permalink raw reply	[flat|nested] 7+ messages in thread

end of thread, other threads:[~2019-03-25 19:05 UTC | newest]

Thread overview: 7+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2019-03-25 16:22 [PATCH] blk-mq: fix sbitmap ws_active for shared tags Jens Axboe
2019-03-25 16:39 ` Bart Van Assche
2019-03-25 16:42   ` Jens Axboe
2019-03-25 18:56 ` Omar Sandoval
2019-03-25 18:58   ` Jens Axboe
2019-03-25 19:04     ` Omar Sandoval
2019-03-25 19:05       ` Jens Axboe

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).