linux-kernel.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
* [PATCH 1/1] blk-core: fix queue stuck on attempt to submit request from unplug
@ 2015-09-27 20:44 Roman Pen
  2015-09-28 17:27 ` Jens Axboe
  0 siblings, 1 reply; 3+ messages in thread
From: Roman Pen @ 2015-09-27 20:44 UTC (permalink / raw)
  Cc: Roman Pen, Jens Axboe, linux-kernel, stable

In case of several stacked block devices, which both were inited by
blk_init_queue call, you can catch the queue stuck, if first device
in stack makes bio submit being in a flush of a plug list.

Let's consider this regular scenario taking readahead into account
(readahead.c:read_pages):

1. Start plug
2. Read pages in loop
3. Finish plug

This example generates backtrace as follows:

1. blk_start_plug
2. generic_make_request
        q->make_request_fn
        [blk_queue_bio]
            if (current->plug)
                list_add_tail(&req->queuelist, &plug->list);
3. blk_finish_plug
        blk_flush_plug_list
            queue_unplugged
                __blk_run_queue
                    XXX_request_fn [some request handler of block device]
                        generic_make_request
                            q->make_request_fn
                            [blk_queue_bio]
                                if (current->plug)
                                    list_add_tail(&req->queuelist, &plug->list);

So the problem is, that on step 3. XXX_request_fn makes
another request, which again will be put to plug list,
because plug is till active, thus new request will be
stuck forever in the queue.

How to fix?
Do flush plug list till it becomes empty.

Signed-off-by: Roman Pen <r.peniaev@gmail.com>
Cc: Jens Axboe <axboe@kernel.dk>
Cc: linux-kernel@vger.kernel.org
Cc: stable@vger.kernel.org
---
 block/blk-core.c | 10 ++++++++++
 block/blk-mq.c   | 13 +++++++++++++
 2 files changed, 23 insertions(+)

diff --git a/block/blk-core.c b/block/blk-core.c
index 2eb722d..36b3bd2 100644
--- a/block/blk-core.c
+++ b/block/blk-core.c
@@ -3151,6 +3151,7 @@ void blk_flush_plug_list(struct blk_plug *plug, bool from_schedule)
 	LIST_HEAD(list);
 	unsigned int depth;
 
+repeat:
 	flush_plug_callbacks(plug, from_schedule);
 
 	if (!list_empty(&plug->mq_list))
@@ -3212,6 +3213,15 @@ void blk_flush_plug_list(struct blk_plug *plug, bool from_schedule)
 		queue_unplugged(q, depth, from_schedule);
 
 	local_irq_restore(flags);
+
+	/*
+	 * We have to repeat the whole flush till list becomes
+	 * empty, because underlying block device can submit
+	 * another bio which again will be put to plug list.
+	 * To avoid stuck of these subsequent bios in the queue
+	 * we have to flush till the end.
+	 */
+	goto repeat;
 }
 
 void blk_finish_plug(struct blk_plug *plug)
diff --git a/block/blk-mq.c b/block/blk-mq.c
index f2d67b4..ced83eb 100644
--- a/block/blk-mq.c
+++ b/block/blk-mq.c
@@ -1091,6 +1091,10 @@ void blk_mq_flush_plug_list(struct blk_plug *plug, bool from_schedule)
 	LIST_HEAD(ctx_list);
 	unsigned int depth;
 
+repeat:
+	if (list_empty(&plug->mq_list))
+		return;
+
 	list_splice_init(&plug->mq_list, &list);
 
 	list_sort(NULL, &list, plug_ctx_cmp);
@@ -1127,6 +1131,15 @@ void blk_mq_flush_plug_list(struct blk_plug *plug, bool from_schedule)
 		blk_mq_insert_requests(this_q, this_ctx, &ctx_list, depth,
 				       from_schedule);
 	}
+
+	/*
+	 * We have to repeat the whole flush till list becomes
+	 * empty, because underlying block device can submit
+	 * another bio which again will be put to plug list.
+	 * To avoid stuck of these subsequent bios in the queue
+	 * we have to flush till the end.
+	 */
+	goto repeat;
 }
 
 static void blk_mq_bio_to_request(struct request *rq, struct bio *bio)
-- 
2.5.1


^ permalink raw reply related	[flat|nested] 3+ messages in thread

* Re: [PATCH 1/1] blk-core: fix queue stuck on attempt to submit request from unplug
  2015-09-27 20:44 [PATCH 1/1] blk-core: fix queue stuck on attempt to submit request from unplug Roman Pen
@ 2015-09-28 17:27 ` Jens Axboe
  2015-09-28 17:36   ` Roman Peniaev
  0 siblings, 1 reply; 3+ messages in thread
From: Jens Axboe @ 2015-09-28 17:27 UTC (permalink / raw)
  To: Roman Pen; +Cc: linux-kernel

On 09/27/2015 02:44 PM, Roman Pen wrote:
> In case of several stacked block devices, which both were inited by
> blk_init_queue call, you can catch the queue stuck, if first device
> in stack makes bio submit being in a flush of a plug list.
>
> Let's consider this regular scenario taking readahead into account
> (readahead.c:read_pages):
>
> 1. Start plug
> 2. Read pages in loop
> 3. Finish plug
>
> This example generates backtrace as follows:
>
> 1. blk_start_plug
> 2. generic_make_request
>          q->make_request_fn
>          [blk_queue_bio]
>              if (current->plug)
>                  list_add_tail(&req->queuelist, &plug->list);
> 3. blk_finish_plug
>          blk_flush_plug_list
>              queue_unplugged
>                  __blk_run_queue
>                      XXX_request_fn [some request handler of block device]
>                          generic_make_request
>                              q->make_request_fn
>                              [blk_queue_bio]
>                                  if (current->plug)
>                                      list_add_tail(&req->queuelist, &plug->list);
>
> So the problem is, that on step 3. XXX_request_fn makes
> another request, which again will be put to plug list,
> because plug is till active, thus new request will be
> stuck forever in the queue.
>
> How to fix?
> Do flush plug list till it becomes empty.

What devices submit IO from request_fn? Seems like something that would 
be better handled out of a make_request_fn hook for that device, which 
would also avoid this corner case.


-- 
Jens Axboe


^ permalink raw reply	[flat|nested] 3+ messages in thread

* Re: [PATCH 1/1] blk-core: fix queue stuck on attempt to submit request from unplug
  2015-09-28 17:27 ` Jens Axboe
@ 2015-09-28 17:36   ` Roman Peniaev
  0 siblings, 0 replies; 3+ messages in thread
From: Roman Peniaev @ 2015-09-28 17:36 UTC (permalink / raw)
  To: Jens Axboe; +Cc: linux-kernel

On Mon, Sep 28, 2015 at 7:27 PM, Jens Axboe <axboe@kernel.dk> wrote:
> On 09/27/2015 02:44 PM, Roman Pen wrote:
>>
>> In case of several stacked block devices, which both were inited by
>> blk_init_queue call, you can catch the queue stuck, if first device
>> in stack makes bio submit being in a flush of a plug list.
>>
>> Let's consider this regular scenario taking readahead into account
>> (readahead.c:read_pages):
>>
>> 1. Start plug
>> 2. Read pages in loop
>> 3. Finish plug
>>
>> This example generates backtrace as follows:
>>
>> 1. blk_start_plug
>> 2. generic_make_request
>>          q->make_request_fn
>>          [blk_queue_bio]
>>              if (current->plug)
>>                  list_add_tail(&req->queuelist, &plug->list);
>> 3. blk_finish_plug
>>          blk_flush_plug_list
>>              queue_unplugged
>>                  __blk_run_queue
>>                      XXX_request_fn [some request handler of block device]
>>                          generic_make_request
>>                              q->make_request_fn
>>                              [blk_queue_bio]
>>                                  if (current->plug)
>>                                      list_add_tail(&req->queuelist,
>> &plug->list);
>>
>> So the problem is, that on step 3. XXX_request_fn makes
>> another request, which again will be put to plug list,
>> because plug is till active, thus new request will be
>> stuck forever in the queue.
>>
>> How to fix?
>> Do flush plug list till it becomes empty.
>
>
> What devices submit IO from request_fn? Seems like something that would be
> better handled out of a make_request_fn hook for that device, which would
> also avoid this corner case.

Practically, this stuck was observed while doing some experiments with md,
which was switched from bio mode to request mode.

But what drawbacks has this loop-till-empty? Because it seems that those
corner cases with plugging can be easily fixed. Or I am missing something?

Also e.g. flush_plug_callbacks() does exactly the same:

while (!list_empty(&plug->cb_list)) {
      list_splice_init(&plug->cb_list, &callbacks);

      while (!list_empty(&callbacks)) {
       [...]
      }
}


--
Roman



>
>
> --
> Jens Axboe
>

^ permalink raw reply	[flat|nested] 3+ messages in thread

end of thread, other threads:[~2015-09-28 17:36 UTC | newest]

Thread overview: 3+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2015-09-27 20:44 [PATCH 1/1] blk-core: fix queue stuck on attempt to submit request from unplug Roman Pen
2015-09-28 17:27 ` Jens Axboe
2015-09-28 17:36   ` Roman Peniaev

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).