All of lore.kernel.org
 help / color / mirror / Atom feed
* [RFC PATCH] blk-mq: fixup RESTART when queue becomes idle
@ 2018-01-18  2:41 Ming Lei
  2018-01-18 16:50 ` Bart Van Assche
  0 siblings, 1 reply; 80+ messages in thread
From: Ming Lei @ 2018-01-18  2:41 UTC (permalink / raw)
  To: Jens Axboe, linux-block, Mike Snitzer, dm-devel
  Cc: Christoph Hellwig, Bart Van Assche, linux-kernel, Omar Sandoval,
	Ming Lei

BLK_STS_RESOURCE can be returned from driver when any resource
is running out of. And the resource may not be related with tags,
such as kmalloc(GFP_ATOMIC), when queue is idle under this kind of
BLK_STS_RESOURCE, restart can't work any more, then IO hang may
be caused.

Most of drivers may call kmalloc(GFP_ATOMIC) in IO path, and almost
all returns BLK_STS_RESOURCE under this situation. But for dm-mpath,
it may be triggered a bit easier since the request pool of underlying
queue may be consumed up much easier. But in reality, it is still not
easy to trigger it. I run all kinds of test on dm-mpath/scsi-debug
with all kinds of scsi_debug parameters, can't trigger this issue
at all. But finally it is triggered in Bart's SRP test, which seems
made by genius, :-)

This patch deals with this situation by running the queue again when
queue is found idle in timeout handler.

Signed-off-by: Ming Lei <ming.lei@redhat.com>
---

Another approach is to do the check after BLK_STS_RESOURCE is returned
from .queue_rq() and BLK_MQ_S_SCHED_RESTART is set, that way may introduce
a bit cost in hot path, and it was V1 of this patch actually, please see
that in the following link:

	https://github.com/ming1/linux/commit/68a66900f3647ea6751aab2848b1e5eef508feaa

Or other better ways?

 block/blk-mq.c | 83 +++++++++++++++++++++++++++++++++++++++++++++++++++++++++-
 1 file changed, 82 insertions(+), 1 deletion(-)

diff --git a/block/blk-mq.c b/block/blk-mq.c
index 6e3f77829dcc..4d4af8d712da 100644
--- a/block/blk-mq.c
+++ b/block/blk-mq.c
@@ -896,6 +896,85 @@ static void blk_mq_terminate_expired(struct blk_mq_hw_ctx *hctx,
 		blk_mq_rq_timed_out(rq, reserved);
 }
 
+struct hctx_busy_data {
+	struct blk_mq_hw_ctx *hctx;
+	bool reserved;
+	bool busy;
+};
+
+static bool check_busy_hctx(struct sbitmap *sb, unsigned int bitnr, void *data)
+{
+	struct hctx_busy_data *busy_data = data;
+	struct blk_mq_hw_ctx *hctx = busy_data->hctx;
+	struct request *rq;
+
+	if (busy_data->reserved)
+		bitnr += hctx->tags->nr_reserved_tags;
+
+	rq = hctx->tags->static_rqs[bitnr];
+	if (blk_mq_rq_state(rq) == MQ_RQ_IN_FLIGHT) {
+		busy_data->busy = true;
+		return false;
+	}
+	return true;
+}
+
+/* Check if there is any in-flight request */
+static bool blk_mq_hctx_is_busy(struct blk_mq_hw_ctx *hctx)
+{
+	struct hctx_busy_data data = {
+		.hctx = hctx,
+		.busy = false,
+		.reserved = true,
+	};
+
+	sbitmap_for_each_set(&hctx->tags->breserved_tags.sb,
+			check_busy_hctx, &data);
+	if (data.busy)
+		return true;
+
+	data.reserved = false;
+	sbitmap_for_each_set(&hctx->tags->bitmap_tags.sb,
+			check_busy_hctx, &data);
+	if (data.busy)
+		return true;
+
+	return false;
+}
+
+static void blk_mq_fixup_restart(struct blk_mq_hw_ctx *hctx)
+{
+	if (test_bit(BLK_MQ_S_SCHED_RESTART, &hctx->state)) {
+		bool busy;
+
+		/*
+		 * If this hctx is still marked as RESTART, and there
+		 * isn't any in-flight requests, we have to run queue
+		 * here to prevent IO from hanging.
+		 *
+		 * BLK_STS_RESOURCE can be returned from driver when any
+		 * resource is running out of. And the resource may not
+		 * be related with tags, such as kmalloc(GFP_ATOMIC), when
+		 * queue is idle under this kind of BLK_STS_RESOURCE, restart
+		 * can't work any more, then IO hang may be caused.
+		 *
+		 * The counter-pair of the following barrier is the one
+		 * in blk_mq_put_driver_tag() after returning BLK_STS_RESOURCE
+		 * from ->queue_rq().
+		 */
+		smp_mb();
+
+		busy = blk_mq_hctx_is_busy(hctx);
+		if (!busy) {
+			printk(KERN_WARNING "blk-mq: fixup RESTART\n");
+			printk(KERN_WARNING "\t If this message is shown"
+			       " a bit often, please report the issue to"
+			       " linux-block@vger.kernel.org\n");
+			blk_mq_run_hw_queue(hctx, true);
+		}
+	}
+}
+
 static void blk_mq_timeout_work(struct work_struct *work)
 {
 	struct request_queue *q =
@@ -966,8 +1045,10 @@ static void blk_mq_timeout_work(struct work_struct *work)
 		 */
 		queue_for_each_hw_ctx(q, hctx, i) {
 			/* the hctx may be unmapped, so check it here */
-			if (blk_mq_hw_queue_mapped(hctx))
+			if (blk_mq_hw_queue_mapped(hctx)) {
 				blk_mq_tag_idle(hctx);
+				blk_mq_fixup_restart(hctx);
+			}
 		}
 	}
 	blk_queue_exit(q);
-- 
2.9.5

^ permalink raw reply related	[flat|nested] 80+ messages in thread

end of thread, other threads:[~2018-01-29 22:37 UTC | newest]

Thread overview: 80+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2018-01-18  2:41 [RFC PATCH] blk-mq: fixup RESTART when queue becomes idle Ming Lei
2018-01-18 16:50 ` Bart Van Assche
2018-01-18 17:03   ` Mike Snitzer
2018-01-18 17:03     ` Mike Snitzer
2018-01-18 17:20     ` Bart Van Assche
2018-01-18 17:20       ` Bart Van Assche
2018-01-18 18:30       ` Mike Snitzer
2018-01-18 18:47         ` Bart Van Assche
2018-01-18 18:47           ` Bart Van Assche
2018-01-18 20:11           ` Jens Axboe
2018-01-18 20:11             ` Jens Axboe
2018-01-18 20:48             ` Mike Snitzer
2018-01-18 20:58               ` Bart Van Assche
2018-01-18 20:58                 ` Bart Van Assche
2018-01-18 21:23                 ` Mike Snitzer
2018-01-18 21:23                   ` Mike Snitzer
2018-01-18 21:37                   ` Laurence Oberman
2018-01-18 21:39                   ` [dm-devel] " Bart Van Assche
2018-01-18 21:39                     ` Bart Van Assche
2018-01-18 21:45                     ` Laurence Oberman
2018-01-18 21:45                       ` Laurence Oberman
2018-01-18 22:01                     ` Mike Snitzer
2018-01-18 22:18                       ` Laurence Oberman
2018-01-18 22:20                         ` Laurence Oberman
2018-01-18 22:20                           ` Laurence Oberman
2018-01-18 22:24                         ` Bart Van Assche
2018-01-18 22:24                           ` Bart Van Assche
2018-01-18 22:35                           ` Laurence Oberman
2018-01-18 22:39                             ` Jens Axboe
2018-01-18 22:55                               ` Bart Van Assche
2018-01-18 22:55                                 ` Bart Van Assche
2018-01-18 22:20                       ` Bart Van Assche
2018-01-18 22:20                         ` Bart Van Assche
2018-01-23  9:22                         ` [PATCH] block: neutralize blk_insert_cloned_request IO stall regression (was: Re: [RFC PATCH] blk-mq: fixup RESTART when queue becomes idle) Mike Snitzer
2018-01-23 10:53                           ` Ming Lei
2018-01-23 12:15                             ` Mike Snitzer
2018-01-23 12:17                               ` Ming Lei
2018-01-23 12:43                                 ` Mike Snitzer
2018-01-23 16:43                           ` [PATCH] " Bart Van Assche
2018-01-23 16:43                             ` Bart Van Assche
2018-01-19  2:32             ` [RFC PATCH] blk-mq: fixup RESTART when queue becomes idle Ming Lei
2018-01-19  4:02               ` Jens Axboe
2018-01-19  7:26                 ` Ming Lei
2018-01-19 15:20                   ` Bart Van Assche
2018-01-19 15:20                     ` Bart Van Assche
2018-01-19 15:25                     ` Jens Axboe
2018-01-19 15:33                     ` Ming Lei
2018-01-19 16:06                       ` Bart Van Assche
2018-01-19 16:06                         ` Bart Van Assche
2018-01-19 15:24                   ` Jens Axboe
2018-01-19 15:40                     ` Ming Lei
2018-01-19 15:40                       ` Ming Lei
2018-01-19 15:48                       ` Jens Axboe
2018-01-19 16:05                         ` Ming Lei
2018-01-19 16:19                           ` Jens Axboe
2018-01-19 16:26                             ` Ming Lei
2018-01-19 16:27                               ` Jens Axboe
2018-01-19 16:37                                 ` Ming Lei
2018-01-19 16:41                                   ` Jens Axboe
2018-01-19 16:41                                     ` Jens Axboe
2018-01-19 16:47                                     ` Mike Snitzer
2018-01-19 16:52                                       ` Jens Axboe
2018-01-19 17:05                                         ` Ming Lei
2018-01-19 17:09                                           ` Jens Axboe
2018-01-19 17:20                                             ` Ming Lei
2018-01-19 17:38                                   ` Jens Axboe
2018-01-19 18:24                                     ` Ming Lei
2018-01-19 18:24                                       ` Ming Lei
2018-01-19 18:33                                     ` Mike Snitzer
2018-01-19 23:52                                     ` Ming Lei
2018-01-20  4:27                                       ` Jens Axboe
2018-01-19 16:13                         ` Mike Snitzer
2018-01-19 16:23                           ` Jens Axboe
2018-01-19 23:57                             ` Ming Lei
2018-01-29 22:37                     ` Bart Van Assche
2018-01-19  5:09               ` Bart Van Assche
2018-01-19  5:09                 ` Bart Van Assche
2018-01-19  7:34                 ` Ming Lei
2018-01-19 19:47                   ` Bart Van Assche
2018-01-19 19:47                     ` Bart Van Assche

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.