All of lore.kernel.org
 help / color / mirror / Atom feed
* [PATCH v3 0/4] block: misc changes
@ 2017-03-27 12:06 Ming Lei
  2017-03-27 12:06 ` [PATCH v3 1/4] blk-mq: comment on races related with timeout handler Ming Lei
                   ` (4 more replies)
  0 siblings, 5 replies; 13+ messages in thread
From: Ming Lei @ 2017-03-27 12:06 UTC (permalink / raw)
  To: Jens Axboe, linux-block, Christoph Hellwig
  Cc: Bart Van Assche, Hannes Reinecke, Ming Lei

Hi,

The 1st  patch add comments on blk-mq races with timeout handler.

The other 3 patches improves handling for dying queue:
	- the 2nd one adds one barrier in blk_queue_enter() for
	avoiding hanging caused by out-of-order
	- the 3rd and 4th patches block new I/O entering queue
	after queue is set as dying

V3:
	- tweak comments as suggested by Bart Van Assche

V2:
	- add one missing barrier in blk_queue_enter()

V1:
	- add comments on races related with timeout handler
	- add Tested-by & Reviewed-by tag

thanks,
Ming

Ming Lei (4):
  blk-mq: comment on races related with timeout handler
  block: add a read barrier in blk_queue_enter()
  block: rename blk_mq_freeze_queue_start()
  block: block new I/O just after queue is set as dying

 block/blk-core.c                  | 16 ++++++++++++++++
 block/blk-mq.c                    | 32 +++++++++++++++++++++++++++-----
 drivers/block/mtip32xx/mtip32xx.c |  2 +-
 drivers/nvme/host/core.c          |  2 +-
 include/linux/blk-mq.h            |  2 +-
 5 files changed, 46 insertions(+), 8 deletions(-)

-- 
2.9.3

^ permalink raw reply	[flat|nested] 13+ messages in thread

* [PATCH v3 1/4] blk-mq: comment on races related with timeout handler
  2017-03-27 12:06 [PATCH v3 0/4] block: misc changes Ming Lei
@ 2017-03-27 12:06 ` Ming Lei
  2017-03-27 12:11   ` Johannes Thumshirn
  2017-03-27 12:06 ` [PATCH v3 2/4] block: add a read barrier in blk_queue_enter() Ming Lei
                   ` (3 subsequent siblings)
  4 siblings, 1 reply; 13+ messages in thread
From: Ming Lei @ 2017-03-27 12:06 UTC (permalink / raw)
  To: Jens Axboe, linux-block, Christoph Hellwig
  Cc: Bart Van Assche, Hannes Reinecke, Ming Lei

This patch adds comment on two races related with
timeout handler:

- requeue from queue busy vs. timeout
- rq free & reallocation vs. timeout

Both the races themselves and current solution aren't
explicit enough, so add comments on them.

Cc: Bart Van Assche <bart.vanassche@sandisk.com>
Reviewed-by: Hannes Reinecke <hare@suse.com>
Signed-off-by: Ming Lei <tom.leiming@gmail.com>
---
 block/blk-mq.c | 22 ++++++++++++++++++++++
 1 file changed, 22 insertions(+)

diff --git a/block/blk-mq.c b/block/blk-mq.c
index c212b9644a9f..b36f0481ba0e 100644
--- a/block/blk-mq.c
+++ b/block/blk-mq.c
@@ -523,6 +523,15 @@ void blk_mq_start_request(struct request *rq)
 }
 EXPORT_SYMBOL(blk_mq_start_request);
 
+/*
+ * When we reach here because queue is busy, REQ_ATOM_COMPLETE
+ * flag isn't set yet, so there may be race with timeout hanlder,
+ * but given rq->deadline is just set in .queue_rq() under
+ * this situation, the race won't be possible in reality because
+ * rq->timeout should be set as big enough to cover the window
+ * between blk_mq_start_request() called from .queue_rq() and
+ * clearing REQ_ATOM_STARTED here.
+ */
 static void __blk_mq_requeue_request(struct request *rq)
 {
 	struct request_queue *q = rq->q;
@@ -696,6 +705,19 @@ static void blk_mq_check_expired(struct blk_mq_hw_ctx *hctx,
 	if (!test_bit(REQ_ATOM_STARTED, &rq->atomic_flags))
 		return;
 
+	/*
+	 * The rq being checked may have been freed and reallocated
+	 * out already here, we avoid this race by checking rq->deadline
+	 * and REQ_ATOM_COMPLETE flag together:
+	 *
+	 * - if rq->deadline is observed as new value because of
+	 *   reusing, the rq won't be timed out because of timing.
+	 * - if rq->deadline is observed as previous value,
+	 *   REQ_ATOM_COMPLETE flag won't be cleared in reuse path
+	 *   because we put a barrier between setting rq->deadline
+	 *   and clearing the flag in blk_mq_start_request(), so
+	 *   this rq won't be timed out too.
+	 */
 	if (time_after_eq(jiffies, rq->deadline)) {
 		if (!blk_mark_rq_complete(rq))
 			blk_mq_rq_timed_out(rq, reserved);
-- 
2.9.3

^ permalink raw reply related	[flat|nested] 13+ messages in thread

* [PATCH v3 2/4] block: add a read barrier in blk_queue_enter()
  2017-03-27 12:06 [PATCH v3 0/4] block: misc changes Ming Lei
  2017-03-27 12:06 ` [PATCH v3 1/4] blk-mq: comment on races related with timeout handler Ming Lei
@ 2017-03-27 12:06 ` Ming Lei
  2017-03-27 12:14   ` Johannes Thumshirn
  2017-03-27 15:44   ` Bart Van Assche
  2017-03-27 12:06 ` [PATCH v3 3/4] block: rename blk_mq_freeze_queue_start() Ming Lei
                   ` (2 subsequent siblings)
  4 siblings, 2 replies; 13+ messages in thread
From: Ming Lei @ 2017-03-27 12:06 UTC (permalink / raw)
  To: Jens Axboe, linux-block, Christoph Hellwig
  Cc: Bart Van Assche, Hannes Reinecke, Ming Lei

Without the barrier, reading DEAD flag of .q_usage_counter
and reading .mq_freeze_depth may be reordered, then the
following wait_event_interruptible() may never return.

Reviewed-by: Hannes Reinecke <hare@suse.com>
Signed-off-by: Ming Lei <tom.leiming@gmail.com>
---
 block/blk-core.c | 9 +++++++++
 1 file changed, 9 insertions(+)

diff --git a/block/blk-core.c b/block/blk-core.c
index ad388d5e309a..5e8963bc98d9 100644
--- a/block/blk-core.c
+++ b/block/blk-core.c
@@ -669,6 +669,15 @@ int blk_queue_enter(struct request_queue *q, bool nowait)
 		if (nowait)
 			return -EBUSY;
 
+		/*
+		 * read pair of barrier in blk_mq_freeze_queue_start(),
+		 * we need to order reading __PERCPU_REF_DEAD flag of
+		 * .q_usage_counter and reading .mq_freeze_depth,
+		 * otherwise the following wait may never return if the
+		 * two reads are reordered.
+		 */
+		smp_rmb();
+
 		ret = wait_event_interruptible(q->mq_freeze_wq,
 				!atomic_read(&q->mq_freeze_depth) ||
 				blk_queue_dying(q));
-- 
2.9.3

^ permalink raw reply related	[flat|nested] 13+ messages in thread

* [PATCH v3 3/4] block: rename blk_mq_freeze_queue_start()
  2017-03-27 12:06 [PATCH v3 0/4] block: misc changes Ming Lei
  2017-03-27 12:06 ` [PATCH v3 1/4] blk-mq: comment on races related with timeout handler Ming Lei
  2017-03-27 12:06 ` [PATCH v3 2/4] block: add a read barrier in blk_queue_enter() Ming Lei
@ 2017-03-27 12:06 ` Ming Lei
  2017-03-27 12:15   ` Johannes Thumshirn
  2017-03-27 12:06 ` [PATCH v3 4/4] block: block new I/O just after queue is set as dying Ming Lei
  2017-03-29  2:11 ` [PATCH v3 0/4] block: misc changes Ming Lei
  4 siblings, 1 reply; 13+ messages in thread
From: Ming Lei @ 2017-03-27 12:06 UTC (permalink / raw)
  To: Jens Axboe, linux-block, Christoph Hellwig
  Cc: Bart Van Assche, Hannes Reinecke, Ming Lei

As the .q_usage_counter is used by both legacy and
mq path, we need to block new I/O if queue becomes
dead in blk_queue_enter().

So rename it and we can use this function in both
paths.

Reviewed-by: Bart Van Assche <bart.vanassche@sandisk.com>
Reviewed-by: Hannes Reinecke <hare@suse.com>
Signed-off-by: Ming Lei <tom.leiming@gmail.com>
---
 block/blk-core.c                  |  2 +-
 block/blk-mq.c                    | 10 +++++-----
 drivers/block/mtip32xx/mtip32xx.c |  2 +-
 drivers/nvme/host/core.c          |  2 +-
 include/linux/blk-mq.h            |  2 +-
 5 files changed, 9 insertions(+), 9 deletions(-)

diff --git a/block/blk-core.c b/block/blk-core.c
index 5e8963bc98d9..60f364e1d36b 100644
--- a/block/blk-core.c
+++ b/block/blk-core.c
@@ -670,7 +670,7 @@ int blk_queue_enter(struct request_queue *q, bool nowait)
 			return -EBUSY;
 
 		/*
-		 * read pair of barrier in blk_mq_freeze_queue_start(),
+		 * read pair of barrier in blk_freeze_queue_start(),
 		 * we need to order reading __PERCPU_REF_DEAD flag of
 		 * .q_usage_counter and reading .mq_freeze_depth,
 		 * otherwise the following wait may never return if the
diff --git a/block/blk-mq.c b/block/blk-mq.c
index b36f0481ba0e..5370b4f750ff 100644
--- a/block/blk-mq.c
+++ b/block/blk-mq.c
@@ -68,7 +68,7 @@ static void blk_mq_hctx_clear_pending(struct blk_mq_hw_ctx *hctx,
 	sbitmap_clear_bit(&hctx->ctx_map, ctx->index_hw);
 }
 
-void blk_mq_freeze_queue_start(struct request_queue *q)
+void blk_freeze_queue_start(struct request_queue *q)
 {
 	int freeze_depth;
 
@@ -78,7 +78,7 @@ void blk_mq_freeze_queue_start(struct request_queue *q)
 		blk_mq_run_hw_queues(q, false);
 	}
 }
-EXPORT_SYMBOL_GPL(blk_mq_freeze_queue_start);
+EXPORT_SYMBOL_GPL(blk_freeze_queue_start);
 
 void blk_mq_freeze_queue_wait(struct request_queue *q)
 {
@@ -108,7 +108,7 @@ void blk_freeze_queue(struct request_queue *q)
 	 * no blk_unfreeze_queue(), and blk_freeze_queue() is not
 	 * exported to drivers as the only user for unfreeze is blk_mq.
 	 */
-	blk_mq_freeze_queue_start(q);
+	blk_freeze_queue_start(q);
 	blk_mq_freeze_queue_wait(q);
 }
 
@@ -746,7 +746,7 @@ static void blk_mq_timeout_work(struct work_struct *work)
 	 * percpu_ref_tryget directly, because we need to be able to
 	 * obtain a reference even in the short window between the queue
 	 * starting to freeze, by dropping the first reference in
-	 * blk_mq_freeze_queue_start, and the moment the last request is
+	 * blk_freeze_queue_start, and the moment the last request is
 	 * consumed, marked by the instant q_usage_counter reaches
 	 * zero.
 	 */
@@ -2376,7 +2376,7 @@ static void blk_mq_queue_reinit_work(void)
 	 * take place in parallel.
 	 */
 	list_for_each_entry(q, &all_q_list, all_q_node)
-		blk_mq_freeze_queue_start(q);
+		blk_freeze_queue_start(q);
 	list_for_each_entry(q, &all_q_list, all_q_node)
 		blk_mq_freeze_queue_wait(q);
 
diff --git a/drivers/block/mtip32xx/mtip32xx.c b/drivers/block/mtip32xx/mtip32xx.c
index f96ab717534c..c96c35ab39df 100644
--- a/drivers/block/mtip32xx/mtip32xx.c
+++ b/drivers/block/mtip32xx/mtip32xx.c
@@ -4162,7 +4162,7 @@ static int mtip_block_remove(struct driver_data *dd)
 		dev_info(&dd->pdev->dev, "device %s surprise removal\n",
 						dd->disk->disk_name);
 
-	blk_mq_freeze_queue_start(dd->queue);
+	blk_freeze_queue_start(dd->queue);
 	blk_mq_stop_hw_queues(dd->queue);
 	blk_mq_tagset_busy_iter(&dd->tags, mtip_no_dev_cleanup, dd);
 
diff --git a/drivers/nvme/host/core.c b/drivers/nvme/host/core.c
index 9b3b57fef446..4a6d7f408769 100644
--- a/drivers/nvme/host/core.c
+++ b/drivers/nvme/host/core.c
@@ -2386,7 +2386,7 @@ void nvme_start_freeze(struct nvme_ctrl *ctrl)
 
 	mutex_lock(&ctrl->namespaces_mutex);
 	list_for_each_entry(ns, &ctrl->namespaces, list)
-		blk_mq_freeze_queue_start(ns->queue);
+		blk_freeze_queue_start(ns->queue);
 	mutex_unlock(&ctrl->namespaces_mutex);
 }
 EXPORT_SYMBOL_GPL(nvme_start_freeze);
diff --git a/include/linux/blk-mq.h b/include/linux/blk-mq.h
index 5b3e201c8d4f..ea2e9dcd3aef 100644
--- a/include/linux/blk-mq.h
+++ b/include/linux/blk-mq.h
@@ -243,7 +243,7 @@ void blk_mq_tagset_busy_iter(struct blk_mq_tag_set *tagset,
 		busy_tag_iter_fn *fn, void *priv);
 void blk_mq_freeze_queue(struct request_queue *q);
 void blk_mq_unfreeze_queue(struct request_queue *q);
-void blk_mq_freeze_queue_start(struct request_queue *q);
+void blk_freeze_queue_start(struct request_queue *q);
 void blk_mq_freeze_queue_wait(struct request_queue *q);
 int blk_mq_freeze_queue_wait_timeout(struct request_queue *q,
 				     unsigned long timeout);
-- 
2.9.3

^ permalink raw reply related	[flat|nested] 13+ messages in thread

* [PATCH v3 4/4] block: block new I/O just after queue is set as dying
  2017-03-27 12:06 [PATCH v3 0/4] block: misc changes Ming Lei
                   ` (2 preceding siblings ...)
  2017-03-27 12:06 ` [PATCH v3 3/4] block: rename blk_mq_freeze_queue_start() Ming Lei
@ 2017-03-27 12:06 ` Ming Lei
  2017-03-27 12:16   ` Johannes Thumshirn
  2017-03-27 15:49   ` Bart Van Assche
  2017-03-29  2:11 ` [PATCH v3 0/4] block: misc changes Ming Lei
  4 siblings, 2 replies; 13+ messages in thread
From: Ming Lei @ 2017-03-27 12:06 UTC (permalink / raw)
  To: Jens Axboe, linux-block, Christoph Hellwig
  Cc: Bart Van Assche, Hannes Reinecke, Ming Lei, Tejun Heo

Before commit 780db2071a(blk-mq: decouble blk-mq freezing
from generic bypassing), the dying flag is checked before
entering queue, and Tejun converts the checking into .mq_freeze_depth,
and assumes the counter is increased just after dying flag
is set. Unfortunately we doesn't do that in blk_set_queue_dying().

This patch calls blk_freeze_queue_start() in blk_set_queue_dying(),
so that we can block new I/O coming once the queue is set as dying.

Given blk_set_queue_dying() is always called in remove path
of block device, and queue will be cleaned up later, we don't
need to worry about undoing the counter.

Cc: Bart Van Assche <bart.vanassche@sandisk.com>
Cc: Tejun Heo <tj@kernel.org>
Reviewed-by: Hannes Reinecke <hare@suse.com>
Signed-off-by: Ming Lei <tom.leiming@gmail.com>
---
 block/blk-core.c | 13 ++++++++++---
 1 file changed, 10 insertions(+), 3 deletions(-)

diff --git a/block/blk-core.c b/block/blk-core.c
index 60f364e1d36b..e22c4ea002ec 100644
--- a/block/blk-core.c
+++ b/block/blk-core.c
@@ -500,6 +500,13 @@ void blk_set_queue_dying(struct request_queue *q)
 	queue_flag_set(QUEUE_FLAG_DYING, q);
 	spin_unlock_irq(q->queue_lock);
 
+	/*
+	 * When queue DYING flag is set, we need to block new req
+	 * entering queue, so we call blk_freeze_queue_start() to
+	 * prevent I/O from crossing blk_queue_enter().
+	 */
+	blk_freeze_queue_start(q);
+
 	if (q->mq_ops)
 		blk_mq_wake_waiters(q);
 	else {
@@ -672,9 +679,9 @@ int blk_queue_enter(struct request_queue *q, bool nowait)
 		/*
 		 * read pair of barrier in blk_freeze_queue_start(),
 		 * we need to order reading __PERCPU_REF_DEAD flag of
-		 * .q_usage_counter and reading .mq_freeze_depth,
-		 * otherwise the following wait may never return if the
-		 * two reads are reordered.
+		 * .q_usage_counter and reading .mq_freeze_depth or
+		 * queue dying flag, otherwise the following wait may
+		 * never return if the two reads are reordered.
 		 */
 		smp_rmb();
 
-- 
2.9.3

^ permalink raw reply related	[flat|nested] 13+ messages in thread

* Re: [PATCH v3 1/4] blk-mq: comment on races related with timeout handler
  2017-03-27 12:06 ` [PATCH v3 1/4] blk-mq: comment on races related with timeout handler Ming Lei
@ 2017-03-27 12:11   ` Johannes Thumshirn
  0 siblings, 0 replies; 13+ messages in thread
From: Johannes Thumshirn @ 2017-03-27 12:11 UTC (permalink / raw)
  To: Ming Lei
  Cc: Jens Axboe, linux-block, Christoph Hellwig, Bart Van Assche,
	Hannes Reinecke

On Mon, Mar 27, 2017 at 08:06:55PM +0800, Ming Lei wrote:
> This patch adds comment on two races related with
> timeout handler:
> 
> - requeue from queue busy vs. timeout
> - rq free & reallocation vs. timeout
> 
> Both the races themselves and current solution aren't
> explicit enough, so add comments on them.
> 
> Cc: Bart Van Assche <bart.vanassche@sandisk.com>
> Reviewed-by: Hannes Reinecke <hare@suse.com>
> Signed-off-by: Ming Lei <tom.leiming@gmail.com>
> ---

Reviewed-by: Johannes Thumshirn <jthumshirn@suse.de>
-- 
Johannes Thumshirn                                          Storage
jthumshirn@suse.de                                +49 911 74053 689
SUSE LINUX GmbH, Maxfeldstr. 5, 90409 N�rnberg
GF: Felix Imend�rffer, Jane Smithard, Graham Norton
HRB 21284 (AG N�rnberg)
Key fingerprint = EC38 9CAB C2C4 F25D 8600 D0D0 0393 969D 2D76 0850

^ permalink raw reply	[flat|nested] 13+ messages in thread

* Re: [PATCH v3 2/4] block: add a read barrier in blk_queue_enter()
  2017-03-27 12:06 ` [PATCH v3 2/4] block: add a read barrier in blk_queue_enter() Ming Lei
@ 2017-03-27 12:14   ` Johannes Thumshirn
  2017-03-27 15:44   ` Bart Van Assche
  1 sibling, 0 replies; 13+ messages in thread
From: Johannes Thumshirn @ 2017-03-27 12:14 UTC (permalink / raw)
  To: Ming Lei
  Cc: Jens Axboe, linux-block, Christoph Hellwig, Bart Van Assche,
	Hannes Reinecke

On Mon, Mar 27, 2017 at 08:06:56PM +0800, Ming Lei wrote:
> Without the barrier, reading DEAD flag of .q_usage_counter
> and reading .mq_freeze_depth may be reordered, then the
> following wait_event_interruptible() may never return.
> 
> Reviewed-by: Hannes Reinecke <hare@suse.com>
> Signed-off-by: Ming Lei <tom.leiming@gmail.com>
> ---

Reviewed-by: Johannes Thumshirn <jthumshirn@suse.de>
-- 
Johannes Thumshirn                                          Storage
jthumshirn@suse.de                                +49 911 74053 689
SUSE LINUX GmbH, Maxfeldstr. 5, 90409 N�rnberg
GF: Felix Imend�rffer, Jane Smithard, Graham Norton
HRB 21284 (AG N�rnberg)
Key fingerprint = EC38 9CAB C2C4 F25D 8600 D0D0 0393 969D 2D76 0850

^ permalink raw reply	[flat|nested] 13+ messages in thread

* Re: [PATCH v3 3/4] block: rename blk_mq_freeze_queue_start()
  2017-03-27 12:06 ` [PATCH v3 3/4] block: rename blk_mq_freeze_queue_start() Ming Lei
@ 2017-03-27 12:15   ` Johannes Thumshirn
  0 siblings, 0 replies; 13+ messages in thread
From: Johannes Thumshirn @ 2017-03-27 12:15 UTC (permalink / raw)
  To: Ming Lei
  Cc: Jens Axboe, linux-block, Christoph Hellwig, Bart Van Assche,
	Hannes Reinecke

On Mon, Mar 27, 2017 at 08:06:57PM +0800, Ming Lei wrote:
> As the .q_usage_counter is used by both legacy and
> mq path, we need to block new I/O if queue becomes
> dead in blk_queue_enter().
> 
> So rename it and we can use this function in both
> paths.
> 
> Reviewed-by: Bart Van Assche <bart.vanassche@sandisk.com>
> Reviewed-by: Hannes Reinecke <hare@suse.com>
> Signed-off-by: Ming Lei <tom.leiming@gmail.com>
> ---

Reviewed-by: Johannes Thumshirn <jthumshirn@suse.de>
-- 
Johannes Thumshirn                                          Storage
jthumshirn@suse.de                                +49 911 74053 689
SUSE LINUX GmbH, Maxfeldstr. 5, 90409 N�rnberg
GF: Felix Imend�rffer, Jane Smithard, Graham Norton
HRB 21284 (AG N�rnberg)
Key fingerprint = EC38 9CAB C2C4 F25D 8600 D0D0 0393 969D 2D76 0850

^ permalink raw reply	[flat|nested] 13+ messages in thread

* Re: [PATCH v3 4/4] block: block new I/O just after queue is set as dying
  2017-03-27 12:06 ` [PATCH v3 4/4] block: block new I/O just after queue is set as dying Ming Lei
@ 2017-03-27 12:16   ` Johannes Thumshirn
  2017-03-27 15:49   ` Bart Van Assche
  1 sibling, 0 replies; 13+ messages in thread
From: Johannes Thumshirn @ 2017-03-27 12:16 UTC (permalink / raw)
  To: Ming Lei
  Cc: Jens Axboe, linux-block, Christoph Hellwig, Bart Van Assche,
	Hannes Reinecke, Tejun Heo

On Mon, Mar 27, 2017 at 08:06:58PM +0800, Ming Lei wrote:
> Before commit 780db2071a(blk-mq: decouble blk-mq freezing
> from generic bypassing), the dying flag is checked before
> entering queue, and Tejun converts the checking into .mq_freeze_depth,
> and assumes the counter is increased just after dying flag
> is set. Unfortunately we doesn't do that in blk_set_queue_dying().
> 
> This patch calls blk_freeze_queue_start() in blk_set_queue_dying(),
> so that we can block new I/O coming once the queue is set as dying.
> 
> Given blk_set_queue_dying() is always called in remove path
> of block device, and queue will be cleaned up later, we don't
> need to worry about undoing the counter.
> 
> Cc: Bart Van Assche <bart.vanassche@sandisk.com>
> Cc: Tejun Heo <tj@kernel.org>
> Reviewed-by: Hannes Reinecke <hare@suse.com>
> Signed-off-by: Ming Lei <tom.leiming@gmail.com>
> ---

Reviewed-by: Johannes Thumshirn <jthumshirn@suse.de>
-- 
Johannes Thumshirn                                          Storage
jthumshirn@suse.de                                +49 911 74053 689
SUSE LINUX GmbH, Maxfeldstr. 5, 90409 N�rnberg
GF: Felix Imend�rffer, Jane Smithard, Graham Norton
HRB 21284 (AG N�rnberg)
Key fingerprint = EC38 9CAB C2C4 F25D 8600 D0D0 0393 969D 2D76 0850

^ permalink raw reply	[flat|nested] 13+ messages in thread

* Re: [PATCH v3 2/4] block: add a read barrier in blk_queue_enter()
  2017-03-27 12:06 ` [PATCH v3 2/4] block: add a read barrier in blk_queue_enter() Ming Lei
  2017-03-27 12:14   ` Johannes Thumshirn
@ 2017-03-27 15:44   ` Bart Van Assche
  1 sibling, 0 replies; 13+ messages in thread
From: Bart Van Assche @ 2017-03-27 15:44 UTC (permalink / raw)
  To: hch, linux-block, tom.leiming, axboe; +Cc: hare, jthumshirn

On Mon, 2017-03-27 at 20:06 +0800, Ming Lei wrote:
> diff --git a/block/blk-core.c b/block/blk-core.c
> index ad388d5e309a..5e8963bc98d9 100644
> --- a/block/blk-core.c
> +++ b/block/blk-core.c
> @@ -669,6 +669,15 @@ int blk_queue_enter(struct request_queue *q, bool no=
wait)
>  		if (nowait)
>  			return -EBUSY;
> =20
> +		/*
> +		 * read pair of barrier in blk_mq_freeze_queue_start(),
> +		 * we need to order reading __PERCPU_REF_DEAD flag of
> +		 * .q_usage_counter and reading .mq_freeze_depth,
> +		 * otherwise the following wait may never return if the
> +		 * two reads are reordered.
> +		 */
> +		smp_rmb();
> +
>  		ret =3D wait_event_interruptible(q->mq_freeze_wq,
>  				!atomic_read(&q->mq_freeze_depth) ||
>  				blk_queue_dying(q));

Since patch 4/4 modifies the comment introduced by this patch, I would have
preferred that patches 2/4 and 4/4 would have been combined into a single
patch. Anyway:

Reviewed-by: Bart Van Assche <bart.vanassche@sandisk.com>

^ permalink raw reply	[flat|nested] 13+ messages in thread

* Re: [PATCH v3 4/4] block: block new I/O just after queue is set as dying
  2017-03-27 12:06 ` [PATCH v3 4/4] block: block new I/O just after queue is set as dying Ming Lei
  2017-03-27 12:16   ` Johannes Thumshirn
@ 2017-03-27 15:49   ` Bart Van Assche
  1 sibling, 0 replies; 13+ messages in thread
From: Bart Van Assche @ 2017-03-27 15:49 UTC (permalink / raw)
  To: hch, linux-block, tom.leiming, axboe; +Cc: tj, hare

On Mon, 2017-03-27 at 20:06 +0800, Ming Lei wrote:
> Before commit 780db2071a(blk-mq: decouble blk-mq freezing
> from generic bypassing), the dying flag is checked before
> entering queue, and Tejun converts the checking into .mq_freeze_depth,
> and assumes the counter is increased just after dying flag
> is set. Unfortunately we doesn't do that in blk_set_queue_dying().
>=20
> This patch calls blk_freeze_queue_start() in blk_set_queue_dying(),
> so that we can block new I/O coming once the queue is set as dying.
>=20
> Given blk_set_queue_dying() is always called in remove path
> of block device, and queue will be cleaned up later, we don't
> need to worry about undoing the counter.
>=20
> Cc: Bart Van Assche <bart.vanassche@sandisk.com>
> Cc: Tejun Heo <tj@kernel.org>
> Reviewed-by: Hannes Reinecke <hare@suse.com>
> Signed-off-by: Ming Lei <tom.leiming@gmail.com>
> ---
>  block/blk-core.c | 13 ++++++++++---
>  1 file changed, 10 insertions(+), 3 deletions(-)
>=20
> diff --git a/block/blk-core.c b/block/blk-core.c
> index 60f364e1d36b..e22c4ea002ec 100644
> --- a/block/blk-core.c
> +++ b/block/blk-core.c
> @@ -500,6 +500,13 @@ void blk_set_queue_dying(struct request_queue *q)
>  	queue_flag_set(QUEUE_FLAG_DYING, q);
>  	spin_unlock_irq(q->queue_lock);
> =20
> +	/*
> +	 * When queue DYING flag is set, we need to block new req
> +	 * entering queue, so we call blk_freeze_queue_start() to
> +	 * prevent I/O from crossing blk_queue_enter().
> +	 */
> +	blk_freeze_queue_start(q);
>
> 	if (q->mq_ops)
>  		blk_mq_wake_waiters(q);
>  	else {
> @@ -672,9 +679,9 @@ int blk_queue_enter(struct request_queue *q, bool now=
ait)
>  		/*
>  		 * read pair of barrier in blk_freeze_queue_start(),
>  		 * we need to order reading __PERCPU_REF_DEAD flag of
> -		 * .q_usage_counter and reading .mq_freeze_depth,
> -		 * otherwise the following wait may never return if the
> -		 * two reads are reordered.
> +		 * .q_usage_counter and reading .mq_freeze_depth or
> +		 * queue dying flag, otherwise the following wait may
> +		 * never return if the two reads are reordered.
>  		 */
>  		smp_rmb();
> =20

An explanation of why that crossing can happen is still missing above the
blk_freeze_queue_start() call. Additionally, I'm still wondering whether
or not we need "Cc: stable" tags for the patches in this series. But since
the code looks fine:

Reviewed-by: Bart Van Assche <bart.vanassche@sandisk.com>
> =

^ permalink raw reply	[flat|nested] 13+ messages in thread

* Re: [PATCH v3 0/4] block: misc changes
  2017-03-27 12:06 [PATCH v3 0/4] block: misc changes Ming Lei
                   ` (3 preceding siblings ...)
  2017-03-27 12:06 ` [PATCH v3 4/4] block: block new I/O just after queue is set as dying Ming Lei
@ 2017-03-29  2:11 ` Ming Lei
  2017-03-29 14:00   ` Jens Axboe
  4 siblings, 1 reply; 13+ messages in thread
From: Ming Lei @ 2017-03-29  2:11 UTC (permalink / raw)
  To: Jens Axboe, linux-block, Christoph Hellwig
  Cc: Bart Van Assche, Hannes Reinecke

On Mon, Mar 27, 2017 at 08:06:54PM +0800, Ming Lei wrote:
> Hi,
> 
> The 1st  patch add comments on blk-mq races with timeout handler.
> 
> The other 3 patches improves handling for dying queue:
> 	- the 2nd one adds one barrier in blk_queue_enter() for
> 	avoiding hanging caused by out-of-order
> 	- the 3rd and 4th patches block new I/O entering queue
> 	after queue is set as dying
> 
> V3:
> 	- tweak comments as suggested by Bart Van Assche

Hi Jens,

This patchset has been posted several times without real code change
for a while, could you consider it for v4.12?

Thanks,
Ming

^ permalink raw reply	[flat|nested] 13+ messages in thread

* Re: [PATCH v3 0/4] block: misc changes
  2017-03-29  2:11 ` [PATCH v3 0/4] block: misc changes Ming Lei
@ 2017-03-29 14:00   ` Jens Axboe
  0 siblings, 0 replies; 13+ messages in thread
From: Jens Axboe @ 2017-03-29 14:00 UTC (permalink / raw)
  To: Ming Lei; +Cc: linux-block, Christoph Hellwig, Bart Van Assche, Hannes Reinecke

On Wed, Mar 29 2017, Ming Lei wrote:
> On Mon, Mar 27, 2017 at 08:06:54PM +0800, Ming Lei wrote:
> > Hi,
> > 
> > The 1st  patch add comments on blk-mq races with timeout handler.
> > 
> > The other 3 patches improves handling for dying queue:
> > 	- the 2nd one adds one barrier in blk_queue_enter() for
> > 	avoiding hanging caused by out-of-order
> > 	- the 3rd and 4th patches block new I/O entering queue
> > 	after queue is set as dying
> > 
> > V3:
> > 	- tweak comments as suggested by Bart Van Assche
> 
> Hi Jens,
> 
> This patchset has been posted several times without real code change
> for a while, could you consider it for v4.12?

We can queue it up now. I do agree that patch 2+4 should have been
folded into one, but that's not a huge deal.

-- 
Jens Axboe

^ permalink raw reply	[flat|nested] 13+ messages in thread

end of thread, other threads:[~2017-03-29 14:00 UTC | newest]

Thread overview: 13+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2017-03-27 12:06 [PATCH v3 0/4] block: misc changes Ming Lei
2017-03-27 12:06 ` [PATCH v3 1/4] blk-mq: comment on races related with timeout handler Ming Lei
2017-03-27 12:11   ` Johannes Thumshirn
2017-03-27 12:06 ` [PATCH v3 2/4] block: add a read barrier in blk_queue_enter() Ming Lei
2017-03-27 12:14   ` Johannes Thumshirn
2017-03-27 15:44   ` Bart Van Assche
2017-03-27 12:06 ` [PATCH v3 3/4] block: rename blk_mq_freeze_queue_start() Ming Lei
2017-03-27 12:15   ` Johannes Thumshirn
2017-03-27 12:06 ` [PATCH v3 4/4] block: block new I/O just after queue is set as dying Ming Lei
2017-03-27 12:16   ` Johannes Thumshirn
2017-03-27 15:49   ` Bart Van Assche
2017-03-29  2:11 ` [PATCH v3 0/4] block: misc changes Ming Lei
2017-03-29 14:00   ` Jens Axboe

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.