All of lore.kernel.org
 help / color / mirror / Atom feed
* [PATCH v2 0/4] block: misc changes
@ 2017-03-24 12:36 Ming Lei
  2017-03-24 12:36 ` [PATCH v2 1/4] blk-mq: comment on races related with timeout handler Ming Lei
                   ` (3 more replies)
  0 siblings, 4 replies; 14+ messages in thread
From: Ming Lei @ 2017-03-24 12:36 UTC (permalink / raw)
  To: Jens Axboe, linux-block, Christoph Hellwig
  Cc: Bart Van Assche, Hannes Reinecke, Ming Lei

Hi,

The 1st  patch add comments on blk-mq races with timeout handler.

The other 3 patches improves handling for dying queue:
	- the 2nd one adds one barrier in blk_queue_enter() for
	avoiding hanging caused by out-of-order
	- the 3rd and 4th patches block new I/O entering queue
	after queue is set as dying

V1:
	- add comments on races related with timeout handler
	- add Tested-by & Reviewed-by tag

thanks,
Ming

Ming Lei (4):
  blk-mq: comment on races related with timeout handler
  block: add a read barrier in blk_queue_enter()
  block: rename blk_mq_freeze_queue_start()
  block: block new I/O just after queue is set as dying

 block/blk-core.c                  | 12 ++++++++++++
 block/blk-mq.c                    | 32 +++++++++++++++++++++++++++-----
 drivers/block/mtip32xx/mtip32xx.c |  2 +-
 drivers/nvme/host/core.c          |  2 +-
 include/linux/blk-mq.h            |  2 +-
 5 files changed, 42 insertions(+), 8 deletions(-)

-- 
2.9.3

^ permalink raw reply	[flat|nested] 14+ messages in thread

* [PATCH v2 1/4] blk-mq: comment on races related with timeout handler
  2017-03-24 12:36 [PATCH v2 0/4] block: misc changes Ming Lei
@ 2017-03-24 12:36 ` Ming Lei
  2017-03-24 12:36 ` [PATCH v2 2/4] block: add a read barrier in blk_queue_enter() Ming Lei
                   ` (2 subsequent siblings)
  3 siblings, 0 replies; 14+ messages in thread
From: Ming Lei @ 2017-03-24 12:36 UTC (permalink / raw)
  To: Jens Axboe, linux-block, Christoph Hellwig
  Cc: Bart Van Assche, Hannes Reinecke, Ming Lei

This patch adds comment on two races related with
timeout handler:

- requeue from queue busy vs. timeout
- rq free & reallocation vs. timeout

Both the races themselves and current solution aren't
explicit enough, so add comments on them.

Cc: Bart Van Assche <bart.vanassche@sandisk.com>
Reviewed-by: Hannes Reinecke <hare@suse.com>
Signed-off-by: Ming Lei <tom.leiming@gmail.com>
---
 block/blk-mq.c | 22 ++++++++++++++++++++++
 1 file changed, 22 insertions(+)

diff --git a/block/blk-mq.c b/block/blk-mq.c
index c212b9644a9f..b36f0481ba0e 100644
--- a/block/blk-mq.c
+++ b/block/blk-mq.c
@@ -523,6 +523,15 @@ void blk_mq_start_request(struct request *rq)
 }
 EXPORT_SYMBOL(blk_mq_start_request);
 
+/*
+ * When we reach here because queue is busy, REQ_ATOM_COMPLETE
+ * flag isn't set yet, so there may be race with timeout hanlder,
+ * but given rq->deadline is just set in .queue_rq() under
+ * this situation, the race won't be possible in reality because
+ * rq->timeout should be set as big enough to cover the window
+ * between blk_mq_start_request() called from .queue_rq() and
+ * clearing REQ_ATOM_STARTED here.
+ */
 static void __blk_mq_requeue_request(struct request *rq)
 {
 	struct request_queue *q = rq->q;
@@ -696,6 +705,19 @@ static void blk_mq_check_expired(struct blk_mq_hw_ctx *hctx,
 	if (!test_bit(REQ_ATOM_STARTED, &rq->atomic_flags))
 		return;
 
+	/*
+	 * The rq being checked may have been freed and reallocated
+	 * out already here, we avoid this race by checking rq->deadline
+	 * and REQ_ATOM_COMPLETE flag together:
+	 *
+	 * - if rq->deadline is observed as new value because of
+	 *   reusing, the rq won't be timed out because of timing.
+	 * - if rq->deadline is observed as previous value,
+	 *   REQ_ATOM_COMPLETE flag won't be cleared in reuse path
+	 *   because we put a barrier between setting rq->deadline
+	 *   and clearing the flag in blk_mq_start_request(), so
+	 *   this rq won't be timed out too.
+	 */
 	if (time_after_eq(jiffies, rq->deadline)) {
 		if (!blk_mark_rq_complete(rq))
 			blk_mq_rq_timed_out(rq, reserved);
-- 
2.9.3

^ permalink raw reply related	[flat|nested] 14+ messages in thread

* [PATCH v2 2/4] block: add a read barrier in blk_queue_enter()
  2017-03-24 12:36 [PATCH v2 0/4] block: misc changes Ming Lei
  2017-03-24 12:36 ` [PATCH v2 1/4] blk-mq: comment on races related with timeout handler Ming Lei
@ 2017-03-24 12:36 ` Ming Lei
  2017-03-24 15:18   ` Hannes Reinecke
  2017-03-24 17:24   ` Bart Van Assche
  2017-03-24 12:36 ` [PATCH v2 3/4] block: rename blk_mq_freeze_queue_start() Ming Lei
  2017-03-24 12:36 ` [PATCH v2 4/4] block: block new I/O just after queue is set as dying Ming Lei
  3 siblings, 2 replies; 14+ messages in thread
From: Ming Lei @ 2017-03-24 12:36 UTC (permalink / raw)
  To: Jens Axboe, linux-block, Christoph Hellwig
  Cc: Bart Van Assche, Hannes Reinecke, Ming Lei

Without the barrier, reading DEAD flag of .q_usage_counter
and reading .mq_freeze_depth may be reordered, then the
following wait_event_interruptible() may never return.

Signed-off-by: Ming Lei <tom.leiming@gmail.com>
---
 block/blk-core.c | 8 ++++++++
 1 file changed, 8 insertions(+)

diff --git a/block/blk-core.c b/block/blk-core.c
index ad388d5e309a..44eed17319c0 100644
--- a/block/blk-core.c
+++ b/block/blk-core.c
@@ -669,6 +669,14 @@ int blk_queue_enter(struct request_queue *q, bool nowait)
 		if (nowait)
 			return -EBUSY;
 
+		/*
+		 * read pair of barrier in blk_mq_freeze_queue_start(),
+		 * we need to order reading DEAD flag of .q_usage_counter
+		 * and reading .mq_freeze_depth, otherwise the following
+		 * wait may never return if the two read are reordered.
+		 */
+		smp_rmb();
+
 		ret = wait_event_interruptible(q->mq_freeze_wq,
 				!atomic_read(&q->mq_freeze_depth) ||
 				blk_queue_dying(q));
-- 
2.9.3

^ permalink raw reply related	[flat|nested] 14+ messages in thread

* [PATCH v2 3/4] block: rename blk_mq_freeze_queue_start()
  2017-03-24 12:36 [PATCH v2 0/4] block: misc changes Ming Lei
  2017-03-24 12:36 ` [PATCH v2 1/4] blk-mq: comment on races related with timeout handler Ming Lei
  2017-03-24 12:36 ` [PATCH v2 2/4] block: add a read barrier in blk_queue_enter() Ming Lei
@ 2017-03-24 12:36 ` Ming Lei
  2017-03-24 15:20   ` Hannes Reinecke
  2017-03-24 17:29   ` Bart Van Assche
  2017-03-24 12:36 ` [PATCH v2 4/4] block: block new I/O just after queue is set as dying Ming Lei
  3 siblings, 2 replies; 14+ messages in thread
From: Ming Lei @ 2017-03-24 12:36 UTC (permalink / raw)
  To: Jens Axboe, linux-block, Christoph Hellwig
  Cc: Bart Van Assche, Hannes Reinecke, Ming Lei

As the .q_usage_counter is used by both legacy and
mq path, we need to block new I/O if queue becomes
dead in blk_queue_enter().

So rename it and we can use this function in both
pathes.

Signed-off-by: Ming Lei <tom.leiming@gmail.com>
---
 block/blk-core.c                  |  2 +-
 block/blk-mq.c                    | 10 +++++-----
 drivers/block/mtip32xx/mtip32xx.c |  2 +-
 drivers/nvme/host/core.c          |  2 +-
 include/linux/blk-mq.h            |  2 +-
 5 files changed, 9 insertions(+), 9 deletions(-)

diff --git a/block/blk-core.c b/block/blk-core.c
index 44eed17319c0..5901133d105f 100644
--- a/block/blk-core.c
+++ b/block/blk-core.c
@@ -670,7 +670,7 @@ int blk_queue_enter(struct request_queue *q, bool nowait)
 			return -EBUSY;
 
 		/*
-		 * read pair of barrier in blk_mq_freeze_queue_start(),
+		 * read pair of barrier in blk_freeze_queue_start(),
 		 * we need to order reading DEAD flag of .q_usage_counter
 		 * and reading .mq_freeze_depth, otherwise the following
 		 * wait may never return if the two read are reordered.
diff --git a/block/blk-mq.c b/block/blk-mq.c
index b36f0481ba0e..5370b4f750ff 100644
--- a/block/blk-mq.c
+++ b/block/blk-mq.c
@@ -68,7 +68,7 @@ static void blk_mq_hctx_clear_pending(struct blk_mq_hw_ctx *hctx,
 	sbitmap_clear_bit(&hctx->ctx_map, ctx->index_hw);
 }
 
-void blk_mq_freeze_queue_start(struct request_queue *q)
+void blk_freeze_queue_start(struct request_queue *q)
 {
 	int freeze_depth;
 
@@ -78,7 +78,7 @@ void blk_mq_freeze_queue_start(struct request_queue *q)
 		blk_mq_run_hw_queues(q, false);
 	}
 }
-EXPORT_SYMBOL_GPL(blk_mq_freeze_queue_start);
+EXPORT_SYMBOL_GPL(blk_freeze_queue_start);
 
 void blk_mq_freeze_queue_wait(struct request_queue *q)
 {
@@ -108,7 +108,7 @@ void blk_freeze_queue(struct request_queue *q)
 	 * no blk_unfreeze_queue(), and blk_freeze_queue() is not
 	 * exported to drivers as the only user for unfreeze is blk_mq.
 	 */
-	blk_mq_freeze_queue_start(q);
+	blk_freeze_queue_start(q);
 	blk_mq_freeze_queue_wait(q);
 }
 
@@ -746,7 +746,7 @@ static void blk_mq_timeout_work(struct work_struct *work)
 	 * percpu_ref_tryget directly, because we need to be able to
 	 * obtain a reference even in the short window between the queue
 	 * starting to freeze, by dropping the first reference in
-	 * blk_mq_freeze_queue_start, and the moment the last request is
+	 * blk_freeze_queue_start, and the moment the last request is
 	 * consumed, marked by the instant q_usage_counter reaches
 	 * zero.
 	 */
@@ -2376,7 +2376,7 @@ static void blk_mq_queue_reinit_work(void)
 	 * take place in parallel.
 	 */
 	list_for_each_entry(q, &all_q_list, all_q_node)
-		blk_mq_freeze_queue_start(q);
+		blk_freeze_queue_start(q);
 	list_for_each_entry(q, &all_q_list, all_q_node)
 		blk_mq_freeze_queue_wait(q);
 
diff --git a/drivers/block/mtip32xx/mtip32xx.c b/drivers/block/mtip32xx/mtip32xx.c
index f96ab717534c..c96c35ab39df 100644
--- a/drivers/block/mtip32xx/mtip32xx.c
+++ b/drivers/block/mtip32xx/mtip32xx.c
@@ -4162,7 +4162,7 @@ static int mtip_block_remove(struct driver_data *dd)
 		dev_info(&dd->pdev->dev, "device %s surprise removal\n",
 						dd->disk->disk_name);
 
-	blk_mq_freeze_queue_start(dd->queue);
+	blk_freeze_queue_start(dd->queue);
 	blk_mq_stop_hw_queues(dd->queue);
 	blk_mq_tagset_busy_iter(&dd->tags, mtip_no_dev_cleanup, dd);
 
diff --git a/drivers/nvme/host/core.c b/drivers/nvme/host/core.c
index 9b3b57fef446..4a6d7f408769 100644
--- a/drivers/nvme/host/core.c
+++ b/drivers/nvme/host/core.c
@@ -2386,7 +2386,7 @@ void nvme_start_freeze(struct nvme_ctrl *ctrl)
 
 	mutex_lock(&ctrl->namespaces_mutex);
 	list_for_each_entry(ns, &ctrl->namespaces, list)
-		blk_mq_freeze_queue_start(ns->queue);
+		blk_freeze_queue_start(ns->queue);
 	mutex_unlock(&ctrl->namespaces_mutex);
 }
 EXPORT_SYMBOL_GPL(nvme_start_freeze);
diff --git a/include/linux/blk-mq.h b/include/linux/blk-mq.h
index 5b3e201c8d4f..ea2e9dcd3aef 100644
--- a/include/linux/blk-mq.h
+++ b/include/linux/blk-mq.h
@@ -243,7 +243,7 @@ void blk_mq_tagset_busy_iter(struct blk_mq_tag_set *tagset,
 		busy_tag_iter_fn *fn, void *priv);
 void blk_mq_freeze_queue(struct request_queue *q);
 void blk_mq_unfreeze_queue(struct request_queue *q);
-void blk_mq_freeze_queue_start(struct request_queue *q);
+void blk_freeze_queue_start(struct request_queue *q);
 void blk_mq_freeze_queue_wait(struct request_queue *q);
 int blk_mq_freeze_queue_wait_timeout(struct request_queue *q,
 				     unsigned long timeout);
-- 
2.9.3

^ permalink raw reply related	[flat|nested] 14+ messages in thread

* [PATCH v2 4/4] block: block new I/O just after queue is set as dying
  2017-03-24 12:36 [PATCH v2 0/4] block: misc changes Ming Lei
                   ` (2 preceding siblings ...)
  2017-03-24 12:36 ` [PATCH v2 3/4] block: rename blk_mq_freeze_queue_start() Ming Lei
@ 2017-03-24 12:36 ` Ming Lei
  2017-03-24 17:45   ` Bart Van Assche
  3 siblings, 1 reply; 14+ messages in thread
From: Ming Lei @ 2017-03-24 12:36 UTC (permalink / raw)
  To: Jens Axboe, linux-block, Christoph Hellwig
  Cc: Bart Van Assche, Hannes Reinecke, Ming Lei, Tejun Heo

Before commit 780db2071a(blk-mq: decouble blk-mq freezing
from generic bypassing), the dying flag is checked before
entering queue, and Tejun converts the checking into .mq_freeze_depth,
and assumes the counter is increased just after dying flag
is set. Unfortunately we doesn't do that in blk_set_queue_dying().

This patch calls blk_freeze_queue_start() in blk_set_queue_dying(),
so that we can block new I/O coming once the queue is set as dying.

Given blk_set_queue_dying() is always called in remove path
of block device, and queue will be cleaned up later, we don't
need to worry about undoing the counter.

Cc: Bart Van Assche <bart.vanassche@sandisk.com>
Cc: Tejun Heo <tj@kernel.org>
Reviewed-by: Hannes Reinecke <hare@suse.com>
Signed-off-by: Ming Lei <tom.leiming@gmail.com>
---
 block/blk-core.c | 8 ++++++--
 1 file changed, 6 insertions(+), 2 deletions(-)

diff --git a/block/blk-core.c b/block/blk-core.c
index 5901133d105f..f0dd9b0054ed 100644
--- a/block/blk-core.c
+++ b/block/blk-core.c
@@ -500,6 +500,9 @@ void blk_set_queue_dying(struct request_queue *q)
 	queue_flag_set(QUEUE_FLAG_DYING, q);
 	spin_unlock_irq(q->queue_lock);
 
+	/* block new I/O coming */
+	blk_freeze_queue_start(q);
+
 	if (q->mq_ops)
 		blk_mq_wake_waiters(q);
 	else {
@@ -672,8 +675,9 @@ int blk_queue_enter(struct request_queue *q, bool nowait)
 		/*
 		 * read pair of barrier in blk_freeze_queue_start(),
 		 * we need to order reading DEAD flag of .q_usage_counter
-		 * and reading .mq_freeze_depth, otherwise the following
-		 * wait may never return if the two read are reordered.
+		 * and reading .mq_freeze_depth or dying flag, otherwise
+		 * the following wait may never return if the two read
+		 * are reordered.
 		 */
 		smp_rmb();
 
-- 
2.9.3

^ permalink raw reply related	[flat|nested] 14+ messages in thread

* Re: [PATCH v2 2/4] block: add a read barrier in blk_queue_enter()
  2017-03-24 12:36 ` [PATCH v2 2/4] block: add a read barrier in blk_queue_enter() Ming Lei
@ 2017-03-24 15:18   ` Hannes Reinecke
  2017-03-24 17:24   ` Bart Van Assche
  1 sibling, 0 replies; 14+ messages in thread
From: Hannes Reinecke @ 2017-03-24 15:18 UTC (permalink / raw)
  To: Ming Lei, Jens Axboe, linux-block, Christoph Hellwig; +Cc: Bart Van Assche

On 03/24/2017 01:36 PM, Ming Lei wrote:
> Without the barrier, reading DEAD flag of .q_usage_counter
> and reading .mq_freeze_depth may be reordered, then the
> following wait_event_interruptible() may never return.
> 
> Signed-off-by: Ming Lei <tom.leiming@gmail.com>
> ---
>  block/blk-core.c | 8 ++++++++
>  1 file changed, 8 insertions(+)
> 
> diff --git a/block/blk-core.c b/block/blk-core.c
> index ad388d5e309a..44eed17319c0 100644
> --- a/block/blk-core.c
> +++ b/block/blk-core.c
> @@ -669,6 +669,14 @@ int blk_queue_enter(struct request_queue *q, bool nowait)
>  		if (nowait)
>  			return -EBUSY;
>  
> +		/*
> +		 * read pair of barrier in blk_mq_freeze_queue_start(),
> +		 * we need to order reading DEAD flag of .q_usage_counter
> +		 * and reading .mq_freeze_depth, otherwise the following
> +		 * wait may never return if the two read are reordered.
> +		 */
> +		smp_rmb();
> +
>  		ret = wait_event_interruptible(q->mq_freeze_wq,
>  				!atomic_read(&q->mq_freeze_depth) ||
>  				blk_queue_dying(q));
> 
Reviewed-by: Hannes Reinecke <hare@suse.com>

Cheers,

Hannes
-- 
Dr. Hannes Reinecke		      zSeries & Storage
hare@suse.de			      +49 911 74053 688
SUSE LINUX Products GmbH, Maxfeldstr. 5, 90409 N�rnberg
GF: J. Hawn, J. Guild, F. Imend�rffer, HRB 16746 (AG N�rnberg)

^ permalink raw reply	[flat|nested] 14+ messages in thread

* Re: [PATCH v2 3/4] block: rename blk_mq_freeze_queue_start()
  2017-03-24 12:36 ` [PATCH v2 3/4] block: rename blk_mq_freeze_queue_start() Ming Lei
@ 2017-03-24 15:20   ` Hannes Reinecke
  2017-03-24 17:29   ` Bart Van Assche
  1 sibling, 0 replies; 14+ messages in thread
From: Hannes Reinecke @ 2017-03-24 15:20 UTC (permalink / raw)
  To: Ming Lei, Jens Axboe, linux-block, Christoph Hellwig; +Cc: Bart Van Assche

On 03/24/2017 01:36 PM, Ming Lei wrote:
> As the .q_usage_counter is used by both legacy and
> mq path, we need to block new I/O if queue becomes
> dead in blk_queue_enter().
> 
> So rename it and we can use this function in both
> pathes.
> 
> Signed-off-by: Ming Lei <tom.leiming@gmail.com>
> ---
>  block/blk-core.c                  |  2 +-
>  block/blk-mq.c                    | 10 +++++-----
>  drivers/block/mtip32xx/mtip32xx.c |  2 +-
>  drivers/nvme/host/core.c          |  2 +-
>  include/linux/blk-mq.h            |  2 +-
>  5 files changed, 9 insertions(+), 9 deletions(-)
> 
Reviewed-by: Hannes Reinecke <hare@suse.com>

Cheers,

Hannes
-- 
Dr. Hannes Reinecke		      zSeries & Storage
hare@suse.de			      +49 911 74053 688
SUSE LINUX Products GmbH, Maxfeldstr. 5, 90409 N�rnberg
GF: J. Hawn, J. Guild, F. Imend�rffer, HRB 16746 (AG N�rnberg)

^ permalink raw reply	[flat|nested] 14+ messages in thread

* Re: [PATCH v2 2/4] block: add a read barrier in blk_queue_enter()
  2017-03-24 12:36 ` [PATCH v2 2/4] block: add a read barrier in blk_queue_enter() Ming Lei
  2017-03-24 15:18   ` Hannes Reinecke
@ 2017-03-24 17:24   ` Bart Van Assche
  2017-03-24 17:38     ` Ming Lei
  1 sibling, 1 reply; 14+ messages in thread
From: Bart Van Assche @ 2017-03-24 17:24 UTC (permalink / raw)
  To: hch, linux-block, tom.leiming, axboe; +Cc: hare

On Fri, 2017-03-24 at 20:36 +0800, Ming Lei wrote:
> Without the barrier, reading DEAD flag of .q_usage_counter
> and reading .mq_freeze_depth may be reordered, then the
> following wait_event_interruptible() may never return.
>=20
> Signed-off-by: Ming Lei <tom.leiming@gmail.com>
> ---
>  block/blk-core.c | 8 ++++++++
>  1 file changed, 8 insertions(+)
>=20
> diff --git a/block/blk-core.c b/block/blk-core.c
> index ad388d5e309a..44eed17319c0 100644
> --- a/block/blk-core.c
> +++ b/block/blk-core.c
> @@ -669,6 +669,14 @@ int blk_queue_enter(struct request_queue *q, bool no=
wait)
>  		if (nowait)
>  			return -EBUSY;
> =20
> +		/*
> +		 * read pair of barrier in blk_mq_freeze_queue_start(),
> +		 * we need to order reading DEAD flag of .q_usage_counter
> +		 * and reading .mq_freeze_depth, otherwise the following
> +		 * wait may never return if the two read are reordered.
> +		 */
> +		smp_rmb();
> +
>  		ret =3D wait_event_interruptible(q->mq_freeze_wq,
>  				!atomic_read(&q->mq_freeze_depth) ||
>  				blk_queue_dying(q));

Hello Ming,

The code looks fine to me but the comment not. You probably wanted to refer
to the "dying" flag instead of the "dead" flag? The read order has to be
enforced for the "dying" flag and q_usage_counter because of the order in
which these are set by blk_set_queue_dying().

Thanks,

Bart.=

^ permalink raw reply	[flat|nested] 14+ messages in thread

* Re: [PATCH v2 3/4] block: rename blk_mq_freeze_queue_start()
  2017-03-24 12:36 ` [PATCH v2 3/4] block: rename blk_mq_freeze_queue_start() Ming Lei
  2017-03-24 15:20   ` Hannes Reinecke
@ 2017-03-24 17:29   ` Bart Van Assche
  2017-03-24 17:52     ` Ming Lei
  1 sibling, 1 reply; 14+ messages in thread
From: Bart Van Assche @ 2017-03-24 17:29 UTC (permalink / raw)
  To: hch, linux-block, tom.leiming, axboe; +Cc: hare

On Fri, 2017-03-24 at 20:36 +0800, Ming Lei wrote:
> As the .q_usage_counter is used by both legacy and
> mq path, we need to block new I/O if queue becomes
> dead in blk_queue_enter().
>=20
> So rename it and we can use this function in both
> pathes.

Should "pathes" be changed into "paths" in the commit message? Additionally=
,
this patch breaks the symmetry the comment in blk_mq_freeze_queue() refers
to. Anyway:

Reviewed-by: Bart Van Assche <bart.vanassche@sandisk.com>=

^ permalink raw reply	[flat|nested] 14+ messages in thread

* Re: [PATCH v2 2/4] block: add a read barrier in blk_queue_enter()
  2017-03-24 17:24   ` Bart Van Assche
@ 2017-03-24 17:38     ` Ming Lei
  2017-03-24 18:45       ` Bart Van Assche
  0 siblings, 1 reply; 14+ messages in thread
From: Ming Lei @ 2017-03-24 17:38 UTC (permalink / raw)
  To: Bart Van Assche; +Cc: hch, linux-block, axboe, hare

On Sat, Mar 25, 2017 at 1:24 AM, Bart Van Assche
<Bart.VanAssche@sandisk.com> wrote:
> On Fri, 2017-03-24 at 20:36 +0800, Ming Lei wrote:
>> Without the barrier, reading DEAD flag of .q_usage_counter
>> and reading .mq_freeze_depth may be reordered, then the
>> following wait_event_interruptible() may never return.
>>
>> Signed-off-by: Ming Lei <tom.leiming@gmail.com>
>> ---
>>  block/blk-core.c | 8 ++++++++
>>  1 file changed, 8 insertions(+)
>>
>> diff --git a/block/blk-core.c b/block/blk-core.c
>> index ad388d5e309a..44eed17319c0 100644
>> --- a/block/blk-core.c
>> +++ b/block/blk-core.c
>> @@ -669,6 +669,14 @@ int blk_queue_enter(struct request_queue *q, bool nowait)
>>               if (nowait)
>>                       return -EBUSY;
>>
>> +             /*
>> +              * read pair of barrier in blk_mq_freeze_queue_start(),
>> +              * we need to order reading DEAD flag of .q_usage_counter
>> +              * and reading .mq_freeze_depth, otherwise the following
>> +              * wait may never return if the two read are reordered.
>> +              */
>> +             smp_rmb();
>> +
>>               ret = wait_event_interruptible(q->mq_freeze_wq,
>>                               !atomic_read(&q->mq_freeze_depth) ||
>>                               blk_queue_dying(q));
>
> Hello Ming,
>
> The code looks fine to me but the comment not. You probably wanted to refer
> to the "dying" flag instead of the "dead" flag? The read order has to be

No, looks you misunderstand the issue.

I mean the order between reading __PERCPU_REF_DEAD of .q_usage_counter
and reading .mq_freeze_depth should be enhanced, especially it is in
blk_queue_enter() vs. blk_mq_freeze_queue_start().

In the last patch, you will find the dying flag is mentioned in above comment
after we call blk_freeze_queue_start() just after the dying flag is set.

> enforced for the "dying" flag and q_usage_counter because of the order in
> which these are set by blk_set_queue_dying().

As I explained, the dying flag should only be mentioned after we change
the code in blk_set_queue_dying().


Thanks,
Ming Lei

^ permalink raw reply	[flat|nested] 14+ messages in thread

* Re: [PATCH v2 4/4] block: block new I/O just after queue is set as dying
  2017-03-24 12:36 ` [PATCH v2 4/4] block: block new I/O just after queue is set as dying Ming Lei
@ 2017-03-24 17:45   ` Bart Van Assche
  0 siblings, 0 replies; 14+ messages in thread
From: Bart Van Assche @ 2017-03-24 17:45 UTC (permalink / raw)
  To: hch, linux-block, tom.leiming, axboe; +Cc: tj, hare

On Fri, 2017-03-24 at 20:36 +0800, Ming Lei wrote:=20
> +	/* block new I/O coming */
> +	blk_freeze_queue_start(q);

As I have already mentioned two times, the comment above
blk_freeze_queue_start() should be made more clear. It should mention that
without that call blk_queue_enter() won't check the "dying" flag after it
has been set. If that is not mentioned in a comment the next person who
reads the blk_set_queue_dying() function will wonder why the
blk_freeze_queue_start() call is really needed and whether it can be remove=
d.

>  		/*
>  		 * read pair of barrier in blk_freeze_queue_start(),
>  		 * we need to order reading DEAD flag of .q_usage_counter
> -		 * and reading .mq_freeze_depth, otherwise the following
> -		 * wait may never return if the two read are reordered.
> +		 * and reading .mq_freeze_depth or dying flag, otherwise
> +		 * the following wait may never return if the two read
> +		 * are reordered.
>  		 */
>  		smp_rmb();

Please fix the spelling in the above comment ("two read").

Thanks,

Bart.

^ permalink raw reply	[flat|nested] 14+ messages in thread

* Re: [PATCH v2 3/4] block: rename blk_mq_freeze_queue_start()
  2017-03-24 17:29   ` Bart Van Assche
@ 2017-03-24 17:52     ` Ming Lei
  0 siblings, 0 replies; 14+ messages in thread
From: Ming Lei @ 2017-03-24 17:52 UTC (permalink / raw)
  To: Bart Van Assche; +Cc: hch, linux-block, axboe, hare

On Sat, Mar 25, 2017 at 1:29 AM, Bart Van Assche
<Bart.VanAssche@sandisk.com> wrote:
> On Fri, 2017-03-24 at 20:36 +0800, Ming Lei wrote:
>> As the .q_usage_counter is used by both legacy and
>> mq path, we need to block new I/O if queue becomes
>> dead in blk_queue_enter().
>>
>> So rename it and we can use this function in both
>> pathes.
>
> Should "pathes" be changed into "paths" in the commit message? Additionally,
> this patch breaks the symmetry the comment in blk_mq_freeze_queue() refers
> to. Anyway:

Really? Is there one function named as blk_mq_freeze_queue_end()?

The comment means blk_mq_freeze_queue() vs. blk_mq_unfreeze_queue(), which can't
be affected by this patch.


Thanks,
Ming Lei

^ permalink raw reply	[flat|nested] 14+ messages in thread

* Re: [PATCH v2 2/4] block: add a read barrier in blk_queue_enter()
  2017-03-24 17:38     ` Ming Lei
@ 2017-03-24 18:45       ` Bart Van Assche
  2017-03-27 11:31         ` Ming Lei
  0 siblings, 1 reply; 14+ messages in thread
From: Bart Van Assche @ 2017-03-24 18:45 UTC (permalink / raw)
  To: tom.leiming; +Cc: hch, linux-block, hare, axboe

On Sat, 2017-03-25 at 01:38 +0800, Ming Lei wrote:
> As I explained, the dying flag should only be mentioned after we change
> the code in blk_set_queue_dying().

Hello Ming,

If patches 2 and 4 would be combined into a single patch then it wouldn't
be necessary anymore to update the comment introduced in patch 2 in patch 4=
.
I think that would make this patch series easier to review.

Since the issues fixed by your patches are longstanding issues, have you
considered to add a "Cc: stable" tag?

Thanks,

Bart.=

^ permalink raw reply	[flat|nested] 14+ messages in thread

* Re: [PATCH v2 2/4] block: add a read barrier in blk_queue_enter()
  2017-03-24 18:45       ` Bart Van Assche
@ 2017-03-27 11:31         ` Ming Lei
  0 siblings, 0 replies; 14+ messages in thread
From: Ming Lei @ 2017-03-27 11:31 UTC (permalink / raw)
  To: Bart Van Assche; +Cc: hch, linux-block, hare, axboe

On Sat, Mar 25, 2017 at 2:45 AM, Bart Van Assche
<Bart.VanAssche@sandisk.com> wrote:
> On Sat, 2017-03-25 at 01:38 +0800, Ming Lei wrote:
>> As I explained, the dying flag should only be mentioned after we change
>> the code in blk_set_queue_dying().
>
> Hello Ming,
>
> If patches 2 and 4 would be combined into a single patch then it wouldn't
> be necessary anymore to update the comment introduced in patch 2 in patch 4.
> I think that would make this patch series easier to review.

I think it is better to not do that, because we know patch 2 and patch
4 are doing
two different things, and patch 2 can be thought as one fix, but patch 4 should
be a improvement.  And it is normal to update comment after code changes.

>
> Since the issues fixed by your patches are longstanding issues, have you
> considered to add a "Cc: stable" tag?

I don't know because there isn't real report, and maybe never, so I leave it
alone.


Thanks,
Ming Lei

^ permalink raw reply	[flat|nested] 14+ messages in thread

end of thread, other threads:[~2017-03-27 12:04 UTC | newest]

Thread overview: 14+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2017-03-24 12:36 [PATCH v2 0/4] block: misc changes Ming Lei
2017-03-24 12:36 ` [PATCH v2 1/4] blk-mq: comment on races related with timeout handler Ming Lei
2017-03-24 12:36 ` [PATCH v2 2/4] block: add a read barrier in blk_queue_enter() Ming Lei
2017-03-24 15:18   ` Hannes Reinecke
2017-03-24 17:24   ` Bart Van Assche
2017-03-24 17:38     ` Ming Lei
2017-03-24 18:45       ` Bart Van Assche
2017-03-27 11:31         ` Ming Lei
2017-03-24 12:36 ` [PATCH v2 3/4] block: rename blk_mq_freeze_queue_start() Ming Lei
2017-03-24 15:20   ` Hannes Reinecke
2017-03-24 17:29   ` Bart Van Assche
2017-03-24 17:52     ` Ming Lei
2017-03-24 12:36 ` [PATCH v2 4/4] block: block new I/O just after queue is set as dying Ming Lei
2017-03-24 17:45   ` Bart Van Assche

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.