[PATCH 0/4] blk-mq: allow to unfreeze queue when io isn't drained

linux-kernel.vger.kernel.org archive mirror
 help / color / mirror / Atom feed

* [PATCH 0/4] blk-mq: allow to unfreeze queue when io isn't drained
@ 2018-09-18 10:19 Ming Lei
  2018-09-18 10:19 ` [PATCH 1/4] percpu-refcount: move zeroing of percpu part into percpu_ref_switch_to_atomic_rcu Ming Lei
                   ` (3 more replies)
  0 siblings, 4 replies; 10+ messages in thread
From: Ming Lei @ 2018-09-18 10:19 UTC (permalink / raw)
  To: linux-kernel; +Cc: Ming Lei

Hi,

The 1st patch is one cleanup and prepares for introducing percpu_ref_resurge().

The 2nd patch introduces percpu_ref_resurge() for implementing
blk_mq_unfreeze_queue_no_drain_io().

The 3rd patch introdces blk_mq_unfreeze_queue_no_drain_io() for
cases in which queue can be unfreeze without draininig IO.

The 4th patch applies blk_mq_unfreeze_queue_no_drain_io() to
NVMe PCI timeout, so that IO hang may be avoided inside
nvme_reset_work() when new IO timeout is triggered. 

Part of idea is from Jianchao's early work:

https://marc.info/?l=linux-kernel&m=153612052611020&w=2

Ming Lei (4):
  percpu-refcount: move zeroing of percpu part into
    percpu_ref_switch_to_atomic_rcu
  lib/percpu-refcount: introduce percpu_ref_resurge()
  blk-mq: introduce blk_mq_unfreeze_queue_no_drain_io
  nvme: don't drain IO in nvme_reset_work()

 block/blk-mq.c                  | 25 +++++++++++++--
 drivers/nvme/host/core.c        | 12 ++++---
 drivers/nvme/host/nvme.h        |  2 +-
 drivers/nvme/host/pci.c         |  3 +-
 include/linux/blk-mq.h          |  1 +
 include/linux/percpu-refcount.h |  1 +
 lib/percpu-refcount.c           | 71 +++++++++++++++++++++++++++++++----------
 7 files changed, 90 insertions(+), 25 deletions(-)

-- 
2.9.5

^ permalink raw reply	[flat|nested] 10+ messages in thread

* [PATCH 1/4] percpu-refcount: move zeroing of percpu part into percpu_ref_switch_to_atomic_rcu
  2018-09-18 10:19 [PATCH 0/4] blk-mq: allow to unfreeze queue when io isn't drained Ming Lei
@ 2018-09-18 10:19 ` Ming Lei
  2018-09-18 10:19 ` [PATCH 2/4] lib/percpu-refcount: introduce percpu_ref_resurge() Ming Lei
                   ` (2 subsequent siblings)
  3 siblings, 0 replies; 10+ messages in thread
From: Ming Lei @ 2018-09-18 10:19 UTC (permalink / raw)
  To: linux-kernel
  Cc: Ming Lei, Tejun Heo, Jianchao Wang, Kent Overstreet, linux-block

The percpu parts are added to atomic part in percpu_ref_switch_to_atomic_rcu(),
so it is readable & reasonable to clear them there.

Cc: Tejun Heo <tj@kernel.org>
Cc: Jianchao Wang <jianchao.w.wang@oracle.com>
Cc: Kent Overstreet <kent.overstreet@gmail.com>
Cc: linux-block@vger.kernel.org
Signed-off-by: Ming Lei <ming.lei@redhat.com>
---
 lib/percpu-refcount.c | 8 +++-----
 1 file changed, 3 insertions(+), 5 deletions(-)

diff --git a/lib/percpu-refcount.c b/lib/percpu-refcount.c
index 9f96fa7bc000..a220b717f6bb 100644
--- a/lib/percpu-refcount.c
+++ b/lib/percpu-refcount.c
@@ -130,8 +130,10 @@ static void percpu_ref_switch_to_atomic_rcu(struct rcu_head *rcu)
 	unsigned long count = 0;
 	int cpu;
 
-	for_each_possible_cpu(cpu)
+	for_each_possible_cpu(cpu) {
 		count += *per_cpu_ptr(percpu_count, cpu);
+		*per_cpu_ptr(percpu_count, cpu) = 0;
+	}
 
 	pr_debug("global %ld percpu %ld",
 		 atomic_long_read(&ref->count), (long)count);
@@ -187,7 +189,6 @@ static void __percpu_ref_switch_to_atomic(struct percpu_ref *ref,
 static void __percpu_ref_switch_to_percpu(struct percpu_ref *ref)
 {
 	unsigned long __percpu *percpu_count = percpu_count_ptr(ref);
-	int cpu;
 
 	BUG_ON(!percpu_count);
 
@@ -202,9 +203,6 @@ static void __percpu_ref_switch_to_percpu(struct percpu_ref *ref)
 	 * zeroing is visible to all percpu accesses which can see the
 	 * following __PERCPU_REF_ATOMIC clearing.
 	 */
-	for_each_possible_cpu(cpu)
-		*per_cpu_ptr(percpu_count, cpu) = 0;
-
 	smp_store_release(&ref->percpu_count_ptr,
 			  ref->percpu_count_ptr & ~__PERCPU_REF_ATOMIC);
 }
-- 
2.9.5


^ permalink raw reply related	[flat|nested] 10+ messages in thread

* [PATCH 2/4] lib/percpu-refcount: introduce percpu_ref_resurge()
  2018-09-18 10:19 [PATCH 0/4] blk-mq: allow to unfreeze queue when io isn't drained Ming Lei
  2018-09-18 10:19 ` [PATCH 1/4] percpu-refcount: move zeroing of percpu part into percpu_ref_switch_to_atomic_rcu Ming Lei
@ 2018-09-18 10:19 ` Ming Lei
  2018-09-18 12:50   ` Tejun Heo
  2018-09-19  5:19   ` jianchao.wang
  2018-09-18 10:19 ` [PATCH 3/4] blk-mq: introduce blk_mq_unfreeze_queue_no_drain_io Ming Lei
  2018-09-18 10:19 ` [PATCH 4/4] nvme: don't drain IO in nvme_reset_work() Ming Lei
  3 siblings, 2 replies; 10+ messages in thread
From: Ming Lei @ 2018-09-18 10:19 UTC (permalink / raw)
  To: linux-kernel
  Cc: Ming Lei, Tejun Heo, Jianchao Wang, Kent Overstreet, linux-block

Now percpu_ref_reinit() can only be done on one percpu refcounter when it
drops zero. And the limit shouldn't be so strict, and it is quite
straightforward that we can do it when the refcount doesn't drop zero
because it is at atomic mode.

This patch introduces percpu_ref_resurge() in which the above limit is
relaxed, so we may avoid extra change[1] for NVMe timeout's requirement.

[1] https://marc.info/?l=linux-kernel&m=153612052611020&w=2

Cc: Tejun Heo <tj@kernel.org>
Cc: Jianchao Wang <jianchao.w.wang@oracle.com>
Cc: Kent Overstreet <kent.overstreet@gmail.com>
Cc: linux-block@vger.kernel.org
Signed-off-by: Ming Lei <ming.lei@redhat.com>
---
 include/linux/percpu-refcount.h |  1 +
 lib/percpu-refcount.c           | 63 ++++++++++++++++++++++++++++++++++-------
 2 files changed, 53 insertions(+), 11 deletions(-)

diff --git a/include/linux/percpu-refcount.h b/include/linux/percpu-refcount.h
index 009cdf3d65b6..641841e26256 100644
--- a/include/linux/percpu-refcount.h
+++ b/include/linux/percpu-refcount.h
@@ -109,6 +109,7 @@ void percpu_ref_switch_to_percpu(struct percpu_ref *ref);
 void percpu_ref_kill_and_confirm(struct percpu_ref *ref,
 				 percpu_ref_func_t *confirm_kill);
 void percpu_ref_reinit(struct percpu_ref *ref);
+void percpu_ref_resurge(struct percpu_ref *ref);
 
 /**
  * percpu_ref_kill - drop the initial ref
diff --git a/lib/percpu-refcount.c b/lib/percpu-refcount.c
index a220b717f6bb..3e385a1401af 100644
--- a/lib/percpu-refcount.c
+++ b/lib/percpu-refcount.c
@@ -341,6 +341,42 @@ void percpu_ref_kill_and_confirm(struct percpu_ref *ref,
 }
 EXPORT_SYMBOL_GPL(percpu_ref_kill_and_confirm);
 
+/*
+ * If @need_drop_zero isn't set, clear the DEAD & ATOMIC flag and reinit
+ * the ref without checking if its ref value drops zero.
+ */
+static void __percpu_ref_reinit(struct percpu_ref *ref, bool need_drop_zero)
+{
+	unsigned long flags;
+
+	spin_lock_irqsave(&percpu_ref_switch_lock, flags);
+
+	if (need_drop_zero) {
+		WARN_ON_ONCE(!percpu_ref_is_zero(ref));
+	} else {
+		unsigned long __percpu *percpu_count;
+
+		WARN_ON_ONCE(__ref_is_percpu(ref, &percpu_count));
+
+		/* get one extra ref for avoiding race with .release */
+		rcu_read_lock_sched();
+		atomic_long_add(1, &ref->count);
+		rcu_read_unlock_sched();
+	}
+
+	ref->percpu_count_ptr &= ~__PERCPU_REF_DEAD;
+	percpu_ref_get(ref);
+	__percpu_ref_switch_mode(ref, NULL);
+
+	if (!need_drop_zero) {
+		rcu_read_lock_sched();
+		atomic_long_sub(1, &ref->count);
+		rcu_read_unlock_sched();
+	}
+
+	spin_unlock_irqrestore(&percpu_ref_switch_lock, flags);
+}
+
 /**
  * percpu_ref_reinit - re-initialize a percpu refcount
  * @ref: perpcu_ref to re-initialize
@@ -354,16 +390,21 @@ EXPORT_SYMBOL_GPL(percpu_ref_kill_and_confirm);
  */
 void percpu_ref_reinit(struct percpu_ref *ref)
 {
-	unsigned long flags;
-
-	spin_lock_irqsave(&percpu_ref_switch_lock, flags);
-
-	WARN_ON_ONCE(!percpu_ref_is_zero(ref));
-
-	ref->percpu_count_ptr &= ~__PERCPU_REF_DEAD;
-	percpu_ref_get(ref);
-	__percpu_ref_switch_mode(ref, NULL);
-
-	spin_unlock_irqrestore(&percpu_ref_switch_lock, flags);
+	__percpu_ref_reinit(ref, true);
 }
 EXPORT_SYMBOL_GPL(percpu_ref_reinit);
+
+/**
+ * percpu_ref_resurge - resurge a percpu refcount
+ * @ref: perpcu_ref to resurge
+ *
+ * Resurge @ref so that it's in the same state as before it is killed.
+ *
+ * Note that percpu_ref_tryget[_live]() are safe to perform on @ref while
+ * this function is in progress.
+ */
+void percpu_ref_resurge(struct percpu_ref *ref)
+{
+	__percpu_ref_reinit(ref, false);
+}
+EXPORT_SYMBOL_GPL(percpu_ref_resurge);
-- 
2.9.5


^ permalink raw reply related	[flat|nested] 10+ messages in thread

* [PATCH 3/4] blk-mq: introduce blk_mq_unfreeze_queue_no_drain_io
  2018-09-18 10:19 [PATCH 0/4] blk-mq: allow to unfreeze queue when io isn't drained Ming Lei
  2018-09-18 10:19 ` [PATCH 1/4] percpu-refcount: move zeroing of percpu part into percpu_ref_switch_to_atomic_rcu Ming Lei
  2018-09-18 10:19 ` [PATCH 2/4] lib/percpu-refcount: introduce percpu_ref_resurge() Ming Lei
@ 2018-09-18 10:19 ` Ming Lei
  2018-09-18 10:19 ` [PATCH 4/4] nvme: don't drain IO in nvme_reset_work() Ming Lei
  3 siblings, 0 replies; 10+ messages in thread
From: Ming Lei @ 2018-09-18 10:19 UTC (permalink / raw)
  To: linux-kernel
  Cc: Ming Lei, Tejun Heo, Jianchao Wang, Kent Overstreet, linux-block,
	Christoph Hellwig, linux-nvme, Keith Busch

This patch introduces blk_mq_unfreeze_queue_no_drain_io() so that
it can be used when no necessary to check if IO is drained, such
as nvme pci resetting(nvme_reset_work).

Cc: Tejun Heo <tj@kernel.org>
Cc: Jianchao Wang <jianchao.w.wang@oracle.com>
Cc: Kent Overstreet <kent.overstreet@gmail.com>
Cc: linux-block@vger.kernel.org
Cc: Christoph Hellwig <hch@lst.de>
Cc: linux-nvme@lists.infradead.org
Cc: Keith Busch <keith.busch@intel.com>
Signed-off-by: Ming Lei <ming.lei@redhat.com>
---
 block/blk-mq.c         | 25 +++++++++++++++++++++++--
 include/linux/blk-mq.h |  1 +
 2 files changed, 24 insertions(+), 2 deletions(-)

diff --git a/block/blk-mq.c b/block/blk-mq.c
index 85a1c1a59c72..a22f82061b93 100644
--- a/block/blk-mq.c
+++ b/block/blk-mq.c
@@ -191,20 +191,41 @@ void blk_mq_freeze_queue(struct request_queue *q)
 }
 EXPORT_SYMBOL_GPL(blk_mq_freeze_queue);
 
-void blk_mq_unfreeze_queue(struct request_queue *q)
+static void __blk_mq_unfreeze_queue(struct request_queue *q,
+		bool need_drop_zero)
 {
 	int freeze_depth;
 
 	freeze_depth = atomic_dec_return(&q->mq_freeze_depth);
 	WARN_ON_ONCE(freeze_depth < 0);
 	if (!freeze_depth) {
-		percpu_ref_reinit(&q->q_usage_counter);
+		if (need_drop_zero)
+			percpu_ref_reinit(&q->q_usage_counter);
+		else
+			percpu_ref_resurge(&q->q_usage_counter);
 		wake_up_all(&q->mq_freeze_wq);
 	}
 }
+
+void blk_mq_unfreeze_queue(struct request_queue *q)
+{
+	__blk_mq_unfreeze_queue(q, true);
+}
 EXPORT_SYMBOL_GPL(blk_mq_unfreeze_queue);
 
 /*
+ * Compared with blk_mq_unfreeze_queue(), the verion of _no_drain_io
+ * doesn't require the queue is really frozen, and it is useful in
+ * case of timeout handling in which IO can't be drained and has to
+ * be retried after controler is recovered.
+ */
+void blk_mq_unfreeze_queue_no_drain_io(struct request_queue *q)
+{
+	__blk_mq_unfreeze_queue(q, false);
+}
+EXPORT_SYMBOL_GPL(blk_mq_unfreeze_queue_no_drain_io);
+
+/*
  * FIXME: replace the scsi_internal_device_*block_nowait() calls in the
  * mpt3sas driver such that this function can be removed.
  */
diff --git a/include/linux/blk-mq.h b/include/linux/blk-mq.h
index 1da59c16f637..5e0740ec407f 100644
--- a/include/linux/blk-mq.h
+++ b/include/linux/blk-mq.h
@@ -279,6 +279,7 @@ void blk_mq_tagset_busy_iter(struct blk_mq_tag_set *tagset,
 		busy_tag_iter_fn *fn, void *priv);
 void blk_mq_freeze_queue(struct request_queue *q);
 void blk_mq_unfreeze_queue(struct request_queue *q);
+void blk_mq_unfreeze_queue_no_drain_io(struct request_queue *q);
 void blk_freeze_queue_start(struct request_queue *q);
 void blk_mq_freeze_queue_wait(struct request_queue *q);
 int blk_mq_freeze_queue_wait_timeout(struct request_queue *q,
-- 
2.9.5


^ permalink raw reply related	[flat|nested] 10+ messages in thread

* [PATCH 4/4] nvme: don't drain IO in nvme_reset_work()
  2018-09-18 10:19 [PATCH 0/4] blk-mq: allow to unfreeze queue when io isn't drained Ming Lei
                   ` (2 preceding siblings ...)
  2018-09-18 10:19 ` [PATCH 3/4] blk-mq: introduce blk_mq_unfreeze_queue_no_drain_io Ming Lei
@ 2018-09-18 10:19 ` Ming Lei
  3 siblings, 0 replies; 10+ messages in thread
From: Ming Lei @ 2018-09-18 10:19 UTC (permalink / raw)
  To: linux-kernel
  Cc: Ming Lei, Tejun Heo, Jianchao Wang, Kent Overstreet, linux-block,
	Christoph Hellwig, linux-nvme, Keith Busch

After the controller is recovered, it isn't necessary to wait for
completion of all in-flight IO. More importantly, it is easy to trigger
deadlock if there is new IO timeout triggered.

Cc: Tejun Heo <tj@kernel.org>
Cc: Jianchao Wang <jianchao.w.wang@oracle.com>
Cc: Kent Overstreet <kent.overstreet@gmail.com>
Cc: linux-block@vger.kernel.org
Cc: Christoph Hellwig <hch@lst.de>
Cc: linux-nvme@lists.infradead.org
Cc: Keith Busch <keith.busch@intel.com>
Signed-off-by: Ming Lei <ming.lei@redhat.com>
---
 drivers/nvme/host/core.c | 12 ++++++++----
 drivers/nvme/host/nvme.h |  2 +-
 drivers/nvme/host/pci.c  |  3 +--
 3 files changed, 10 insertions(+), 7 deletions(-)

diff --git a/drivers/nvme/host/core.c b/drivers/nvme/host/core.c
index dd8ec1dd9219..cf6a2267d44e 100644
--- a/drivers/nvme/host/core.c
+++ b/drivers/nvme/host/core.c
@@ -1203,7 +1203,7 @@ static void nvme_passthru_end(struct nvme_ctrl *ctrl, u32 effects)
 	if (effects & NVME_CMD_EFFECTS_LBCC)
 		nvme_update_formats(ctrl);
 	if (effects & (NVME_CMD_EFFECTS_LBCC | NVME_CMD_EFFECTS_CSE_MASK))
-		nvme_unfreeze(ctrl);
+		nvme_unfreeze(ctrl, true);
 	if (effects & NVME_CMD_EFFECTS_CCC)
 		nvme_init_identify(ctrl);
 	if (effects & (NVME_CMD_EFFECTS_NIC | NVME_CMD_EFFECTS_NCC))
@@ -3602,13 +3602,17 @@ void nvme_kill_queues(struct nvme_ctrl *ctrl)
 }
 EXPORT_SYMBOL_GPL(nvme_kill_queues);
 
-void nvme_unfreeze(struct nvme_ctrl *ctrl)
+void nvme_unfreeze(struct nvme_ctrl *ctrl, bool check_io_drained)
 {
 	struct nvme_ns *ns;
 
 	down_read(&ctrl->namespaces_rwsem);
-	list_for_each_entry(ns, &ctrl->namespaces, list)
-		blk_mq_unfreeze_queue(ns->queue);
+	list_for_each_entry(ns, &ctrl->namespaces, list) {
+		if  (check_io_drained)
+			blk_mq_unfreeze_queue(ns->queue);
+		else
+			blk_mq_unfreeze_queue_no_drain_io(ns->queue);
+	}
 	up_read(&ctrl->namespaces_rwsem);
 }
 EXPORT_SYMBOL_GPL(nvme_unfreeze);
diff --git a/drivers/nvme/host/nvme.h b/drivers/nvme/host/nvme.h
index bb4a2003c097..fd56270637d1 100644
--- a/drivers/nvme/host/nvme.h
+++ b/drivers/nvme/host/nvme.h
@@ -432,7 +432,7 @@ void nvme_complete_async_event(struct nvme_ctrl *ctrl, __le16 status,
 void nvme_stop_queues(struct nvme_ctrl *ctrl);
 void nvme_start_queues(struct nvme_ctrl *ctrl);
 void nvme_kill_queues(struct nvme_ctrl *ctrl);
-void nvme_unfreeze(struct nvme_ctrl *ctrl);
+void nvme_unfreeze(struct nvme_ctrl *ctrl, bool check_io_drained);
 void nvme_wait_freeze(struct nvme_ctrl *ctrl);
 void nvme_wait_freeze_timeout(struct nvme_ctrl *ctrl, long timeout);
 void nvme_start_freeze(struct nvme_ctrl *ctrl);
diff --git a/drivers/nvme/host/pci.c b/drivers/nvme/host/pci.c
index d668682f91df..1c26a2e92063 100644
--- a/drivers/nvme/host/pci.c
+++ b/drivers/nvme/host/pci.c
@@ -2328,11 +2328,10 @@ static void nvme_reset_work(struct work_struct *work)
 		new_state = NVME_CTRL_ADMIN_ONLY;
 	} else {
 		nvme_start_queues(&dev->ctrl);
-		nvme_wait_freeze(&dev->ctrl);
 		/* hit this only when allocate tagset fails */
 		if (nvme_dev_add(dev))
 			new_state = NVME_CTRL_ADMIN_ONLY;
-		nvme_unfreeze(&dev->ctrl);
+		nvme_unfreeze(&dev->ctrl, false);
 	}
 
 	/*
-- 
2.9.5


^ permalink raw reply related	[flat|nested] 10+ messages in thread

* Re: [PATCH 2/4] lib/percpu-refcount: introduce percpu_ref_resurge()
  2018-09-18 10:19 ` [PATCH 2/4] lib/percpu-refcount: introduce percpu_ref_resurge() Ming Lei
@ 2018-09-18 12:50   ` Tejun Heo
  2018-09-19  3:09     ` Ming Lei
  2018-09-19  5:19   ` jianchao.wang
  1 sibling, 1 reply; 10+ messages in thread
From: Tejun Heo @ 2018-09-18 12:50 UTC (permalink / raw)
  To: Ming Lei; +Cc: linux-kernel, Jianchao Wang, Kent Overstreet, linux-block

On Tue, Sep 18, 2018 at 06:19:44PM +0800, Ming Lei wrote:
> Now percpu_ref_reinit() can only be done on one percpu refcounter when it
> drops zero. And the limit shouldn't be so strict, and it is quite
> straightforward that we can do it when the refcount doesn't drop zero
> because it is at atomic mode.
> 
> This patch introduces percpu_ref_resurge() in which the above limit is
> relaxed, so we may avoid extra change[1] for NVMe timeout's requirement.

For now,

Nacked-by: Tejun Heo <tj@kernel.org>

Please see the original discussion thread.

Thanks.

-- 
tejun

^ permalink raw reply	[flat|nested] 10+ messages in thread

* Re: [PATCH 2/4] lib/percpu-refcount: introduce percpu_ref_resurge()
  2018-09-18 12:50   ` Tejun Heo
@ 2018-09-19  3:09     ` Ming Lei
  0 siblings, 0 replies; 10+ messages in thread
From: Ming Lei @ 2018-09-19  3:09 UTC (permalink / raw)
  To: Tejun Heo; +Cc: linux-kernel, Jianchao Wang, Kent Overstreet, linux-block

On Tue, Sep 18, 2018 at 05:50:21AM -0700, Tejun Heo wrote:
> On Tue, Sep 18, 2018 at 06:19:44PM +0800, Ming Lei wrote:
> > Now percpu_ref_reinit() can only be done on one percpu refcounter when it
> > drops zero. And the limit shouldn't be so strict, and it is quite
> > straightforward that we can do it when the refcount doesn't drop zero
> > because it is at atomic mode.
> > 
> > This patch introduces percpu_ref_resurge() in which the above limit is
> > relaxed, so we may avoid extra change[1] for NVMe timeout's requirement.
> 
> For now,
> 
> Nacked-by: Tejun Heo <tj@kernel.org>
> 
> Please see the original discussion thread.

Your comment in that thread supposes that synchronize_rcu() is used for
avoiding race with .release().

But this patchset doesn't use that approach at all for avoiding this
race.

Thanks,
Ming

^ permalink raw reply	[flat|nested] 10+ messages in thread

* Re: [PATCH 2/4] lib/percpu-refcount: introduce percpu_ref_resurge()
  2018-09-18 10:19 ` [PATCH 2/4] lib/percpu-refcount: introduce percpu_ref_resurge() Ming Lei
  2018-09-18 12:50   ` Tejun Heo
@ 2018-09-19  5:19   ` jianchao.wang
  2018-09-19  7:55     ` Ming Lei
  1 sibling, 1 reply; 10+ messages in thread
From: jianchao.wang @ 2018-09-19  5:19 UTC (permalink / raw)
  To: Ming Lei, linux-kernel; +Cc: Tejun Heo, Kent Overstreet, linux-block

Hi Ming

On 09/18/2018 06:19 PM, Ming Lei wrote:
> +		unsigned long __percpu *percpu_count;
> +
> +		WARN_ON_ONCE(__ref_is_percpu(ref, &percpu_count));
> +
> +		/* get one extra ref for avoiding race with .release */
> +		rcu_read_lock_sched();
> +		atomic_long_add(1, &ref->count);
> +		rcu_read_unlock_sched();
> +	}

The rcu_read_lock_sched here is redundant. We have been in the critical section
of a spin_lock_irqsave.

The atomic_long_add(1, &ref->count) may have two result.
1. ref->count > 1
   it will not drop to zero any more.
2. ref->count == 1
   it has dropped to zero and .release may be running.

Thanks
Jianchao



^ permalink raw reply	[flat|nested] 10+ messages in thread

* Re: [PATCH 2/4] lib/percpu-refcount: introduce percpu_ref_resurge()
  2018-09-19  5:19   ` jianchao.wang
@ 2018-09-19  7:55     ` Ming Lei
  2018-09-19  8:01       ` Ming Lei
  0 siblings, 1 reply; 10+ messages in thread
From: Ming Lei @ 2018-09-19  7:55 UTC (permalink / raw)
  To: jianchao.wang; +Cc: linux-kernel, Tejun Heo, Kent Overstreet, linux-block

On Wed, Sep 19, 2018 at 01:19:10PM +0800, jianchao.wang wrote:
> Hi Ming
> 
> On 09/18/2018 06:19 PM, Ming Lei wrote:
> > +		unsigned long __percpu *percpu_count;
> > +
> > +		WARN_ON_ONCE(__ref_is_percpu(ref, &percpu_count));
> > +
> > +		/* get one extra ref for avoiding race with .release */
> > +		rcu_read_lock_sched();
> > +		atomic_long_add(1, &ref->count);
> > +		rcu_read_unlock_sched();
> > +	}
> 
> The rcu_read_lock_sched here is redundant. We have been in the critical section
> of a spin_lock_irqsave.

Right.

> 
> The atomic_long_add(1, &ref->count) may have two result.
> 1. ref->count > 1
>    it will not drop to zero any more.
> 2. ref->count == 1
>    it has dropped to zero and .release may be running.

IMO, both the two cases are fine and supported, or do you have other
concern about this way?

thanks, 
Ming

^ permalink raw reply	[flat|nested] 10+ messages in thread

* Re: [PATCH 2/4] lib/percpu-refcount: introduce percpu_ref_resurge()
  2018-09-19  7:55     ` Ming Lei
@ 2018-09-19  8:01       ` Ming Lei
  0 siblings, 0 replies; 10+ messages in thread
From: Ming Lei @ 2018-09-19  8:01 UTC (permalink / raw)
  To: jianchao.wang; +Cc: linux-kernel, Tejun Heo, Kent Overstreet, linux-block

On Wed, Sep 19, 2018 at 03:55:07PM +0800, Ming Lei wrote:
> On Wed, Sep 19, 2018 at 01:19:10PM +0800, jianchao.wang wrote:
> > Hi Ming
> > 
> > On 09/18/2018 06:19 PM, Ming Lei wrote:
> > > +		unsigned long __percpu *percpu_count;
> > > +
> > > +		WARN_ON_ONCE(__ref_is_percpu(ref, &percpu_count));
> > > +
> > > +		/* get one extra ref for avoiding race with .release */
> > > +		rcu_read_lock_sched();
> > > +		atomic_long_add(1, &ref->count);
> > > +		rcu_read_unlock_sched();
> > > +	}
> > 
> > The rcu_read_lock_sched here is redundant. We have been in the critical section
> > of a spin_lock_irqsave.
> 
> Right.
> 
> > 
> > The atomic_long_add(1, &ref->count) may have two result.
> > 1. ref->count > 1
> >    it will not drop to zero any more.
> > 2. ref->count == 1
> >    it has dropped to zero and .release may be running.
> 
> IMO, both the two cases are fine and supported, or do you have other
> concern about this way?

It is too quick, :-)

Yeah, the .release() may be running.

For blk-mq/NVMe's use case, it won't be an issue. We may comment on this
race and let user handle it if it is a problem.


thanks,
Ming

^ permalink raw reply	[flat|nested] 10+ messages in thread

end of thread, other threads:[~2018-09-19  8:01 UTC | newest]

Thread overview: 10+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2018-09-18 10:19 [PATCH 0/4] blk-mq: allow to unfreeze queue when io isn't drained Ming Lei
2018-09-18 10:19 ` [PATCH 1/4] percpu-refcount: move zeroing of percpu part into percpu_ref_switch_to_atomic_rcu Ming Lei
2018-09-18 10:19 ` [PATCH 2/4] lib/percpu-refcount: introduce percpu_ref_resurge() Ming Lei
2018-09-18 12:50   ` Tejun Heo
2018-09-19  3:09     ` Ming Lei
2018-09-19  5:19   ` jianchao.wang
2018-09-19  7:55     ` Ming Lei
2018-09-19  8:01       ` Ming Lei
2018-09-18 10:19 ` [PATCH 3/4] blk-mq: introduce blk_mq_unfreeze_queue_no_drain_io Ming Lei
2018-09-18 10:19 ` [PATCH 4/4] nvme: don't drain IO in nvme_reset_work() Ming Lei

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).