All of lore.kernel.org
 help / color / mirror / Atom feed
* [PATCH 0/5] nvme fabrics patches
@ 2017-01-12 10:31 Sagi Grimberg
  2017-01-12 10:31 ` [PATCH 1/5] nvmet: delete controllers deletion upon subsystem release Sagi Grimberg
                   ` (4 more replies)
  0 siblings, 5 replies; 8+ messages in thread
From: Sagi Grimberg @ 2017-01-12 10:31 UTC (permalink / raw)


This is a resend of some patches I sent a while ago
and just now got some time to revisit.

I added a fix to a possible race condition where we
free a controller without making sure that fatal_err
and/or async works are cancelled/flushed.

Sagi Grimberg (5):
  nvmet: delete controllers deletion upon subsystem release
  nvmet: Make cntlid globally unique
  nvme: Make controller state visible via sysfs
  nvmet: cancel fatal error and flush async work before free controller
  nvmet: Call fatal_error from keep-alive timout expiration

 drivers/nvme/host/core.c          | 24 ++++++++++++++++++++++++
 drivers/nvme/target/configfs.c    |  1 +
 drivers/nvme/target/core.c        | 25 ++++++++++++++++++-------
 drivers/nvme/target/fabrics-cmd.c |  4 ++--
 drivers/nvme/target/nvmet.h       |  2 +-
 5 files changed, 46 insertions(+), 10 deletions(-)

-- 
2.7.4

^ permalink raw reply	[flat|nested] 8+ messages in thread

* [PATCH 1/5] nvmet: delete controllers deletion upon subsystem release
  2017-01-12 10:31 [PATCH 0/5] nvme fabrics patches Sagi Grimberg
@ 2017-01-12 10:31 ` Sagi Grimberg
  2017-01-12 10:31 ` [PATCH 2/5] nvmet: Make cntlid globally unique Sagi Grimberg
                   ` (3 subsequent siblings)
  4 siblings, 0 replies; 8+ messages in thread
From: Sagi Grimberg @ 2017-01-12 10:31 UTC (permalink / raw)


No reason for them to be kept around if we are
deleting the subsystem, so instead of passively
wait for the host to disconnect, actively delete
the controllers.

Signed-off-by: Sagi Grimberg <sagi at grimberg.me>
Reviewed-by: Johannes Thumshirn <jthumshirn at suse.de>
Reviewed-by: Max Gurtovoy <maxg at mellanox.com>
Reviewed-by: Christoph Hellwig <hch at lst.de>
---
 drivers/nvme/target/configfs.c |  1 +
 drivers/nvme/target/core.c     | 10 ++++++++++
 drivers/nvme/target/nvmet.h    |  1 +
 3 files changed, 12 insertions(+)

diff --git a/drivers/nvme/target/configfs.c b/drivers/nvme/target/configfs.c
index 6f5074153dcd..be8c800078e2 100644
--- a/drivers/nvme/target/configfs.c
+++ b/drivers/nvme/target/configfs.c
@@ -631,6 +631,7 @@ static void nvmet_subsys_release(struct config_item *item)
 {
 	struct nvmet_subsys *subsys = to_subsys(item);
 
+	nvmet_subsys_del_ctrls(subsys);
 	nvmet_subsys_put(subsys);
 }
 
diff --git a/drivers/nvme/target/core.c b/drivers/nvme/target/core.c
index b1d66ed655c9..4a367549eb93 100644
--- a/drivers/nvme/target/core.c
+++ b/drivers/nvme/target/core.c
@@ -935,6 +935,16 @@ static void nvmet_subsys_free(struct kref *ref)
 	kfree(subsys);
 }
 
+void nvmet_subsys_del_ctrls(struct nvmet_subsys *subsys)
+{
+	struct nvmet_ctrl *ctrl;
+
+	mutex_lock(&subsys->lock);
+	list_for_each_entry(ctrl, &subsys->ctrls, subsys_entry)
+		ctrl->ops->delete_ctrl(ctrl);
+	mutex_unlock(&subsys->lock);
+}
+
 void nvmet_subsys_put(struct nvmet_subsys *subsys)
 {
 	kref_put(&subsys->ref, nvmet_subsys_free);
diff --git a/drivers/nvme/target/nvmet.h b/drivers/nvme/target/nvmet.h
index 23d5eb1c944f..cc7ad06b43a7 100644
--- a/drivers/nvme/target/nvmet.h
+++ b/drivers/nvme/target/nvmet.h
@@ -282,6 +282,7 @@ void nvmet_ctrl_put(struct nvmet_ctrl *ctrl);
 struct nvmet_subsys *nvmet_subsys_alloc(const char *subsysnqn,
 		enum nvme_subsys_type type);
 void nvmet_subsys_put(struct nvmet_subsys *subsys);
+void nvmet_subsys_del_ctrls(struct nvmet_subsys *subsys);
 
 struct nvmet_ns *nvmet_find_namespace(struct nvmet_ctrl *ctrl, __le32 nsid);
 void nvmet_put_namespace(struct nvmet_ns *ns);
-- 
2.7.4

^ permalink raw reply related	[flat|nested] 8+ messages in thread

* [PATCH 2/5] nvmet: Make cntlid globally unique
  2017-01-12 10:31 [PATCH 0/5] nvme fabrics patches Sagi Grimberg
  2017-01-12 10:31 ` [PATCH 1/5] nvmet: delete controllers deletion upon subsystem release Sagi Grimberg
@ 2017-01-12 10:31 ` Sagi Grimberg
  2017-01-12 10:31 ` [PATCH 3/5] nvme: Make controller state visible via sysfs Sagi Grimberg
                   ` (2 subsequent siblings)
  4 siblings, 0 replies; 8+ messages in thread
From: Sagi Grimberg @ 2017-01-12 10:31 UTC (permalink / raw)


We usually log the cntlid which is confusing in case
we have multiple subsystems each with it's own cntlid ida.
Instead make cntlid ida globally unique and log the initial
association.

Signed-off-by: Sagi Grimberg <sagi at grimberg.me>
Reviewed-by: Johannes Thumshirn <jthumshirn at suse.de>
Reviewed-by: Max Gurtovoy <maxg at mellanox.com>
Reviewed-by: Christoph Hellwig <hch at lst.de>
---
 drivers/nvme/target/core.c        | 10 ++++------
 drivers/nvme/target/fabrics-cmd.c |  4 ++--
 drivers/nvme/target/nvmet.h       |  1 -
 3 files changed, 6 insertions(+), 9 deletions(-)

diff --git a/drivers/nvme/target/core.c b/drivers/nvme/target/core.c
index 4a367549eb93..16a1c5e85826 100644
--- a/drivers/nvme/target/core.c
+++ b/drivers/nvme/target/core.c
@@ -17,6 +17,7 @@
 #include "nvmet.h"
 
 static struct nvmet_fabrics_ops *nvmet_transports[NVMF_TRTYPE_MAX];
+static DEFINE_IDA(cntlid_ida);
 
 /*
  * This read/write semaphore is used to synchronize access to configuration
@@ -749,7 +750,7 @@ u16 nvmet_alloc_ctrl(const char *subsysnqn, const char *hostnqn,
 	if (!ctrl->sqs)
 		goto out_free_cqs;
 
-	ret = ida_simple_get(&subsys->cntlid_ida,
+	ret = ida_simple_get(&cntlid_ida,
 			     NVME_CNTLID_MIN, NVME_CNTLID_MAX,
 			     GFP_KERNEL);
 	if (ret < 0) {
@@ -816,7 +817,7 @@ static void nvmet_ctrl_free(struct kref *ref)
 	list_del(&ctrl->subsys_entry);
 	mutex_unlock(&subsys->lock);
 
-	ida_simple_remove(&subsys->cntlid_ida, ctrl->cntlid);
+	ida_simple_remove(&cntlid_ida, ctrl->cntlid);
 	nvmet_subsys_put(subsys);
 
 	kfree(ctrl->sqs);
@@ -915,9 +916,6 @@ struct nvmet_subsys *nvmet_subsys_alloc(const char *subsysnqn,
 	mutex_init(&subsys->lock);
 	INIT_LIST_HEAD(&subsys->namespaces);
 	INIT_LIST_HEAD(&subsys->ctrls);
-
-	ida_init(&subsys->cntlid_ida);
-
 	INIT_LIST_HEAD(&subsys->hosts);
 
 	return subsys;
@@ -930,7 +928,6 @@ static void nvmet_subsys_free(struct kref *ref)
 
 	WARN_ON_ONCE(!list_empty(&subsys->namespaces));
 
-	ida_destroy(&subsys->cntlid_ida);
 	kfree(subsys->subsysnqn);
 	kfree(subsys);
 }
@@ -973,6 +970,7 @@ static void __exit nvmet_exit(void)
 {
 	nvmet_exit_configfs();
 	nvmet_exit_discovery();
+	ida_destroy(&cntlid_ida);
 
 	BUILD_BUG_ON(sizeof(struct nvmf_disc_rsp_page_entry) != 1024);
 	BUILD_BUG_ON(sizeof(struct nvmf_disc_rsp_page_hdr) != 1024);
diff --git a/drivers/nvme/target/fabrics-cmd.c b/drivers/nvme/target/fabrics-cmd.c
index f4088198cd0d..18dc314601c3 100644
--- a/drivers/nvme/target/fabrics-cmd.c
+++ b/drivers/nvme/target/fabrics-cmd.c
@@ -153,8 +153,8 @@ static void nvmet_execute_admin_connect(struct nvmet_req *req)
 		goto out;
 	}
 
-	pr_info("creating controller %d for NQN %s.\n",
-			ctrl->cntlid, ctrl->hostnqn);
+	pr_info("creating controller %d for subsystem %s for NQN %s.\n",
+		ctrl->cntlid, ctrl->subsys->subsysnqn, ctrl->hostnqn);
 	req->rsp->result.u16 = cpu_to_le16(ctrl->cntlid);
 
 out:
diff --git a/drivers/nvme/target/nvmet.h b/drivers/nvme/target/nvmet.h
index cc7ad06b43a7..1370eee0a3c0 100644
--- a/drivers/nvme/target/nvmet.h
+++ b/drivers/nvme/target/nvmet.h
@@ -142,7 +142,6 @@ struct nvmet_subsys {
 	unsigned int		max_nsid;
 
 	struct list_head	ctrls;
-	struct ida		cntlid_ida;
 
 	struct list_head	hosts;
 	bool			allow_any_host;
-- 
2.7.4

^ permalink raw reply related	[flat|nested] 8+ messages in thread

* [PATCH 3/5] nvme: Make controller state visible via sysfs
  2017-01-12 10:31 [PATCH 0/5] nvme fabrics patches Sagi Grimberg
  2017-01-12 10:31 ` [PATCH 1/5] nvmet: delete controllers deletion upon subsystem release Sagi Grimberg
  2017-01-12 10:31 ` [PATCH 2/5] nvmet: Make cntlid globally unique Sagi Grimberg
@ 2017-01-12 10:31 ` Sagi Grimberg
  2017-01-12 10:31 ` [PATCH 4/5] nvmet: cancel fatal error and flush async work before free controller Sagi Grimberg
  2017-01-12 10:31 ` [PATCH 5/5] nvmet: Call fatal_error from keep-alive timout expiration Sagi Grimberg
  4 siblings, 0 replies; 8+ messages in thread
From: Sagi Grimberg @ 2017-01-12 10:31 UTC (permalink / raw)


Easier for debugging and testing state machine
transitions.

Signed-off-by: Sagi Grimberg <sagi at grimberg.me>
Reviewed-by: Johannes Thumshirn <jthumshirn at suse.de>
Reviewed-by: Christoph Hellwig <hch at lst.de>
---
 drivers/nvme/host/core.c | 24 ++++++++++++++++++++++++
 1 file changed, 24 insertions(+)

diff --git a/drivers/nvme/host/core.c b/drivers/nvme/host/core.c
index 8a3c3e32a704..439aebf4473c 100644
--- a/drivers/nvme/host/core.c
+++ b/drivers/nvme/host/core.c
@@ -1521,6 +1521,29 @@ static ssize_t nvme_sysfs_show_transport(struct device *dev,
 }
 static DEVICE_ATTR(transport, S_IRUGO, nvme_sysfs_show_transport, NULL);
 
+static ssize_t nvme_sysfs_show_state(struct device *dev,
+				     struct device_attribute *attr,
+				     char *buf)
+{
+	struct nvme_ctrl *ctrl = dev_get_drvdata(dev);
+	static const char *const state_name[] = {
+		[NVME_CTRL_NEW]		= "new",
+		[NVME_CTRL_LIVE]	= "live",
+		[NVME_CTRL_RESETTING]	= "resetting",
+		[NVME_CTRL_RECONNECTING]= "reconnecting",
+		[NVME_CTRL_DELETING]	= "deleting",
+		[NVME_CTRL_DEAD]	= "dead",
+	};
+
+	if ((unsigned)ctrl->state < ARRAY_SIZE(state_name) &&
+	    state_name[ctrl->state])
+		return sprintf(buf, "%s\n", state_name[ctrl->state]);
+
+	return sprintf(buf, "unknown state\n");
+}
+
+static DEVICE_ATTR(state, S_IRUGO, nvme_sysfs_show_state, NULL);
+
 static ssize_t nvme_sysfs_show_subsysnqn(struct device *dev,
 					 struct device_attribute *attr,
 					 char *buf)
@@ -1553,6 +1576,7 @@ static struct attribute *nvme_dev_attrs[] = {
 	&dev_attr_transport.attr,
 	&dev_attr_subsysnqn.attr,
 	&dev_attr_address.attr,
+	&dev_attr_state.attr,
 	NULL
 };
 
-- 
2.7.4

^ permalink raw reply related	[flat|nested] 8+ messages in thread

* [PATCH 4/5] nvmet: cancel fatal error and flush async work before free controller
  2017-01-12 10:31 [PATCH 0/5] nvme fabrics patches Sagi Grimberg
                   ` (2 preceding siblings ...)
  2017-01-12 10:31 ` [PATCH 3/5] nvme: Make controller state visible via sysfs Sagi Grimberg
@ 2017-01-12 10:31 ` Sagi Grimberg
  2017-01-13  7:21   ` Christoph Hellwig
  2017-01-12 10:31 ` [PATCH 5/5] nvmet: Call fatal_error from keep-alive timout expiration Sagi Grimberg
  4 siblings, 1 reply; 8+ messages in thread
From: Sagi Grimberg @ 2017-01-12 10:31 UTC (permalink / raw)


Make sure they are not running and we can free the controller
safely.

Signed-off-by: Roy Shterman <roys at lightbitslabs.com>
Signed-off-by: Sagi Grimberg <sagi at grimberg.me>
---
 drivers/nvme/target/core.c | 3 +++
 1 file changed, 3 insertions(+)

diff --git a/drivers/nvme/target/core.c b/drivers/nvme/target/core.c
index 16a1c5e85826..dcdc2f6146f1 100644
--- a/drivers/nvme/target/core.c
+++ b/drivers/nvme/target/core.c
@@ -817,6 +817,9 @@ static void nvmet_ctrl_free(struct kref *ref)
 	list_del(&ctrl->subsys_entry);
 	mutex_unlock(&subsys->lock);
 
+	flush_work(&ctrl->async_event_work);
+	cancel_work_sync(&ctrl->fatal_err_work);
+
 	ida_simple_remove(&cntlid_ida, ctrl->cntlid);
 	nvmet_subsys_put(subsys);
 
-- 
2.7.4

^ permalink raw reply related	[flat|nested] 8+ messages in thread

* [PATCH 5/5] nvmet: Call fatal_error from keep-alive timout expiration
  2017-01-12 10:31 [PATCH 0/5] nvme fabrics patches Sagi Grimberg
                   ` (3 preceding siblings ...)
  2017-01-12 10:31 ` [PATCH 4/5] nvmet: cancel fatal error and flush async work before free controller Sagi Grimberg
@ 2017-01-12 10:31 ` Sagi Grimberg
  2017-01-13  7:22   ` Christoph Hellwig
  4 siblings, 1 reply; 8+ messages in thread
From: Sagi Grimberg @ 2017-01-12 10:31 UTC (permalink / raw)


We only need to call delete_ctrl once, so given that both
keep-alive timeout and any other fatal error can trigger it,
just make sure we only call delete_ctrl once.

Signed-off-by: Sagi Grimberg <sagi at grimberg.me>
---
 drivers/nvme/target/core.c | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/drivers/nvme/target/core.c b/drivers/nvme/target/core.c
index dcdc2f6146f1..5267ce20c12d 100644
--- a/drivers/nvme/target/core.c
+++ b/drivers/nvme/target/core.c
@@ -201,7 +201,7 @@ static void nvmet_keep_alive_timer(struct work_struct *work)
 	pr_err("ctrl %d keep-alive timer (%d seconds) expired!\n",
 		ctrl->cntlid, ctrl->kato);
 
-	ctrl->ops->delete_ctrl(ctrl);
+	nvmet_ctrl_fatal_error(ctrl);
 }
 
 static void nvmet_start_keep_alive_timer(struct nvmet_ctrl *ctrl)
-- 
2.7.4

^ permalink raw reply related	[flat|nested] 8+ messages in thread

* [PATCH 4/5] nvmet: cancel fatal error and flush async work before free controller
  2017-01-12 10:31 ` [PATCH 4/5] nvmet: cancel fatal error and flush async work before free controller Sagi Grimberg
@ 2017-01-13  7:21   ` Christoph Hellwig
  0 siblings, 0 replies; 8+ messages in thread
From: Christoph Hellwig @ 2017-01-13  7:21 UTC (permalink / raw)


On Thu, Jan 12, 2017@12:31:18PM +0200, Sagi Grimberg wrote:
> Make sure they are not running and we can free the controller
> safely.
> 
> Signed-off-by: Roy Shterman <roys at lightbitslabs.com>
> Signed-off-by: Sagi Grimberg <sagi at grimberg.me>

Looks good,

Reviewed-by: Christoph Hellwig <hch at lst.de>

^ permalink raw reply	[flat|nested] 8+ messages in thread

* [PATCH 5/5] nvmet: Call fatal_error from keep-alive timout expiration
  2017-01-12 10:31 ` [PATCH 5/5] nvmet: Call fatal_error from keep-alive timout expiration Sagi Grimberg
@ 2017-01-13  7:22   ` Christoph Hellwig
  0 siblings, 0 replies; 8+ messages in thread
From: Christoph Hellwig @ 2017-01-13  7:22 UTC (permalink / raw)


On Thu, Jan 12, 2017@12:31:19PM +0200, Sagi Grimberg wrote:
> We only need to call delete_ctrl once, so given that both
> keep-alive timeout and any other fatal error can trigger it,
> just make sure we only call delete_ctrl once.
> 
> Signed-off-by: Sagi Grimberg <sagi at grimberg.me>
> ---
>  drivers/nvme/target/core.c | 2 +-
>  1 file changed, 1 insertion(+), 1 deletion(-)
> 
> diff --git a/drivers/nvme/target/core.c b/drivers/nvme/target/core.c
> index dcdc2f6146f1..5267ce20c12d 100644
> --- a/drivers/nvme/target/core.c
> +++ b/drivers/nvme/target/core.c
> @@ -201,7 +201,7 @@ static void nvmet_keep_alive_timer(struct work_struct *work)
>  	pr_err("ctrl %d keep-alive timer (%d seconds) expired!\n",
>  		ctrl->cntlid, ctrl->kato);
>  
> -	ctrl->ops->delete_ctrl(ctrl);
> +	nvmet_ctrl_fatal_error(ctrl);

Looks good,

Reviewed-by: Christoph Hellwig <hch at lst.de>

^ permalink raw reply	[flat|nested] 8+ messages in thread

end of thread, other threads:[~2017-01-13  7:22 UTC | newest]

Thread overview: 8+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2017-01-12 10:31 [PATCH 0/5] nvme fabrics patches Sagi Grimberg
2017-01-12 10:31 ` [PATCH 1/5] nvmet: delete controllers deletion upon subsystem release Sagi Grimberg
2017-01-12 10:31 ` [PATCH 2/5] nvmet: Make cntlid globally unique Sagi Grimberg
2017-01-12 10:31 ` [PATCH 3/5] nvme: Make controller state visible via sysfs Sagi Grimberg
2017-01-12 10:31 ` [PATCH 4/5] nvmet: cancel fatal error and flush async work before free controller Sagi Grimberg
2017-01-13  7:21   ` Christoph Hellwig
2017-01-12 10:31 ` [PATCH 5/5] nvmet: Call fatal_error from keep-alive timout expiration Sagi Grimberg
2017-01-13  7:22   ` Christoph Hellwig

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.