All of lore.kernel.org
 help / color / mirror / Atom feed
* [RFC PATCH 0/4] nvme async-connect and discovery uevents
@ 2018-08-21 13:43 Hannes Reinecke
  2018-08-21 13:43 ` [PATCH 1/4] nvme-rdma: use reconnect_work for initial connect Hannes Reinecke
                   ` (3 more replies)
  0 siblings, 4 replies; 13+ messages in thread
From: Hannes Reinecke @ 2018-08-21 13:43 UTC (permalink / raw)


Hi all,

here's my next attempt to solve the auto-connect issue. The overall idea
is to have an 'async_connect' option to the nvme cli, which would return
immediately from the ioctl without waiting for any I/O to complete.
With that we're safe to call 'nvme connect' or 'nvme discover' from udev
rules, and don't run into the risk of accidentally stalling udev as the
nvme cli call doesn't return.
As we then can't return the discovery log anymore (as the connect hasn't
happened by the time we're returning from the 'connect' ioctl) I've also
implemented discovery uevents for each entry in the discovery log page.
And while I've been at it I've also implemented a 'discovery' sysfs attribute
for the discovery controller, holding all found discovery records.

With these patches we can implement a simple autoconnect:

1) call 'nvme discover --async_connect --trtype ...'.
2) The discovery log is fetched from the kernel, and an uevent for each
discovery record is generated:

UDEV  [1920.740092] change   /devices/virtual/nvme-fabrics/ctl/nvme0 (nvme)
ACTION=change
DEVNAME=/dev/nvme0
DEVPATH=/devices/virtual/nvme-fabrics/ctl/nvme0
MAJOR=244
MINOR=0
NVME_DISCOVERY_ADRFAM=fc
NVME_DISCOVERY_HOST_TRADDR=nn-0x20000090fae06325:pn-0x10000090fae06325
NVME_DISCOVERY_SUBNQN=blktests-subsys
NVME_DISCOVERY_TRADDR=nn-0x200200a09890f5bf:pn-0x200300a09890f5bf
NVME_DISCOVERY_TRSVCID=none
NVME_DISCOVERY_TRTYPE=fc
NVME_EVENT=discovery
SEQNUM=6940
SUBSYSTEM=nvme
USEC_INITIALIZED=1920739942

3) The uevent is evaluated with a simple udev rule, which essentially just
calls 'nvme connect --async_connect' with the parameters from the uevent:

SUBSYSTEM=="nvme", ACTION=="change", RUN+="/usr/sbin/nvme connect --async_connect --transport=$env{NVME_DISCOVERY_TRTYPE} --traddr=$env{NVME_DISCOVERY_TRADDR} --host-traddr=$env{NVME_DISCOVERY_HOST_TRADDR} --nqn=$env{NVME_DISCOVERY_SUBNQN}"

4) The connection is established, all namespaces are parsed etc.
5) After two minutes the discovery controller is removed.

As usual, comments and reviews are welcome.

Hannes Reinecke (4):
  nvme-rdma: use reconnect_work for initial connect
  nvme: implement 'async_connect' cli option
  nvme: implement 'discovery' sysfs entry and discovery uevents
  nvme: delete discovery controller after 2 minutes

 drivers/nvme/host/core.c    | 168 +++++++++++++++++++++++++++++++++++++++++++-
 drivers/nvme/host/fabrics.c |   9 ++-
 drivers/nvme/host/fabrics.h |   6 ++
 drivers/nvme/host/fc.c      |   3 +-
 drivers/nvme/host/nvme.h    |   2 +
 drivers/nvme/host/rdma.c    |  21 ++++--
 6 files changed, 198 insertions(+), 11 deletions(-)

-- 
2.16.4

^ permalink raw reply	[flat|nested] 13+ messages in thread

* [PATCH 1/4] nvme-rdma: use reconnect_work for initial connect
  2018-08-21 13:43 [RFC PATCH 0/4] nvme async-connect and discovery uevents Hannes Reinecke
@ 2018-08-21 13:43 ` Hannes Reinecke
  2018-08-21 14:10   ` Max Gurtovoy
  2018-08-21 13:43 ` [PATCH 2/4] nvme: implement 'async_connect' cli option Hannes Reinecke
                   ` (2 subsequent siblings)
  3 siblings, 1 reply; 13+ messages in thread
From: Hannes Reinecke @ 2018-08-21 13:43 UTC (permalink / raw)


The 'reconnect_work' function does exactly the same than we do
for the initial connect, so we can as well schedule the workqueue
item here and wait for it to complete.

Signed-off-by: Hannes Reinecke <hare at suse.com>
---
 drivers/nvme/host/rdma.c | 20 +++++++++++++-------
 1 file changed, 13 insertions(+), 7 deletions(-)

diff --git a/drivers/nvme/host/rdma.c b/drivers/nvme/host/rdma.c
index dc042017c293..0906acbb800e 100644
--- a/drivers/nvme/host/rdma.c
+++ b/drivers/nvme/host/rdma.c
@@ -1990,19 +1990,25 @@ static struct nvme_ctrl *nvme_rdma_create_ctrl(struct device *dev,
 	changed = nvme_change_ctrl_state(&ctrl->ctrl, NVME_CTRL_CONNECTING);
 	WARN_ON_ONCE(!changed);
 
-	ret = nvme_rdma_setup_ctrl(ctrl, true);
-	if (ret)
-		goto out_uninit_ctrl;
-
-	dev_info(ctrl->ctrl.device, "new ctrl: NQN \"%s\", addr %pISpcs\n",
-		ctrl->ctrl.opts->subsysnqn, &ctrl->addr);
-
 	nvme_get_ctrl(&ctrl->ctrl);
 
 	mutex_lock(&nvme_rdma_ctrl_mutex);
 	list_add_tail(&ctrl->list, &nvme_rdma_ctrl_list);
 	mutex_unlock(&nvme_rdma_ctrl_mutex);
 
+	if (!queue_delayed_work(nvme_wq, &ctrl->reconnect_work, 0)) {
+		nvme_put_ctrl(&ctrl->ctrl);
+		dev_err(ctrl->ctrl.device,
+			"failed to schedule initial connect\n");
+		goto out_uninit_ctrl;
+	}
+
+	flush_delayed_work(&ctrl->reconnect_work);
+
+	dev_info(ctrl->ctrl.device, "new ctrl: NQN \"%s\", addr %pISpcs\n",
+		ctrl->ctrl.opts->subsysnqn, &ctrl->addr);
+
+
 	return &ctrl->ctrl;
 
 out_uninit_ctrl:
-- 
2.16.4

^ permalink raw reply related	[flat|nested] 13+ messages in thread

* [PATCH 2/4] nvme: implement 'async_connect' cli option
  2018-08-21 13:43 [RFC PATCH 0/4] nvme async-connect and discovery uevents Hannes Reinecke
  2018-08-21 13:43 ` [PATCH 1/4] nvme-rdma: use reconnect_work for initial connect Hannes Reinecke
@ 2018-08-21 13:43 ` Hannes Reinecke
  2018-08-21 21:01   ` James Smart
  2018-08-21 13:43 ` [PATCH 3/4] nvme: implement 'discovery' sysfs entry and discovery uevents Hannes Reinecke
  2018-08-21 13:43 ` [PATCH 4/4] nvme: delete discovery controller after 2 minutes Hannes Reinecke
  3 siblings, 1 reply; 13+ messages in thread
From: Hannes Reinecke @ 2018-08-21 13:43 UTC (permalink / raw)


Implement an option 'async_connect' to make the CLI return immediately
without waiting for the actual connect to complete.

Signed-off-by: Hannes Reinecke <hare at suse.com>
---
 drivers/nvme/host/fabrics.c | 7 ++++++-
 drivers/nvme/host/fabrics.h | 6 ++++++
 drivers/nvme/host/fc.c      | 3 ++-
 drivers/nvme/host/rdma.c    | 3 ++-
 4 files changed, 16 insertions(+), 3 deletions(-)

diff --git a/drivers/nvme/host/fabrics.c b/drivers/nvme/host/fabrics.c
index 206d63cb1afc..e484205b4cad 100644
--- a/drivers/nvme/host/fabrics.c
+++ b/drivers/nvme/host/fabrics.c
@@ -604,6 +604,7 @@ static const match_table_t opt_tokens = {
 	{ NVMF_OPT_HOST_TRADDR,		"host_traddr=%s"	},
 	{ NVMF_OPT_HOST_ID,		"hostid=%s"		},
 	{ NVMF_OPT_DUP_CONNECT,		"duplicate_connect"	},
+	{ NVMF_OPT_ASYNC_CONNECT,	"async_connect"		},
 	{ NVMF_OPT_ERR,			NULL			}
 };
 
@@ -814,6 +815,9 @@ static int nvmf_parse_options(struct nvmf_ctrl_options *opts,
 		case NVMF_OPT_DUP_CONNECT:
 			opts->duplicate_connect = true;
 			break;
+		case NVMF_OPT_ASYNC_CONNECT:
+			opts->async_connect = true;
+			break;
 		default:
 			pr_warn("unknown parameter or missing value '%s' in ctrl creation request\n",
 				p);
@@ -900,7 +904,8 @@ EXPORT_SYMBOL_GPL(nvmf_free_options);
 #define NVMF_REQUIRED_OPTS	(NVMF_OPT_TRANSPORT | NVMF_OPT_NQN)
 #define NVMF_ALLOWED_OPTS	(NVMF_OPT_QUEUE_SIZE | NVMF_OPT_NR_IO_QUEUES | \
 				 NVMF_OPT_KATO | NVMF_OPT_HOSTNQN | \
-				 NVMF_OPT_HOST_ID | NVMF_OPT_DUP_CONNECT)
+				 NVMF_OPT_HOST_ID | NVMF_OPT_DUP_CONNECT | \
+				 NVMF_OPT_ASYNC_CONNECT)
 
 static struct nvme_ctrl *
 nvmf_create_ctrl(struct device *dev, const char *buf, size_t count)
diff --git a/drivers/nvme/host/fabrics.h b/drivers/nvme/host/fabrics.h
index aa2fdb2a2e8f..930305a17b88 100644
--- a/drivers/nvme/host/fabrics.h
+++ b/drivers/nvme/host/fabrics.h
@@ -58,6 +58,7 @@ enum {
 	NVMF_OPT_CTRL_LOSS_TMO	= 1 << 11,
 	NVMF_OPT_HOST_ID	= 1 << 12,
 	NVMF_OPT_DUP_CONNECT	= 1 << 13,
+	NVMF_OPT_ASYNC_CONNECT	= 1 << 14,
 };
 
 /**
@@ -80,6 +81,10 @@ enum {
  * @nr_io_queues: Number of controller IO queues that will be established.
  * @reconnect_delay: Time between two consecutive reconnect attempts.
  * @discovery_nqn: indicates if the subsysnqn is the well-known discovery NQN.
+ * @duplicate_connect: indicates if the controller supports duplicate
+ *		connections.
+ * @async_connect: indicates if the connection should be established
+ *		asynchronously via a workqueue.
  * @kato:	Keep-alive timeout.
  * @host:	Virtual NVMe host, contains the NQN and Host ID.
  * @max_reconnects: maximum number of allowed reconnect attempts before removing
@@ -98,6 +103,7 @@ struct nvmf_ctrl_options {
 	unsigned int		reconnect_delay;
 	bool			discovery_nqn;
 	bool			duplicate_connect;
+	bool			async_connect;
 	unsigned int		kato;
 	struct nvmf_host	*host;
 	int			max_reconnects;
diff --git a/drivers/nvme/host/fc.c b/drivers/nvme/host/fc.c
index 611e70cae754..90b4d8df275b 100644
--- a/drivers/nvme/host/fc.c
+++ b/drivers/nvme/host/fc.c
@@ -3081,7 +3081,8 @@ nvme_fc_init_ctrl(struct device *dev, struct nvmf_ctrl_options *opts,
 		goto fail_ctrl;
 	}
 
-	flush_delayed_work(&ctrl->connect_work);
+	if (!opts->async_connect)
+		flush_delayed_work(&ctrl->connect_work);
 
 	dev_info(ctrl->ctrl.device,
 		"NVME-FC{%d}: new ctrl: NQN \"%s\"\n",
diff --git a/drivers/nvme/host/rdma.c b/drivers/nvme/host/rdma.c
index 0906acbb800e..82917dba9452 100644
--- a/drivers/nvme/host/rdma.c
+++ b/drivers/nvme/host/rdma.c
@@ -2003,7 +2003,8 @@ static struct nvme_ctrl *nvme_rdma_create_ctrl(struct device *dev,
 		goto out_uninit_ctrl;
 	}
 
-	flush_delayed_work(&ctrl->reconnect_work);
+	if (!opts->async_connect)
+		flush_delayed_work(&ctrl->reconnect_work);
 
 	dev_info(ctrl->ctrl.device, "new ctrl: NQN \"%s\", addr %pISpcs\n",
 		ctrl->ctrl.opts->subsysnqn, &ctrl->addr);
-- 
2.16.4

^ permalink raw reply related	[flat|nested] 13+ messages in thread

* [PATCH 3/4] nvme: implement 'discovery' sysfs entry and discovery uevents
  2018-08-21 13:43 [RFC PATCH 0/4] nvme async-connect and discovery uevents Hannes Reinecke
  2018-08-21 13:43 ` [PATCH 1/4] nvme-rdma: use reconnect_work for initial connect Hannes Reinecke
  2018-08-21 13:43 ` [PATCH 2/4] nvme: implement 'async_connect' cli option Hannes Reinecke
@ 2018-08-21 13:43 ` Hannes Reinecke
  2018-08-21 21:01   ` James Smart
  2018-08-21 13:43 ` [PATCH 4/4] nvme: delete discovery controller after 2 minutes Hannes Reinecke
  3 siblings, 1 reply; 13+ messages in thread
From: Hannes Reinecke @ 2018-08-21 13:43 UTC (permalink / raw)


When establishing a connection to the discovery controller with the
'async-connect' option the cli has no means of actually getting the
discovery log page entries.
This patch implements a 'discovery' sysfs attribute for the controller
which holds the information of the discovery log page. Additionally
an uevent is generated for each discovery log page entry.

Signed-off-by: Hannes Reinecke <hare at suse.com>
---
 drivers/nvme/host/core.c | 163 ++++++++++++++++++++++++++++++++++++++++++++++-
 drivers/nvme/host/nvme.h |   1 +
 2 files changed, 163 insertions(+), 1 deletion(-)

diff --git a/drivers/nvme/host/core.c b/drivers/nvme/host/core.c
index dd8ec1dd9219..358be6d217d9 100644
--- a/drivers/nvme/host/core.c
+++ b/drivers/nvme/host/core.c
@@ -2325,6 +2325,133 @@ static int nvme_get_effects_log(struct nvme_ctrl *ctrl)
 	return ret;
 }
 
+static int nvme_get_discovery_log(struct nvme_ctrl *ctrl)
+{
+	int ret, disc_log_len = 4096;
+
+retry:
+	if (!ctrl->discovery)
+		ctrl->discovery = kzalloc(disc_log_len, GFP_KERNEL);
+
+	if (!ctrl->discovery)
+		return 0;
+
+	ret = nvme_get_log(ctrl, NVME_NSID_ALL, NVME_LOG_DISC, 0,
+			   ctrl->discovery, disc_log_len, 0);
+	if (ret) {
+		kfree(ctrl->discovery);
+		ctrl->discovery = NULL;
+		return ret;
+	}
+	if (ctrl->discovery->numrec > 3 && disc_log_len == 4096) {
+		disc_log_len = (ctrl->discovery->numrec + 1) * 1024;
+		kfree(ctrl->discovery);
+		ctrl->discovery = NULL;
+		goto retry;
+	}
+	return ret;
+}
+
+static void nvme_discovery_uevent(struct nvme_ctrl *ctrl,
+				  struct nvmf_disc_rsp_page_entry *entry)
+
+{
+	char *envp[8], *envstr;
+	size_t envlen = 0;
+	const char *trtype, *adrfam;
+
+	envstr = kzalloc(4096, GFP_KERNEL);
+	if (!envstr)
+		return;
+
+	envp[0] = envstr;
+	envlen += snprintf(envstr, 4096, "NVME_EVENT=discovery");
+	envlen ++;
+	switch (entry->trtype) {
+	case NVMF_TRTYPE_RDMA:
+		trtype = "rdma";
+		break;
+	case NVMF_TRTYPE_FC:
+		trtype = "fc";
+		break;
+	case NVMF_TRTYPE_LOOP:
+		trtype = "loop";
+		break;
+	default:
+		trtype = "unknown";
+		break;
+	}
+	envp[1] = envstr + envlen;
+	envlen += snprintf(envstr + envlen, 4096 - envlen,
+			   "NVME_DISCOVERY_TRTYPE=%s", trtype);
+	envlen++;
+	switch (entry->adrfam) {
+	case NVMF_ADDR_FAMILY_IP4:
+		adrfam = "ipv4";
+		break;
+	case NVMF_ADDR_FAMILY_IP6:
+		adrfam = "ipv6";
+		break;
+	case NVMF_ADDR_FAMILY_IB:
+		adrfam = "ib";
+		break;
+	case NVMF_ADDR_FAMILY_FC:
+		adrfam = "fc";
+		break;
+	default:
+		adrfam = "unknown";
+		break;
+	}
+	envp[2] = envstr + envlen;
+	envlen += snprintf(envstr + envlen, 4096 - envlen,
+			   "NVME_DISCOVERY_ADRFAM=%s", adrfam);
+	envlen++;
+	envp[3] = envstr + envlen;
+	envlen += snprintf(envstr + envlen, 4096 - envlen,
+			   "NVME_DISCOVERY_TRADDR=%s", strim(entry->traddr));
+	envlen++;
+	envp[4] = envstr + envlen;
+	envlen += snprintf(envstr + envlen, 4096 - envlen,
+			   "NVME_DISCOVERY_TRSVCID=%s", strim(entry->trsvcid));
+	envlen++;
+	envp[5] = envstr + envlen;
+	envlen += snprintf(envstr + envlen, 4096 - envlen,
+			   "NVME_DISCOVERY_HOST_TRADDR=%s",
+			   ctrl->opts->host_traddr);
+	envlen++;
+	envp[6] = envstr + envlen;
+	envlen += snprintf(envstr + envlen, 4096 - envlen,
+			   "NVME_DISCOVERY_SUBNQN=%s", strim(entry->subnqn));
+	envlen++;
+	envp[7] = NULL;
+	kobject_uevent_env(&ctrl->device->kobj, KOBJ_CHANGE, envp);
+	kfree(envstr);
+}
+
+static int nvme_configure_discovery(struct nvme_ctrl *ctrl)
+{
+	int ret = 0;
+
+	if (!ctrl->opts || !ctrl->opts->discovery_nqn)
+		return 0;
+
+	if (!ctrl->opts->async_connect)
+		return 0;
+
+	ret = nvme_get_discovery_log(ctrl);
+	if (!ret && ctrl->discovery) {
+		int i;
+		struct nvmf_disc_rsp_page_entry *entry =
+			ctrl->discovery->entries;
+
+		for (i = 0; i < ctrl->discovery->numrec; i++) {
+			nvme_discovery_uevent(ctrl, entry);
+			entry++;
+		}
+	}
+	return ret;
+}
+
 /*
  * Initialize the cached copies of the Identify data and various controller
  * register in our nvme_ctrl structure.  This should be called as soon as
@@ -2491,6 +2618,10 @@ int nvme_init_identify(struct nvme_ctrl *ctrl)
 	if (ret < 0)
 		return ret;
 
+	ret = nvme_configure_discovery(ctrl);
+	if (ret < 0)
+		return ret;
+
 	ctrl->identified = true;
 
 	return 0;
@@ -2825,6 +2956,32 @@ static ssize_t nvme_sysfs_show_address(struct device *dev,
 }
 static DEVICE_ATTR(address, S_IRUGO, nvme_sysfs_show_address, NULL);
 
+static ssize_t nvme_discovery_show(struct device *dev,
+				   struct device_attribute *attr,
+				   char *buf)
+{
+	struct nvme_ctrl *ctrl = dev_get_drvdata(dev);
+	struct nvmf_disc_rsp_page_entry *entry;
+	int i;
+	ssize_t len = 0;
+
+	if (!ctrl->discovery)
+		return -EINVAL;
+
+	entry = ctrl->discovery->entries;
+	for (i = 0; i < ctrl->discovery->numrec; i++) {
+		len += snprintf(buf, PAGE_SIZE,
+				"trtype %d adrfam %d traddr %s "
+				"trsvcid %s portid %d subnqn %s\n",
+				entry->trtype, entry->adrfam, entry->traddr,
+				entry->trsvcid, le16_to_cpu(entry->portid),
+				entry->subnqn);
+		entry++;
+	}
+	return len;
+}
+static DEVICE_ATTR(discovery, S_IRUGO, nvme_discovery_show, NULL);
+
 static struct attribute *nvme_dev_attrs[] = {
 	&dev_attr_reset_controller.attr,
 	&dev_attr_rescan_controller.attr,
@@ -2836,6 +2993,7 @@ static struct attribute *nvme_dev_attrs[] = {
 	&dev_attr_transport.attr,
 	&dev_attr_subsysnqn.attr,
 	&dev_attr_address.attr,
+	&dev_attr_discovery.attr,
 	&dev_attr_state.attr,
 	NULL
 };
@@ -2850,7 +3008,9 @@ static umode_t nvme_dev_attrs_are_visible(struct kobject *kobj,
 		return 0;
 	if (a == &dev_attr_address.attr && !ctrl->ops->get_address)
 		return 0;
-
+	if (a == &dev_attr_discovery.attr && ctrl->opts &&
+	    !ctrl->opts->discovery_nqn)
+		return 0;
 	return a->mode;
 }
 
@@ -3496,6 +3656,7 @@ static void nvme_free_ctrl(struct device *dev)
 
 	ida_simple_remove(&nvme_instance_ida, ctrl->instance);
 	kfree(ctrl->effects);
+	kfree(ctrl->discovery);
 	nvme_mpath_uninit(ctrl);
 
 	if (subsys) {
diff --git a/drivers/nvme/host/nvme.h b/drivers/nvme/host/nvme.h
index bb4a2003c097..8a4ed46b986b 100644
--- a/drivers/nvme/host/nvme.h
+++ b/drivers/nvme/host/nvme.h
@@ -199,6 +199,7 @@ struct nvme_ctrl {
 	unsigned long quirks;
 	struct nvme_id_power_state psd[32];
 	struct nvme_effects_log *effects;
+	struct nvmf_disc_rsp_page_hdr *discovery;
 	struct work_struct scan_work;
 	struct work_struct async_event_work;
 	struct delayed_work ka_work;
-- 
2.16.4

^ permalink raw reply related	[flat|nested] 13+ messages in thread

* [PATCH 4/4] nvme: delete discovery controller after 2 minutes
  2018-08-21 13:43 [RFC PATCH 0/4] nvme async-connect and discovery uevents Hannes Reinecke
                   ` (2 preceding siblings ...)
  2018-08-21 13:43 ` [PATCH 3/4] nvme: implement 'discovery' sysfs entry and discovery uevents Hannes Reinecke
@ 2018-08-21 13:43 ` Hannes Reinecke
  2018-08-21 21:01   ` James Smart
  3 siblings, 1 reply; 13+ messages in thread
From: Hannes Reinecke @ 2018-08-21 13:43 UTC (permalink / raw)


If the CLI crashes before the 'disconnect' command is issued or when
the 'async_connect' option is used the controller is never removed.
This patch cleans up stale discovery controllers after 2 minutes,

Signed-off-by: Hannes Reinecke <hare at suse.com>
---
 drivers/nvme/host/core.c    | 5 +++++
 drivers/nvme/host/fabrics.c | 2 +-
 drivers/nvme/host/nvme.h    | 1 +
 3 files changed, 7 insertions(+), 1 deletion(-)

diff --git a/drivers/nvme/host/core.c b/drivers/nvme/host/core.c
index 358be6d217d9..b3738b327731 100644
--- a/drivers/nvme/host/core.c
+++ b/drivers/nvme/host/core.c
@@ -866,6 +866,11 @@ static void nvme_keep_alive_work(struct work_struct *work)
 	struct nvme_ctrl *ctrl = container_of(to_delayed_work(work),
 			struct nvme_ctrl, ka_work);
 
+	if (ctrl->opts->discovery_nqn) {
+		nvme_delete_ctrl(ctrl);
+		return;
+	}
+
 	if (nvme_keep_alive(ctrl)) {
 		/* allocation failure, reset the controller */
 		dev_err(ctrl->device, "keep-alive failed\n");
diff --git a/drivers/nvme/host/fabrics.c b/drivers/nvme/host/fabrics.c
index e484205b4cad..b98662760051 100644
--- a/drivers/nvme/host/fabrics.c
+++ b/drivers/nvme/host/fabrics.c
@@ -827,7 +827,7 @@ static int nvmf_parse_options(struct nvmf_ctrl_options *opts,
 	}
 
 	if (opts->discovery_nqn) {
-		opts->kato = 0;
+		opts->kato = NVME_DISCOVERY_TIMEOUT;
 		opts->nr_io_queues = 0;
 		opts->duplicate_connect = true;
 	}
diff --git a/drivers/nvme/host/nvme.h b/drivers/nvme/host/nvme.h
index 8a4ed46b986b..551a6b1dbc8c 100644
--- a/drivers/nvme/host/nvme.h
+++ b/drivers/nvme/host/nvme.h
@@ -32,6 +32,7 @@ extern unsigned int admin_timeout;
 
 #define NVME_DEFAULT_KATO	5
 #define NVME_KATO_GRACE		10
+#define NVME_DISCOVERY_TIMEOUT	120
 
 extern struct workqueue_struct *nvme_wq;
 extern struct workqueue_struct *nvme_reset_wq;
-- 
2.16.4

^ permalink raw reply related	[flat|nested] 13+ messages in thread

* [PATCH 1/4] nvme-rdma: use reconnect_work for initial connect
  2018-08-21 13:43 ` [PATCH 1/4] nvme-rdma: use reconnect_work for initial connect Hannes Reinecke
@ 2018-08-21 14:10   ` Max Gurtovoy
  2018-08-21 14:18     ` Hannes Reinecke
  0 siblings, 1 reply; 13+ messages in thread
From: Max Gurtovoy @ 2018-08-21 14:10 UTC (permalink / raw)




On 8/21/2018 4:43 PM, Hannes Reinecke wrote:
> The 'reconnect_work' function does exactly the same than we do
> for the initial connect, so we can as well schedule the workqueue
> item here and wait for it to complete.

If we do this we should rename it to connect_work.

> 
> Signed-off-by: Hannes Reinecke <hare at suse.com>
> ---
>   drivers/nvme/host/rdma.c | 20 +++++++++++++-------
>   1 file changed, 13 insertions(+), 7 deletions(-)
> 
> diff --git a/drivers/nvme/host/rdma.c b/drivers/nvme/host/rdma.c
> index dc042017c293..0906acbb800e 100644
> --- a/drivers/nvme/host/rdma.c
> +++ b/drivers/nvme/host/rdma.c
> @@ -1990,19 +1990,25 @@ static struct nvme_ctrl *nvme_rdma_create_ctrl(struct device *dev,
>   	changed = nvme_change_ctrl_state(&ctrl->ctrl, NVME_CTRL_CONNECTING);
>   	WARN_ON_ONCE(!changed);
>   
> -	ret = nvme_rdma_setup_ctrl(ctrl, true);

we call nvme_rdma_setup_ctrl with "new == true" here and with "new == 
false" in the reconnect work.
so how does this work ?

> -	if (ret)
> -		goto out_uninit_ctrl;
> -
> -	dev_info(ctrl->ctrl.device, "new ctrl: NQN \"%s\", addr %pISpcs\n",
> -		ctrl->ctrl.opts->subsysnqn, &ctrl->addr);
> -
>   	nvme_get_ctrl(&ctrl->ctrl);
>   
>   	mutex_lock(&nvme_rdma_ctrl_mutex);
>   	list_add_tail(&ctrl->list, &nvme_rdma_ctrl_list);
>   	mutex_unlock(&nvme_rdma_ctrl_mutex);
>   
> +	if (!queue_delayed_work(nvme_wq, &ctrl->reconnect_work, 0)) {
> +		nvme_put_ctrl(&ctrl->ctrl);
> +		dev_err(ctrl->ctrl.device,
> +			"failed to schedule initial connect\n");
> +		goto out_uninit_ctrl;
> +	}
> +
> +	flush_delayed_work(&ctrl->reconnect_work);
> +
> +	dev_info(ctrl->ctrl.device, "new ctrl: NQN \"%s\", addr %pISpcs\n",
> +		ctrl->ctrl.opts->subsysnqn, &ctrl->addr);
> +
> +
>   	return &ctrl->ctrl;
>   
>   out_uninit_ctrl:
> 

what is the benefit here ?
We have more lines of code now :)

^ permalink raw reply	[flat|nested] 13+ messages in thread

* [PATCH 1/4] nvme-rdma: use reconnect_work for initial connect
  2018-08-21 14:10   ` Max Gurtovoy
@ 2018-08-21 14:18     ` Hannes Reinecke
  0 siblings, 0 replies; 13+ messages in thread
From: Hannes Reinecke @ 2018-08-21 14:18 UTC (permalink / raw)


On 08/21/2018 04:10 PM, Max Gurtovoy wrote:
> 
> 
> On 8/21/2018 4:43 PM, Hannes Reinecke wrote:
>> The 'reconnect_work' function does exactly the same than we do
>> for the initial connect, so we can as well schedule the workqueue
>> item here and wait for it to complete.
> 
> If we do this we should rename it to connect_work.
> 
Sure, can do.

>>
>> Signed-off-by: Hannes Reinecke <hare at suse.com>
>> ---
>> ? drivers/nvme/host/rdma.c | 20 +++++++++++++-------
>> ? 1 file changed, 13 insertions(+), 7 deletions(-)
>>
>> diff --git a/drivers/nvme/host/rdma.c b/drivers/nvme/host/rdma.c
>> index dc042017c293..0906acbb800e 100644
>> --- a/drivers/nvme/host/rdma.c
>> +++ b/drivers/nvme/host/rdma.c
>> @@ -1990,19 +1990,25 @@ static struct nvme_ctrl
>> *nvme_rdma_create_ctrl(struct device *dev,
>> ????? changed = nvme_change_ctrl_state(&ctrl->ctrl,
>> NVME_CTRL_CONNECTING);
>> ????? WARN_ON_ONCE(!changed);
>> ? -??? ret = nvme_rdma_setup_ctrl(ctrl, true);
> 
> we call nvme_rdma_setup_ctrl with "new == true" here and with "new ==
> false" in the reconnect work.
> so how does this work ?
> 
Ah. Overlooked this. Will be fixing it up.

>> -??? if (ret)
>> -??????? goto out_uninit_ctrl;
>> -
>> -??? dev_info(ctrl->ctrl.device, "new ctrl: NQN \"%s\", addr %pISpcs\n",
>> -??????? ctrl->ctrl.opts->subsysnqn, &ctrl->addr);
>> -
>> ????? nvme_get_ctrl(&ctrl->ctrl);
>> ? ????? mutex_lock(&nvme_rdma_ctrl_mutex);
>> ????? list_add_tail(&ctrl->list, &nvme_rdma_ctrl_list);
>> ????? mutex_unlock(&nvme_rdma_ctrl_mutex);
>> ? +??? if (!queue_delayed_work(nvme_wq, &ctrl->reconnect_work, 0)) {
>> +??????? nvme_put_ctrl(&ctrl->ctrl);
>> +??????? dev_err(ctrl->ctrl.device,
>> +??????????? "failed to schedule initial connect\n");
>> +??????? goto out_uninit_ctrl;
>> +??? }
>> +
>> +??? flush_delayed_work(&ctrl->reconnect_work);
>> +
>> +??? dev_info(ctrl->ctrl.device, "new ctrl: NQN \"%s\", addr %pISpcs\n",
>> +??????? ctrl->ctrl.opts->subsysnqn, &ctrl->addr);
>> +
>> +
>> ????? return &ctrl->ctrl;
>> ? ? out_uninit_ctrl:
>>
> 
> what is the benefit here ?
> We have more lines of code now :)

The benefit comes with the next patches, as moving things out to a
workqueue allows us to implement the 'async_connect' option.


Cheers,

Hannes
-- 
Dr. Hannes Reinecke		   Teamlead Storage & Networking
hare at suse.de			               +49 911 74053 688
SUSE LINUX GmbH, Maxfeldstr. 5, 90409 N?rnberg
GF: F. Imend?rffer, J. Smithard, J. Guild, D. Upmanyu, G. Norton
HRB 21284 (AG N?rnberg)

^ permalink raw reply	[flat|nested] 13+ messages in thread

* [PATCH 2/4] nvme: implement 'async_connect' cli option
  2018-08-21 13:43 ` [PATCH 2/4] nvme: implement 'async_connect' cli option Hannes Reinecke
@ 2018-08-21 21:01   ` James Smart
  0 siblings, 0 replies; 13+ messages in thread
From: James Smart @ 2018-08-21 21:01 UTC (permalink / raw)




On 8/21/2018 6:43 AM, Hannes Reinecke wrote:
> diff --git a/drivers/nvme/host/fc.c b/drivers/nvme/host/fc.c
> index 611e70cae754..90b4d8df275b 100644
> --- a/drivers/nvme/host/fc.c
> +++ b/drivers/nvme/host/fc.c
> @@ -3081,7 +3081,8 @@ nvme_fc_init_ctrl(struct device *dev, struct nvmf_ctrl_options *opts,
>   		goto fail_ctrl;
>   	}
>   
> -	flush_delayed_work(&ctrl->connect_work);
> +	if (!opts->async_connect)
> +		flush_delayed_work(&ctrl->connect_work);
>   
>   	dev_info(ctrl->ctrl.device,
>   		"NVME-FC{%d}: new ctrl: NQN \"%s\"\n",
> diff --git a/drivers/nvme/host/rdma.c b/drivers/nvme/host/rdma.c
> index 0906acbb800e..82917dba9452 100644
> --- a/drivers/nvme/host/rdma.c
> +++ b/drivers/nvme/host/rdma.c
> @@ -2003,7 +2003,8 @@ static struct nvme_ctrl *nvme_rdma_create_ctrl(struct device *dev,
>   		goto out_uninit_ctrl;
>   	}
>   
> -	flush_delayed_work(&ctrl->reconnect_work);
> +	if (!opts->async_connect)
> +		flush_delayed_work(&ctrl->reconnect_work);
>   
>   	dev_info(ctrl->ctrl.device, "new ctrl: NQN \"%s\", addr %pISpcs\n",
>   		ctrl->ctrl.opts->subsysnqn, &ctrl->addr);

I would prefer that the check and flush occur in the common 
nvmf_create_ctrl() routine.? However, I know this means the core layer 
has to become aware of the work element which complicates things.? If 
you're willing to figure that out....

if not...
Reviewed-by:? James Smart? <james.smart at broadcom.com>

-- james

^ permalink raw reply	[flat|nested] 13+ messages in thread

* [PATCH 3/4] nvme: implement 'discovery' sysfs entry and discovery uevents
  2018-08-21 13:43 ` [PATCH 3/4] nvme: implement 'discovery' sysfs entry and discovery uevents Hannes Reinecke
@ 2018-08-21 21:01   ` James Smart
  2018-08-22  7:23     ` Hannes Reinecke
  0 siblings, 1 reply; 13+ messages in thread
From: James Smart @ 2018-08-21 21:01 UTC (permalink / raw)




On 8/21/2018 6:43 AM, Hannes Reinecke wrote:
> When establishing a connection to the discovery controller with the
> 'async-connect' option the cli has no means of actually getting the
> discovery log page entries.
> This patch implements a 'discovery' sysfs attribute for the controller
> which holds the information of the discovery log page. Additionally
> an uevent is generated for each discovery log page entry.
>
> Signed-off-by: Hannes Reinecke <hare at suse.com>
> ---
>   drivers/nvme/host/core.c | 163 ++++++++++++++++++++++++++++++++++++++++++++++-
>   drivers/nvme/host/nvme.h |   1 +
>   2 files changed, 163 insertions(+), 1 deletion(-)
>
> diff --git a/drivers/nvme/host/core.c b/drivers/nvme/host/core.c
> index dd8ec1dd9219..358be6d217d9 100644
> --- a/drivers/nvme/host/core.c
> +++ b/drivers/nvme/host/core.c
> @@ -2325,6 +2325,133 @@ static int nvme_get_effects_log(struct nvme_ctrl *ctrl)
>   	return ret;
>   }
>   
> +static int nvme_get_discovery_log(struct nvme_ctrl *ctrl)
> +{
> +	int ret, disc_log_len = 4096;
> +
> +retry:
> +	if (!ctrl->discovery)
> +		ctrl->discovery = kzalloc(disc_log_len, GFP_KERNEL);
> +
> +	if (!ctrl->discovery)
> +		return 0;
> +
> +	ret = nvme_get_log(ctrl, NVME_NSID_ALL, NVME_LOG_DISC, 0,
> +			   ctrl->discovery, disc_log_len, 0);
> +	if (ret) {
> +		kfree(ctrl->discovery);
> +		ctrl->discovery = NULL;
> +		return ret;
> +	}
> +	if (ctrl->discovery->numrec > 3 && disc_log_len == 4096) {
> +		disc_log_len = (ctrl->discovery->numrec + 1) * 1024;
> +		kfree(ctrl->discovery);
> +		ctrl->discovery = NULL;
> +		goto retry;

This is missing logic to detect a change in GENCTR and restart the log read.


> +	}
> +	return ret;
> +}
> +
> ...
> +static int nvme_configure_discovery(struct nvme_ctrl *ctrl)
> +{
> +	int ret = 0;
> +
> +	if (!ctrl->opts || !ctrl->opts->discovery_nqn)
> +		return 0;
> +
> +	if (!ctrl->opts->async_connect)
> +		return 0;

I don't understand why async_connect parameter means anything in this 
context...? why is this looked at?

is it strictly to avoid delay while reading the discovery log ?? I don't 
know that waiting for yet another command or two is a big deal, so I 
would think just proceeding with reading the discovery log is fine.

> +
> +	ret = nvme_get_discovery_log(ctrl);
> +	if (!ret && ctrl->discovery) {
> +		int i;
> +		struct nvmf_disc_rsp_page_entry *entry =
> +			ctrl->discovery->entries;
> +
> +		for (i = 0; i < ctrl->discovery->numrec; i++) {
> +			nvme_discovery_uevent(ctrl, entry);
> +			entry++;
> +		}
> +	}
> +	return ret;
> +}
> +
>   /*
>    * Initialize the cached copies of the Identify data and various controller
>    * register in our nvme_ctrl structure.  This should be called as soon as
> @@ -2491,6 +2618,10 @@ int nvme_init_identify(struct nvme_ctrl *ctrl)
>   	if (ret < 0)
>   		return ret;
>   
> +	ret = nvme_configure_discovery(ctrl);
> +	if (ret < 0)
> +		return ret;
> +
>   	ctrl->identified = true;
>   
>   	return 0;
> @@ -2825,6 +2956,32 @@ static ssize_t nvme_sysfs_show_address(struct device *dev,
>   }
>   static DEVICE_ATTR(address, S_IRUGO, nvme_sysfs_show_address, NULL);
>   
> +static ssize_t nvme_discovery_show(struct device *dev,
> +				   struct device_attribute *attr,
> +				   char *buf)
> +{
> +	struct nvme_ctrl *ctrl = dev_get_drvdata(dev);
> +	struct nvmf_disc_rsp_page_entry *entry;
> +	int i;
> +	ssize_t len = 0;
> +
> +	if (!ctrl->discovery)
> +		return -EINVAL;
> +
> +	entry = ctrl->discovery->entries;
> +	for (i = 0; i < ctrl->discovery->numrec; i++) {
> +		len += snprintf(buf, PAGE_SIZE,
> +				"trtype %d adrfam %d traddr %s "
> +				"trsvcid %s portid %d subnqn %s\n",
> +				entry->trtype, entry->adrfam, entry->traddr,
> +				entry->trsvcid, le16_to_cpu(entry->portid),
> +				entry->subnqn);
> +		entry++;
> +	}
> +	return len;
> +}
> +static DEVICE_ATTR(discovery, S_IRUGO, nvme_discovery_show, NULL);
> +

I really don't like that it's reporting a cached log. ? It may be 
completely inconsistent with what's on the discovery controller now 
(consider a nvmetcli done then a reconfig done). ?? There may not even 
be connectivity to the discovery controller at the time the sysfs entry 
is read - so validity to the content is in question.? I would much 
rather have the attribute read the log then display contents - so it's 
always current.??? Of course that does make it interesting if you read 
it while the controller is paused doing a reconnect.


>   static struct attribute *nvme_dev_attrs[] = {
>   	&dev_attr_reset_controller.attr,
>   	&dev_attr_rescan_controller.attr,
> @@ -2836,6 +2993,7 @@ static struct attribute *nvme_dev_attrs[] = {
>   	&dev_attr_transport.attr,
>   	&dev_attr_subsysnqn.attr,
>   	&dev_attr_address.attr,
> +	&dev_attr_discovery.attr,
>   	&dev_attr_state.attr,
>   	NULL
>   };
> @@ -2850,7 +3008,9 @@ static umode_t nvme_dev_attrs_are_visible(struct kobject *kobj,
>   		return 0;
>   	if (a == &dev_attr_address.attr && !ctrl->ops->get_address)
>   		return 0;
> -
> +	if (a == &dev_attr_discovery.attr && ctrl->opts &&
> +	    !ctrl->opts->discovery_nqn)
> +		return 0;
>   	return a->mode;
>   }
>   
> @@ -3496,6 +3656,7 @@ static void nvme_free_ctrl(struct device *dev)
>   
>   	ida_simple_remove(&nvme_instance_ida, ctrl->instance);
>   	kfree(ctrl->effects);
> +	kfree(ctrl->discovery);
>   	nvme_mpath_uninit(ctrl);
>   
>   	if (subsys) {
> diff --git a/drivers/nvme/host/nvme.h b/drivers/nvme/host/nvme.h
> index bb4a2003c097..8a4ed46b986b 100644
> --- a/drivers/nvme/host/nvme.h
> +++ b/drivers/nvme/host/nvme.h
> @@ -199,6 +199,7 @@ struct nvme_ctrl {
>   	unsigned long quirks;
>   	struct nvme_id_power_state psd[32];
>   	struct nvme_effects_log *effects;
> +	struct nvmf_disc_rsp_page_hdr *discovery;
>   	struct work_struct scan_work;
>   	struct work_struct async_event_work;
>   	struct delayed_work ka_work;

^ permalink raw reply	[flat|nested] 13+ messages in thread

* [PATCH 4/4] nvme: delete discovery controller after 2 minutes
  2018-08-21 13:43 ` [PATCH 4/4] nvme: delete discovery controller after 2 minutes Hannes Reinecke
@ 2018-08-21 21:01   ` James Smart
  2018-08-22  7:32     ` Hannes Reinecke
  0 siblings, 1 reply; 13+ messages in thread
From: James Smart @ 2018-08-21 21:01 UTC (permalink / raw)




On 8/21/2018 6:43 AM, Hannes Reinecke wrote:
> If the CLI crashes before the 'disconnect' command is issued or when
> the 'async_connect' option is used the controller is never removed.
> This patch cleans up stale discovery controllers after 2 minutes,
>
> Signed-off-by: Hannes Reinecke <hare at suse.com>
> ---
>   drivers/nvme/host/core.c    | 5 +++++
>   drivers/nvme/host/fabrics.c | 2 +-
>   drivers/nvme/host/nvme.h    | 1 +
>   3 files changed, 7 insertions(+), 1 deletion(-)
>
> diff --git a/drivers/nvme/host/core.c b/drivers/nvme/host/core.c
> index 358be6d217d9..b3738b327731 100644
> --- a/drivers/nvme/host/core.c
> +++ b/drivers/nvme/host/core.c
> @@ -866,6 +866,11 @@ static void nvme_keep_alive_work(struct work_struct *work)
>   	struct nvme_ctrl *ctrl = container_of(to_delayed_work(work),
>   			struct nvme_ctrl, ka_work);
>   
> +	if (ctrl->opts->discovery_nqn) {
> +		nvme_delete_ctrl(ctrl);
> +		return;
> +	}
> +
>   	if (nvme_keep_alive(ctrl)) {
>   		/* allocation failure, reset the controller */
>   		dev_err(ctrl->device, "keep-alive failed\n");
> diff --git a/drivers/nvme/host/fabrics.c b/drivers/nvme/host/fabrics.c
> index e484205b4cad..b98662760051 100644
> --- a/drivers/nvme/host/fabrics.c
> +++ b/drivers/nvme/host/fabrics.c
> @@ -827,7 +827,7 @@ static int nvmf_parse_options(struct nvmf_ctrl_options *opts,
>   	}
>   
>   	if (opts->discovery_nqn) {
> -		opts->kato = 0;
> +		opts->kato = NVME_DISCOVERY_TIMEOUT;
>   		opts->nr_io_queues = 0;
>   		opts->duplicate_connect = true;
>   	}
> diff --git a/drivers/nvme/host/nvme.h b/drivers/nvme/host/nvme.h
> index 8a4ed46b986b..551a6b1dbc8c 100644
> --- a/drivers/nvme/host/nvme.h
> +++ b/drivers/nvme/host/nvme.h
> @@ -32,6 +32,7 @@ extern unsigned int admin_timeout;
>   
>   #define NVME_DEFAULT_KATO	5
>   #define NVME_KATO_GRACE		10
> +#define NVME_DISCOVERY_TIMEOUT	120
>   
>   extern struct workqueue_struct *nvme_wq;
>   extern struct workqueue_struct *nvme_reset_wq;

this doesn't necessarily track to the new TP that adds kato support to 
discovery controllers, nor the fabric spec update that has the host 
tracking kato and deleting the controller (whether discovery controller 
or not) (the actual kato as set on the controller with the grace period).

I would rather have this be a generic timer on a host that tracks to the 
kato timeout and deletes the controller (doesn't matter if discovery or 
not) if kato times out.??? If the controller is an older discovery 
controller that doesn't support kato - then the 1st kato timeout should 
fail and remove the controller.? If it's newer and supports kato, it 
would be assumed the controller stays live for an extended period- just 
like a storage controller.

-- james

^ permalink raw reply	[flat|nested] 13+ messages in thread

* [PATCH 3/4] nvme: implement 'discovery' sysfs entry and discovery uevents
  2018-08-21 21:01   ` James Smart
@ 2018-08-22  7:23     ` Hannes Reinecke
  2018-08-22  8:51       ` Hannes Reinecke
  0 siblings, 1 reply; 13+ messages in thread
From: Hannes Reinecke @ 2018-08-22  7:23 UTC (permalink / raw)


On 08/21/2018 11:01 PM, James Smart wrote:
> 
> 
> On 8/21/2018 6:43 AM, Hannes Reinecke wrote:
>> When establishing a connection to the discovery controller with the
>> 'async-connect' option the cli has no means of actually getting the
>> discovery log page entries.
>> This patch implements a 'discovery' sysfs attribute for the controller
>> which holds the information of the discovery log page. Additionally
>> an uevent is generated for each discovery log page entry.
>>
>> Signed-off-by: Hannes Reinecke <hare at suse.com>
>> ---
>> ? drivers/nvme/host/core.c | 163
>> ++++++++++++++++++++++++++++++++++++++++++++++-
>> ? drivers/nvme/host/nvme.h |?? 1 +
>> ? 2 files changed, 163 insertions(+), 1 deletion(-)
>>
>> diff --git a/drivers/nvme/host/core.c b/drivers/nvme/host/core.c
>> index dd8ec1dd9219..358be6d217d9 100644
>> --- a/drivers/nvme/host/core.c
>> +++ b/drivers/nvme/host/core.c
>> @@ -2325,6 +2325,133 @@ static int nvme_get_effects_log(struct
>> nvme_ctrl *ctrl)
>> ????? return ret;
>> ? }
>> ? +static int nvme_get_discovery_log(struct nvme_ctrl *ctrl)
>> +{
>> +??? int ret, disc_log_len = 4096;
>> +
>> +retry:
>> +??? if (!ctrl->discovery)
>> +??????? ctrl->discovery = kzalloc(disc_log_len, GFP_KERNEL);
>> +
>> +??? if (!ctrl->discovery)
>> +??????? return 0;
>> +
>> +??? ret = nvme_get_log(ctrl, NVME_NSID_ALL, NVME_LOG_DISC, 0,
>> +?????????????? ctrl->discovery, disc_log_len, 0);
>> +??? if (ret) {
>> +??????? kfree(ctrl->discovery);
>> +??????? ctrl->discovery = NULL;
>> +??????? return ret;
>> +??? }
>> +??? if (ctrl->discovery->numrec > 3 && disc_log_len == 4096) {
>> +??????? disc_log_len = (ctrl->discovery->numrec + 1) * 1024;
>> +??????? kfree(ctrl->discovery);
>> +??????? ctrl->discovery = NULL;
>> +??????? goto retry;
> 
> This is missing logic to detect a change in GENCTR and restart the log
> read.
> 
Yeah, I'll be implementing this for the next round.

>> +??? }
>> +??? return ret;
>> +}
>> +
>> ...
>> +static int nvme_configure_discovery(struct nvme_ctrl *ctrl)
>> +{
>> +??? int ret = 0;
>> +
>> +??? if (!ctrl->opts || !ctrl->opts->discovery_nqn)
>> +??????? return 0;
>> +
>> +??? if (!ctrl->opts->async_connect)
>> +??????? return 0;
> 
> I don't understand why async_connect parameter means anything in this
> context...? why is this looked at?
> 
> is it strictly to avoid delay while reading the discovery log ?? I don't
> know that waiting for yet another command or two is a big deal, so I
> would think just proceeding with reading the discovery log is fine.
> 
Originally I did that so as to restore the original functionality, and
tie discovery events to the use of the 'async_connect' parameter.
But you are right, both are not necessarily connected.
I'll drop this for the next round.

>> +
>> +??? ret = nvme_get_discovery_log(ctrl);
>> +??? if (!ret && ctrl->discovery) {
>> +??????? int i;
>> +??????? struct nvmf_disc_rsp_page_entry *entry =
>> +??????????? ctrl->discovery->entries;
>> +
>> +??????? for (i = 0; i < ctrl->discovery->numrec; i++) {
>> +??????????? nvme_discovery_uevent(ctrl, entry);
>> +??????????? entry++;
>> +??????? }
>> +??? }
>> +??? return ret;
>> +}
>> +
>> ? /*
>> ?? * Initialize the cached copies of the Identify data and various
>> controller
>> ?? * register in our nvme_ctrl structure.? This should be called as
>> soon as
>> @@ -2491,6 +2618,10 @@ int nvme_init_identify(struct nvme_ctrl *ctrl)
>> ????? if (ret < 0)
>> ????????? return ret;
>> ? +??? ret = nvme_configure_discovery(ctrl);
>> +??? if (ret < 0)
>> +??????? return ret;
>> +
>> ????? ctrl->identified = true;
>> ? ????? return 0;
>> @@ -2825,6 +2956,32 @@ static ssize_t nvme_sysfs_show_address(struct
>> device *dev,
>> ? }
>> ? static DEVICE_ATTR(address, S_IRUGO, nvme_sysfs_show_address, NULL);
>> ? +static ssize_t nvme_discovery_show(struct device *dev,
>> +?????????????????? struct device_attribute *attr,
>> +?????????????????? char *buf)
>> +{
>> +??? struct nvme_ctrl *ctrl = dev_get_drvdata(dev);
>> +??? struct nvmf_disc_rsp_page_entry *entry;
>> +??? int i;
>> +??? ssize_t len = 0;
>> +
>> +??? if (!ctrl->discovery)
>> +??????? return -EINVAL;
>> +
>> +??? entry = ctrl->discovery->entries;
>> +??? for (i = 0; i < ctrl->discovery->numrec; i++) {
>> +??????? len += snprintf(buf, PAGE_SIZE,
>> +??????????????? "trtype %d adrfam %d traddr %s "
>> +??????????????? "trsvcid %s portid %d subnqn %s\n",
>> +??????????????? entry->trtype, entry->adrfam, entry->traddr,
>> +??????????????? entry->trsvcid, le16_to_cpu(entry->portid),
>> +??????????????? entry->subnqn);
>> +??????? entry++;
>> +??? }
>> +??? return len;
>> +}
>> +static DEVICE_ATTR(discovery, S_IRUGO, nvme_discovery_show, NULL);
>> +
> 
> I really don't like that it's reporting a cached log. ? It may be
> completely inconsistent with what's on the discovery controller now
> (consider a nvmetcli done then a reconfig done). ?? There may not even
> be connectivity to the discovery controller at the time the sysfs entry
> is read - so validity to the content is in question.? I would much
> rather have the attribute read the log then display contents - so it's
> always current.??? Of course that does make it interesting if you read
> it while the controller is paused doing a reconnect.
> 
For the next round I'll split this patch in two, one for getting the
discovery log and issueing uevents, and a next onw for the sysfs attribute.
After all, I've implemented the sysfs attribute just for completeness,
to show the contents of the internal discovery buffer. Re-reading it
when reading kind of defeats that purpose IMO.

Cheers,

Hannes
-- 
Dr. Hannes Reinecke		   Teamlead Storage & Networking
hare at suse.de			               +49 911 74053 688
SUSE LINUX GmbH, Maxfeldstr. 5, 90409 N?rnberg
GF: F. Imend?rffer, J. Smithard, J. Guild, D. Upmanyu, G. Norton
HRB 21284 (AG N?rnberg)

^ permalink raw reply	[flat|nested] 13+ messages in thread

* [PATCH 4/4] nvme: delete discovery controller after 2 minutes
  2018-08-21 21:01   ` James Smart
@ 2018-08-22  7:32     ` Hannes Reinecke
  0 siblings, 0 replies; 13+ messages in thread
From: Hannes Reinecke @ 2018-08-22  7:32 UTC (permalink / raw)


On 08/21/2018 11:01 PM, James Smart wrote:
> 
> 
> On 8/21/2018 6:43 AM, Hannes Reinecke wrote:
>> If the CLI crashes before the 'disconnect' command is issued or when
>> the 'async_connect' option is used the controller is never removed.
>> This patch cleans up stale discovery controllers after 2 minutes,
>>
>> Signed-off-by: Hannes Reinecke <hare at suse.com>
>> ---
>> ? drivers/nvme/host/core.c??? | 5 +++++
>> ? drivers/nvme/host/fabrics.c | 2 +-
>> ? drivers/nvme/host/nvme.h??? | 1 +
>> ? 3 files changed, 7 insertions(+), 1 deletion(-)
>>
>> diff --git a/drivers/nvme/host/core.c b/drivers/nvme/host/core.c
>> index 358be6d217d9..b3738b327731 100644
>> --- a/drivers/nvme/host/core.c
>> +++ b/drivers/nvme/host/core.c
>> @@ -866,6 +866,11 @@ static void nvme_keep_alive_work(struct
>> work_struct *work)
>> ????? struct nvme_ctrl *ctrl = container_of(to_delayed_work(work),
>> ????????????? struct nvme_ctrl, ka_work);
>> ? +??? if (ctrl->opts->discovery_nqn) {
>> +??????? nvme_delete_ctrl(ctrl);
>> +??????? return;
>> +??? }
>> +
>> ????? if (nvme_keep_alive(ctrl)) {
>> ????????? /* allocation failure, reset the controller */
>> ????????? dev_err(ctrl->device, "keep-alive failed\n");
>> diff --git a/drivers/nvme/host/fabrics.c b/drivers/nvme/host/fabrics.c
>> index e484205b4cad..b98662760051 100644
>> --- a/drivers/nvme/host/fabrics.c
>> +++ b/drivers/nvme/host/fabrics.c
>> @@ -827,7 +827,7 @@ static int nvmf_parse_options(struct
>> nvmf_ctrl_options *opts,
>> ????? }
>> ? ????? if (opts->discovery_nqn) {
>> -??????? opts->kato = 0;
>> +??????? opts->kato = NVME_DISCOVERY_TIMEOUT;
>> ????????? opts->nr_io_queues = 0;
>> ????????? opts->duplicate_connect = true;
>> ????? }
>> diff --git a/drivers/nvme/host/nvme.h b/drivers/nvme/host/nvme.h
>> index 8a4ed46b986b..551a6b1dbc8c 100644
>> --- a/drivers/nvme/host/nvme.h
>> +++ b/drivers/nvme/host/nvme.h
>> @@ -32,6 +32,7 @@ extern unsigned int admin_timeout;
>> ? ? #define NVME_DEFAULT_KATO??? 5
>> ? #define NVME_KATO_GRACE??????? 10
>> +#define NVME_DISCOVERY_TIMEOUT??? 120
>> ? ? extern struct workqueue_struct *nvme_wq;
>> ? extern struct workqueue_struct *nvme_reset_wq;
> 
> this doesn't necessarily track to the new TP that adds kato support to
> discovery controllers, nor the fabric spec update that has the host
> tracking kato and deleting the controller (whether discovery controller
> or not) (the actual kato as set on the controller with the grace period).
> 
Correct, it doesn't.
Both of the TPs referred here are not ratified yet, so I cannot make
assumptions about nor code against them.

> I would rather have this be a generic timer on a host that tracks to the
> kato timeout and deletes the controller (doesn't matter if discovery or
> not) if kato times out.??? If the controller is an older discovery
> controller that doesn't support kato - then the 1st kato timeout should
> fail and remove the controller.? If it's newer and supports kato, it
> would be assumed the controller stays live for an extended period- just
> like a storage controller.
> 

I've tried to implement something like this, but yeah, I'll be updating
the patch along these lines.

Cheers,

Hannes
-- 
Dr. Hannes Reinecke		   Teamlead Storage & Networking
hare at suse.de			               +49 911 74053 688
SUSE LINUX GmbH, Maxfeldstr. 5, 90409 N?rnberg
GF: F. Imend?rffer, J. Smithard, J. Guild, D. Upmanyu, G. Norton
HRB 21284 (AG N?rnberg)

^ permalink raw reply	[flat|nested] 13+ messages in thread

* [PATCH 3/4] nvme: implement 'discovery' sysfs entry and discovery uevents
  2018-08-22  7:23     ` Hannes Reinecke
@ 2018-08-22  8:51       ` Hannes Reinecke
  0 siblings, 0 replies; 13+ messages in thread
From: Hannes Reinecke @ 2018-08-22  8:51 UTC (permalink / raw)


On 08/22/2018 09:23 AM, Hannes Reinecke wrote:
> On 08/21/2018 11:01 PM, James Smart wrote:
>>
>>
>> On 8/21/2018 6:43 AM, Hannes Reinecke wrote:
>>> When establishing a connection to the discovery controller with the
>>> 'async-connect' option the cli has no means of actually getting the
>>> discovery log page entries.
>>> This patch implements a 'discovery' sysfs attribute for the controller
>>> which holds the information of the discovery log page. Additionally
>>> an uevent is generated for each discovery log page entry.
>>>
>>> Signed-off-by: Hannes Reinecke <hare at suse.com>
>>> ---
>>> ? drivers/nvme/host/core.c | 163
>>> ++++++++++++++++++++++++++++++++++++++++++++++-
>>> ? drivers/nvme/host/nvme.h |?? 1 +
>>> ? 2 files changed, 163 insertions(+), 1 deletion(-)
>>>
>>> diff --git a/drivers/nvme/host/core.c b/drivers/nvme/host/core.c
>>> index dd8ec1dd9219..358be6d217d9 100644
>>> --- a/drivers/nvme/host/core.c
>>> +++ b/drivers/nvme/host/core.c
>>> @@ -2325,6 +2325,133 @@ static int nvme_get_effects_log(struct
>>> nvme_ctrl *ctrl)
>>> ????? return ret;
>>> ? }
>>> ? +static int nvme_get_discovery_log(struct nvme_ctrl *ctrl)
>>> +{
>>> +??? int ret, disc_log_len = 4096;
>>> +
>>> +retry:
>>> +??? if (!ctrl->discovery)
>>> +??????? ctrl->discovery = kzalloc(disc_log_len, GFP_KERNEL);
>>> +
>>> +??? if (!ctrl->discovery)
>>> +??????? return 0;
>>> +
>>> +??? ret = nvme_get_log(ctrl, NVME_NSID_ALL, NVME_LOG_DISC, 0,
>>> +?????????????? ctrl->discovery, disc_log_len, 0);
>>> +??? if (ret) {
>>> +??????? kfree(ctrl->discovery);
>>> +??????? ctrl->discovery = NULL;
>>> +??????? return ret;
>>> +??? }
>>> +??? if (ctrl->discovery->numrec > 3 && disc_log_len == 4096) {
>>> +??????? disc_log_len = (ctrl->discovery->numrec + 1) * 1024;
>>> +??????? kfree(ctrl->discovery);
>>> +??????? ctrl->discovery = NULL;
>>> +??????? goto retry;
>>
>> This is missing logic to detect a change in GENCTR and restart the log
>> read.
>>
> Yeah, I'll be implementing this for the next round.
> 
>>> +??? }
>>> +??? return ret;
>>> +}
>>> +
>>> ...
>>> +static int nvme_configure_discovery(struct nvme_ctrl *ctrl)
>>> +{
>>> +??? int ret = 0;
>>> +
>>> +??? if (!ctrl->opts || !ctrl->opts->discovery_nqn)
>>> +??????? return 0;
>>> +
>>> +??? if (!ctrl->opts->async_connect)
>>> +??????? return 0;
>>
>> I don't understand why async_connect parameter means anything in this
>> context...? why is this looked at?
>>
>> is it strictly to avoid delay while reading the discovery log ?? I don't
>> know that waiting for yet another command or two is a big deal, so I
>> would think just proceeding with reading the discovery log is fine.
>>
> Originally I did that so as to restore the original functionality, and
> tie discovery events to the use of the 'async_connect' parameter.
> But you are right, both are not necessarily connected.
> I'll drop this for the next round.
> 
>>> +
>>> +??? ret = nvme_get_discovery_log(ctrl);
>>> +??? if (!ret && ctrl->discovery) {
>>> +??????? int i;
>>> +??????? struct nvmf_disc_rsp_page_entry *entry =
>>> +??????????? ctrl->discovery->entries;
>>> +
>>> +??????? for (i = 0; i < ctrl->discovery->numrec; i++) {
>>> +??????????? nvme_discovery_uevent(ctrl, entry);
>>> +??????????? entry++;
>>> +??????? }
>>> +??? }
>>> +??? return ret;
>>> +}
>>> +
>>> ? /*
>>> ?? * Initialize the cached copies of the Identify data and various
>>> controller
>>> ?? * register in our nvme_ctrl structure.? This should be called as
>>> soon as
>>> @@ -2491,6 +2618,10 @@ int nvme_init_identify(struct nvme_ctrl *ctrl)
>>> ????? if (ret < 0)
>>> ????????? return ret;
>>> ? +??? ret = nvme_configure_discovery(ctrl);
>>> +??? if (ret < 0)
>>> +??????? return ret;
>>> +
>>> ????? ctrl->identified = true;
>>> ? ????? return 0;
>>> @@ -2825,6 +2956,32 @@ static ssize_t nvme_sysfs_show_address(struct
>>> device *dev,
>>> ? }
>>> ? static DEVICE_ATTR(address, S_IRUGO, nvme_sysfs_show_address, NULL);
>>> ? +static ssize_t nvme_discovery_show(struct device *dev,
>>> +?????????????????? struct device_attribute *attr,
>>> +?????????????????? char *buf)
>>> +{
>>> +??? struct nvme_ctrl *ctrl = dev_get_drvdata(dev);
>>> +??? struct nvmf_disc_rsp_page_entry *entry;
>>> +??? int i;
>>> +??? ssize_t len = 0;
>>> +
>>> +??? if (!ctrl->discovery)
>>> +??????? return -EINVAL;
>>> +
>>> +??? entry = ctrl->discovery->entries;
>>> +??? for (i = 0; i < ctrl->discovery->numrec; i++) {
>>> +??????? len += snprintf(buf, PAGE_SIZE,
>>> +??????????????? "trtype %d adrfam %d traddr %s "
>>> +??????????????? "trsvcid %s portid %d subnqn %s\n",
>>> +??????????????? entry->trtype, entry->adrfam, entry->traddr,
>>> +??????????????? entry->trsvcid, le16_to_cpu(entry->portid),
>>> +??????????????? entry->subnqn);
>>> +??????? entry++;
>>> +??? }
>>> +??? return len;
>>> +}
>>> +static DEVICE_ATTR(discovery, S_IRUGO, nvme_discovery_show, NULL);
>>> +
>>
>> I really don't like that it's reporting a cached log. ? It may be
>> completely inconsistent with what's on the discovery controller now
>> (consider a nvmetcli done then a reconfig done). ?? There may not even
>> be connectivity to the discovery controller at the time the sysfs entry
>> is read - so validity to the content is in question.? I would much
>> rather have the attribute read the log then display contents - so it's
>> always current.??? Of course that does make it interesting if you read
>> it while the controller is paused doing a reconnect.
>>
> For the next round I'll split this patch in two, one for getting the
> discovery log and issueing uevents, and a next onw for the sysfs attribute.
> After all, I've implemented the sysfs attribute just for completeness,
> to show the contents of the internal discovery buffer. Re-reading it
> when reading kind of defeats that purpose IMO.
> 
Actually, I think it would be better to make 'discovery' a _bin_ attribute.
With that 'nvme cli' could parse it directly without having to implement
yet another parser, we don't have this awkward multiple entry sysfs
attribute, and would actually provide additional value.

I'll give it a go.

Cheers,

Hannes
-- 
Dr. Hannes Reinecke		   Teamlead Storage & Networking
hare at suse.de			               +49 911 74053 688
SUSE LINUX GmbH, Maxfeldstr. 5, 90409 N?rnberg
GF: F. Imend?rffer, J. Smithard, J. Guild, D. Upmanyu, G. Norton
HRB 21284 (AG N?rnberg)

^ permalink raw reply	[flat|nested] 13+ messages in thread

end of thread, other threads:[~2018-08-22  8:51 UTC | newest]

Thread overview: 13+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2018-08-21 13:43 [RFC PATCH 0/4] nvme async-connect and discovery uevents Hannes Reinecke
2018-08-21 13:43 ` [PATCH 1/4] nvme-rdma: use reconnect_work for initial connect Hannes Reinecke
2018-08-21 14:10   ` Max Gurtovoy
2018-08-21 14:18     ` Hannes Reinecke
2018-08-21 13:43 ` [PATCH 2/4] nvme: implement 'async_connect' cli option Hannes Reinecke
2018-08-21 21:01   ` James Smart
2018-08-21 13:43 ` [PATCH 3/4] nvme: implement 'discovery' sysfs entry and discovery uevents Hannes Reinecke
2018-08-21 21:01   ` James Smart
2018-08-22  7:23     ` Hannes Reinecke
2018-08-22  8:51       ` Hannes Reinecke
2018-08-21 13:43 ` [PATCH 4/4] nvme: delete discovery controller after 2 minutes Hannes Reinecke
2018-08-21 21:01   ` James Smart
2018-08-22  7:32     ` Hannes Reinecke

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.