All of lore.kernel.org
 help / color / mirror / Atom feed
* [PATCH rfc 0/3] add support to discovery async event notifications
@ 2018-10-04 21:23 Sagi Grimberg
  2018-10-04 21:23 ` [PATCH rfc 1/3] nvme-fabrics: allow discovery subsystems accept a kato Sagi Grimberg
                   ` (5 more replies)
  0 siblings, 6 replies; 20+ messages in thread
From: Sagi Grimberg @ 2018-10-04 21:23 UTC (permalink / raw)


In order to support discovery AENs we need to have two things:
1. Support long lived discovery controller sessions that would allow to
   receive discovery AENs. This means we need to specify kato in connect

2. Support discovery log change events. The choice here is to simply fire
   a udev event and let userspace to:
   1. issue an updated discovery log page via our discovery controller (1)
   2. connect to every new subsystem/port that exists in the system.

This patch set was developed against the nvmet support patch set from Jay.

nvme-cli patches are also included in this series.

the udev rule that ultimately achieves dynamic discovery enumeration (TP 8002)
is:

SUBSYSTEM=="nvme", ACTION=="change", \
  RUN+="/usr/sbin/nvme connect-all --device=nvme$env{NVME_INSTANCE} --transport=$env{NVME_TRTYPE} \
  --traddr=$env{NVME_TRADDR} --trsvcid=$env{NVME_TRSVCID} & disown

Sagi Grimberg (3):
  nvme-fabrics: allow discovery subsystems accept a kato
  nvme: enable aen also for discovery controllers
  nvme: fire discovery log page change events to userspace

 drivers/nvme/host/core.c    | 36 +++++++++++++++++++++++++++++++++---
 drivers/nvme/host/fabrics.c | 16 +++-------------
 2 files changed, 36 insertions(+), 16 deletions(-)

-- 
2.17.1

^ permalink raw reply	[flat|nested] 20+ messages in thread

* [PATCH rfc 1/3] nvme-fabrics: allow discovery subsystems accept a kato
  2018-10-04 21:23 [PATCH rfc 0/3] add support to discovery async event notifications Sagi Grimberg
@ 2018-10-04 21:23 ` Sagi Grimberg
  2018-10-05  5:50   ` Hannes Reinecke
  2018-10-04 21:23 ` [PATCH rfc 2/3] nvme: enable aen also for discovery controllers Sagi Grimberg
                   ` (4 subsequent siblings)
  5 siblings, 1 reply; 20+ messages in thread
From: Sagi Grimberg @ 2018-10-04 21:23 UTC (permalink / raw)


This modifies the behavior of discovery subsystems to accept
a kato as a preparation to support discovery log change
events. This also means that now every discovery controller
will have a default kato value, and for non-persistent connections
the host needs to pass in a zero kato value (keep_alive_tmo=0).

Signed-off-by: Sagi Grimberg <sagi at grimberg.me>
---
 drivers/nvme/host/fabrics.c | 16 +++-------------
 1 file changed, 3 insertions(+), 13 deletions(-)

diff --git a/drivers/nvme/host/fabrics.c b/drivers/nvme/host/fabrics.c
index 40f7e6feeadc..863f83052c95 100644
--- a/drivers/nvme/host/fabrics.c
+++ b/drivers/nvme/host/fabrics.c
@@ -389,8 +389,8 @@ int nvmf_connect_admin_queue(struct nvme_ctrl *ctrl)
 	 * Set keep-alive timeout in seconds granularity (ms * 1000)
 	 * and add a grace period for controller kato enforcement
 	 */
-	cmd.connect.kato = ctrl->opts->discovery_nqn ? 0 :
-		cpu_to_le32((ctrl->kato + NVME_KATO_GRACE) * 1000);
+	cmd.connect.kato = ctrl->kato ?
+		cpu_to_le32((ctrl->kato + NVME_KATO_GRACE) * 1000) : 0;
 
 	if (ctrl->opts->disable_sqflow)
 		cmd.connect.cattr |= NVME_CONNECT_DISABLE_SQFLOW;
@@ -743,13 +743,6 @@ static int nvmf_parse_options(struct nvmf_ctrl_options *opts,
 				pr_warn("keep_alive_tmo 0 won't execute keep alives!!!\n");
 			}
 			opts->kato = token;
-
-			if (opts->discovery_nqn && opts->kato) {
-				pr_err("Discovery controllers cannot accept KATO != 0\n");
-				ret = -EINVAL;
-				goto out;
-			}
-
 			break;
 		case NVMF_OPT_CTRL_LOSS_TMO:
 			if (match_int(args, &token)) {
@@ -845,11 +838,8 @@ static int nvmf_parse_options(struct nvmf_ctrl_options *opts,
 		}
 	}
 
-	if (opts->discovery_nqn) {
-		opts->kato = 0;
+	if (opts->discovery_nqn)
 		opts->nr_io_queues = 0;
-		opts->duplicate_connect = true;
-	}
 	if (ctrl_loss_tmo < 0)
 		opts->max_reconnects = -1;
 	else
-- 
2.17.1

^ permalink raw reply related	[flat|nested] 20+ messages in thread

* [PATCH rfc 2/3] nvme: enable aen also for discovery controllers
  2018-10-04 21:23 [PATCH rfc 0/3] add support to discovery async event notifications Sagi Grimberg
  2018-10-04 21:23 ` [PATCH rfc 1/3] nvme-fabrics: allow discovery subsystems accept a kato Sagi Grimberg
@ 2018-10-04 21:23 ` Sagi Grimberg
  2018-10-05  5:51   ` Hannes Reinecke
  2018-10-04 21:23 ` [PATCH rfc 3/3] nvme: fire discovery log page change events to userspace Sagi Grimberg
                   ` (3 subsequent siblings)
  5 siblings, 1 reply; 20+ messages in thread
From: Sagi Grimberg @ 2018-10-04 21:23 UTC (permalink / raw)


If the controller supports discovery log page change events,
we want to enable it on the host side as well.

Signed-off-by: Sagi Grimberg <sagi at grimberg.me>
---
 drivers/nvme/host/core.c | 9 +++++++--
 1 file changed, 7 insertions(+), 2 deletions(-)

diff --git a/drivers/nvme/host/core.c b/drivers/nvme/host/core.c
index de745f94b698..efdbfc07de32 100644
--- a/drivers/nvme/host/core.c
+++ b/drivers/nvme/host/core.c
@@ -1089,6 +1089,8 @@ static void nvme_enable_aen(struct nvme_ctrl *ctrl)
 	if (!supported_aens)
 		return;
 
+	queue_work(nvme_wq, &ctrl->async_event_work);
+
 	status = nvme_set_features(ctrl, NVME_FEAT_ASYNC_EVENT, supported_aens,
 			NULL, 0, &result);
 	if (status)
@@ -3489,10 +3491,13 @@ void nvme_start_ctrl(struct nvme_ctrl *ctrl)
 	if (ctrl->kato)
 		nvme_start_keep_alive(ctrl);
 
+	if (ctrl->queue_count > 1 ||
+	    (ctrl->ops->flags & NVME_F_FABRICS &&
+	     ctrl->opts->discovery_nqn))
+		nvme_enable_aen(ctrl);
+
 	if (ctrl->queue_count > 1) {
 		nvme_queue_scan(ctrl);
-		nvme_enable_aen(ctrl);
-		queue_work(nvme_wq, &ctrl->async_event_work);
 		nvme_start_queues(ctrl);
 	}
 }
-- 
2.17.1

^ permalink raw reply related	[flat|nested] 20+ messages in thread

* [PATCH rfc 3/3] nvme: fire discovery log page change events to userspace
  2018-10-04 21:23 [PATCH rfc 0/3] add support to discovery async event notifications Sagi Grimberg
  2018-10-04 21:23 ` [PATCH rfc 1/3] nvme-fabrics: allow discovery subsystems accept a kato Sagi Grimberg
  2018-10-04 21:23 ` [PATCH rfc 2/3] nvme: enable aen also for discovery controllers Sagi Grimberg
@ 2018-10-04 21:23 ` Sagi Grimberg
  2018-10-05  5:51   ` Hannes Reinecke
  2018-10-04 21:23 ` [PATCH 4/3 rfc nvme-cli] fabrics: support persistent connections to a discovery controller Sagi Grimberg
                   ` (2 subsequent siblings)
  5 siblings, 1 reply; 20+ messages in thread
From: Sagi Grimberg @ 2018-10-04 21:23 UTC (permalink / raw)


Provide userspace with nvme discovery controller device instance,
controller traddr and trsvcid. We'd expect userspace to handle
this event by issuing a discovery + connect.

Signed-off-by: Sagi Grimberg <sagi at grimberg.me>
---
 drivers/nvme/host/core.c | 27 ++++++++++++++++++++++++++-
 1 file changed, 26 insertions(+), 1 deletion(-)

diff --git a/drivers/nvme/host/core.c b/drivers/nvme/host/core.c
index efdbfc07de32..495bfbeea068 100644
--- a/drivers/nvme/host/core.c
+++ b/drivers/nvme/host/core.c
@@ -1079,7 +1079,8 @@ int nvme_set_queue_count(struct nvme_ctrl *ctrl, int *count)
 EXPORT_SYMBOL_GPL(nvme_set_queue_count);
 
 #define NVME_AEN_SUPPORTED \
-	(NVME_AEN_CFG_NS_ATTR | NVME_AEN_CFG_FW_ACT | NVME_AEN_CFG_ANA_CHANGE)
+	(NVME_AEN_CFG_NS_ATTR | NVME_AEN_CFG_FW_ACT | NVME_AEN_CFG_ANA_CHANGE | \
+	 NVME_AEN_CFG_DISC_CHANGE)
 
 static void nvme_enable_aen(struct nvme_ctrl *ctrl)
 {
@@ -3351,6 +3352,27 @@ static void nvme_aen_uevent(struct nvme_ctrl *ctrl)
 	kfree(envp[0]);
 }
 
+static void nvme_disc_aen_uevent(struct nvme_ctrl *ctrl)
+{
+	struct nvmf_ctrl_options *opts = ctrl->opts;
+	char *envp[16];
+	int i, envloc = 0;
+
+	envp[envloc++] = kasprintf(GFP_KERNEL, "NVME_EVENT=discovery");
+	envp[envloc++] = kasprintf(GFP_KERNEL, "NVME_INSTANCE=%d", ctrl->instance);
+	envp[envloc++] = kasprintf(GFP_KERNEL, "NVME_TRTYPE=%s", opts->transport);
+	envp[envloc++] = kasprintf(GFP_KERNEL, "NVME_TRADDR=%s", opts->traddr);
+	envp[envloc++] = kasprintf(GFP_KERNEL, "NVME_TRSVCID=%s", opts->trsvcid);
+	/* TODO: propagate HOST_TRADDR? */
+	envp[envloc] = NULL;
+
+	for (i = 0; i < envloc; i++)
+		dev_dbg(ctrl->device, "%s\n", envp[i]);
+	kobject_uevent_env(&ctrl->device->kobj, KOBJ_CHANGE, envp);
+	for (i = 0; i < envloc; i++)
+		kfree(envp[i]);
+}
+
 static void nvme_async_event_work(struct work_struct *work)
 {
 	struct nvme_ctrl *ctrl =
@@ -3442,6 +3464,9 @@ static void nvme_handle_aen_notice(struct nvme_ctrl *ctrl, u32 result)
 		queue_work(nvme_wq, &ctrl->ana_work);
 		break;
 #endif
+	case NVME_AER_NOTICE_DISC_CHANGED:
+		nvme_disc_aen_uevent(ctrl);
+		break;
 	default:
 		dev_warn(ctrl->device, "async event result %08x\n", result);
 	}
-- 
2.17.1

^ permalink raw reply related	[flat|nested] 20+ messages in thread

* [PATCH 4/3 rfc nvme-cli] fabrics: support persistent connections to a discovery controller
  2018-10-04 21:23 [PATCH rfc 0/3] add support to discovery async event notifications Sagi Grimberg
                   ` (2 preceding siblings ...)
  2018-10-04 21:23 ` [PATCH rfc 3/3] nvme: fire discovery log page change events to userspace Sagi Grimberg
@ 2018-10-04 21:23 ` Sagi Grimberg
  2018-10-04 21:23 ` [PATCH 5/3 rfc nvme-cli] fabrics: allow user to retrieve discovery log from existing " Sagi Grimberg
  2018-10-05  5:50 ` [PATCH rfc 0/3] add support to discovery async event notifications Hannes Reinecke
  5 siblings, 0 replies; 20+ messages in thread
From: Sagi Grimberg @ 2018-10-04 21:23 UTC (permalink / raw)


Simply don't destroy the discovery controller after getting the
log pages. Note that persistent connection to a discovery subsystem
require to pass in a non-zero kato value, so if not provided we
simply use a default of 30 seconds kato.

Signed-off-by: Sagi Grimberg <sagi at grimberg.me>
---
 fabrics.c | 10 +++++++++-
 fabrics.h |  2 ++
 2 files changed, 11 insertions(+), 1 deletion(-)

diff --git a/fabrics.c b/fabrics.c
index 1c25ccb0af6c..f5b822902aa2 100644
--- a/fabrics.c
+++ b/fabrics.c
@@ -62,6 +62,7 @@ static struct config {
 	char *raw;
 	char *device;
 	int  duplicate_connect;
+	bool persistent;
 } cfg = { NULL };
 
 #define BUF_SIZE		4096
@@ -762,7 +763,8 @@ static int do_discover(char *argstr, bool connect)
 		return errno;
 	ret = nvmf_get_log_page_discovery(dev_name, &log, &numrec);
 	free(dev_name);
-	remove_ctrl(instance);
+	if (!cfg.persistent)
+		remove_ctrl(instance);
 
 	switch (ret) {
 	case DISC_OK:
@@ -838,6 +840,9 @@ static int discover_from_conf_file(const char *desc, char *argstr,
 		if (err)
 			continue;
 
+		if (cfg.persistent && !cfg.keep_alive_tmo)
+			cfg.keep_alive_tmo = NVMF_DEF_DISC_TMO;
+
 		err = build_options(argstr, BUF_SIZE);
 		if (err) {
 			ret = err;
@@ -874,6 +879,7 @@ int discover(const char *desc, int argc, char **argv, bool connect)
 		{"keep-alive-tmo",  'k', "LIST", CFG_INT, &cfg.keep_alive_tmo,  required_argument, "keep alive timeout period in seconds" },
 		{"reconnect-delay", 'c', "LIST", CFG_INT, &cfg.reconnect_delay, required_argument, "reconnect timeout period in seconds" },
 		{"ctrl-loss-tmo",   'l', "LIST", CFG_INT, &cfg.ctrl_loss_tmo,   required_argument, "controller loss timeout period in seconds" },
+		{"persistent",  'p', "LIST", CFG_NONE, &cfg.persistent,  no_argument, "persistent discovery connection" },
 		{NULL},
 	};
 
@@ -888,6 +894,8 @@ int discover(const char *desc, int argc, char **argv, bool connect)
 		return discover_from_conf_file(desc, argstr,
 				command_line_options, connect);
 	} else {
+		if (cfg.persistent && !cfg.keep_alive_tmo)
+			cfg.keep_alive_tmo = NVMF_DEF_DISC_TMO;
 		ret = build_options(argstr, BUF_SIZE);
 		if (ret)
 			return ret;
diff --git a/fabrics.h b/fabrics.h
index 988f3ef2fbc4..7c1664b80d51 100644
--- a/fabrics.h
+++ b/fabrics.h
@@ -1,6 +1,8 @@
 #ifndef _DISCOVER_H
 #define _DISCOVER_H
 
+#define NVMF_DEF_DISC_TMO	30
+
 extern int discover(const char *desc, int argc, char **argv, bool connect);
 extern int connect(const char *desc, int argc, char **argv);
 extern int disconnect(const char *desc, int argc, char **argv);
-- 
2.17.1

^ permalink raw reply related	[flat|nested] 20+ messages in thread

* [PATCH 5/3 rfc nvme-cli] fabrics: allow user to retrieve discovery log from existing discovery controller
  2018-10-04 21:23 [PATCH rfc 0/3] add support to discovery async event notifications Sagi Grimberg
                   ` (3 preceding siblings ...)
  2018-10-04 21:23 ` [PATCH 4/3 rfc nvme-cli] fabrics: support persistent connections to a discovery controller Sagi Grimberg
@ 2018-10-04 21:23 ` Sagi Grimberg
  2018-10-05  5:50 ` [PATCH rfc 0/3] add support to discovery async event notifications Hannes Reinecke
  5 siblings, 0 replies; 20+ messages in thread
From: Sagi Grimberg @ 2018-10-04 21:23 UTC (permalink / raw)


Simply allow discover to receive an existing discovery device node name.
Also centralize extraction of controller instance from the controller
name to a common helper.

Signed-off-by: Sagi Grimberg <sagi at grimberg.me>
---
 fabrics.c | 32 ++++++++++++++++++++++----------
 1 file changed, 22 insertions(+), 10 deletions(-)

diff --git a/fabrics.c b/fabrics.c
index f5b822902aa2..d937df47b68e 100644
--- a/fabrics.c
+++ b/fabrics.c
@@ -173,6 +173,19 @@ static const char *cms_str(__u8 cm)
 
 static int do_discover(char *argstr, bool connect);
 
+static int ctrl_instance(char *device)
+{
+	int ret, instance;
+
+	device = basename(device);
+	ret = sscanf(device, "nvme%d", &instance);
+	if (ret < 0)
+		return ret;
+	if (!ret)
+		return -1;
+	return instance;
+}
+
 static int add_ctrl(const char *argstr)
 {
 	substring_t args[MAX_OPT_ARGS];
@@ -755,7 +768,10 @@ static int do_discover(char *argstr, bool connect)
 	char *dev_name;
 	int instance, numrec = 0, ret;
 
-	instance = add_ctrl(argstr);
+	if (!cfg.device)
+		instance = add_ctrl(argstr);
+	else
+		instance = ctrl_instance(cfg.device);
 	if (instance < 0)
 		return instance;
 
@@ -763,7 +779,7 @@ static int do_discover(char *argstr, bool connect)
 		return errno;
 	ret = nvmf_get_log_page_discovery(dev_name, &log, &numrec);
 	free(dev_name);
-	if (!cfg.persistent)
+	if (!cfg.device && !cfg.persistent)
 		remove_ctrl(instance);
 
 	switch (ret) {
@@ -876,6 +892,7 @@ int discover(const char *desc, int argc, char **argv, bool connect)
 		{"hostnqn",     'q', "LIST", CFG_STRING, &cfg.hostnqn,     required_argument, "user-defined hostnqn (if default not used)" },
 		{"hostid",      'I', "LIST", CFG_STRING, &cfg.hostid,      required_argument, "user-defined hostid (if default not used)"},
 		{"raw",         'r', "LIST", CFG_STRING, &cfg.raw,         required_argument, "raw output file" },
+		{"device",      'd', "LIST", CFG_STRING, &cfg.device, required_argument, "use existing discovery contoller device" },
 		{"keep-alive-tmo",  'k', "LIST", CFG_INT, &cfg.keep_alive_tmo,  required_argument, "keep alive timeout period in seconds" },
 		{"reconnect-delay", 'c', "LIST", CFG_INT, &cfg.reconnect_delay, required_argument, "reconnect timeout period in seconds" },
 		{"ctrl-loss-tmo",   'l', "LIST", CFG_INT, &cfg.ctrl_loss_tmo,   required_argument, "controller loss timeout period in seconds" },
@@ -1020,15 +1037,10 @@ static int disconnect_by_nqn(char *nqn)
 static int disconnect_by_device(char *device)
 {
 	int instance;
-	int ret;
-
-	device = basename(device);
-	ret = sscanf(device, "nvme%d", &instance);
-	if (ret < 0)
-		return ret;
-	if (!ret)
-		return -1;
 
+	instance = ctrl_instance(device);
+	if (instance < 0)
+		return instance;
 	return remove_ctrl(instance);
 }
 
-- 
2.17.1

^ permalink raw reply related	[flat|nested] 20+ messages in thread

* [PATCH rfc 0/3] add support to discovery async event notifications
  2018-10-04 21:23 [PATCH rfc 0/3] add support to discovery async event notifications Sagi Grimberg
                   ` (4 preceding siblings ...)
  2018-10-04 21:23 ` [PATCH 5/3 rfc nvme-cli] fabrics: allow user to retrieve discovery log from existing " Sagi Grimberg
@ 2018-10-05  5:50 ` Hannes Reinecke
  2018-10-05  7:08   ` Sagi Grimberg
  5 siblings, 1 reply; 20+ messages in thread
From: Hannes Reinecke @ 2018-10-05  5:50 UTC (permalink / raw)


On 10/4/18 11:23 PM, Sagi Grimberg wrote:
> In order to support discovery AENs we need to have two things:
> 1. Support long lived discovery controller sessions that would allow to
>     receive discovery AENs. This means we need to specify kato in connect
> 
> 2. Support discovery log change events. The choice here is to simply fire
>     a udev event and let userspace to:
>     1. issue an updated discovery log page via our discovery controller (1)
>     2. connect to every new subsystem/port that exists in the system.
> 
> This patch set was developed against the nvmet support patch set from Jay.
> 
> nvme-cli patches are also included in this series.
> 
> the udev rule that ultimately achieves dynamic discovery enumeration (TP 8002)
> is:
> 
> SUBSYSTEM=="nvme", ACTION=="change", \
>    RUN+="/usr/sbin/nvme connect-all --device=nvme$env{NVME_INSTANCE} --transport=$env{NVME_TRTYPE} \
>    --traddr=$env{NVME_TRADDR} --trsvcid=$env{NVME_TRSVCID} & disown
> 
Not sure that'll work; IIRC udev doesn't follow the bash syntax of '&' 
here. Have you checked?

Cheers,

Hannes
-- 
Dr. Hannes Reinecke		   Teamlead Storage & Networking
hare at suse.de			               +49 911 74053 688
SUSE LINUX GmbH, Maxfeldstr. 5, 90409 N?rnberg
GF: F. Imend?rffer, J. Smithard, J. Guild, D. Upmanyu, G. Norton
HRB 21284 (AG N?rnberg)

^ permalink raw reply	[flat|nested] 20+ messages in thread

* [PATCH rfc 1/3] nvme-fabrics: allow discovery subsystems accept a kato
  2018-10-04 21:23 ` [PATCH rfc 1/3] nvme-fabrics: allow discovery subsystems accept a kato Sagi Grimberg
@ 2018-10-05  5:50   ` Hannes Reinecke
  0 siblings, 0 replies; 20+ messages in thread
From: Hannes Reinecke @ 2018-10-05  5:50 UTC (permalink / raw)


On 10/4/18 11:23 PM, Sagi Grimberg wrote:
> This modifies the behavior of discovery subsystems to accept
> a kato as a preparation to support discovery log change
> events. This also means that now every discovery controller
> will have a default kato value, and for non-persistent connections
> the host needs to pass in a zero kato value (keep_alive_tmo=0).
> 
> Signed-off-by: Sagi Grimberg <sagi at grimberg.me>
> ---
>   drivers/nvme/host/fabrics.c | 16 +++-------------
>   1 file changed, 3 insertions(+), 13 deletions(-)
> 
Reviewed-by: Hannes Reinecke <hare at suse.com>

Cheers,

Hannes
-- 
Dr. Hannes Reinecke		   Teamlead Storage & Networking
hare at suse.de			               +49 911 74053 688
SUSE LINUX GmbH, Maxfeldstr. 5, 90409 N?rnberg
GF: F. Imend?rffer, J. Smithard, J. Guild, D. Upmanyu, G. Norton
HRB 21284 (AG N?rnberg)

^ permalink raw reply	[flat|nested] 20+ messages in thread

* [PATCH rfc 2/3] nvme: enable aen also for discovery controllers
  2018-10-04 21:23 ` [PATCH rfc 2/3] nvme: enable aen also for discovery controllers Sagi Grimberg
@ 2018-10-05  5:51   ` Hannes Reinecke
  0 siblings, 0 replies; 20+ messages in thread
From: Hannes Reinecke @ 2018-10-05  5:51 UTC (permalink / raw)


On 10/4/18 11:23 PM, Sagi Grimberg wrote:
> If the controller supports discovery log page change events,
> we want to enable it on the host side as well.
> 
> Signed-off-by: Sagi Grimberg <sagi at grimberg.me>
> ---
>   drivers/nvme/host/core.c | 9 +++++++--
>   1 file changed, 7 insertions(+), 2 deletions(-)
> 
Reviewed-by: Hannes Reinecke <hare at suse.com>

Cheers,

Hannes
-- 
Dr. Hannes Reinecke		   Teamlead Storage & Networking
hare at suse.de			               +49 911 74053 688
SUSE LINUX GmbH, Maxfeldstr. 5, 90409 N?rnberg
GF: F. Imend?rffer, J. Smithard, J. Guild, D. Upmanyu, G. Norton
HRB 21284 (AG N?rnberg)

^ permalink raw reply	[flat|nested] 20+ messages in thread

* [PATCH rfc 3/3] nvme: fire discovery log page change events to userspace
  2018-10-04 21:23 ` [PATCH rfc 3/3] nvme: fire discovery log page change events to userspace Sagi Grimberg
@ 2018-10-05  5:51   ` Hannes Reinecke
  0 siblings, 0 replies; 20+ messages in thread
From: Hannes Reinecke @ 2018-10-05  5:51 UTC (permalink / raw)


On 10/4/18 11:23 PM, Sagi Grimberg wrote:
> Provide userspace with nvme discovery controller device instance,
> controller traddr and trsvcid. We'd expect userspace to handle
> this event by issuing a discovery + connect.
> 
> Signed-off-by: Sagi Grimberg <sagi at grimberg.me>
> ---
>   drivers/nvme/host/core.c | 27 ++++++++++++++++++++++++++-
>   1 file changed, 26 insertions(+), 1 deletion(-)
> 
Reviewed-by: Hannes Reinecke <hare at suse.com>

Cheers,

Hannes
-- 
Dr. Hannes Reinecke		   Teamlead Storage & Networking
hare at suse.de			               +49 911 74053 688
SUSE LINUX GmbH, Maxfeldstr. 5, 90409 N?rnberg
GF: F. Imend?rffer, J. Smithard, J. Guild, D. Upmanyu, G. Norton
HRB 21284 (AG N?rnberg)

^ permalink raw reply	[flat|nested] 20+ messages in thread

* [PATCH rfc 0/3] add support to discovery async event notifications
  2018-10-05  5:50 ` [PATCH rfc 0/3] add support to discovery async event notifications Hannes Reinecke
@ 2018-10-05  7:08   ` Sagi Grimberg
  2018-10-05  7:36     ` Hannes Reinecke
  0 siblings, 1 reply; 20+ messages in thread
From: Sagi Grimberg @ 2018-10-05  7:08 UTC (permalink / raw)



>> the udev rule that ultimately achieves dynamic discovery enumeration 
>> (TP 8002)
>> is:
>>
>> SUBSYSTEM=="nvme", ACTION=="change", \
>> ?? RUN+="/usr/sbin/nvme connect-all --device=nvme$env{NVME_INSTANCE} 
>> --transport=$env{NVME_TRTYPE} \
>> ?? --traddr=$env{NVME_TRADDR} --trsvcid=$env{NVME_TRSVCID} & disown
>>
> Not sure that'll work; IIRC udev doesn't follow the bash syntax of '&' 
> here. Have you checked?

This works.

^ permalink raw reply	[flat|nested] 20+ messages in thread

* [PATCH rfc 0/3] add support to discovery async event notifications
  2018-10-05  7:08   ` Sagi Grimberg
@ 2018-10-05  7:36     ` Hannes Reinecke
  2018-10-05 19:34       ` Sagi Grimberg
  0 siblings, 1 reply; 20+ messages in thread
From: Hannes Reinecke @ 2018-10-05  7:36 UTC (permalink / raw)


On 10/5/18 9:08 AM, Sagi Grimberg wrote:
> 
>>> the udev rule that ultimately achieves dynamic discovery enumeration 
>>> (TP 8002)
>>> is:
>>>
>>> SUBSYSTEM=="nvme", ACTION=="change", \
>>> ?? RUN+="/usr/sbin/nvme connect-all --device=nvme$env{NVME_INSTANCE} 
>>> --transport=$env{NVME_TRTYPE} \
>>> ?? --traddr=$env{NVME_TRADDR} --trsvcid=$env{NVME_TRSVCID} & disown
>>>
>> Not sure that'll work; IIRC udev doesn't follow the bash syntax of '&' 
>> here. Have you checked?
> 
> This works.
Oh, I don't doubt that it'll work in the sense that 'nvme connect-all' 
is executed.
I _do_ doubt that the 'disown' is doing anything at all; both '&' and 
'disown' will be used as arguments to '/usr/sbin/nvme', and _not_ 
interpreted by any shell (as no shell is invoked in the first place).
Internally udev will call 'execve' to call the new program, which will 
invoke a shell only for shell scripts; binary programs will be executed 
directly with not shell whatsoever.

Cheers,

Hannes
-- 
Dr. Hannes Reinecke		   Teamlead Storage & Networking
hare at suse.de			               +49 911 74053 688
SUSE LINUX GmbH, Maxfeldstr. 5, 90409 N?rnberg
GF: F. Imend?rffer, J. Smithard, J. Guild, D. Upmanyu, G. Norton
HRB 21284 (AG N?rnberg)

^ permalink raw reply	[flat|nested] 20+ messages in thread

* [PATCH rfc 0/3] add support to discovery async event notifications
  2018-10-05  7:36     ` Hannes Reinecke
@ 2018-10-05 19:34       ` Sagi Grimberg
  2018-10-16  1:11         ` Sagi Grimberg
  0 siblings, 1 reply; 20+ messages in thread
From: Sagi Grimberg @ 2018-10-05 19:34 UTC (permalink / raw)



>>>> the udev rule that ultimately achieves dynamic discovery enumeration 
>>>> (TP 8002)
>>>> is:
>>>>
>>>> SUBSYSTEM=="nvme", ACTION=="change", \
>>>> ?? RUN+="/usr/sbin/nvme connect-all --device=nvme$env{NVME_INSTANCE} 
>>>> --transport=$env{NVME_TRTYPE} \
>>>> ?? --traddr=$env{NVME_TRADDR} --trsvcid=$env{NVME_TRSVCID} & disown
>>>>
>>> Not sure that'll work; IIRC udev doesn't follow the bash syntax of 
>>> '&' here. Have you checked?
>>
>> This works.
> Oh, I don't doubt that it'll work in the sense that 'nvme connect-all' 
> is executed.
> I _do_ doubt that the 'disown' is doing anything at all; both '&' and 
> 'disown' will be used as arguments to '/usr/sbin/nvme', and _not_ 
> interpreted by any shell (as no shell is invoked in the first place).
> Internally udev will call 'execve' to call the new program, which will 
> invoke a shell only for shell scripts; binary programs will be executed 
> directly with not shell whatsoever.

I see, we can run it with /bin/sh and it should do the trick?

^ permalink raw reply	[flat|nested] 20+ messages in thread

* [PATCH rfc 0/3] add support to discovery async event notifications
  2018-10-05 19:34       ` Sagi Grimberg
@ 2018-10-16  1:11         ` Sagi Grimberg
  2018-10-16  5:59           ` Hannes Reinecke
  0 siblings, 1 reply; 20+ messages in thread
From: Sagi Grimberg @ 2018-10-16  1:11 UTC (permalink / raw)



>> Oh, I don't doubt that it'll work in the sense that 'nvme connect-all' 
>> is executed.
>> I _do_ doubt that the 'disown' is doing anything at all; both '&' and 
>> 'disown' will be used as arguments to '/usr/sbin/nvme', and _not_ 
>> interpreted by any shell (as no shell is invoked in the first place).
>> Internally udev will call 'execve' to call the new program, which will 
>> invoke a shell only for shell scripts; binary programs will be 
>> executed directly with not shell whatsoever.
> 
> I see, we can run it with /bin/sh and it should do the trick?

Hannes,

Do you agree that it would suffice? Or, Do you think we should take
a different approach?

Would like to get some feedback from others given that this is not
a trivial decision about which way we go here..

^ permalink raw reply	[flat|nested] 20+ messages in thread

* [PATCH rfc 0/3] add support to discovery async event notifications
  2018-10-16  1:11         ` Sagi Grimberg
@ 2018-10-16  5:59           ` Hannes Reinecke
  2018-10-16 17:59             ` James Smart
  2018-10-17  0:38             ` Sagi Grimberg
  0 siblings, 2 replies; 20+ messages in thread
From: Hannes Reinecke @ 2018-10-16  5:59 UTC (permalink / raw)


On 10/16/18 3:11 AM, Sagi Grimberg wrote:
> 
>>> Oh, I don't doubt that it'll work in the sense that 'nvme 
>>> connect-all' is executed.
>>> I _do_ doubt that the 'disown' is doing anything at all; both '&' and 
>>> 'disown' will be used as arguments to '/usr/sbin/nvme', and _not_ 
>>> interpreted by any shell (as no shell is invoked in the first place).
>>> Internally udev will call 'execve' to call the new program, which 
>>> will invoke a shell only for shell scripts; binary programs will be 
>>> executed directly with not shell whatsoever.
>>
>> I see, we can run it with /bin/sh and it should do the trick?
> 
> Hannes,
> 
> Do you agree that it would suffice? Or, Do you think we should take
> a different approach?
> 
> Would like to get some feedback from others given that this is not
> a trivial decision about which way we go here..

Actually, I have looked at the original solution from James Smart for 
FC-NVMe autoconnect.

Essentially he starts a system service from a udev rule, which in itself 
is triggered by the event.
With that we're out of the udev program timeout (as systemd services can 
run for arbitrarily long) _and_ we also have resolved the problem of a 
uevent storm, as systemd will manage that for us.

The only problem here is that we need to pass arguments to the systemd 
service, but if we can use the standard udev trick of placing the 
arguments into the program environment and having the program looking 
for that we should be able to solve that, too.

Would you be okay with this approach?

Cheers,

Hannes
-- 
Dr. Hannes Reinecke		   Teamlead Storage & Networking
hare at suse.de			               +49 911 74053 688
SUSE LINUX GmbH, Maxfeldstr. 5, 90409 N?rnberg
GF: F. Imend?rffer, J. Smithard, J. Guild, D. Upmanyu, G. Norton
HRB 21284 (AG N?rnberg)

^ permalink raw reply	[flat|nested] 20+ messages in thread

* [PATCH rfc 0/3] add support to discovery async event notifications
  2018-10-16  5:59           ` Hannes Reinecke
@ 2018-10-16 17:59             ` James Smart
  2018-10-17 15:29               ` Hannes Reinecke
  2018-10-17  0:38             ` Sagi Grimberg
  1 sibling, 1 reply; 20+ messages in thread
From: James Smart @ 2018-10-16 17:59 UTC (permalink / raw)




On 10/15/2018 10:59 PM, Hannes Reinecke wrote:
> On 10/16/18 3:11 AM, Sagi Grimberg wrote:
>>
>>>> Oh, I don't doubt that it'll work in the sense that 'nvme 
>>>> connect-all' is executed.
>>>> I _do_ doubt that the 'disown' is doing anything at all; both '&' 
>>>> and 'disown' will be used as arguments to '/usr/sbin/nvme', and 
>>>> _not_ interpreted by any shell (as no shell is invoked in the first 
>>>> place).
>>>> Internally udev will call 'execve' to call the new program, which 
>>>> will invoke a shell only for shell scripts; binary programs will be 
>>>> executed directly with not shell whatsoever.
>>>
>>> I see, we can run it with /bin/sh and it should do the trick?
>>
>> Hannes,
>>
>> Do you agree that it would suffice? Or, Do you think we should take
>> a different approach?
>>
>> Would like to get some feedback from others given that this is not
>> a trivial decision about which way we go here..
>
> Actually, I have looked at the original solution from James Smart for 
> FC-NVMe autoconnect.
>
> Essentially he starts a system service from a udev rule, which in 
> itself is triggered by the event.
> With that we're out of the udev program timeout (as systemd services 
> can run for arbitrarily long) _and_ we also have resolved the problem 
> of a uevent storm, as systemd will manage that for us.
>
> The only problem here is that we need to pass arguments to the systemd 
> service, but if we can use the standard udev trick of placing the 
> arguments into the program environment and having the program looking 
> for that we should be able to solve that, too.

fyi - the udev script fc uses passes arguments (HOST_TRADDR and TRADDR) 
from the event into the systemd service. Actually, it creates an 
instance of the service based on the names built from those arguments.

Passing different arguments beyond that seemed rather difficult. For 
that I was assuming the instance arguments would have to be parsed, and 
a system file extracted, to obtain any variable values (such as 
device-specific connect timeouts, or queue count limits, etc).


-- jmes

^ permalink raw reply	[flat|nested] 20+ messages in thread

* [PATCH rfc 0/3] add support to discovery async event notifications
  2018-10-16  5:59           ` Hannes Reinecke
  2018-10-16 17:59             ` James Smart
@ 2018-10-17  0:38             ` Sagi Grimberg
  1 sibling, 0 replies; 20+ messages in thread
From: Sagi Grimberg @ 2018-10-17  0:38 UTC (permalink / raw)



>> Hannes,
>>
>> Do you agree that it would suffice? Or, Do you think we should take
>> a different approach?
>>
>> Would like to get some feedback from others given that this is not
>> a trivial decision about which way we go here..
> 
> Actually, I have looked at the original solution from James Smart for 
> FC-NVMe autoconnect.
> 
> Essentially he starts a system service from a udev rule, which in itself 
> is triggered by the event.
> With that we're out of the udev program timeout (as systemd services can 
> run for arbitrarily long) _and_ we also have resolved the problem of a 
> uevent storm, as systemd will manage that for us.

That's a good idea!

> The only problem here is that we need to pass arguments to the systemd 
> service, but if we can use the standard udev trick of placing the 
> arguments into the program environment and having the program looking 
> for that we should be able to solve that, too.
> 
> Would you be okay with this approach?

I'm fine with this approach. There are enough ways to
pass in args to systemd:

Something like:

SUBSYSTEM=="nvme", ACTION=="change", \
    RUN+=systemctl start nvmf-disc-change@" 
--device=nvme$env{NVME_INSTANCE} --transport=$env{NVME_TRTYPE} 
--traddr=$env{NVME_TRADDR} --trsvcid=$env{NVME_TRSVCID}".service

And nvmf-disc-change at .service is:
--
[Unit]
Description=NVMe automatic discovery upon discovery log change updates

[Service]
Environment="CONNECTALL_ARGS=%I"
ExecStart=nvme connect-all $CONNECTALL_ARGS
--

We could also add EnvironmentFile for additional parameters
(nr_io_queues, etc...)

Thoughts?

^ permalink raw reply	[flat|nested] 20+ messages in thread

* [PATCH rfc 0/3] add support to discovery async event notifications
  2018-10-16 17:59             ` James Smart
@ 2018-10-17 15:29               ` Hannes Reinecke
  2018-10-19  0:44                 ` Sagi Grimberg
  2018-10-19 16:09                 ` James Smart
  0 siblings, 2 replies; 20+ messages in thread
From: Hannes Reinecke @ 2018-10-17 15:29 UTC (permalink / raw)


On 10/16/18 7:59 PM, James Smart wrote:
> 
> 
> On 10/15/2018 10:59 PM, Hannes Reinecke wrote:
>> On 10/16/18 3:11 AM, Sagi Grimberg wrote:
>>>
>>>>> Oh, I don't doubt that it'll work in the sense that 'nvme 
>>>>> connect-all' is executed.
>>>>> I _do_ doubt that the 'disown' is doing anything at all; both '&' 
>>>>> and 'disown' will be used as arguments to '/usr/sbin/nvme', and 
>>>>> _not_ interpreted by any shell (as no shell is invoked in the first 
>>>>> place).
>>>>> Internally udev will call 'execve' to call the new program, which 
>>>>> will invoke a shell only for shell scripts; binary programs will be 
>>>>> executed directly with not shell whatsoever.
>>>>
>>>> I see, we can run it with /bin/sh and it should do the trick?
>>>
>>> Hannes,
>>>
>>> Do you agree that it would suffice? Or, Do you think we should take
>>> a different approach?
>>>
>>> Would like to get some feedback from others given that this is not
>>> a trivial decision about which way we go here..
>>
>> Actually, I have looked at the original solution from James Smart for 
>> FC-NVMe autoconnect.
>>
>> Essentially he starts a system service from a udev rule, which in 
>> itself is triggered by the event.
>> With that we're out of the udev program timeout (as systemd services 
>> can run for arbitrarily long) _and_ we also have resolved the problem 
>> of a uevent storm, as systemd will manage that for us.
>>
>> The only problem here is that we need to pass arguments to the systemd 
>> service, but if we can use the standard udev trick of placing the 
>> arguments into the program environment and having the program looking 
>> for that we should be able to solve that, too.
> 
> fyi - the udev script fc uses passes arguments (HOST_TRADDR and TRADDR) 
> from the event into the systemd service. Actually, it creates an 
> instance of the service based on the names built from those arguments.
> 
> Passing different arguments beyond that seemed rather difficult. For 
> that I was assuming the instance arguments would have to be parsed, and 
> a system file extracted, to obtain any variable values (such as 
> device-specific connect timeouts, or queue count limits, etc).
> 
That is precisely where I'm scratching my head now.
Essentially the systemd '@' thingie allows you to pass exactly _one_ 
argument. Sadly we have several options to pass, not just one.
We could reduce the number of options by having per-transport services,
but then we require at least two arguments to be passed (host traddr and 
traddr for FC, traddr, trsvcid, and adrfam for RDMA).
 From what I can see we can try to:
1) escape the arguments so that they appear as one string to systemd, 
and then de-escaping them in the service file. Works without 
modification to the code, but then you always have to invoke the shell 
to handle escaping/de-escaping.
2) preformat the nvme option string, and pass that as argument for the 
systemd service. Which actually would get rid of the need for using 
nvme-cli, too, as we could just write the string into the device node.
Will be looking into it.

Cheers,

Hannes
-- 
Dr. Hannes Reinecke		   Teamlead Storage & Networking
hare at suse.de			               +49 911 74053 688
SUSE LINUX GmbH, Maxfeldstr. 5, 90409 N?rnberg
GF: F. Imend?rffer, J. Smithard, J. Guild, D. Upmanyu, G. Norton
HRB 21284 (AG N?rnberg)

^ permalink raw reply	[flat|nested] 20+ messages in thread

* [PATCH rfc 0/3] add support to discovery async event notifications
  2018-10-17 15:29               ` Hannes Reinecke
@ 2018-10-19  0:44                 ` Sagi Grimberg
  2018-10-19 16:09                 ` James Smart
  1 sibling, 0 replies; 20+ messages in thread
From: Sagi Grimberg @ 2018-10-19  0:44 UTC (permalink / raw)



>> fyi - the udev script fc uses passes arguments (HOST_TRADDR and 
>> TRADDR) from the event into the systemd service. Actually, it creates 
>> an instance of the service based on the names built from those arguments.
>>
>> Passing different arguments beyond that seemed rather difficult. For 
>> that I was assuming the instance arguments would have to be parsed, 
>> and a system file extracted, to obtain any variable values (such as 
>> device-specific connect timeouts, or queue count limits, etc).
>>
> That is precisely where I'm scratching my head now.
> Essentially the systemd '@' thingie allows you to pass exactly _one_ 
> argument. Sadly we have several options to pass, not just one.
> We could reduce the number of options by having per-transport services,
> but then we require at least two arguments to be passed (host traddr and 
> traddr for FC, traddr, trsvcid, and adrfam for RDMA).
>  From what I can see we can try to:
> 1) escape the arguments so that they appear as one string to systemd, 
> and then de-escaping them in the service file. Works without 
> modification to the code, but then you always have to invoke the shell 
> to handle escaping/de-escaping.

I don't see a problem with that..

> 2) preformat the nvme option string, and pass that as argument for the 
> systemd service. Which actually would get rid of the need for using 
> nvme-cli, too, as we could just write the string into the device node.
> Will be looking into it.

Where would the option string be preformatted?

^ permalink raw reply	[flat|nested] 20+ messages in thread

* [PATCH rfc 0/3] add support to discovery async event notifications
  2018-10-17 15:29               ` Hannes Reinecke
  2018-10-19  0:44                 ` Sagi Grimberg
@ 2018-10-19 16:09                 ` James Smart
  1 sibling, 0 replies; 20+ messages in thread
From: James Smart @ 2018-10-19 16:09 UTC (permalink / raw)




On 10/17/2018 8:29 AM, Hannes Reinecke wrote:
> On 10/16/18 7:59 PM, James Smart wrote:
>>
>> fyi - the udev script fc uses passes arguments (HOST_TRADDR and 
>> TRADDR) from the event into the systemd service. Actually, it creates 
>> an instance of the service based on the names built from those 
>> arguments.
>>
>> Passing different arguments beyond that seemed rather difficult. For 
>> that I was assuming the instance arguments would have to be parsed, 
>> and a system file extracted, to obtain any variable values (such as 
>> device-specific connect timeouts, or queue count limits, etc).
>>
> That is precisely where I'm scratching my head now.
> Essentially the systemd '@' thingie allows you to pass exactly _one_ 
> argument. Sadly we have several options to pass, not just one.

I know.? see below.

> We could reduce the number of options by having per-transport services,
> but then we require at least two arguments to be passed (host traddr 
> and traddr for FC, traddr, trsvcid, and adrfam for RDMA).
> From what I can see we can try to:
> 1) escape the arguments so that they appear as one string to systemd, 
> and then de-escaping them in the service file. Works without 
> modification to the code, but then you always have to invoke the shell 
> to handle escaping/de-escaping.
> 2) preformat the nvme option string, and pass that as argument for the 
> systemd service. Which actually would get rid of the need for using 
> nvme-cli, too, as we could just write the string into the device node.

this is what FC did as FC had 2 args (for now): host_traddr and traddr
A single string was created for systemd, and that single string is 
passed as an argument to connect-all.? As the argument to connect-all 
was really 2 separate arguments (--host_traddr=<value>? and 
--traddr=<value), when we built the string for systemd which wants only 
1 arg, we embedded the spaces - used \\x20 rather than a space 
character.? There should be no reason this couldn't be extended to more 
arguments.??? We had tried several different ways to pass the arguments 
and this was the only way that seemed to hold together.? So it 
effectively does both of the things above.

And as I hinted before - I still think that this process needs to be 
augmented by something that sets per-connection overrides, so that 
different connect requests could have different connect timeouts. In 
general, these differences wouldn't be known by the transport. Although 
it would be possible to define a mechanism to go in the 
kernel/transport, I think it would be far better for nvme-cli to have a 
admin-defined override that could be used to match the connect args to 
custom parameters, which would support explicit matches as well as 
wildcards, then issue the connect request.??? For example, it could 
state anything that is of transport type X to use arguments Z, or 
anything that maps to address M to use arguments N.

-- james


> Will be looking into it.
>
> Cheers,
>
> Hannes

^ permalink raw reply	[flat|nested] 20+ messages in thread

end of thread, other threads:[~2018-10-19 16:09 UTC | newest]

Thread overview: 20+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2018-10-04 21:23 [PATCH rfc 0/3] add support to discovery async event notifications Sagi Grimberg
2018-10-04 21:23 ` [PATCH rfc 1/3] nvme-fabrics: allow discovery subsystems accept a kato Sagi Grimberg
2018-10-05  5:50   ` Hannes Reinecke
2018-10-04 21:23 ` [PATCH rfc 2/3] nvme: enable aen also for discovery controllers Sagi Grimberg
2018-10-05  5:51   ` Hannes Reinecke
2018-10-04 21:23 ` [PATCH rfc 3/3] nvme: fire discovery log page change events to userspace Sagi Grimberg
2018-10-05  5:51   ` Hannes Reinecke
2018-10-04 21:23 ` [PATCH 4/3 rfc nvme-cli] fabrics: support persistent connections to a discovery controller Sagi Grimberg
2018-10-04 21:23 ` [PATCH 5/3 rfc nvme-cli] fabrics: allow user to retrieve discovery log from existing " Sagi Grimberg
2018-10-05  5:50 ` [PATCH rfc 0/3] add support to discovery async event notifications Hannes Reinecke
2018-10-05  7:08   ` Sagi Grimberg
2018-10-05  7:36     ` Hannes Reinecke
2018-10-05 19:34       ` Sagi Grimberg
2018-10-16  1:11         ` Sagi Grimberg
2018-10-16  5:59           ` Hannes Reinecke
2018-10-16 17:59             ` James Smart
2018-10-17 15:29               ` Hannes Reinecke
2018-10-19  0:44                 ` Sagi Grimberg
2018-10-19 16:09                 ` James Smart
2018-10-17  0:38             ` Sagi Grimberg

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.