All of lore.kernel.org
 help / color / mirror / Atom feed
* [PATCH rfc 0/1] Fix missing AENs when discovery controllers are disconnected
@ 2022-09-07 14:24 Sagi Grimberg
  2022-09-07 14:24 ` [PATCH rfc 1/1] fabrics: support notify_rediscover connect parameter Sagi Grimberg
                   ` (4 more replies)
  0 siblings, 5 replies; 19+ messages in thread
From: Sagi Grimberg @ 2022-09-07 14:24 UTC (permalink / raw)
  To: linux-nvme
  Cc: Christoph Hellwig, Keith Busch, Chaitanya Kulkarni,
	Martin Belanger, Hannes Reinecke

When a discovery controller is disconnected, no AENs will
arrive to notify the host about discovery log-page changes.

This attempts to fix it with a new connection parameter asking
the kernel to send a dedicated udev event for this case. Prior
attempt tried to use "connected" event already sent by the kernel
however this also applied on the first connected, causing undesired
side-effects when issuing a simple 'discover' command.

The patchset includes the nvme-cli/libnvme counter-parts as well.

Sagi Grimberg (1):
  fabrics: support notify_rediscover connect parameter

 drivers/nvme/host/core.c    | 4 ++++
 drivers/nvme/host/fabrics.c | 6 +++++-
 drivers/nvme/host/fabrics.h | 4 ++++
 drivers/nvme/host/fc.c      | 5 ++---
 4 files changed, 15 insertions(+), 4 deletions(-)

-- 
2.34.1



^ permalink raw reply	[flat|nested] 19+ messages in thread

* [PATCH rfc 1/1] fabrics: support notify_rediscover connect parameter
  2022-09-07 14:24 [PATCH rfc 0/1] Fix missing AENs when discovery controllers are disconnected Sagi Grimberg
@ 2022-09-07 14:24 ` Sagi Grimberg
  2022-09-07 14:24 ` [PATCH libnvme 2/1] nvme: Add generic connect parameter support detection Sagi Grimberg
                   ` (3 subsequent siblings)
  4 siblings, 0 replies; 19+ messages in thread
From: Sagi Grimberg @ 2022-09-07 14:24 UTC (permalink / raw)
  To: linux-nvme
  Cc: Christoph Hellwig, Keith Busch, Chaitanya Kulkarni,
	Martin Belanger, Hannes Reinecke

When a discovery controller is disconnected, no AENs will
arrive to notify the host about discovery log-page changes.

In order to solve this, support a notify_rediscover parameter
that userspace can pass us to request a notification via uevent when
a discovery controller reconnects.

Note, we rely on nr_reconnects controller counter, so transports
must not clear it before calling nvme_start_ctrl (fc is fixed).

Signed-off-by: Sagi Grimberg <sagi@grimberg.me>
---
 drivers/nvme/host/core.c    | 4 ++++
 drivers/nvme/host/fabrics.c | 6 +++++-
 drivers/nvme/host/fabrics.h | 4 ++++
 drivers/nvme/host/fc.c      | 5 ++---
 4 files changed, 15 insertions(+), 4 deletions(-)

diff --git a/drivers/nvme/host/core.c b/drivers/nvme/host/core.c
index 2429b11eb9a8..159097e91723 100644
--- a/drivers/nvme/host/core.c
+++ b/drivers/nvme/host/core.c
@@ -4814,6 +4814,10 @@ void nvme_start_ctrl(struct nvme_ctrl *ctrl)
 
 	nvme_enable_aen(ctrl);
 
+	if (nvme_discovery_ctrl(ctrl) && ctrl->opts &&
+	    ctrl->opts->notify_rediscover && ctrl->nr_reconnects > 0)
+		nvme_change_uevent(ctrl, "NVME_EVENT=rediscover");
+
 	if (ctrl->queue_count > 1) {
 		nvme_queue_scan(ctrl);
 		nvme_start_queues(ctrl);
diff --git a/drivers/nvme/host/fabrics.c b/drivers/nvme/host/fabrics.c
index 10cc4a814602..945844ce64cb 100644
--- a/drivers/nvme/host/fabrics.c
+++ b/drivers/nvme/host/fabrics.c
@@ -594,6 +594,7 @@ static const match_table_t opt_tokens = {
 	{ NVMF_OPT_DISCOVERY,		"discovery"		},
 	{ NVMF_OPT_DHCHAP_SECRET,	"dhchap_secret=%s"	},
 	{ NVMF_OPT_DHCHAP_CTRL_SECRET,	"dhchap_ctrl_secret=%s"	},
+	{ NVMF_OPT_NOTIFY_REDISCOVER,	"notify_rediscover"	},
 	{ NVMF_OPT_ERR,			NULL			}
 };
 
@@ -875,6 +876,9 @@ static int nvmf_parse_options(struct nvmf_ctrl_options *opts,
 		case NVMF_OPT_DISCOVERY:
 			opts->discovery_nqn = true;
 			break;
+		case NVMF_OPT_NOTIFY_REDISCOVER:
+			opts->notify_rediscover = true;
+			break;
 		case NVMF_OPT_DHCHAP_SECRET:
 			p = match_strdup(args);
 			if (!p) {
@@ -1033,7 +1037,7 @@ EXPORT_SYMBOL_GPL(nvmf_free_options);
 				 NVMF_OPT_HOST_ID | NVMF_OPT_DUP_CONNECT |\
 				 NVMF_OPT_DISABLE_SQFLOW | NVMF_OPT_DISCOVERY |\
 				 NVMF_OPT_FAIL_FAST_TMO | NVMF_OPT_DHCHAP_SECRET |\
-				 NVMF_OPT_DHCHAP_CTRL_SECRET)
+				 NVMF_OPT_DHCHAP_CTRL_SECRET | NVMF_OPT_NOTIFY_REDISCOVER)
 
 static struct nvme_ctrl *
 nvmf_create_ctrl(struct device *dev, const char *buf)
diff --git a/drivers/nvme/host/fabrics.h b/drivers/nvme/host/fabrics.h
index a6e22116e139..a0139091fcac 100644
--- a/drivers/nvme/host/fabrics.h
+++ b/drivers/nvme/host/fabrics.h
@@ -70,6 +70,7 @@ enum {
 	NVMF_OPT_DISCOVERY	= 1 << 22,
 	NVMF_OPT_DHCHAP_SECRET	= 1 << 23,
 	NVMF_OPT_DHCHAP_CTRL_SECRET = 1 << 24,
+	NVMF_OPT_NOTIFY_REDISCOVER = 1 << 25,
 };
 
 /**
@@ -109,6 +110,8 @@ enum {
  * @nr_poll_queues: number of queues for polling I/O
  * @tos: type of service
  * @fast_io_fail_tmo: Fast I/O fail timeout in seconds
+ * @notify_rediscover: Send uevent notification when a discovery controller
+ *              reconnects to userspace as re-discovery may be required
  */
 struct nvmf_ctrl_options {
 	unsigned		mask;
@@ -135,6 +138,7 @@ struct nvmf_ctrl_options {
 	unsigned int		nr_poll_queues;
 	int			tos;
 	int			fast_io_fail_tmo;
+	bool			notify_rediscover;
 };
 
 /*
diff --git a/drivers/nvme/host/fc.c b/drivers/nvme/host/fc.c
index 127abaf9ba5d..7bb8837a855e 100644
--- a/drivers/nvme/host/fc.c
+++ b/drivers/nvme/host/fc.c
@@ -3195,12 +3195,11 @@ nvme_fc_create_association(struct nvme_fc_ctrl *ctrl)
 		goto out_term_aen_ops;
 
 	changed = nvme_change_ctrl_state(&ctrl->ctrl, NVME_CTRL_LIVE);
-
-	ctrl->ctrl.nr_reconnects = 0;
-
 	if (changed)
 		nvme_start_ctrl(&ctrl->ctrl);
 
+	ctrl->ctrl.nr_reconnects = 0;
+
 	return 0;	/* Success */
 
 out_term_aen_ops:
-- 
2.34.1



^ permalink raw reply related	[flat|nested] 19+ messages in thread

* [PATCH libnvme 2/1] nvme: Add generic connect parameter support detection
  2022-09-07 14:24 [PATCH rfc 0/1] Fix missing AENs when discovery controllers are disconnected Sagi Grimberg
  2022-09-07 14:24 ` [PATCH rfc 1/1] fabrics: support notify_rediscover connect parameter Sagi Grimberg
@ 2022-09-07 14:24 ` Sagi Grimberg
  2022-09-09  6:56   ` Daniel Wagner
  2022-09-07 14:24 ` [PATCH libnvme 3/1] fabrics: add notify_rediscover parameter Sagi Grimberg
                   ` (2 subsequent siblings)
  4 siblings, 1 reply; 19+ messages in thread
From: Sagi Grimberg @ 2022-09-07 14:24 UTC (permalink / raw)
  To: linux-nvme
  Cc: Christoph Hellwig, Keith Busch, Chaitanya Kulkarni,
	Martin Belanger, Hannes Reinecke

When reading from the nvmf misc device, a concatenated string
of all the supported opts is returned. Add a helper that detects
if a parameter string exists in the output.

Signed-off-by: Sagi Grimberg <sagi@grimberg.me>
---
 src/libnvme.map    |  1 +
 src/nvme/fabrics.c | 32 ++++++++++++++++++++++++++++++++
 src/nvme/fabrics.h | 11 +++++++++++
 3 files changed, 44 insertions(+)

diff --git a/src/libnvme.map b/src/libnvme.map
index 79b8f88ecd3a..80fbe218cb9a 100644
--- a/src/libnvme.map
+++ b/src/libnvme.map
@@ -352,6 +352,7 @@ LIBNVME_1_0 {
 		nvmf_treq_str;
 		nvmf_trtype_str;
 		nvmf_update_config;
+		nvmf_check_param_support;
 	local:
 		*;
 };
diff --git a/src/nvme/fabrics.c b/src/nvme/fabrics.c
index d5db5090e30b..678ca2af2561 100644
--- a/src/nvme/fabrics.c
+++ b/src/nvme/fabrics.c
@@ -526,6 +526,38 @@ static int build_options(nvme_host_t h, nvme_ctrl_t c, char **argstr)
 	return 0;
 }
 
+int nvmf_check_param_support(const char *param, bool *supported)
+{
+	int ret = 0, fd, len;
+	char buf[0x1000];
+
+	fd = open(nvmf_dev, O_RDWR);
+	if (fd < 0) {
+		nvme_msg(NULL, LOG_ERR, "Failed to open %s: %s\n",
+			 nvmf_dev, strerror(errno));
+		return -ENVME_CONNECT_OPEN;
+	}
+
+	memset(buf, 0x0, sizeof(buf));
+	len = read(fd, buf, sizeof(buf) - 1);
+	if (len < 0) {
+		nvme_msg(NULL, LOG_ERR, "Failed to read from %s: %s\n",
+			 nvmf_dev, strerror(errno));
+		ret = -ENVME_CONNECT_READ;
+		goto out_close;
+	}
+	nvme_msg(NULL, LOG_DEBUG, "supported opts: '%.*s'\n", (int)strcspn(buf, "\n"), buf);
+	buf[len] = '\0';
+
+	if (strstr(buf, param) != NULL)
+		*supported = true;
+	else
+		*supported = false;
+out_close:
+	close(fd);
+	return ret;
+}
+
 static int __nvmf_add_ctrl(nvme_root_t r, const char *argstr)
 {
 	int ret, fd, len = strlen(argstr);
diff --git a/src/nvme/fabrics.h b/src/nvme/fabrics.h
index 9e099feeeb08..83353a65c5d7 100644
--- a/src/nvme/fabrics.h
+++ b/src/nvme/fabrics.h
@@ -270,4 +270,15 @@ bool nvmf_is_registration_supported(nvme_ctrl_t c);
  */
 int nvmf_register_ctrl(nvme_ctrl_t c, enum nvmf_dim_tas tas, __u32 *result);
 
+/**
+ * nvmf_check_param_support() - Check parameter support from the kernel
+ * @param:	Parameter string
+ * @supported:	Parameter Support indicator
+ *
+ * Read from the misc device the supported parameters and search the @param string
+ *
+ * Return: 0 on success to query; on failure errno is returned
+ */
+int nvmf_check_param_support(const char *param, bool *supported);
+
 #endif /* _LIBNVME_FABRICS_H */
-- 
2.34.1



^ permalink raw reply related	[flat|nested] 19+ messages in thread

* [PATCH libnvme 3/1] fabrics: add notify_rediscover parameter
  2022-09-07 14:24 [PATCH rfc 0/1] Fix missing AENs when discovery controllers are disconnected Sagi Grimberg
  2022-09-07 14:24 ` [PATCH rfc 1/1] fabrics: support notify_rediscover connect parameter Sagi Grimberg
  2022-09-07 14:24 ` [PATCH libnvme 2/1] nvme: Add generic connect parameter support detection Sagi Grimberg
@ 2022-09-07 14:24 ` Sagi Grimberg
  2022-09-09  6:58   ` Daniel Wagner
  2022-09-07 14:24 ` [PATCH nvme-cli 4/1] fabrics: re-read the discovery log page when a discovery controller reconnected Sagi Grimberg
  2022-09-10 15:47 ` [PATCH rfc 0/1] Fix missing AENs when discovery controllers are disconnected James Smart
  4 siblings, 1 reply; 19+ messages in thread
From: Sagi Grimberg @ 2022-09-07 14:24 UTC (permalink / raw)
  To: linux-nvme
  Cc: Christoph Hellwig, Keith Busch, Chaitanya Kulkarni,
	Martin Belanger, Hannes Reinecke

When a discovery controller is disconnected, no AENs will
arrive to notify the host about discovery log-page changes.

In order to solve this, add a notify_rediscover parameter
that we can pass the kernel to notify us via uevent when
a discovery controller reconnect.

Signed-off-by: Sagi Grimberg <sagi@grimberg.me>
---
 src/nvme/fabrics.c | 3 +++
 src/nvme/fabrics.h | 2 ++
 2 files changed, 5 insertions(+)

diff --git a/src/nvme/fabrics.c b/src/nvme/fabrics.c
index 678ca2af2561..b49bc2eda965 100644
--- a/src/nvme/fabrics.c
+++ b/src/nvme/fabrics.c
@@ -214,6 +214,7 @@ static struct nvme_fabrics_config *merge_config(nvme_ctrl_t c,
 	MERGE_CFG_OPTION(ctrl_cfg, cfg, hdr_digest, false);
 	MERGE_CFG_OPTION(ctrl_cfg, cfg, data_digest, false);
 	MERGE_CFG_OPTION(ctrl_cfg, cfg, tls, false);
+	MERGE_CFG_OPTION(ctrl_cfg, cfg, notify_rediscover, false);
 
 	return ctrl_cfg;
 }
@@ -481,6 +482,8 @@ static int build_options(nvme_host_t h, nvme_ctrl_t c, char **argstr)
 	    (hostid && add_argument(argstr, "hostid", hostid)) ||
 	    (discover && !discovery_nqn &&
 	     add_bool_argument(argstr, "discovery", true)) ||
+	     add_bool_argument(argstr, "notify_rediscover",
+			       cfg->notify_rediscover) ||
 	    (!discover && hostkey &&
 	     add_argument(argstr, "dhchap_secret", hostkey)) ||
 	    (!discover && ctrlkey &&
diff --git a/src/nvme/fabrics.h b/src/nvme/fabrics.h
index 83353a65c5d7..377475a690f6 100644
--- a/src/nvme/fabrics.h
+++ b/src/nvme/fabrics.h
@@ -40,6 +40,7 @@
  * @hdr_digest:		Generate/verify header digest (TCP)
  * @data_digest:	Generate/verify data digest (TCP)
  * @tls:		Start TLS on the connection (TCP)
+ * @notify_rediscover:  Send uevent notification for rediscovery upon reconnect
  */
 struct nvme_fabrics_config {
 	char *host_traddr;
@@ -59,6 +60,7 @@ struct nvme_fabrics_config {
 	bool hdr_digest;
 	bool data_digest;
 	bool tls;
+	bool notify_rediscover;
 };
 
 /**
-- 
2.34.1



^ permalink raw reply related	[flat|nested] 19+ messages in thread

* [PATCH nvme-cli 4/1] fabrics: re-read the discovery log page when a discovery controller reconnected
  2022-09-07 14:24 [PATCH rfc 0/1] Fix missing AENs when discovery controllers are disconnected Sagi Grimberg
                   ` (2 preceding siblings ...)
  2022-09-07 14:24 ` [PATCH libnvme 3/1] fabrics: add notify_rediscover parameter Sagi Grimberg
@ 2022-09-07 14:24 ` Sagi Grimberg
  2022-09-10 15:47 ` [PATCH rfc 0/1] Fix missing AENs when discovery controllers are disconnected James Smart
  4 siblings, 0 replies; 19+ messages in thread
From: Sagi Grimberg @ 2022-09-07 14:24 UTC (permalink / raw)
  To: linux-nvme
  Cc: Christoph Hellwig, Keith Busch, Chaitanya Kulkarni,
	Martin Belanger, Hannes Reinecke

When using persistent discovery controllers, if the discovery
controller loses connectivity and manage to reconnect after a while,
we need to retrieve again the discovery log page in order to learn
about possible changes that may have occurred during this time as
discovery log change events were lost.

So we pass notify_rediscover for persistent discovery controllers (if
the kernel supports this argument). Upon reception of a udev
EVENT=rediscover we can kickstart discovery on the existing discovery
controller device node that generated the event.

Signed-off-by: Sagi Grimberg <sagi@grimberg.me>
---
 fabrics.c                                         | 15 ++++++++++++++-
 .../udev-rules/70-nvmf-autoconnect.rules.in       |  7 +++++++
 2 files changed, 21 insertions(+), 1 deletion(-)

diff --git a/fabrics.c b/fabrics.c
index 43ca5f422d3d..0cb5d9da6cbf 100644
--- a/fabrics.c
+++ b/fabrics.c
@@ -144,6 +144,7 @@ static nvme_ctrl_t __create_discover_ctrl(nvme_root_t r, nvme_host_t h,
 {
 	nvme_ctrl_t c;
 	int tmo, ret;
+	bool supported;
 
 	c = nvme_create_ctrl(r, trcfg->subsysnqn, trcfg->transport,
 			     trcfg->traddr, trcfg->host_traddr,
@@ -154,6 +155,14 @@ static nvme_ctrl_t __create_discover_ctrl(nvme_root_t r, nvme_host_t h,
 	nvme_ctrl_set_discovery_ctrl(c, true);
 	tmo = set_discovery_kato(cfg);
 
+	ret = nvmf_check_param_support("notify_rediscover", &supported);
+	if (ret)
+		return NULL;
+	if (supported) {
+		printf("cfg->notify_rediscover %d\n", cfg->notify_rediscover);
+		cfg->notify_rediscover = persistent;
+	}
+
 	errno = 0;
 	ret = nvmf_add_ctrl(h, c, cfg);
 
@@ -390,8 +399,12 @@ static int __discover(nvme_ctrl_t c, struct nvme_fabrics_config *defcfg,
 			if (strcmp(nvme_ctrl_get_transport(c), nvmf_trtype_str(e->trtype)))
 				continue;
 
-			if (e->subtype == NVME_NQN_DISC)
+			if (e->subtype == NVME_NQN_DISC) {
 				set_discovery_kato(defcfg);
+				defcfg->notify_rediscover = true;
+			} else {
+				defcfg->notify_rediscover = false;
+			}
 
 			errno = 0;
 			child = nvmf_connect_disc_entry(h, e, defcfg,
diff --git a/nvmf-autoconnect/udev-rules/70-nvmf-autoconnect.rules.in b/nvmf-autoconnect/udev-rules/70-nvmf-autoconnect.rules.in
index 434cc080ffe3..93e438863672 100644
--- a/nvmf-autoconnect/udev-rules/70-nvmf-autoconnect.rules.in
+++ b/nvmf-autoconnect/udev-rules/70-nvmf-autoconnect.rules.in
@@ -17,3 +17,10 @@ ACTION=="change", SUBSYSTEM=="nvme", ENV{NVME_AEN}=="0x70f002",\
 ACTION=="change", SUBSYSTEM=="fc", ENV{FC_EVENT}=="nvmediscovery", \
   ENV{NVMEFC_HOST_TRADDR}=="*",  ENV{NVMEFC_TRADDR}=="*", \
   RUN+="@SYSTEMCTL@ --no-block start nvmf-connect@--device=none\t--transport=fc\t--traddr=$env{NVMEFC_TRADDR}\t--trsvcid=none\t--host-traddr=$env{NVMEFC_HOST_TRADDR}.service"
+
+# A discovery controller just (re)connected, re-read the discovery log change to
+# check if there were any changes since it was last connected.
+ACTION=="change", SUBSYSTEM=="nvme", ENV{NVME_EVENT}=="rediscover", ATTR{cntrltype}=="discovery", \
+  ENV{NVME_TRTYPE}=="*", ENV{NVME_TRADDR}=="*", \
+  ENV{NVME_TRSVCID}=="*", ENV{NVME_HOST_TRADDR}=="*", \
+  RUN+="@SYSTEMCTL@ --no-block start nvmf-connect@--device=$kernel\t--transport=$env{NVME_TRTYPE}\t--traddr=$env{NVME_TRADDR}\t--trsvcid=$env{NVME_TRSVCID}\t--host-traddr=$env{NVME_HOST_TRADDR}.service"
-- 
2.34.1



^ permalink raw reply related	[flat|nested] 19+ messages in thread

* Re: [PATCH libnvme 2/1] nvme: Add generic connect parameter support detection
  2022-09-07 14:24 ` [PATCH libnvme 2/1] nvme: Add generic connect parameter support detection Sagi Grimberg
@ 2022-09-09  6:56   ` Daniel Wagner
  2022-09-12 12:32     ` Sagi Grimberg
  0 siblings, 1 reply; 19+ messages in thread
From: Daniel Wagner @ 2022-09-09  6:56 UTC (permalink / raw)
  To: Sagi Grimberg
  Cc: linux-nvme, Christoph Hellwig, Keith Busch, Chaitanya Kulkarni,
	Martin Belanger, Hannes Reinecke

On Wed, Sep 07, 2022 at 05:24:09PM +0300, Sagi Grimberg wrote:
> --- a/src/libnvme.map
> +++ b/src/libnvme.map
> @@ -352,6 +352,7 @@ LIBNVME_1_0 {
>  		nvmf_treq_str;
>  		nvmf_trtype_str;
>  		nvmf_update_config;
> +		nvmf_check_param_support;

New entries go to the 'unreleased' section, this would be LIBNVME_1_2.

> +int nvmf_check_param_support(const char *param, bool *supported)
> +{

I was wondering if t would be better to return all supported features in
one go. Not really sure about it though.

> +	int ret = 0, fd, len;
> +	char buf[0x1000];
> +
> +	fd = open(nvmf_dev, O_RDWR);
> +	if (fd < 0) {
> +		nvme_msg(NULL, LOG_ERR, "Failed to open %s: %s\n",
> +			 nvmf_dev, strerror(errno));

'nvme_msg(NULL,' will always write to the stderr which is kind of bad
for libraries. I think it would okay just to return the error codes in
this function and have the caller do the logging part.

> +		return -ENVME_CONNECT_OPEN;
> +	}
> +
> +	memset(buf, 0x0, sizeof(buf));
> +	len = read(fd, buf, sizeof(buf) - 1);
> +	if (len < 0) {
> +		nvme_msg(NULL, LOG_ERR, "Failed to read from %s: %s\n",
> +			 nvmf_dev, strerror(errno));
> +		ret = -ENVME_CONNECT_READ;
> +		goto out_close;
> +	}
> +	nvme_msg(NULL, LOG_DEBUG, "supported opts: '%.*s'\n",
> (int)strcspn(buf, "\n"), buf);

This triggered my question above why we don't return all supported
features up and have the caller added something like this?

Thanks for doing this! This feature is rotting on my TODO list for a
while...


^ permalink raw reply	[flat|nested] 19+ messages in thread

* Re: [PATCH libnvme 3/1] fabrics: add notify_rediscover parameter
  2022-09-07 14:24 ` [PATCH libnvme 3/1] fabrics: add notify_rediscover parameter Sagi Grimberg
@ 2022-09-09  6:58   ` Daniel Wagner
  2022-09-12 12:33     ` Sagi Grimberg
  0 siblings, 1 reply; 19+ messages in thread
From: Daniel Wagner @ 2022-09-09  6:58 UTC (permalink / raw)
  To: Sagi Grimberg
  Cc: linux-nvme, Christoph Hellwig, Keith Busch, Chaitanya Kulkarni,
	Martin Belanger, Hannes Reinecke

On Wed, Sep 07, 2022 at 05:24:10PM +0300, Sagi Grimberg wrote:
> @@ -481,6 +482,8 @@ static int build_options(nvme_host_t h, nvme_ctrl_t c, char **argstr)
>  	    (hostid && add_argument(argstr, "hostid", hostid)) ||
>  	    (discover && !discovery_nqn &&
>  	     add_bool_argument(argstr, "discovery", true)) ||
> +	     add_bool_argument(argstr, "notify_rediscover",
> +			       cfg->notify_rediscover) ||

If notify_rediscover is not supported by the kernel, wouldn't this end
up with an EINVAL when we try to connect? I thought we need to make this
conditional.


^ permalink raw reply	[flat|nested] 19+ messages in thread

* Re: [PATCH rfc 0/1] Fix missing AENs when discovery controllers are disconnected
  2022-09-07 14:24 [PATCH rfc 0/1] Fix missing AENs when discovery controllers are disconnected Sagi Grimberg
                   ` (3 preceding siblings ...)
  2022-09-07 14:24 ` [PATCH nvme-cli 4/1] fabrics: re-read the discovery log page when a discovery controller reconnected Sagi Grimberg
@ 2022-09-10 15:47 ` James Smart
  2022-09-12 12:39   ` Sagi Grimberg
  4 siblings, 1 reply; 19+ messages in thread
From: James Smart @ 2022-09-10 15:47 UTC (permalink / raw)
  To: Sagi Grimberg, linux-nvme
  Cc: Christoph Hellwig, Keith Busch, Chaitanya Kulkarni,
	Martin Belanger, Hannes Reinecke

On 9/7/2022 7:24 AM, Sagi Grimberg wrote:
> When a discovery controller is disconnected, no AENs will
> arrive to notify the host about discovery log-page changes.
> 
> This attempts to fix it with a new connection parameter asking
> the kernel to send a dedicated udev event for this case. Prior
> attempt tried to use "connected" event already sent by the kernel
> however this also applied on the first connected, causing undesired
> side-effects when issuing a simple 'discover' command.
> 
> The patchset includes the nvme-cli/libnvme counter-parts as well.

Sagi,

Do we really need another parameter/option ?  seems awkward to need 
userspace to figure it out.

The transport certainly knows the difference between 1st time connect 
and nth time re-connect - perhaps this should be a simple flag on the 
nvme ctrl struct set by the transport ?  Simple thing for rdma/tcp to 
set flag at end of successful xxx_create_ctrl(), while FC, given it does 
reconnects w/o a first successful connect needs a little more logic.

-- james




^ permalink raw reply	[flat|nested] 19+ messages in thread

* Re: [PATCH libnvme 2/1] nvme: Add generic connect parameter support detection
  2022-09-09  6:56   ` Daniel Wagner
@ 2022-09-12 12:32     ` Sagi Grimberg
  2022-09-12 14:19       ` Daniel Wagner
  0 siblings, 1 reply; 19+ messages in thread
From: Sagi Grimberg @ 2022-09-12 12:32 UTC (permalink / raw)
  To: Daniel Wagner
  Cc: linux-nvme, Christoph Hellwig, Keith Busch, Chaitanya Kulkarni,
	Martin Belanger, Hannes Reinecke


>> --- a/src/libnvme.map
>> +++ b/src/libnvme.map
>> @@ -352,6 +352,7 @@ LIBNVME_1_0 {
>>   		nvmf_treq_str;
>>   		nvmf_trtype_str;
>>   		nvmf_update_config;
>> +		nvmf_check_param_support;
> 
> New entries go to the 'unreleased' section, this would be LIBNVME_1_2.

OK.

>> +int nvmf_check_param_support(const char *param, bool *supported)
>> +{
> 
> I was wondering if t would be better to return all supported features in
> one go. Not really sure about it though.

Why? it is already held in the kernel, why hold it again?

>> +	int ret = 0, fd, len;
>> +	char buf[0x1000];
>> +
>> +	fd = open(nvmf_dev, O_RDWR);
>> +	if (fd < 0) {
>> +		nvme_msg(NULL, LOG_ERR, "Failed to open %s: %s\n",
>> +			 nvmf_dev, strerror(errno));
> 
> 'nvme_msg(NULL,' will always write to the stderr which is kind of bad
> for libraries. I think it would okay just to return the error codes in
> this function and have the caller do the logging part.

OK

> 
>> +		return -ENVME_CONNECT_OPEN;
>> +	}
>> +
>> +	memset(buf, 0x0, sizeof(buf));
>> +	len = read(fd, buf, sizeof(buf) - 1);
>> +	if (len < 0) {
>> +		nvme_msg(NULL, LOG_ERR, "Failed to read from %s: %s\n",
>> +			 nvmf_dev, strerror(errno));
>> +		ret = -ENVME_CONNECT_READ;
>> +		goto out_close;
>> +	}
>> +	nvme_msg(NULL, LOG_DEBUG, "supported opts: '%.*s'\n",
>> (int)strcspn(buf, "\n"), buf);
> 
> This triggered my question above why we don't return all supported
> features up and have the caller added something like this?

Why? the caller doesn't need the entire array, it is just looking
to check if a param is supported. We can move this out if someone
actually needs the entire string or set of caps...


^ permalink raw reply	[flat|nested] 19+ messages in thread

* Re: [PATCH libnvme 3/1] fabrics: add notify_rediscover parameter
  2022-09-09  6:58   ` Daniel Wagner
@ 2022-09-12 12:33     ` Sagi Grimberg
  2022-09-12 14:42       ` Daniel Wagner
  0 siblings, 1 reply; 19+ messages in thread
From: Sagi Grimberg @ 2022-09-12 12:33 UTC (permalink / raw)
  To: Daniel Wagner
  Cc: linux-nvme, Christoph Hellwig, Keith Busch, Chaitanya Kulkarni,
	Martin Belanger, Hannes Reinecke


>> @@ -481,6 +482,8 @@ static int build_options(nvme_host_t h, nvme_ctrl_t c, char **argstr)
>>   	    (hostid && add_argument(argstr, "hostid", hostid)) ||
>>   	    (discover && !discovery_nqn &&
>>   	     add_bool_argument(argstr, "discovery", true)) ||
>> +	     add_bool_argument(argstr, "notify_rediscover",
>> +			       cfg->notify_rediscover) ||
> 
> If notify_rediscover is not supported by the kernel, wouldn't this end
> up with an EINVAL when we try to connect? I thought we need to make this
> conditional.

The check is done by the callers, which check support for it. If it
is not supported and still passed, its a bug in the caller.


^ permalink raw reply	[flat|nested] 19+ messages in thread

* Re: [PATCH rfc 0/1] Fix missing AENs when discovery controllers are disconnected
  2022-09-10 15:47 ` [PATCH rfc 0/1] Fix missing AENs when discovery controllers are disconnected James Smart
@ 2022-09-12 12:39   ` Sagi Grimberg
  2022-09-13  0:06     ` James Smart
  0 siblings, 1 reply; 19+ messages in thread
From: Sagi Grimberg @ 2022-09-12 12:39 UTC (permalink / raw)
  To: James Smart, linux-nvme
  Cc: Christoph Hellwig, Keith Busch, Chaitanya Kulkarni,
	Martin Belanger, Hannes Reinecke

>> When a discovery controller is disconnected, no AENs will
>> arrive to notify the host about discovery log-page changes.
>>
>> This attempts to fix it with a new connection parameter asking
>> the kernel to send a dedicated udev event for this case. Prior
>> attempt tried to use "connected" event already sent by the kernel
>> however this also applied on the first connected, causing undesired
>> side-effects when issuing a simple 'discover' command.
>>
>> The patchset includes the nvme-cli/libnvme counter-parts as well.
> 
> Sagi,
> 
> Do we really need another parameter/option ?  seems awkward to need 
> userspace to figure it out.

It has a rather specific use though..

I was looking for something more fine-grained because I initially
used the generic "connected" uevent which had some negative side
effects, so I decided against adding yet another generic "reconnected"
event that nvme-cli would use, and may/may-not fit other use-cases.

I thought its better to have something that is as specific as possible
like "send me an event only when a persistent discovery-controller
reconnects" and I also used the name to indicate the semantic of
the event, so in the future, a possible consumer of this event will
know exactly what it is used for before opting to use it.

> The transport certainly knows the difference between 1st time connect 
> and nth time re-connect - perhaps this should be a simple flag on the 
> nvme ctrl struct set by the transport ?  Simple thing for rdma/tcp to 
> set flag at end of successful xxx_create_ctrl(), while FC, given it does 
> reconnects w/o a first successful connect needs a little more logic.

I used ctrl->nr_reconnects to decide it, which introduced the constraint
that transports would clear it _after_ calling nvme_start_ctrl(). Open
to better suggestions though...


^ permalink raw reply	[flat|nested] 19+ messages in thread

* Re: [PATCH libnvme 2/1] nvme: Add generic connect parameter support detection
  2022-09-12 12:32     ` Sagi Grimberg
@ 2022-09-12 14:19       ` Daniel Wagner
  0 siblings, 0 replies; 19+ messages in thread
From: Daniel Wagner @ 2022-09-12 14:19 UTC (permalink / raw)
  To: Sagi Grimberg
  Cc: linux-nvme, Christoph Hellwig, Keith Busch, Chaitanya Kulkarni,
	Martin Belanger, Hannes Reinecke

On Mon, Sep 12, 2022 at 03:32:01PM +0300, Sagi Grimberg wrote:
> > > +int nvmf_check_param_support(const char *param, bool *supported)
> > > +{
> > 
> > I was wondering if t would be better to return all supported features in
> > one go. Not really sure about it though.
> 
> Why? it is already held in the kernel, why hold it again?

True, I am probably making it more complex than needed.

> > This triggered my question above why we don't return all supported
> > features up and have the caller added something like this?
> 
> Why? the caller doesn't need the entire array, it is just looking
> to check if a param is supported. We can move this out if someone
> actually needs the entire string or set of caps...

Right, let's address this when there is actually a user for it.


^ permalink raw reply	[flat|nested] 19+ messages in thread

* Re: [PATCH libnvme 3/1] fabrics: add notify_rediscover parameter
  2022-09-12 12:33     ` Sagi Grimberg
@ 2022-09-12 14:42       ` Daniel Wagner
  0 siblings, 0 replies; 19+ messages in thread
From: Daniel Wagner @ 2022-09-12 14:42 UTC (permalink / raw)
  To: Sagi Grimberg
  Cc: linux-nvme, Christoph Hellwig, Keith Busch, Chaitanya Kulkarni,
	Martin Belanger, Hannes Reinecke

On Mon, Sep 12, 2022 at 03:33:15PM +0300, Sagi Grimberg wrote:
> 
> > > @@ -481,6 +482,8 @@ static int build_options(nvme_host_t h, nvme_ctrl_t c, char **argstr)
> > >   	    (hostid && add_argument(argstr, "hostid", hostid)) ||
> > >   	    (discover && !discovery_nqn &&
> > >   	     add_bool_argument(argstr, "discovery", true)) ||
> > > +	     add_bool_argument(argstr, "notify_rediscover",
> > > +			       cfg->notify_rediscover) ||
> > 
> > If notify_rediscover is not supported by the kernel, wouldn't this end
> > up with an EINVAL when we try to connect? I thought we need to make this
> > conditional.
> 
> The check is done by the callers, which check support for it. If it
> is not supported and still passed, its a bug in the caller.

Okay.

Another thing I was wondering if we might run into a problem with
dependencies on the library via the kernel feature support. But I think
we should be safe and if not, another reason for a 2.x release :)


^ permalink raw reply	[flat|nested] 19+ messages in thread

* Re: [PATCH rfc 0/1] Fix missing AENs when discovery controllers are disconnected
  2022-09-12 12:39   ` Sagi Grimberg
@ 2022-09-13  0:06     ` James Smart
  2022-09-14 10:29       ` Sagi Grimberg
  0 siblings, 1 reply; 19+ messages in thread
From: James Smart @ 2022-09-13  0:06 UTC (permalink / raw)
  To: Sagi Grimberg, linux-nvme
  Cc: Christoph Hellwig, Keith Busch, Chaitanya Kulkarni,
	Martin Belanger, Hannes Reinecke

On 9/12/2022 5:39 AM, Sagi Grimberg wrote:
>>> When a discovery controller is disconnected, no AENs will
>>> arrive to notify the host about discovery log-page changes.
>>>
>>> This attempts to fix it with a new connection parameter asking
>>> the kernel to send a dedicated udev event for this case. Prior
>>> attempt tried to use "connected" event already sent by the kernel
>>> however this also applied on the first connected, causing undesired
>>> side-effects when issuing a simple 'discover' command.
>>>
>>> The patchset includes the nvme-cli/libnvme counter-parts as well.
>>
>> Sagi,
>>
>> Do we really need another parameter/option ?  seems awkward to need 
>> userspace to figure it out.
> 
> It has a rather specific use though..
> 
> I was looking for something more fine-grained because I initially
> used the generic "connected" uevent which had some negative side
> effects, so I decided against adding yet another generic "reconnected"
> event that nvme-cli would use, and may/may-not fit other use-cases.
> 
> I thought its better to have something that is as specific as possible
> like "send me an event only when a persistent discovery-controller
> reconnects" and I also used the name to indicate the semantic of
> the event, so in the future, a possible consumer of this event will
> know exactly what it is used for before opting to use it.
> 
>> The transport certainly knows the difference between 1st time connect 
>> and nth time re-connect - perhaps this should be a simple flag on the 
>> nvme ctrl struct set by the transport ?  Simple thing for rdma/tcp to 
>> set flag at end of successful xxx_create_ctrl(), while FC, given it 
>> does reconnects w/o a first successful connect needs a little more logic.
> 
> I used ctrl->nr_reconnects to decide it, which introduced the constraint
> that transports would clear it _after_ calling nvme_start_ctrl(). Open
> to better suggestions though...

I was thinking this rather than the new option... The bit would only be 
set *after* the first successful link-side connect, thus the initial 
nvme_start_ctrl would not send the reconnect event. Every reconnect 
thereafter would.

-- james


diff --git a/drivers/nvme/host/core.c b/drivers/nvme/host/core.c
index 2429b11eb9a8..43c8b6590164 100644
--- a/drivers/nvme/host/core.c
+++ b/drivers/nvme/host/core.c
@@ -4814,6 +4814,10 @@ void nvme_start_ctrl(struct nvme_ctrl *ctrl)

  	nvme_enable_aen(ctrl);

+	if (nvme_discovery_ctrl(ctrl) &&
+	    test_bit(NVME_CTRL_FABRIC_CONNECTED, &ctrl->flags))
+		nvme_change_uevent(ctrl, "NVME_EVENT=rediscover");
+
  	if (ctrl->queue_count > 1) {
  		nvme_queue_scan(ctrl);
  		nvme_start_queues(ctrl);
diff --git a/drivers/nvme/host/fc.c b/drivers/nvme/host/fc.c
index 127abaf9ba5d..ff1dd8f999b0 100644
--- a/drivers/nvme/host/fc.c
+++ b/drivers/nvme/host/fc.c
@@ -2947,6 +2947,7 @@ nvme_fc_create_io_queues(struct nvme_fc_ctrl *ctrl)
  		goto out_delete_hw_queues;

  	ctrl->ioq_live = true;
+	set_bit(NVME_CTRL_FABRIC_CONNECTED, &ctrl->ctrl.flags);

  	return 0;

diff --git a/drivers/nvme/host/nvme.h b/drivers/nvme/host/nvme.h
index bdc0ff7ed9ab..9b4260d60516 100644
--- a/drivers/nvme/host/nvme.h
+++ b/drivers/nvme/host/nvme.h
@@ -356,6 +356,7 @@ struct nvme_ctrl {
  	unsigned long flags;
  #define NVME_CTRL_FAILFAST_EXPIRED	0
  #define NVME_CTRL_ADMIN_Q_STOPPED	1
+#define NVME_CTRL_FABRIC_CONNECTED	2
  	struct nvmf_ctrl_options *opts;





^ permalink raw reply related	[flat|nested] 19+ messages in thread

* Re: [PATCH rfc 0/1] Fix missing AENs when discovery controllers are disconnected
  2022-09-13  0:06     ` James Smart
@ 2022-09-14 10:29       ` Sagi Grimberg
  2022-09-14 15:00         ` James Smart
  0 siblings, 1 reply; 19+ messages in thread
From: Sagi Grimberg @ 2022-09-14 10:29 UTC (permalink / raw)
  To: James Smart, linux-nvme
  Cc: Christoph Hellwig, Keith Busch, Chaitanya Kulkarni,
	Martin Belanger, Hannes Reinecke


> I was thinking this rather than the new option... The bit would only be 
> set *after* the first successful link-side connect, thus the initial 
> nvme_start_ctrl would not send the reconnect event. Every reconnect 
> thereafter would.
> 
> -- james
> 
> 
> diff --git a/drivers/nvme/host/core.c b/drivers/nvme/host/core.c
> index 2429b11eb9a8..43c8b6590164 100644
> --- a/drivers/nvme/host/core.c
> +++ b/drivers/nvme/host/core.c
> @@ -4814,6 +4814,10 @@ void nvme_start_ctrl(struct nvme_ctrl *ctrl)
> 
>       nvme_enable_aen(ctrl);
> 
> +    if (nvme_discovery_ctrl(ctrl) &&
> +        test_bit(NVME_CTRL_FABRIC_CONNECTED, &ctrl->flags))
> +        nvme_change_uevent(ctrl, "NVME_EVENT=rediscover");
> +
>       if (ctrl->queue_count > 1) {
>           nvme_queue_scan(ctrl);
>           nvme_start_queues(ctrl);
> diff --git a/drivers/nvme/host/fc.c b/drivers/nvme/host/fc.c
> index 127abaf9ba5d..ff1dd8f999b0 100644
> --- a/drivers/nvme/host/fc.c
> +++ b/drivers/nvme/host/fc.c
> @@ -2947,6 +2947,7 @@ nvme_fc_create_io_queues(struct nvme_fc_ctrl *ctrl)
>           goto out_delete_hw_queues;
> 
>       ctrl->ioq_live = true;
> +    set_bit(NVME_CTRL_FABRIC_CONNECTED, &ctrl->ctrl.flags);

Isn't this set before calling nvme_start_ctrl()?

But this is orthogonal to the user argument, is there a specific
reason to why not add an explicit argument requesting that?
Because every userspace that will setup a persistent discovery
controller will need this event?


^ permalink raw reply	[flat|nested] 19+ messages in thread

* Re: [PATCH rfc 0/1] Fix missing AENs when discovery controllers are disconnected
  2022-09-14 10:29       ` Sagi Grimberg
@ 2022-09-14 15:00         ` James Smart
  2022-09-14 15:31           ` Sagi Grimberg
  0 siblings, 1 reply; 19+ messages in thread
From: James Smart @ 2022-09-14 15:00 UTC (permalink / raw)
  To: Sagi Grimberg, linux-nvme
  Cc: Christoph Hellwig, Keith Busch, Chaitanya Kulkarni,
	Martin Belanger, Hannes Reinecke

On 9/14/2022 3:29 AM, Sagi Grimberg wrote:
> 
>> I was thinking this rather than the new option... The bit would only 
>> be set *after* the first successful link-side connect, thus the 
>> initial nvme_start_ctrl would not send the reconnect event. Every 
>> reconnect thereafter would.
>>
>> -- james
>>
>>
>> diff --git a/drivers/nvme/host/core.c b/drivers/nvme/host/core.c
>> index 2429b11eb9a8..43c8b6590164 100644
>> --- a/drivers/nvme/host/core.c
>> +++ b/drivers/nvme/host/core.c
>> @@ -4814,6 +4814,10 @@ void nvme_start_ctrl(struct nvme_ctrl *ctrl)
>>
>>       nvme_enable_aen(ctrl);
>>
>> +    if (nvme_discovery_ctrl(ctrl) &&
>> +        test_bit(NVME_CTRL_FABRIC_CONNECTED, &ctrl->flags))
>> +        nvme_change_uevent(ctrl, "NVME_EVENT=rediscover");
>> +
>>       if (ctrl->queue_count > 1) {
>>           nvme_queue_scan(ctrl);
>>           nvme_start_queues(ctrl);
>> diff --git a/drivers/nvme/host/fc.c b/drivers/nvme/host/fc.c
>> index 127abaf9ba5d..ff1dd8f999b0 100644
>> --- a/drivers/nvme/host/fc.c
>> +++ b/drivers/nvme/host/fc.c
>> @@ -2947,6 +2947,7 @@ nvme_fc_create_io_queues(struct nvme_fc_ctrl *ctrl)
>>           goto out_delete_hw_queues;
>>
>>       ctrl->ioq_live = true;
>> +    set_bit(NVME_CTRL_FABRIC_CONNECTED, &ctrl->ctrl.flags);
> 
> Isn't this set before calling nvme_start_ctrl()?

frick - sent the wrong tweak for fc:

this should be:
diff --git a/drivers/nvme/host/fc.c b/drivers/nvme/host/fc.c
index 127abaf9ba5d..630d8e50f29c 100644
--- a/drivers/nvme/host/fc.c
+++ b/drivers/nvme/host/fc.c
@@ -3201,6 +3201,8 @@ nvme_fc_create_association(struct nvme_fc_ctrl *ctrl)
         if (changed)
                 nvme_start_ctrl(&ctrl->ctrl);

+       set_bit(NVME_CTRL_FABRIC_CONNECTED, &ctrl->ctrl.flags);
+
         return 0;       /* Success */

  out_term_aen_ops:


So in all cases, it's now set immediately after calling nvme_start_ctrl. 
Won't be set on 1st call but will be all on all reamaing.


> 
> But this is orthogonal to the user argument, is there a specific
> reason to why not add an explicit argument requesting that?
> Because every userspace that will setup a persistent discovery
> controller will need this event?

Well my aversion is to remove relationships between user-space and 
kernel.  It seemed very odd to go to user-space to then set a custom 
option and this handshaking looked like a lot of work vs just generating 
a reconnect event on a discovery controller.

In my mind think we should be doing one of the following:
a) generate connect and reconnect events generically. If they are 
handled, great, if not that's fine. Handler is free to filter by device 
attributes as to what it does

I assume there's some issue here that you saw with the original attempt 
as reconnects may post while connects or reconnects are still being handled.

b) go down the path of the rediscover option - but the kernel routine 
should never require it to be a discovery controller. If the option is 
set, send the rediscover event. Removes events that won't be handled.
The filtering in the lib to only set the option on devices that the lib 
cares about can continue.

-- james


^ permalink raw reply related	[flat|nested] 19+ messages in thread

* Re: [PATCH rfc 0/1] Fix missing AENs when discovery controllers are disconnected
  2022-09-14 15:00         ` James Smart
@ 2022-09-14 15:31           ` Sagi Grimberg
  2022-09-14 17:26             ` James Smart
  0 siblings, 1 reply; 19+ messages in thread
From: Sagi Grimberg @ 2022-09-14 15:31 UTC (permalink / raw)
  To: James Smart, linux-nvme
  Cc: Christoph Hellwig, Keith Busch, Chaitanya Kulkarni,
	Martin Belanger, Hannes Reinecke



On 9/14/22 18:00, James Smart wrote:
> On 9/14/2022 3:29 AM, Sagi Grimberg wrote:
>>
>>> I was thinking this rather than the new option... The bit would only 
>>> be set *after* the first successful link-side connect, thus the 
>>> initial nvme_start_ctrl would not send the reconnect event. Every 
>>> reconnect thereafter would.
>>>
>>> -- james
>>>
>>>
>>> diff --git a/drivers/nvme/host/core.c b/drivers/nvme/host/core.c
>>> index 2429b11eb9a8..43c8b6590164 100644
>>> --- a/drivers/nvme/host/core.c
>>> +++ b/drivers/nvme/host/core.c
>>> @@ -4814,6 +4814,10 @@ void nvme_start_ctrl(struct nvme_ctrl *ctrl)
>>>
>>>       nvme_enable_aen(ctrl);
>>>
>>> +    if (nvme_discovery_ctrl(ctrl) &&
>>> +        test_bit(NVME_CTRL_FABRIC_CONNECTED, &ctrl->flags))
>>> +        nvme_change_uevent(ctrl, "NVME_EVENT=rediscover");
>>> +
>>>       if (ctrl->queue_count > 1) {
>>>           nvme_queue_scan(ctrl);
>>>           nvme_start_queues(ctrl);
>>> diff --git a/drivers/nvme/host/fc.c b/drivers/nvme/host/fc.c
>>> index 127abaf9ba5d..ff1dd8f999b0 100644
>>> --- a/drivers/nvme/host/fc.c
>>> +++ b/drivers/nvme/host/fc.c
>>> @@ -2947,6 +2947,7 @@ nvme_fc_create_io_queues(struct nvme_fc_ctrl 
>>> *ctrl)
>>>           goto out_delete_hw_queues;
>>>
>>>       ctrl->ioq_live = true;
>>> +    set_bit(NVME_CTRL_FABRIC_CONNECTED, &ctrl->ctrl.flags);
>>
>> Isn't this set before calling nvme_start_ctrl()?
> 
> frick - sent the wrong tweak for fc:
> 
> this should be:
> diff --git a/drivers/nvme/host/fc.c b/drivers/nvme/host/fc.c
> index 127abaf9ba5d..630d8e50f29c 100644
> --- a/drivers/nvme/host/fc.c
> +++ b/drivers/nvme/host/fc.c
> @@ -3201,6 +3201,8 @@ nvme_fc_create_association(struct nvme_fc_ctrl *ctrl)
>          if (changed)
>                  nvme_start_ctrl(&ctrl->ctrl);
> 
> +       set_bit(NVME_CTRL_FABRIC_CONNECTED, &ctrl->ctrl.flags);
> +
>          return 0;       /* Success */
> 
>   out_term_aen_ops:
> 
> 
> So in all cases, it's now set immediately after calling nvme_start_ctrl. 
> Won't be set on 1st call but will be all on all reamaing.

I would rename it as NVME_CTRL_STARTED_ONCE or something. I feel that
every name we choose would be kinda ugly, which is why I opted to
relying on nr_reconnects that the transports maintain.

But again, all of this is orthogonal to the connect parameter question.

>> But this is orthogonal to the user argument, is there a specific
>> reason to why not add an explicit argument requesting that?
>> Because every userspace that will setup a persistent discovery
>> controller will need this event?
> 
> Well my aversion is to remove relationships between user-space and 
> kernel.  It seemed very odd to go to user-space to then set a custom 
> option and this handshaking looked like a lot of work vs just generating 
> a reconnect event on a discovery controller.
> 
> In my mind think we should be doing one of the following:
> a) generate connect and reconnect events generically. If they are 
> handled, great, if not that's fine. Handler is free to filter by device 
> attributes as to what it does

I can understand this approach. A persistent discovery controller that
reconnects, needs to send an indication to userspace universally because
someone always needs to rediscover as a result.

What do others think? Martin?

> I assume there's some issue here that you saw with the original attempt 
> as reconnects may post while connects or reconnects are still being 
> handled.
> 
> b) go down the path of the rediscover option - but the kernel routine 
> should never require it to be a discovery controller. If the option is 
> set, send the rediscover event. Removes events that won't be handled.
> The filtering in the lib to only set the option on devices that the lib 
> cares about can continue.

The fact that the event is called rediscover semantically means that
this is the discovery controller. It would be weird coming on any other
controller, even if userspace happened to ask for it. If at all, I'd
prefer to fail the parameter for controllers that are not discovery
controller, which is strict, but at least not resulting in a weird
event...


^ permalink raw reply	[flat|nested] 19+ messages in thread

* Re: [PATCH rfc 0/1] Fix missing AENs when discovery controllers are disconnected
  2022-09-14 15:31           ` Sagi Grimberg
@ 2022-09-14 17:26             ` James Smart
  2022-09-18 14:12               ` Sagi Grimberg
  0 siblings, 1 reply; 19+ messages in thread
From: James Smart @ 2022-09-14 17:26 UTC (permalink / raw)
  To: Sagi Grimberg, linux-nvme
  Cc: Christoph Hellwig, Keith Busch, Chaitanya Kulkarni,
	Martin Belanger, Hannes Reinecke

On 9/14/2022 8:31 AM, Sagi Grimberg wrote:
> The fact that the event is called rediscover semantically means that
> this is the discovery controller. It would be weird coming on any other
> controller, even if userspace happened to ask for it. If at all, I'd
> prefer to fail the parameter for controllers that are not discovery
> controller, which is strict, but at least not resulting in a weird
> event...

yeah... I guess I fell into is "rediscover" because the controller was 
"rediscovered" on the link ?   vs "please rediscover on a discovery 
controller"

either way - based on other people's input is fine with me.

-- james


^ permalink raw reply	[flat|nested] 19+ messages in thread

* Re: [PATCH rfc 0/1] Fix missing AENs when discovery controllers are disconnected
  2022-09-14 17:26             ` James Smart
@ 2022-09-18 14:12               ` Sagi Grimberg
  0 siblings, 0 replies; 19+ messages in thread
From: Sagi Grimberg @ 2022-09-18 14:12 UTC (permalink / raw)
  To: James Smart, linux-nvme
  Cc: Christoph Hellwig, Keith Busch, Chaitanya Kulkarni,
	Martin Belanger, Hannes Reinecke


>> The fact that the event is called rediscover semantically means that
>> this is the discovery controller. It would be weird coming on any other
>> controller, even if userspace happened to ask for it. If at all, I'd
>> prefer to fail the parameter for controllers that are not discovery
>> controller, which is strict, but at least not resulting in a weird
>> event...
> 
> yeah... I guess I fell into is "rediscover" because the controller was 
> "rediscovered" on the link ?   vs "please rediscover on a discovery 
> controller"
> 
> either way - based on other people's input is fine with me.

Didn't hear from anyone, guess it doesn't really matter.
I'll lose the user parameter, make the event fire for
any discovery controller that reconnects.


^ permalink raw reply	[flat|nested] 19+ messages in thread

end of thread, other threads:[~2022-09-18 14:13 UTC | newest]

Thread overview: 19+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2022-09-07 14:24 [PATCH rfc 0/1] Fix missing AENs when discovery controllers are disconnected Sagi Grimberg
2022-09-07 14:24 ` [PATCH rfc 1/1] fabrics: support notify_rediscover connect parameter Sagi Grimberg
2022-09-07 14:24 ` [PATCH libnvme 2/1] nvme: Add generic connect parameter support detection Sagi Grimberg
2022-09-09  6:56   ` Daniel Wagner
2022-09-12 12:32     ` Sagi Grimberg
2022-09-12 14:19       ` Daniel Wagner
2022-09-07 14:24 ` [PATCH libnvme 3/1] fabrics: add notify_rediscover parameter Sagi Grimberg
2022-09-09  6:58   ` Daniel Wagner
2022-09-12 12:33     ` Sagi Grimberg
2022-09-12 14:42       ` Daniel Wagner
2022-09-07 14:24 ` [PATCH nvme-cli 4/1] fabrics: re-read the discovery log page when a discovery controller reconnected Sagi Grimberg
2022-09-10 15:47 ` [PATCH rfc 0/1] Fix missing AENs when discovery controllers are disconnected James Smart
2022-09-12 12:39   ` Sagi Grimberg
2022-09-13  0:06     ` James Smart
2022-09-14 10:29       ` Sagi Grimberg
2022-09-14 15:00         ` James Smart
2022-09-14 15:31           ` Sagi Grimberg
2022-09-14 17:26             ` James Smart
2022-09-18 14:12               ` Sagi Grimberg

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.