[PATCH 0/7] nvme_fc: asynchronous controller create and simple discovery

All of lore.kernel.org
 help / color / mirror / Atom feed

* [PATCH 0/7] nvme_fc: asynchronous controller create and simple discovery
@ 2018-05-08  0:12 James Smart
  2018-05-08  0:12 ` [PATCH 1/7] nvme: remove unnecessary controller subnqn validation James Smart
                   ` (6 more replies)
  0 siblings, 7 replies; 18+ messages in thread
From: James Smart @ 2018-05-08  0:12 UTC (permalink / raw)


This patch set modifies the fc transport such that create_ctrl results
in the OS's controller creation only and does not initially connect
to the device on the wire inline to the create_ctrl call. Instead, the
initial connect immediately schedules the background reconnect thread
to perform the initial association connect. The patchset also contains
several other cleanups found while implementing the asynchronous
connect testing.

There are two main reasons why asynchronous controller create is done:
1) It simplifies error handling and retries. The initial controller
   connect attempts can be disrupted or by errors as easily as after
   the controller is initially created. As the code currently stands
   there has to be special retry logic and prohibitions around state
   changes if errors occur during the initial connect (which the code
   today does not have). With this patch set, initial connections use
   the same path as a reconnect, and any error handling uses the same
   paths as if the errors occurred post initial connect.
2) As the create_ctrl() call is now very fast, simplistic udev rules
   can be used for auto-connect rather than involving systemd to work
   around long initial connect times, especially if errors in initial
   connect occur.

However, several hurdles in the common infrastructure need to be changed
in order to make this work. The initial controller creation expects the
controller to be fully connected and live on the wire before it returns
back to the cli.  This gave a lot of time for the udev event to be
generated and serviced to create the corresponding /dev file.  The cli
now has to be prepared that it may access the /dev file before the event
had been serviced.  There is also a check in the fabrics layer to validate
the controller subnqn is what the connect request asked for. With this
patch set, FC will return before the initial connect is complete thus
the controller field being checked is not yet set.  There is no reason
that it wouldn't be as the request will be what the fabric connect
requests are based off, so this checking of the nqn should be removed.
Additionally, operations such as connect-all may occur while there is
a connectivity disturbance with the discovery controller, thus the
discovery log read may fail. To circumvent the side effects of giving
up and not connecting, the cli needs to retry the discovery log reads.

Therefore, this patch set is dependent on the following modifications
that have been made to the cli:
  nvme-cli: Wait for device file if not present after successful add_ctrl
     github repo commit id: bb2d87d7f386
  nvme-cli: Add ioctl retry support for "connect-all"
     github repo commit id: d8f949c21e9d


As the patchset now allows simplistic udev scripting, this patchset also
adds the change to the transport to regenerate udev discovery events in
case the system missed the events earlier (such as boot scenarios).



James Smart (7):
  nvme: remove unnecessary controller subnqn validation
  nvme_fc: remove setting DNR on exception conditions
  nvme_fc: retry failures to set io queue count
  nvme_fc: remove reinit_request routine
  nvme_fc: change controllers first connect to use reconnect path
  nvme_fc: fix nulling of queue data on reconnect
  nvme_fc: add 'nvme_discovery' sysfs attribute to fc transport device

 drivers/nvme/host/fabrics.c |  10 --
 drivers/nvme/host/fc.c      | 271 ++++++++++++++++++++++++++------------------
 2 files changed, 163 insertions(+), 118 deletions(-)

-- 
2.13.1

^ permalink raw reply	[flat|nested] 18+ messages in thread

* [PATCH 1/7] nvme: remove unnecessary controller subnqn validation
  2018-05-08  0:12 [PATCH 0/7] nvme_fc: asynchronous controller create and simple discovery James Smart
@ 2018-05-08  0:12 ` James Smart
  2018-05-08  5:57   ` Hannes Reinecke
  2018-05-08  0:12 ` [PATCH 2/7] nvme_fc: remove setting DNR on exception conditions James Smart
                   ` (5 subsequent siblings)
  6 siblings, 1 reply; 18+ messages in thread
From: James Smart @ 2018-05-08  0:12 UTC (permalink / raw)


After creating the nvme controller, nvmf_create_ctrl() validates
the newly created subsysnqn vs the one specified by the options.

In general, this is an unnecessary check as the Connect message
should implicitly ensure this value matches.

With the change to the FC transport to do an asynchronous connect
for the first association create, the transport will return to
nvmf_create_ctrl() before that first association has been established,
thus the subnqn will not yet be set.

Remove the unnecessary validation.

Signed-off-by: James Smart <james.smart at broadcom.com>
---
 drivers/nvme/host/fabrics.c | 10 ----------
 1 file changed, 10 deletions(-)

diff --git a/drivers/nvme/host/fabrics.c b/drivers/nvme/host/fabrics.c
index 7ae732a77fe8..6fdf499d16b4 100644
--- a/drivers/nvme/host/fabrics.c
+++ b/drivers/nvme/host/fabrics.c
@@ -983,16 +983,6 @@ nvmf_create_ctrl(struct device *dev, const char *buf, size_t count)
 		goto out_module_put;
 	}
 
-	if (strcmp(ctrl->subsys->subnqn, opts->subsysnqn)) {
-		dev_warn(ctrl->device,
-			"controller returned incorrect NQN: \"%s\".\n",
-			ctrl->subsys->subnqn);
-		module_put(ops->module);
-		up_read(&nvmf_transports_rwsem);
-		nvme_delete_ctrl_sync(ctrl);
-		return ERR_PTR(-EINVAL);
-	}
-
 	module_put(ops->module);
 	up_read(&nvmf_transports_rwsem);
 	return ctrl;
-- 
2.13.1

^ permalink raw reply related	[flat|nested] 18+ messages in thread

* [PATCH 2/7] nvme_fc: remove setting DNR on exception conditions
  2018-05-08  0:12 [PATCH 0/7] nvme_fc: asynchronous controller create and simple discovery James Smart
  2018-05-08  0:12 ` [PATCH 1/7] nvme: remove unnecessary controller subnqn validation James Smart
@ 2018-05-08  0:12 ` James Smart
  2018-05-08  5:58   ` Hannes Reinecke
  2018-05-08  0:12 ` [PATCH 3/7] nvme_fc: retry failures to set io queue count James Smart
                   ` (4 subsequent siblings)
  6 siblings, 1 reply; 18+ messages in thread
From: James Smart @ 2018-05-08  0:12 UTC (permalink / raw)


Current code will set DNR if the controller is deleting or there is
an error during controller init. None of this is necessary.

remove the code that sets DNR

Signed-off-by: James Smart <james.smart at broadcom.com>
---
 drivers/nvme/host/fc.c | 10 ----------
 1 file changed, 10 deletions(-)

diff --git a/drivers/nvme/host/fc.c b/drivers/nvme/host/fc.c
index 6cb26bcf6ec0..7e64fe69c945 100644
--- a/drivers/nvme/host/fc.c
+++ b/drivers/nvme/host/fc.c
@@ -1686,16 +1686,6 @@ nvme_fc_fcpio_done(struct nvmefc_fcp_req *req)
 		goto check_error;
 	}
 
-	/*
-	 * Force failures of commands if we're killing the controller
-	 * or have an error on a command used to create an new association
-	 */
-	if (status &&
-	    (blk_queue_dying(rq->q) ||
-	     ctrl->ctrl.state == NVME_CTRL_NEW ||
-	     ctrl->ctrl.state == NVME_CTRL_CONNECTING))
-		status |= cpu_to_le16(NVME_SC_DNR << 1);
-
 	__nvme_fc_fcpop_chk_teardowns(ctrl, op, opstate);
 	nvme_end_request(rq, status, result);
 
-- 
2.13.1

^ permalink raw reply related	[flat|nested] 18+ messages in thread

* [PATCH 3/7] nvme_fc: retry failures to set io queue count
  2018-05-08  0:12 [PATCH 0/7] nvme_fc: asynchronous controller create and simple discovery James Smart
  2018-05-08  0:12 ` [PATCH 1/7] nvme: remove unnecessary controller subnqn validation James Smart
  2018-05-08  0:12 ` [PATCH 2/7] nvme_fc: remove setting DNR on exception conditions James Smart
@ 2018-05-08  0:12 ` James Smart
  2018-05-08  6:00   ` Hannes Reinecke
  2018-05-08  0:12 ` [PATCH 4/7] nvme_fc: remove reinit_request routine James Smart
                   ` (3 subsequent siblings)
  6 siblings, 1 reply; 18+ messages in thread
From: James Smart @ 2018-05-08  0:12 UTC (permalink / raw)


During the creation of a new controller association, it's possible for
errors and link connectivity issues to cause nvme_set_queue_count() to
have its SET_FEATURES command fail with a positive non-zero code. The
routine doesn't treat this as a hard error, instead setting the io queue
count to zero and returning success.  This has the result of the
transport setting the io queue count to 0, making the storage controller
inoperable. The message "...Could not set queue count..." is seen.

Revise the fc transport to detect when it asked for io queues but got
back a result of 0 io queues. In such a case, fail the re-connection
attempt and fall into the retry loop.

Signed-off-by: James Smart <james.smart at broadcom.com>
---
 drivers/nvme/host/fc.c | 14 ++++++++------
 1 file changed, 8 insertions(+), 6 deletions(-)

diff --git a/drivers/nvme/host/fc.c b/drivers/nvme/host/fc.c
index 7e64fe69c945..69299bda7cb2 100644
--- a/drivers/nvme/host/fc.c
+++ b/drivers/nvme/host/fc.c
@@ -2414,16 +2414,17 @@ static int
 nvme_fc_create_io_queues(struct nvme_fc_ctrl *ctrl)
 {
 	struct nvmf_ctrl_options *opts = ctrl->ctrl.opts;
-	unsigned int nr_io_queues;
+	unsigned int nr_io_queues, numq;
 	int ret;
 
 	nr_io_queues = min(min(opts->nr_io_queues, num_online_cpus()),
 				ctrl->lport->ops->max_hw_queues);
+	numq = nr_io_queues;
 	ret = nvme_set_queue_count(&ctrl->ctrl, &nr_io_queues);
-	if (ret) {
+	if (ret || (numq && !nr_io_queues)) {
 		dev_info(ctrl->ctrl.device,
 			"set_queue_count failed: %d\n", ret);
-		return ret;
+		return ret ? ret : -ENOTCONN;
 	}
 
 	ctrl->ctrl.queue_count = nr_io_queues + 1;
@@ -2486,16 +2487,17 @@ static int
 nvme_fc_reinit_io_queues(struct nvme_fc_ctrl *ctrl)
 {
 	struct nvmf_ctrl_options *opts = ctrl->ctrl.opts;
-	unsigned int nr_io_queues;
+	unsigned int nr_io_queues, numq;
 	int ret;
 
 	nr_io_queues = min(min(opts->nr_io_queues, num_online_cpus()),
 				ctrl->lport->ops->max_hw_queues);
+	numq = nr_io_queues;
 	ret = nvme_set_queue_count(&ctrl->ctrl, &nr_io_queues);
-	if (ret) {
+	if (ret || (numq && !nr_io_queues)) {
 		dev_info(ctrl->ctrl.device,
 			"set_queue_count failed: %d\n", ret);
-		return ret;
+		return ret ? ret : -ENOTCONN;
 	}
 
 	ctrl->ctrl.queue_count = nr_io_queues + 1;
-- 
2.13.1

^ permalink raw reply related	[flat|nested] 18+ messages in thread

* [PATCH 4/7] nvme_fc: remove reinit_request routine
  2018-05-08  0:12 [PATCH 0/7] nvme_fc: asynchronous controller create and simple discovery James Smart
                   ` (2 preceding siblings ...)
  2018-05-08  0:12 ` [PATCH 3/7] nvme_fc: retry failures to set io queue count James Smart
@ 2018-05-08  0:12 ` James Smart
  2018-05-08  6:01   ` Hannes Reinecke
  2018-05-08  0:12 ` [PATCH 5/7] nvme_fc: change controllers first connect to use reconnect path James Smart
                   ` (2 subsequent siblings)
  6 siblings, 1 reply; 18+ messages in thread
From: James Smart @ 2018-05-08  0:12 UTC (permalink / raw)


The reinit_request routine is not necessary. Remove support for the
op callback.

As all that nvme_reinit_tagset() does is itterate and call the
reinit routine, it too has no purpose. Remove the call.

Signed-off-by: James Smart <james.smart at broadcom.com>
---
 drivers/nvme/host/fc.c | 20 --------------------
 1 file changed, 20 deletions(-)

diff --git a/drivers/nvme/host/fc.c b/drivers/nvme/host/fc.c
index 69299bda7cb2..d6842ed50097 100644
--- a/drivers/nvme/host/fc.c
+++ b/drivers/nvme/host/fc.c
@@ -1470,21 +1470,6 @@ nvme_fc_xmt_disconnect_assoc(struct nvme_fc_ctrl *ctrl)
 
 static void nvme_fc_error_recovery(struct nvme_fc_ctrl *ctrl, char *errmsg);
 
-static int
-nvme_fc_reinit_request(void *data, struct request *rq)
-{
-	struct nvme_fc_fcp_op *op = blk_mq_rq_to_pdu(rq);
-	struct nvme_fc_cmd_iu *cmdiu = &op->cmd_iu;
-
-	memset(cmdiu, 0, sizeof(*cmdiu));
-	cmdiu->scsi_id = NVME_CMD_SCSI_ID;
-	cmdiu->fc_id = NVME_CMD_FC_ID;
-	cmdiu->iu_len = cpu_to_be16(sizeof(*cmdiu) / sizeof(u32));
-	memset(&op->rsp_iu, 0, sizeof(op->rsp_iu));
-
-	return 0;
-}
-
 static void
 __nvme_fc_exit_request(struct nvme_fc_ctrl *ctrl,
 		struct nvme_fc_fcp_op *op)
@@ -2507,10 +2492,6 @@ nvme_fc_reinit_io_queues(struct nvme_fc_ctrl *ctrl)
 
 	nvme_fc_init_io_queues(ctrl);
 
-	ret = nvme_reinit_tagset(&ctrl->ctrl, ctrl->ctrl.tagset);
-	if (ret)
-		goto out_free_io_queues;
-
 	ret = nvme_fc_create_hw_io_queues(ctrl, ctrl->ctrl.sqsize + 1);
 	if (ret)
 		goto out_free_io_queues;
@@ -2922,7 +2903,6 @@ static const struct nvme_ctrl_ops nvme_fc_ctrl_ops = {
 	.submit_async_event	= nvme_fc_submit_async_event,
 	.delete_ctrl		= nvme_fc_delete_ctrl,
 	.get_address		= nvmf_get_address,
-	.reinit_request		= nvme_fc_reinit_request,
 };
 
 static void
-- 
2.13.1

^ permalink raw reply related	[flat|nested] 18+ messages in thread

* [PATCH 5/7] nvme_fc: change controllers first connect to use reconnect path
  2018-05-08  0:12 [PATCH 0/7] nvme_fc: asynchronous controller create and simple discovery James Smart
                   ` (3 preceding siblings ...)
  2018-05-08  0:12 ` [PATCH 4/7] nvme_fc: remove reinit_request routine James Smart
@ 2018-05-08  0:12 ` James Smart
  2018-05-08  6:03   ` Hannes Reinecke
  2018-05-08  0:12 ` [PATCH 6/7] nvme_fc: fix nulling of queue data on reconnect James Smart
  2018-05-08  0:12 ` [PATCH 7/7] nvme_fc: add 'nvme_discovery' sysfs attribute to fc transport device James Smart
  6 siblings, 1 reply; 18+ messages in thread
From: James Smart @ 2018-05-08  0:12 UTC (permalink / raw)


Current code follows the framework that has been in the transports
from the beginning where initial link-side controller connect occurs
as part of "creating the controller". Thus that first connect fully
talks to the controller and obtains values that can then be used in
for blk-mq setup, etc. It also means that everything about the
controller is fully know before the "create controller" call returns.

This has several weaknesses:
- The initial create_ctrl call made by the cli will block for a long
  time as wire transactions are performed synchronously. This delay
  becomes longer if errors occur or connectivity is lost and retries
  need to be performed.
- Code wise, it means there is a separate connect path for initial
  controller connect vs the (same) steps used in the reconnect path.
- And as there's separate paths, it means there's separate error
  handling and retry logic. It also plays havoc with the NEW state
  (should transition out of it after successful initial connect) vs
  the RESETTING and CONNECTING (reconnect) states that want to be
  transitioned to on error.
- As there's separate paths, to recover from errors and disruptions,
  it requires separate recovery/retry paths as well and can severely
  convolute the controller state.
- Given the duration of the initial create_ctrl() call, automation
  of dynamic discovery can't use simplistic udev rules. Instead,
  more convoluted use of systemd is required.

This patch reworks the fc transport to use the same connect paths
for the initial connection as it uses for reconnect. This makes a
single path for error recovery and handling. Given that the initial
connect becomes asynchronous on a background work thread, the
create_ctrl() now returns very quickly, allowing the simplified
auto-connnect scripting.

This patch:
- Removes the driving of the initial connect and replaces it with
  a state transition to CONNECTING and initiating the reconnect
  thread. A dummy state transition of RESETTING had to be traversed
  as a direct transtion of NEW->CONNECTING is not allowed. Given
  that the controller is "new", the RESETTING transition is a simple
  no-op. Once in the reconnecting thread, the normal behaviors of
  ctrl_loss_tmo (max_retries * connect_delay) and dev_loss_tmo will
  apply before the controller is torn down.
- Only if the state transitions couldn't be traversed and the
  reconnect thread not scheduled, will the controller be torn down
  while in create_ctrl.
- The prior code used the controller state of NEW to indicate
  whether request queues had been initialized or not. For the admin
  queue, the request queue is always created, so there's no need to
  check a state. For IO queues, change to tracking whether a successful
  io request queue create has occurred (e.g. 1st successful connect).
- The initial controller id is initialized to the dynamic controller
  id used in the initial connect message. It will be overwritten by
  the real controller id once the controller is connected on the wire.

Note: this patch is dependent on these changes that have been
added to the nvme-cli utility:
- nvme-cli: Wait for device file if not present after successful add_ctrl
- nvme-cli: Add ioctl retry support for "connect-all"

With this patch, a simplistic udev rule such as one of the following
options below, may now be used to support auto connectivity:

option A:
   ACTION=="change", SUBSYSTEM=="fc", ENV{FC_EVENT}=="nvmediscovery", \
        ENV{NVMEFC_HOST_TRADDR}=="*", ENV{NVMEFC_TRADDR}=="*", \
	RUN+="/usr/sbin/nvme connect-all --transport=fc --host-traddr=$env{NVMEFC_HOST_TRADDR} --traddr=$env{NVMEFC_TRADDR}"

option B:
   SUBSYSTEM!="fc", GOTO="nvme_autoconnect_end"
   ACTION!="change", GOTO="nvme_autoconnect_end"
   ENV{FC_EVENT}!="nvmediscovery", GOTO="nvme_autoconnect_end"
   ENV{NVMEFC_HOST_TRADDR}!="?*", GOTO="nvme_autoconnect_end"
   ENV{NVMEFC_TRADDR}!="?*", GOTO="nvme_autoconnect_end"
   RUN+="/usr/sbin/nvme connect-all -t fc --host-traddr=$env{NVMEFC_HOST_TRADDR} -a $env{NVMEFC_TRADDR}"
   LABEL="nvme_autoconnect_end"

Signed-off-by: James Smart <james.smart at broadcom.com>
---
 drivers/nvme/host/fc.c | 102 ++++++++++++++++++++++---------------------------
 1 file changed, 45 insertions(+), 57 deletions(-)

diff --git a/drivers/nvme/host/fc.c b/drivers/nvme/host/fc.c
index d6842ed50097..7b9fb2e5afee 100644
--- a/drivers/nvme/host/fc.c
+++ b/drivers/nvme/host/fc.c
@@ -142,6 +142,7 @@ struct nvme_fc_ctrl {
 	struct nvme_fc_rport	*rport;
 	u32			cnum;
 
+	bool			ioq_live;
 	bool			assoc_active;
 	u64			association_id;
 
@@ -2452,6 +2453,8 @@ nvme_fc_create_io_queues(struct nvme_fc_ctrl *ctrl)
 	if (ret)
 		goto out_delete_hw_queues;
 
+	ctrl->ioq_live = true;
+
 	return 0;
 
 out_delete_hw_queues:
@@ -2601,8 +2604,7 @@ nvme_fc_create_association(struct nvme_fc_ctrl *ctrl)
 	if (ret)
 		goto out_delete_hw_queue;
 
-	if (ctrl->ctrl.state != NVME_CTRL_NEW)
-		blk_mq_unquiesce_queue(ctrl->ctrl.admin_q);
+	blk_mq_unquiesce_queue(ctrl->ctrl.admin_q);
 
 	ret = nvmf_connect_admin_queue(&ctrl->ctrl);
 	if (ret)
@@ -2675,7 +2677,7 @@ nvme_fc_create_association(struct nvme_fc_ctrl *ctrl)
 	 */
 
 	if (ctrl->ctrl.queue_count > 1) {
-		if (ctrl->ctrl.state == NVME_CTRL_NEW)
+		if (!ctrl->ioq_live)
 			ret = nvme_fc_create_io_queues(ctrl);
 		else
 			ret = nvme_fc_reinit_io_queues(ctrl);
@@ -2762,8 +2764,7 @@ nvme_fc_delete_association(struct nvme_fc_ctrl *ctrl)
 	 * use blk_mq_tagset_busy_itr() and the transport routine to
 	 * terminate the exchanges.
 	 */
-	if (ctrl->ctrl.state != NVME_CTRL_NEW)
-		blk_mq_quiesce_queue(ctrl->ctrl.admin_q);
+	blk_mq_quiesce_queue(ctrl->ctrl.admin_q);
 	blk_mq_tagset_busy_iter(&ctrl->admin_tag_set,
 				nvme_fc_terminate_exchange, &ctrl->ctrl);
 
@@ -2919,7 +2920,7 @@ nvme_fc_connect_ctrl_work(struct work_struct *work)
 		nvme_fc_reconnect_or_delete(ctrl, ret);
 	else
 		dev_info(ctrl->ctrl.device,
-			"NVME-FC{%d}: controller reconnect complete\n",
+			"NVME-FC{%d}: controller connect complete\n",
 			ctrl->cnum);
 }
 
@@ -2967,7 +2968,7 @@ nvme_fc_init_ctrl(struct device *dev, struct nvmf_ctrl_options *opts,
 {
 	struct nvme_fc_ctrl *ctrl;
 	unsigned long flags;
-	int ret, idx, retry;
+	int ret, idx;
 
 	if (!(rport->remoteport.port_role &
 	    (FC_PORT_ROLE_NVME_DISCOVERY | FC_PORT_ROLE_NVME_TARGET))) {
@@ -2994,11 +2995,13 @@ nvme_fc_init_ctrl(struct device *dev, struct nvmf_ctrl_options *opts,
 	}
 
 	ctrl->ctrl.opts = opts;
+	ctrl->ctrl.nr_reconnects = 0;
 	INIT_LIST_HEAD(&ctrl->ctrl_list);
 	ctrl->lport = lport;
 	ctrl->rport = rport;
 	ctrl->dev = lport->dev;
 	ctrl->cnum = idx;
+	ctrl->ioq_live = false;
 	ctrl->assoc_active = false;
 	init_waitqueue_head(&ctrl->ioabort_wait);
 
@@ -3017,6 +3020,7 @@ nvme_fc_init_ctrl(struct device *dev, struct nvmf_ctrl_options *opts,
 
 	ctrl->ctrl.sqsize = opts->queue_size - 1;
 	ctrl->ctrl.kato = opts->kato;
+	ctrl->ctrl.cntlid = 0xffff;
 
 	ret = -ENOMEM;
 	ctrl->queues = kcalloc(ctrl->ctrl.queue_count,
@@ -3066,68 +3070,52 @@ nvme_fc_init_ctrl(struct device *dev, struct nvmf_ctrl_options *opts,
 	list_add_tail(&ctrl->ctrl_list, &rport->ctrl_list);
 	spin_unlock_irqrestore(&rport->lock, flags);
 
-	/*
-	 * It's possible that transactions used to create the association
-	 * may fail. Examples: CreateAssociation LS or CreateIOConnection
-	 * LS gets dropped/corrupted/fails; or a frame gets dropped or a
-	 * command times out for one of the actions to init the controller
-	 * (Connect, Get/Set_Property, Set_Features, etc). Many of these
-	 * transport errors (frame drop, LS failure) inherently must kill
-	 * the association. The transport is coded so that any command used
-	 * to create the association (prior to a LIVE state transition
-	 * while NEW or CONNECTING) will fail if it completes in error or
-	 * times out.
-	 *
-	 * As such: as the connect request was mostly likely due to a
-	 * udev event that discovered the remote port, meaning there is
-	 * not an admin or script there to restart if the connect
-	 * request fails, retry the initial connection creation up to
-	 * three times before giving up and declaring failure.
-	 */
-	for (retry = 0; retry < 3; retry++) {
-		ret = nvme_fc_create_association(ctrl);
-		if (!ret)
-			break;
+	if (!nvme_change_ctrl_state(&ctrl->ctrl, NVME_CTRL_RESETTING) ||
+	    !nvme_change_ctrl_state(&ctrl->ctrl, NVME_CTRL_CONNECTING)) {
+		dev_err(ctrl->ctrl.device,
+			"NVME-FC{%d}: failed to init ctrl state\n", ctrl->cnum);
+		goto fail_ctrl;
 	}
 
-	if (ret) {
-		nvme_change_ctrl_state(&ctrl->ctrl, NVME_CTRL_DELETING);
-		cancel_work_sync(&ctrl->ctrl.reset_work);
-		cancel_delayed_work_sync(&ctrl->connect_work);
+	nvme_get_ctrl(&ctrl->ctrl);
 
-		/* couldn't schedule retry - fail out */
+	if (!queue_delayed_work(nvme_wq, &ctrl->connect_work, 0)) {
+		nvme_put_ctrl(&ctrl->ctrl);
 		dev_err(ctrl->ctrl.device,
-			"NVME-FC{%d}: Connect retry failed\n", ctrl->cnum);
+			"NVME-FC{%d}: failed to schedule initial connect\n",
+			ctrl->cnum);
+		goto fail_ctrl;
+	}
 
-		ctrl->ctrl.opts = NULL;
+	dev_info(ctrl->ctrl.device,
+		"NVME-FC{%d}: new ctrl: NQN \"%s\"\n",
+		ctrl->cnum, ctrl->ctrl.opts->subsysnqn);
 
-		/* initiate nvme ctrl ref counting teardown */
-		nvme_uninit_ctrl(&ctrl->ctrl);
+	return &ctrl->ctrl;
 
-		/* Remove core ctrl ref. */
-		nvme_put_ctrl(&ctrl->ctrl);
+fail_ctrl:
+	nvme_change_ctrl_state(&ctrl->ctrl, NVME_CTRL_DELETING);
+	cancel_work_sync(&ctrl->ctrl.reset_work);
+	cancel_delayed_work_sync(&ctrl->connect_work);
 
-		/* as we're past the point where we transition to the ref
-		 * counting teardown path, if we return a bad pointer here,
-		 * the calling routine, thinking it's prior to the
-		 * transition, will do an rport put. Since the teardown
-		 * path also does a rport put, we do an extra get here to
-		 * so proper order/teardown happens.
-		 */
-		nvme_fc_rport_get(rport);
+	ctrl->ctrl.opts = NULL;
 
-		if (ret > 0)
-			ret = -EIO;
-		return ERR_PTR(ret);
-	}
+	/* initiate nvme ctrl ref counting teardown */
+	nvme_uninit_ctrl(&ctrl->ctrl);
 
-	nvme_get_ctrl(&ctrl->ctrl);
+	/* Remove core ctrl ref. */
+	nvme_put_ctrl(&ctrl->ctrl);
 
-	dev_info(ctrl->ctrl.device,
-		"NVME-FC{%d}: new ctrl: NQN \"%s\"\n",
-		ctrl->cnum, ctrl->ctrl.opts->subsysnqn);
+	/* as we're past the point where we transition to the ref
+	 * counting teardown path, if we return a bad pointer here,
+	 * the calling routine, thinking it's prior to the
+	 * transition, will do an rport put. Since the teardown
+	 * path also does a rport put, we do an extra get here to
+	 * so proper order/teardown happens.
+	 */
+	nvme_fc_rport_get(rport);
 
-	return &ctrl->ctrl;
+	return ERR_PTR(-EIO);
 
 out_cleanup_admin_q:
 	blk_cleanup_queue(ctrl->ctrl.admin_q);
-- 
2.13.1

^ permalink raw reply related	[flat|nested] 18+ messages in thread

* [PATCH 6/7] nvme_fc: fix nulling of queue data on reconnect
  2018-05-08  0:12 [PATCH 0/7] nvme_fc: asynchronous controller create and simple discovery James Smart
                   ` (4 preceding siblings ...)
  2018-05-08  0:12 ` [PATCH 5/7] nvme_fc: change controllers first connect to use reconnect path James Smart
@ 2018-05-08  0:12 ` James Smart
  2018-05-08  6:05   ` Hannes Reinecke
  2018-05-08  0:12 ` [PATCH 7/7] nvme_fc: add 'nvme_discovery' sysfs attribute to fc transport device James Smart
  6 siblings, 1 reply; 18+ messages in thread
From: James Smart @ 2018-05-08  0:12 UTC (permalink / raw)


The reconnect path is calling the init routines to clear a queue
structure. But the queue structure has state that perhaps needs
to persist as long as the controller is live.

Remove the nvme_fc_init_queue() calls on reconnect.
The nvme_fc_free_queue() calls will clear state bits and reset
any relevant queue state for a new connection.

Signed-off-by: James Smart <james.smart at broadcom.com>
---
 drivers/nvme/host/fc.c | 11 +++++------
 1 file changed, 5 insertions(+), 6 deletions(-)

diff --git a/drivers/nvme/host/fc.c b/drivers/nvme/host/fc.c
index 7b9fb2e5afee..3cab280cf5f7 100644
--- a/drivers/nvme/host/fc.c
+++ b/drivers/nvme/host/fc.c
@@ -1879,6 +1879,7 @@ nvme_fc_free_queue(struct nvme_fc_queue *queue)
 	 */
 
 	queue->connection_id = 0;
+	atomic_set(&queue->csn, 1);
 }
 
 static void
@@ -2472,7 +2473,7 @@ nvme_fc_create_io_queues(struct nvme_fc_ctrl *ctrl)
 }
 
 static int
-nvme_fc_reinit_io_queues(struct nvme_fc_ctrl *ctrl)
+nvme_fc_recreate_io_queues(struct nvme_fc_ctrl *ctrl)
 {
 	struct nvmf_ctrl_options *opts = ctrl->ctrl.opts;
 	unsigned int nr_io_queues, numq;
@@ -2493,8 +2494,6 @@ nvme_fc_reinit_io_queues(struct nvme_fc_ctrl *ctrl)
 	if (ctrl->ctrl.queue_count == 1)
 		return 0;
 
-	nvme_fc_init_io_queues(ctrl);
-
 	ret = nvme_fc_create_hw_io_queues(ctrl, ctrl->ctrl.sqsize + 1);
 	if (ret)
 		goto out_free_io_queues;
@@ -2592,8 +2591,6 @@ nvme_fc_create_association(struct nvme_fc_ctrl *ctrl)
 	 * Create the admin queue
 	 */
 
-	nvme_fc_init_queue(ctrl, 0);
-
 	ret = __nvme_fc_create_hw_queue(ctrl, &ctrl->queues[0], 0,
 				NVME_AQ_DEPTH);
 	if (ret)
@@ -2680,7 +2677,7 @@ nvme_fc_create_association(struct nvme_fc_ctrl *ctrl)
 		if (!ctrl->ioq_live)
 			ret = nvme_fc_create_io_queues(ctrl);
 		else
-			ret = nvme_fc_reinit_io_queues(ctrl);
+			ret = nvme_fc_recreate_io_queues(ctrl);
 		if (ret)
 			goto out_term_aen_ops;
 	}
@@ -3028,6 +3025,8 @@ nvme_fc_init_ctrl(struct device *dev, struct nvmf_ctrl_options *opts,
 	if (!ctrl->queues)
 		goto out_free_ida;
 
+	nvme_fc_init_queue(ctrl, 0);
+
 	memset(&ctrl->admin_tag_set, 0, sizeof(ctrl->admin_tag_set));
 	ctrl->admin_tag_set.ops = &nvme_fc_admin_mq_ops;
 	ctrl->admin_tag_set.queue_depth = NVME_AQ_MQ_TAG_DEPTH;
-- 
2.13.1

^ permalink raw reply related	[flat|nested] 18+ messages in thread

* [PATCH 7/7] nvme_fc: add 'nvme_discovery' sysfs attribute to fc transport device
  2018-05-08  0:12 [PATCH 0/7] nvme_fc: asynchronous controller create and simple discovery James Smart
                   ` (5 preceding siblings ...)
  2018-05-08  0:12 ` [PATCH 6/7] nvme_fc: fix nulling of queue data on reconnect James Smart
@ 2018-05-08  0:12 ` James Smart
  2018-05-08  6:06   ` Hannes Reinecke
  6 siblings, 1 reply; 18+ messages in thread
From: James Smart @ 2018-05-08  0:12 UTC (permalink / raw)


The fc transport device should allow for a rediscovery, as userspace
might have lost the events. Example is udev events not handled during
system startup.

This patch add a sysfs entry 'nvme_discovery' on the fc class to
have it replay all udev discovery events for all local port/remote
port address pairs.

Signed-off-by: James Smart <james.smart at broadcom.com>
CC: Hannes Reinecke <hare at suse.com>
Cc: Ewan Milne <emilne at redhat.com>
Cc: Johannes Thumshirn <jthumshirn at suse.com>

---
This is actually v4 - based on Hannes' v3 of the same patch:
[PATCHv3] nvme/fc: add 'discovery' sysfs attribute to fc transport device
http://lists.infradead.org/pipermail/linux-nvme/2017-December/014334.html

However, refcounting in v3 was still messed up when an rport_get() fails.
Modified the patch to add the unwind routine to correct it.

Also renamed the "discovery" attribute to "nvme_discovery" so that there
will not be collisions if/when we move to a shared fc transport and have
scsi there as well.

Note: this patch alone doesn't allow the simple udev script.  The patches
for the fc transport to move initial connect to use the asychronous
reconnect path is what allows the simple udev script. This patch simply
allows to catch up with connectivity if the udev events weren't handled.
---
 drivers/nvme/host/fc.c | 114 +++++++++++++++++++++++++++++++++++++++++++++----
 1 file changed, 105 insertions(+), 9 deletions(-)

diff --git a/drivers/nvme/host/fc.c b/drivers/nvme/host/fc.c
index 3cab280cf5f7..14678989e51d 100644
--- a/drivers/nvme/host/fc.c
+++ b/drivers/nvme/host/fc.c
@@ -122,6 +122,7 @@ struct nvme_fc_rport {
 	struct list_head		endp_list; /* for lport->endp_list */
 	struct list_head		ctrl_list;
 	struct list_head		ls_req_list;
+	struct list_head		disc_list;
 	struct device			*dev;	/* physical device for dma */
 	struct nvme_fc_lport		*lport;
 	spinlock_t			lock;
@@ -210,7 +211,6 @@ static DEFINE_IDA(nvme_fc_ctrl_cnt);
  * These items are short-term. They will eventually be moved into
  * a generic FC class. See comments in module init.
  */
-static struct class *fc_class;
 static struct device *fc_udev_device;
 
 
@@ -507,6 +507,7 @@ nvme_fc_free_rport(struct kref *ref)
 	list_del(&rport->endp_list);
 	spin_unlock_irqrestore(&nvme_fc_lock, flags);
 
+	WARN_ON(!list_empty(&rport->disc_list));
 	ida_simple_remove(&lport->endp_cnt, rport->remoteport.port_num);
 
 	kfree(rport);
@@ -694,6 +695,7 @@ nvme_fc_register_remoteport(struct nvme_fc_local_port *localport,
 	INIT_LIST_HEAD(&newrec->endp_list);
 	INIT_LIST_HEAD(&newrec->ctrl_list);
 	INIT_LIST_HEAD(&newrec->ls_req_list);
+	INIT_LIST_HEAD(&newrec->disc_list);
 	kref_init(&newrec->ref);
 	atomic_set(&newrec->act_ctrl_cnt, 0);
 	spin_lock_init(&newrec->lock);
@@ -3255,6 +3257,100 @@ static struct nvmf_transport_ops nvme_fc_transport = {
 	.create_ctrl	= nvme_fc_create_ctrl,
 };
 
+static void nvme_fc_discovery_unwind(struct list_head *lheadp)
+{
+	unsigned long flags;
+	struct nvme_fc_lport *lport;
+	struct nvme_fc_rport *rport, *tmp_rport;
+
+	list_for_each_entry_safe(rport, tmp_rport,
+				 lheadp, disc_list) {
+		spin_lock_irqsave(&nvme_fc_lock, flags);
+		list_del_init(&rport->disc_list);
+		spin_unlock_irqrestore(&nvme_fc_lock, flags);
+		lport = rport->lport;
+		/* signal discovery. Won't hurt if it repeats */
+		nvme_fc_signal_discovery_scan(lport, rport);
+		nvme_fc_rport_put(rport);
+		nvme_fc_lport_put(lport);
+	}
+}
+
+/* Arbitrary successive failures max. With lots of subsystems could be high */
+#define DISCOVERY_MAX_FAIL	20
+
+static ssize_t nvme_fc_nvme_discovery_store(struct device *dev,
+				       struct device_attribute *attr,
+				       const char *buf, size_t count)
+{
+	unsigned long flags;
+	LIST_HEAD(nvme_fc_disc_list);
+	struct nvme_fc_lport *lport;
+	struct nvme_fc_rport *rport, *tmp_rport;
+	int failcnt = 0;
+
+restart:
+	spin_lock_irqsave(&nvme_fc_lock, flags);
+	list_for_each_entry(lport, &nvme_fc_lport_list, port_list) {
+		list_for_each_entry(rport, &lport->endp_list, endp_list) {
+			if (!nvme_fc_lport_get(lport))
+				continue;
+			if (!nvme_fc_rport_get(rport)) {
+				/*
+				 * This is a temporary condition, so upon
+				 * restart this node will be gone from the
+				 * list.
+				 */
+				spin_unlock_irqrestore(&nvme_fc_lock, flags);
+				nvme_fc_lport_put(lport);
+				nvme_fc_discovery_unwind(&nvme_fc_disc_list);
+				if (failcnt++ < DISCOVERY_MAX_FAIL)
+					goto restart;
+				pr_err("nvme_discovery: too many reference "
+				       "failures\n");
+				return 0;
+			}
+			if (list_empty(&rport->disc_list))
+				list_add_tail(&rport->disc_list,
+					      &nvme_fc_disc_list);
+		}
+	}
+	spin_unlock_irqrestore(&nvme_fc_lock, flags);
+
+	list_for_each_entry_safe(rport, tmp_rport,
+				 &nvme_fc_disc_list, disc_list) {
+		spin_lock_irqsave(&nvme_fc_lock, flags);
+		list_del_init(&rport->disc_list);
+		spin_unlock_irqrestore(&nvme_fc_lock, flags);
+		lport = rport->lport;
+		nvme_fc_signal_discovery_scan(lport, rport);
+		nvme_fc_rport_put(rport);
+		nvme_fc_lport_put(lport);
+	}
+	return count;
+}
+static DEVICE_ATTR(nvme_discovery, 0200, NULL, nvme_fc_nvme_discovery_store);
+
+static struct attribute *nvme_fc_attrs[] = {
+	&dev_attr_nvme_discovery.attr,
+	NULL
+};
+
+static struct attribute_group nvme_fc_attr_group = {
+	.attrs = nvme_fc_attrs,
+};
+
+static const struct attribute_group *nvme_fc_attr_groups[] = {
+	&nvme_fc_attr_group,
+	NULL
+};
+
+static struct class fc_class = {
+	.name = "fc",
+	.dev_groups = nvme_fc_attr_groups,
+	.owner = THIS_MODULE,
+};
+
 static int __init nvme_fc_init_module(void)
 {
 	int ret;
@@ -3273,16 +3369,16 @@ static int __init nvme_fc_init_module(void)
 	 * put in place, this code will move to a more generic
 	 * location for the class.
 	 */
-	fc_class = class_create(THIS_MODULE, "fc");
-	if (IS_ERR(fc_class)) {
+	ret = class_register(&fc_class);
+	if (ret) {
 		pr_err("couldn't register class fc\n");
-		return PTR_ERR(fc_class);
+		return ret;
 	}
 
 	/*
 	 * Create a device for the FC-centric udev events
 	 */
-	fc_udev_device = device_create(fc_class, NULL, MKDEV(0, 0), NULL,
+	fc_udev_device = device_create(&fc_class, NULL, MKDEV(0, 0), NULL,
 				"fc_udev_device");
 	if (IS_ERR(fc_udev_device)) {
 		pr_err("couldn't create fc_udev device!\n");
@@ -3297,9 +3393,9 @@ static int __init nvme_fc_init_module(void)
 	return 0;
 
 out_destroy_device:
-	device_destroy(fc_class, MKDEV(0, 0));
+	device_destroy(&fc_class, MKDEV(0, 0));
 out_destroy_class:
-	class_destroy(fc_class);
+	class_unregister(&fc_class);
 	return ret;
 }
 
@@ -3314,8 +3410,8 @@ static void __exit nvme_fc_exit_module(void)
 	ida_destroy(&nvme_fc_local_port_cnt);
 	ida_destroy(&nvme_fc_ctrl_cnt);
 
-	device_destroy(fc_class, MKDEV(0, 0));
-	class_destroy(fc_class);
+	device_destroy(&fc_class, MKDEV(0, 0));
+	class_unregister(&fc_class);
 }
 
 module_init(nvme_fc_init_module);
-- 
2.13.1

^ permalink raw reply related	[flat|nested] 18+ messages in thread

* [PATCH 1/7] nvme: remove unnecessary controller subnqn validation
  2018-05-08  0:12 ` [PATCH 1/7] nvme: remove unnecessary controller subnqn validation James Smart
@ 2018-05-08  5:57   ` Hannes Reinecke
  0 siblings, 0 replies; 18+ messages in thread
From: Hannes Reinecke @ 2018-05-08  5:57 UTC (permalink / raw)


On Mon,  7 May 2018 17:12:08 -0700
"James Smart" <jsmart2021@gmail.com> wrote:

> After creating the nvme controller, nvmf_create_ctrl() validates
> the newly created subsysnqn vs the one specified by the options.
> 
> In general, this is an unnecessary check as the Connect message
> should implicitly ensure this value matches.
> 
> With the change to the FC transport to do an asynchronous connect
> for the first association create, the transport will return to
> nvmf_create_ctrl() before that first association has been established,
> thus the subnqn will not yet be set.
> 
> Remove the unnecessary validation.
> 
> Signed-off-by: James Smart <james.smart at broadcom.com>
> ---
>  drivers/nvme/host/fabrics.c | 10 ----------
>  1 file changed, 10 deletions(-)
> 
Reviewed-by: Hannes Reinecke <hare at suse.com>

Cheers,

Hannes

^ permalink raw reply	[flat|nested] 18+ messages in thread

* [PATCH 2/7] nvme_fc: remove setting DNR on exception conditions
  2018-05-08  0:12 ` [PATCH 2/7] nvme_fc: remove setting DNR on exception conditions James Smart
@ 2018-05-08  5:58   ` Hannes Reinecke
  0 siblings, 0 replies; 18+ messages in thread
From: Hannes Reinecke @ 2018-05-08  5:58 UTC (permalink / raw)


On Mon,  7 May 2018 17:12:09 -0700
"James Smart" <jsmart2021@gmail.com> wrote:

> Current code will set DNR if the controller is deleting or there is
> an error during controller init. None of this is necessary.
> 
> remove the code that sets DNR
> 
> Signed-off-by: James Smart <james.smart at broadcom.com>
> ---
>  drivers/nvme/host/fc.c | 10 ----------
>  1 file changed, 10 deletions(-)
> 
Reviewed-by: Hannes Reinecke <hare at suse.com>

Cheers,

Hannes

^ permalink raw reply	[flat|nested] 18+ messages in thread

* [PATCH 3/7] nvme_fc: retry failures to set io queue count
  2018-05-08  0:12 ` [PATCH 3/7] nvme_fc: retry failures to set io queue count James Smart
@ 2018-05-08  6:00   ` Hannes Reinecke
  2018-05-11 20:19     ` James Smart
  0 siblings, 1 reply; 18+ messages in thread
From: Hannes Reinecke @ 2018-05-08  6:00 UTC (permalink / raw)


On Mon,  7 May 2018 17:12:10 -0700
"James Smart" <jsmart2021@gmail.com> wrote:

> During the creation of a new controller association, it's possible for
> errors and link connectivity issues to cause nvme_set_queue_count() to
> have its SET_FEATURES command fail with a positive non-zero code. The
> routine doesn't treat this as a hard error, instead setting the io
> queue count to zero and returning success.  This has the result of the
> transport setting the io queue count to 0, making the storage
> controller inoperable. The message "...Could not set queue count..."
> is seen.
> 
> Revise the fc transport to detect when it asked for io queues but got
> back a result of 0 io queues. In such a case, fail the re-connection
> attempt and fall into the retry loop.
> 
> Signed-off-by: James Smart <james.smart at broadcom.com>
> ---
>  drivers/nvme/host/fc.c | 14 ++++++++------
>  1 file changed, 8 insertions(+), 6 deletions(-)
> 
The usual problem when having _two_ return values.
Can't have nvme_set_queue_count() return the number of queues or a
negative number on failure?
Then the check would be much simplified.

Cheers,

Hannes

^ permalink raw reply	[flat|nested] 18+ messages in thread

* [PATCH 4/7] nvme_fc: remove reinit_request routine
  2018-05-08  0:12 ` [PATCH 4/7] nvme_fc: remove reinit_request routine James Smart
@ 2018-05-08  6:01   ` Hannes Reinecke
  0 siblings, 0 replies; 18+ messages in thread
From: Hannes Reinecke @ 2018-05-08  6:01 UTC (permalink / raw)


On Mon,  7 May 2018 17:12:11 -0700
"James Smart" <jsmart2021@gmail.com> wrote:

> The reinit_request routine is not necessary. Remove support for the
> op callback.
> 
> As all that nvme_reinit_tagset() does is itterate and call the
> reinit routine, it too has no purpose. Remove the call.
> 
> Signed-off-by: James Smart <james.smart at broadcom.com>
> ---
>  drivers/nvme/host/fc.c | 20 --------------------
>  1 file changed, 20 deletions(-)
> 
Reviewed-by: Hannes Reinecke <hare at suse.com>

Cheers,

Hannes

^ permalink raw reply	[flat|nested] 18+ messages in thread

* [PATCH 5/7] nvme_fc: change controllers first connect to use reconnect path
  2018-05-08  0:12 ` [PATCH 5/7] nvme_fc: change controllers first connect to use reconnect path James Smart
@ 2018-05-08  6:03   ` Hannes Reinecke
  0 siblings, 0 replies; 18+ messages in thread
From: Hannes Reinecke @ 2018-05-08  6:03 UTC (permalink / raw)


On Mon,  7 May 2018 17:12:12 -0700
"James Smart" <jsmart2021@gmail.com> wrote:

> Current code follows the framework that has been in the transports
> from the beginning where initial link-side controller connect occurs
> as part of "creating the controller". Thus that first connect fully
> talks to the controller and obtains values that can then be used in
> for blk-mq setup, etc. It also means that everything about the
> controller is fully know before the "create controller" call returns.
> 
> This has several weaknesses:
> - The initial create_ctrl call made by the cli will block for a long
>   time as wire transactions are performed synchronously. This delay
>   becomes longer if errors occur or connectivity is lost and retries
>   need to be performed.
> - Code wise, it means there is a separate connect path for initial
>   controller connect vs the (same) steps used in the reconnect path.
> - And as there's separate paths, it means there's separate error
>   handling and retry logic. It also plays havoc with the NEW state
>   (should transition out of it after successful initial connect) vs
>   the RESETTING and CONNECTING (reconnect) states that want to be
>   transitioned to on error.
> - As there's separate paths, to recover from errors and disruptions,
>   it requires separate recovery/retry paths as well and can severely
>   convolute the controller state.
> - Given the duration of the initial create_ctrl() call, automation
>   of dynamic discovery can't use simplistic udev rules. Instead,
>   more convoluted use of systemd is required.
> 
> This patch reworks the fc transport to use the same connect paths
> for the initial connection as it uses for reconnect. This makes a
> single path for error recovery and handling. Given that the initial
> connect becomes asynchronous on a background work thread, the
> create_ctrl() now returns very quickly, allowing the simplified
> auto-connnect scripting.
> 
> This patch:
> - Removes the driving of the initial connect and replaces it with
>   a state transition to CONNECTING and initiating the reconnect
>   thread. A dummy state transition of RESETTING had to be traversed
>   as a direct transtion of NEW->CONNECTING is not allowed. Given
>   that the controller is "new", the RESETTING transition is a simple
>   no-op. Once in the reconnecting thread, the normal behaviors of
>   ctrl_loss_tmo (max_retries * connect_delay) and dev_loss_tmo will
>   apply before the controller is torn down.
> - Only if the state transitions couldn't be traversed and the
>   reconnect thread not scheduled, will the controller be torn down
>   while in create_ctrl.
> - The prior code used the controller state of NEW to indicate
>   whether request queues had been initialized or not. For the admin
>   queue, the request queue is always created, so there's no need to
>   check a state. For IO queues, change to tracking whether a
> successful io request queue create has occurred (e.g. 1st successful
> connect).
> - The initial controller id is initialized to the dynamic controller
>   id used in the initial connect message. It will be overwritten by
>   the real controller id once the controller is connected on the wire.
> 
> Note: this patch is dependent on these changes that have been
> added to the nvme-cli utility:
> - nvme-cli: Wait for device file if not present after successful
> add_ctrl
> - nvme-cli: Add ioctl retry support for "connect-all"
> 
> With this patch, a simplistic udev rule such as one of the following
> options below, may now be used to support auto connectivity:
> 
> option A:
>    ACTION=="change", SUBSYSTEM=="fc", ENV{FC_EVENT}=="nvmediscovery",
> \ ENV{NVMEFC_HOST_TRADDR}=="*", ENV{NVMEFC_TRADDR}=="*", \
> 	RUN+="/usr/sbin/nvme connect-all --transport=fc
> --host-traddr=$env{NVMEFC_HOST_TRADDR} --traddr=$env{NVMEFC_TRADDR}"
> 
> option B:
>    SUBSYSTEM!="fc", GOTO="nvme_autoconnect_end"
>    ACTION!="change", GOTO="nvme_autoconnect_end"
>    ENV{FC_EVENT}!="nvmediscovery", GOTO="nvme_autoconnect_end"
>    ENV{NVMEFC_HOST_TRADDR}!="?*", GOTO="nvme_autoconnect_end"
>    ENV{NVMEFC_TRADDR}!="?*", GOTO="nvme_autoconnect_end"
>    RUN+="/usr/sbin/nvme connect-all -t fc
> --host-traddr=$env{NVMEFC_HOST_TRADDR} -a $env{NVMEFC_TRADDR}"
> LABEL="nvme_autoconnect_end"
> 
> Signed-off-by: James Smart <james.smart at broadcom.com>
> ---
>  drivers/nvme/host/fc.c | 102
> ++++++++++++++++++++++--------------------------- 1 file changed, 45
> insertions(+), 57 deletions(-)
> 

Reviewed-by: Hannes Reinecke <hare at suse.com>

Cheers,

Hannes

^ permalink raw reply	[flat|nested] 18+ messages in thread

* [PATCH 6/7] nvme_fc: fix nulling of queue data on reconnect
  2018-05-08  0:12 ` [PATCH 6/7] nvme_fc: fix nulling of queue data on reconnect James Smart
@ 2018-05-08  6:05   ` Hannes Reinecke
  2018-05-08 15:12     ` James Smart
  0 siblings, 1 reply; 18+ messages in thread
From: Hannes Reinecke @ 2018-05-08  6:05 UTC (permalink / raw)


On Mon,  7 May 2018 17:12:13 -0700
"James Smart" <jsmart2021@gmail.com> wrote:

> The reconnect path is calling the init routines to clear a queue
> structure. But the queue structure has state that perhaps needs
> to persist as long as the controller is live.
> 
> Remove the nvme_fc_init_queue() calls on reconnect.
> The nvme_fc_free_queue() calls will clear state bits and reset
> any relevant queue state for a new connection.
> 
> Signed-off-by: James Smart <james.smart at broadcom.com>
> ---
>  drivers/nvme/host/fc.c | 11 +++++------
>  1 file changed, 5 insertions(+), 6 deletions(-)
> 
> diff --git a/drivers/nvme/host/fc.c b/drivers/nvme/host/fc.c
> index 7b9fb2e5afee..3cab280cf5f7 100644
> --- a/drivers/nvme/host/fc.c
> +++ b/drivers/nvme/host/fc.c
> @@ -1879,6 +1879,7 @@ nvme_fc_free_queue(struct nvme_fc_queue *queue)
>  	 */
>  
>  	queue->connection_id = 0;
> +	atomic_set(&queue->csn, 1);
>  }
>  
Huh? What is that doing here?
Doesn't seem to be related to the remaining patchset.
Or am I missing something here?

>  static void
> @@ -2472,7 +2473,7 @@ nvme_fc_create_io_queues(struct nvme_fc_ctrl
> *ctrl) }
>  
>  static int
> -nvme_fc_reinit_io_queues(struct nvme_fc_ctrl *ctrl)
> +nvme_fc_recreate_io_queues(struct nvme_fc_ctrl *ctrl)
>  {
>  	struct nvmf_ctrl_options *opts = ctrl->ctrl.opts;
>  	unsigned int nr_io_queues, numq;
> @@ -2493,8 +2494,6 @@ nvme_fc_reinit_io_queues(struct nvme_fc_ctrl
> *ctrl) if (ctrl->ctrl.queue_count == 1)
>  		return 0;
>  
> -	nvme_fc_init_io_queues(ctrl);
> -
>  	ret = nvme_fc_create_hw_io_queues(ctrl, ctrl->ctrl.sqsize +
> 1); if (ret)
>  		goto out_free_io_queues;
> @@ -2592,8 +2591,6 @@ nvme_fc_create_association(struct nvme_fc_ctrl
> *ctrl)
>  	 * Create the admin queue
>  	 */
>  
> -	nvme_fc_init_queue(ctrl, 0);
> -
>  	ret = __nvme_fc_create_hw_queue(ctrl, &ctrl->queues[0], 0,
>  				NVME_AQ_DEPTH);
>  	if (ret)
> @@ -2680,7 +2677,7 @@ nvme_fc_create_association(struct nvme_fc_ctrl
> *ctrl) if (!ctrl->ioq_live)
>  			ret = nvme_fc_create_io_queues(ctrl);
>  		else
> -			ret = nvme_fc_reinit_io_queues(ctrl);
> +			ret = nvme_fc_recreate_io_queues(ctrl);
>  		if (ret)
>  			goto out_term_aen_ops;
>  	}
> @@ -3028,6 +3025,8 @@ nvme_fc_init_ctrl(struct device *dev, struct
> nvmf_ctrl_options *opts, if (!ctrl->queues)
>  		goto out_free_ida;
>  
> +	nvme_fc_init_queue(ctrl, 0);
> +
>  	memset(&ctrl->admin_tag_set, 0, sizeof(ctrl->admin_tag_set));
>  	ctrl->admin_tag_set.ops = &nvme_fc_admin_mq_ops;
>  	ctrl->admin_tag_set.queue_depth = NVME_AQ_MQ_TAG_DEPTH;

Cheers,

Hannes

^ permalink raw reply	[flat|nested] 18+ messages in thread

* [PATCH 7/7] nvme_fc: add 'nvme_discovery' sysfs attribute to fc transport device
  2018-05-08  0:12 ` [PATCH 7/7] nvme_fc: add 'nvme_discovery' sysfs attribute to fc transport device James Smart
@ 2018-05-08  6:06   ` Hannes Reinecke
  0 siblings, 0 replies; 18+ messages in thread
From: Hannes Reinecke @ 2018-05-08  6:06 UTC (permalink / raw)


On Mon,  7 May 2018 17:12:14 -0700
"James Smart" <jsmart2021@gmail.com> wrote:

> The fc transport device should allow for a rediscovery, as userspace
> might have lost the events. Example is udev events not handled during
> system startup.
> 
> This patch add a sysfs entry 'nvme_discovery' on the fc class to
> have it replay all udev discovery events for all local port/remote
> port address pairs.
> 
> Signed-off-by: James Smart <james.smart at broadcom.com>
> CC: Hannes Reinecke <hare at suse.com>
> Cc: Ewan Milne <emilne at redhat.com>
> Cc: Johannes Thumshirn <jthumshirn at suse.com>
> 
> ---
> This is actually v4 - based on Hannes' v3 of the same patch:
> [PATCHv3] nvme/fc: add 'discovery' sysfs attribute to fc transport
> device
> http://lists.infradead.org/pipermail/linux-nvme/2017-December/014334.html
> 
> However, refcounting in v3 was still messed up when an rport_get()
> fails. Modified the patch to add the unwind routine to correct it.
> 
> Also renamed the "discovery" attribute to "nvme_discovery" so that
> there will not be collisions if/when we move to a shared fc transport
> and have scsi there as well.
> 
> Note: this patch alone doesn't allow the simple udev script.  The
> patches for the fc transport to move initial connect to use the
> asychronous reconnect path is what allows the simple udev script.
> This patch simply allows to catch up with connectivity if the udev
> events weren't handled. ---
>  drivers/nvme/host/fc.c | 114
> +++++++++++++++++++++++++++++++++++++++++++++---- 1 file changed, 105
> insertions(+), 9 deletions(-)
> 
Reviewed-by: Hannes Reinecke <hare at suse.com>

Cheers,

Hannes

^ permalink raw reply	[flat|nested] 18+ messages in thread

* [PATCH 6/7] nvme_fc: fix nulling of queue data on reconnect
  2018-05-08  6:05   ` Hannes Reinecke
@ 2018-05-08 15:12     ` James Smart
  2018-05-08 15:28       ` Hannes Reinecke
  0 siblings, 1 reply; 18+ messages in thread
From: James Smart @ 2018-05-08 15:12 UTC (permalink / raw)



On 5/7/2018 11:05 PM, Hannes Reinecke wrote:
> On Mon,  7 May 2018 17:12:13 -0700
> "James Smart" <jsmart2021@gmail.com> wrote:
>
>> @@ -1879,6 +1879,7 @@ nvme_fc_free_queue(struct nvme_fc_queue *queue)
>>   	 */
>>   
>>   	queue->connection_id = 0;
>> +	atomic_set(&queue->csn, 1);
>>   }
>>   
> Huh? What is that doing here?
> Doesn't seem to be related to the remaining patchset.
> Or am I missing something here?

The init_queue calls would normally "init" the structure and set the csn 
back to 1, but it also cleared too much other state as well. Thus they 
were removed. The "free" routine remains, so this CSN setting is a 
"reinit'ing" of the structure - just like zeroing the connection id.

-- james

^ permalink raw reply	[flat|nested] 18+ messages in thread

* [PATCH 6/7] nvme_fc: fix nulling of queue data on reconnect
  2018-05-08 15:12     ` James Smart
@ 2018-05-08 15:28       ` Hannes Reinecke
  0 siblings, 0 replies; 18+ messages in thread
From: Hannes Reinecke @ 2018-05-08 15:28 UTC (permalink / raw)


On 05/08/2018 05:12 PM, James Smart wrote:
> 
> On 5/7/2018 11:05 PM, Hannes Reinecke wrote:
>> On Mon,? 7 May 2018 17:12:13 -0700
>> "James Smart" <jsmart2021@gmail.com> wrote:
>>
>>> @@ -1879,6 +1879,7 @@ nvme_fc_free_queue(struct nvme_fc_queue *queue)
>>> ?????? */
>>> ????? queue->connection_id = 0;
>>> +??? atomic_set(&queue->csn, 1);
>>> ? }
>> Huh? What is that doing here?
>> Doesn't seem to be related to the remaining patchset.
>> Or am I missing something here?
> 
> The init_queue calls would normally "init" the structure and set the csn 
> back to 1, but it also cleared too much other state as well. Thus they 
> were removed. The "free" routine remains, so this CSN setting is a 
> "reinit'ing" of the structure - just like zeroing the connection id.
> 
Thanks for the clarification.

Reviewed-by: Hannes Reinecke <hare at suse.com>

Cheers,

Hannes

^ permalink raw reply	[flat|nested] 18+ messages in thread

* [PATCH 3/7] nvme_fc: retry failures to set io queue count
  2018-05-08  6:00   ` Hannes Reinecke
@ 2018-05-11 20:19     ` James Smart
  0 siblings, 0 replies; 18+ messages in thread
From: James Smart @ 2018-05-11 20:19 UTC (permalink / raw)

On 5/7/2018 11:00 PM, Hannes Reinecke wrote:
> On Mon,  7 May 2018 17:12:10 -0700
> "James Smart" <jsmart2021@gmail.com> wrote:
> 
>> During the creation of a new controller association, it's possible for
>> errors and link connectivity issues to cause nvme_set_queue_count() to
>> have its SET_FEATURES command fail with a positive non-zero code. The
>> routine doesn't treat this as a hard error, instead setting the io
>> queue count to zero and returning success.  This has the result of the
>> transport setting the io queue count to 0, making the storage
>> controller inoperable. The message "...Could not set queue count..."
>> is seen.
>>
>> Revise the fc transport to detect when it asked for io queues but got
>> back a result of 0 io queues. In such a case, fail the re-connection
>> attempt and fall into the retry loop.
>>
>> Signed-off-by: James Smart <james.smart at broadcom.com>
>> ---
>>   drivers/nvme/host/fc.c | 14 ++++++++------
>>   1 file changed, 8 insertions(+), 6 deletions(-)
>>
> The usual problem when having _two_ return values.
> Can't have nvme_set_queue_count() return the number of queues or a
> negative number on failure?
> Then the check would be much simplified.
> 
> Cheers,
> 
> Hannes
> 

the routine that nvme_set_queue_count() uses to perform the SET_FEATURES 
command to set the queue count returns either a negative error (linux 
error code), 0 for success, or a positive error (nvme command status 
value).   The nvme_set_queue_count() routine, in the case where it is a 
positive error - converts it to logging a message, setting io count to
zero, and returning success. In the case where it was a negative error,
nvme_set_queue_count() returns the negative error. Success sets the io 
count and returns zero.

In looking through history, it appears the desired to not fault 
controller connect if the SET_FEATURES command failed due to a nvme 
status was to allow the (degraded) controller to come up and at least 
offer the adminq for maintenance.

Obviously, my patch is reverting that "maintenance" path. This is a 
little more complex than desired as its not clear whether the nvme 
status was from a real device completion or one that was stamped by the 
transport as it recovered from a transport error or connectivity error 
(NVME_SC_ABORT_REQ  or NVME_SC_INTERNAL). Former is when you would want 
to continue, latter you want to fail. It appears the other transports 
stamp only NVME_SC_ABORT_REQ.

I'm going to repost with nvme_set_queue_count() returning failure 
(positive nvme result) if one of those two codes are returned. Any other 
status will keep the existing behavior.

-- james

^ permalink raw reply	[flat|nested] 18+ messages in thread

end of thread, other threads:[~2018-05-11 20:19 UTC | newest]

Thread overview: 18+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2018-05-08  0:12 [PATCH 0/7] nvme_fc: asynchronous controller create and simple discovery James Smart
2018-05-08  0:12 ` [PATCH 1/7] nvme: remove unnecessary controller subnqn validation James Smart
2018-05-08  5:57   ` Hannes Reinecke
2018-05-08  0:12 ` [PATCH 2/7] nvme_fc: remove setting DNR on exception conditions James Smart
2018-05-08  5:58   ` Hannes Reinecke
2018-05-08  0:12 ` [PATCH 3/7] nvme_fc: retry failures to set io queue count James Smart
2018-05-08  6:00   ` Hannes Reinecke
2018-05-11 20:19     ` James Smart
2018-05-08  0:12 ` [PATCH 4/7] nvme_fc: remove reinit_request routine James Smart
2018-05-08  6:01   ` Hannes Reinecke
2018-05-08  0:12 ` [PATCH 5/7] nvme_fc: change controllers first connect to use reconnect path James Smart
2018-05-08  6:03   ` Hannes Reinecke
2018-05-08  0:12 ` [PATCH 6/7] nvme_fc: fix nulling of queue data on reconnect James Smart
2018-05-08  6:05   ` Hannes Reinecke
2018-05-08 15:12     ` James Smart
2018-05-08 15:28       ` Hannes Reinecke
2018-05-08  0:12 ` [PATCH 7/7] nvme_fc: add 'nvme_discovery' sysfs attribute to fc transport device James Smart
2018-05-08  6:06   ` Hannes Reinecke

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.