All of lore.kernel.org
 help / color / mirror / Atom feed
* [PATCH v3 0/3] Fix host side state machine
@ 2018-01-31 16:31 Max Gurtovoy
  2018-01-31 16:31 ` [PATCH 1/3] nvme: rename NVME_CTRL_RECONNECTING state to NVME_CTRL_CONNECTING Max Gurtovoy
                   ` (4 more replies)
  0 siblings, 5 replies; 19+ messages in thread
From: Max Gurtovoy @ 2018-01-31 16:31 UTC (permalink / raw)


Hi all,
this series is rebased above nvme-4.16.
Actually there is a still missing part in this tree (but I tested it on
my own "stable" mixed kernel with this patch):
"nvme-rdma: fix concurrent reset and reconnect" from Sagi.

The first motivation for this series was fixing RDMA initiator that crushes in
case we fail during initial connect and start error recovery during initial
connection establishment.
This patchset also renames NVME_CTRL_RECONNECTING to NVME_CTRL_CONNECTING as
this state doesn't represent only a reconnection flow but also used for
initialization process.

The tested transport for these patches was RDMA.
It will be appriciated if someone can run this series with real FC HW (to test
patch #3).

changes from V2 (James will implement the support for FC and resubmit the deleted patch):
 - removed FC support
 - removed the "nvme: delete NVME_CTRL_NEW --> NVME_CTRL_LIVE transition"

changes from V1:
 - Added FC support

Max Gurtovoy (3):
  nvme: rename NVME_CTRL_RECONNECTING state to NVME_CTRL_CONNECTING
  nvme-rdma: use NVME_CTRL_CONNECTING state to mark init process
  nvme: delete NVME_CTRL_LIVE --> NVME_CTRL_CONNECTING transition

 drivers/nvme/host/core.c    | 12 ++++++------
 drivers/nvme/host/fabrics.h |  9 +++++----
 drivers/nvme/host/fc.c      | 14 +++++++-------
 drivers/nvme/host/nvme.h    |  2 +-
 drivers/nvme/host/pci.c     |  8 ++++----
 drivers/nvme/host/rdma.c    |  7 +++++--
 6 files changed, 28 insertions(+), 24 deletions(-)

-- 
1.8.3.1

^ permalink raw reply	[flat|nested] 19+ messages in thread

* [PATCH 1/3] nvme: rename NVME_CTRL_RECONNECTING state to NVME_CTRL_CONNECTING
  2018-01-31 16:31 [PATCH v3 0/3] Fix host side state machine Max Gurtovoy
@ 2018-01-31 16:31 ` Max Gurtovoy
  2018-01-31 17:10   ` James Smart
  2018-02-08 15:28   ` Sagi Grimberg
  2018-01-31 16:31 ` [PATCH 2/3] nvme-rdma: use NVME_CTRL_CONNECTING state to mark init process Max Gurtovoy
                   ` (3 subsequent siblings)
  4 siblings, 2 replies; 19+ messages in thread
From: Max Gurtovoy @ 2018-01-31 16:31 UTC (permalink / raw)


In pci transport, this state is used to mark the initialization
process. This should be also used in other transports as well.

Signed-off-by: Max Gurtovoy <maxg at mellanox.com>
---
 drivers/nvme/host/core.c    | 10 +++++-----
 drivers/nvme/host/fabrics.h |  9 +++++----
 drivers/nvme/host/fc.c      | 14 +++++++-------
 drivers/nvme/host/nvme.h    |  2 +-
 drivers/nvme/host/pci.c     |  8 ++++----
 drivers/nvme/host/rdma.c    |  4 ++--
 6 files changed, 24 insertions(+), 23 deletions(-)

diff --git a/drivers/nvme/host/core.c b/drivers/nvme/host/core.c
index b3af8e9..c1a9706 100644
--- a/drivers/nvme/host/core.c
+++ b/drivers/nvme/host/core.c
@@ -265,7 +265,7 @@ bool nvme_change_ctrl_state(struct nvme_ctrl *ctrl,
 	switch (new_state) {
 	case NVME_CTRL_ADMIN_ONLY:
 		switch (old_state) {
-		case NVME_CTRL_RECONNECTING:
+		case NVME_CTRL_CONNECTING:
 			changed = true;
 			/* FALLTHRU */
 		default:
@@ -276,7 +276,7 @@ bool nvme_change_ctrl_state(struct nvme_ctrl *ctrl,
 		switch (old_state) {
 		case NVME_CTRL_NEW:
 		case NVME_CTRL_RESETTING:
-		case NVME_CTRL_RECONNECTING:
+		case NVME_CTRL_CONNECTING:
 			changed = true;
 			/* FALLTHRU */
 		default:
@@ -294,7 +294,7 @@ bool nvme_change_ctrl_state(struct nvme_ctrl *ctrl,
 			break;
 		}
 		break;
-	case NVME_CTRL_RECONNECTING:
+	case NVME_CTRL_CONNECTING:
 		switch (old_state) {
 		case NVME_CTRL_LIVE:
 		case NVME_CTRL_RESETTING:
@@ -309,7 +309,7 @@ bool nvme_change_ctrl_state(struct nvme_ctrl *ctrl,
 		case NVME_CTRL_LIVE:
 		case NVME_CTRL_ADMIN_ONLY:
 		case NVME_CTRL_RESETTING:
-		case NVME_CTRL_RECONNECTING:
+		case NVME_CTRL_CONNECTING:
 			changed = true;
 			/* FALLTHRU */
 		default:
@@ -2682,7 +2682,7 @@ static ssize_t nvme_sysfs_show_state(struct device *dev,
 		[NVME_CTRL_LIVE]	= "live",
 		[NVME_CTRL_ADMIN_ONLY]	= "only-admin",
 		[NVME_CTRL_RESETTING]	= "resetting",
-		[NVME_CTRL_RECONNECTING]= "reconnecting",
+		[NVME_CTRL_CONNECTING]	= "connecting",
 		[NVME_CTRL_DELETING]	= "deleting",
 		[NVME_CTRL_DEAD]	= "dead",
 	};
diff --git a/drivers/nvme/host/fabrics.h b/drivers/nvme/host/fabrics.h
index 25b19f7..a3145d9 100644
--- a/drivers/nvme/host/fabrics.h
+++ b/drivers/nvme/host/fabrics.h
@@ -171,13 +171,14 @@ static inline blk_status_t nvmf_check_init_req(struct nvme_ctrl *ctrl,
 	    cmd->common.opcode != nvme_fabrics_command ||
 	    cmd->fabrics.fctype != nvme_fabrics_type_connect) {
 		/*
-		 * Reconnecting state means transport disruption, which can take
-		 * a long time and even might fail permanently, fail fast to
-		 * give upper layers a chance to failover.
+		 * Connecting state means transport disruption or initial
+		 * establishment, which can take a long time and even might
+		 * fail permanently, fail fast to give upper layers a chance
+		 * to failover.
 		 * Deleting state means that the ctrl will never accept commands
 		 * again, fail it permanently.
 		 */
-		if (ctrl->state == NVME_CTRL_RECONNECTING ||
+		if (ctrl->state == NVME_CTRL_CONNECTING ||
 		    ctrl->state == NVME_CTRL_DELETING) {
 			nvme_req(rq)->status = NVME_SC_ABORT_REQ;
 			return BLK_STS_IOERR;
diff --git a/drivers/nvme/host/fc.c b/drivers/nvme/host/fc.c
index b76ba46..b679971 100644
--- a/drivers/nvme/host/fc.c
+++ b/drivers/nvme/host/fc.c
@@ -534,7 +534,7 @@ static void __nvme_fc_delete_hw_queue(struct nvme_fc_ctrl *,
 {
 	switch (ctrl->ctrl.state) {
 	case NVME_CTRL_NEW:
-	case NVME_CTRL_RECONNECTING:
+	case NVME_CTRL_CONNECTING:
 		/*
 		 * As all reconnects were suppressed, schedule a
 		 * connect.
@@ -779,7 +779,7 @@ static void __nvme_fc_delete_hw_queue(struct nvme_fc_ctrl *,
 		}
 		break;
 
-	case NVME_CTRL_RECONNECTING:
+	case NVME_CTRL_CONNECTING:
 		/*
 		 * The association has already been terminated and the
 		 * controller is attempting reconnects.  No need to do anything
@@ -1724,7 +1724,7 @@ enum {
 	if (status &&
 	    (blk_queue_dying(rq->q) ||
 	     ctrl->ctrl.state == NVME_CTRL_NEW ||
-	     ctrl->ctrl.state == NVME_CTRL_RECONNECTING))
+	     ctrl->ctrl.state == NVME_CTRL_CONNECTING))
 		status |= cpu_to_le16(NVME_SC_DNR << 1);
 
 	if (__nvme_fc_fcpop_chk_teardowns(ctrl, op))
@@ -2951,7 +2951,7 @@ static inline blk_status_t nvme_fc_is_ready(struct nvme_fc_queue *queue,
 	unsigned long recon_delay = ctrl->ctrl.opts->reconnect_delay * HZ;
 	bool recon = true;
 
-	if (ctrl->ctrl.state != NVME_CTRL_RECONNECTING)
+	if (ctrl->ctrl.state != NVME_CTRL_CONNECTING)
 		return;
 
 	if (portptr->port_state == FC_OBJSTATE_ONLINE)
@@ -2999,10 +2999,10 @@ static inline blk_status_t nvme_fc_is_ready(struct nvme_fc_queue *queue,
 	/* will block will waiting for io to terminate */
 	nvme_fc_delete_association(ctrl);
 
-	if (!nvme_change_ctrl_state(&ctrl->ctrl, NVME_CTRL_RECONNECTING)) {
+	if (!nvme_change_ctrl_state(&ctrl->ctrl, NVME_CTRL_CONNECTING)) {
 		dev_err(ctrl->ctrl.device,
 			"NVME-FC{%d}: error_recovery: Couldn't change state "
-			"to RECONNECTING\n", ctrl->cnum);
+			"to CONNECTING\n", ctrl->cnum);
 		return;
 	}
 
@@ -3203,7 +3203,7 @@ static inline blk_status_t nvme_fc_is_ready(struct nvme_fc_queue *queue,
 	 * transport errors (frame drop, LS failure) inherently must kill
 	 * the association. The transport is coded so that any command used
 	 * to create the association (prior to a LIVE state transition
-	 * while NEW or RECONNECTING) will fail if it completes in error or
+	 * while NEW or CONNECTING) will fail if it completes in error or
 	 * times out.
 	 *
 	 * As such: as the connect request was mostly likely due to a
diff --git a/drivers/nvme/host/nvme.h b/drivers/nvme/host/nvme.h
index 8e7fc1b..91529c9 100644
--- a/drivers/nvme/host/nvme.h
+++ b/drivers/nvme/host/nvme.h
@@ -123,7 +123,7 @@ enum nvme_ctrl_state {
 	NVME_CTRL_LIVE,
 	NVME_CTRL_ADMIN_ONLY,    /* Only admin queue live */
 	NVME_CTRL_RESETTING,
-	NVME_CTRL_RECONNECTING,
+	NVME_CTRL_CONNECTING,
 	NVME_CTRL_DELETING,
 	NVME_CTRL_DEAD,
 };
diff --git a/drivers/nvme/host/pci.c b/drivers/nvme/host/pci.c
index 0bc6a9e..8060ee9 100644
--- a/drivers/nvme/host/pci.c
+++ b/drivers/nvme/host/pci.c
@@ -1143,7 +1143,7 @@ static bool nvme_should_reset(struct nvme_dev *dev, u32 csts)
 	/* If there is a reset/reinit ongoing, we shouldn't reset again. */
 	switch (dev->ctrl.state) {
 	case NVME_CTRL_RESETTING:
-	case NVME_CTRL_RECONNECTING:
+	case NVME_CTRL_CONNECTING:
 		return false;
 	default:
 		break;
@@ -2290,12 +2290,12 @@ static void nvme_reset_work(struct work_struct *work)
 		nvme_dev_disable(dev, false);
 
 	/*
-	 * Introduce RECONNECTING state from nvme-fc/rdma transports to mark the
+	 * Introduce CONNECTING state from nvme-fc/rdma transports to mark the
 	 * initializing procedure here.
 	 */
-	if (!nvme_change_ctrl_state(&dev->ctrl, NVME_CTRL_RECONNECTING)) {
+	if (!nvme_change_ctrl_state(&dev->ctrl, NVME_CTRL_CONNECTING)) {
 		dev_warn(dev->ctrl.device,
-			"failed to mark controller RECONNECTING\n");
+			"failed to mark controller CONNECTING\n");
 		goto out;
 	}
 
diff --git a/drivers/nvme/host/rdma.c b/drivers/nvme/host/rdma.c
index 6c2fdfa..219c6a0 100644
--- a/drivers/nvme/host/rdma.c
+++ b/drivers/nvme/host/rdma.c
@@ -887,7 +887,7 @@ static void nvme_rdma_free_ctrl(struct nvme_ctrl *nctrl)
 static void nvme_rdma_reconnect_or_remove(struct nvme_rdma_ctrl *ctrl)
 {
 	/* If we are resetting/deleting then do nothing */
-	if (ctrl->ctrl.state != NVME_CTRL_RECONNECTING) {
+	if (ctrl->ctrl.state != NVME_CTRL_CONNECTING) {
 		WARN_ON_ONCE(ctrl->ctrl.state == NVME_CTRL_NEW ||
 			ctrl->ctrl.state == NVME_CTRL_LIVE);
 		return;
@@ -978,7 +978,7 @@ static void nvme_rdma_error_recovery_work(struct work_struct *work)
 
 static void nvme_rdma_error_recovery(struct nvme_rdma_ctrl *ctrl)
 {
-	if (!nvme_change_ctrl_state(&ctrl->ctrl, NVME_CTRL_RECONNECTING))
+	if (!nvme_change_ctrl_state(&ctrl->ctrl, NVME_CTRL_CONNECTING))
 		return;
 
 	queue_work(nvme_wq, &ctrl->err_work);
-- 
1.8.3.1

^ permalink raw reply related	[flat|nested] 19+ messages in thread

* [PATCH 2/3] nvme-rdma: use NVME_CTRL_CONNECTING state to mark init process
  2018-01-31 16:31 [PATCH v3 0/3] Fix host side state machine Max Gurtovoy
  2018-01-31 16:31 ` [PATCH 1/3] nvme: rename NVME_CTRL_RECONNECTING state to NVME_CTRL_CONNECTING Max Gurtovoy
@ 2018-01-31 16:31 ` Max Gurtovoy
  2018-01-31 17:10   ` James Smart
  2018-01-31 16:31 ` [PATCH 3/3] nvme: delete NVME_CTRL_LIVE --> NVME_CTRL_CONNECTING transition Max Gurtovoy
                   ` (2 subsequent siblings)
  4 siblings, 1 reply; 19+ messages in thread
From: Max Gurtovoy @ 2018-01-31 16:31 UTC (permalink / raw)


In order to avoid concurrent error recovery during initialization
process (allowed by the NVME_CTRL_NEW --> NVME_CTRL_RESETTING transition)
we must mark the ctrl as CONNECTING before initial connection
establisment.

Signed-off-by: Max Gurtovoy <maxg at mellanox.com>
---
 drivers/nvme/host/core.c | 1 +
 drivers/nvme/host/rdma.c | 3 +++
 2 files changed, 4 insertions(+)

diff --git a/drivers/nvme/host/core.c b/drivers/nvme/host/core.c
index c1a9706..1d7e304 100644
--- a/drivers/nvme/host/core.c
+++ b/drivers/nvme/host/core.c
@@ -296,6 +296,7 @@ bool nvme_change_ctrl_state(struct nvme_ctrl *ctrl,
 		break;
 	case NVME_CTRL_CONNECTING:
 		switch (old_state) {
+		case NVME_CTRL_NEW:
 		case NVME_CTRL_LIVE:
 		case NVME_CTRL_RESETTING:
 			changed = true;
diff --git a/drivers/nvme/host/rdma.c b/drivers/nvme/host/rdma.c
index 219c6a0..d86aee7 100644
--- a/drivers/nvme/host/rdma.c
+++ b/drivers/nvme/host/rdma.c
@@ -1930,6 +1930,9 @@ static struct nvme_ctrl *nvme_rdma_create_ctrl(struct device *dev,
 	if (!ctrl->queues)
 		goto out_uninit_ctrl;
 
+	changed = nvme_change_ctrl_state(&ctrl->ctrl, NVME_CTRL_CONNECTING);
+	WARN_ON_ONCE(!changed);
+
 	ret = nvme_rdma_configure_admin_queue(ctrl, true);
 	if (ret)
 		goto out_kfree_queues;
-- 
1.8.3.1

^ permalink raw reply related	[flat|nested] 19+ messages in thread

* [PATCH 3/3] nvme: delete NVME_CTRL_LIVE --> NVME_CTRL_CONNECTING transition
  2018-01-31 16:31 [PATCH v3 0/3] Fix host side state machine Max Gurtovoy
  2018-01-31 16:31 ` [PATCH 1/3] nvme: rename NVME_CTRL_RECONNECTING state to NVME_CTRL_CONNECTING Max Gurtovoy
  2018-01-31 16:31 ` [PATCH 2/3] nvme-rdma: use NVME_CTRL_CONNECTING state to mark init process Max Gurtovoy
@ 2018-01-31 16:31 ` Max Gurtovoy
  2018-01-31 17:11   ` James Smart
  2018-02-05 14:46 ` [PATCH v3 0/3] Fix host side state machine Sagi Grimberg
  2018-02-08 16:01 ` Keith Busch
  4 siblings, 1 reply; 19+ messages in thread
From: Max Gurtovoy @ 2018-01-31 16:31 UTC (permalink / raw)


There is no logical reason to move from live state to connecting
state. In case of initial connection establishment, the transition
should be NVME_CTRL_NEW --> NVME_CTRL_CONNECTING --> NVME_CTRL_LIVE.
In case of error recovery or reset, the transition should be
NVME_CTRL_LIVE --> NVME_CTRL_RESETTING --> NVME_CTRL_CONNECTING -->
NVME_CTRL_LIVE.

Signed-off-by: Max Gurtovoy <maxg at mellanox.com>
---
 drivers/nvme/host/core.c | 1 -
 1 file changed, 1 deletion(-)

diff --git a/drivers/nvme/host/core.c b/drivers/nvme/host/core.c
index 1d7e304..c2a3700 100644
--- a/drivers/nvme/host/core.c
+++ b/drivers/nvme/host/core.c
@@ -297,7 +297,6 @@ bool nvme_change_ctrl_state(struct nvme_ctrl *ctrl,
 	case NVME_CTRL_CONNECTING:
 		switch (old_state) {
 		case NVME_CTRL_NEW:
-		case NVME_CTRL_LIVE:
 		case NVME_CTRL_RESETTING:
 			changed = true;
 			/* FALLTHRU */
-- 
1.8.3.1

^ permalink raw reply related	[flat|nested] 19+ messages in thread

* [PATCH 1/3] nvme: rename NVME_CTRL_RECONNECTING state to NVME_CTRL_CONNECTING
  2018-01-31 16:31 ` [PATCH 1/3] nvme: rename NVME_CTRL_RECONNECTING state to NVME_CTRL_CONNECTING Max Gurtovoy
@ 2018-01-31 17:10   ` James Smart
  2018-02-08 15:28   ` Sagi Grimberg
  1 sibling, 0 replies; 19+ messages in thread
From: James Smart @ 2018-01-31 17:10 UTC (permalink / raw)


On 1/31/2018 8:31 AM, Max Gurtovoy wrote:
> In pci transport, this state is used to mark the initialization
> process. This should be also used in other transports as well.
>
> Signed-off-by: Max Gurtovoy <maxg at mellanox.com>
> ---

Reviewed-by: James Smart <james.smart at broadcom.com>

^ permalink raw reply	[flat|nested] 19+ messages in thread

* [PATCH 2/3] nvme-rdma: use NVME_CTRL_CONNECTING state to mark init process
  2018-01-31 16:31 ` [PATCH 2/3] nvme-rdma: use NVME_CTRL_CONNECTING state to mark init process Max Gurtovoy
@ 2018-01-31 17:10   ` James Smart
  0 siblings, 0 replies; 19+ messages in thread
From: James Smart @ 2018-01-31 17:10 UTC (permalink / raw)


On 1/31/2018 8:31 AM, Max Gurtovoy wrote:
> In order to avoid concurrent error recovery during initialization
> process (allowed by the NVME_CTRL_NEW --> NVME_CTRL_RESETTING transition)
> we must mark the ctrl as CONNECTING before initial connection
> establisment.
>
> Signed-off-by: Max Gurtovoy <maxg at mellanox.com>


Reviewed-by: James Smart <james.smart at broadcom.com>

Prefer to have Sagi sign off as well on the rdma part

-- james

^ permalink raw reply	[flat|nested] 19+ messages in thread

* [PATCH 3/3] nvme: delete NVME_CTRL_LIVE --> NVME_CTRL_CONNECTING transition
  2018-01-31 16:31 ` [PATCH 3/3] nvme: delete NVME_CTRL_LIVE --> NVME_CTRL_CONNECTING transition Max Gurtovoy
@ 2018-01-31 17:11   ` James Smart
  0 siblings, 0 replies; 19+ messages in thread
From: James Smart @ 2018-01-31 17:11 UTC (permalink / raw)


On 1/31/2018 8:31 AM, Max Gurtovoy wrote:
> There is no logical reason to move from live state to connecting
> state. In case of initial connection establishment, the transition
> should be NVME_CTRL_NEW --> NVME_CTRL_CONNECTING --> NVME_CTRL_LIVE.
> In case of error recovery or reset, the transition should be
> NVME_CTRL_LIVE --> NVME_CTRL_RESETTING --> NVME_CTRL_CONNECTING -->
> NVME_CTRL_LIVE.
>
> Signed-off-by: Max Gurtovoy <maxg at mellanox.com>


Reviewed-by: James Smart <james.smart at broadcom.com>

^ permalink raw reply	[flat|nested] 19+ messages in thread

* [PATCH v3 0/3] Fix host side state machine
  2018-01-31 16:31 [PATCH v3 0/3] Fix host side state machine Max Gurtovoy
                   ` (2 preceding siblings ...)
  2018-01-31 16:31 ` [PATCH 3/3] nvme: delete NVME_CTRL_LIVE --> NVME_CTRL_CONNECTING transition Max Gurtovoy
@ 2018-02-05 14:46 ` Sagi Grimberg
  2018-02-06 15:01   ` Max Gurtovoy
  2018-02-08 16:01 ` Keith Busch
  4 siblings, 1 reply; 19+ messages in thread
From: Sagi Grimberg @ 2018-02-05 14:46 UTC (permalink / raw)



> Hi all,
> this series is rebased above nvme-4.16.
> Actually there is a still missing part in this tree (but I tested it on
> my own "stable" mixed kernel with this patch):
> "nvme-rdma: fix concurrent reset and reconnect" from Sagi.
> 
> The first motivation for this series was fixing RDMA initiator that crushes in
> case we fail during initial connect and start error recovery during initial
> connection establishment.
> This patchset also renames NVME_CTRL_RECONNECTING to NVME_CTRL_CONNECTING as
> this state doesn't represent only a reconnection flow but also used for
> initialization process.
> 
> The tested transport for these patches was RDMA.
> It will be appriciated if someone can run this series with real FC HW (to test
> patch #3).

The series looks fine to me.

I've picked it up, but I prefer to hear from Keith/Christoph before
adding it to the next pr round.

^ permalink raw reply	[flat|nested] 19+ messages in thread

* [PATCH v3 0/3] Fix host side state machine
  2018-02-05 14:46 ` [PATCH v3 0/3] Fix host side state machine Sagi Grimberg
@ 2018-02-06 15:01   ` Max Gurtovoy
  0 siblings, 0 replies; 19+ messages in thread
From: Max Gurtovoy @ 2018-02-06 15:01 UTC (permalink / raw)




On 2/5/2018 4:46 PM, Sagi Grimberg wrote:
> 
>> Hi all,
>> this series is rebased above nvme-4.16.
>> Actually there is a still missing part in this tree (but I tested it on
>> my own "stable" mixed kernel with this patch):
>> "nvme-rdma: fix concurrent reset and reconnect" from Sagi.
>>
>> The first motivation for this series was fixing RDMA initiator that 
>> crushes in
>> case we fail during initial connect and start error recovery during 
>> initial
>> connection establishment.
>> This patchset also renames NVME_CTRL_RECONNECTING to 
>> NVME_CTRL_CONNECTING as
>> this state doesn't represent only a reconnection flow but also used for
>> initialization process.
>>
>> The tested transport for these patches was RDMA.
>> It will be appriciated if someone can run this series with real FC HW 
>> (to test
>> patch #3).
> 
> The series looks fine to me.
> 
> I've picked it up, but I prefer to hear from Keith/Christoph before
> adding it to the next pr round.

Me too :)

Afterwards I'll prepare a fix to 4.15-stable.

-Max.

^ permalink raw reply	[flat|nested] 19+ messages in thread

* [PATCH 1/3] nvme: rename NVME_CTRL_RECONNECTING state to NVME_CTRL_CONNECTING
  2018-01-31 16:31 ` [PATCH 1/3] nvme: rename NVME_CTRL_RECONNECTING state to NVME_CTRL_CONNECTING Max Gurtovoy
  2018-01-31 17:10   ` James Smart
@ 2018-02-08 15:28   ` Sagi Grimberg
  2018-02-08 16:19     ` Sagi Grimberg
  1 sibling, 1 reply; 19+ messages in thread
From: Sagi Grimberg @ 2018-02-08 15:28 UTC (permalink / raw)



> diff --git a/drivers/nvme/host/rdma.c b/drivers/nvme/host/rdma.c
> index 6c2fdfa..219c6a0 100644
> --- a/drivers/nvme/host/rdma.c
> +++ b/drivers/nvme/host/rdma.c
> @@ -887,7 +887,7 @@ static void nvme_rdma_free_ctrl(struct nvme_ctrl *nctrl)
>   static void nvme_rdma_reconnect_or_remove(struct nvme_rdma_ctrl *ctrl)
>   {
>   	/* If we are resetting/deleting then do nothing */
> -	if (ctrl->ctrl.state != NVME_CTRL_RECONNECTING) {
> +	if (ctrl->ctrl.state != NVME_CTRL_CONNECTING) {
>   		WARN_ON_ONCE(ctrl->ctrl.state == NVME_CTRL_NEW ||
>   			ctrl->ctrl.state == NVME_CTRL_LIVE);
>   		return;
> @@ -978,7 +978,7 @@ static void nvme_rdma_error_recovery_work(struct work_struct *work)
>   
>   static void nvme_rdma_error_recovery(struct nvme_rdma_ctrl *ctrl)
>   {
> -	if (!nvme_change_ctrl_state(&ctrl->ctrl, NVME_CTRL_RECONNECTING))
> +	if (!nvme_change_ctrl_state(&ctrl->ctrl, NVME_CTRL_CONNECTING))
>   		return;
>   
>   	queue_work(nvme_wq, &ctrl->err_work);
> 
What is this based on? It does not apply on nvme-4.16.

^ permalink raw reply	[flat|nested] 19+ messages in thread

* [PATCH v3 0/3] Fix host side state machine
  2018-01-31 16:31 [PATCH v3 0/3] Fix host side state machine Max Gurtovoy
                   ` (3 preceding siblings ...)
  2018-02-05 14:46 ` [PATCH v3 0/3] Fix host side state machine Sagi Grimberg
@ 2018-02-08 16:01 ` Keith Busch
  4 siblings, 0 replies; 19+ messages in thread
From: Keith Busch @ 2018-02-08 16:01 UTC (permalink / raw)


On Wed, Jan 31, 2018@06:31:23PM +0200, Max Gurtovoy wrote:
> Hi all,
> this series is rebased above nvme-4.16.
> Actually there is a still missing part in this tree (but I tested it on
> my own "stable" mixed kernel with this patch):
> "nvme-rdma: fix concurrent reset and reconnect" from Sagi.
> 
> The first motivation for this series was fixing RDMA initiator that crushes in
> case we fail during initial connect and start error recovery during initial
> connection establishment.
> This patchset also renames NVME_CTRL_RECONNECTING to NVME_CTRL_CONNECTING as
> this state doesn't represent only a reconnection flow but also used for
> initialization process.
> 
> The tested transport for these patches was RDMA.
> It will be appriciated if someone can run this series with real FC HW (to test
> patch #3).

Looks good to me.

Reviewed-by: Keith Busch <keith.busch at intel.com>

^ permalink raw reply	[flat|nested] 19+ messages in thread

* [PATCH 1/3] nvme: rename NVME_CTRL_RECONNECTING state to NVME_CTRL_CONNECTING
  2018-02-08 15:28   ` Sagi Grimberg
@ 2018-02-08 16:19     ` Sagi Grimberg
  2018-02-13 10:19       ` Max Gurtovoy
  0 siblings, 1 reply; 19+ messages in thread
From: Sagi Grimberg @ 2018-02-08 16:19 UTC (permalink / raw)




On 02/08/2018 05:28 PM, Sagi Grimberg wrote:
> 
>> diff --git a/drivers/nvme/host/rdma.c b/drivers/nvme/host/rdma.c
>> index 6c2fdfa..219c6a0 100644
>> --- a/drivers/nvme/host/rdma.c
>> +++ b/drivers/nvme/host/rdma.c
>> @@ -887,7 +887,7 @@ static void nvme_rdma_free_ctrl(struct nvme_ctrl 
>> *nctrl)
>> ? static void nvme_rdma_reconnect_or_remove(struct nvme_rdma_ctrl *ctrl)
>> ? {
>> ????? /* If we are resetting/deleting then do nothing */
>> -??? if (ctrl->ctrl.state != NVME_CTRL_RECONNECTING) {
>> +??? if (ctrl->ctrl.state != NVME_CTRL_CONNECTING) {
>> ????????? WARN_ON_ONCE(ctrl->ctrl.state == NVME_CTRL_NEW ||
>> ????????????? ctrl->ctrl.state == NVME_CTRL_LIVE);
>> ????????? return;
>> @@ -978,7 +978,7 @@ static void nvme_rdma_error_recovery_work(struct 
>> work_struct *work)
>> ? static void nvme_rdma_error_recovery(struct nvme_rdma_ctrl *ctrl)
>> ? {
>> -??? if (!nvme_change_ctrl_state(&ctrl->ctrl, NVME_CTRL_RECONNECTING))
>> +??? if (!nvme_change_ctrl_state(&ctrl->ctrl, NVME_CTRL_CONNECTING))
>> ????????? return;
>> ????? queue_work(nvme_wq, &ctrl->err_work);
>>
> What is this based on? It does not apply on nvme-4.16.

Max, I fixed it up locally in nvme-4.16-rc, please verify that its
correct and passes your local tests.

^ permalink raw reply	[flat|nested] 19+ messages in thread

* [PATCH 1/3] nvme: rename NVME_CTRL_RECONNECTING state to NVME_CTRL_CONNECTING
  2018-02-08 16:19     ` Sagi Grimberg
@ 2018-02-13 10:19       ` Max Gurtovoy
  2018-02-13 10:46         ` Sagi Grimberg
  0 siblings, 1 reply; 19+ messages in thread
From: Max Gurtovoy @ 2018-02-13 10:19 UTC (permalink / raw)




On 2/8/2018 6:19 PM, Sagi Grimberg wrote:
> 
> 
> On 02/08/2018 05:28 PM, Sagi Grimberg wrote:
>>
>>> diff --git a/drivers/nvme/host/rdma.c b/drivers/nvme/host/rdma.c
>>> index 6c2fdfa..219c6a0 100644
>>> --- a/drivers/nvme/host/rdma.c
>>> +++ b/drivers/nvme/host/rdma.c
>>> @@ -887,7 +887,7 @@ static void nvme_rdma_free_ctrl(struct nvme_ctrl 
>>> *nctrl)
>>> ? static void nvme_rdma_reconnect_or_remove(struct nvme_rdma_ctrl *ctrl)
>>> ? {
>>> ????? /* If we are resetting/deleting then do nothing */
>>> -??? if (ctrl->ctrl.state != NVME_CTRL_RECONNECTING) {
>>> +??? if (ctrl->ctrl.state != NVME_CTRL_CONNECTING) {
>>> ????????? WARN_ON_ONCE(ctrl->ctrl.state == NVME_CTRL_NEW ||
>>> ????????????? ctrl->ctrl.state == NVME_CTRL_LIVE);
>>> ????????? return;
>>> @@ -978,7 +978,7 @@ static void nvme_rdma_error_recovery_work(struct 
>>> work_struct *work)
>>> ? static void nvme_rdma_error_recovery(struct nvme_rdma_ctrl *ctrl)
>>> ? {
>>> -??? if (!nvme_change_ctrl_state(&ctrl->ctrl, NVME_CTRL_RECONNECTING))
>>> +??? if (!nvme_change_ctrl_state(&ctrl->ctrl, NVME_CTRL_CONNECTING))
>>> ????????? return;
>>> ????? queue_work(nvme_wq, &ctrl->err_work);
>>>
>> What is this based on? It does not apply on nvme-4.16.
> 
> Max, I fixed it up locally in nvme-4.16-rc, please verify that its
> correct and passes your local tests.

Sagi,
I guess you applied it after rebasing, I've mentioned that there was 
still 1 commit missing in nvme-4.16 (your reset fix).
I'll run tests on nvme-4.16-rc and update soon.

-Max.

^ permalink raw reply	[flat|nested] 19+ messages in thread

* [PATCH 1/3] nvme: rename NVME_CTRL_RECONNECTING state to NVME_CTRL_CONNECTING
  2018-02-13 10:19       ` Max Gurtovoy
@ 2018-02-13 10:46         ` Sagi Grimberg
  2018-02-13 18:01           ` Max Gurtovoy
  2018-02-13 23:55           ` Max Gurtovoy
  0 siblings, 2 replies; 19+ messages in thread
From: Sagi Grimberg @ 2018-02-13 10:46 UTC (permalink / raw)


>> Max, I fixed it up locally in nvme-4.16-rc, please verify that its
>> correct and passes your local tests.
> 
> Sagi,
> I guess you applied it after rebasing, I've mentioned that there was 
> still 1 commit missing in nvme-4.16 (your reset fix).

Which one?

> I'll run tests on nvme-4.16-rc and update soon.

Thanks

^ permalink raw reply	[flat|nested] 19+ messages in thread

* [PATCH 1/3] nvme: rename NVME_CTRL_RECONNECTING state to NVME_CTRL_CONNECTING
  2018-02-13 10:46         ` Sagi Grimberg
@ 2018-02-13 18:01           ` Max Gurtovoy
  2018-02-13 23:55           ` Max Gurtovoy
  1 sibling, 0 replies; 19+ messages in thread
From: Max Gurtovoy @ 2018-02-13 18:01 UTC (permalink / raw)




On 2/13/2018 12:46 PM, Sagi Grimberg wrote:
>>> Max, I fixed it up locally in nvme-4.16-rc, please verify that its
>>> correct and passes your local tests.
>>
>> Sagi,
>> I guess you applied it after rebasing, I've mentioned that there was 
>> still 1 commit missing in nvme-4.16 (your reset fix).
> 
> Which one?

I guess this one caused the conflict:
d5bf4b7f4 nvme-rdma: fix concurrent reset and reconnect
> 
>> I'll run tests on nvme-4.16-rc and update soon.
> 
> Thanks


During port toggle with traffic (using dm-multipath) I see some warnings 
during ib_destroy_qp that say there are still mrs_used.
and therefore also warning in ib_dealloc_pd that says refcount on pd is 
not 0.

I'll debug it tomorrow hopefully and update.

-Max.

^ permalink raw reply	[flat|nested] 19+ messages in thread

* [PATCH 1/3] nvme: rename NVME_CTRL_RECONNECTING state to NVME_CTRL_CONNECTING
  2018-02-13 10:46         ` Sagi Grimberg
  2018-02-13 18:01           ` Max Gurtovoy
@ 2018-02-13 23:55           ` Max Gurtovoy
  2018-02-14 13:40             ` Sagi Grimberg
  1 sibling, 1 reply; 19+ messages in thread
From: Max Gurtovoy @ 2018-02-13 23:55 UTC (permalink / raw)




On 2/13/2018 12:46 PM, Sagi Grimberg wrote:
>>> Max, I fixed it up locally in nvme-4.16-rc, please verify that its
>>> correct and passes your local tests.
>>
>> Sagi,
>> I guess you applied it after rebasing, I've mentioned that there was 
>> still 1 commit missing in nvme-4.16 (your reset fix).
> 
> Which one?

I guess this one caused the conflict:
d5bf4b7f4 nvme-rdma: fix concurrent reset and reconnect
> 
>> I'll run tests on nvme-4.16-rc and update soon.
> 
> Thanks


During port toggle with traffic (using dm-multipath) I see some 
warnings during ib_destroy_qp that say there are still mrs_used.
and therefore also in ib_dealloc_pd that says refcount on pd is not 0.

I'll debug it tomorrow hopefully and update.

-Max.

^ permalink raw reply	[flat|nested] 19+ messages in thread

* [PATCH 1/3] nvme: rename NVME_CTRL_RECONNECTING state to NVME_CTRL_CONNECTING
  2018-02-13 23:55           ` Max Gurtovoy
@ 2018-02-14 13:40             ` Sagi Grimberg
  2018-02-14 14:20               ` Max Gurtovoy
  0 siblings, 1 reply; 19+ messages in thread
From: Sagi Grimberg @ 2018-02-14 13:40 UTC (permalink / raw)



> During port toggle with traffic (using dm-multipath) I see some
> warnings during ib_destroy_qp that say there are still mrs_used.
> and therefore also in ib_dealloc_pd that says refcount on pd is not 0.
> 
> I'll debug it tomorrow hopefully and update.

Is this a regression that happened due to your patch set?

^ permalink raw reply	[flat|nested] 19+ messages in thread

* [PATCH 1/3] nvme: rename NVME_CTRL_RECONNECTING state to NVME_CTRL_CONNECTING
  2018-02-14 13:40             ` Sagi Grimberg
@ 2018-02-14 14:20               ` Max Gurtovoy
  2018-02-15 18:09                 ` Max Gurtovoy
  0 siblings, 1 reply; 19+ messages in thread
From: Max Gurtovoy @ 2018-02-14 14:20 UTC (permalink / raw)




On 2/14/2018 3:40 PM, Sagi Grimberg wrote:
> 
>> During port toggle with traffic (using dm-multipath) I see some
>> warnings during ib_destroy_qp that say there are still mrs_used.
>> and therefore also in ib_dealloc_pd that says refcount on pd is not 0.
>>
>> I'll debug it tomorrow hopefully and update.
> 
> Is this a regression that happened due to your patch set?

I don't think so. Without my patches we crash.
I see that we have a timeout on admin_q, and then I/O error:


[Wed Feb 14 14:10:59 2018] nvme nvme0: I/O 0 QID 0 timeout, reset controller
[Wed Feb 14 14:10:59 2018] nvme nvme0: failed nvme_keep_alive_end_io 
error=10
[Wed Feb 14 14:10:59 2018] print_req_error: I/O error, dev nvme0n1, 
sector 704258460
[Wed Feb 14 14:10:59 2018] print_req_error: I/O error, dev nvme0n1, 
sector 388820158
[Wed Feb 14 14:10:59 2018] ib_mr_pool_destroy: destroyed 121 mrs, 
mrs_used 6 for qp 000000008182fc6f
[Wed Feb 14 14:10:59 2018] print_req_error: I/O error, dev nvme0n1, 
sector 489120554
[Wed Feb 14 14:10:59 2018] print_req_error: I/O error, dev nvme0n1, 
sector 399385206
[Wed Feb 14 14:10:59 2018] device-mapper: multipath: Failing path 259:0.
[Wed Feb 14 14:10:59 2018] WARNING: CPU: 9 PID: 12333 at 
drivers/infiniband/core//verbs.c:1524 ib_destroy_qp+0x159/0x170 [ib_core]
[Wed Feb 14 14:10:59 2018] print_req_error: I/O error, dev nvme0n1, 
sector 269330912
[Wed Feb 14 14:10:59 2018] Modules linked in:
[Wed Feb 14 14:10:59 2018] print_req_error: I/O error, dev nvme0n1, 
sector 211936734
[Wed Feb 14 14:10:59 2018]  nvme_rdma(OE)
[Wed Feb 14 14:10:59 2018] print_req_error: I/O error, dev nvme0n1, 
sector 383446442
[Wed Feb 14 14:10:59 2018]  nvme_fabrics(OE) nvme_core(OE)
[Wed Feb 14 14:10:59 2018] print_req_error: I/O error, dev nvme0n1, 
sector 160594228


for some reason not all commands complete before we destroy the QP (we 
use dm-multipath here).

In iser (we also saw that the pool has registered regions) we created 
all_list and we free the MRs from there...


-Max.

^ permalink raw reply	[flat|nested] 19+ messages in thread

* [PATCH 1/3] nvme: rename NVME_CTRL_RECONNECTING state to NVME_CTRL_CONNECTING
  2018-02-14 14:20               ` Max Gurtovoy
@ 2018-02-15 18:09                 ` Max Gurtovoy
  0 siblings, 0 replies; 19+ messages in thread
From: Max Gurtovoy @ 2018-02-15 18:09 UTC (permalink / raw)




On 2/14/2018 4:20 PM, Max Gurtovoy wrote:
> 
> 
> On 2/14/2018 3:40 PM, Sagi Grimberg wrote:
>>
>>> During port toggle with traffic (using dm-multipath) I see some
>>> warnings during ib_destroy_qp that say there are still mrs_used.
>>> and therefore also in ib_dealloc_pd that says refcount on pd is not 0.
>>>
>>> I'll debug it tomorrow hopefully and update.
>>
>> Is this a regression that happened due to your patch set?
> 
> I don't think so. Without my patches we crash.
> I see that we have a timeout on admin_q, and then I/O error:
> 
> 
> [Wed Feb 14 14:10:59 2018] nvme nvme0: I/O 0 QID 0 timeout, reset 
> controller
> [Wed Feb 14 14:10:59 2018] nvme nvme0: failed nvme_keep_alive_end_io 
> error=10
> [Wed Feb 14 14:10:59 2018] print_req_error: I/O error, dev nvme0n1, 
> sector 704258460
> [Wed Feb 14 14:10:59 2018] print_req_error: I/O error, dev nvme0n1, 
> sector 388820158
> [Wed Feb 14 14:10:59 2018] ib_mr_pool_destroy: destroyed 121 mrs, 
> mrs_used 6 for qp 000000008182fc6f
> [Wed Feb 14 14:10:59 2018] print_req_error: I/O error, dev nvme0n1, 
> sector 489120554
> [Wed Feb 14 14:10:59 2018] print_req_error: I/O error, dev nvme0n1, 
> sector 399385206
> [Wed Feb 14 14:10:59 2018] device-mapper: multipath: Failing path 259:0.
> [Wed Feb 14 14:10:59 2018] WARNING: CPU: 9 PID: 12333 at 
> drivers/infiniband/core//verbs.c:1524 ib_destroy_qp+0x159/0x170 [ib_core]
> [Wed Feb 14 14:10:59 2018] print_req_error: I/O error, dev nvme0n1, 
> sector 269330912
> [Wed Feb 14 14:10:59 2018] Modules linked in:
> [Wed Feb 14 14:10:59 2018] print_req_error: I/O error, dev nvme0n1, 
> sector 211936734
> [Wed Feb 14 14:10:59 2018]? nvme_rdma(OE)
> [Wed Feb 14 14:10:59 2018] print_req_error: I/O error, dev nvme0n1, 
> sector 383446442
> [Wed Feb 14 14:10:59 2018]? nvme_fabrics(OE) nvme_core(OE)
> [Wed Feb 14 14:10:59 2018] print_req_error: I/O error, dev nvme0n1, 
> sector 160594228
> 
> 
> for some reason not all commands complete before we destroy the QP (we 
> use dm-multipath here).

What I did is freezing the queue and it seems to help. But now the 
failover takes ~cmd_timeout secs:


@@ -931,6 +938,8 @@ static void nvme_rdma_error_recovery_work(struct 
work_struct *work)
         nvme_stop_keep_alive(&ctrl->ctrl);

         if (ctrl->ctrl.queue_count > 1) {
+               nvme_start_freeze(&ctrl->ctrl);
+               nvme_wait_freeze(&ctrl->ctrl);
                 nvme_stop_queues(&ctrl->ctrl);
                 blk_mq_tagset_busy_iter(&ctrl->tag_set,
                                         nvme_cancel_request, &ctrl->ctrl);
@@ -948,6 +957,7 @@ static void nvme_rdma_error_recovery_work(struct 
work_struct *work)
          */
         blk_mq_unquiesce_queue(ctrl->ctrl.admin_q);
         nvme_start_queues(&ctrl->ctrl);
+       nvme_unfreeze(&ctrl->ctrl);

         if (!nvme_change_ctrl_state(&ctrl->ctrl, NVME_CTRL_CONNECTING)) {
                 /* state change failure should never happen */


I borrowed it from the nvme pci driver to make sure we won't receive any 
new requests (even timed out requests).
Maybe Keith can look on that ?

I don't think the above is the right solution but we must to make sure 
we don't get any request during reconnection or after freeing the QP.
We might end up with free/alloc the tagset on each reconnection or at 
least use blk_mq_queue_reinit.

I'll try it too.


> 
> In iser (we also saw that the pool has registered regions) we created 
> all_list and we free the MRs from there...
> 
> 
> -Max.

^ permalink raw reply	[flat|nested] 19+ messages in thread

end of thread, other threads:[~2018-02-15 18:09 UTC | newest]

Thread overview: 19+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2018-01-31 16:31 [PATCH v3 0/3] Fix host side state machine Max Gurtovoy
2018-01-31 16:31 ` [PATCH 1/3] nvme: rename NVME_CTRL_RECONNECTING state to NVME_CTRL_CONNECTING Max Gurtovoy
2018-01-31 17:10   ` James Smart
2018-02-08 15:28   ` Sagi Grimberg
2018-02-08 16:19     ` Sagi Grimberg
2018-02-13 10:19       ` Max Gurtovoy
2018-02-13 10:46         ` Sagi Grimberg
2018-02-13 18:01           ` Max Gurtovoy
2018-02-13 23:55           ` Max Gurtovoy
2018-02-14 13:40             ` Sagi Grimberg
2018-02-14 14:20               ` Max Gurtovoy
2018-02-15 18:09                 ` Max Gurtovoy
2018-01-31 16:31 ` [PATCH 2/3] nvme-rdma: use NVME_CTRL_CONNECTING state to mark init process Max Gurtovoy
2018-01-31 17:10   ` James Smart
2018-01-31 16:31 ` [PATCH 3/3] nvme: delete NVME_CTRL_LIVE --> NVME_CTRL_CONNECTING transition Max Gurtovoy
2018-01-31 17:11   ` James Smart
2018-02-05 14:46 ` [PATCH v3 0/3] Fix host side state machine Sagi Grimberg
2018-02-06 15:01   ` Max Gurtovoy
2018-02-08 16:01 ` Keith Busch

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.