All of lore.kernel.org
 help / color / mirror / Atom feed
* [PATCH 0/3] Introduce fabrics controller loss timeout
@ 2017-03-18 22:42 Sagi Grimberg
  2017-03-18 22:42 ` [PATCH 1/3] nvme-rdma: get rid of local reconnect_delay Sagi Grimberg
                   ` (3 more replies)
  0 siblings, 4 replies; 13+ messages in thread
From: Sagi Grimberg @ 2017-03-18 22:42 UTC (permalink / raw)


In case a host realize that it's controller session is
damaged it schedules periodic reconnects. In case the controller
is gone and will never return, we need a stop condition to give
up on this controller simply remove it.

We allow the user to configure a suitable ctrl_loss_tmo and
set a reasonable default of 10 minutes.

We'll need a complementary nvme-cli exposure that will follow.

Sagi Grimberg (3):
  nvme-rdma: get rid of local reconnect_delay
  nvme-fabrics: Allow ctrl loss timeout configuration
  nvme-rdma: Support ctrl_loss_tmo

 drivers/nvme/host/fabrics.c | 28 ++++++++++++++++++++++++++++
 drivers/nvme/host/fabrics.h | 10 ++++++++++
 drivers/nvme/host/rdma.c    | 43 ++++++++++++++++++++++++++++---------------
 3 files changed, 66 insertions(+), 15 deletions(-)

-- 
2.7.4

^ permalink raw reply	[flat|nested] 13+ messages in thread

* [PATCH 1/3] nvme-rdma: get rid of local reconnect_delay
  2017-03-18 22:42 [PATCH 0/3] Introduce fabrics controller loss timeout Sagi Grimberg
@ 2017-03-18 22:42 ` Sagi Grimberg
  2017-03-27  9:50   ` Christoph Hellwig
  2017-03-18 22:42 ` [PATCH 2/3] nvme-fabrics: Allow ctrl loss timeout configuration Sagi Grimberg
                   ` (2 subsequent siblings)
  3 siblings, 1 reply; 13+ messages in thread
From: Sagi Grimberg @ 2017-03-18 22:42 UTC (permalink / raw)


we already have it in opts.

Signed-off-by: Sagi Grimberg <sagi at grimberg.me>
---
 drivers/nvme/host/rdma.c | 8 +++-----
 1 file changed, 3 insertions(+), 5 deletions(-)

diff --git a/drivers/nvme/host/rdma.c b/drivers/nvme/host/rdma.c
index e1db1736823f..33f18636ea99 100644
--- a/drivers/nvme/host/rdma.c
+++ b/drivers/nvme/host/rdma.c
@@ -118,7 +118,6 @@ struct nvme_rdma_ctrl {
 
 	struct nvme_rdma_qe	async_event_sqe;
 
-	int			reconnect_delay;
 	struct delayed_work	reconnect_work;
 
 	struct list_head	list;
@@ -782,7 +781,7 @@ static void nvme_rdma_reconnect_ctrl_work(struct work_struct *work)
 		dev_info(ctrl->ctrl.device,
 			"Failed reconnect attempt, requeueing...\n");
 		queue_delayed_work(nvme_rdma_wq, &ctrl->reconnect_work,
-					ctrl->reconnect_delay * HZ);
+				ctrl->ctrl.opts->reconnect_delay * HZ);
 	}
 }
 
@@ -811,10 +810,10 @@ static void nvme_rdma_error_recovery_work(struct work_struct *work)
 				nvme_cancel_request, &ctrl->ctrl);
 
 	dev_info(ctrl->ctrl.device, "reconnecting in %d seconds\n",
-		ctrl->reconnect_delay);
+		ctrl->ctrl.opts->reconnect_delay);
 
 	queue_delayed_work(nvme_rdma_wq, &ctrl->reconnect_work,
-				ctrl->reconnect_delay * HZ);
+				ctrl->ctrl.opts->reconnect_delay * HZ);
 }
 
 static void nvme_rdma_error_recovery(struct nvme_rdma_ctrl *ctrl)
@@ -1918,7 +1917,6 @@ static struct nvme_ctrl *nvme_rdma_create_ctrl(struct device *dev,
 	if (ret)
 		goto out_free_ctrl;
 
-	ctrl->reconnect_delay = opts->reconnect_delay;
 	INIT_DELAYED_WORK(&ctrl->reconnect_work,
 			nvme_rdma_reconnect_ctrl_work);
 	INIT_WORK(&ctrl->err_work, nvme_rdma_error_recovery_work);
-- 
2.7.4

^ permalink raw reply related	[flat|nested] 13+ messages in thread

* [PATCH 2/3] nvme-fabrics: Allow ctrl loss timeout configuration
  2017-03-18 22:42 [PATCH 0/3] Introduce fabrics controller loss timeout Sagi Grimberg
  2017-03-18 22:42 ` [PATCH 1/3] nvme-rdma: get rid of local reconnect_delay Sagi Grimberg
@ 2017-03-18 22:42 ` Sagi Grimberg
  2017-03-27  9:50   ` Christoph Hellwig
  2017-04-17 22:29   ` James Smart
  2017-03-18 22:42 ` [PATCH 3/3] nvme-rdma: Support ctrl_loss_tmo Sagi Grimberg
  2017-03-27  0:41 ` [PATCH 0/3] Introduce fabrics controller loss timeout Yi Zhang
  3 siblings, 2 replies; 13+ messages in thread
From: Sagi Grimberg @ 2017-03-18 22:42 UTC (permalink / raw)


When a host sense that its controller session is damaged,
it tries to re-establish it periodically (reconnect every
reconnect_delay). It may very well be that the controller
is gone and never coming back, in this case the host will
try to reconnect forever.

Add a ctrl_loss_tmo to bound the number of reconnect attempts
to a specific controller (default to a reasonable 10 minutes).
The timeout configuration is actually translated into number of
reconnect attempts and not a schedule on its own but rather
divided with reconnect_delay. This is useful to prevent
racing flows of remove and reconnect, and it doesn't really
matter if we remove slightly sooner than what the user requested.

Signed-off-by: Sagi Grimberg <sagi at grimberg.me>
---
 drivers/nvme/host/fabrics.c | 28 ++++++++++++++++++++++++++++
 drivers/nvme/host/fabrics.h | 10 ++++++++++
 2 files changed, 38 insertions(+)

diff --git a/drivers/nvme/host/fabrics.c b/drivers/nvme/host/fabrics.c
index 5b7386f69f4d..990e6fb32a63 100644
--- a/drivers/nvme/host/fabrics.c
+++ b/drivers/nvme/host/fabrics.c
@@ -471,6 +471,16 @@ int nvmf_connect_io_queue(struct nvme_ctrl *ctrl, u16 qid)
 }
 EXPORT_SYMBOL_GPL(nvmf_connect_io_queue);
 
+bool nvmf_should_reconnect(struct nvme_ctrl *ctrl)
+{
+	if (ctrl->opts->max_reconnects != -1 &&
+	    ctrl->opts->nr_reconnects < ctrl->opts->max_reconnects)
+		return true;
+
+	return false;
+}
+EXPORT_SYMBOL_GPL(nvmf_should_reconnect);
+
 /**
  * nvmf_register_transport() - NVMe Fabrics Library registration function.
  * @ops:	Transport ops instance to be registered to the
@@ -533,6 +543,7 @@ static const match_table_t opt_tokens = {
 	{ NVMF_OPT_QUEUE_SIZE,		"queue_size=%d"		},
 	{ NVMF_OPT_NR_IO_QUEUES,	"nr_io_queues=%d"	},
 	{ NVMF_OPT_RECONNECT_DELAY,	"reconnect_delay=%d"	},
+	{ NVMF_OPT_CTRL_LOSS_TMO,	"ctrl_loss_tmo=%d"	},
 	{ NVMF_OPT_KATO,		"keep_alive_tmo=%d"	},
 	{ NVMF_OPT_HOSTNQN,		"hostnqn=%s"		},
 	{ NVMF_OPT_HOST_TRADDR,		"host_traddr=%s"	},
@@ -546,6 +557,7 @@ static int nvmf_parse_options(struct nvmf_ctrl_options *opts,
 	char *options, *o, *p;
 	int token, ret = 0;
 	size_t nqnlen  = 0;
+	int ctrl_loss_tmo = NVMF_DEF_CTRL_LOSS_TMO;
 
 	/* Set defaults */
 	opts->queue_size = NVMF_DEF_QUEUE_SIZE;
@@ -655,6 +667,16 @@ static int nvmf_parse_options(struct nvmf_ctrl_options *opts,
 			}
 			opts->kato = token;
 			break;
+		case NVMF_OPT_CTRL_LOSS_TMO:
+			if (match_int(args, &token)) {
+				ret = -EINVAL;
+				goto out;
+			}
+
+			if (token < 0)
+				pr_warn("ctrl_loss_tmo < 0 will reconnect forever\n");
+			ctrl_loss_tmo = token;
+			break;
 		case NVMF_OPT_HOSTNQN:
 			if (opts->host) {
 				pr_err("hostnqn already user-assigned: %s\n",
@@ -710,6 +732,12 @@ static int nvmf_parse_options(struct nvmf_ctrl_options *opts,
 		}
 	}
 
+	if (ctrl_loss_tmo < 0)
+		opts->max_reconnects = -1;
+	else
+		opts->max_reconnects = DIV_ROUND_UP(ctrl_loss_tmo,
+						opts->reconnect_delay);
+
 	if (!opts->host) {
 		kref_get(&nvmf_default_host->ref);
 		opts->host = nvmf_default_host;
diff --git a/drivers/nvme/host/fabrics.h b/drivers/nvme/host/fabrics.h
index 156018182ce4..f5a9c1fb186f 100644
--- a/drivers/nvme/host/fabrics.h
+++ b/drivers/nvme/host/fabrics.h
@@ -21,6 +21,8 @@
 #define NVMF_MAX_QUEUE_SIZE	1024
 #define NVMF_DEF_QUEUE_SIZE	128
 #define NVMF_DEF_RECONNECT_DELAY	10
+/* default to 600 seconds of reconnect attempts before giving up */
+#define NVMF_DEF_CTRL_LOSS_TMO		600
 
 /*
  * Define a host as seen by the target.  We allocate one at boot, but also
@@ -53,6 +55,7 @@ enum {
 	NVMF_OPT_HOSTNQN	= 1 << 8,
 	NVMF_OPT_RECONNECT_DELAY = 1 << 9,
 	NVMF_OPT_HOST_TRADDR	= 1 << 10,
+	NVMF_OPT_CTRL_LOSS_TMO	= 1 << 11,
 };
 
 /**
@@ -77,6 +80,10 @@ enum {
  * @discovery_nqn: indicates if the subsysnqn is the well-known discovery NQN.
  * @kato:	Keep-alive timeout.
  * @host:	Virtual NVMe host, contains the NQN and Host ID.
+ * @nr_reconnects: number of reconnect attempted since the last ctrl failure
+ * @max_reconnects: maximum number of allowed reconnect attempts before removing
+ *              the controller, (-1) means reconnect forever, zero means remove
+ *              immediately;
  */
 struct nvmf_ctrl_options {
 	unsigned		mask;
@@ -91,6 +98,8 @@ struct nvmf_ctrl_options {
 	bool			discovery_nqn;
 	unsigned int		kato;
 	struct nvmf_host	*host;
+	int			nr_reconnects;
+	int			max_reconnects;
 };
 
 /*
@@ -133,5 +142,6 @@ void nvmf_unregister_transport(struct nvmf_transport_ops *ops);
 void nvmf_free_options(struct nvmf_ctrl_options *opts);
 const char *nvmf_get_subsysnqn(struct nvme_ctrl *ctrl);
 int nvmf_get_address(struct nvme_ctrl *ctrl, char *buf, int size);
+bool nvmf_should_reconnect(struct nvme_ctrl *ctrl);
 
 #endif /* _NVME_FABRICS_H */
-- 
2.7.4

^ permalink raw reply related	[flat|nested] 13+ messages in thread

* [PATCH 3/3] nvme-rdma: Support ctrl_loss_tmo
  2017-03-18 22:42 [PATCH 0/3] Introduce fabrics controller loss timeout Sagi Grimberg
  2017-03-18 22:42 ` [PATCH 1/3] nvme-rdma: get rid of local reconnect_delay Sagi Grimberg
  2017-03-18 22:42 ` [PATCH 2/3] nvme-fabrics: Allow ctrl loss timeout configuration Sagi Grimberg
@ 2017-03-18 22:42 ` Sagi Grimberg
  2017-03-27  9:50   ` Christoph Hellwig
  2017-04-25  0:46   ` James Smart
  2017-03-27  0:41 ` [PATCH 0/3] Introduce fabrics controller loss timeout Yi Zhang
  3 siblings, 2 replies; 13+ messages in thread
From: Sagi Grimberg @ 2017-03-18 22:42 UTC (permalink / raw)


Before scheduling a reconnect attempt, check
nr_reconnects against max_reconnects, if not
exhausted (or max_reconnects is not -1), schedule
a reconnect attempts, otherwise schedule ctrl
removal.

Signed-off-by: Sagi Grimberg <sagi at grimberg.me>
---
 drivers/nvme/host/rdma.c | 41 ++++++++++++++++++++++++++++-------------
 1 file changed, 28 insertions(+), 13 deletions(-)

diff --git a/drivers/nvme/host/rdma.c b/drivers/nvme/host/rdma.c
index 33f18636ea99..71d1e1a6b928 100644
--- a/drivers/nvme/host/rdma.c
+++ b/drivers/nvme/host/rdma.c
@@ -711,6 +711,26 @@ static void nvme_rdma_free_ctrl(struct nvme_ctrl *nctrl)
 	kfree(ctrl);
 }
 
+static void nvme_rdma_reconnect_or_remove(struct nvme_rdma_ctrl *ctrl)
+{
+	/* If we are resetting/deleting then do nothing */
+	if (ctrl->ctrl.state != NVME_CTRL_RECONNECTING) {
+		WARN_ON_ONCE(ctrl->ctrl.state == NVME_CTRL_NEW ||
+			ctrl->ctrl.state == NVME_CTRL_LIVE);
+		return;
+	}
+
+	if (nvmf_should_reconnect(&ctrl->ctrl)) {
+		dev_info(ctrl->ctrl.device, "Reconnecting in %d seconds...\n",
+			ctrl->ctrl.opts->reconnect_delay);
+		queue_delayed_work(nvme_rdma_wq, &ctrl->reconnect_work,
+				ctrl->ctrl.opts->reconnect_delay * HZ);
+	} else {
+		dev_info(ctrl->ctrl.device, "Removing controller...\n");
+		queue_work(nvme_rdma_wq, &ctrl->delete_work);
+	}
+}
+
 static void nvme_rdma_reconnect_ctrl_work(struct work_struct *work)
 {
 	struct nvme_rdma_ctrl *ctrl = container_of(to_delayed_work(work),
@@ -718,6 +738,8 @@ static void nvme_rdma_reconnect_ctrl_work(struct work_struct *work)
 	bool changed;
 	int ret;
 
+	++ctrl->ctrl.opts->nr_reconnects;
+
 	if (ctrl->queue_count > 1) {
 		nvme_rdma_free_io_queues(ctrl);
 
@@ -762,6 +784,7 @@ static void nvme_rdma_reconnect_ctrl_work(struct work_struct *work)
 
 	changed = nvme_change_ctrl_state(&ctrl->ctrl, NVME_CTRL_LIVE);
 	WARN_ON_ONCE(!changed);
+	ctrl->ctrl.opts->nr_reconnects = 0;
 
 	if (ctrl->queue_count > 1) {
 		nvme_start_queues(&ctrl->ctrl);
@@ -776,13 +799,9 @@ static void nvme_rdma_reconnect_ctrl_work(struct work_struct *work)
 stop_admin_q:
 	blk_mq_stop_hw_queues(ctrl->ctrl.admin_q);
 requeue:
-	/* Make sure we are not resetting/deleting */
-	if (ctrl->ctrl.state == NVME_CTRL_RECONNECTING) {
-		dev_info(ctrl->ctrl.device,
-			"Failed reconnect attempt, requeueing...\n");
-		queue_delayed_work(nvme_rdma_wq, &ctrl->reconnect_work,
-				ctrl->ctrl.opts->reconnect_delay * HZ);
-	}
+	dev_info(ctrl->ctrl.device, "Failed reconnect attempt %d\n",
+			ctrl->ctrl.opts->nr_reconnects);
+	nvme_rdma_reconnect_or_remove(ctrl);
 }
 
 static void nvme_rdma_error_recovery_work(struct work_struct *work)
@@ -809,11 +828,7 @@ static void nvme_rdma_error_recovery_work(struct work_struct *work)
 	blk_mq_tagset_busy_iter(&ctrl->admin_tag_set,
 				nvme_cancel_request, &ctrl->ctrl);
 
-	dev_info(ctrl->ctrl.device, "reconnecting in %d seconds\n",
-		ctrl->ctrl.opts->reconnect_delay);
-
-	queue_delayed_work(nvme_rdma_wq, &ctrl->reconnect_work,
-				ctrl->ctrl.opts->reconnect_delay * HZ);
+	nvme_rdma_reconnect_or_remove(ctrl);
 }
 
 static void nvme_rdma_error_recovery(struct nvme_rdma_ctrl *ctrl)
@@ -2011,7 +2026,7 @@ static struct nvmf_transport_ops nvme_rdma_transport = {
 	.name		= "rdma",
 	.required_opts	= NVMF_OPT_TRADDR,
 	.allowed_opts	= NVMF_OPT_TRSVCID | NVMF_OPT_RECONNECT_DELAY |
-			  NVMF_OPT_HOST_TRADDR,
+			  NVMF_OPT_HOST_TRADDR | NVMF_OPT_CTRL_LOSS_TMO,
 	.create_ctrl	= nvme_rdma_create_ctrl,
 };
 
-- 
2.7.4

^ permalink raw reply related	[flat|nested] 13+ messages in thread

* [PATCH 0/3] Introduce fabrics controller loss timeout
  2017-03-18 22:42 [PATCH 0/3] Introduce fabrics controller loss timeout Sagi Grimberg
                   ` (2 preceding siblings ...)
  2017-03-18 22:42 ` [PATCH 3/3] nvme-rdma: Support ctrl_loss_tmo Sagi Grimberg
@ 2017-03-27  0:41 ` Yi Zhang
  2017-03-28 11:37   ` Sagi Grimberg
  3 siblings, 1 reply; 13+ messages in thread
From: Yi Zhang @ 2017-03-27  0:41 UTC (permalink / raw)


Hello Sagi
With these three patches, the reconnecting stopped after 60 times.

I restart another test that do fio testing on nvme0n1[1] on client before executing "nvmetclt clear" on target side. 
After that, I found another issue that the fio jobs cannot be stopped even I tried "Ctrl + C", and the device node also cannot be released[2].
Here is the kernel log[3].
Let me know if you need more info, thanks

[1]
fio -filename=/dev/nvme0n1 -iodepth=1 -thread -rw=randwrite -ioengine=psync -bssplit=5k/10:9k/10:13k/10:17k/10:21k/10:25k/10:29k/10:33k/10:37k/10:41k/10 -bs_unaligned -runtime=1200 -size=-group_reporting -name=mytest -numjobs=60

[2]
# lsblk 
NAME                  MAJ:MIN RM   SIZE RO TYPE MOUNTPOINT
sr0                    11:0    1  1024M  0 rom  
sda                     8:0    0 279.4G  0 disk 
??sda2                  8:2    0 278.4G  0 part 
? ??rhelp_rdma04-swap 253:1    0  15.8G  0 lvm  [SWAP]
? ??rhelp_rdma04-home 253:2    0 212.6G  0 lvm  /home
? ??rhelp_rdma04-root 253:0    0    50G  0 lvm  /
??sda1                  8:1    0     1G  0 part /boot
nvme0n1               259:0    0   250G  0 disk 

[3]
[  356.812399] nvme nvme0: Reconnecting in 10 seconds...
[  366.965161] nvme nvme0: Connect rejected: status 8 (invalid service ID).
[  367.002048] nvme nvme0: rdma_resolve_addr wait failed (-104).
[  367.029926] nvme nvme0: Failed reconnect attempt 21
[  367.051905] nvme nvme0: Reconnecting in 10 seconds...
[  371.444001] INFO: task kworker/u130:1:155 blocked for more than 120 seconds.
[  371.480773]       Not tainted 4.11.0-rc3.ctrl_tmo+ #1
[  371.505608] "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message.
[  371.540918] kworker/u130:1  D    0   155      2 0x00000000
[  371.565584] Workqueue: writeback wb_workfn (flush-259:0)
[  371.590031] Call Trace:
[  371.600981]  __schedule+0x289/0x8f0
[  371.616644]  schedule+0x36/0x80
[  371.630693]  io_schedule+0x16/0x40
[  371.645565]  blk_mq_get_tag+0x16c/0x280
[  371.662929]  ? remove_wait_queue+0x60/0x60
[  371.680942]  __blk_mq_alloc_request+0x1b/0xe0
[  371.700508]  blk_mq_sched_get_request+0x1a0/0x240
[  371.721616]  blk_mq_make_request+0x113/0x620
[  371.741215]  generic_make_request+0x110/0x2c0
[  371.760755]  submit_bio+0x75/0x150
[  371.776138]  submit_bh_wbc+0x141/0x180
[  371.793106]  __block_write_full_page+0x13d/0x3b0
[  371.814573]  ? I_BDEV+0x20/0x20
[  371.828657]  ? I_BDEV+0x20/0x20
[  371.842717]  block_write_full_page+0xe5/0x110
[  371.862312]  blkdev_writepage+0x18/0x20
[  371.879727]  __writepage+0x13/0x40
[  371.894593]  write_cache_pages+0x26f/0x510
[  371.913039]  ? select_idle_sibling+0x29/0x3d0
[  371.932593]  ? compound_head+0x20/0x20
[  371.949404]  generic_writepages+0x51/0x80
[  371.967972]  blkdev_writepages+0x2f/0x40
[  371.989381]  do_writepages+0x1e/0x30
[  372.007479]  __writeback_single_inode+0x45/0x330
[  372.028326]  writeback_sb_inodes+0x280/0x570
[  372.047594]  __writeback_inodes_wb+0x8c/0xc0
[  372.066852]  wb_writeback+0x276/0x310
[  372.083247]  wb_workfn+0x19c/0x3b0
[  372.098577]  process_one_work+0x165/0x410
[  372.116679]  worker_thread+0x137/0x4c0
[  372.133644]  kthread+0x101/0x140
[  372.148257]  ? rescuer_thread+0x3b0/0x3b0
[  372.166253]  ? kthread_park+0x90/0x90
[  372.182689]  ret_from_fork+0x2c/0x40
[  372.198802] INFO: task systemd-udevd:788 blocked for more than 120 seconds.
[  372.230377]       Not tainted 4.11.0-rc3.ctrl_tmo+ #1
[  372.253129] "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message.
[  372.288576] systemd-udevd   D    0   788      1 0x00000002
[  372.313244] Call Trace:
[  372.324208]  __schedule+0x289/0x8f0
[  372.339835]  schedule+0x36/0x80
[  372.354198]  io_schedule+0x16/0x40
[  372.369040]  blk_mq_get_tag+0x16c/0x280
[  372.385867]  ? remove_wait_queue+0x60/0x60
[  372.404276]  __blk_mq_alloc_request+0x1b/0xe0
[  372.423849]  blk_mq_sched_get_request+0x1a0/0x240
[  372.444945]  blk_mq_make_request+0x113/0x620
[  372.464123]  generic_make_request+0x110/0x2c0
[  372.484885]  submit_bio+0x75/0x150
[  372.502586]  submit_bh_wbc+0x141/0x180
[  372.521625]  __block_write_full_page+0x13d/0x3b0
[  372.542552]  ? I_BDEV+0x20/0x20
[  372.556646]  ? I_BDEV+0x20/0x20
[  372.570750]  block_write_full_page+0xe5/0x110
[  372.590507]  blkdev_writepage+0x18/0x20
[  372.608514]  __writepage+0x13/0x40
[  372.623729]  write_cache_pages+0x26f/0x510
[  372.642116]  ? compound_head+0x20/0x20
[  372.659046]  generic_writepages+0x51/0x80
[  372.677447]  blkdev_writepages+0x2f/0x40
[  372.695072]  do_writepages+0x1e/0x30
[  372.711155]  __filemap_fdatawrite_range+0xc6/0x100
[  372.732778]  filemap_write_and_wait+0x3d/0x80
[  372.752330]  __sync_blockdev+0x1f/0x40
[  372.769151]  fsync_bdev+0x44/0x50
[  372.784048]  invalidate_partition+0x24/0x50
[  372.802835]  rescan_partitions+0x52/0x3a0
[  372.821426]  ? selinux_capable+0x20/0x30
[  372.839444]  ? security_capable+0x48/0x60
[  372.857427]  __blkdev_reread_part+0x64/0x70
[  372.876214]  blkdev_reread_part+0x23/0x40
[  372.894178]  blkdev_ioctl+0x46c/0x900
[  372.910650]  block_ioctl+0x41/0x50
[  372.925899]  do_vfs_ioctl+0xa7/0x5e0
[  372.941931]  SyS_ioctl+0x79/0x90
[  372.956410]  ? SyS_flock+0x12c/0x1c0
[  372.972407]  entry_SYSCALL_64_fastpath+0x1a/0xa9
[  372.995057] RIP: 0033:0x7f2604a22507
[  373.013328] RSP: 002b:00007ffe3be8f228 EFLAGS: 00000246 ORIG_RAX: 0000000000000010
[  373.049088] RAX: ffffffffffffffda RBX: 000056342ff88de0 RCX: 00007f2604a22507
[  373.081210] RDX: 0000000000000000 RSI: 000000000000125f RDI: 000000000000000c
[  373.113650] RBP: 0000000000000000 R08: 0000000000000001 R09: 00007f2605dbb8c0
[  373.145759] R10: 0000000000000000 R11: 0000000000000246 R12: 0000000000000000
[  373.178107] R13: 00007ffe3be8b1d8 R14: 0000000000000008 R15: 0000000000010300
[  373.210167] INFO: task fio:3324 blocked for more than 120 seconds.
[  373.237948]       Not tainted 4.11.0-rc3.ctrl_tmo+ #1
[  373.260671] "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message.
[  373.295605] fio             D    0  3324   3252 0x00000080
[  373.320234] Call Trace:
[  373.331152]  __schedule+0x289/0x8f0
[  373.346824]  schedule+0x36/0x80
[  373.360958]  schedule_preempt_disabled+0xe/0x10
[  373.381274]  __mutex_lock.isra.8+0x266/0x500
[  373.400423]  __mutex_lock_slowpath+0x13/0x20
[  373.419588]  mutex_lock+0x2f/0x40
[  373.434441]  blkdev_put+0x20/0x120
[  373.449748]  blkdev_close+0x25/0x30
[  373.466217]  __fput+0xe7/0x210
[  373.480691]  ____fput+0xe/0x10
[  373.495002]  task_work_run+0x83/0xb0
[  373.512914]  exit_to_usermode_loop+0x59/0x85
[  373.534017]  do_syscall_64+0x165/0x180
[  373.552724]  entry_SYSCALL64_slow_path+0x25/0x25
[  373.575867] RIP: 0033:0x2b89425194fd
[  373.591921] RSP: 002b:00002b895b083c40 EFLAGS: 00000293 ORIG_RAX: 0000000000000003
[  373.626630] RAX: 0000000000000000 RBX: 00002b89431806d0 RCX: 00002b89425194fd
[  373.658765] RDX: 00002b89415c8000 RSI: 0000000000000080 RDI: 000000000000000f
[  373.690820] RBP: 0000000000000000 R08: 0000000000000000 R09: 0000000000000cfc
[  373.722842] R10: 6e493d726f727265 R11: 0000000000000293 R12: 0000000000000000
[  373.755214] R13: 00002b894b403000 R14: 0000000000000000 R15: 00002b894b4104c0
[  373.787447] INFO: task fio:3325 blocked for more than 120 seconds.
[  373.815230]       Not tainted 4.11.0-rc3.ctrl_tmo+ #1
[  373.838263] "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message.
[  373.874051] fio             D    0  3325   3252 0x00000080
[  373.898802] Call Trace:
[  373.909778]  __schedule+0x289/0x8f0
[  373.925415]  schedule+0x36/0x80
[  373.939503]  schedule_preempt_disabled+0xe/0x10
[  373.959802]  __mutex_lock.isra.8+0x266/0x500
[  373.979022]  __mutex_lock_slowpath+0x13/0x20
[  373.998230]  mutex_lock+0x2f/0x40
[  374.013611]  blkdev_put+0x20/0x120
[  374.031725]  blkdev_close+0x25/0x30
[  374.050176]  __fput+0xe7/0x210
[  374.064775]  ____fput+0xe/0x10
[  374.078489]  task_work_run+0x83/0xb0
[  374.094580]  exit_to_usermode_loop+0x59/0x85
[  374.113768]  do_syscall_64+0x165/0x180
[  374.130553]  entry_SYSCALL64_slow_path+0x25/0x25
[  374.151303] RIP: 0033:0x2b89425194fd
[  374.167387] RSP: 002b:00002b895ae82c40 EFLAGS: 00000293 ORIG_RAX: 0000000000000003
[  374.201599] RAX: 0000000000000000 RBX: 00002b8943180890 RCX: 00002b89425194fd
[  374.233708] RDX: 00002b89415c8000 RSI: 0000000000000080 RDI: 0000000000000037
[  374.265519] RBP: 0000000000000000 R08: 0000000000000000 R09: 0000000000000cfd
[  374.297649] R10: 6e493d726f727265 R11: 0000000000000293 R12: 0000000000000000
[  374.329729] R13: 00002b894b410c00 R14: 0000000000000000 R15: 00002b894b41e0c0
[  374.361865] INFO: task fio:3327 blocked for more than 120 seconds.
[  374.389636]       Not tainted 4.11.0-rc3.ctrl_tmo+ #1
[  374.412347] "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message.
[  374.447748] fio             D    0  3327   3252 0x00000080
[  374.472345] Call Trace:
[  374.483370]  __schedule+0x289/0x8f0
[  374.499146]  schedule+0x36/0x80
[  374.513203]  schedule_preempt_disabled+0xe/0x10
[  374.534772]  __mutex_lock.isra.8+0x266/0x500
[  374.556953]  __mutex_lock_slowpath+0x13/0x20
[  374.577119]  mutex_lock+0x2f/0x40
[  374.591993]  blkdev_put+0x20/0x120
[  374.607965]  blkdev_close+0x25/0x30
[  374.623585]  __fput+0xe7/0x210
[  374.637293]  ____fput+0xe/0x10
[  374.650976]  task_work_run+0x83/0xb0
[  374.667176]  exit_to_usermode_loop+0x59/0x85
[  374.686332]  do_syscall_64+0x165/0x180
[  374.703150]  entry_SYSCALL64_slow_path+0x25/0x25
[  374.723902] RIP: 0033:0x2b89425194fd
[  374.740073] RSP: 002b:00002b895aa80c40 EFLAGS: 00000293 ORIG_RAX: 0000000000000003
[  374.774171] RAX: 0000000000000000 RBX: 00002b8943180c10 RCX: 00002b89425194fd
[  374.806303] RDX: 00002b89415c8000 RSI: 0000000000000080 RDI: 000000000000000a
[  374.838350] RBP: 0000000000000000 R08: 0000000000000000 R09: 0000000000000cff
[  374.871310] R10: 6e493d726f727265 R11: 0000000000000293 R12: 0000000000000000
[  374.903759] R13: 00002b894b42c400 R14: 0000000000000000 R15: 00002b894b4398c0
[  374.935769] INFO: task fio:3328 blocked for more than 120 seconds.
[  374.963535]       Not tainted 4.11.0-rc3.ctrl_tmo+ #1
[  374.986330] "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message.
[  375.021111] fio             D    0  3328   3252 0x00000080
[  375.047092] Call Trace:
[  375.059404]  __schedule+0x289/0x8f0
[  375.076919]  schedule+0x36/0x80
[  375.091092]  schedule_preempt_disabled+0xe/0x10
[  375.111569]  __mutex_lock.isra.8+0x266/0x500
[  375.130372]  __mutex_lock_slowpath+0x13/0x20
[  375.149605]  mutex_lock+0x2f/0x40
[  375.164517]  blkdev_put+0x20/0x120
[  375.179741]  blkdev_close+0x25/0x30
[  375.195456]  __fput+0xe7/0x210
[  375.209262]  ____fput+0xe/0x10
[  375.222946]  task_work_run+0x83/0xb0
[  375.239113]  exit_to_usermode_loop+0x59/0x85
[  375.258416]  do_syscall_64+0x165/0x180
[  375.275285]  entry_SYSCALL64_slow_path+0x25/0x25
[  375.296039] RIP: 0033:0x2b89425194fd
[  375.312085] RSP: 002b:00002b895a87fc40 EFLAGS: 00000293 ORIG_RAX: 0000000000000003
[  375.346101] RAX: 0000000000000000 RBX: 00002b8943180dd0 RCX: 00002b89425194fd
[  375.378382] RDX: 00002b89415c8000 RSI: 0000000000000080 RDI: 0000000000000033
[  375.411225] RBP: 0000000000000000 R08: 0000000000000000 R09: 0000000000000d00
[  375.443626] R10: 6e493d726f727265 R11: 0000000000000293 R12: 0000000000000000
[  375.475713] R13: 00002b894b43a000 R14: 0000000000000000 R15: 00002b894b4474c0
[  375.507788] INFO: task fio:3329 blocked for more than 120 seconds.
[  375.535718]       Not tainted 4.11.0-rc3.ctrl_tmo+ #1
[  375.560678] "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message.
[  375.600320] fio             D    0  3329   3252 0x00000080
[  375.626374] Call Trace:
[  375.638002]  __schedule+0x289/0x8f0
[  375.654503]  schedule+0x36/0x80
[  375.669362]  schedule_preempt_disabled+0xe/0x10
[  375.690733]  __mutex_lock.isra.8+0x266/0x500
[  375.710360]  __mutex_lock_slowpath+0x13/0x20
[  375.730588]  mutex_lock+0x2f/0x40
[  375.745960]  blkdev_put+0x20/0x120
[  375.761654]  blkdev_close+0x25/0x30
[  375.777527]  __fput+0xe7/0x210
[  375.791235]  ____fput+0xe/0x10
[  375.804915]  task_work_run+0x83/0xb0
[  375.820962]  exit_to_usermode_loop+0x59/0x85
[  375.840572]  do_syscall_64+0x165/0x180
[  375.857423]  entry_SYSCALL64_slow_path+0x25/0x25
[  375.877716] RIP: 0033:0x2b89425194fd
[  375.894374] RSP: 002b:00002b895a67ec40 EFLAGS: 00000293 ORIG_RAX: 0000000000000003
[  375.928733] RAX: 0000000000000000 RBX: 00002b8943180f90 RCX: 00002b89425194fd
[  375.960830] RDX: 00002b89415c8000 RSI: 0000000000000080 RDI: 0000000000000012
[  375.992567] RBP: 0000000000000000 R08: 0000000000000000 R09: 0000000000000d01
[  376.024255] R10: 6e493d726f727265 R11: 0000000000000293 R12: 0000000000000000
[  376.057209] R13: 00002b894b447c00 R14: 0000000000000000 R15: 00002b894b4550c0
[  376.094684] INFO: task fio:3330 blocked for more than 120 seconds.
[  376.122962]       Not tainted 4.11.0-rc3.ctrl_tmo+ #1
[  376.145629] "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message.
[  376.180921] fio             D    0  3330   3252 0x00000080
[  376.205618] Call Trace:
[  376.216588]  __schedule+0x289/0x8f0
[  376.232522]  schedule+0x36/0x80
[  376.246584]  schedule_preempt_disabled+0xe/0x10
[  376.266981]  __mutex_lock.isra.8+0x266/0x500
[  376.286200]  __mutex_lock_slowpath+0x13/0x20
[  376.305350]  mutex_lock+0x2f/0x40
[  376.320234]  blkdev_put+0x20/0x120
[  376.335129]  blkdev_close+0x25/0x30
[  376.350811]  __fput+0xe7/0x210
[  376.364524]  ____fput+0xe/0x10
[  376.378272]  task_work_run+0x83/0xb0
[  376.394276]  exit_to_usermode_loop+0x59/0x85
[  376.413504]  do_syscall_64+0x165/0x180
[  376.430381]  entry_SYSCALL64_slow_path+0x25/0x25
[  376.451181] RIP: 0033:0x2b89425194fd
[  376.467187] RSP: 002b:00002b895a47dc40 EFLAGS: 00000293 ORIG_RAX: 0000000000000003
[  376.501281] RAX: 0000000000000000 RBX: 00002b8943181150 RCX: 00002b89425194fd
[  376.533460] RDX: 00002b89415c8000 RSI: 0000000000000080 RDI: 000000000000001b
[  376.565546] RBP: 0000000000000000 R08: 0000000000000000 R09: 0000000000000d02
[  376.602073] R10: 6e493d726f727265 R11: 0000000000000293 R12: 0000000000000000
[  376.634662] R13: 00002b894b455800 R14: 0000000000000000 R15: 00002b894b462cc0
[  376.666879] INFO: task fio:3331 blocked for more than 120 seconds.
[  376.694623]       Not tainted 4.11.0-rc3.ctrl_tmo+ #1
[  376.717318] "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message.
[  376.752846] fio             D    0  3331   3252 0x00000080
[  376.777775] Call Trace:
[  376.788747]  __schedule+0x289/0x8f0
[  376.804426]  schedule+0x36/0x80
[  376.818548]  schedule_preempt_disabled+0xe/0x10
[  376.838867]  __mutex_lock.isra.8+0x266/0x500
[  376.858245]  __mutex_lock_slowpath+0x13/0x20
[  376.877437]  mutex_lock+0x2f/0x40
[  376.892312]  blkdev_put+0x20/0x120
[  376.907705]  blkdev_close+0x25/0x30
[  376.924015]  __fput+0xe7/0x210
[  376.937845]  ____fput+0xe/0x10
[  376.951535]  task_work_run+0x83/0xb0
[  376.967630]  exit_to_usermode_loop+0x59/0x85
[  376.986804]  do_syscall_64+0x165/0x180
[  377.003710]  entry_SYSCALL64_slow_path+0x25/0x25
[  377.024454] RIP: 0033:0x2b89425194fd
[  377.040191] RSP: 002b:00002b895a27cc40 EFLAGS: 00000293 ORIG_RAX: 0000000000000003
[  377.074447] RAX: 0000000000000000 RBX: 00002b8943181310 RCX: 00002b89425194fd
[  377.110910] RDX: 00002b89415c8000 RSI: 0000000000000080 RDI: 0000000000000004
[  377.143293] RBP: 0000000000000000 R08: 0000000000000000 R09: 0000000000000d03
[  377.175001] R10: 6e493d726f727265 R11: 0000000000000293 R12: 0000000000000000
[  377.205372] nvme nvme0: Connect rejected: status 8 (invalid service ID).
[  377.205394] nvme nvme0: rdma_resolve_addr wait failed (-104).
[  377.206229] nvme nvme0: Failed reconnect attempt 22
[  377.206231] nvme nvme0: Reconnecting in 10 seconds...
[  377.308015] R13: 00002b894b463400 R14: 0000000000000000 R15: 00002b894b4708c0
[  377.340061] INFO: task fio:3332 blocked for more than 120 seconds.
[  377.368235]       Not tainted 4.11.0-rc3.ctrl_tmo+ #1
[  377.390954] "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message.
[  377.426117] fio             D    0  3332   3252 0x00000080
[  377.450821] Call Trace:
[  377.461740]  __schedule+0x289/0x8f0
[  377.477483]  ? bit_wait+0x50/0x50
[  377.492389]  schedule+0x36/0x80
[  377.506526]  io_schedule+0x16/0x40
[  377.521756]  bit_wait_io+0x11/0x50
[  377.537329]  __wait_on_bit+0x64/0x90
[  377.553385]  ? bit_wait+0x50/0x50
[  377.568312]  out_of_line_wait_on_bit+0x81/0xb0
[  377.588802]  ? autoremove_wake_function+0x60/0x60
[  377.614016]  __block_write_begin_int+0x3cf/0x6c0
[  377.637191]  ? I_BDEV+0x20/0x20
[  377.651456]  ? I_BDEV+0x20/0x20
[  377.665628]  block_write_begin+0x49/0x90
[  377.683410]  blkdev_write_begin+0x23/0x30
[  377.701436]  generic_perform_write+0xca/0x1c0
[  377.720995]  ? file_update_time+0x5e/0x110
[  377.740096]  __generic_file_write_iter+0x19b/0x1e0
[  377.762660]  blkdev_write_iter+0x8a/0x100
[  377.781780]  ? __inode_security_revalidate+0x4f/0x60
[  377.805212]  __vfs_write+0xe3/0x160
[  377.821172]  vfs_write+0xb2/0x1b0
[  377.836228]  ? syscall_trace_enter+0x1d0/0x2b0
[  377.856432]  SyS_pwrite64+0x87/0xb0
[  377.872541]  do_syscall_64+0x67/0x180
[  377.888976]  entry_SYSCALL64_slow_path+0x25/0x25
[  377.909777] RIP: 0033:0x2b8942519d63
[  377.925799] RSP: 002b:00002b895a07bc00 EFLAGS: 00000293 ORIG_RAX: 0000000000000012
[  377.960704] RAX: ffffffffffffffda RBX: 00002b899000ad40 RCX: 00002b8942519d63
[  377.992782] RDX: 0000000000000400 RSI: 00002b8990002920 RDI: 0000000000000031
[  378.024525] RBP: 00002b894b471000 R08: 0000000000000000 R09: 0000000000000000
[  378.056661] R10: 00000000c6946000 R11: 0000000000000293 R12: 00002b894b471008
[  378.088923] R13: 0000000000000400 R14: 00002b899000ad68 R15: 00002b899000ad50
[  387.445743] nvme nvme0: Connect rejected: status 8 (invalid service ID).
[  387.481444] nvme nvme0: rdma_resolve_addr wait failed (-104).
[  387.509486] nvme nvme0: Failed reconnect attempt 23
[  387.531502] nvme nvme0: Reconnecting in 10 seconds...
[  397.686098] nvme nvme0: Connect rejected: status 8 (invalid service ID).
[  397.719849] nvme nvme0: rdma_resolve_addr wait failed (-104).
[  397.749892] nvme nvme0: Failed reconnect attempt 24
--snip--
[  756.182567] nvme nvme0: Reconnecting in 10 seconds...
[  766.336578] nvme nvme0: Connect rejected: status 8 (invalid service ID).
[  766.371583] nvme nvme0: rdma_resolve_addr wait failed (-104).
[  766.400827] nvme nvme0: Failed reconnect attempt 60
[  766.423690] nvme nvme0: Removing controller...

Best Regards,
  Yi Zhang


----- Original Message -----
From: "Sagi Grimberg" <sagi@grimberg.me>
To: linux-nvme at lists.infradead.org
Cc: "Christoph Hellwig" <hch at lst.de>, "Yi Zhang" <yizhan at redhat.com>
Sent: Sunday, March 19, 2017 6:42:18 AM
Subject: [PATCH 0/3] Introduce fabrics controller loss timeout

In case a host realize that it's controller session is
damaged it schedules periodic reconnects. In case the controller
is gone and will never return, we need a stop condition to give
up on this controller simply remove it.

We allow the user to configure a suitable ctrl_loss_tmo and
set a reasonable default of 10 minutes.

We'll need a complementary nvme-cli exposure that will follow.

Sagi Grimberg (3):
  nvme-rdma: get rid of local reconnect_delay
  nvme-fabrics: Allow ctrl loss timeout configuration
  nvme-rdma: Support ctrl_loss_tmo

 drivers/nvme/host/fabrics.c | 28 ++++++++++++++++++++++++++++
 drivers/nvme/host/fabrics.h | 10 ++++++++++
 drivers/nvme/host/rdma.c    | 43 ++++++++++++++++++++++++++++---------------
 3 files changed, 66 insertions(+), 15 deletions(-)

-- 
2.7.4

^ permalink raw reply	[flat|nested] 13+ messages in thread

* [PATCH 1/3] nvme-rdma: get rid of local reconnect_delay
  2017-03-18 22:42 ` [PATCH 1/3] nvme-rdma: get rid of local reconnect_delay Sagi Grimberg
@ 2017-03-27  9:50   ` Christoph Hellwig
  0 siblings, 0 replies; 13+ messages in thread
From: Christoph Hellwig @ 2017-03-27  9:50 UTC (permalink / raw)


Looks fine,

Reviewed-by: Christoph Hellwig <hch at lst.de>

^ permalink raw reply	[flat|nested] 13+ messages in thread

* [PATCH 2/3] nvme-fabrics: Allow ctrl loss timeout configuration
  2017-03-18 22:42 ` [PATCH 2/3] nvme-fabrics: Allow ctrl loss timeout configuration Sagi Grimberg
@ 2017-03-27  9:50   ` Christoph Hellwig
  2017-04-17 22:29   ` James Smart
  1 sibling, 0 replies; 13+ messages in thread
From: Christoph Hellwig @ 2017-03-27  9:50 UTC (permalink / raw)


Looks good,

Reviewed-by: Christoph Hellwig <hch at lst.de>

^ permalink raw reply	[flat|nested] 13+ messages in thread

* [PATCH 3/3] nvme-rdma: Support ctrl_loss_tmo
  2017-03-18 22:42 ` [PATCH 3/3] nvme-rdma: Support ctrl_loss_tmo Sagi Grimberg
@ 2017-03-27  9:50   ` Christoph Hellwig
  2017-04-25  0:46   ` James Smart
  1 sibling, 0 replies; 13+ messages in thread
From: Christoph Hellwig @ 2017-03-27  9:50 UTC (permalink / raw)


Looks good,

Reviewed-by: Christoph Hellwig <hch at lst.de>

^ permalink raw reply	[flat|nested] 13+ messages in thread

* [PATCH 0/3] Introduce fabrics controller loss timeout
  2017-03-27  0:41 ` [PATCH 0/3] Introduce fabrics controller loss timeout Yi Zhang
@ 2017-03-28 11:37   ` Sagi Grimberg
  0 siblings, 0 replies; 13+ messages in thread
From: Sagi Grimberg @ 2017-03-28 11:37 UTC (permalink / raw)



> Hello Sagi
> With these three patches, the reconnecting stopped after 60 times.

Progress..

> I restart another test that do fio testing on nvme0n1[1] on client before executing "nvmetclt clear" on target side.
> After that, I found another issue that the fio jobs cannot be stopped even I tried "Ctrl + C", and the device node also cannot be released[2].
> Here is the kernel log[3].

Thanks for the new test case ;)

> [3]
> [  356.812399] nvme nvme0: Reconnecting in 10 seconds...
> [  366.965161] nvme nvme0: Connect rejected: status 8 (invalid service ID).
> [  367.002048] nvme nvme0: rdma_resolve_addr wait failed (-104).
> [  367.029926] nvme nvme0: Failed reconnect attempt 21
> [  367.051905] nvme nvme0: Reconnecting in 10 seconds...
> [  371.444001] INFO: task kworker/u130:1:155 blocked for more than 120 seconds.
> [  371.480773]       Not tainted 4.11.0-rc3.ctrl_tmo+ #1
> [  371.505608] "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message.
> [  371.540918] kworker/u130:1  D    0   155      2 0x00000000
> [  371.565584] Workqueue: writeback wb_workfn (flush-259:0)
> [  371.590031] Call Trace:
> [  371.600981]  __schedule+0x289/0x8f0
> [  371.616644]  schedule+0x36/0x80
> [  371.630693]  io_schedule+0x16/0x40
> [  371.645565]  blk_mq_get_tag+0x16c/0x280
> [  371.662929]  ? remove_wait_queue+0x60/0x60
> [  371.680942]  __blk_mq_alloc_request+0x1b/0xe0
> [  371.700508]  blk_mq_sched_get_request+0x1a0/0x240
> [  371.721616]  blk_mq_make_request+0x113/0x620
> [  371.741215]  generic_make_request+0x110/0x2c0
> [  371.760755]  submit_bio+0x75/0x150

Looks like we have I/O waiting for a tag, but the
controller teardown couldn't interrupt and fail it...

In this specific case, its a writeback, also udevd is
stuck in the same location below...

I'm thinking we might need something similar to Keith
nvme_start_freeze/nvme_wait_freeze/nvme_unfreeze calls
for fabrics too.. :/

^ permalink raw reply	[flat|nested] 13+ messages in thread

* [PATCH 2/3] nvme-fabrics: Allow ctrl loss timeout configuration
  2017-03-18 22:42 ` [PATCH 2/3] nvme-fabrics: Allow ctrl loss timeout configuration Sagi Grimberg
  2017-03-27  9:50   ` Christoph Hellwig
@ 2017-04-17 22:29   ` James Smart
  2017-04-20 10:20     ` Sagi Grimberg
  1 sibling, 1 reply; 13+ messages in thread
From: James Smart @ 2017-04-17 22:29 UTC (permalink / raw)


On 3/18/2017 3:42 PM, Sagi Grimberg wrote:
> + * @nr_reconnects: number of reconnect attempted since the last ctrl failure
> + * @max_reconnects: maximum number of allowed reconnect attempts before removing
> + *              the controller, (-1) means reconnect forever, zero means remove
> + *              immediately;
>    */
>   struct nvmf_ctrl_options {
>   	unsigned		mask;
> @@ -91,6 +98,8 @@ struct nvmf_ctrl_options {
>   	bool			discovery_nqn;
>   	unsigned int		kato;
>   	struct nvmf_host	*host;
> +	int			nr_reconnects;
> +	int			max_reconnects;
>   };
>   
>   /*
> @@ -133,5 +142,6 @@ void nvmf_unregister_transport(struct nvmf_transport_ops *ops);
>   void nvmf_free_options(struct nvmf_ctrl_options *opts);
>   const char *nvmf_get_subsysnqn(struct nvme_ctrl *ctrl);
>   int nvmf_get_address(struct nvme_ctrl *ctrl, char *buf, int size);
> +bool nvmf_should_reconnect(struct nvme_ctrl *ctrl);

I know this patch has been pulled in - but I think it very odd that we 
added a field (nr_reconnects) into the opts structure, that is not an 
connect option but is instead a dynamically-changing transport 
variable.  As the change introduced a common transport variable beyond 
start options, the patch should have formally added a generic transport 
structure to the ctrl structure.

-- james

^ permalink raw reply	[flat|nested] 13+ messages in thread

* [PATCH 2/3] nvme-fabrics: Allow ctrl loss timeout configuration
  2017-04-17 22:29   ` James Smart
@ 2017-04-20 10:20     ` Sagi Grimberg
  0 siblings, 0 replies; 13+ messages in thread
From: Sagi Grimberg @ 2017-04-20 10:20 UTC (permalink / raw)



> I know this patch has been pulled in - but I think it very odd that we
> added a field (nr_reconnects) into the opts structure, that is not an
> connect option but is instead a dynamically-changing transport
> variable.  As the change introduced a common transport variable beyond
> start options, the patch should have formally added a generic transport
> structure to the ctrl structure.

You're correct, I can move it to sit in nvme_ctrl.

^ permalink raw reply	[flat|nested] 13+ messages in thread

* [PATCH 3/3] nvme-rdma: Support ctrl_loss_tmo
  2017-03-18 22:42 ` [PATCH 3/3] nvme-rdma: Support ctrl_loss_tmo Sagi Grimberg
  2017-03-27  9:50   ` Christoph Hellwig
@ 2017-04-25  0:46   ` James Smart
  2017-05-03  8:05     ` Sagi Grimberg
  1 sibling, 1 reply; 13+ messages in thread
From: James Smart @ 2017-04-25  0:46 UTC (permalink / raw)


On 3/18/2017 3:42 PM, Sagi Grimberg wrote:
> Before scheduling a reconnect attempt, check
> nr_reconnects against max_reconnects, if not
> exhausted (or max_reconnects is not -1), schedule
> a reconnect attempts, otherwise schedule ctrl
> removal.
>
> Signed-off-by: Sagi Grimberg <sagi at grimberg.me>
> ---
>   drivers/nvme/host/rdma.c | 41 ++++++++++++++++++++++++++++-------------
>   1 file changed, 28 insertions(+), 13 deletions(-)
>
> diff --git a/drivers/nvme/host/rdma.c b/drivers/nvme/host/rdma.c
> index 33f18636ea99..71d1e1a6b928 100644
> --- a/drivers/nvme/host/rdma.c
> +++ b/drivers/nvme/host/rdma.c
> @@ -711,6 +711,26 @@ static void nvme_rdma_free_ctrl(struct nvme_ctrl *nctrl)
>   	kfree(ctrl);
>   }
>   
> +static void nvme_rdma_reconnect_or_remove(struct nvme_rdma_ctrl *ctrl)
> +{
> +	/* If we are resetting/deleting then do nothing */
> +	if (ctrl->ctrl.state != NVME_CTRL_RECONNECTING) {
> +		WARN_ON_ONCE(ctrl->ctrl.state == NVME_CTRL_NEW ||
> +			ctrl->ctrl.state == NVME_CTRL_LIVE);
> +		return;
> +	}
> +
> +	if (nvmf_should_reconnect(&ctrl->ctrl)) {
> +		dev_info(ctrl->ctrl.device, "Reconnecting in %d seconds...\n",
> +			ctrl->ctrl.opts->reconnect_delay);
> +		queue_delayed_work(nvme_rdma_wq, &ctrl->reconnect_work,
> +				ctrl->ctrl.opts->reconnect_delay * HZ);
> +	} else {
> +		dev_info(ctrl->ctrl.device, "Removing controller...\n");
Shouldn't there be a:
         if (!nvme_change_ctrl_state(&ctrl->ctrl, NVME_CTRL_DELETING))
                 return;

right here ?

> +		queue_work(nvme_rdma_wq, &ctrl->delete_work);
> +	}
> +}
> +


-- james

^ permalink raw reply	[flat|nested] 13+ messages in thread

* [PATCH 3/3] nvme-rdma: Support ctrl_loss_tmo
  2017-04-25  0:46   ` James Smart
@ 2017-05-03  8:05     ` Sagi Grimberg
  0 siblings, 0 replies; 13+ messages in thread
From: Sagi Grimberg @ 2017-05-03  8:05 UTC (permalink / raw)



> Shouldn't there be a:
>         if (!nvme_change_ctrl_state(&ctrl->ctrl, NVME_CTRL_DELETING))
>                 return;
>
> right here ?

Correct. I'll send a fix.

^ permalink raw reply	[flat|nested] 13+ messages in thread

end of thread, other threads:[~2017-05-03  8:05 UTC | newest]

Thread overview: 13+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2017-03-18 22:42 [PATCH 0/3] Introduce fabrics controller loss timeout Sagi Grimberg
2017-03-18 22:42 ` [PATCH 1/3] nvme-rdma: get rid of local reconnect_delay Sagi Grimberg
2017-03-27  9:50   ` Christoph Hellwig
2017-03-18 22:42 ` [PATCH 2/3] nvme-fabrics: Allow ctrl loss timeout configuration Sagi Grimberg
2017-03-27  9:50   ` Christoph Hellwig
2017-04-17 22:29   ` James Smart
2017-04-20 10:20     ` Sagi Grimberg
2017-03-18 22:42 ` [PATCH 3/3] nvme-rdma: Support ctrl_loss_tmo Sagi Grimberg
2017-03-27  9:50   ` Christoph Hellwig
2017-04-25  0:46   ` James Smart
2017-05-03  8:05     ` Sagi Grimberg
2017-03-27  0:41 ` [PATCH 0/3] Introduce fabrics controller loss timeout Yi Zhang
2017-03-28 11:37   ` Sagi Grimberg

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.