All of lore.kernel.org
 help / color / mirror / Atom feed
From: Anton Eidelman <anton.eidelman@gmail.com>
To: linux-nvme@lists.infradead.org, hch@lst.de, kbusch@kernel.org,
	sagi@grimberg.me, axboe@fb.com
Cc: Anton Eidelman <anton@lightbitslabs.com>
Subject: [PATCH] nvme/mpath: fix hang when disk goes live over reconnect
Date: Sat, 18 Sep 2021 15:57:29 -0600	[thread overview]
Message-ID: <20210918215729.388968-1-anton@lightbitslabs.com> (raw)

nvme_mpath_init_identify() invoked from nvme_init_identify()
fetches a fresh ANA log from the ctrl.
This is essential to have an up to date path states
for both existing namespaces and for those scan_work
may discover once the ctrl is up.

This happens in the following cases:
1) A new ctrl is being connected.
2) An existing ctrl is successfully reconnected.
3) An existing ctrl is being reset.

While in (1) ctrl->namespaces is empty, (2 & 3) may have namespaces,
and nvme_read_ana_log() may call nvme_update_ns_ana_state().

This result in a hang when the ANA state of an existing namespace
changes and makes the disk live: nvme_mpath_set_live() issues
IO to the namespace through the ctrl, which does NOT have IO queues yet.
See sample hang below.

Solution:
- move nvme_read_ana_log() call from nvme_mpath_init_identify()
into a separate function, nvme_mpath_update().
- call this function in nvme_start_ctrl() - here the ctrl is ready.

Downside: in case nvme_read_ana_log() fails, we do not find this out
  in nvme_mpath_init_identify().

Alernative solution:
  nvme_get_log() in nvme_mpath_init_identify(), but don't parse.
  Then nvme_start_ctrl() can just queue_work(nvme_wq, &ctrl->ana_work),
  along with scan_work.
  The ctrl->ana_log_buf contents is valid for scan_work,
  and ana_work will re-fetch and parse it.
  Downside: fetching an ANA log from the ctrl twice.

Sample failure:
    nvme nvme0: starting error recovery
    nvme nvme0: Reconnecting in 10 seconds...
    block nvme0n6: no usable path - requeuing I/O
    INFO: task kworker/u8:3:312 blocked for more than 122 seconds.
          Tainted: G            E     5.14.5-1.el7.elrepo.x86_64 #1
    Workqueue: nvme-wq nvme_tcp_reconnect_ctrl_work [nvme_tcp]
    Call Trace:
     __schedule+0x2a2/0x7e0
     schedule+0x4e/0xb0
     io_schedule+0x16/0x40
     wait_on_page_bit_common+0x15c/0x3e0
     do_read_cache_page+0x1e0/0x410
     read_cache_page+0x12/0x20
     read_part_sector+0x46/0x100
     read_lba+0x121/0x240
     efi_partition+0x1d2/0x6a0
     bdev_disk_changed.part.0+0x1df/0x430
     bdev_disk_changed+0x18/0x20
     blkdev_get_whole+0x77/0xe0
     blkdev_get_by_dev+0xd2/0x3a0
     __device_add_disk+0x1ed/0x310
     device_add_disk+0x13/0x20
     nvme_mpath_set_live+0x138/0x1b0 [nvme_core]
     nvme_update_ns_ana_state+0x2b/0x30 [nvme_core]
     nvme_update_ana_state+0xca/0xe0 [nvme_core]
     nvme_parse_ana_log+0xac/0x170 [nvme_core]
     nvme_read_ana_log+0x7d/0xe0 [nvme_core]
     nvme_mpath_init_identify+0x105/0x150 [nvme_core]
     nvme_init_identify+0x2df/0x4d0 [nvme_core]
     nvme_init_ctrl_finish+0x8d/0x3b0 [nvme_core]
     nvme_tcp_setup_ctrl+0x337/0x390 [nvme_tcp]
     nvme_tcp_reconnect_ctrl_work+0x24/0x40 [nvme_tcp]
     process_one_work+0x1bd/0x360
     worker_thread+0x50/0x3d0

Signed-off-by: Anton Eidelman <anton@lightbitslabs.com>
---
 drivers/nvme/host/core.c      | 1 +
 drivers/nvme/host/multipath.c | 9 ++++++---
 drivers/nvme/host/nvme.h      | 4 ++++
 3 files changed, 11 insertions(+), 3 deletions(-)

diff --git a/drivers/nvme/host/core.c b/drivers/nvme/host/core.c
index 97f8211cf92c..884ca1f6f80b 100644
--- a/drivers/nvme/host/core.c
+++ b/drivers/nvme/host/core.c
@@ -4325,6 +4325,7 @@ void nvme_start_ctrl(struct nvme_ctrl *ctrl)
 	nvme_enable_aen(ctrl);
 
 	if (ctrl->queue_count > 1) {
+		nvme_mpath_update(ctrl);
 		nvme_queue_scan(ctrl);
 		nvme_start_queues(ctrl);
 	}
diff --git a/drivers/nvme/host/multipath.c b/drivers/nvme/host/multipath.c
index e8ccdd398f78..8bd133f66636 100644
--- a/drivers/nvme/host/multipath.c
+++ b/drivers/nvme/host/multipath.c
@@ -850,9 +850,7 @@ int nvme_mpath_init_identify(struct nvme_ctrl *ctrl, struct nvme_id_ctrl *id)
 			return -ENOMEM;
 	}
 	ctrl->ana_log_size = ana_log_size;
-	error = nvme_read_ana_log(ctrl);
-	if (error)
-		goto out_uninit;
+
 	return 0;
 
 out_uninit:
@@ -860,6 +858,11 @@ int nvme_mpath_init_identify(struct nvme_ctrl *ctrl, struct nvme_id_ctrl *id)
 	return error;
 }
 
+void nvme_mpath_update(struct nvme_ctrl *ctrl)
+{
+	nvme_read_ana_log(ctrl);
+}
+
 void nvme_mpath_uninit(struct nvme_ctrl *ctrl)
 {
 	kfree(ctrl->ana_log_buf);
diff --git a/drivers/nvme/host/nvme.h b/drivers/nvme/host/nvme.h
index 9871c0c9374c..1560ea7adc6b 100644
--- a/drivers/nvme/host/nvme.h
+++ b/drivers/nvme/host/nvme.h
@@ -746,6 +746,7 @@ void nvme_mpath_add_disk(struct nvme_ns *ns, struct nvme_id_ns *id);
 void nvme_mpath_remove_disk(struct nvme_ns_head *head);
 int nvme_mpath_init_identify(struct nvme_ctrl *ctrl, struct nvme_id_ctrl *id);
 void nvme_mpath_init_ctrl(struct nvme_ctrl *ctrl);
+void nvme_mpath_update(struct nvme_ctrl *ctrl);
 void nvme_mpath_uninit(struct nvme_ctrl *ctrl);
 void nvme_mpath_stop(struct nvme_ctrl *ctrl);
 bool nvme_mpath_clear_current_path(struct nvme_ns *ns);
@@ -820,6 +821,9 @@ static inline int nvme_mpath_init_identify(struct nvme_ctrl *ctrl,
 "Please enable CONFIG_NVME_MULTIPATH for full support of multi-port devices.\n");
 	return 0;
 }
+static void nvme_mpath_update(struct nvme_ctrl *ctrl)
+{
+}
 static inline void nvme_mpath_uninit(struct nvme_ctrl *ctrl)
 {
 }
-- 
2.25.1


_______________________________________________
Linux-nvme mailing list
Linux-nvme@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-nvme

             reply	other threads:[~2021-09-18 21:58 UTC|newest]

Thread overview: 22+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2021-09-18 21:57 Anton Eidelman [this message]
2021-09-19 10:19 ` [PATCH] nvme/mpath: fix hang when disk goes live over reconnect Sagi Grimberg
2021-09-19 23:05   ` Anton Eidelman
2021-09-20  7:55     ` Sagi Grimberg
2021-09-20  6:35 ` Christoph Hellwig
2021-09-20 14:53   ` Anton Eidelman
2021-09-21  7:15     ` Christoph Hellwig
2021-10-04 16:46       ` Anton Eidelman
2021-10-04 16:57         ` Christoph Hellwig
2021-10-05 12:38           ` Sagi Grimberg
     [not found]             ` <368499da-c117-e8b7-7b1b-46894e1e0b48@grimberg.me>
2021-10-19 15:13               ` Anton Eidelman
2022-03-23  3:55 ` [PATCH v2 0/1] " Anton Eidelman
2022-03-23  3:55   ` [PATCH v2 1/1] " Anton Eidelman
2022-03-23  9:23     ` Sagi Grimberg
2022-03-23 14:45     ` [PATCH v3 0/1] " Anton Eidelman
2022-03-23 14:45       ` [PATCH v3 1/1] " Anton Eidelman
2022-03-23 14:55         ` Sagi Grimberg
2022-03-23 15:22         ` Christoph Hellwig
2022-03-24 19:05           ` [PATCH v4 0/1] " Anton Eidelman
2022-03-24 19:05             ` [PATCH v4 1/1] " Anton Eidelman
2022-03-24 21:06               ` Sagi Grimberg
2022-03-25  6:36               ` Christoph Hellwig

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20210918215729.388968-1-anton@lightbitslabs.com \
    --to=anton.eidelman@gmail.com \
    --cc=anton@lightbitslabs.com \
    --cc=axboe@fb.com \
    --cc=hch@lst.de \
    --cc=kbusch@kernel.org \
    --cc=linux-nvme@lists.infradead.org \
    --cc=sagi@grimberg.me \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.