All of lore.kernel.org
 help / color / mirror / Atom feed
From: Sagi Grimberg <sagi@grimberg.me>
To: Anton Eidelman <anton.eidelman@gmail.com>,
	linux-nvme@lists.infradead.org, hch@lst.de, kbusch@kernel.org,
	axboe@fb.com
Cc: Anton Eidelman <anton@lightbitslabs.com>
Subject: Re: [PATCH] nvme/mpath: fix hang when disk goes live over reconnect
Date: Sun, 19 Sep 2021 13:19:39 +0300	[thread overview]
Message-ID: <b6dd0d15-041e-209e-953c-1b991c0738c9@grimberg.me> (raw)
In-Reply-To: <20210918215729.388968-1-anton@lightbitslabs.com>



On 9/19/21 12:57 AM, Anton Eidelman wrote:
> nvme_mpath_init_identify() invoked from nvme_init_identify()
> fetches a fresh ANA log from the ctrl.
> This is essential to have an up to date path states
> for both existing namespaces and for those scan_work
> may discover once the ctrl is up.
> 
> This happens in the following cases:
> 1) A new ctrl is being connected.
> 2) An existing ctrl is successfully reconnected.
> 3) An existing ctrl is being reset.
> 
> While in (1) ctrl->namespaces is empty, (2 & 3) may have namespaces,
> and nvme_read_ana_log() may call nvme_update_ns_ana_state().
> 
> This result in a hang when the ANA state of an existing namespace
> changes and makes the disk live: nvme_mpath_set_live() issues
> IO to the namespace through the ctrl, which does NOT have IO queues yet.
> See sample hang below.
> 
> Solution:
> - move nvme_read_ana_log() call from nvme_mpath_init_identify()
> into a separate function, nvme_mpath_update().
> - call this function in nvme_start_ctrl() - here the ctrl is ready.
> 
> Downside: in case nvme_read_ana_log() fails, we do not find this out
>    in nvme_mpath_init_identify().
> 
> Alernative solution:
>    nvme_get_log() in nvme_mpath_init_identify(), but don't parse.
>    Then nvme_start_ctrl() can just queue_work(nvme_wq, &ctrl->ana_work),
>    along with scan_work.
>    The ctrl->ana_log_buf contents is valid for scan_work,
>    and ana_work will re-fetch and parse it.
>    Downside: fetching an ANA log from the ctrl twice.
> 
> Sample failure:
>      nvme nvme0: starting error recovery
>      nvme nvme0: Reconnecting in 10 seconds...
>      block nvme0n6: no usable path - requeuing I/O
>      INFO: task kworker/u8:3:312 blocked for more than 122 seconds.
>            Tainted: G            E     5.14.5-1.el7.elrepo.x86_64 #1
>      Workqueue: nvme-wq nvme_tcp_reconnect_ctrl_work [nvme_tcp]
>      Call Trace:
>       __schedule+0x2a2/0x7e0
>       schedule+0x4e/0xb0
>       io_schedule+0x16/0x40
>       wait_on_page_bit_common+0x15c/0x3e0
>       do_read_cache_page+0x1e0/0x410
>       read_cache_page+0x12/0x20
>       read_part_sector+0x46/0x100
>       read_lba+0x121/0x240
>       efi_partition+0x1d2/0x6a0
>       bdev_disk_changed.part.0+0x1df/0x430
>       bdev_disk_changed+0x18/0x20
>       blkdev_get_whole+0x77/0xe0
>       blkdev_get_by_dev+0xd2/0x3a0
>       __device_add_disk+0x1ed/0x310
>       device_add_disk+0x13/0x20
>       nvme_mpath_set_live+0x138/0x1b0 [nvme_core]
>       nvme_update_ns_ana_state+0x2b/0x30 [nvme_core]
>       nvme_update_ana_state+0xca/0xe0 [nvme_core]
>       nvme_parse_ana_log+0xac/0x170 [nvme_core]
>       nvme_read_ana_log+0x7d/0xe0 [nvme_core]
>       nvme_mpath_init_identify+0x105/0x150 [nvme_core]
>       nvme_init_identify+0x2df/0x4d0 [nvme_core]
>       nvme_init_ctrl_finish+0x8d/0x3b0 [nvme_core]
>       nvme_tcp_setup_ctrl+0x337/0x390 [nvme_tcp]
>       nvme_tcp_reconnect_ctrl_work+0x24/0x40 [nvme_tcp]
>       process_one_work+0x1bd/0x360
>       worker_thread+0x50/0x3d0
> 
> Signed-off-by: Anton Eidelman <anton@lightbitslabs.com>
> ---
>   drivers/nvme/host/core.c      | 1 +
>   drivers/nvme/host/multipath.c | 9 ++++++---
>   drivers/nvme/host/nvme.h      | 4 ++++
>   3 files changed, 11 insertions(+), 3 deletions(-)
> 
> diff --git a/drivers/nvme/host/core.c b/drivers/nvme/host/core.c
> index 97f8211cf92c..884ca1f6f80b 100644
> --- a/drivers/nvme/host/core.c
> +++ b/drivers/nvme/host/core.c
> @@ -4325,6 +4325,7 @@ void nvme_start_ctrl(struct nvme_ctrl *ctrl)
>   	nvme_enable_aen(ctrl);
>   
>   	if (ctrl->queue_count > 1) {
> +		nvme_mpath_update(ctrl);

Does this ordering now becomes subtle as it needs to run before we queue 
the ns scan?

>   		nvme_queue_scan(ctrl);
>   		nvme_start_queues(ctrl);
>   	}

_______________________________________________
Linux-nvme mailing list
Linux-nvme@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-nvme

  reply	other threads:[~2021-09-19 10:19 UTC|newest]

Thread overview: 22+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2021-09-18 21:57 [PATCH] nvme/mpath: fix hang when disk goes live over reconnect Anton Eidelman
2021-09-19 10:19 ` Sagi Grimberg [this message]
2021-09-19 23:05   ` Anton Eidelman
2021-09-20  7:55     ` Sagi Grimberg
2021-09-20  6:35 ` Christoph Hellwig
2021-09-20 14:53   ` Anton Eidelman
2021-09-21  7:15     ` Christoph Hellwig
2021-10-04 16:46       ` Anton Eidelman
2021-10-04 16:57         ` Christoph Hellwig
2021-10-05 12:38           ` Sagi Grimberg
     [not found]             ` <368499da-c117-e8b7-7b1b-46894e1e0b48@grimberg.me>
2021-10-19 15:13               ` Anton Eidelman
2022-03-23  3:55 ` [PATCH v2 0/1] " Anton Eidelman
2022-03-23  3:55   ` [PATCH v2 1/1] " Anton Eidelman
2022-03-23  9:23     ` Sagi Grimberg
2022-03-23 14:45     ` [PATCH v3 0/1] " Anton Eidelman
2022-03-23 14:45       ` [PATCH v3 1/1] " Anton Eidelman
2022-03-23 14:55         ` Sagi Grimberg
2022-03-23 15:22         ` Christoph Hellwig
2022-03-24 19:05           ` [PATCH v4 0/1] " Anton Eidelman
2022-03-24 19:05             ` [PATCH v4 1/1] " Anton Eidelman
2022-03-24 21:06               ` Sagi Grimberg
2022-03-25  6:36               ` Christoph Hellwig

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=b6dd0d15-041e-209e-953c-1b991c0738c9@grimberg.me \
    --to=sagi@grimberg.me \
    --cc=anton.eidelman@gmail.com \
    --cc=anton@lightbitslabs.com \
    --cc=axboe@fb.com \
    --cc=hch@lst.de \
    --cc=kbusch@kernel.org \
    --cc=linux-nvme@lists.infradead.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.