linux-nvme.lists.infradead.org archive mirror
 help / color / mirror / Atom feed
From: Martin Belanger <nitram_67@hotmail.com>
To: "linux-nvme@lists.infradead.org" <linux-nvme@lists.infradead.org>
Subject: nvme-fabrics: shutdown 1-minute deadlock
Date: Mon, 20 Sep 2021 15:09:37 +0000	[thread overview]
Message-ID: <BL0PR13MB429131886435C5C7092275A39CA09@BL0PR13MB4291.namprd13.prod.outlook.com> (raw)

Hello NVMe community,

I ran into a 1-minute deadlock trying to disconnect from a remote discovery controller (over tcp) while the controller is unreachable. The remote controller could be unreachable because the network is down or the controller simply crashed unexpectedly. The cause is irrelevant. Suffice it to say that the kernel does not yet know that the controller is unreachable and we're trying to disconnect from the controller. The problem is that during a disconnect we try to write commands to the remote controller and the default timeout for a write operation is 1 minute (admin_timeout). 

For example, in nvme_shutdown_ctrl() we call reg_write32() (which really invokes nvmf_reg_write32) to set the NVME_CC_SHN_NORMAL bit, and this operation will block waiting for a response that will never come. Interestingly, in the same function we also call reg_read32() to read the CSTS register, but this time we specify a 5-sec timeout (i.e. ctrl->shutdown_timeout). In other words, there is an inconsistency between the write and the read timeouts in that one function. 

Similarly, nvme_disable_ctrl() calls reg_write32() (i.e. nvmf_reg_write32) to clear NVME_CC_ENABLE and NVME_CC_SHN_MASK, and once again this will block for 1 minute if the controller in unreachable. 

I would like to propose that the prototype for reg_write32() be changed to allow for the caller to specify a timeout as follows:

int (*reg_write32)(struct nvme_ctrl *ctrl, u32 off, u32 val, unsigned timeout);

This timeout will simply be passed to nvmf_reg_write32() and in turn to __nvme_submit_sync_cmd().

When invoking reg_write32(), one would set timeout to 0 (zero) to indicate that the default 1-minute timeout shall be used. Otherwise, a non-0 timeout would overwrite the default. It's only in functions nvme_shutdown_ctrl() and nvme_disable_ctrl() that we would specify a timeout shorter than 1 minute. For example, we could use ctrl->shutdown_timeout as the value for timeout.

I would like to hear your thoughts before I submit a patch. Maybe there's a better or easier way to work around this.

Regards,
Martin Belanger
Engineering Technologist, Dell Inc.
_______________________________________________
Linux-nvme mailing list
Linux-nvme@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-nvme

             reply	other threads:[~2021-09-20 15:10 UTC|newest]

Thread overview: 2+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2021-09-20 15:09 Martin Belanger [this message]
2021-09-20 17:14 ` nvme-fabrics: shutdown 1-minute deadlock Martin Belanger

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=BL0PR13MB429131886435C5C7092275A39CA09@BL0PR13MB4291.namprd13.prod.outlook.com \
    --to=nitram_67@hotmail.com \
    --cc=linux-nvme@lists.infradead.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).