From: James Smart <james.smart@broadcom.com>
To: Victor Gladkov <Victor.Gladkov@kioxia.com>,
"linux-nvme@lists.infradead.org" <linux-nvme@lists.infradead.org>
Cc: Sagi Grimberg <sagi@grimberg.me>,
"Ewan D. Milne" <emilne@redhat.com>,
Hannes Reinecke <hare@suse.de>
Subject: Re: [PATCH v7] nvme-fabrics: reject I/O to offline device
Date: Thu, 13 Aug 2020 08:00:25 -0700 [thread overview]
Message-ID: <d79e8917-6c45-5120-5e5c-602750fc5013@broadcom.com> (raw)
In-Reply-To: <168ed5d66eae49ac8b445478b1bb362d@kioxia.com>
On 7/8/2020 8:07 AM, Victor Gladkov wrote:
> Commands get stuck while Host NVMe controller (TCP or RDMA) is in
> reconnect state. NVMe controller enters into reconnect state when it
> loses connection with the target. It tries to reconnect every 10
> seconds (default) until successful reconnection or until reconnect
> time-out is reached. The default reconnect time out is 10 minutes.
>
> To fix this long delay due to the default timeout we introduce new
> session parameter "fast_io_fail_tmo". The timeout is measured in
> seconds from the controller reconnect, any command beyond that
> timeout is rejected. The new parameter value may be passed during
> 'connect'.
> The default value of 0 means no timeout (similar to current behavior).
>
> We add a new controller NVME_CTRL_FAILFAST_EXPIRED and respective
> delayed work that updates the NVME_CTRL_FAILFAST_EXPIRED flag.
>
> When the controller is entering the CONNECTING state, we schedule
> the delayed_work based on failfast timeout value. If the transition
> is out of CONNECTING, terminate delayed work item and ensure
> failfast_expired is false. If delayed work item expires then set
> "NVME_CTRL_FAILFAST_EXPIRED" flag to true.
>
> We also update nvmf_fail_nonready_command() and
> nvme_available_path() functions with check the
> "NVME_CTRL_FAILFAST_EXPIRED" controller flag.
>
> Signed-off-by: Victor Gladkov <victor.gladkov at kioxia.com>
> Signed-off-by: Chaitanya Kulkarni <chaitanya.kulkarni at wdc.com>
> Reviewed-by: Hannes Reinecke <hare@suse.de>
Victor,
when we talked about what fast fail does, we discussed how it was
different from SCSI's notion of fast fail, which was in support of
mpio. nvme didn't need scsi's fast_fail behavior as the fabric check
ready's are spotting the mpath attributes immediately. So this patch set
is for "normal" io that normally enters retries. You replied with a
very important distinction about this patch that isn't described by the
above.
Please add this paragraph to the patch description:
From email from Victor on 12/3/2019 at 2:04 am US PST:
Applications are expecting commands to complete with success or error
within a certain timeout (30 seconds by default). The NVMe host is
enforcing that timeout while it is connected, never the less, during
reconnection, the timeout is not enforced and commands may get stuck for
a long period or even forever.
-- james
_______________________________________________
Linux-nvme mailing list
Linux-nvme@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-nvme
prev parent reply other threads:[~2020-08-13 15:00 UTC|newest]
Thread overview: 10+ messages / expand[flat|nested] mbox.gz Atom feed top
2020-07-08 15:07 [PATCH v7] nvme-fabrics: reject I/O to offline device Victor Gladkov
2020-07-09 20:34 ` James Smart
2020-07-10 4:50 ` Sagi Grimberg
2020-07-10 6:58 ` Hannes Reinecke
2020-07-14 11:04 ` Christoph Hellwig
2020-07-22 22:57 ` Sagi Grimberg
2020-08-09 15:32 ` Victor Gladkov
2020-08-11 20:56 ` Sagi Grimberg
2020-08-12 14:09 ` Victor Gladkov
2020-08-13 15:00 ` James Smart [this message]
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=d79e8917-6c45-5120-5e5c-602750fc5013@broadcom.com \
--to=james.smart@broadcom.com \
--cc=Victor.Gladkov@kioxia.com \
--cc=emilne@redhat.com \
--cc=hare@suse.de \
--cc=linux-nvme@lists.infradead.org \
--cc=sagi@grimberg.me \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.