From: Sagi Grimberg <sagi@grimberg.me>
To: Chao Leng <lengchao@huawei.com>, Christoph Hellwig <hch@lst.de>
Cc: Keith Busch <kbusch@kernel.org>,
linux-nvme@lists.infradead.org,
Chaitanya Kulkarni <Chaitanya.Kulkarni@wdc.com>
Subject: Re: [PATCH 0/3 rfc] Fix nvme-tcp and nvme-rdma controller reset hangs
Date: Thu, 18 Mar 2021 11:46:08 -0700 [thread overview]
Message-ID: <7b7d5223-ddaf-eb88-f112-02834f8c8f93@grimberg.me> (raw)
In-Reply-To: <55142c25-9a70-08a0-d46a-cad21da59d19@huawei.com>
>>>>>>> Will it work if nvme mpath used request NOWAIT flag for its
>>>>>>> submit_bio()
>>>>>>> call, and add the bio to the requeue_list if blk_queue_enter()
>>>>>>> fails? I
>>>>>>> think that looks like another way to resolve the deadlock, but we
>>>>>>> need
>>>>>>> the block layer to return a failed status to the original caller.
>>>>
>>>> Yes, I think BLK_MQ_REQ_NOWAIT makes total sense here. dm-mpath also
>>>> uses it for its request allocation for similar reasons.
>>>>
>>>>>>
>>>>>> But who would kick the requeue list? and that would make
>>>>>> near-tag-exhaust performance stink...
>>>>
>>>> The multipath code would have to kick the list. We could also try to
>>>> split into two flags, one that affects blk_queue_enter and one that
>>>> affects the tag allocation.
>>>>
>>>>> moving nvme_start_freeze from nvme_rdma_teardown_io_queues to
>>>>> nvme_rdma_configure_io_queues can fix it.
>>>>> It can also avoid I/O hang long time if reconnection failed.
>>>>
>>>> Can you explain how we'd still ensure that no new commands get queued
>>>> during teardown using that scheme?
>>> 1. tear down will cancel all inflight requests, and then multipath
>>> will clear the path.
>>> 2. and then we may freeze the controler.
>>> 3. nvme_ns_head_submit_bio can not find the reconnection controller
>>> as valid path, so it is safe.
>>
>> In non-mpath (which unfortunately is a valid use-case), there is no
>> failover, and we cannot freeze the queue after we stopped (and/or
>> started) the queues because then fail_non_ready_command() constantly
>> return BLK_STS_RESOURCE (just causing a re-submission over and over
>> again) and the freeze will never complete (the commands are still
>> inflight from the queue->g_usage_counter perspective).
> If the request set the flags to REQ_FAILFAST_xxx, will hang long time if
> reconnection failed.
> This is not expected.
> Another, If the controller is not live and the controller is freezed
> ,fast_io_fail_tmo will not work.
> This is also not expected.
No arguments that the queue needs to unfreeze asap for mpath, that
is exactly what the patch does. The only unnatural part is the
non-mpath case where if we unfreeze the queue before we reconnect
I/Os will fail, which is we should also respect fast_fail_tmo.
The main issue here is that there are two behaviors that we
should maintain based if its mpath or non-mpath...
> So I think freezing the controller when reconnecting is not good idea.
As said, for mpath its for sure not, but for non-mpath that matches
the expected behavior.
> It's really not good behavior to try again and again. This is at least
> better than request hang long time.
I am not sure I understand how that even supposed to work TBH.
>> So I think we should still start queue freeze before we quiesce
>> the queues.
> We should unquiesce and unfreeze the queues when reconnecting, otherwise
> fast_io_fail_tmo will not work.
>>
>> I still don't see how the mpath NOWAIT suggestion works either...
> mpath will queuue request to other live path or requeue the request(if
> no used path), so it will not wait.
Placing the request on the requeue_list is fine, but the question is
when to kick the requeue_work, nothing guarantees that an alternate path
exist or will in a sane period. So constantly requeue+kick sounds like
a really bad practice to me.
_______________________________________________
Linux-nvme mailing list
Linux-nvme@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-nvme
next prev parent reply other threads:[~2021-03-18 18:46 UTC|newest]
Thread overview: 30+ messages / expand[flat|nested] mbox.gz Atom feed top
2021-03-15 22:27 [PATCH 0/3 rfc] Fix nvme-tcp and nvme-rdma controller reset hangs Sagi Grimberg
2021-03-15 22:27 ` [PATCH 1/3] nvme: introduce nvme_ctrl_is_mpath helper Sagi Grimberg
2021-03-15 22:27 ` [PATCH 2/3] nvme-tcp: fix possible hang when trying to set a live path during I/O Sagi Grimberg
2021-03-15 22:27 ` [PATCH 3/3] nvme-rdma: " Sagi Grimberg
2021-03-16 3:24 ` [PATCH 0/3 rfc] Fix nvme-tcp and nvme-rdma controller reset hangs Chao Leng
2021-03-16 5:04 ` Sagi Grimberg
2021-03-16 6:18 ` Chao Leng
2021-03-16 6:25 ` Sagi Grimberg
2021-03-16 20:07 ` Sagi Grimberg
2021-03-16 20:42 ` Keith Busch
2021-03-16 23:51 ` Sagi Grimberg
2021-03-17 2:55 ` Chao Leng
2021-03-17 6:59 ` Christoph Hellwig
2021-03-17 7:59 ` Chao Leng
2021-03-17 18:43 ` Sagi Grimberg
2021-03-18 1:51 ` Chao Leng
2021-03-18 4:45 ` Christoph Hellwig
2021-03-18 18:46 ` Sagi Grimberg [this message]
2021-03-18 19:16 ` Keith Busch
2021-03-18 19:31 ` Sagi Grimberg
2021-03-18 21:52 ` Keith Busch
2021-03-18 22:45 ` Sagi Grimberg
2021-03-19 14:05 ` Christoph Hellwig
2021-03-19 17:28 ` Christoph Hellwig
2021-03-19 19:07 ` Keith Busch
2021-03-19 19:34 ` Sagi Grimberg
2021-03-20 6:11 ` Christoph Hellwig
2021-03-21 6:49 ` Sagi Grimberg
2021-03-22 6:34 ` Christoph Hellwig
2021-03-17 8:16 ` Sagi Grimberg
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=7b7d5223-ddaf-eb88-f112-02834f8c8f93@grimberg.me \
--to=sagi@grimberg.me \
--cc=Chaitanya.Kulkarni@wdc.com \
--cc=hch@lst.de \
--cc=kbusch@kernel.org \
--cc=lengchao@huawei.com \
--cc=linux-nvme@lists.infradead.org \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).