From: Chao Leng <lengchao@huawei.com>
To: Christoph Hellwig <hch@lst.de>
Cc: kbusch@kernel.org, axboe@fb.com, sagi@grimberg.me,
linux-nvme@lists.infradead.org, John.Meneghini@netapp.com
Subject: Re: [PATCH] nvme-core: fix io interrupt when work with dm-multipah
Date: Thu, 30 Jul 2020 09:49:48 +0800 [thread overview]
Message-ID: <43e5dee8-1a91-4d8b-fdb5-91f9679ddeb3@huawei.com> (raw)
In-Reply-To: <20200729055903.GC31113@lst.de>
On 2020/7/29 13:59, Christoph Hellwig wrote:
> On Wed, Jul 29, 2020 at 10:54:29AM +0800, Chao Leng wrote:
>>
>>
>> On 2020/7/28 19:19, Christoph Hellwig wrote:
>>> On Mon, Jul 27, 2020 at 01:58:18PM +0800, Chao Leng wrote:
>>>> The protocol NVM-Express-1.4 define:
>>>> Command Interrupted: Command processing was interrupted and the
>>>> controller is unable to successfully complete the command. The host
>>>> should retry the command. If this status code is returned, then
>>>> the controller shall clear the Do Not Retry bit to ‘0’ in the Status
>>>> field of the CQE (refer to Figure 124). The controller shall not return
>>>> this status code unless the host has set the Advanced Command Retry
>>>> Enable (ACRE) field to 1h in the Host Behavior Support feature(refer to
>>>> section 5.21.1.22).
>>>>
>>>> According the protocol define, NVME_SC_CMD_INTERRUPTED need retry.
>>>> The error code NVME_SC_CMD_INTERRUPTED should not translate to
>>>> BLK_STS_TARGET, because if the error code translate to BLK_STS_TARGET,
>>>> dm-multipah will return error to application. So if target return error
>>>> code NVME_SC_CMD_INTERRUPTED, io will interrupt. NVME_SC_CMD_INTERRUPTED
>>>> should translate to BLK_STS_IOERR by default, dm-multipath will fail
>>>> over to other path retry the io.
>>>
>>> IOERR still seems wrong, though.
>>> .
>>
>> BLK_STS_TARGET means target has critical error. NVME_SC_CMD_INTERRUPTED
>> just means target need retry io. It is not suitable to translate
>> NVME_SC_CMD_INTERRUPTED to BLK_STS_TARGET. Maybe translate to
>> BLK_STS_IOERR is also not suitable, we should translate
>> NVME_SC_CMD_INTERRUPTED to BLK_STS_AGAIN.
>> We can do like this:
>
> BLK_STS_AGAIN is a bad choice as we use it for calls that block when
> the callers asked for non-blocking submission. I'm really not sure
> we want to change anything here - the error definition clearly states
> it is not a failure but a request to retry later.
> .
>
BLK_STS_AGAIN is not a good choice, but BLK_STS_TARGET is also not
a good choice. I find the patch that translate NVME_SC_CMD_INTERRUPTED
to BLK_STS_TARGET.
commit:35038bffa87da282010b91108cadd13238bb5bbd
nvme: Translate more status codes to blk_status_t
Decode interrupted command and not ready namespace nvme status codes to
BLK_STS_TARGET. These are not generic IO errors and should use a non-path
specific error so that it can use the non-failover retry path.
Reported-by: John Meneghini <John.Meneghini@netapp.com>
In the old version, need translate NVME_SC_CMD_INTERRUPTED to
BLK_STS_TARGET, because nvme multipath check the blk_path_error,
we expect retry IO in the current path, so have to translate
NVME_SC_CMD_INTERRUPTED to BLK_STS_TARGET. Although converting to
BLK_STS_TARGET is not a good choice, there seems to be no better choice
for the old version code. But now we do not need translate
NVME_SC_CMD_INTERRUPTED to BLK_STS_TARGET, nvme multipath already
improve and does not depend on blk_path_error.
According to the description of BLK_STS_TARGET:
[BLK_STS_TARGET] = { -EREMOTEIO, "critical target" }
BLK_STS_TARGET may easily mistaken as a fatal error on the storage.
I already check all the error code of BLK_STS_xx, there is no good
choice for NVME_SC_CMD_INTERRUPTED now. In the absence of a suitable
error code, it is a good choice to retain the default. So I strongly
suggest delete translate NVME_SC_CMD_INTERRUPTED to BLK_STS_TARGET.
John, what do you think?
_______________________________________________
Linux-nvme mailing list
Linux-nvme@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-nvme
next prev parent reply other threads:[~2020-07-30 1:50 UTC|newest]
Thread overview: 50+ messages / expand[flat|nested] mbox.gz Atom feed top
2020-07-27 5:58 [PATCH] nvme-core: fix io interrupt when work with dm-multipah Chao Leng
2020-07-28 11:19 ` Christoph Hellwig
2020-07-29 2:54 ` Chao Leng
2020-07-29 5:59 ` Christoph Hellwig
2020-07-30 1:49 ` Chao Leng [this message]
2020-08-05 6:40 ` Chao Leng
2020-08-05 15:29 ` Keith Busch
2020-08-06 5:52 ` Chao Leng
2020-08-06 14:26 ` Keith Busch
2020-08-06 15:59 ` Meneghini, John
2020-08-06 16:17 ` Meneghini, John
2020-08-06 18:40 ` Mike Snitzer
2020-08-06 19:19 ` [PATCH] nvme: restore use of blk_path_error() in nvme_complete_rq() Mike Snitzer
2020-08-06 22:42 ` Meneghini, John
2020-08-07 0:07 ` Mike Snitzer
2020-08-07 1:21 ` Sagi Grimberg
2020-08-07 4:50 ` Mike Snitzer
2020-08-07 23:35 ` Sagi Grimberg
2020-08-08 21:08 ` Meneghini, John
2020-08-08 21:11 ` Meneghini, John
2020-08-10 14:48 ` Mike Snitzer
2020-08-11 12:54 ` Meneghini, John
2020-08-10 8:10 ` Chao Leng
2020-08-11 12:36 ` Meneghini, John
2020-08-12 7:51 ` Chao Leng
2020-08-10 14:36 ` Mike Snitzer
2020-08-10 17:22 ` [PATCH] nvme: explicitly use normal NVMe error handling when appropriate Mike Snitzer
2020-08-11 3:32 ` Chao Leng
2020-08-11 4:20 ` Mike Snitzer
2020-08-11 6:17 ` Chao Leng
2020-08-11 14:12 ` Mike Snitzer
2020-08-13 14:48 ` [RESEND PATCH] " Mike Snitzer
2020-08-13 15:29 ` Meneghini, John
2020-08-13 15:43 ` Mike Snitzer
2020-08-13 15:59 ` Meneghini, John
2020-08-13 15:36 ` Christoph Hellwig
2020-08-13 17:47 ` Mike Snitzer
2020-08-13 18:43 ` Christoph Hellwig
2020-08-13 19:03 ` Mike Snitzer
2020-08-14 4:26 ` Meneghini, John
2020-08-14 6:53 ` Sagi Grimberg
2020-08-14 6:55 ` Christoph Hellwig
2020-08-14 7:02 ` Sagi Grimberg
2020-08-14 3:23 ` Meneghini, John
2020-08-07 0:44 ` [PATCH] nvme: restore use of blk_path_error() in nvme_complete_rq() Sagi Grimberg
2020-08-10 12:43 ` Christoph Hellwig
2020-08-10 15:06 ` Mike Snitzer
2020-08-11 3:45 ` [PATCH] " Chao Leng
2020-08-07 0:03 ` [PATCH] nvme-core: fix io interrupt when work with dm-multipah Sagi Grimberg
2020-08-07 2:28 ` Chao Leng
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=43e5dee8-1a91-4d8b-fdb5-91f9679ddeb3@huawei.com \
--to=lengchao@huawei.com \
--cc=John.Meneghini@netapp.com \
--cc=axboe@fb.com \
--cc=hch@lst.de \
--cc=kbusch@kernel.org \
--cc=linux-nvme@lists.infradead.org \
--cc=sagi@grimberg.me \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).