linux-block.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
From: Sagi Grimberg <sagi@grimberg.me>
To: Chao Leng <lengchao@huawei.com>, linux-nvme@lists.infradead.org
Cc: kbusch@kernel.org, axboe@fb.com, hch@lst.de,
	linux-block@vger.kernel.org, axboe@kernel.dk
Subject: Re: [PATCH v2 4/6] nvme-rdma: avoid IO error and repeated request completion
Date: Fri, 15 Jan 2021 17:18:37 -0800	[thread overview]
Message-ID: <4ff22d33-12fa-1f70-3606-54821f314c45@grimberg.me> (raw)
In-Reply-To: <695b6839-5333-c342-2189-d7aaeba797a7@huawei.com>


>>>>> When a request is queued failed, blk_status_t is directly returned
>>>>> to the blk-mq. If blk_status_t is not BLK_STS_RESOURCE,
>>>>> BLK_STS_DEV_RESOURCE, BLK_STS_ZONE_RESOURCE, blk-mq call
>>>>> blk_mq_end_request to complete the request with BLK_STS_IOERR.
>>>>> In two scenarios, the request should be retried and may succeed.
>>>>> First, if work with nvme multipath, the request may be retried
>>>>> successfully in another path, because the error is probably related to
>>>>> the path. Second, if work without multipath software, the request may
>>>>> be retried successfully after error recovery.
>>>>> If the request is complete with BLK_STS_IOERR in 
>>>>> blk_mq_dispatch_rq_list.
>>>>> The state of request may be changed to MQ_RQ_IN_FLIGHT. If free the
>>>>> request asynchronously such as in nvme_submit_user_cmd, in extreme
>>>>> scenario the request will be repeated freed in tear down.
>>>>> If a non-resource error occurs in queue_rq, should directly call
>>>>> nvme_complete_rq to complete request and set the state of request to
>>>>> MQ_RQ_COMPLETE. nvme_complete_rq will decide to retry, fail over or 
>>>>> end
>>>>> the request.
>>>>>
>>>>> Signed-off-by: Chao Leng <lengchao@huawei.com>
>>>>> ---
>>>>>   drivers/nvme/host/rdma.c | 2 +-
>>>>>   1 file changed, 1 insertion(+), 1 deletion(-)
>>>>>
>>>>> diff --git a/drivers/nvme/host/rdma.c b/drivers/nvme/host/rdma.c
>>>>> index df9f6f4549f1..4a89bf44ecdc 100644
>>>>> --- a/drivers/nvme/host/rdma.c
>>>>> +++ b/drivers/nvme/host/rdma.c
>>>>> @@ -2093,7 +2093,7 @@ static blk_status_t nvme_rdma_queue_rq(struct 
>>>>> blk_mq_hw_ctx *hctx,
>>>>>   unmap_qe:
>>>>>       ib_dma_unmap_single(dev, req->sqe.dma, sizeof(struct 
>>>>> nvme_command),
>>>>>                   DMA_TO_DEVICE);
>>>>> -    return ret;
>>>>> +    return nvme_try_complete_failed_req(rq, ret);
>>>>
>>>> I don't understand this. There are errors that may not be related to
>>>> anything that is pathing related (sw bug, memory leak, mapping error,
>>>> etc, etc) why should we return this one-shot error?
>>> Although fail over retry is not required, if we return the error to
>>> blk-mq, a low probability crash may happen. because blk-mq do not set
>>> the state of request to MQ_RQ_COMPLETE before complete the request,
>>> the request may be freed asynchronously such as in nvme_submit_user_cmd.
>>> If race with error recovery, request double completion may happens.
>>
>> Then fix that, don't work around it.
> I'm not trying to work around it. The purpose of this is to solve
> the problem of nvme native multipathing at the same time.

Please explain how this is an nvme-multipath issue?

>>
>>>
>>> So we can not return the error to blk-mq if the blk_status_t is not
>>> BLK_STS_RESOURCE, BLK_STS_DEV_RESOURCE, BLK_STS_ZONE_RESOURCE.
>>
>> This is not something we should be handling in nvme. block drivers
>> should be able to fail queue_rq, and this all should live in the
>> block layer.
> Of course, it is also an idea to repair the block drivers directly.
> However, block layer is unaware of nvme native multipathing,

Nor it should be

> will cause the request return error which should be avoided.

Not sure I understand..
requests should failover for path related errors,
what queue_rq errors are expected to be failed over from your
perspective?

> The scenario: use two HBAs for nvme native multipath, and then one HBA
> fault,

What is the specific error the driver sees?

> the blk_status_t of queue_rq is BLK_STS_IOERR, blk-mq will call
> blk_mq_end_request to complete the request which bypass name native
> multipath. We expect the request fail over to normal HBA, but the request
> is directly completed with BLK_STS_IOERR.
> The two scenarios can be fixed by directly completing the request in 
> queue_rq.
Well, certainly this one-shot always return 0 and complete the command
with HOST_PATH error is not a good approach IMO

  reply	other threads:[~2021-01-16  1:19 UTC|newest]

Thread overview: 21+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2021-01-07  3:31 [PATCH v2 0/6] avoid repeated request completion and IO error Chao Leng
2021-01-07  3:31 ` [PATCH v2 1/6] blk-mq: introduce blk_mq_set_request_complete Chao Leng
2021-01-14  0:17   ` Sagi Grimberg
2021-01-14  6:50     ` Chao Leng
2021-01-07  3:31 ` [PATCH v2 2/6] nvme-core: introduce complete failed request Chao Leng
2021-01-07  3:31 ` [PATCH v2 3/6] nvme-fabrics: avoid repeated request completion for nvmf_fail_nonready_command Chao Leng
2021-01-07  3:31 ` [PATCH v2 4/6] nvme-rdma: avoid IO error and repeated request completion Chao Leng
2021-01-14  0:19   ` Sagi Grimberg
2021-01-14  6:55     ` Chao Leng
2021-01-14 21:25       ` Sagi Grimberg
2021-01-15  2:53         ` Chao Leng
2021-01-16  1:18           ` Sagi Grimberg [this message]
2021-01-18  3:22             ` Chao Leng
2021-01-18 17:49               ` Christoph Hellwig
2021-01-19  1:50                 ` Chao Leng
2021-01-20 21:35               ` Sagi Grimberg
2021-01-21  1:34                 ` Chao Leng
2021-01-07  3:31 ` [PATCH v2 5/6] nvme-tcp: " Chao Leng
2021-01-07  3:31 ` [PATCH v2 6/6] nvme-fc: " Chao Leng
2021-01-14  0:15 ` [PATCH v2 0/6] avoid repeated request completion and IO error Sagi Grimberg
2021-01-14  6:50   ` Chao Leng

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=4ff22d33-12fa-1f70-3606-54821f314c45@grimberg.me \
    --to=sagi@grimberg.me \
    --cc=axboe@fb.com \
    --cc=axboe@kernel.dk \
    --cc=hch@lst.de \
    --cc=kbusch@kernel.org \
    --cc=lengchao@huawei.com \
    --cc=linux-block@vger.kernel.org \
    --cc=linux-nvme@lists.infradead.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).