All of lore.kernel.org
 help / color / mirror / Atom feed
From: Sagi Grimberg <sagi@grimberg.me>
To: Chao Leng <lengchao@huawei.com>, linux-nvme@lists.infradead.org
Cc: kbusch@kernel.org, axboe@fb.com, hch@lst.de,
	linux-block@vger.kernel.org, axboe@kernel.dk
Subject: Re: [PATCH v2 4/6] nvme-rdma: avoid IO error and repeated request completion
Date: Wed, 13 Jan 2021 16:19:47 -0800	[thread overview]
Message-ID: <07e41b4f-914a-11e8-5638-e2d6408feb3f@grimberg.me> (raw)
In-Reply-To: <20210107033149.15701-5-lengchao@huawei.com>


> When a request is queued failed, blk_status_t is directly returned
> to the blk-mq. If blk_status_t is not BLK_STS_RESOURCE,
> BLK_STS_DEV_RESOURCE, BLK_STS_ZONE_RESOURCE, blk-mq call
> blk_mq_end_request to complete the request with BLK_STS_IOERR.
> In two scenarios, the request should be retried and may succeed.
> First, if work with nvme multipath, the request may be retried
> successfully in another path, because the error is probably related to
> the path. Second, if work without multipath software, the request may
> be retried successfully after error recovery.
> If the request is complete with BLK_STS_IOERR in blk_mq_dispatch_rq_list.
> The state of request may be changed to MQ_RQ_IN_FLIGHT. If free the
> request asynchronously such as in nvme_submit_user_cmd, in extreme
> scenario the request will be repeated freed in tear down.
> If a non-resource error occurs in queue_rq, should directly call
> nvme_complete_rq to complete request and set the state of request to
> MQ_RQ_COMPLETE. nvme_complete_rq will decide to retry, fail over or end
> the request.
> 
> Signed-off-by: Chao Leng <lengchao@huawei.com>
> ---
>   drivers/nvme/host/rdma.c | 2 +-
>   1 file changed, 1 insertion(+), 1 deletion(-)
> 
> diff --git a/drivers/nvme/host/rdma.c b/drivers/nvme/host/rdma.c
> index df9f6f4549f1..4a89bf44ecdc 100644
> --- a/drivers/nvme/host/rdma.c
> +++ b/drivers/nvme/host/rdma.c
> @@ -2093,7 +2093,7 @@ static blk_status_t nvme_rdma_queue_rq(struct blk_mq_hw_ctx *hctx,
>   unmap_qe:
>   	ib_dma_unmap_single(dev, req->sqe.dma, sizeof(struct nvme_command),
>   			    DMA_TO_DEVICE);
> -	return ret;
> +	return nvme_try_complete_failed_req(rq, ret);

I don't understand this. There are errors that may not be related to
anything that is pathing related (sw bug, memory leak, mapping error,
etc, etc) why should we return this one-shot error?

WARNING: multiple messages have this Message-ID (diff)
From: Sagi Grimberg <sagi@grimberg.me>
To: Chao Leng <lengchao@huawei.com>, linux-nvme@lists.infradead.org
Cc: kbusch@kernel.org, axboe@fb.com, linux-block@vger.kernel.org,
	hch@lst.de, axboe@kernel.dk
Subject: Re: [PATCH v2 4/6] nvme-rdma: avoid IO error and repeated request completion
Date: Wed, 13 Jan 2021 16:19:47 -0800	[thread overview]
Message-ID: <07e41b4f-914a-11e8-5638-e2d6408feb3f@grimberg.me> (raw)
In-Reply-To: <20210107033149.15701-5-lengchao@huawei.com>


> When a request is queued failed, blk_status_t is directly returned
> to the blk-mq. If blk_status_t is not BLK_STS_RESOURCE,
> BLK_STS_DEV_RESOURCE, BLK_STS_ZONE_RESOURCE, blk-mq call
> blk_mq_end_request to complete the request with BLK_STS_IOERR.
> In two scenarios, the request should be retried and may succeed.
> First, if work with nvme multipath, the request may be retried
> successfully in another path, because the error is probably related to
> the path. Second, if work without multipath software, the request may
> be retried successfully after error recovery.
> If the request is complete with BLK_STS_IOERR in blk_mq_dispatch_rq_list.
> The state of request may be changed to MQ_RQ_IN_FLIGHT. If free the
> request asynchronously such as in nvme_submit_user_cmd, in extreme
> scenario the request will be repeated freed in tear down.
> If a non-resource error occurs in queue_rq, should directly call
> nvme_complete_rq to complete request and set the state of request to
> MQ_RQ_COMPLETE. nvme_complete_rq will decide to retry, fail over or end
> the request.
> 
> Signed-off-by: Chao Leng <lengchao@huawei.com>
> ---
>   drivers/nvme/host/rdma.c | 2 +-
>   1 file changed, 1 insertion(+), 1 deletion(-)
> 
> diff --git a/drivers/nvme/host/rdma.c b/drivers/nvme/host/rdma.c
> index df9f6f4549f1..4a89bf44ecdc 100644
> --- a/drivers/nvme/host/rdma.c
> +++ b/drivers/nvme/host/rdma.c
> @@ -2093,7 +2093,7 @@ static blk_status_t nvme_rdma_queue_rq(struct blk_mq_hw_ctx *hctx,
>   unmap_qe:
>   	ib_dma_unmap_single(dev, req->sqe.dma, sizeof(struct nvme_command),
>   			    DMA_TO_DEVICE);
> -	return ret;
> +	return nvme_try_complete_failed_req(rq, ret);

I don't understand this. There are errors that may not be related to
anything that is pathing related (sw bug, memory leak, mapping error,
etc, etc) why should we return this one-shot error?

_______________________________________________
Linux-nvme mailing list
Linux-nvme@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-nvme

  reply	other threads:[~2021-01-14  0:32 UTC|newest]

Thread overview: 44+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2021-01-07  3:31 [PATCH v2 0/6] avoid repeated request completion and IO error Chao Leng
2021-01-07  3:31 ` Chao Leng
2021-01-07  3:31 ` [PATCH v2 1/6] blk-mq: introduce blk_mq_set_request_complete Chao Leng
2021-01-07  3:31   ` Chao Leng
2021-01-14  0:17   ` Sagi Grimberg
2021-01-14  0:17     ` Sagi Grimberg
2021-01-14  6:50     ` Chao Leng
2021-01-14  6:50       ` Chao Leng
2021-01-07  3:31 ` [PATCH v2 2/6] nvme-core: introduce complete failed request Chao Leng
2021-01-07  3:31   ` Chao Leng
2021-01-21  8:14   ` Hannes Reinecke
2021-01-22  1:45     ` Chao Leng
2021-01-07  3:31 ` [PATCH v2 3/6] nvme-fabrics: avoid repeated request completion for nvmf_fail_nonready_command Chao Leng
2021-01-07  3:31   ` Chao Leng
2021-01-07  3:31 ` [PATCH v2 4/6] nvme-rdma: avoid IO error and repeated request completion Chao Leng
2021-01-07  3:31   ` Chao Leng
2021-01-14  0:19   ` Sagi Grimberg [this message]
2021-01-14  0:19     ` Sagi Grimberg
2021-01-14  6:55     ` Chao Leng
2021-01-14  6:55       ` Chao Leng
2021-01-14 21:25       ` Sagi Grimberg
2021-01-14 21:25         ` Sagi Grimberg
2021-01-15  2:53         ` Chao Leng
2021-01-15  2:53           ` Chao Leng
2021-01-16  1:18           ` Sagi Grimberg
2021-01-16  1:18             ` Sagi Grimberg
2021-01-18  3:22             ` Chao Leng
2021-01-18  3:22               ` Chao Leng
2021-01-18 17:49               ` Christoph Hellwig
2021-01-18 17:49                 ` Christoph Hellwig
2021-01-19  1:50                 ` Chao Leng
2021-01-19  1:50                   ` Chao Leng
2021-01-20 21:35               ` Sagi Grimberg
2021-01-20 21:35                 ` Sagi Grimberg
2021-01-21  1:34                 ` Chao Leng
2021-01-21  1:34                   ` Chao Leng
2021-01-07  3:31 ` [PATCH v2 5/6] nvme-tcp: " Chao Leng
2021-01-07  3:31   ` Chao Leng
2021-01-07  3:31 ` [PATCH v2 6/6] nvme-fc: " Chao Leng
2021-01-07  3:31   ` Chao Leng
2021-01-14  0:15 ` [PATCH v2 0/6] avoid repeated request completion and IO error Sagi Grimberg
2021-01-14  0:15   ` Sagi Grimberg
2021-01-14  6:50   ` Chao Leng
2021-01-14  6:50     ` Chao Leng

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=07e41b4f-914a-11e8-5638-e2d6408feb3f@grimberg.me \
    --to=sagi@grimberg.me \
    --cc=axboe@fb.com \
    --cc=axboe@kernel.dk \
    --cc=hch@lst.de \
    --cc=kbusch@kernel.org \
    --cc=lengchao@huawei.com \
    --cc=linux-block@vger.kernel.org \
    --cc=linux-nvme@lists.infradead.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.