linux-nvme.lists.infradead.org archive mirror
 help / color / mirror / Atom feed
From: Mike Snitzer <snitzer@redhat.com>
To: Chao Leng <lengchao@huawei.com>
Cc: Sagi Grimberg <sagi@grimberg.me>, Hannes Reinecke <hare@suse.de>,
	"linux-nvme@lists.infradead.org" <linux-nvme@lists.infradead.org>,
	Christoph Hellwig <hch@infradead.org>,
	dm-devel@redhat.com, Ewan Milne <emilne@redhat.com>,
	Keith Busch <kbusch@kernel.org>,
	"Meneghini, John" <John.Meneghini@netapp.com>
Subject: Re: nvme: explicitly use normal NVMe error handling when appropriate
Date: Tue, 11 Aug 2020 00:20:00 -0400	[thread overview]
Message-ID: <20200811042000.GA22692@redhat.com> (raw)
In-Reply-To: <7f99724a-a1eb-6bec-f8ae-f9a4601b0487@huawei.com>

On Mon, Aug 10 2020 at 11:32pm -0400,
Chao Leng <lengchao@huawei.com> wrote:

> 
> 
> On 2020/8/11 1:22, Mike Snitzer wrote:
> >On Mon, Aug 10 2020 at 10:36am -0400,
> >Mike Snitzer <snitzer@redhat.com> wrote:
> >
> >>On Fri, Aug 07 2020 at  7:35pm -0400,
> >>Sagi Grimberg <sagi@grimberg.me> wrote:
> >>
> >>>
> >>>>>Hey Mike,
> >...
> >>>>I think NVMe can easily fix this by having an earlier stage of checking,
> >>>>e.g. nvme_local_retry_req(), that shortcircuits ever getting to
> >>>>higher-level multipathing consideration (be it native NVMe or DM
> >>>>multipathing) for cases like NVME_SC_CMD_INTERRUPTED.
> >>>>To be clear: the "default" case of nvme_failover_req() that returns
> >>>>false to fallback to NVMe's "local" normal NVMe error handling -- that
> >>>>can stay.. but a more explicit handling of cases like
> >>>>NVME_SC_CMD_INTERRUPTED should be added to a nvme_local_retry_req()
> >>>>check that happens before nvme_failover_req() in nvme_complete_rq().
> >>>
> >>>I don't necessarily agree with having a dedicated nvme_local_retry_req().
> >>>a request that isn't failed over, goes to local error handling (retry or
> >>>not). I actually think that just adding the condition to
> >>>nvme_complete_req and having nvme_failover_req reject it would work.
> >>>
> >>>Keith?
> >>
> >>I think that is basically what I'm thinking too.
> >
> >From: Mike Snitzer <snitzer@redhat.com>
> >Subject: nvme: explicitly use normal NVMe error handling when appropriate
> >
> >Commit 764e9332098c0 ("nvme-multipath: do not reset on unknown
> >status"), among other things, fixed NVME_SC_CMD_INTERRUPTED error
> >handling by changing multipathing's nvme_failover_req() to short-circuit
> >path failover and then fallback to NVMe's normal error handling (which
> >takes care of NVME_SC_CMD_INTERRUPTED).
> >
> >This detour through native NVMe multipathing code is unwelcome because
> >it prevents NVMe core from handling NVME_SC_CMD_INTERRUPTED independent
> >of any multipathing concerns.
> >
> >Introduce nvme_status_needs_local_error_handling() to prioritize
> >non-failover retry, when appropriate, in terms of normal NVMe error
> >handling.  nvme_status_needs_local_error_handling() will naturely evolve
> >to include handling of any other errors that normal error handling must
> >be used for.
> >
> >nvme_failover_req()'s ability to fallback to normal NVMe error handling
> >has been preserved because it may be useful for future NVME_SC that
> >nvme_status_needs_local_error_handling() hasn't yet been trained for.
> >
> >Signed-off-by: Mike Snitzer <snitzer@redhat.com>
> >---
> >  drivers/nvme/host/core.c | 16 ++++++++++++++--
> >  1 file changed, 14 insertions(+), 2 deletions(-)
> >
> >diff --git a/drivers/nvme/host/core.c b/drivers/nvme/host/core.c
> >index 88cff309d8e4..be749b690af7 100644
> >--- a/drivers/nvme/host/core.c
> >+++ b/drivers/nvme/host/core.c
> >@@ -252,6 +252,16 @@ static inline bool nvme_req_needs_retry(struct request *req)
> >  	return true;
> >  }
> >+static inline bool nvme_status_needs_local_error_handling(u16 status)
> >+{
> >+	switch (status & 0x7ff) {
> >+	case NVME_SC_CMD_INTERRUPTED:
> >+		return true;
> >+	default:
> >+		return false;
> >+	}
> >+}
> >+
> >  static void nvme_retry_req(struct request *req)
> >  {
> >  	struct nvme_ns *ns = req->q->queuedata;
> >@@ -270,7 +280,8 @@ static void nvme_retry_req(struct request *req)
> >  void nvme_complete_rq(struct request *req)
> >  {
> >-	blk_status_t status = nvme_error_status(nvme_req(req)->status);
> >+	u16 nvme_status = nvme_req(req)->status;
> >+	blk_status_t status = nvme_error_status(nvme_status);
> >  	trace_nvme_complete_rq(req);
> >@@ -280,7 +291,8 @@ void nvme_complete_rq(struct request *req)
> >  		nvme_req(req)->ctrl->comp_seen = true;
> >  	if (unlikely(status != BLK_STS_OK && nvme_req_needs_retry(req))) {
> >-		if ((req->cmd_flags & REQ_NVME_MPATH) && nvme_failover_req(req))
> >+		if (!nvme_status_needs_local_error_handling(nvme_status) &&
> >+		    (req->cmd_flags & REQ_NVME_MPATH) && nvme_failover_req(req))
>
> This looks no affect. if work with nvme multipath, now is already retry local.

Not if NVMe is built without multipathing configured.

> If work with dm-multipath, still return error.

Yes, I'm aware.  Use of REQ_FAILFAST_TRANSPORT isn't something that is
needed for NVMe, so why are you proposing hacks in NVMe to deal with it?

> >  			return;
> >  		if (!blk_queue_dying(req->q)) {
> >
> 
> Suggest:
> REQ_FAILFAST_TRANSPORT may be designed for scsi, because scsi protocol
> do not difine the local retry mechanism. SCSI implements a fuzzy local
> retry mechanism, so need the REQ_FAILFAST_TRANSPORT for multipath
> software, multipath software retry according error code is expected.
> nvme is different with scsi about this. It define local retry mechanism
> and path error code, so nvme should not care REQ_FAILFAST_TRANSPORT.

Exactly.  Except by "nvme should not care REQ_FAILFAST_TRANSPORT." your
patch says you mean "nvme shouldn't disallow retry if
REQ_FAILFAST_TRANSPORT is it".  I'm saying: don't try to get such
changes into NVMe.

In general, aspects of your patch may have merit but overall it is doing
too much.

Mike


_______________________________________________
Linux-nvme mailing list
Linux-nvme@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-nvme

  reply	other threads:[~2020-08-11  4:20 UTC|newest]

Thread overview: 50+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2020-07-27  5:58 [PATCH] nvme-core: fix io interrupt when work with dm-multipah Chao Leng
2020-07-28 11:19 ` Christoph Hellwig
2020-07-29  2:54   ` Chao Leng
2020-07-29  5:59     ` Christoph Hellwig
2020-07-30  1:49       ` Chao Leng
2020-08-05  6:40         ` Chao Leng
2020-08-05 15:29           ` Keith Busch
2020-08-06  5:52             ` Chao Leng
2020-08-06 14:26               ` Keith Busch
2020-08-06 15:59                 ` Meneghini, John
2020-08-06 16:17                   ` Meneghini, John
2020-08-06 18:40                     ` Mike Snitzer
2020-08-06 19:19                       ` [PATCH] nvme: restore use of blk_path_error() in nvme_complete_rq() Mike Snitzer
2020-08-06 22:42                         ` Meneghini, John
2020-08-07  0:07                           ` Mike Snitzer
2020-08-07  1:21                             ` Sagi Grimberg
2020-08-07  4:50                               ` Mike Snitzer
2020-08-07 23:35                                 ` Sagi Grimberg
2020-08-08 21:08                                   ` Meneghini, John
2020-08-08 21:11                                     ` Meneghini, John
2020-08-10 14:48                                       ` Mike Snitzer
2020-08-11 12:54                                         ` Meneghini, John
2020-08-10  8:10                                     ` Chao Leng
2020-08-11 12:36                                       ` Meneghini, John
2020-08-12  7:51                                         ` Chao Leng
2020-08-10 14:36                                   ` Mike Snitzer
2020-08-10 17:22                                     ` [PATCH] nvme: explicitly use normal NVMe error handling when appropriate Mike Snitzer
2020-08-11  3:32                                       ` Chao Leng
2020-08-11  4:20                                         ` Mike Snitzer [this message]
2020-08-11  6:17                                           ` Chao Leng
2020-08-11 14:12                                             ` Mike Snitzer
2020-08-13 14:48                                       ` [RESEND PATCH] " Mike Snitzer
2020-08-13 15:29                                         ` Meneghini, John
2020-08-13 15:43                                           ` Mike Snitzer
2020-08-13 15:59                                             ` Meneghini, John
2020-08-13 15:36                                         ` Christoph Hellwig
2020-08-13 17:47                                           ` Mike Snitzer
2020-08-13 18:43                                             ` Christoph Hellwig
2020-08-13 19:03                                               ` Mike Snitzer
2020-08-14  4:26                                               ` Meneghini, John
2020-08-14  6:53                                               ` Sagi Grimberg
2020-08-14  6:55                                                 ` Christoph Hellwig
2020-08-14  7:02                                                   ` Sagi Grimberg
2020-08-14  3:23                                         ` Meneghini, John
2020-08-07  0:44                         ` [PATCH] nvme: restore use of blk_path_error() in nvme_complete_rq() Sagi Grimberg
2020-08-10 12:43                         ` Christoph Hellwig
2020-08-10 15:06                           ` Mike Snitzer
2020-08-11  3:45                           ` [PATCH] " Chao Leng
2020-08-07  0:03                   ` [PATCH] nvme-core: fix io interrupt when work with dm-multipah Sagi Grimberg
2020-08-07  2:28                     ` Chao Leng

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20200811042000.GA22692@redhat.com \
    --to=snitzer@redhat.com \
    --cc=John.Meneghini@netapp.com \
    --cc=dm-devel@redhat.com \
    --cc=emilne@redhat.com \
    --cc=hare@suse.de \
    --cc=hch@infradead.org \
    --cc=kbusch@kernel.org \
    --cc=lengchao@huawei.com \
    --cc=linux-nvme@lists.infradead.org \
    --cc=sagi@grimberg.me \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).