From: Hannes Reinecke <hare@suse.de>
To: Christoph Hellwig <hch@lst.de>
Cc: Keith Busch <keith.busch@intel.com>,
Sagi Grimberg <sagi@grimberg.me>,
linux-nvme@lists.infradead.org,
John Managhini <john.meneghini@netapp.com>
Subject: Re: [PATCH] nvme-multipath: do not reset controller on unknown status
Date: Thu, 13 Feb 2020 07:53:34 +0100 [thread overview]
Message-ID: <88fffa65-7c1d-8b14-e4d2-e4b5e2eb00a8@suse.de> (raw)
In-Reply-To: <20200212175317.GA5813@lst.de>
On 2/12/20 6:53 PM, Christoph Hellwig wrote:
> On Wed, Feb 12, 2020 at 02:41:40PM +0100, Hannes Reinecke wrote:
>> We're seeing occasional controller resets during straight I/O,
>> but only when multipath is active.
>> The problem here is the nvme-multipath will reset the controller
>> on every unknown status, which really is an odd behaviour, seeing
>> that the host already received a perfectly good status; it's just
>> that it's not smart enough to understand it.
>> And resetting wouldn't help at all; the error status will continue
>> to be received.
>> So we should rather pass up any unknown error to the generic
>> routines and let them deal with this situation.
>
> What unknown status do you see?
>
>>
>> Signed-off-by: Hannes Reinecke <hare@suse.de>
>> Cc: John Managhini <john.meneghini@netapp.com>
>> ---
>> drivers/nvme/host/core.c | 4 ++--
>> drivers/nvme/host/multipath.c | 18 ++++++++++--------
>> drivers/nvme/host/nvme.h | 2 +-
>> 3 files changed, 13 insertions(+), 11 deletions(-)
>>
>> diff --git a/drivers/nvme/host/core.c b/drivers/nvme/host/core.c
>> index 5dc32b72e7fa..edb081781ae7 100644
>> --- a/drivers/nvme/host/core.c
>> +++ b/drivers/nvme/host/core.c
>> @@ -293,8 +293,8 @@ void nvme_complete_rq(struct request *req)
>> if (unlikely(status != BLK_STS_OK && nvme_req_needs_retry(req))) {
>> if ((req->cmd_flags & REQ_NVME_MPATH) &&
>> blk_path_error(status)) {
>> - nvme_failover_req(req);
>> - return;
>> + if (nvme_failover_req(req))
>> + return;
>
> This reads weird, why not:
>
> if ((req->cmd_flags & REQ_NVME_MPATH) &&
> blk_path_error(status) && nvme_failover_req(req))
> return;
>
> ?
>
> But see below, with your updated checks (assuming this makes sense
> as we need more explanation) we don't even need the blk_path_error
> call.
>
>> -void nvme_failover_req(struct request *req)
>> +bool nvme_failover_req(struct request *req)
>> {
>> struct nvme_ns *ns = req->q->queuedata;
>> u16 status = nvme_req(req)->status;
>> unsigned long flags;
>> + bool handled = false;
>>
>> spin_lock_irqsave(&ns->head->requeue_lock, flags);
>> blk_steal_bios(&ns->head->requeue_list, req);
>> spin_unlock_irqrestore(&ns->head->requeue_lock, flags);
>> - blk_mq_end_request(req, 0);
>
> You can't just steal the bios without completing the request.
>
Ah. True.
>>
>> switch (status & 0x7ff) {
>> case NVME_SC_ANA_TRANSITION:
>> @@ -88,11 +88,13 @@ void nvme_failover_req(struct request *req)
>> * mark the the path as pending and kick of a re-read of the ANA
>> * log page ASAP.
>> */
>> + blk_mq_end_request(req, 0);
>> nvme_mpath_clear_current_path(ns);
>> if (ns->ctrl->ana_log_buf) {
>> set_bit(NVME_NS_ANA_PENDING, &ns->flags);
>> queue_work(nvme_wq, &ns->ctrl->ana_work);
>> }
>> + handled = true;
>> break;
>> case NVME_SC_HOST_PATH_ERROR:
>> case NVME_SC_HOST_ABORTED_CMD:
>> @@ -100,18 +102,18 @@ void nvme_failover_req(struct request *req)
>> * Temporary transport disruption in talking to the controller.
>> * Try to send on a new path.
>> */
>> + blk_mq_end_request(req, 0);
>> nvme_mpath_clear_current_path(ns);
>> + handled = true;
>> break;
>> default:
>> - /*
>> - * Reset the controller for any non-ANA error as we don't know
>> - * what caused the error.
>> - */
>> - nvme_reset_ctrl(ns->ctrl);
>> + /* Delegate to common error handling */
>> break;
>> }
>>
>> - kblockd_schedule_work(&ns->head->requeue_work);
>> + if (handled)
>> + kblockd_schedule_work(&ns->head->requeue_work);
>> + return handled;
>
> I think you can defer the blk_mq_end_request to here as well. Then
> just return false from the default case and you don't need the handled
> variable.
>
> So in the end this should be something like this:
>
[ .. ]
Maybe without the btrfs bits :-)
I'll give it a spin.
Cheers,
Hannes
--
Dr. Hannes Reinecke Kernel Storage Architect
hare@suse.de +49 911 74053 688
SUSE Software Solutions Germany GmbH, Maxfeldstr. 5, 90409 Nürnberg
HRB 36809 (AG Nürnberg), GF: Felix Imendörffer
_______________________________________________
linux-nvme mailing list
linux-nvme@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-nvme
prev parent reply other threads:[~2020-02-13 6:53 UTC|newest]
Thread overview: 9+ messages / expand[flat|nested] mbox.gz Atom feed top
2020-02-12 13:41 [PATCH] nvme-multipath: do not reset controller on unknown status Hannes Reinecke
2020-02-12 17:53 ` Christoph Hellwig
2020-02-12 19:33 ` Sagi Grimberg
2020-02-13 7:02 ` Hannes Reinecke
2020-02-13 7:19 ` Sagi Grimberg
2020-02-13 17:02 ` Keith Busch
2020-02-14 14:22 ` Meneghini, John
2020-02-19 15:03 ` Christoph Hellwig
2020-02-13 6:53 ` Hannes Reinecke [this message]
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=88fffa65-7c1d-8b14-e4d2-e4b5e2eb00a8@suse.de \
--to=hare@suse.de \
--cc=hch@lst.de \
--cc=john.meneghini@netapp.com \
--cc=keith.busch@intel.com \
--cc=linux-nvme@lists.infradead.org \
--cc=sagi@grimberg.me \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).