Linux-NVME Archive on lore.kernel.org
 help / color / Atom feed
From: Hannes Reinecke <hare@suse.de>
To: Christoph Hellwig <hch@lst.de>
Cc: Keith Busch <keith.busch@intel.com>,
	Sagi Grimberg <sagi@grimberg.me>,
	linux-nvme@lists.infradead.org,
	John Managhini <john.meneghini@netapp.com>
Subject: Re: [PATCH] nvme-multipath: do not reset controller on unknown status
Date: Thu, 13 Feb 2020 07:53:34 +0100
Message-ID: <88fffa65-7c1d-8b14-e4d2-e4b5e2eb00a8@suse.de> (raw)
In-Reply-To: <20200212175317.GA5813@lst.de>

On 2/12/20 6:53 PM, Christoph Hellwig wrote:
> On Wed, Feb 12, 2020 at 02:41:40PM +0100, Hannes Reinecke wrote:
>> We're seeing occasional controller resets during straight I/O,
>> but only when multipath is active.
>> The problem here is the nvme-multipath will reset the controller
>> on every unknown status, which really is an odd behaviour, seeing
>> that the host already received a perfectly good status; it's just
>> that it's not smart enough to understand it.
>> And resetting wouldn't help at all; the error status will continue
>> to be received.
>> So we should rather pass up any unknown error to the generic
>> routines and let them deal with this situation.
> 
> What unknown status do you see?
> 
>>
>> Signed-off-by: Hannes Reinecke <hare@suse.de>
>> Cc: John Managhini <john.meneghini@netapp.com>
>> ---
>>  drivers/nvme/host/core.c      |  4 ++--
>>  drivers/nvme/host/multipath.c | 18 ++++++++++--------
>>  drivers/nvme/host/nvme.h      |  2 +-
>>  3 files changed, 13 insertions(+), 11 deletions(-)
>>
>> diff --git a/drivers/nvme/host/core.c b/drivers/nvme/host/core.c
>> index 5dc32b72e7fa..edb081781ae7 100644
>> --- a/drivers/nvme/host/core.c
>> +++ b/drivers/nvme/host/core.c
>> @@ -293,8 +293,8 @@ void nvme_complete_rq(struct request *req)
>>  	if (unlikely(status != BLK_STS_OK && nvme_req_needs_retry(req))) {
>>  		if ((req->cmd_flags & REQ_NVME_MPATH) &&
>>  		    blk_path_error(status)) {
>> -			nvme_failover_req(req);
>> -			return;
>> +			if (nvme_failover_req(req))
>> +				return;
> 
> This reads weird, why not:
> 
> 		if ((req->cmd_flags & REQ_NVME_MPATH) &&
> 		    blk_path_error(status) && nvme_failover_req(req))
> 			return;
> 
> ?
> 
> But see below, with your updated checks (assuming this makes sense
> as we need more explanation) we don't even need the blk_path_error
> call.
> 
>> -void nvme_failover_req(struct request *req)
>> +bool nvme_failover_req(struct request *req)
>>  {
>>  	struct nvme_ns *ns = req->q->queuedata;
>>  	u16 status = nvme_req(req)->status;
>>  	unsigned long flags;
>> +	bool handled = false;
>>  
>>  	spin_lock_irqsave(&ns->head->requeue_lock, flags);
>>  	blk_steal_bios(&ns->head->requeue_list, req);
>>  	spin_unlock_irqrestore(&ns->head->requeue_lock, flags);
>> -	blk_mq_end_request(req, 0);
> 
> You can't just steal the bios without completing the request.
> 
Ah. True.

>>  
>>  	switch (status & 0x7ff) {
>>  	case NVME_SC_ANA_TRANSITION:
>> @@ -88,11 +88,13 @@ void nvme_failover_req(struct request *req)
>>  		 * mark the the path as pending and kick of a re-read of the ANA
>>  		 * log page ASAP.
>>  		 */
>> +		blk_mq_end_request(req, 0);
>>  		nvme_mpath_clear_current_path(ns);
>>  		if (ns->ctrl->ana_log_buf) {
>>  			set_bit(NVME_NS_ANA_PENDING, &ns->flags);
>>  			queue_work(nvme_wq, &ns->ctrl->ana_work);
>>  		}
>> +		handled = true;
>>  		break;
>>  	case NVME_SC_HOST_PATH_ERROR:
>>  	case NVME_SC_HOST_ABORTED_CMD:
>> @@ -100,18 +102,18 @@ void nvme_failover_req(struct request *req)
>>  		 * Temporary transport disruption in talking to the controller.
>>  		 * Try to send on a new path.
>>  		 */
>> +		blk_mq_end_request(req, 0);
>>  		nvme_mpath_clear_current_path(ns);
>> +		handled = true;
>>  		break;
>>  	default:
>> -		/*
>> -		 * Reset the controller for any non-ANA error as we don't know
>> -		 * what caused the error.
>> -		 */
>> -		nvme_reset_ctrl(ns->ctrl);
>> +		/* Delegate to common error handling */
>>  		break;
>>  	}
>>  
>> -	kblockd_schedule_work(&ns->head->requeue_work);
>> +	if (handled)
>> +		kblockd_schedule_work(&ns->head->requeue_work);
>> +	return handled;
> 
> I think you can defer the blk_mq_end_request to here as well.  Then
> just return false from the default case and you don't need the handled
> variable.
> 
> So in the end this should be something like this:
> 
[ .. ]
Maybe without the btrfs bits :-)

I'll give it a spin.

Cheers,

Hannes
-- 
Dr. Hannes Reinecke		           Kernel Storage Architect
hare@suse.de			                  +49 911 74053 688
SUSE Software Solutions Germany GmbH, Maxfeldstr. 5, 90409 Nürnberg
HRB 36809 (AG Nürnberg), GF: Felix Imendörffer

_______________________________________________
linux-nvme mailing list
linux-nvme@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-nvme

      parent reply index

Thread overview: 9+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2020-02-12 13:41 Hannes Reinecke
2020-02-12 17:53 ` Christoph Hellwig
2020-02-12 19:33   ` Sagi Grimberg
2020-02-13  7:02     ` Hannes Reinecke
2020-02-13  7:19       ` Sagi Grimberg
2020-02-13 17:02       ` Keith Busch
2020-02-14 14:22         ` Meneghini, John
2020-02-19 15:03           ` Christoph Hellwig
2020-02-13  6:53   ` Hannes Reinecke [this message]

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=88fffa65-7c1d-8b14-e4d2-e4b5e2eb00a8@suse.de \
    --to=hare@suse.de \
    --cc=hch@lst.de \
    --cc=john.meneghini@netapp.com \
    --cc=keith.busch@intel.com \
    --cc=linux-nvme@lists.infradead.org \
    --cc=sagi@grimberg.me \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link

Linux-NVME Archive on lore.kernel.org

Archives are clonable:
	git clone --mirror https://lore.kernel.org/linux-nvme/0 linux-nvme/git/0.git

	# If you have public-inbox 1.1+ installed, you may
	# initialize and index your mirror using the following commands:
	public-inbox-init -V2 linux-nvme linux-nvme/ https://lore.kernel.org/linux-nvme \
		linux-nvme@lists.infradead.org
	public-inbox-index linux-nvme

Example config snippet for mirrors

Newsgroup available over NNTP:
	nntp://nntp.lore.kernel.org/org.infradead.lists.linux-nvme


AGPL code for this site: git clone https://public-inbox.org/public-inbox.git