All of lore.kernel.org
 help / color / mirror / Atom feed
From: james.smart@broadcom.com (James Smart)
Subject: [PATCH v2 2/7] nvme: revise nvme_set_queue_count to return error on some nvme status codes
Date: Mon, 14 May 2018 08:08:12 -0700	[thread overview]
Message-ID: <9a65c024-1a33-ff97-f5fe-38f147eb082c@broadcom.com> (raw)
In-Reply-To: <20180512133453.GB14317@infradead.org>

On 5/12/2018 6:34 AM, Christoph Hellwig wrote:
> On Fri, May 11, 2018@05:50:23PM -0700, James Smart wrote:
>> The status codes that transports may generate are NVME_SC_ABORT_REQ and
>> NVME_SC_INTERNAL. In those cases, return a positive error value with the
>> value being the status code.
> No, your transport should not come up with fake nvme errors, sorry.
>
> And we need this code to ignore errors to bring up various
> PCIe controller in degraded conditions.

It's not making up fake errors, per say. All transports, when they 
terminate io for an association that fails, kick them back with at least 
the aborted status. FC does have the internal value as well for 
transport-specific errors.? What is happening here - the controller 
initialization starts, but then connectivity is lost before it can 
complete. In this case, it occurred right as the SET_PROPERTY for io 
queues was outstanding.?? I wish the core code wasn't so expectant on 
the happy path, as even if we get past this error, it's very chatty on 
errors and can scare an admin even when the issues are fully recoverable.

Looking closer at the issue, there are two concerns:
a) the setting of the io queue count to zero persists in future 
connection attempts;
b) if the transport intercepts and kills the io, and the controller 
continues to come live in the os, is there any way the transport won't 
kill the association and retry ??? because otherwise, it is stuck in a 
"bogus" degraded mode.

for (a): the transport recalculates the io queue count on each new 
association attempt and ignores previous values. Except that it is used 
to size the hw_queues on the io request queue when it's initially 
created.? I assume the blk_mq_update_nr_hw_queues() call is sufficient 
to resize it on the next association.

for (b): I can't come up with a scenario where the transport would not 
be taking the association down due to the error that generated the 
failure on the command.

As such, I guess I can pull this patch and drop it. It's not necessary.

-- james

  reply	other threads:[~2018-05-14 15:08 UTC|newest]

Thread overview: 14+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2018-05-12  0:50 [PATCH v2 0/7] nvme_fc: asynchronous controller create and simple discovery James Smart
2018-05-12  0:50 ` [PATCH v2 1/7] nvme: remove unnecessary controller subnqn validation James Smart
2018-05-12 13:36   ` Christoph Hellwig
2018-05-25  9:11   ` Christoph Hellwig
2018-05-12  0:50 ` [PATCH v2 2/7] nvme: revise nvme_set_queue_count to return error on some nvme status codes James Smart
2018-05-12 13:34   ` Christoph Hellwig
2018-05-14 15:08     ` James Smart [this message]
2018-05-12  0:50 ` [PATCH v2 3/7] nvme_fc: remove setting DNR on exception conditions James Smart
2018-05-25  9:11   ` Christoph Hellwig
2018-05-12  0:50 ` [PATCH v2 4/7] nvme_fc: remove reinit_request routine James Smart
2018-05-12 13:35   ` Christoph Hellwig
2018-05-12  0:50 ` [PATCH v2 5/7] nvme_fc: change controllers first connect to use reconnect path James Smart
2018-05-12  0:50 ` [PATCH v2 6/7] nvme_fc: fix nulling of queue data on reconnect James Smart
2018-05-12  0:50 ` [PATCH v2 7/7] nvme_fc: add 'nvme_discovery' sysfs attribute to fc transport device James Smart

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=9a65c024-1a33-ff97-f5fe-38f147eb082c@broadcom.com \
    --to=james.smart@broadcom.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.