From: Sagi Grimberg <sagi@grimberg.me>
To: James Smart <james.smart@broadcom.com>,
Chao Leng <lengchao@huawei.com>,
linux-nvme@lists.infradead.org, Christoph Hellwig <hch@lst.de>,
Keith Busch <kbusch@kernel.org>
Subject: Re: [PATCH] nvme-fabrics: allow to queue requests for live queues
Date: Tue, 28 Jul 2020 16:39:26 -0700 [thread overview]
Message-ID: <8c96aa2a-89e3-b8e5-8506-f489aa9a69f1@grimberg.me> (raw)
In-Reply-To: <520f2317-533d-b546-dd35-80ae5f707aa7@broadcom.com>
>>>> Can you please give an example? NVME_REQ_USERCMD should not be any
>>>> different from any other type of I/O.
>>> ...
>>
>> The problem I have with this code is that checking for NVME_REQ_USERCMD
>> is a big hammer that you land on what can be normal I/O
>> (e.g. nvme read/write/write-zeros), and we must not rely on this
>> indication to prevent what you are describing.
>>
>> It maybe (just maybe) OK to check NVME_REQ_USERCMD only for the admin
>> queue, but we are really circling over the fact that we cannot reliably
>> send admin connect before to go off first, because we have to unquiesce
>> it before we issue the admin connect, and there might be a stray command
>> going into execution before we submit the admin connect.
>
> agree.
>
> Didn't you suggest something separate to isolate the connect
> commands/init commands ? Can we use a different tag set for
> "initialization" commands? We can use different queue_rq routines with
> different check ready checks. I don't like the overhead, but sure makes
> the logic more straightforward.
I think I've added this now that I'm looking back:
e7832cb48a65 ("nvme: make fabrics command run on a separate request queue")
But it still doesn't solve the user commands you are referring to.
>>>>> As for the blk_rq_is_passthrough check - I guess I can see it being
>>>>> based on the queue state, and the check looks ok (we should never
>>>>> see !blk_rq_is_passthrough on the admin q).
>>>>> But...
>>>>> - I don't know why it was that important to change it. On the
>>>>> connecting path, all you're doing is letting io start flowing
>>>>> before all the queues have been created. Did you really need to
>>>>> start that much sooner ?
>>>>
>>>> The issue is that controller in RESETTING state will have requests that
>>>> are being issued, and if we don't let it pass through, it will hang
>>>> around forever being requeued preventing queue freeze to complete.
>>>
>
> Isn't this the real problem: we require the request to complete in
> order to get off the request queue in order to freeze. But to complete -
> it has to finish with some status. This is ok if running multipath, as
> mpath queues and retries - but what mpath is not running ? The issuer of
> the io will get the error ?
We already fail noretry errors when we cancel (or abort) anything that
was submitted before, so there is nothing new here. non-mpath
does not set FAILFAST so it gets requeued, user commands do, so they
are failed, this is not different than before.
> Should we always be putting even non-mp devices under mp, thus mp
> catches the errors rather than the blk queue, and can have a short delay
> while the controller reconnects ? At least normal io can be handled
> this way and we'd have to figure out the ioctl path. But it should
> still allow the core io, such as scanning, to get the error and fail
> out, to be picked up after the reconnect.
non-mpath still gets requeued because it doesn't set any sort of
FAILFAST flag.
or
_______________________________________________
Linux-nvme mailing list
Linux-nvme@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-nvme
next prev parent reply other threads:[~2020-07-28 23:39 UTC|newest]
Thread overview: 14+ messages / expand[flat|nested] mbox.gz Atom feed top
2020-07-28 5:35 [PATCH] nvme-fabrics: allow to queue requests for live queues Sagi Grimberg
2020-07-28 6:44 ` Chao Leng
2020-07-28 6:49 ` Sagi Grimberg
2020-07-28 17:11 ` James Smart
2020-07-28 17:50 ` Sagi Grimberg
2020-07-28 20:11 ` James Smart
2020-07-28 20:38 ` Sagi Grimberg
2020-07-28 22:47 ` James Smart
2020-07-28 23:39 ` Sagi Grimberg [this message]
2020-07-28 10:50 ` Christoph Hellwig
2020-07-28 16:50 ` James Smart
2020-07-29 5:45 ` Christoph Hellwig
2020-07-29 5:53 ` Sagi Grimberg
2020-07-29 6:05 ` Christoph Hellwig
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=8c96aa2a-89e3-b8e5-8506-f489aa9a69f1@grimberg.me \
--to=sagi@grimberg.me \
--cc=hch@lst.de \
--cc=james.smart@broadcom.com \
--cc=kbusch@kernel.org \
--cc=lengchao@huawei.com \
--cc=linux-nvme@lists.infradead.org \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).