linux-nvme.lists.infradead.org archive mirror
 help / color / mirror / Atom feed
From: Sagi Grimberg <sagi@grimberg.me>
To: Christoph Hellwig <hch@lst.de>
Cc: Keith Busch <kbusch@kernel.org>,
	James Smart <james.smart@broadcom.com>,
	linux-nvme@lists.infradead.org
Subject: Re: [PATCH v2 1/8] nvme-fabrics: allow to queue requests for live queues
Date: Fri, 14 Aug 2020 00:08:52 -0700	[thread overview]
Message-ID: <27f60468-269a-34d8-9e51-920106c3a139@grimberg.me> (raw)
In-Reply-To: <20200814064414.GA1719@lst.de>


>> Right now we are failing requests based on the controller
>> state (which is checked inline in nvmf_check_ready) however
>> we should definitely accept requests if the queue is live.
>>
>> When entering controller reset, we transition the controller
>> into NVME_CTRL_RESETTING, and then return BLK_STS_RESOURCE for
>> non-mpath requests (have blk_noretry_request set).
>>
>> This is also the case for NVME_REQ_USER for the wrong reason.
>> There shouldn't be any reason for us to reject this I/O in a
>> controller reset. We do want to prevent passthru commands on
>> the admin queue because we need the controller to fully initialize
>> first before we let user passthru admin commands to be issued.
>>
>> In a non-mpath setup, this means that the requests will simply
>> be requeued over and over forever not allowing the q_usage_counter
>> to drop its final reference, causing controller reset to hang
>> if running concurrently with heavy I/O.
> 
> Which will still happen with the admin queue user passthrough
> commands with this patch, so I don't think it actually solves anything,
> it just reduces the exposure a bit.

The original version of the patch removed that as well, but james
indicated that it's still needed because we have no way to make sure
the admin (re)connect will be the first request when we unquiesce.

So I kept that one around and will fix it later, and yes, this
is niche corner case compared to user I/O.

>> While we are at it, remove the redundant NVME_CTRL_NEW case, which
>> should never see any I/O as it must first transition to
>> NVME_CTRL_CONNECTING.
> 
> That probablyly should be a separate patch.

OK.

>> -		if (nvme_is_fabrics(req->cmd) &&
>> +		if (blk_rq_is_passthrough(rq) && nvme_is_fabrics(req->cmd) &&
> 
> And this (make sure we don't access garbage in ->cmd for non-passthrough)
> should probably be a separate fix as well.

No, the check was in the upper condition and this reference relied on
it so if I separate this part there is no justification for the
change.

_______________________________________________
Linux-nvme mailing list
Linux-nvme@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-nvme

  reply	other threads:[~2020-08-14  7:09 UTC|newest]

Thread overview: 30+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2020-08-06 19:11 [PATCH v2 0/8] fix possible controller reset hangs in nvme-tcp/nvme-rdma Sagi Grimberg
2020-08-06 19:11 ` [PATCH v2 1/8] nvme-fabrics: allow to queue requests for live queues Sagi Grimberg
2020-08-14  6:44   ` Christoph Hellwig
2020-08-14  7:08     ` Sagi Grimberg [this message]
2020-08-14  7:22       ` Christoph Hellwig
2020-08-14 15:55         ` James Smart
2020-08-14 17:49         ` Sagi Grimberg
2020-08-06 19:11 ` [PATCH v2 2/8] nvme: have nvme_wait_freeze_timeout return if it timed out Sagi Grimberg
2020-08-14  6:45   ` Christoph Hellwig
2020-08-14  7:09     ` Sagi Grimberg
2020-08-06 19:11 ` [PATCH v2 3/8] nvme-tcp: serialize controller teardown double completion Sagi Grimberg
2020-08-06 19:11 ` [PATCH v2 4/8] nvme-tcp: fix timeout handler Sagi Grimberg
2020-08-06 19:11 ` [PATCH v2 5/8] nvme-tcp: fix reset hang if controller died in the middle of a reset Sagi Grimberg
2020-08-06 19:11 ` [PATCH v2 6/8] nvme-rdma: serialize controller teardown sequences Sagi Grimberg
2020-08-14  6:45   ` Christoph Hellwig
2020-08-14 21:12   ` James Smart
2020-08-19  0:35     ` Sagi Grimberg
2020-08-06 19:11 ` [PATCH v2 7/8] nvme-rdma: fix timeout handler Sagi Grimberg
2020-08-14  6:52   ` Christoph Hellwig
2020-08-14  7:14     ` Sagi Grimberg
2020-08-14 23:19       ` James Smart
2020-08-19  0:26         ` Sagi Grimberg
2020-08-14 23:27   ` James Smart
2020-08-14 23:30     ` James Smart
2020-08-19  0:39       ` Sagi Grimberg
2020-08-19  0:38     ` Sagi Grimberg
2020-08-06 19:11 ` [PATCH v2 8/8] nvme-rdma: fix reset hang if controller died in the middle of a reset Sagi Grimberg
2020-08-14  6:53   ` Christoph Hellwig
2020-08-11 22:16 ` [PATCH v2 0/8] fix possible controller reset hangs in nvme-tcp/nvme-rdma Sagi Grimberg
2020-08-13 15:39   ` Christoph Hellwig

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=27f60468-269a-34d8-9e51-920106c3a139@grimberg.me \
    --to=sagi@grimberg.me \
    --cc=hch@lst.de \
    --cc=james.smart@broadcom.com \
    --cc=kbusch@kernel.org \
    --cc=linux-nvme@lists.infradead.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).