All of lore.kernel.org
 help / color / mirror / Atom feed
From: Chao Leng <lengchao@huawei.com>
To: Sagi Grimberg <sagi@grimberg.me>, Christoph Hellwig <hch@lst.de>,
	Ming Lei <ming.lei@redhat.com>
Cc: <linux-nvme@lists.infradead.org>, <linux-block@vger.kernel.org>,
	<kbusch@kernel.org>, <axboe@kernel.dk>
Subject: Re: [PATCH 0/3] improve nvme quiesce time for large amount of namespaces
Date: Wed, 12 Oct 2022 16:43:02 +0800	[thread overview]
Message-ID: <f9fce880-4714-3cdb-dfd1-f1d77d033d7a@huawei.com> (raw)
In-Reply-To: <5fc61f6c-3d3e-ce0e-a090-aa5bcdb7721c@grimberg.me>

Add Ming Lei.

On 2022/10/12 14:37, Sagi Grimberg wrote:
> 
>>> On Sun, Jul 31, 2022 at 01:23:36PM +0300, Sagi Grimberg wrote:
>>>> But maybe we can avoid that, and because we allocate
>>>> the connect_q ourselves, and fully know that it should
>>>> not be apart of the tagset quiesce, perhaps we can introduce
>>>> a new interface like:
>>>> -- 
>>>> static inline int nvme_ctrl_init_connect_q(struct nvme_ctrl *ctrl)
>>>> {
>>>>     ctrl->connect_q = blk_mq_init_queue_self_quiesce(ctrl->tagset);
>>>>     if (IS_ERR(ctrl->connect_q))
>>>>         return PTR_ERR(ctrl->connect_q);
>>>>     return 0;
>>>> }
>>>> -- 
>>>>
>>>> And then blk_mq_quiesce_tagset can simply look into a per request-queue
>>>> self_quiesce flag and skip as needed.
>>>
>>> I'd just make that a queue flag set after allocation to keep the
>>> interface simple, but otherwise this seems like the right thing
>>> to do.
>> Now the code used NVME_NS_STOPPED to avoid unpaired stop/start.
>> If we use blk_mq_quiesce_tagset, It will cause the above mechanism to fail.
>> I review the code, only pci can not ensure secure stop/start pairing.
>> So there is a choice, We only use blk_mq_quiesce_tagset on fabrics, not PCI.
>> Do you think that's acceptable?
>> If that's acceptable, I will try to send a patch set.
> 
> I don't think that this is acceptable. But I don't understand how
> NVME_NS_STOPPED would change anything in the behavior of tagset-wide
> quiesce?
If use blk_mq_quiesce_tagset, it will quiesce all queues of all ns,
but can not set NVME_NS_STOPPED of all ns. The mechanism of NVME_NS_STOPPED
will be invalidated.
NVMe-pci has very complicated quiesce/unquiesce use pattern, quiesce/unquiesce
may be called unpaired.
It will cause some backward. There may be some bugs in this scenario:
A thread: quiesce the queue
B thread: quiesce the queue
A thread end, and does not unquiesce the queue.
B thread: unquiesce the queue, and do something which need the queue must be unquiesed.

Of course, I don't think it is a good choice to guarantee paired access through NVME_NS_STOPPED,
there exist unexpected unquiesce and start queue too early.
But now that the code has done so, the backward should be unacceptable.
such as this scenario:
A thread: quiesce the queue
B thread: want to quiesce the queue but do nothing because NVME_NS_STOPPED is already set.
A thread: unquiesce the queue
Now the queue is unquiesced too early for B thread.
B thread: do something which need the queue must be quiesced.

Introduce NVME_NS_STOPPED link:
https://lore.kernel.org/all/20211014081710.1871747-5-ming.lei@redhat.com/
> 
> .

  reply	other threads:[~2022-10-12  8:43 UTC|newest]

Thread overview: 15+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2022-07-29  7:39 [PATCH 0/3] improve nvme quiesce time for large amount of namespaces Chao Leng
2022-07-29  7:39 ` [PATCH 1/3] blk-mq: delete unnecessary comments Chao Leng
2022-07-29  7:39 ` [PATCH 2/3] nvme: improve the quiesce time for non blocking transports Chao Leng
2022-07-29  7:39 ` [PATCH 3/3] nvme: improve the quiesce time for " Chao Leng
2022-07-29 14:26 ` [PATCH 0/3] improve nvme quiesce time for large amount of namespaces Christoph Hellwig
2022-07-30  0:39   ` Chao Leng
2022-07-31 10:23     ` Sagi Grimberg
2022-08-01  1:45       ` Chao Leng
2022-08-02 13:38       ` Christoph Hellwig
2022-10-10  8:46         ` Chao Leng
2022-10-12  6:37           ` Sagi Grimberg
2022-10-12  8:43             ` Chao Leng [this message]
2022-10-12 11:13               ` Sagi Grimberg
2022-10-13  1:37                 ` Chao Leng
2022-10-13  2:06                   ` Chao Leng

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=f9fce880-4714-3cdb-dfd1-f1d77d033d7a@huawei.com \
    --to=lengchao@huawei.com \
    --cc=axboe@kernel.dk \
    --cc=hch@lst.de \
    --cc=kbusch@kernel.org \
    --cc=linux-block@vger.kernel.org \
    --cc=linux-nvme@lists.infradead.org \
    --cc=ming.lei@redhat.com \
    --cc=sagi@grimberg.me \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.