* [PATCH v2 0/3] reduce quiesce time for lots of name spaces @ 2020-08-07 9:05 Chao Leng 2020-08-07 13:49 ` Ming Lei 0 siblings, 1 reply; 7+ messages in thread From: Chao Leng @ 2020-08-07 9:05 UTC (permalink / raw) To: linux-nvme, linux-block; +Cc: kbusch, axboe, hch, lengchao, sagi nvme_stop_queues quiesce queues for all name spaces, now quiesce one by one, if there is lots of name spaces, sync wait long time(more than 10s). Multipath can not fail over to retry quickly, cause io pause long time. This is not expected. To reduce quiesce time, we introduce async mechanism for sync SRCUs and quiesce queue. Changes from v1: - add support for blocking queue. - introduce async mechanism for sync SRCU - introduce async mechanism for quiesce queue Chao Leng (3): rcu: introduce async mechanism for sync SRCU blk-mq: introduce async mechanism for quiesce queue nvme-core: reduce io pause time when fail over block/blk-mq.c | 28 ++++++++++++++++++++++++++++ drivers/nvme/host/core.c | 7 ++++++- include/linux/blk-mq.h | 2 ++ include/linux/rcupdate_wait.h | 6 ++++++ include/linux/srcu.h | 1 + include/linux/srcutiny.h | 2 ++ include/linux/srcutree.h | 2 ++ kernel/rcu/srcutiny.c | 15 +++++++++++++++ kernel/rcu/srcutree.c | 26 ++++++++++++++++++++++++++ kernel/rcu/update.c | 11 +++++++++++ 10 files changed, 99 insertions(+), 1 deletion(-) -- 2.16.4 _______________________________________________ Linux-nvme mailing list Linux-nvme@lists.infradead.org http://lists.infradead.org/mailman/listinfo/linux-nvme ^ permalink raw reply [flat|nested] 7+ messages in thread
* Re: [PATCH v2 0/3] reduce quiesce time for lots of name spaces 2020-08-07 9:05 [PATCH v2 0/3] reduce quiesce time for lots of name spaces Chao Leng @ 2020-08-07 13:49 ` Ming Lei 2020-08-07 16:37 ` Sagi Grimberg 2020-08-10 2:17 ` Chao Leng 0 siblings, 2 replies; 7+ messages in thread From: Ming Lei @ 2020-08-07 13:49 UTC (permalink / raw) To: Chao Leng; +Cc: axboe, sagi, linux-nvme, linux-block, kbusch, hch On Fri, Aug 07, 2020 at 05:05:59PM +0800, Chao Leng wrote: > nvme_stop_queues quiesce queues for all name spaces, now quiesce one by > one, if there is lots of name spaces, sync wait long time(more than 10s). > Multipath can not fail over to retry quickly, cause io pause long time. > This is not expected. > To reduce quiesce time, we introduce async mechanism for sync SRCUs > and quiesce queue. > Frankly speaking, I prefer to replace SRCU with percpu_refcount: - percpu_refcount has much less memory footprint than SRCU, so we can simply move percpu_refcount into request_queue, instead of adding more bytes into each hctx by this patch - percpu_ref_get()/percpu_ref_put() isn't slower than srcu_read_lock()/srcu_read_unlock(). - with percpu_refcount, we can remove 'srcu_idx' from hctx_lock/hctx_unlock() Thanks, Ming _______________________________________________ Linux-nvme mailing list Linux-nvme@lists.infradead.org http://lists.infradead.org/mailman/listinfo/linux-nvme ^ permalink raw reply [flat|nested] 7+ messages in thread
* Re: [PATCH v2 0/3] reduce quiesce time for lots of name spaces 2020-08-07 13:49 ` Ming Lei @ 2020-08-07 16:37 ` Sagi Grimberg 2020-08-10 2:17 ` Chao Leng 1 sibling, 0 replies; 7+ messages in thread From: Sagi Grimberg @ 2020-08-07 16:37 UTC (permalink / raw) To: Ming Lei, Chao Leng; +Cc: linux-block, kbusch, axboe, linux-nvme, hch >> nvme_stop_queues quiesce queues for all name spaces, now quiesce one by >> one, if there is lots of name spaces, sync wait long time(more than 10s). >> Multipath can not fail over to retry quickly, cause io pause long time. >> This is not expected. >> To reduce quiesce time, we introduce async mechanism for sync SRCUs >> and quiesce queue. >> > > Frankly speaking, I prefer to replace SRCU with percpu_refcount: > > - percpu_refcount has much less memory footprint than SRCU, so we can simply > move percpu_refcount into request_queue, instead of adding more bytes > into each hctx by this patch > > - percpu_ref_get()/percpu_ref_put() isn't slower than srcu_read_lock()/srcu_read_unlock(). > > - with percpu_refcount, we can remove 'srcu_idx' from hctx_lock/hctx_unlock() Ming, I don't have an issue with your preference, but I don't want to tie it to what Chao is attempting to fix. So we should either move forward with an srcu approach, or you please provide evidence to your claims if you are replacing the fast-path synchronization mechanism, as this warrants a very good justification. _______________________________________________ Linux-nvme mailing list Linux-nvme@lists.infradead.org http://lists.infradead.org/mailman/listinfo/linux-nvme ^ permalink raw reply [flat|nested] 7+ messages in thread
* Re: [PATCH v2 0/3] reduce quiesce time for lots of name spaces 2020-08-07 13:49 ` Ming Lei 2020-08-07 16:37 ` Sagi Grimberg @ 2020-08-10 2:17 ` Chao Leng 2020-08-10 3:15 ` Ming Lei 1 sibling, 1 reply; 7+ messages in thread From: Chao Leng @ 2020-08-10 2:17 UTC (permalink / raw) To: Ming Lei; +Cc: axboe, sagi, linux-nvme, linux-block, kbusch, hch On 2020/8/7 21:49, Ming Lei wrote: > On Fri, Aug 07, 2020 at 05:05:59PM +0800, Chao Leng wrote: >> nvme_stop_queues quiesce queues for all name spaces, now quiesce one by >> one, if there is lots of name spaces, sync wait long time(more than 10s). >> Multipath can not fail over to retry quickly, cause io pause long time. >> This is not expected. >> To reduce quiesce time, we introduce async mechanism for sync SRCUs >> and quiesce queue. >> > > Frankly speaking, I prefer to replace SRCU with percpu_refcount: > > - percpu_refcount has much less memory footprint than SRCU, so we can simply > move percpu_refcount into request_queue, instead of adding more bytes > into each hctx by this patch > > - percpu_ref_get()/percpu_ref_put() isn't slower than srcu_read_lock()/srcu_read_unlock(). > > - with percpu_refcount, we can remove 'srcu_idx' from hctx_lock/hctx_unlock() IO pause long time if fail over, this is a serios problem. we need fix it as soon as possible. SRCU is just used for blocking queue, non blocking queue need 0 bytes. So more bytes(just 24 bytes) is not waste. About using per_cpu to replace SRCU, I suggest separate discussion. Can you show the patch? This will make it easier to discuss. > > Thanks, > Ming > > . > _______________________________________________ Linux-nvme mailing list Linux-nvme@lists.infradead.org http://lists.infradead.org/mailman/listinfo/linux-nvme ^ permalink raw reply [flat|nested] 7+ messages in thread
* Re: [PATCH v2 0/3] reduce quiesce time for lots of name spaces 2020-08-10 2:17 ` Chao Leng @ 2020-08-10 3:15 ` Ming Lei 2020-08-11 3:13 ` Chao Leng 0 siblings, 1 reply; 7+ messages in thread From: Ming Lei @ 2020-08-10 3:15 UTC (permalink / raw) To: Chao Leng; +Cc: axboe, sagi, linux-nvme, linux-block, kbusch, hch On Mon, Aug 10, 2020 at 10:17:04AM +0800, Chao Leng wrote: > > > On 2020/8/7 21:49, Ming Lei wrote: > > On Fri, Aug 07, 2020 at 05:05:59PM +0800, Chao Leng wrote: > > > nvme_stop_queues quiesce queues for all name spaces, now quiesce one by > > > one, if there is lots of name spaces, sync wait long time(more than 10s). > > > Multipath can not fail over to retry quickly, cause io pause long time. > > > This is not expected. > > > To reduce quiesce time, we introduce async mechanism for sync SRCUs > > > and quiesce queue. > > > > > > > Frankly speaking, I prefer to replace SRCU with percpu_refcount: > > > > - percpu_refcount has much less memory footprint than SRCU, so we can simply > > move percpu_refcount into request_queue, instead of adding more bytes > > into each hctx by this patch > > > > - percpu_ref_get()/percpu_ref_put() isn't slower than srcu_read_lock()/srcu_read_unlock(). > > > > - with percpu_refcount, we can remove 'srcu_idx' from hctx_lock/hctx_unlock() > IO pause long time if fail over, this is a serios problem. we need fix > it as soon as possible. SRCU is just used for blocking queue, The issue has been long time since SRCU is taken, not sure if it is something urgent. > non blocking queue need 0 bytes. So more bytes(just 24 bytes) is not > waste. > > About using per_cpu to replace SRCU, I suggest separate discussion. > Can you show the patch? This will make it easier to discuss. https://lore.kernel.org/linux-block/20200728134938.1505467-1-ming.lei@redhat.com/ Thanks, Ming _______________________________________________ Linux-nvme mailing list Linux-nvme@lists.infradead.org http://lists.infradead.org/mailman/listinfo/linux-nvme ^ permalink raw reply [flat|nested] 7+ messages in thread
* Re: [PATCH v2 0/3] reduce quiesce time for lots of name spaces 2020-08-10 3:15 ` Ming Lei @ 2020-08-11 3:13 ` Chao Leng 2020-08-11 20:53 ` Sagi Grimberg 0 siblings, 1 reply; 7+ messages in thread From: Chao Leng @ 2020-08-11 3:13 UTC (permalink / raw) To: Ming Lei; +Cc: axboe, sagi, linux-nvme, linux-block, kbusch, hch On 2020/8/10 11:15, Ming Lei wrote: >> About using per_cpu to replace SRCU, I suggest separate discussion. >> Can you show the patch? This will make it easier to discuss. > https://lore.kernel.org/linux-block/20200728134938.1505467-1-ming.lei@redhat.com/ Directly replace SRCU with percpu_ref, is not safe. This may make quiesce queue wait extra time, in extreme scenario, maybe wait for ever. _______________________________________________ Linux-nvme mailing list Linux-nvme@lists.infradead.org http://lists.infradead.org/mailman/listinfo/linux-nvme ^ permalink raw reply [flat|nested] 7+ messages in thread
* Re: [PATCH v2 0/3] reduce quiesce time for lots of name spaces 2020-08-11 3:13 ` Chao Leng @ 2020-08-11 20:53 ` Sagi Grimberg 0 siblings, 0 replies; 7+ messages in thread From: Sagi Grimberg @ 2020-08-11 20:53 UTC (permalink / raw) To: Chao Leng, Ming Lei; +Cc: linux-block, kbusch, axboe, linux-nvme, hch >>> About using per_cpu to replace SRCU, I suggest separate discussion. >>> Can you show the patch? This will make it easier to discuss. >> https://lore.kernel.org/linux-block/20200728134938.1505467-1-ming.lei@redhat.com/ >> > Directly replace SRCU with percpu_ref, is not safe. > This may make quiesce queue wait extra time, in extreme scenario, > maybe wait for ever. Agree with Ming that this is not necessarily pressing as this issue existed in nvme for a long time. I don't resist to change the locking mechanism, but Ming needs to provide evidence that this change does not degrade reliability nor performance. Personally I'd like to see this specific issue solved first and then do a an infrastructure change. _______________________________________________ Linux-nvme mailing list Linux-nvme@lists.infradead.org http://lists.infradead.org/mailman/listinfo/linux-nvme ^ permalink raw reply [flat|nested] 7+ messages in thread
end of thread, other threads:[~2020-08-11 20:53 UTC | newest] Thread overview: 7+ messages (download: mbox.gz / follow: Atom feed) -- links below jump to the message on this page -- 2020-08-07 9:05 [PATCH v2 0/3] reduce quiesce time for lots of name spaces Chao Leng 2020-08-07 13:49 ` Ming Lei 2020-08-07 16:37 ` Sagi Grimberg 2020-08-10 2:17 ` Chao Leng 2020-08-10 3:15 ` Ming Lei 2020-08-11 3:13 ` Chao Leng 2020-08-11 20:53 ` Sagi Grimberg
This is a public inbox, see mirroring instructions for how to clone and mirror all data and code used for this inbox; as well as URLs for NNTP newsgroup(s).