linux-nvme.lists.infradead.org archive mirror
 help / color / mirror / Atom feed
* [PATCH v2 0/3] reduce quiesce time for lots of name spaces
@ 2020-08-07  9:05 Chao Leng
  2020-08-07 13:49 ` Ming Lei
  0 siblings, 1 reply; 7+ messages in thread
From: Chao Leng @ 2020-08-07  9:05 UTC (permalink / raw)
  To: linux-nvme, linux-block; +Cc: kbusch, axboe, hch, lengchao, sagi

nvme_stop_queues quiesce queues for all name spaces, now quiesce one by
one, if there is lots of name spaces, sync wait long time(more than 10s).
Multipath can not fail over to retry quickly, cause io pause long time.
This is not expected.
To reduce quiesce time, we introduce async mechanism for sync SRCUs
and quiesce queue.

Changes from v1:
- add support for blocking queue.
- introduce async mechanism for sync SRCU
- introduce async mechanism for quiesce queue

Chao Leng (3):
  rcu: introduce async mechanism for sync SRCU
  blk-mq: introduce async mechanism for quiesce queue
  nvme-core: reduce io pause time when fail over

 block/blk-mq.c                | 28 ++++++++++++++++++++++++++++
 drivers/nvme/host/core.c      |  7 ++++++-
 include/linux/blk-mq.h        |  2 ++
 include/linux/rcupdate_wait.h |  6 ++++++
 include/linux/srcu.h          |  1 +
 include/linux/srcutiny.h      |  2 ++
 include/linux/srcutree.h      |  2 ++
 kernel/rcu/srcutiny.c         | 15 +++++++++++++++
 kernel/rcu/srcutree.c         | 26 ++++++++++++++++++++++++++
 kernel/rcu/update.c           | 11 +++++++++++
 10 files changed, 99 insertions(+), 1 deletion(-)

-- 
2.16.4


_______________________________________________
Linux-nvme mailing list
Linux-nvme@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-nvme

^ permalink raw reply	[flat|nested] 7+ messages in thread

* Re: [PATCH v2 0/3] reduce quiesce time for lots of name spaces
  2020-08-07  9:05 [PATCH v2 0/3] reduce quiesce time for lots of name spaces Chao Leng
@ 2020-08-07 13:49 ` Ming Lei
  2020-08-07 16:37   ` Sagi Grimberg
  2020-08-10  2:17   ` Chao Leng
  0 siblings, 2 replies; 7+ messages in thread
From: Ming Lei @ 2020-08-07 13:49 UTC (permalink / raw)
  To: Chao Leng; +Cc: axboe, sagi, linux-nvme, linux-block, kbusch, hch

On Fri, Aug 07, 2020 at 05:05:59PM +0800, Chao Leng wrote:
> nvme_stop_queues quiesce queues for all name spaces, now quiesce one by
> one, if there is lots of name spaces, sync wait long time(more than 10s).
> Multipath can not fail over to retry quickly, cause io pause long time.
> This is not expected.
> To reduce quiesce time, we introduce async mechanism for sync SRCUs
> and quiesce queue.
> 

Frankly speaking, I prefer to replace SRCU with percpu_refcount:

- percpu_refcount has much less memory footprint than SRCU, so we can simply
move percpu_refcount into request_queue, instead of adding more bytes
into each hctx by this patch

- percpu_ref_get()/percpu_ref_put() isn't slower than srcu_read_lock()/srcu_read_unlock().

- with percpu_refcount, we can remove 'srcu_idx' from hctx_lock/hctx_unlock()

Thanks,
Ming


_______________________________________________
Linux-nvme mailing list
Linux-nvme@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-nvme

^ permalink raw reply	[flat|nested] 7+ messages in thread

* Re: [PATCH v2 0/3] reduce quiesce time for lots of name spaces
  2020-08-07 13:49 ` Ming Lei
@ 2020-08-07 16:37   ` Sagi Grimberg
  2020-08-10  2:17   ` Chao Leng
  1 sibling, 0 replies; 7+ messages in thread
From: Sagi Grimberg @ 2020-08-07 16:37 UTC (permalink / raw)
  To: Ming Lei, Chao Leng; +Cc: linux-block, kbusch, axboe, linux-nvme, hch


>> nvme_stop_queues quiesce queues for all name spaces, now quiesce one by
>> one, if there is lots of name spaces, sync wait long time(more than 10s).
>> Multipath can not fail over to retry quickly, cause io pause long time.
>> This is not expected.
>> To reduce quiesce time, we introduce async mechanism for sync SRCUs
>> and quiesce queue.
>>
> 
> Frankly speaking, I prefer to replace SRCU with percpu_refcount:
> 
> - percpu_refcount has much less memory footprint than SRCU, so we can simply
> move percpu_refcount into request_queue, instead of adding more bytes
> into each hctx by this patch
> 
> - percpu_ref_get()/percpu_ref_put() isn't slower than srcu_read_lock()/srcu_read_unlock().
> 
> - with percpu_refcount, we can remove 'srcu_idx' from hctx_lock/hctx_unlock()

Ming, I don't have an issue with your preference, but I don't want to
tie it to what Chao is attempting to fix.

So we should either move forward with an srcu approach, or you please
provide evidence to your claims if you are replacing the fast-path
synchronization mechanism, as this warrants a very good justification.

_______________________________________________
Linux-nvme mailing list
Linux-nvme@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-nvme

^ permalink raw reply	[flat|nested] 7+ messages in thread

* Re: [PATCH v2 0/3] reduce quiesce time for lots of name spaces
  2020-08-07 13:49 ` Ming Lei
  2020-08-07 16:37   ` Sagi Grimberg
@ 2020-08-10  2:17   ` Chao Leng
  2020-08-10  3:15     ` Ming Lei
  1 sibling, 1 reply; 7+ messages in thread
From: Chao Leng @ 2020-08-10  2:17 UTC (permalink / raw)
  To: Ming Lei; +Cc: axboe, sagi, linux-nvme, linux-block, kbusch, hch



On 2020/8/7 21:49, Ming Lei wrote:
> On Fri, Aug 07, 2020 at 05:05:59PM +0800, Chao Leng wrote:
>> nvme_stop_queues quiesce queues for all name spaces, now quiesce one by
>> one, if there is lots of name spaces, sync wait long time(more than 10s).
>> Multipath can not fail over to retry quickly, cause io pause long time.
>> This is not expected.
>> To reduce quiesce time, we introduce async mechanism for sync SRCUs
>> and quiesce queue.
>>
> 
> Frankly speaking, I prefer to replace SRCU with percpu_refcount:
> 
> - percpu_refcount has much less memory footprint than SRCU, so we can simply
> move percpu_refcount into request_queue, instead of adding more bytes
> into each hctx by this patch
> 
> - percpu_ref_get()/percpu_ref_put() isn't slower than srcu_read_lock()/srcu_read_unlock().
> 
> - with percpu_refcount, we can remove 'srcu_idx' from hctx_lock/hctx_unlock()
IO pause long time if fail over, this is a serios problem. we need fix
it as soon as possible. SRCU is just used for blocking queue,
non blocking queue need 0 bytes. So more bytes(just 24 bytes) is not
waste.

About using per_cpu to replace SRCU, I suggest separate discussion.
Can you show the patch? This will make it easier to discuss.
> 
> Thanks,
> Ming
> 
> .
> 

_______________________________________________
Linux-nvme mailing list
Linux-nvme@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-nvme

^ permalink raw reply	[flat|nested] 7+ messages in thread

* Re: [PATCH v2 0/3] reduce quiesce time for lots of name spaces
  2020-08-10  2:17   ` Chao Leng
@ 2020-08-10  3:15     ` Ming Lei
  2020-08-11  3:13       ` Chao Leng
  0 siblings, 1 reply; 7+ messages in thread
From: Ming Lei @ 2020-08-10  3:15 UTC (permalink / raw)
  To: Chao Leng; +Cc: axboe, sagi, linux-nvme, linux-block, kbusch, hch

On Mon, Aug 10, 2020 at 10:17:04AM +0800, Chao Leng wrote:
> 
> 
> On 2020/8/7 21:49, Ming Lei wrote:
> > On Fri, Aug 07, 2020 at 05:05:59PM +0800, Chao Leng wrote:
> > > nvme_stop_queues quiesce queues for all name spaces, now quiesce one by
> > > one, if there is lots of name spaces, sync wait long time(more than 10s).
> > > Multipath can not fail over to retry quickly, cause io pause long time.
> > > This is not expected.
> > > To reduce quiesce time, we introduce async mechanism for sync SRCUs
> > > and quiesce queue.
> > > 
> > 
> > Frankly speaking, I prefer to replace SRCU with percpu_refcount:
> > 
> > - percpu_refcount has much less memory footprint than SRCU, so we can simply
> > move percpu_refcount into request_queue, instead of adding more bytes
> > into each hctx by this patch
> > 
> > - percpu_ref_get()/percpu_ref_put() isn't slower than srcu_read_lock()/srcu_read_unlock().
> > 
> > - with percpu_refcount, we can remove 'srcu_idx' from hctx_lock/hctx_unlock()
> IO pause long time if fail over, this is a serios problem. we need fix
> it as soon as possible. SRCU is just used for blocking queue,

The issue has been long time since SRCU is taken, not sure if it is
something urgent.

> non blocking queue need 0 bytes. So more bytes(just 24 bytes) is not
> waste.
> 
> About using per_cpu to replace SRCU, I suggest separate discussion.
> Can you show the patch? This will make it easier to discuss.

https://lore.kernel.org/linux-block/20200728134938.1505467-1-ming.lei@redhat.com/


Thanks,
Ming


_______________________________________________
Linux-nvme mailing list
Linux-nvme@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-nvme

^ permalink raw reply	[flat|nested] 7+ messages in thread

* Re: [PATCH v2 0/3] reduce quiesce time for lots of name spaces
  2020-08-10  3:15     ` Ming Lei
@ 2020-08-11  3:13       ` Chao Leng
  2020-08-11 20:53         ` Sagi Grimberg
  0 siblings, 1 reply; 7+ messages in thread
From: Chao Leng @ 2020-08-11  3:13 UTC (permalink / raw)
  To: Ming Lei; +Cc: axboe, sagi, linux-nvme, linux-block, kbusch, hch



On 2020/8/10 11:15, Ming Lei wrote:
>> About using per_cpu to replace SRCU, I suggest separate discussion.
>> Can you show the patch? This will make it easier to discuss.
> https://lore.kernel.org/linux-block/20200728134938.1505467-1-ming.lei@redhat.com/
Directly replace SRCU with percpu_ref, is not safe.
This may make quiesce queue wait extra time, in extreme scenario,
maybe wait for ever.

_______________________________________________
Linux-nvme mailing list
Linux-nvme@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-nvme

^ permalink raw reply	[flat|nested] 7+ messages in thread

* Re: [PATCH v2 0/3] reduce quiesce time for lots of name spaces
  2020-08-11  3:13       ` Chao Leng
@ 2020-08-11 20:53         ` Sagi Grimberg
  0 siblings, 0 replies; 7+ messages in thread
From: Sagi Grimberg @ 2020-08-11 20:53 UTC (permalink / raw)
  To: Chao Leng, Ming Lei; +Cc: linux-block, kbusch, axboe, linux-nvme, hch


>>> About using per_cpu to replace SRCU, I suggest separate discussion.
>>> Can you show the patch? This will make it easier to discuss.
>> https://lore.kernel.org/linux-block/20200728134938.1505467-1-ming.lei@redhat.com/ 
>>
> Directly replace SRCU with percpu_ref, is not safe.
> This may make quiesce queue wait extra time, in extreme scenario,
> maybe wait for ever.

Agree with Ming that this is not necessarily pressing as this issue
existed in nvme for a long time.

I don't resist to change the locking mechanism, but Ming needs to
provide evidence that this change does not degrade reliability nor
performance. Personally I'd like to see this specific issue solved
first and then do a an infrastructure change.


_______________________________________________
Linux-nvme mailing list
Linux-nvme@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-nvme

^ permalink raw reply	[flat|nested] 7+ messages in thread

end of thread, other threads:[~2020-08-11 20:53 UTC | newest]

Thread overview: 7+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2020-08-07  9:05 [PATCH v2 0/3] reduce quiesce time for lots of name spaces Chao Leng
2020-08-07 13:49 ` Ming Lei
2020-08-07 16:37   ` Sagi Grimberg
2020-08-10  2:17   ` Chao Leng
2020-08-10  3:15     ` Ming Lei
2020-08-11  3:13       ` Chao Leng
2020-08-11 20:53         ` Sagi Grimberg

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).