From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org X-Spam-Level: X-Spam-Status: No, score=-8.7 required=3.0 tests=BAYES_00, HEADER_FROM_DIFFERENT_DOMAINS,INCLUDES_PATCH,MAILING_LIST_MULTI,NICE_REPLY_A, SPF_HELO_NONE,SPF_PASS,USER_AGENT_SANE_1 autolearn=unavailable autolearn_force=no version=3.4.0 Received: from mail.kernel.org (mail.kernel.org [198.145.29.99]) by smtp.lore.kernel.org (Postfix) with ESMTP id 2DBA2C433E1 for ; Mon, 27 Jul 2020 03:33:53 +0000 (UTC) Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by mail.kernel.org (Postfix) with ESMTP id 0CF5F206D8 for ; Mon, 27 Jul 2020 03:33:53 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1726112AbgG0Ddw (ORCPT ); Sun, 26 Jul 2020 23:33:52 -0400 Received: from szxga06-in.huawei.com ([45.249.212.32]:56022 "EHLO huawei.com" rhost-flags-OK-OK-OK-FAIL) by vger.kernel.org with ESMTP id S1726072AbgG0Ddv (ORCPT ); Sun, 26 Jul 2020 23:33:51 -0400 Received: from DGGEMS407-HUB.china.huawei.com (unknown [172.30.72.58]) by Forcepoint Email with ESMTP id B0F86CE653F6DD44F11F; Mon, 27 Jul 2020 11:33:47 +0800 (CST) Received: from [10.27.125.30] (10.27.125.30) by DGGEMS407-HUB.china.huawei.com (10.3.19.207) with Microsoft SMTP Server id 14.3.487.0; Mon, 27 Jul 2020 11:33:44 +0800 Subject: Re: [PATCH v3 1/2] blk-mq: add async quiesce interface To: Ming Lei , Sagi Grimberg CC: , Christoph Hellwig , "Keith Busch" , Jens Axboe , References: <20200726002301.145627-1-sagi@grimberg.me> <20200726002301.145627-2-sagi@grimberg.me> <20200726093132.GD1110104@T590> <9ac5f658-31b3-bb19-e5fe-385a629a7d67@grimberg.me> <20200727020803.GC1129253@T590> From: Chao Leng Message-ID: Date: Mon, 27 Jul 2020 11:33:43 +0800 User-Agent: Mozilla/5.0 (Windows NT 10.0; Win64; x64; rv:68.0) Gecko/20100101 Thunderbird/68.9.0 MIME-Version: 1.0 In-Reply-To: <20200727020803.GC1129253@T590> Content-Type: text/plain; charset="utf-8"; format=flowed Content-Language: en-US Content-Transfer-Encoding: 7bit X-Originating-IP: [10.27.125.30] X-CFilter-Loop: Reflected Sender: linux-block-owner@vger.kernel.org Precedence: bulk List-ID: X-Mailing-List: linux-block@vger.kernel.org On 2020/7/27 10:08, Ming Lei wrote: >> It is at the end and contains exactly what is needed to synchronize. Not > The sync is simply single global synchronize_rcu(), and why bother to add > extra >=40bytes for each hctx. > >> sure what you mean by reuse hctx->srcu? > You already reuses hctx->srcu, but not see reason to add extra rcu_synchronize > to each hctx for just simulating one single synchronize_rcu(). To sync srcu together, the extra bytes must be needed, seperate blocking and non blocking queue to two hctx may be a not good choice. There is two choice: the struct rcu_synchronize is added in hctx or in srcu. Though add rcu_synchronize in srcu has a weakness: the extra bytes is not need if which do not need batch sync srcu, I still think it's better for the SRCU to provide the batch synchronization interface. We can add check ctrl->tagset->flags to provide same interface both for blocking and non blocking queue. The code for TINY_SRCU: --- block/blk-mq.c | 29 +++++++++++++++++++++++++++++ drivers/nvme/host/core.c | 9 ++++++++- include/linux/blk-mq.h | 2 ++ include/linux/srcu.h | 2 ++ include/linux/srcutiny.h | 1 + kernel/rcu/srcutiny.c | 16 ++++++++++++++++ 6 files changed, 58 insertions(+), 1 deletion(-) diff --git a/block/blk-mq.c b/block/blk-mq.c index 4e0d173beaa3..3117fc3082ff 100644 --- a/block/blk-mq.c +++ b/block/blk-mq.c @@ -235,6 +235,35 @@ void blk_mq_quiesce_queue(struct request_queue *q) } EXPORT_SYMBOL_GPL(blk_mq_quiesce_queue); +void blk_mq_quiesce_queue_async(struct request_queue *q) +{ + struct blk_mq_hw_ctx *hctx; + unsigned int i; + + blk_mq_quiesce_queue_nowait(q); + + queue_for_each_hw_ctx(q, hctx, i) + if (hctx->flags & BLK_MQ_F_BLOCKING) + synchronize_srcu_async(hctx->srcu); +} +EXPORT_SYMBOL_GPL(blk_mq_quiesce_queue_async); + +void blk_mq_quiesce_queue_async_wait(struct request_queue *q) +{ + struct blk_mq_hw_ctx *hctx; + unsigned int i; + + if (q == NULL) { + synchronize_rcu(); + return; + } + + queue_for_each_hw_ctx(q, hctx, i) + if (hctx->flags & BLK_MQ_F_BLOCKING) + synchronize_srcu_async_wait(hctx->srcu); +} +EXPORT_SYMBOL_GPL(blk_mq_quiesce_queue_async_wait); + /* * blk_mq_unquiesce_queue() - counterpart of blk_mq_quiesce_queue() * @q: request queue. diff --git a/drivers/nvme/host/core.c b/drivers/nvme/host/core.c index a3b1157561f5..f13aa447ab64 100644 --- a/drivers/nvme/host/core.c +++ b/drivers/nvme/host/core.c @@ -4322,7 +4322,14 @@ void nvme_stop_queues(struct nvme_ctrl *ctrl) down_read(&ctrl->namespaces_rwsem); list_for_each_entry(ns, &ctrl->namespaces, list) - blk_mq_quiesce_queue(ns->queue); + blk_mq_quiesce_queue_async(ns->queue); + + if (ctrl->tagset->flags & BLK_MQ_F_BLOCKING) { + list_for_each_entry(ns, &ctrl->namespaces, list) + blk_mq_quiesce_queue_async_wait(ns->queue); + } else { + blk_mq_quiesce_queue_async_wait(NULL); + } up_read(&ctrl->namespaces_rwsem); } EXPORT_SYMBOL_GPL(nvme_stop_queues); diff --git a/include/linux/blk-mq.h b/include/linux/blk-mq.h index d6fcae17da5a..092470c63558 100644 --- a/include/linux/blk-mq.h +++ b/include/linux/blk-mq.h @@ -515,6 +515,8 @@ void blk_mq_start_hw_queues(struct request_queue *q); void blk_mq_start_stopped_hw_queue(struct blk_mq_hw_ctx *hctx, bool async); void blk_mq_start_stopped_hw_queues(struct request_queue *q, bool async); void blk_mq_quiesce_queue(struct request_queue *q); +void blk_mq_quiesce_queue_async(struct request_queue *q); +void blk_mq_quiesce_queue_async_wait(struct request_queue *q); void blk_mq_unquiesce_queue(struct request_queue *q); void blk_mq_delay_run_hw_queue(struct blk_mq_hw_ctx *hctx, unsigned long msecs); void blk_mq_run_hw_queue(struct blk_mq_hw_ctx *hctx, bool async); diff --git a/include/linux/srcu.h b/include/linux/srcu.h index e432cc92c73d..7e006e51ccf9 100644 --- a/include/linux/srcu.h +++ b/include/linux/srcu.h @@ -60,6 +60,8 @@ void cleanup_srcu_struct(struct srcu_struct *ssp); int __srcu_read_lock(struct srcu_struct *ssp) __acquires(ssp); void __srcu_read_unlock(struct srcu_struct *ssp, int idx) __releases(ssp); void synchronize_srcu(struct srcu_struct *ssp); +void synchronize_srcu_async(struct srcu_struct *ssp); +void synchronize_srcu_async_wait(struct srcu_struct *ssp); #ifdef CONFIG_DEBUG_LOCK_ALLOC diff --git a/include/linux/srcutiny.h b/include/linux/srcutiny.h index 5a5a1941ca15..3d7d871bef61 100644 --- a/include/linux/srcutiny.h +++ b/include/linux/srcutiny.h @@ -23,6 +23,7 @@ struct srcu_struct { struct rcu_head *srcu_cb_head; /* Pending callbacks: Head. */ struct rcu_head **srcu_cb_tail; /* Pending callbacks: Tail. */ struct work_struct srcu_work; /* For driving grace periods. */ + struct rcu_synchronize rcu_sync; #ifdef CONFIG_DEBUG_LOCK_ALLOC struct lockdep_map dep_map; #endif /* #ifdef CONFIG_DEBUG_LOCK_ALLOC */ diff --git a/kernel/rcu/srcutiny.c b/kernel/rcu/srcutiny.c index 6208c1dae5c9..6e1468175a45 100644 --- a/kernel/rcu/srcutiny.c +++ b/kernel/rcu/srcutiny.c @@ -190,6 +190,22 @@ void synchronize_srcu(struct srcu_struct *ssp) } EXPORT_SYMBOL_GPL(synchronize_srcu); +void synchronize_srcu_async(struct srcu_struct *ssp) +{ + init_rcu_head(&ssp->rcu_sync.head); + init_completion(&ssp->rcu_sync.completion); + call_srcu(ssp, &ssp->rcu_sync.head, wakeme_after_rcu_batch); + +} +EXPORT_SYMBOL_GPL(synchronize_srcu_async); + +void synchronize_srcu_async_wait(struct srcu_struct *ssp) +{ + wait_for_completion(&ssp->rcu_sync.completion); + destroy_rcu_head(&ssp->rcu_sync.head); +} +EXPORT_SYMBOL_GPL(synchronize_srcu_async_wait); + /* Lockdep diagnostics. */ void __init rcu_scheduler_starting(void) { -- 2.16.4