All of lore.kernel.org
 help / color / mirror / Atom feed
From: Waiman Long <longman@redhat.com>
To: Tejun Heo <tj@kernel.org>, Jens Axboe <axboe@kernel.dk>
Cc: cgroups@vger.kernel.org, linux-block@vger.kernel.org,
	linux-kernel@vger.kernel.org, Ming Lei <ming.lei@redhat.com>,
	Waiman Long <longman@redhat.com>
Subject: [PATCH v5 4/4] blk-cgroup: Document the design of new lockless iostat_cpu list
Date: Thu,  2 Jun 2022 14:54:01 -0400	[thread overview]
Message-ID: <20220602185401.162937-1-longman@redhat.com> (raw)
In-Reply-To: <20220602133543.128088-2-longman@redhat.com>

A set of percpu lockless lists per block cgroup (blkcg) is added to
track the set of recently updated iostat_cpu structures. Add comment
in the code to document the design of this new set of lockless lists.

Signed-off-by: Waiman Long <longman@redhat.com>
---
 block/blk-cgroup.c | 15 +++++++++++++++
 1 file changed, 15 insertions(+)

diff --git a/block/blk-cgroup.c b/block/blk-cgroup.c
index 8af97f3b2fc9..f8f27551c16a 100644
--- a/block/blk-cgroup.c
+++ b/block/blk-cgroup.c
@@ -60,6 +60,21 @@ static struct workqueue_struct *blkcg_punt_bio_wq;
 #define BLKG_DESTROY_BATCH_SIZE  64
 
 /*
+ * Lockless lists for tracking IO stats update
+ *
+ * New IO stats are stored in the percpu iostat_cpu within blkcg_gq (blkg).
+ * There are multiple blkg's (one for each block device) attached to each
+ * blkcg. The rstat code keeps track of which cpu has IO stats updated,
+ * but it doesn't know which blkg has the updated stats. If there are many
+ * block devices in a system, the cost of iterating all the blkg's to flush
+ * out the IO stats can be high. To reduce such overhead, a set of percpu
+ * lockless lists (lhead) per blkcg are used to track the set of recently
+ * updated iostat_cpu's since the last flush. An iostat_cpu will be put
+ * onto the lockless list on the update side [blk_cgroup_bio_start()] if
+ * not there yet and then removed when being flushed [blkcg_rstat_flush()].
+ * References to blkg are gotten and then put back in the process to
+ * protect against blkg removal.
+ *
  * lnode.next of the last entry in a lockless list is NULL. To enable us to
  * use lnode.next as a boolean flag to indicate its presence in a lockless
  * list, we have to make it non-NULL for all. This is done by using a
-- 
2.31.1


WARNING: multiple messages have this Message-ID (diff)
From: Waiman Long <longman-H+wXaHxf7aLQT0dZR+AlfA@public.gmane.org>
To: Tejun Heo <tj-DgEjT+Ai2ygdnm+yROfE0A@public.gmane.org>,
	Jens Axboe <axboe-tSWWG44O7X1aa/9Udqfwiw@public.gmane.org>
Cc: cgroups-u79uwXL29TY76Z2rM5mHXA@public.gmane.org,
	linux-block-u79uwXL29TY76Z2rM5mHXA@public.gmane.org,
	linux-kernel-u79uwXL29TY76Z2rM5mHXA@public.gmane.org,
	Ming Lei <ming.lei-H+wXaHxf7aLQT0dZR+AlfA@public.gmane.org>,
	Waiman Long <longman-H+wXaHxf7aLQT0dZR+AlfA@public.gmane.org>
Subject: [PATCH v5 4/4] blk-cgroup: Document the design of new lockless iostat_cpu list
Date: Thu,  2 Jun 2022 14:54:01 -0400	[thread overview]
Message-ID: <20220602185401.162937-1-longman@redhat.com> (raw)
In-Reply-To: <20220602133543.128088-2-longman-H+wXaHxf7aLQT0dZR+AlfA@public.gmane.org>

A set of percpu lockless lists per block cgroup (blkcg) is added to
track the set of recently updated iostat_cpu structures. Add comment
in the code to document the design of this new set of lockless lists.

Signed-off-by: Waiman Long <longman-H+wXaHxf7aLQT0dZR+AlfA@public.gmane.org>
---
 block/blk-cgroup.c | 15 +++++++++++++++
 1 file changed, 15 insertions(+)

diff --git a/block/blk-cgroup.c b/block/blk-cgroup.c
index 8af97f3b2fc9..f8f27551c16a 100644
--- a/block/blk-cgroup.c
+++ b/block/blk-cgroup.c
@@ -60,6 +60,21 @@ static struct workqueue_struct *blkcg_punt_bio_wq;
 #define BLKG_DESTROY_BATCH_SIZE  64
 
 /*
+ * Lockless lists for tracking IO stats update
+ *
+ * New IO stats are stored in the percpu iostat_cpu within blkcg_gq (blkg).
+ * There are multiple blkg's (one for each block device) attached to each
+ * blkcg. The rstat code keeps track of which cpu has IO stats updated,
+ * but it doesn't know which blkg has the updated stats. If there are many
+ * block devices in a system, the cost of iterating all the blkg's to flush
+ * out the IO stats can be high. To reduce such overhead, a set of percpu
+ * lockless lists (lhead) per blkcg are used to track the set of recently
+ * updated iostat_cpu's since the last flush. An iostat_cpu will be put
+ * onto the lockless list on the update side [blk_cgroup_bio_start()] if
+ * not there yet and then removed when being flushed [blkcg_rstat_flush()].
+ * References to blkg are gotten and then put back in the process to
+ * protect against blkg removal.
+ *
  * lnode.next of the last entry in a lockless list is NULL. To enable us to
  * use lnode.next as a boolean flag to indicate its presence in a lockless
  * list, we have to make it non-NULL for all. This is done by using a
-- 
2.31.1


  reply	other threads:[~2022-06-02 18:54 UTC|newest]

Thread overview: 30+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2022-06-01 21:18 [PATCH v3 0/2] blk-cgroup: Optimize blkcg_rstat_flush() Waiman Long
2022-06-01 21:18 ` Waiman Long
2022-06-01 21:18 ` [PATCH v3 1/2] blk-cgroup: Correctly free percpu iostat_cpu in blkg on error exit Waiman Long
2022-06-01 21:18 ` [PATCH v3 2/2] blk-cgroup: Optimize blkcg_rstat_flush() Waiman Long
2022-06-01 21:18   ` Waiman Long
2022-06-01 21:26   ` Tejun Heo
2022-06-01 21:30     ` Waiman Long
2022-06-02  6:32   ` kernel test robot
2022-06-02  1:54 ` [PATCH v4 " Waiman Long
2022-06-02  1:54   ` Waiman Long
2022-06-02 13:35 ` [PATCH v5 0/3] " Waiman Long
2022-06-02 13:35 ` [PATCH v5 1/3] blk-cgroup: Correctly free percpu iostat_cpu in blkg on error exit Waiman Long
2022-06-02 18:54   ` Waiman Long [this message]
2022-06-02 18:54     ` [PATCH v5 4/4] blk-cgroup: Document the design of new lockless iostat_cpu list Waiman Long
2022-06-02 19:05     ` Tejun Heo
2022-06-02 19:12       ` Waiman Long
2022-06-02 19:12         ` Waiman Long
2022-06-02 13:35 ` [PATCH v5 2/3] blk-cgroup: Return -ENOMEM directly in blkcg_css_alloc() error path Waiman Long
2022-06-02 13:35   ` Waiman Long
2022-06-02 16:16   ` Tejun Heo
2022-06-02 17:17     ` Waiman Long
2022-06-02 17:17       ` Waiman Long
2022-06-02 13:35 ` [PATCH v5 3/3] blk-cgroup: Optimize blkcg_rstat_flush() Waiman Long
2022-06-02 16:58   ` Tejun Heo
2022-06-02 17:26     ` Waiman Long
2022-06-02 17:26       ` Waiman Long
2022-06-02 17:46       ` Tejun Heo
2022-06-02 17:46         ` Tejun Heo
2022-06-02 18:18         ` Waiman Long
2022-06-02 18:18           ` Waiman Long

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20220602185401.162937-1-longman@redhat.com \
    --to=longman@redhat.com \
    --cc=axboe@kernel.dk \
    --cc=cgroups@vger.kernel.org \
    --cc=linux-block@vger.kernel.org \
    --cc=linux-kernel@vger.kernel.org \
    --cc=ming.lei@redhat.com \
    --cc=tj@kernel.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.