All of lore.kernel.org
 help / color / mirror / Atom feed
* [PATCHSET block/for-4.2/writeback] blkcg: blkcg_policy methods cleanup
@ 2015-06-24  2:44 ` Tejun Heo
  0 siblings, 0 replies; 19+ messages in thread
From: Tejun Heo @ 2015-06-24  2:44 UTC (permalink / raw)
  To: axboe; +Cc: linux-kernel, cgroups, vgoyal, avanzini.arianna, kernel-team

This patchset contains assorted cleanups for blkcg_policy methods.

* alloc/free are added.  exit is dropped.

* blk-throttle's async percpu allocation is replaced with direct
  allocation.

* all methods now take blkcg_policy_data instead of blkcg_gq.

Nothing too controversial.

Jens, as we're at the beginning of the 4.2 merge window, I'll ping /
repost this patchset along with other pending patchsets once -rc1
drops.

This patchset contains the following seven patches.

 0001-blkcg-remove-unnecessary-request_list-blkg-NULL-test.patch
 0002-blkcg-use-blkg_free-in-blkcg_init_queue-failure-path.patch
 0003-blkcg-make-blkcg_activate_policy-allow-NULL-pd_init_.patch
 0004-blkcg-replace-blkcg_policy-pd_size-with-pd_alloc-fre.patch
 0005-blk-throttle-remove-asynchrnous-percpu-stats-allocat.patch
 0006-blk-throttle-clean-up-blkg_policy_data-alloc-init-ex.patch
 0007-blkcg-make-blkcg_policy-methods-take-a-pointer-to-bl.patch

0001-0003 are misc cleanups.  0004-0006 add alloc/free methods and
remove blk-throttle's async percpu allocation mechanism.  0007 makes
all methods take blkcg_policy_data.

This patchset is also available in the following git branch.

 git://git.kernel.org/pub/scm/linux/kernel/git/tj/cgroup.git review-blkcg-methods-cleanup

and is on top of

[1] block/for-4.2/writeback
[2] [PATCHSET block/for-4.2/writeback] block, cgroup: make cfq charge async IOs to the appropriate blkcgs

diffstat follows.  Thanks.

 block/blk-cgroup.c         |   53 +++++--------
 block/blk-throttle.c       |  173 +++++++++++++--------------------------------
 block/cfq-iosched.c        |   37 +++++++--
 include/linux/blk-cgroup.h |   31 +++-----
 4 files changed, 116 insertions(+), 178 deletions(-)

--
tejun

[1] 5857cd637bc0 ("bdi: fix wrong error return value in cgwb_create()").
[2] http://lkml.kernel.org/g/1433753973-23684-1-git-send-email-tj@kernel.org

^ permalink raw reply	[flat|nested] 19+ messages in thread

* [PATCHSET block/for-4.2/writeback] blkcg: blkcg_policy methods cleanup
@ 2015-06-24  2:44 ` Tejun Heo
  0 siblings, 0 replies; 19+ messages in thread
From: Tejun Heo @ 2015-06-24  2:44 UTC (permalink / raw)
  To: axboe-tSWWG44O7X1aa/9Udqfwiw
  Cc: linux-kernel-u79uwXL29TY76Z2rM5mHXA,
	cgroups-u79uwXL29TY76Z2rM5mHXA, vgoyal-H+wXaHxf7aLQT0dZR+AlfA,
	avanzini.arianna-Re5JQEeQqe8AvxtiuMwx3w, kernel-team-b10kYP2dOMg

This patchset contains assorted cleanups for blkcg_policy methods.

* alloc/free are added.  exit is dropped.

* blk-throttle's async percpu allocation is replaced with direct
  allocation.

* all methods now take blkcg_policy_data instead of blkcg_gq.

Nothing too controversial.

Jens, as we're at the beginning of the 4.2 merge window, I'll ping /
repost this patchset along with other pending patchsets once -rc1
drops.

This patchset contains the following seven patches.

 0001-blkcg-remove-unnecessary-request_list-blkg-NULL-test.patch
 0002-blkcg-use-blkg_free-in-blkcg_init_queue-failure-path.patch
 0003-blkcg-make-blkcg_activate_policy-allow-NULL-pd_init_.patch
 0004-blkcg-replace-blkcg_policy-pd_size-with-pd_alloc-fre.patch
 0005-blk-throttle-remove-asynchrnous-percpu-stats-allocat.patch
 0006-blk-throttle-clean-up-blkg_policy_data-alloc-init-ex.patch
 0007-blkcg-make-blkcg_policy-methods-take-a-pointer-to-bl.patch

0001-0003 are misc cleanups.  0004-0006 add alloc/free methods and
remove blk-throttle's async percpu allocation mechanism.  0007 makes
all methods take blkcg_policy_data.

This patchset is also available in the following git branch.

 git://git.kernel.org/pub/scm/linux/kernel/git/tj/cgroup.git review-blkcg-methods-cleanup

and is on top of

[1] block/for-4.2/writeback
[2] [PATCHSET block/for-4.2/writeback] block, cgroup: make cfq charge async IOs to the appropriate blkcgs

diffstat follows.  Thanks.

 block/blk-cgroup.c         |   53 +++++--------
 block/blk-throttle.c       |  173 +++++++++++++--------------------------------
 block/cfq-iosched.c        |   37 +++++++--
 include/linux/blk-cgroup.h |   31 +++-----
 4 files changed, 116 insertions(+), 178 deletions(-)

--
tejun

[1] 5857cd637bc0 ("bdi: fix wrong error return value in cgwb_create()").
[2] http://lkml.kernel.org/g/1433753973-23684-1-git-send-email-tj-DgEjT+Ai2ygdnm+yROfE0A@public.gmane.org

^ permalink raw reply	[flat|nested] 19+ messages in thread

* [PATCH 1/7] blkcg: remove unnecessary request_list->blkg NULL test in blk_put_rl()
@ 2015-06-24  2:44   ` Tejun Heo
  0 siblings, 0 replies; 19+ messages in thread
From: Tejun Heo @ 2015-06-24  2:44 UTC (permalink / raw)
  To: axboe
  Cc: linux-kernel, cgroups, vgoyal, avanzini.arianna, kernel-team, Tejun Heo

Since ec13b1d6f0a0 ("blkcg: always create the blkcg_gq for the root
blkcg"), a request_list always has its blkg associated.  Drop
unnecessary rl->blkg NULL test from blk_put_rl().

Signed-off-by: Tejun Heo <tj@kernel.org>
Cc: Vivek Goyal <vgoyal@redhat.com>
---
 include/linux/blk-cgroup.h | 3 +--
 1 file changed, 1 insertion(+), 2 deletions(-)

diff --git a/include/linux/blk-cgroup.h b/include/linux/blk-cgroup.h
index 07a32b8..50d1009 100644
--- a/include/linux/blk-cgroup.h
+++ b/include/linux/blk-cgroup.h
@@ -378,8 +378,7 @@ static inline struct request_list *blk_get_rl(struct request_queue *q,
  */
 static inline void blk_put_rl(struct request_list *rl)
 {
-	/* root_rl may not have blkg set */
-	if (rl->blkg && rl->blkg->blkcg != &blkcg_root)
+	if (rl->blkg->blkcg != &blkcg_root)
 		blkg_put(rl->blkg);
 }
 
-- 
2.4.3


^ permalink raw reply related	[flat|nested] 19+ messages in thread

* [PATCH 1/7] blkcg: remove unnecessary request_list->blkg NULL test in blk_put_rl()
@ 2015-06-24  2:44   ` Tejun Heo
  0 siblings, 0 replies; 19+ messages in thread
From: Tejun Heo @ 2015-06-24  2:44 UTC (permalink / raw)
  To: axboe-tSWWG44O7X1aa/9Udqfwiw
  Cc: linux-kernel-u79uwXL29TY76Z2rM5mHXA,
	cgroups-u79uwXL29TY76Z2rM5mHXA, vgoyal-H+wXaHxf7aLQT0dZR+AlfA,
	avanzini.arianna-Re5JQEeQqe8AvxtiuMwx3w, kernel-team-b10kYP2dOMg,
	Tejun Heo

Since ec13b1d6f0a0 ("blkcg: always create the blkcg_gq for the root
blkcg"), a request_list always has its blkg associated.  Drop
unnecessary rl->blkg NULL test from blk_put_rl().

Signed-off-by: Tejun Heo <tj-DgEjT+Ai2ygdnm+yROfE0A@public.gmane.org>
Cc: Vivek Goyal <vgoyal-H+wXaHxf7aLQT0dZR+AlfA@public.gmane.org>
---
 include/linux/blk-cgroup.h | 3 +--
 1 file changed, 1 insertion(+), 2 deletions(-)

diff --git a/include/linux/blk-cgroup.h b/include/linux/blk-cgroup.h
index 07a32b8..50d1009 100644
--- a/include/linux/blk-cgroup.h
+++ b/include/linux/blk-cgroup.h
@@ -378,8 +378,7 @@ static inline struct request_list *blk_get_rl(struct request_queue *q,
  */
 static inline void blk_put_rl(struct request_list *rl)
 {
-	/* root_rl may not have blkg set */
-	if (rl->blkg && rl->blkg->blkcg != &blkcg_root)
+	if (rl->blkg->blkcg != &blkcg_root)
 		blkg_put(rl->blkg);
 }
 
-- 
2.4.3

^ permalink raw reply related	[flat|nested] 19+ messages in thread

* [PATCH 2/7] blkcg: use blkg_free() in blkcg_init_queue() failure path
@ 2015-06-24  2:44   ` Tejun Heo
  0 siblings, 0 replies; 19+ messages in thread
From: Tejun Heo @ 2015-06-24  2:44 UTC (permalink / raw)
  To: axboe
  Cc: linux-kernel, cgroups, vgoyal, avanzini.arianna, kernel-team, Tejun Heo

When blkcg_init_queue() fails midway after creating a new blkg, it
performs kfree() directly; however, this doesn't free the policy data
areas.  Make it use blkg_free() instead.  In turn, blkg_free() is
updated to handle root request_list special case.

While this fixes a possible memory leak, it's on an unlikely failure
path of an already cold path and the size leaked per occurrence is
miniscule too.  I don't think it needs to be tagged for -stable.

Signed-off-by: Tejun Heo <tj@kernel.org>
Cc: Vivek Goyal <vgoyal@redhat.com>
---
 block/blk-cgroup.c | 5 +++--
 1 file changed, 3 insertions(+), 2 deletions(-)

diff --git a/block/blk-cgroup.c b/block/blk-cgroup.c
index 1fddbbd..898f560 100644
--- a/block/blk-cgroup.c
+++ b/block/blk-cgroup.c
@@ -57,7 +57,8 @@ static void blkg_free(struct blkcg_gq *blkg)
 	for (i = 0; i < BLKCG_MAX_POLS; i++)
 		kfree(blkg->pd[i]);
 
-	blk_exit_rl(&blkg->rl);
+	if (blkg->blkcg != &blkcg_root)
+		blk_exit_rl(&blkg->rl);
 	kfree(blkg);
 }
 
@@ -886,7 +887,7 @@ int blkcg_init_queue(struct request_queue *q)
 		radix_tree_preload_end();
 
 	if (IS_ERR(blkg)) {
-		kfree(new_blkg);
+		blkg_free(new_blkg);
 		return PTR_ERR(blkg);
 	}
 
-- 
2.4.3


^ permalink raw reply related	[flat|nested] 19+ messages in thread

* [PATCH 2/7] blkcg: use blkg_free() in blkcg_init_queue() failure path
@ 2015-06-24  2:44   ` Tejun Heo
  0 siblings, 0 replies; 19+ messages in thread
From: Tejun Heo @ 2015-06-24  2:44 UTC (permalink / raw)
  To: axboe-tSWWG44O7X1aa/9Udqfwiw
  Cc: linux-kernel-u79uwXL29TY76Z2rM5mHXA,
	cgroups-u79uwXL29TY76Z2rM5mHXA, vgoyal-H+wXaHxf7aLQT0dZR+AlfA,
	avanzini.arianna-Re5JQEeQqe8AvxtiuMwx3w, kernel-team-b10kYP2dOMg,
	Tejun Heo

When blkcg_init_queue() fails midway after creating a new blkg, it
performs kfree() directly; however, this doesn't free the policy data
areas.  Make it use blkg_free() instead.  In turn, blkg_free() is
updated to handle root request_list special case.

While this fixes a possible memory leak, it's on an unlikely failure
path of an already cold path and the size leaked per occurrence is
miniscule too.  I don't think it needs to be tagged for -stable.

Signed-off-by: Tejun Heo <tj-DgEjT+Ai2ygdnm+yROfE0A@public.gmane.org>
Cc: Vivek Goyal <vgoyal-H+wXaHxf7aLQT0dZR+AlfA@public.gmane.org>
---
 block/blk-cgroup.c | 5 +++--
 1 file changed, 3 insertions(+), 2 deletions(-)

diff --git a/block/blk-cgroup.c b/block/blk-cgroup.c
index 1fddbbd..898f560 100644
--- a/block/blk-cgroup.c
+++ b/block/blk-cgroup.c
@@ -57,7 +57,8 @@ static void blkg_free(struct blkcg_gq *blkg)
 	for (i = 0; i < BLKCG_MAX_POLS; i++)
 		kfree(blkg->pd[i]);
 
-	blk_exit_rl(&blkg->rl);
+	if (blkg->blkcg != &blkcg_root)
+		blk_exit_rl(&blkg->rl);
 	kfree(blkg);
 }
 
@@ -886,7 +887,7 @@ int blkcg_init_queue(struct request_queue *q)
 		radix_tree_preload_end();
 
 	if (IS_ERR(blkg)) {
-		kfree(new_blkg);
+		blkg_free(new_blkg);
 		return PTR_ERR(blkg);
 	}
 
-- 
2.4.3

^ permalink raw reply related	[flat|nested] 19+ messages in thread

* [PATCH 3/7] blkcg: make blkcg_activate_policy() allow NULL ->pd_init_fn
@ 2015-06-24  2:44   ` Tejun Heo
  0 siblings, 0 replies; 19+ messages in thread
From: Tejun Heo @ 2015-06-24  2:44 UTC (permalink / raw)
  To: axboe
  Cc: linux-kernel, cgroups, vgoyal, avanzini.arianna, kernel-team, Tejun Heo

blkg_create() allows NULL ->pd_init_fn() but blkcg_activate_policy()
doesn't.  As both in-kernel policies implement ->pd_init_fn, it
currently doesn't break anything.  Update blkcg_activate_policy() so
that its behavior is consistent with blkg_create().

Signed-off-by: Tejun Heo <tj@kernel.org>
Cc: Vivek Goyal <vgoyal@redhat.com>
---
 block/blk-cgroup.c | 3 ++-
 1 file changed, 2 insertions(+), 1 deletion(-)

diff --git a/block/blk-cgroup.c b/block/blk-cgroup.c
index 898f560..617a586 100644
--- a/block/blk-cgroup.c
+++ b/block/blk-cgroup.c
@@ -1045,7 +1045,8 @@ int blkcg_activate_policy(struct request_queue *q,
 		blkg->pd[pol->plid] = pd;
 		pd->blkg = blkg;
 		pd->plid = pol->plid;
-		pol->pd_init_fn(blkg);
+		if (pol->pd_init_fn)
+			pol->pd_init_fn(blkg);
 
 		spin_unlock(&blkg->blkcg->lock);
 	}
-- 
2.4.3


^ permalink raw reply related	[flat|nested] 19+ messages in thread

* [PATCH 3/7] blkcg: make blkcg_activate_policy() allow NULL ->pd_init_fn
@ 2015-06-24  2:44   ` Tejun Heo
  0 siblings, 0 replies; 19+ messages in thread
From: Tejun Heo @ 2015-06-24  2:44 UTC (permalink / raw)
  To: axboe-tSWWG44O7X1aa/9Udqfwiw
  Cc: linux-kernel-u79uwXL29TY76Z2rM5mHXA,
	cgroups-u79uwXL29TY76Z2rM5mHXA, vgoyal-H+wXaHxf7aLQT0dZR+AlfA,
	avanzini.arianna-Re5JQEeQqe8AvxtiuMwx3w, kernel-team-b10kYP2dOMg,
	Tejun Heo

blkg_create() allows NULL ->pd_init_fn() but blkcg_activate_policy()
doesn't.  As both in-kernel policies implement ->pd_init_fn, it
currently doesn't break anything.  Update blkcg_activate_policy() so
that its behavior is consistent with blkg_create().

Signed-off-by: Tejun Heo <tj-DgEjT+Ai2ygdnm+yROfE0A@public.gmane.org>
Cc: Vivek Goyal <vgoyal-H+wXaHxf7aLQT0dZR+AlfA@public.gmane.org>
---
 block/blk-cgroup.c | 3 ++-
 1 file changed, 2 insertions(+), 1 deletion(-)

diff --git a/block/blk-cgroup.c b/block/blk-cgroup.c
index 898f560..617a586 100644
--- a/block/blk-cgroup.c
+++ b/block/blk-cgroup.c
@@ -1045,7 +1045,8 @@ int blkcg_activate_policy(struct request_queue *q,
 		blkg->pd[pol->plid] = pd;
 		pd->blkg = blkg;
 		pd->plid = pol->plid;
-		pol->pd_init_fn(blkg);
+		if (pol->pd_init_fn)
+			pol->pd_init_fn(blkg);
 
 		spin_unlock(&blkg->blkcg->lock);
 	}
-- 
2.4.3

^ permalink raw reply related	[flat|nested] 19+ messages in thread

* [PATCH 4/7] blkcg: replace blkcg_policy->pd_size with ->pd_alloc/free_fn() methods
@ 2015-06-24  2:44   ` Tejun Heo
  0 siblings, 0 replies; 19+ messages in thread
From: Tejun Heo @ 2015-06-24  2:44 UTC (permalink / raw)
  To: axboe
  Cc: linux-kernel, cgroups, vgoyal, avanzini.arianna, kernel-team, Tejun Heo

A blkg (blkcg_gq) represents the relationship between a cgroup and
request_queue.  Each active policy has a pd (blkg_policy_data) on each
blkg.  The pd's were allocated by blkcg core and each policy could
request to allocate extra space at the end by setting
blkcg_policy->pd_size larger than the size of pd.

This is a bit unusual but was done this way mostly to simplify error
handling and all the existing use cases could be handled this way;
however, this is becoming too restrictive now that percpu memory can
be allocated without blocking.

This introduces two new mandatory blkcg_policy methods - pd_alloc_fn()
and pd_free_fn() - which are used to allocate and release pd for a
given policy.  As pd allocation is now done from policy side, it can
simply allocate a larger area which embeds pd at the beginning.  This
change makes ->pd_size pointless.  Removed.

Signed-off-by: Tejun Heo <tj@kernel.org>
Cc: Vivek Goyal <vgoyal@redhat.com>
---
 block/blk-cgroup.c         | 18 +++++++++---------
 block/blk-throttle.c       | 13 ++++++++++++-
 block/cfq-iosched.c        | 13 ++++++++++++-
 include/linux/blk-cgroup.h | 18 +++++++++---------
 4 files changed, 42 insertions(+), 20 deletions(-)

diff --git a/block/blk-cgroup.c b/block/blk-cgroup.c
index 617a586..574971a 100644
--- a/block/blk-cgroup.c
+++ b/block/blk-cgroup.c
@@ -55,7 +55,8 @@ static void blkg_free(struct blkcg_gq *blkg)
 		return;
 
 	for (i = 0; i < BLKCG_MAX_POLS; i++)
-		kfree(blkg->pd[i]);
+		if (blkg->pd[i])
+			blkcg_policy[i]->pd_free_fn(blkg->pd[i]);
 
 	if (blkg->blkcg != &blkcg_root)
 		blk_exit_rl(&blkg->rl);
@@ -101,7 +102,7 @@ static struct blkcg_gq *blkg_alloc(struct blkcg *blkcg, struct request_queue *q,
 			continue;
 
 		/* alloc per-policy data and attach it to blkg */
-		pd = kzalloc_node(pol->pd_size, gfp_mask, q->node);
+		pd = pol->pd_alloc_fn(gfp_mask, q->node);
 		if (!pd)
 			goto err_free;
 
@@ -1016,7 +1017,7 @@ int blkcg_activate_policy(struct request_queue *q,
 	spin_unlock_irq(q->queue_lock);
 
 	while (cnt--) {
-		pd = kzalloc_node(pol->pd_size, GFP_KERNEL, q->node);
+		pd = pol->pd_alloc_fn(GFP_KERNEL, q->node);
 		if (!pd) {
 			ret = -ENOMEM;
 			goto out_free;
@@ -1058,7 +1059,7 @@ int blkcg_activate_policy(struct request_queue *q,
 out_free:
 	blk_queue_bypass_end(q);
 	list_for_each_entry_safe(pd, n, &pds, alloc_node)
-		kfree(pd);
+		pol->pd_free_fn(pd);
 	return ret;
 }
 EXPORT_SYMBOL_GPL(blkcg_activate_policy);
@@ -1093,8 +1094,10 @@ void blkcg_deactivate_policy(struct request_queue *q,
 		if (pol->pd_exit_fn)
 			pol->pd_exit_fn(blkg);
 
-		kfree(blkg->pd[pol->plid]);
-		blkg->pd[pol->plid] = NULL;
+		if (blkg->pd[pol->plid]) {
+			pol->pd_free_fn(blkg->pd[pol->plid]);
+			blkg->pd[pol->plid] = NULL;
+		}
 
 		spin_unlock(&blkg->blkcg->lock);
 	}
@@ -1115,9 +1118,6 @@ int blkcg_policy_register(struct blkcg_policy *pol)
 {
 	int i, ret;
 
-	if (WARN_ON(pol->pd_size < sizeof(struct blkg_policy_data)))
-		return -EINVAL;
-
 	mutex_lock(&blkcg_pol_mutex);
 
 	/* find an empty slot */
diff --git a/block/blk-throttle.c b/block/blk-throttle.c
index b231935..f1dd691 100644
--- a/block/blk-throttle.c
+++ b/block/blk-throttle.c
@@ -403,6 +403,11 @@ static void throtl_service_queue_exit(struct throtl_service_queue *sq)
 	del_timer_sync(&sq->pending_timer);
 }
 
+static struct blkg_policy_data *throtl_pd_alloc(gfp_t gfp, int node)
+{
+	return kzalloc_node(sizeof(struct throtl_grp), gfp, node);
+}
+
 static void throtl_pd_init(struct blkcg_gq *blkg)
 {
 	struct throtl_grp *tg = blkg_to_tg(blkg);
@@ -493,6 +498,11 @@ static void throtl_pd_exit(struct blkcg_gq *blkg)
 	throtl_service_queue_exit(&tg->service_queue);
 }
 
+static void throtl_pd_free(struct blkg_policy_data *pd)
+{
+	kfree(pd);
+}
+
 static void throtl_pd_reset_stats(struct blkcg_gq *blkg)
 {
 	struct throtl_grp *tg = blkg_to_tg(blkg);
@@ -1468,12 +1478,13 @@ static void throtl_shutdown_wq(struct request_queue *q)
 }
 
 static struct blkcg_policy blkcg_policy_throtl = {
-	.pd_size		= sizeof(struct throtl_grp),
 	.cftypes		= throtl_files,
 
+	.pd_alloc_fn		= throtl_pd_alloc,
 	.pd_init_fn		= throtl_pd_init,
 	.pd_online_fn		= throtl_pd_online,
 	.pd_exit_fn		= throtl_pd_exit,
+	.pd_free_fn		= throtl_pd_free,
 	.pd_reset_stats_fn	= throtl_pd_reset_stats,
 };
 
diff --git a/block/cfq-iosched.c b/block/cfq-iosched.c
index c09ef15..6b7bf94 100644
--- a/block/cfq-iosched.c
+++ b/block/cfq-iosched.c
@@ -1543,6 +1543,11 @@ static void cfqg_stats_init(struct cfqg_stats *stats)
 #endif
 }
 
+static struct blkg_policy_data *cfq_pd_alloc(gfp_t gfp, int node)
+{
+	return kzalloc_node(sizeof(struct cfq_group), gfp, node);
+}
+
 static void cfq_pd_init(struct blkcg_gq *blkg)
 {
 	struct cfq_group *cfqg = blkg_to_cfqg(blkg);
@@ -1578,6 +1583,11 @@ static void cfq_pd_offline(struct blkcg_gq *blkg)
 	cfqg_stats_xfer_dead(cfqg);
 }
 
+static void cfq_pd_free(struct blkg_policy_data *pd)
+{
+	return kfree(pd);
+}
+
 /* offset delta from cfqg->stats to cfqg->dead_stats */
 static const int dead_stats_off_delta = offsetof(struct cfq_group, dead_stats) -
 					offsetof(struct cfq_group, stats);
@@ -4552,11 +4562,12 @@ static struct elevator_type iosched_cfq = {
 
 #ifdef CONFIG_CFQ_GROUP_IOSCHED
 static struct blkcg_policy blkcg_policy_cfq = {
-	.pd_size		= sizeof(struct cfq_group),
 	.cftypes		= cfq_blkcg_files,
 
+	.pd_alloc_fn		= cfq_pd_alloc,
 	.pd_init_fn		= cfq_pd_init,
 	.pd_offline_fn		= cfq_pd_offline,
+	.pd_free_fn		= cfq_pd_free,
 	.pd_reset_stats_fn	= cfq_pd_reset_stats,
 };
 #endif
diff --git a/include/linux/blk-cgroup.h b/include/linux/blk-cgroup.h
index 50d1009..63f6340 100644
--- a/include/linux/blk-cgroup.h
+++ b/include/linux/blk-cgroup.h
@@ -74,13 +74,11 @@ struct blkg_rwstat {
  * request_queue (q).  This is used by blkcg policies which need to track
  * information per blkcg - q pair.
  *
- * There can be multiple active blkcg policies and each has its private
- * data on each blkg, the size of which is determined by
- * blkcg_policy->pd_size.  blkcg core allocates and frees such areas
- * together with blkg and invokes pd_init/exit_fn() methods.
- *
- * Such private data must embed struct blkg_policy_data (pd) at the
- * beginning and pd_size can't be smaller than pd.
+ * There can be multiple active blkcg policies and each blkg:policy pair is
+ * represented by a blkg_policy_data which is allocated and freed by each
+ * policy's pd_alloc/free_fn() methods.  A policy can allocate private data
+ * area by allocating larger data structure which embeds blkg_policy_data
+ * at the beginning.
  */
 struct blkg_policy_data {
 	/* the blkg and policy id this per-policy data belongs to */
@@ -122,24 +120,26 @@ struct blkcg_gq {
 	struct rcu_head			rcu_head;
 };
 
+typedef struct blkg_policy_data *(blkcg_pol_alloc_pd_fn)(gfp_t gfp, int node);
 typedef void (blkcg_pol_init_pd_fn)(struct blkcg_gq *blkg);
 typedef void (blkcg_pol_online_pd_fn)(struct blkcg_gq *blkg);
 typedef void (blkcg_pol_offline_pd_fn)(struct blkcg_gq *blkg);
 typedef void (blkcg_pol_exit_pd_fn)(struct blkcg_gq *blkg);
+typedef void (blkcg_pol_free_pd_fn)(struct blkg_policy_data *pd);
 typedef void (blkcg_pol_reset_pd_stats_fn)(struct blkcg_gq *blkg);
 
 struct blkcg_policy {
 	int				plid;
-	/* policy specific private data size */
-	size_t				pd_size;
 	/* cgroup files for the policy */
 	struct cftype			*cftypes;
 
 	/* operations */
+	blkcg_pol_alloc_pd_fn		*pd_alloc_fn;
 	blkcg_pol_init_pd_fn		*pd_init_fn;
 	blkcg_pol_online_pd_fn		*pd_online_fn;
 	blkcg_pol_offline_pd_fn		*pd_offline_fn;
 	blkcg_pol_exit_pd_fn		*pd_exit_fn;
+	blkcg_pol_free_pd_fn		*pd_free_fn;
 	blkcg_pol_reset_pd_stats_fn	*pd_reset_stats_fn;
 };
 
-- 
2.4.3


^ permalink raw reply related	[flat|nested] 19+ messages in thread

* [PATCH 4/7] blkcg: replace blkcg_policy->pd_size with ->pd_alloc/free_fn() methods
@ 2015-06-24  2:44   ` Tejun Heo
  0 siblings, 0 replies; 19+ messages in thread
From: Tejun Heo @ 2015-06-24  2:44 UTC (permalink / raw)
  To: axboe-tSWWG44O7X1aa/9Udqfwiw
  Cc: linux-kernel-u79uwXL29TY76Z2rM5mHXA,
	cgroups-u79uwXL29TY76Z2rM5mHXA, vgoyal-H+wXaHxf7aLQT0dZR+AlfA,
	avanzini.arianna-Re5JQEeQqe8AvxtiuMwx3w, kernel-team-b10kYP2dOMg,
	Tejun Heo

A blkg (blkcg_gq) represents the relationship between a cgroup and
request_queue.  Each active policy has a pd (blkg_policy_data) on each
blkg.  The pd's were allocated by blkcg core and each policy could
request to allocate extra space at the end by setting
blkcg_policy->pd_size larger than the size of pd.

This is a bit unusual but was done this way mostly to simplify error
handling and all the existing use cases could be handled this way;
however, this is becoming too restrictive now that percpu memory can
be allocated without blocking.

This introduces two new mandatory blkcg_policy methods - pd_alloc_fn()
and pd_free_fn() - which are used to allocate and release pd for a
given policy.  As pd allocation is now done from policy side, it can
simply allocate a larger area which embeds pd at the beginning.  This
change makes ->pd_size pointless.  Removed.

Signed-off-by: Tejun Heo <tj-DgEjT+Ai2ygdnm+yROfE0A@public.gmane.org>
Cc: Vivek Goyal <vgoyal-H+wXaHxf7aLQT0dZR+AlfA@public.gmane.org>
---
 block/blk-cgroup.c         | 18 +++++++++---------
 block/blk-throttle.c       | 13 ++++++++++++-
 block/cfq-iosched.c        | 13 ++++++++++++-
 include/linux/blk-cgroup.h | 18 +++++++++---------
 4 files changed, 42 insertions(+), 20 deletions(-)

diff --git a/block/blk-cgroup.c b/block/blk-cgroup.c
index 617a586..574971a 100644
--- a/block/blk-cgroup.c
+++ b/block/blk-cgroup.c
@@ -55,7 +55,8 @@ static void blkg_free(struct blkcg_gq *blkg)
 		return;
 
 	for (i = 0; i < BLKCG_MAX_POLS; i++)
-		kfree(blkg->pd[i]);
+		if (blkg->pd[i])
+			blkcg_policy[i]->pd_free_fn(blkg->pd[i]);
 
 	if (blkg->blkcg != &blkcg_root)
 		blk_exit_rl(&blkg->rl);
@@ -101,7 +102,7 @@ static struct blkcg_gq *blkg_alloc(struct blkcg *blkcg, struct request_queue *q,
 			continue;
 
 		/* alloc per-policy data and attach it to blkg */
-		pd = kzalloc_node(pol->pd_size, gfp_mask, q->node);
+		pd = pol->pd_alloc_fn(gfp_mask, q->node);
 		if (!pd)
 			goto err_free;
 
@@ -1016,7 +1017,7 @@ int blkcg_activate_policy(struct request_queue *q,
 	spin_unlock_irq(q->queue_lock);
 
 	while (cnt--) {
-		pd = kzalloc_node(pol->pd_size, GFP_KERNEL, q->node);
+		pd = pol->pd_alloc_fn(GFP_KERNEL, q->node);
 		if (!pd) {
 			ret = -ENOMEM;
 			goto out_free;
@@ -1058,7 +1059,7 @@ int blkcg_activate_policy(struct request_queue *q,
 out_free:
 	blk_queue_bypass_end(q);
 	list_for_each_entry_safe(pd, n, &pds, alloc_node)
-		kfree(pd);
+		pol->pd_free_fn(pd);
 	return ret;
 }
 EXPORT_SYMBOL_GPL(blkcg_activate_policy);
@@ -1093,8 +1094,10 @@ void blkcg_deactivate_policy(struct request_queue *q,
 		if (pol->pd_exit_fn)
 			pol->pd_exit_fn(blkg);
 
-		kfree(blkg->pd[pol->plid]);
-		blkg->pd[pol->plid] = NULL;
+		if (blkg->pd[pol->plid]) {
+			pol->pd_free_fn(blkg->pd[pol->plid]);
+			blkg->pd[pol->plid] = NULL;
+		}
 
 		spin_unlock(&blkg->blkcg->lock);
 	}
@@ -1115,9 +1118,6 @@ int blkcg_policy_register(struct blkcg_policy *pol)
 {
 	int i, ret;
 
-	if (WARN_ON(pol->pd_size < sizeof(struct blkg_policy_data)))
-		return -EINVAL;
-
 	mutex_lock(&blkcg_pol_mutex);
 
 	/* find an empty slot */
diff --git a/block/blk-throttle.c b/block/blk-throttle.c
index b231935..f1dd691 100644
--- a/block/blk-throttle.c
+++ b/block/blk-throttle.c
@@ -403,6 +403,11 @@ static void throtl_service_queue_exit(struct throtl_service_queue *sq)
 	del_timer_sync(&sq->pending_timer);
 }
 
+static struct blkg_policy_data *throtl_pd_alloc(gfp_t gfp, int node)
+{
+	return kzalloc_node(sizeof(struct throtl_grp), gfp, node);
+}
+
 static void throtl_pd_init(struct blkcg_gq *blkg)
 {
 	struct throtl_grp *tg = blkg_to_tg(blkg);
@@ -493,6 +498,11 @@ static void throtl_pd_exit(struct blkcg_gq *blkg)
 	throtl_service_queue_exit(&tg->service_queue);
 }
 
+static void throtl_pd_free(struct blkg_policy_data *pd)
+{
+	kfree(pd);
+}
+
 static void throtl_pd_reset_stats(struct blkcg_gq *blkg)
 {
 	struct throtl_grp *tg = blkg_to_tg(blkg);
@@ -1468,12 +1478,13 @@ static void throtl_shutdown_wq(struct request_queue *q)
 }
 
 static struct blkcg_policy blkcg_policy_throtl = {
-	.pd_size		= sizeof(struct throtl_grp),
 	.cftypes		= throtl_files,
 
+	.pd_alloc_fn		= throtl_pd_alloc,
 	.pd_init_fn		= throtl_pd_init,
 	.pd_online_fn		= throtl_pd_online,
 	.pd_exit_fn		= throtl_pd_exit,
+	.pd_free_fn		= throtl_pd_free,
 	.pd_reset_stats_fn	= throtl_pd_reset_stats,
 };
 
diff --git a/block/cfq-iosched.c b/block/cfq-iosched.c
index c09ef15..6b7bf94 100644
--- a/block/cfq-iosched.c
+++ b/block/cfq-iosched.c
@@ -1543,6 +1543,11 @@ static void cfqg_stats_init(struct cfqg_stats *stats)
 #endif
 }
 
+static struct blkg_policy_data *cfq_pd_alloc(gfp_t gfp, int node)
+{
+	return kzalloc_node(sizeof(struct cfq_group), gfp, node);
+}
+
 static void cfq_pd_init(struct blkcg_gq *blkg)
 {
 	struct cfq_group *cfqg = blkg_to_cfqg(blkg);
@@ -1578,6 +1583,11 @@ static void cfq_pd_offline(struct blkcg_gq *blkg)
 	cfqg_stats_xfer_dead(cfqg);
 }
 
+static void cfq_pd_free(struct blkg_policy_data *pd)
+{
+	return kfree(pd);
+}
+
 /* offset delta from cfqg->stats to cfqg->dead_stats */
 static const int dead_stats_off_delta = offsetof(struct cfq_group, dead_stats) -
 					offsetof(struct cfq_group, stats);
@@ -4552,11 +4562,12 @@ static struct elevator_type iosched_cfq = {
 
 #ifdef CONFIG_CFQ_GROUP_IOSCHED
 static struct blkcg_policy blkcg_policy_cfq = {
-	.pd_size		= sizeof(struct cfq_group),
 	.cftypes		= cfq_blkcg_files,
 
+	.pd_alloc_fn		= cfq_pd_alloc,
 	.pd_init_fn		= cfq_pd_init,
 	.pd_offline_fn		= cfq_pd_offline,
+	.pd_free_fn		= cfq_pd_free,
 	.pd_reset_stats_fn	= cfq_pd_reset_stats,
 };
 #endif
diff --git a/include/linux/blk-cgroup.h b/include/linux/blk-cgroup.h
index 50d1009..63f6340 100644
--- a/include/linux/blk-cgroup.h
+++ b/include/linux/blk-cgroup.h
@@ -74,13 +74,11 @@ struct blkg_rwstat {
  * request_queue (q).  This is used by blkcg policies which need to track
  * information per blkcg - q pair.
  *
- * There can be multiple active blkcg policies and each has its private
- * data on each blkg, the size of which is determined by
- * blkcg_policy->pd_size.  blkcg core allocates and frees such areas
- * together with blkg and invokes pd_init/exit_fn() methods.
- *
- * Such private data must embed struct blkg_policy_data (pd) at the
- * beginning and pd_size can't be smaller than pd.
+ * There can be multiple active blkcg policies and each blkg:policy pair is
+ * represented by a blkg_policy_data which is allocated and freed by each
+ * policy's pd_alloc/free_fn() methods.  A policy can allocate private data
+ * area by allocating larger data structure which embeds blkg_policy_data
+ * at the beginning.
  */
 struct blkg_policy_data {
 	/* the blkg and policy id this per-policy data belongs to */
@@ -122,24 +120,26 @@ struct blkcg_gq {
 	struct rcu_head			rcu_head;
 };
 
+typedef struct blkg_policy_data *(blkcg_pol_alloc_pd_fn)(gfp_t gfp, int node);
 typedef void (blkcg_pol_init_pd_fn)(struct blkcg_gq *blkg);
 typedef void (blkcg_pol_online_pd_fn)(struct blkcg_gq *blkg);
 typedef void (blkcg_pol_offline_pd_fn)(struct blkcg_gq *blkg);
 typedef void (blkcg_pol_exit_pd_fn)(struct blkcg_gq *blkg);
+typedef void (blkcg_pol_free_pd_fn)(struct blkg_policy_data *pd);
 typedef void (blkcg_pol_reset_pd_stats_fn)(struct blkcg_gq *blkg);
 
 struct blkcg_policy {
 	int				plid;
-	/* policy specific private data size */
-	size_t				pd_size;
 	/* cgroup files for the policy */
 	struct cftype			*cftypes;
 
 	/* operations */
+	blkcg_pol_alloc_pd_fn		*pd_alloc_fn;
 	blkcg_pol_init_pd_fn		*pd_init_fn;
 	blkcg_pol_online_pd_fn		*pd_online_fn;
 	blkcg_pol_offline_pd_fn		*pd_offline_fn;
 	blkcg_pol_exit_pd_fn		*pd_exit_fn;
+	blkcg_pol_free_pd_fn		*pd_free_fn;
 	blkcg_pol_reset_pd_stats_fn	*pd_reset_stats_fn;
 };
 
-- 
2.4.3

^ permalink raw reply related	[flat|nested] 19+ messages in thread

* [PATCH 5/7] blk-throttle: remove asynchrnous percpu stats allocation mechanism
  2015-06-24  2:44 ` Tejun Heo
                   ` (4 preceding siblings ...)
  (?)
@ 2015-06-24  2:44 ` Tejun Heo
  2015-06-25 19:10     ` Vivek Goyal
  -1 siblings, 1 reply; 19+ messages in thread
From: Tejun Heo @ 2015-06-24  2:44 UTC (permalink / raw)
  To: axboe
  Cc: linux-kernel, cgroups, vgoyal, avanzini.arianna, kernel-team, Tejun Heo

Because percpu allocator couldn't do non-blocking allocations,
blk-throttle was forced to implement an ad-hoc asynchronous allocation
mechanism for its percpu stats for cases where blkg's (blkcg_gq's) are
allocated from an IO path without sleepable context.

Now that percpu allocator can handle gfp_mask and blkg_policy_data
alloc / free are handled by policy methods, the ad-hoc asynchronous
allocation mechanism can be replaced with direct allocation from
tg_stats_alloc_fn().  Rit it out.

This ensures that an active throtl_grp always has valid non-NULL
->stats_cpu.  Remove checks on it.

Signed-off-by: Tejun Heo <tj@kernel.org>
Cc: Vivek Goyal <vgoyal@redhat.com>
---
 block/blk-throttle.c | 112 ++++++++++++---------------------------------------
 1 file changed, 25 insertions(+), 87 deletions(-)

diff --git a/block/blk-throttle.c b/block/blk-throttle.c
index f1dd691..3c86976 100644
--- a/block/blk-throttle.c
+++ b/block/blk-throttle.c
@@ -144,9 +144,6 @@ struct throtl_grp {
 
 	/* Per cpu stats pointer */
 	struct tg_stats_cpu __percpu *stats_cpu;
-
-	/* List of tgs waiting for per cpu stats memory to be allocated */
-	struct list_head stats_alloc_node;
 };
 
 struct throtl_data
@@ -168,13 +165,6 @@ struct throtl_data
 	struct work_struct dispatch_work;
 };
 
-/* list and work item to allocate percpu group stats */
-static DEFINE_SPINLOCK(tg_stats_alloc_lock);
-static LIST_HEAD(tg_stats_alloc_list);
-
-static void tg_stats_alloc_fn(struct work_struct *);
-static DECLARE_DELAYED_WORK(tg_stats_alloc_work, tg_stats_alloc_fn);
-
 static void throtl_pending_timer_fn(unsigned long arg);
 
 static inline struct throtl_grp *pd_to_tg(struct blkg_policy_data *pd)
@@ -256,53 +246,6 @@ static struct throtl_data *sq_to_td(struct throtl_service_queue *sq)
 	}								\
 } while (0)
 
-static void tg_stats_init(struct tg_stats_cpu *tg_stats)
-{
-	blkg_rwstat_init(&tg_stats->service_bytes);
-	blkg_rwstat_init(&tg_stats->serviced);
-}
-
-/*
- * Worker for allocating per cpu stat for tgs. This is scheduled on the
- * system_wq once there are some groups on the alloc_list waiting for
- * allocation.
- */
-static void tg_stats_alloc_fn(struct work_struct *work)
-{
-	static struct tg_stats_cpu *stats_cpu;	/* this fn is non-reentrant */
-	struct delayed_work *dwork = to_delayed_work(work);
-	bool empty = false;
-
-alloc_stats:
-	if (!stats_cpu) {
-		int cpu;
-
-		stats_cpu = alloc_percpu(struct tg_stats_cpu);
-		if (!stats_cpu) {
-			/* allocation failed, try again after some time */
-			schedule_delayed_work(dwork, msecs_to_jiffies(10));
-			return;
-		}
-		for_each_possible_cpu(cpu)
-			tg_stats_init(per_cpu_ptr(stats_cpu, cpu));
-	}
-
-	spin_lock_irq(&tg_stats_alloc_lock);
-
-	if (!list_empty(&tg_stats_alloc_list)) {
-		struct throtl_grp *tg = list_first_entry(&tg_stats_alloc_list,
-							 struct throtl_grp,
-							 stats_alloc_node);
-		swap(tg->stats_cpu, stats_cpu);
-		list_del_init(&tg->stats_alloc_node);
-	}
-
-	empty = list_empty(&tg_stats_alloc_list);
-	spin_unlock_irq(&tg_stats_alloc_lock);
-	if (!empty)
-		goto alloc_stats;
-}
-
 static void throtl_qnode_init(struct throtl_qnode *qn, struct throtl_grp *tg)
 {
 	INIT_LIST_HEAD(&qn->node);
@@ -405,7 +348,27 @@ static void throtl_service_queue_exit(struct throtl_service_queue *sq)
 
 static struct blkg_policy_data *throtl_pd_alloc(gfp_t gfp, int node)
 {
-	return kzalloc_node(sizeof(struct throtl_grp), gfp, node);
+	struct throtl_grp *tg;
+	int cpu;
+
+	tg = kzalloc_node(sizeof(*tg), gfp, node);
+	if (!tg)
+		return NULL;
+
+	tg->stats_cpu = alloc_percpu_gfp(struct tg_stats_cpu, gfp);
+	if (!tg->stats_cpu) {
+		kfree(tg);
+		return NULL;
+	}
+
+	for_each_possible_cpu(cpu) {
+		struct tg_stats_cpu *stats_cpu = per_cpu_ptr(tg->stats_cpu, cpu);
+
+		blkg_rwstat_init(&stats_cpu->service_bytes);
+		blkg_rwstat_init(&stats_cpu->serviced);
+	}
+
+	return &tg->pd;
 }
 
 static void throtl_pd_init(struct blkcg_gq *blkg)
@@ -413,7 +376,6 @@ static void throtl_pd_init(struct blkcg_gq *blkg)
 	struct throtl_grp *tg = blkg_to_tg(blkg);
 	struct throtl_data *td = blkg->q->td;
 	struct throtl_service_queue *parent_sq;
-	unsigned long flags;
 	int rw;
 
 	/*
@@ -448,16 +410,6 @@ static void throtl_pd_init(struct blkcg_gq *blkg)
 	tg->bps[WRITE] = -1;
 	tg->iops[READ] = -1;
 	tg->iops[WRITE] = -1;
-
-	/*
-	 * Ugh... We need to perform per-cpu allocation for tg->stats_cpu
-	 * but percpu allocator can't be called from IO path.  Queue tg on
-	 * tg_stats_alloc_list and allocate from work item.
-	 */
-	spin_lock_irqsave(&tg_stats_alloc_lock, flags);
-	list_add(&tg->stats_alloc_node, &tg_stats_alloc_list);
-	schedule_delayed_work(&tg_stats_alloc_work, 0);
-	spin_unlock_irqrestore(&tg_stats_alloc_lock, flags);
 }
 
 /*
@@ -487,20 +439,16 @@ static void throtl_pd_online(struct blkcg_gq *blkg)
 static void throtl_pd_exit(struct blkcg_gq *blkg)
 {
 	struct throtl_grp *tg = blkg_to_tg(blkg);
-	unsigned long flags;
-
-	spin_lock_irqsave(&tg_stats_alloc_lock, flags);
-	list_del_init(&tg->stats_alloc_node);
-	spin_unlock_irqrestore(&tg_stats_alloc_lock, flags);
-
-	free_percpu(tg->stats_cpu);
 
 	throtl_service_queue_exit(&tg->service_queue);
 }
 
 static void throtl_pd_free(struct blkg_policy_data *pd)
 {
-	kfree(pd);
+	struct throtl_grp *tg = pd_to_tg(pd);
+
+	free_percpu(tg->stats_cpu);
+	kfree(tg);
 }
 
 static void throtl_pd_reset_stats(struct blkcg_gq *blkg)
@@ -508,9 +456,6 @@ static void throtl_pd_reset_stats(struct blkcg_gq *blkg)
 	struct throtl_grp *tg = blkg_to_tg(blkg);
 	int cpu;
 
-	if (tg->stats_cpu == NULL)
-		return;
-
 	for_each_possible_cpu(cpu) {
 		struct tg_stats_cpu *sc = per_cpu_ptr(tg->stats_cpu, cpu);
 
@@ -973,10 +918,6 @@ static void throtl_update_dispatch_stats(struct blkcg_gq *blkg, u64 bytes,
 	struct tg_stats_cpu *stats_cpu;
 	unsigned long flags;
 
-	/* If per cpu stats are not allocated yet, don't do any accounting. */
-	if (tg->stats_cpu == NULL)
-		return;
-
 	/*
 	 * Disabling interrupts to provide mutual exclusion between two
 	 * writes on same cpu. It probably is not needed for 64bit. Not
@@ -1302,9 +1243,6 @@ static u64 tg_prfill_cpu_rwstat(struct seq_file *sf,
 	struct blkg_rwstat rwstat = { }, tmp;
 	int i, cpu;
 
-	if (tg->stats_cpu == NULL)
-		return 0;
-
 	for_each_possible_cpu(cpu) {
 		struct tg_stats_cpu *sc = per_cpu_ptr(tg->stats_cpu, cpu);
 
-- 
2.4.3


^ permalink raw reply related	[flat|nested] 19+ messages in thread

* [PATCH 6/7] blk-throttle: clean up blkg_policy_data alloc/init/exit/free methods
@ 2015-06-24  2:44   ` Tejun Heo
  0 siblings, 0 replies; 19+ messages in thread
From: Tejun Heo @ 2015-06-24  2:44 UTC (permalink / raw)
  To: axboe
  Cc: linux-kernel, cgroups, vgoyal, avanzini.arianna, kernel-team, Tejun Heo

With the recent addition of alloc and free methods, things became
messier.  This patch reorganizes them according to the followings.

* ->pd_alloc_fn()

  Responsible for allocation and static initializations - the ones
  which can be done independent of where the pd might be attached.

* ->pd_init_fn()

  Initializations which require the knowledge of where the pd is
  attached.

* ->pd_free_fn()

  The counter part of pd_alloc_fn().  Static de-init and freeing.

This leaves ->pd_exit_fn() without any users.  Removed.

While at it, collapse an one liner function throtl_pd_exit(), which
has only one user, into its user.

Signed-off-by: Tejun Heo <tj@kernel.org>
Cc: Vivek Goyal <vgoyal@redhat.com>
---
 block/blk-cgroup.c         | 11 ---------
 block/blk-throttle.c       | 57 ++++++++++++++++------------------------------
 block/cfq-iosched.c        | 15 ++++++++----
 include/linux/blk-cgroup.h |  2 --
 4 files changed, 31 insertions(+), 54 deletions(-)

diff --git a/block/blk-cgroup.c b/block/blk-cgroup.c
index 574971a..9d434a9 100644
--- a/block/blk-cgroup.c
+++ b/block/blk-cgroup.c
@@ -389,15 +389,6 @@ static void blkg_destroy_all(struct request_queue *q)
 void __blkg_release_rcu(struct rcu_head *rcu_head)
 {
 	struct blkcg_gq *blkg = container_of(rcu_head, struct blkcg_gq, rcu_head);
-	int i;
-
-	/* tell policies that this one is being freed */
-	for (i = 0; i < BLKCG_MAX_POLS; i++) {
-		struct blkcg_policy *pol = blkcg_policy[i];
-
-		if (blkg->pd[i] && pol->pd_exit_fn)
-			pol->pd_exit_fn(blkg);
-	}
 
 	/* release the blkcg and parent blkg refs this blkg has been holding */
 	css_put(&blkg->blkcg->css);
@@ -1091,8 +1082,6 @@ void blkcg_deactivate_policy(struct request_queue *q,
 
 		if (pol->pd_offline_fn)
 			pol->pd_offline_fn(blkg);
-		if (pol->pd_exit_fn)
-			pol->pd_exit_fn(blkg);
 
 		if (blkg->pd[pol->plid]) {
 			pol->pd_free_fn(blkg->pd[pol->plid]);
diff --git a/block/blk-throttle.c b/block/blk-throttle.c
index 3c86976..c3a235b 100644
--- a/block/blk-throttle.c
+++ b/block/blk-throttle.c
@@ -330,26 +330,19 @@ static struct bio *throtl_pop_queued(struct list_head *queued,
 }
 
 /* init a service_queue, assumes the caller zeroed it */
-static void throtl_service_queue_init(struct throtl_service_queue *sq,
-				      struct throtl_service_queue *parent_sq)
+static void throtl_service_queue_init(struct throtl_service_queue *sq)
 {
 	INIT_LIST_HEAD(&sq->queued[0]);
 	INIT_LIST_HEAD(&sq->queued[1]);
 	sq->pending_tree = RB_ROOT;
-	sq->parent_sq = parent_sq;
 	setup_timer(&sq->pending_timer, throtl_pending_timer_fn,
 		    (unsigned long)sq);
 }
 
-static void throtl_service_queue_exit(struct throtl_service_queue *sq)
-{
-	del_timer_sync(&sq->pending_timer);
-}
-
 static struct blkg_policy_data *throtl_pd_alloc(gfp_t gfp, int node)
 {
 	struct throtl_grp *tg;
-	int cpu;
+	int rw, cpu;
 
 	tg = kzalloc_node(sizeof(*tg), gfp, node);
 	if (!tg)
@@ -361,6 +354,19 @@ static struct blkg_policy_data *throtl_pd_alloc(gfp_t gfp, int node)
 		return NULL;
 	}
 
+	throtl_service_queue_init(&tg->service_queue);
+
+	for (rw = READ; rw <= WRITE; rw++) {
+		throtl_qnode_init(&tg->qnode_on_self[rw], tg);
+		throtl_qnode_init(&tg->qnode_on_parent[rw], tg);
+	}
+
+	RB_CLEAR_NODE(&tg->rb_node);
+	tg->bps[READ] = -1;
+	tg->bps[WRITE] = -1;
+	tg->iops[READ] = -1;
+	tg->iops[WRITE] = -1;
+
 	for_each_possible_cpu(cpu) {
 		struct tg_stats_cpu *stats_cpu = per_cpu_ptr(tg->stats_cpu, cpu);
 
@@ -375,8 +381,7 @@ static void throtl_pd_init(struct blkcg_gq *blkg)
 {
 	struct throtl_grp *tg = blkg_to_tg(blkg);
 	struct throtl_data *td = blkg->q->td;
-	struct throtl_service_queue *parent_sq;
-	int rw;
+	struct throtl_service_queue *sq = &tg->service_queue;
 
 	/*
 	 * If on the default hierarchy, we switch to properly hierarchical
@@ -391,25 +396,10 @@ static void throtl_pd_init(struct blkcg_gq *blkg)
 	 * Limits of a group don't interact with limits of other groups
 	 * regardless of the position of the group in the hierarchy.
 	 */
-	parent_sq = &td->service_queue;
-
+	sq->parent_sq = &td->service_queue;
 	if (cgroup_on_dfl(blkg->blkcg->css.cgroup) && blkg->parent)
-		parent_sq = &blkg_to_tg(blkg->parent)->service_queue;
-
-	throtl_service_queue_init(&tg->service_queue, parent_sq);
-
-	for (rw = READ; rw <= WRITE; rw++) {
-		throtl_qnode_init(&tg->qnode_on_self[rw], tg);
-		throtl_qnode_init(&tg->qnode_on_parent[rw], tg);
-	}
-
-	RB_CLEAR_NODE(&tg->rb_node);
+		sq->parent_sq = &blkg_to_tg(blkg->parent)->service_queue;
 	tg->td = td;
-
-	tg->bps[READ] = -1;
-	tg->bps[WRITE] = -1;
-	tg->iops[READ] = -1;
-	tg->iops[WRITE] = -1;
 }
 
 /*
@@ -436,17 +426,11 @@ static void throtl_pd_online(struct blkcg_gq *blkg)
 	tg_update_has_rules(blkg_to_tg(blkg));
 }
 
-static void throtl_pd_exit(struct blkcg_gq *blkg)
-{
-	struct throtl_grp *tg = blkg_to_tg(blkg);
-
-	throtl_service_queue_exit(&tg->service_queue);
-}
-
 static void throtl_pd_free(struct blkg_policy_data *pd)
 {
 	struct throtl_grp *tg = pd_to_tg(pd);
 
+	del_timer_sync(&tg->service_queue.pending_timer);
 	free_percpu(tg->stats_cpu);
 	kfree(tg);
 }
@@ -1421,7 +1405,6 @@ static struct blkcg_policy blkcg_policy_throtl = {
 	.pd_alloc_fn		= throtl_pd_alloc,
 	.pd_init_fn		= throtl_pd_init,
 	.pd_online_fn		= throtl_pd_online,
-	.pd_exit_fn		= throtl_pd_exit,
 	.pd_free_fn		= throtl_pd_free,
 	.pd_reset_stats_fn	= throtl_pd_reset_stats,
 };
@@ -1616,7 +1599,7 @@ int blk_throtl_init(struct request_queue *q)
 		return -ENOMEM;
 
 	INIT_WORK(&td->dispatch_work, blk_throtl_dispatch_work_fn);
-	throtl_service_queue_init(&td->service_queue, NULL);
+	throtl_service_queue_init(&td->service_queue);
 
 	q->td = td;
 	td->queue = q;
diff --git a/block/cfq-iosched.c b/block/cfq-iosched.c
index 6b7bf94..d281ea6 100644
--- a/block/cfq-iosched.c
+++ b/block/cfq-iosched.c
@@ -1545,18 +1545,25 @@ static void cfqg_stats_init(struct cfqg_stats *stats)
 
 static struct blkg_policy_data *cfq_pd_alloc(gfp_t gfp, int node)
 {
-	return kzalloc_node(sizeof(struct cfq_group), gfp, node);
+	struct cfq_group *cfqg;
+
+	cfqg = kzalloc_node(sizeof(*cfqg), gfp, node);
+	if (!cfqg)
+		return NULL;
+
+	cfq_init_cfqg_base(cfqg);
+	cfqg_stats_init(&cfqg->stats);
+	cfqg_stats_init(&cfqg->dead_stats);
+
+	return &cfqg->pd;
 }
 
 static void cfq_pd_init(struct blkcg_gq *blkg)
 {
 	struct cfq_group *cfqg = blkg_to_cfqg(blkg);
 
-	cfq_init_cfqg_base(cfqg);
 	cfqg->weight = blkg->blkcg->cfq_weight;
 	cfqg->leaf_weight = blkg->blkcg->cfq_leaf_weight;
-	cfqg_stats_init(&cfqg->stats);
-	cfqg_stats_init(&cfqg->dead_stats);
 }
 
 static void cfq_pd_offline(struct blkcg_gq *blkg)
diff --git a/include/linux/blk-cgroup.h b/include/linux/blk-cgroup.h
index 63f6340..9603ae1 100644
--- a/include/linux/blk-cgroup.h
+++ b/include/linux/blk-cgroup.h
@@ -124,7 +124,6 @@ typedef struct blkg_policy_data *(blkcg_pol_alloc_pd_fn)(gfp_t gfp, int node);
 typedef void (blkcg_pol_init_pd_fn)(struct blkcg_gq *blkg);
 typedef void (blkcg_pol_online_pd_fn)(struct blkcg_gq *blkg);
 typedef void (blkcg_pol_offline_pd_fn)(struct blkcg_gq *blkg);
-typedef void (blkcg_pol_exit_pd_fn)(struct blkcg_gq *blkg);
 typedef void (blkcg_pol_free_pd_fn)(struct blkg_policy_data *pd);
 typedef void (blkcg_pol_reset_pd_stats_fn)(struct blkcg_gq *blkg);
 
@@ -138,7 +137,6 @@ struct blkcg_policy {
 	blkcg_pol_init_pd_fn		*pd_init_fn;
 	blkcg_pol_online_pd_fn		*pd_online_fn;
 	blkcg_pol_offline_pd_fn		*pd_offline_fn;
-	blkcg_pol_exit_pd_fn		*pd_exit_fn;
 	blkcg_pol_free_pd_fn		*pd_free_fn;
 	blkcg_pol_reset_pd_stats_fn	*pd_reset_stats_fn;
 };
-- 
2.4.3


^ permalink raw reply related	[flat|nested] 19+ messages in thread

* [PATCH 6/7] blk-throttle: clean up blkg_policy_data alloc/init/exit/free methods
@ 2015-06-24  2:44   ` Tejun Heo
  0 siblings, 0 replies; 19+ messages in thread
From: Tejun Heo @ 2015-06-24  2:44 UTC (permalink / raw)
  To: axboe-tSWWG44O7X1aa/9Udqfwiw
  Cc: linux-kernel-u79uwXL29TY76Z2rM5mHXA,
	cgroups-u79uwXL29TY76Z2rM5mHXA, vgoyal-H+wXaHxf7aLQT0dZR+AlfA,
	avanzini.arianna-Re5JQEeQqe8AvxtiuMwx3w, kernel-team-b10kYP2dOMg,
	Tejun Heo

With the recent addition of alloc and free methods, things became
messier.  This patch reorganizes them according to the followings.

* ->pd_alloc_fn()

  Responsible for allocation and static initializations - the ones
  which can be done independent of where the pd might be attached.

* ->pd_init_fn()

  Initializations which require the knowledge of where the pd is
  attached.

* ->pd_free_fn()

  The counter part of pd_alloc_fn().  Static de-init and freeing.

This leaves ->pd_exit_fn() without any users.  Removed.

While at it, collapse an one liner function throtl_pd_exit(), which
has only one user, into its user.

Signed-off-by: Tejun Heo <tj-DgEjT+Ai2ygdnm+yROfE0A@public.gmane.org>
Cc: Vivek Goyal <vgoyal-H+wXaHxf7aLQT0dZR+AlfA@public.gmane.org>
---
 block/blk-cgroup.c         | 11 ---------
 block/blk-throttle.c       | 57 ++++++++++++++++------------------------------
 block/cfq-iosched.c        | 15 ++++++++----
 include/linux/blk-cgroup.h |  2 --
 4 files changed, 31 insertions(+), 54 deletions(-)

diff --git a/block/blk-cgroup.c b/block/blk-cgroup.c
index 574971a..9d434a9 100644
--- a/block/blk-cgroup.c
+++ b/block/blk-cgroup.c
@@ -389,15 +389,6 @@ static void blkg_destroy_all(struct request_queue *q)
 void __blkg_release_rcu(struct rcu_head *rcu_head)
 {
 	struct blkcg_gq *blkg = container_of(rcu_head, struct blkcg_gq, rcu_head);
-	int i;
-
-	/* tell policies that this one is being freed */
-	for (i = 0; i < BLKCG_MAX_POLS; i++) {
-		struct blkcg_policy *pol = blkcg_policy[i];
-
-		if (blkg->pd[i] && pol->pd_exit_fn)
-			pol->pd_exit_fn(blkg);
-	}
 
 	/* release the blkcg and parent blkg refs this blkg has been holding */
 	css_put(&blkg->blkcg->css);
@@ -1091,8 +1082,6 @@ void blkcg_deactivate_policy(struct request_queue *q,
 
 		if (pol->pd_offline_fn)
 			pol->pd_offline_fn(blkg);
-		if (pol->pd_exit_fn)
-			pol->pd_exit_fn(blkg);
 
 		if (blkg->pd[pol->plid]) {
 			pol->pd_free_fn(blkg->pd[pol->plid]);
diff --git a/block/blk-throttle.c b/block/blk-throttle.c
index 3c86976..c3a235b 100644
--- a/block/blk-throttle.c
+++ b/block/blk-throttle.c
@@ -330,26 +330,19 @@ static struct bio *throtl_pop_queued(struct list_head *queued,
 }
 
 /* init a service_queue, assumes the caller zeroed it */
-static void throtl_service_queue_init(struct throtl_service_queue *sq,
-				      struct throtl_service_queue *parent_sq)
+static void throtl_service_queue_init(struct throtl_service_queue *sq)
 {
 	INIT_LIST_HEAD(&sq->queued[0]);
 	INIT_LIST_HEAD(&sq->queued[1]);
 	sq->pending_tree = RB_ROOT;
-	sq->parent_sq = parent_sq;
 	setup_timer(&sq->pending_timer, throtl_pending_timer_fn,
 		    (unsigned long)sq);
 }
 
-static void throtl_service_queue_exit(struct throtl_service_queue *sq)
-{
-	del_timer_sync(&sq->pending_timer);
-}
-
 static struct blkg_policy_data *throtl_pd_alloc(gfp_t gfp, int node)
 {
 	struct throtl_grp *tg;
-	int cpu;
+	int rw, cpu;
 
 	tg = kzalloc_node(sizeof(*tg), gfp, node);
 	if (!tg)
@@ -361,6 +354,19 @@ static struct blkg_policy_data *throtl_pd_alloc(gfp_t gfp, int node)
 		return NULL;
 	}
 
+	throtl_service_queue_init(&tg->service_queue);
+
+	for (rw = READ; rw <= WRITE; rw++) {
+		throtl_qnode_init(&tg->qnode_on_self[rw], tg);
+		throtl_qnode_init(&tg->qnode_on_parent[rw], tg);
+	}
+
+	RB_CLEAR_NODE(&tg->rb_node);
+	tg->bps[READ] = -1;
+	tg->bps[WRITE] = -1;
+	tg->iops[READ] = -1;
+	tg->iops[WRITE] = -1;
+
 	for_each_possible_cpu(cpu) {
 		struct tg_stats_cpu *stats_cpu = per_cpu_ptr(tg->stats_cpu, cpu);
 
@@ -375,8 +381,7 @@ static void throtl_pd_init(struct blkcg_gq *blkg)
 {
 	struct throtl_grp *tg = blkg_to_tg(blkg);
 	struct throtl_data *td = blkg->q->td;
-	struct throtl_service_queue *parent_sq;
-	int rw;
+	struct throtl_service_queue *sq = &tg->service_queue;
 
 	/*
 	 * If on the default hierarchy, we switch to properly hierarchical
@@ -391,25 +396,10 @@ static void throtl_pd_init(struct blkcg_gq *blkg)
 	 * Limits of a group don't interact with limits of other groups
 	 * regardless of the position of the group in the hierarchy.
 	 */
-	parent_sq = &td->service_queue;
-
+	sq->parent_sq = &td->service_queue;
 	if (cgroup_on_dfl(blkg->blkcg->css.cgroup) && blkg->parent)
-		parent_sq = &blkg_to_tg(blkg->parent)->service_queue;
-
-	throtl_service_queue_init(&tg->service_queue, parent_sq);
-
-	for (rw = READ; rw <= WRITE; rw++) {
-		throtl_qnode_init(&tg->qnode_on_self[rw], tg);
-		throtl_qnode_init(&tg->qnode_on_parent[rw], tg);
-	}
-
-	RB_CLEAR_NODE(&tg->rb_node);
+		sq->parent_sq = &blkg_to_tg(blkg->parent)->service_queue;
 	tg->td = td;
-
-	tg->bps[READ] = -1;
-	tg->bps[WRITE] = -1;
-	tg->iops[READ] = -1;
-	tg->iops[WRITE] = -1;
 }
 
 /*
@@ -436,17 +426,11 @@ static void throtl_pd_online(struct blkcg_gq *blkg)
 	tg_update_has_rules(blkg_to_tg(blkg));
 }
 
-static void throtl_pd_exit(struct blkcg_gq *blkg)
-{
-	struct throtl_grp *tg = blkg_to_tg(blkg);
-
-	throtl_service_queue_exit(&tg->service_queue);
-}
-
 static void throtl_pd_free(struct blkg_policy_data *pd)
 {
 	struct throtl_grp *tg = pd_to_tg(pd);
 
+	del_timer_sync(&tg->service_queue.pending_timer);
 	free_percpu(tg->stats_cpu);
 	kfree(tg);
 }
@@ -1421,7 +1405,6 @@ static struct blkcg_policy blkcg_policy_throtl = {
 	.pd_alloc_fn		= throtl_pd_alloc,
 	.pd_init_fn		= throtl_pd_init,
 	.pd_online_fn		= throtl_pd_online,
-	.pd_exit_fn		= throtl_pd_exit,
 	.pd_free_fn		= throtl_pd_free,
 	.pd_reset_stats_fn	= throtl_pd_reset_stats,
 };
@@ -1616,7 +1599,7 @@ int blk_throtl_init(struct request_queue *q)
 		return -ENOMEM;
 
 	INIT_WORK(&td->dispatch_work, blk_throtl_dispatch_work_fn);
-	throtl_service_queue_init(&td->service_queue, NULL);
+	throtl_service_queue_init(&td->service_queue);
 
 	q->td = td;
 	td->queue = q;
diff --git a/block/cfq-iosched.c b/block/cfq-iosched.c
index 6b7bf94..d281ea6 100644
--- a/block/cfq-iosched.c
+++ b/block/cfq-iosched.c
@@ -1545,18 +1545,25 @@ static void cfqg_stats_init(struct cfqg_stats *stats)
 
 static struct blkg_policy_data *cfq_pd_alloc(gfp_t gfp, int node)
 {
-	return kzalloc_node(sizeof(struct cfq_group), gfp, node);
+	struct cfq_group *cfqg;
+
+	cfqg = kzalloc_node(sizeof(*cfqg), gfp, node);
+	if (!cfqg)
+		return NULL;
+
+	cfq_init_cfqg_base(cfqg);
+	cfqg_stats_init(&cfqg->stats);
+	cfqg_stats_init(&cfqg->dead_stats);
+
+	return &cfqg->pd;
 }
 
 static void cfq_pd_init(struct blkcg_gq *blkg)
 {
 	struct cfq_group *cfqg = blkg_to_cfqg(blkg);
 
-	cfq_init_cfqg_base(cfqg);
 	cfqg->weight = blkg->blkcg->cfq_weight;
 	cfqg->leaf_weight = blkg->blkcg->cfq_leaf_weight;
-	cfqg_stats_init(&cfqg->stats);
-	cfqg_stats_init(&cfqg->dead_stats);
 }
 
 static void cfq_pd_offline(struct blkcg_gq *blkg)
diff --git a/include/linux/blk-cgroup.h b/include/linux/blk-cgroup.h
index 63f6340..9603ae1 100644
--- a/include/linux/blk-cgroup.h
+++ b/include/linux/blk-cgroup.h
@@ -124,7 +124,6 @@ typedef struct blkg_policy_data *(blkcg_pol_alloc_pd_fn)(gfp_t gfp, int node);
 typedef void (blkcg_pol_init_pd_fn)(struct blkcg_gq *blkg);
 typedef void (blkcg_pol_online_pd_fn)(struct blkcg_gq *blkg);
 typedef void (blkcg_pol_offline_pd_fn)(struct blkcg_gq *blkg);
-typedef void (blkcg_pol_exit_pd_fn)(struct blkcg_gq *blkg);
 typedef void (blkcg_pol_free_pd_fn)(struct blkg_policy_data *pd);
 typedef void (blkcg_pol_reset_pd_stats_fn)(struct blkcg_gq *blkg);
 
@@ -138,7 +137,6 @@ struct blkcg_policy {
 	blkcg_pol_init_pd_fn		*pd_init_fn;
 	blkcg_pol_online_pd_fn		*pd_online_fn;
 	blkcg_pol_offline_pd_fn		*pd_offline_fn;
-	blkcg_pol_exit_pd_fn		*pd_exit_fn;
 	blkcg_pol_free_pd_fn		*pd_free_fn;
 	blkcg_pol_reset_pd_stats_fn	*pd_reset_stats_fn;
 };
-- 
2.4.3

^ permalink raw reply related	[flat|nested] 19+ messages in thread

* [PATCH 7/7] blkcg: make blkcg_policy methods take a pointer to blkcg_policy_data
@ 2015-06-24  2:44   ` Tejun Heo
  0 siblings, 0 replies; 19+ messages in thread
From: Tejun Heo @ 2015-06-24  2:44 UTC (permalink / raw)
  To: axboe
  Cc: linux-kernel, cgroups, vgoyal, avanzini.arianna, kernel-team, Tejun Heo

The newly added ->pd_alloc_fn() and ->pd_free_fn() deal with pd
(blkg_policy_data) while the older ones use blkg (blkcg_gq).  As using
blkg doesn't make sense for ->pd_alloc_fn() and after allocation pd
can always be mapped to blkg and given that these are policy-specific
methods, it makes sense to converge on pd.

This patch makes all methods deal with pd instead of blkg.  Most
conversions are trivial.  In blk-cgroup.c, a couple method invocation
sites now test whether pd exists instead of policy state for
consistency.  This shouldn't cause any behavioral differences.

Signed-off-by: Tejun Heo <tj@kernel.org>
Cc: Vivek Goyal <vgoyal@redhat.com>
---
 block/blk-cgroup.c         | 18 ++++++++----------
 block/blk-throttle.c       | 13 +++++++------
 block/cfq-iosched.c        | 17 +++++++++--------
 include/linux/blk-cgroup.h |  8 ++++----
 4 files changed, 28 insertions(+), 28 deletions(-)

diff --git a/block/blk-cgroup.c b/block/blk-cgroup.c
index 9d434a9..15d9628b 100644
--- a/block/blk-cgroup.c
+++ b/block/blk-cgroup.c
@@ -229,7 +229,7 @@ static struct blkcg_gq *blkg_create(struct blkcg *blkcg,
 		struct blkcg_policy *pol = blkcg_policy[i];
 
 		if (blkg->pd[i] && pol->pd_init_fn)
-			pol->pd_init_fn(blkg);
+			pol->pd_init_fn(blkg->pd[i]);
 	}
 
 	/* insert */
@@ -243,7 +243,7 @@ static struct blkcg_gq *blkg_create(struct blkcg *blkcg,
 			struct blkcg_policy *pol = blkcg_policy[i];
 
 			if (blkg->pd[i] && pol->pd_online_fn)
-				pol->pd_online_fn(blkg);
+				pol->pd_online_fn(blkg->pd[i]);
 		}
 	}
 	blkg->online = true;
@@ -334,7 +334,7 @@ static void blkg_destroy(struct blkcg_gq *blkg)
 		struct blkcg_policy *pol = blkcg_policy[i];
 
 		if (blkg->pd[i] && pol->pd_offline_fn)
-			pol->pd_offline_fn(blkg);
+			pol->pd_offline_fn(blkg->pd[i]);
 	}
 	blkg->online = false;
 
@@ -468,9 +468,8 @@ static int blkcg_reset_stats(struct cgroup_subsys_state *css,
 		for (i = 0; i < BLKCG_MAX_POLS; i++) {
 			struct blkcg_policy *pol = blkcg_policy[i];
 
-			if (blkcg_policy_enabled(blkg->q, pol) &&
-			    pol->pd_reset_stats_fn)
-				pol->pd_reset_stats_fn(blkg);
+			if (blkg->pd[i] && pol->pd_reset_stats_fn)
+				pol->pd_reset_stats_fn(blkg->pd[i]);
 		}
 	}
 
@@ -1038,7 +1037,7 @@ int blkcg_activate_policy(struct request_queue *q,
 		pd->blkg = blkg;
 		pd->plid = pol->plid;
 		if (pol->pd_init_fn)
-			pol->pd_init_fn(blkg);
+			pol->pd_init_fn(pd);
 
 		spin_unlock(&blkg->blkcg->lock);
 	}
@@ -1080,10 +1079,9 @@ void blkcg_deactivate_policy(struct request_queue *q,
 		/* grab blkcg lock too while removing @pd from @blkg */
 		spin_lock(&blkg->blkcg->lock);
 
-		if (pol->pd_offline_fn)
-			pol->pd_offline_fn(blkg);
-
 		if (blkg->pd[pol->plid]) {
+			if (pol->pd_offline_fn)
+				pol->pd_offline_fn(blkg->pd[pol->plid]);
 			pol->pd_free_fn(blkg->pd[pol->plid]);
 			blkg->pd[pol->plid] = NULL;
 		}
diff --git a/block/blk-throttle.c b/block/blk-throttle.c
index c3a235b..c2c7547 100644
--- a/block/blk-throttle.c
+++ b/block/blk-throttle.c
@@ -377,9 +377,10 @@ static struct blkg_policy_data *throtl_pd_alloc(gfp_t gfp, int node)
 	return &tg->pd;
 }
 
-static void throtl_pd_init(struct blkcg_gq *blkg)
+static void throtl_pd_init(struct blkg_policy_data *pd)
 {
-	struct throtl_grp *tg = blkg_to_tg(blkg);
+	struct throtl_grp *tg = pd_to_tg(pd);
+	struct blkcg_gq *blkg = tg_to_blkg(tg);
 	struct throtl_data *td = blkg->q->td;
 	struct throtl_service_queue *sq = &tg->service_queue;
 
@@ -417,13 +418,13 @@ static void tg_update_has_rules(struct throtl_grp *tg)
 				    (tg->bps[rw] != -1 || tg->iops[rw] != -1);
 }
 
-static void throtl_pd_online(struct blkcg_gq *blkg)
+static void throtl_pd_online(struct blkg_policy_data *pd)
 {
 	/*
 	 * We don't want new groups to escape the limits of its ancestors.
 	 * Update has_rules[] after a new group is brought online.
 	 */
-	tg_update_has_rules(blkg_to_tg(blkg));
+	tg_update_has_rules(pd_to_tg(pd));
 }
 
 static void throtl_pd_free(struct blkg_policy_data *pd)
@@ -435,9 +436,9 @@ static void throtl_pd_free(struct blkg_policy_data *pd)
 	kfree(tg);
 }
 
-static void throtl_pd_reset_stats(struct blkcg_gq *blkg)
+static void throtl_pd_reset_stats(struct blkg_policy_data *pd)
 {
-	struct throtl_grp *tg = blkg_to_tg(blkg);
+	struct throtl_grp *tg = pd_to_tg(pd);
 	int cpu;
 
 	for_each_possible_cpu(cpu) {
diff --git a/block/cfq-iosched.c b/block/cfq-iosched.c
index d281ea6..d2bc961 100644
--- a/block/cfq-iosched.c
+++ b/block/cfq-iosched.c
@@ -1558,17 +1558,18 @@ static struct blkg_policy_data *cfq_pd_alloc(gfp_t gfp, int node)
 	return &cfqg->pd;
 }
 
-static void cfq_pd_init(struct blkcg_gq *blkg)
+static void cfq_pd_init(struct blkg_policy_data *pd)
 {
-	struct cfq_group *cfqg = blkg_to_cfqg(blkg);
+	struct cfq_group *cfqg = pd_to_cfqg(pd);
+	struct blkcg *blkcg = cfqg_to_blkg(cfqg)->blkcg;
 
-	cfqg->weight = blkg->blkcg->cfq_weight;
-	cfqg->leaf_weight = blkg->blkcg->cfq_leaf_weight;
+	cfqg->weight = blkcg->cfq_weight;
+	cfqg->leaf_weight = blkcg->cfq_leaf_weight;
 }
 
-static void cfq_pd_offline(struct blkcg_gq *blkg)
+static void cfq_pd_offline(struct blkg_policy_data *pd)
 {
-	struct cfq_group *cfqg = blkg_to_cfqg(blkg);
+	struct cfq_group *cfqg = pd_to_cfqg(pd);
 	int i;
 
 	for (i = 0; i < IOPRIO_BE_NR; i++) {
@@ -1621,9 +1622,9 @@ static struct blkg_rwstat cfqg_rwstat_pd_recursive_sum(struct blkg_policy_data *
 	return a;
 }
 
-static void cfq_pd_reset_stats(struct blkcg_gq *blkg)
+static void cfq_pd_reset_stats(struct blkg_policy_data *pd)
 {
-	struct cfq_group *cfqg = blkg_to_cfqg(blkg);
+	struct cfq_group *cfqg = pd_to_cfqg(pd);
 
 	cfqg_stats_reset(&cfqg->stats);
 	cfqg_stats_reset(&cfqg->dead_stats);
diff --git a/include/linux/blk-cgroup.h b/include/linux/blk-cgroup.h
index 9603ae1..cafc54a 100644
--- a/include/linux/blk-cgroup.h
+++ b/include/linux/blk-cgroup.h
@@ -121,11 +121,11 @@ struct blkcg_gq {
 };
 
 typedef struct blkg_policy_data *(blkcg_pol_alloc_pd_fn)(gfp_t gfp, int node);
-typedef void (blkcg_pol_init_pd_fn)(struct blkcg_gq *blkg);
-typedef void (blkcg_pol_online_pd_fn)(struct blkcg_gq *blkg);
-typedef void (blkcg_pol_offline_pd_fn)(struct blkcg_gq *blkg);
+typedef void (blkcg_pol_init_pd_fn)(struct blkg_policy_data *pd);
+typedef void (blkcg_pol_online_pd_fn)(struct blkg_policy_data *pd);
+typedef void (blkcg_pol_offline_pd_fn)(struct blkg_policy_data *pd);
 typedef void (blkcg_pol_free_pd_fn)(struct blkg_policy_data *pd);
-typedef void (blkcg_pol_reset_pd_stats_fn)(struct blkcg_gq *blkg);
+typedef void (blkcg_pol_reset_pd_stats_fn)(struct blkg_policy_data *pd);
 
 struct blkcg_policy {
 	int				plid;
-- 
2.4.3


^ permalink raw reply related	[flat|nested] 19+ messages in thread

* [PATCH 7/7] blkcg: make blkcg_policy methods take a pointer to blkcg_policy_data
@ 2015-06-24  2:44   ` Tejun Heo
  0 siblings, 0 replies; 19+ messages in thread
From: Tejun Heo @ 2015-06-24  2:44 UTC (permalink / raw)
  To: axboe-tSWWG44O7X1aa/9Udqfwiw
  Cc: linux-kernel-u79uwXL29TY76Z2rM5mHXA,
	cgroups-u79uwXL29TY76Z2rM5mHXA, vgoyal-H+wXaHxf7aLQT0dZR+AlfA,
	avanzini.arianna-Re5JQEeQqe8AvxtiuMwx3w, kernel-team-b10kYP2dOMg,
	Tejun Heo

The newly added ->pd_alloc_fn() and ->pd_free_fn() deal with pd
(blkg_policy_data) while the older ones use blkg (blkcg_gq).  As using
blkg doesn't make sense for ->pd_alloc_fn() and after allocation pd
can always be mapped to blkg and given that these are policy-specific
methods, it makes sense to converge on pd.

This patch makes all methods deal with pd instead of blkg.  Most
conversions are trivial.  In blk-cgroup.c, a couple method invocation
sites now test whether pd exists instead of policy state for
consistency.  This shouldn't cause any behavioral differences.

Signed-off-by: Tejun Heo <tj-DgEjT+Ai2ygdnm+yROfE0A@public.gmane.org>
Cc: Vivek Goyal <vgoyal-H+wXaHxf7aLQT0dZR+AlfA@public.gmane.org>
---
 block/blk-cgroup.c         | 18 ++++++++----------
 block/blk-throttle.c       | 13 +++++++------
 block/cfq-iosched.c        | 17 +++++++++--------
 include/linux/blk-cgroup.h |  8 ++++----
 4 files changed, 28 insertions(+), 28 deletions(-)

diff --git a/block/blk-cgroup.c b/block/blk-cgroup.c
index 9d434a9..15d9628b 100644
--- a/block/blk-cgroup.c
+++ b/block/blk-cgroup.c
@@ -229,7 +229,7 @@ static struct blkcg_gq *blkg_create(struct blkcg *blkcg,
 		struct blkcg_policy *pol = blkcg_policy[i];
 
 		if (blkg->pd[i] && pol->pd_init_fn)
-			pol->pd_init_fn(blkg);
+			pol->pd_init_fn(blkg->pd[i]);
 	}
 
 	/* insert */
@@ -243,7 +243,7 @@ static struct blkcg_gq *blkg_create(struct blkcg *blkcg,
 			struct blkcg_policy *pol = blkcg_policy[i];
 
 			if (blkg->pd[i] && pol->pd_online_fn)
-				pol->pd_online_fn(blkg);
+				pol->pd_online_fn(blkg->pd[i]);
 		}
 	}
 	blkg->online = true;
@@ -334,7 +334,7 @@ static void blkg_destroy(struct blkcg_gq *blkg)
 		struct blkcg_policy *pol = blkcg_policy[i];
 
 		if (blkg->pd[i] && pol->pd_offline_fn)
-			pol->pd_offline_fn(blkg);
+			pol->pd_offline_fn(blkg->pd[i]);
 	}
 	blkg->online = false;
 
@@ -468,9 +468,8 @@ static int blkcg_reset_stats(struct cgroup_subsys_state *css,
 		for (i = 0; i < BLKCG_MAX_POLS; i++) {
 			struct blkcg_policy *pol = blkcg_policy[i];
 
-			if (blkcg_policy_enabled(blkg->q, pol) &&
-			    pol->pd_reset_stats_fn)
-				pol->pd_reset_stats_fn(blkg);
+			if (blkg->pd[i] && pol->pd_reset_stats_fn)
+				pol->pd_reset_stats_fn(blkg->pd[i]);
 		}
 	}
 
@@ -1038,7 +1037,7 @@ int blkcg_activate_policy(struct request_queue *q,
 		pd->blkg = blkg;
 		pd->plid = pol->plid;
 		if (pol->pd_init_fn)
-			pol->pd_init_fn(blkg);
+			pol->pd_init_fn(pd);
 
 		spin_unlock(&blkg->blkcg->lock);
 	}
@@ -1080,10 +1079,9 @@ void blkcg_deactivate_policy(struct request_queue *q,
 		/* grab blkcg lock too while removing @pd from @blkg */
 		spin_lock(&blkg->blkcg->lock);
 
-		if (pol->pd_offline_fn)
-			pol->pd_offline_fn(blkg);
-
 		if (blkg->pd[pol->plid]) {
+			if (pol->pd_offline_fn)
+				pol->pd_offline_fn(blkg->pd[pol->plid]);
 			pol->pd_free_fn(blkg->pd[pol->plid]);
 			blkg->pd[pol->plid] = NULL;
 		}
diff --git a/block/blk-throttle.c b/block/blk-throttle.c
index c3a235b..c2c7547 100644
--- a/block/blk-throttle.c
+++ b/block/blk-throttle.c
@@ -377,9 +377,10 @@ static struct blkg_policy_data *throtl_pd_alloc(gfp_t gfp, int node)
 	return &tg->pd;
 }
 
-static void throtl_pd_init(struct blkcg_gq *blkg)
+static void throtl_pd_init(struct blkg_policy_data *pd)
 {
-	struct throtl_grp *tg = blkg_to_tg(blkg);
+	struct throtl_grp *tg = pd_to_tg(pd);
+	struct blkcg_gq *blkg = tg_to_blkg(tg);
 	struct throtl_data *td = blkg->q->td;
 	struct throtl_service_queue *sq = &tg->service_queue;
 
@@ -417,13 +418,13 @@ static void tg_update_has_rules(struct throtl_grp *tg)
 				    (tg->bps[rw] != -1 || tg->iops[rw] != -1);
 }
 
-static void throtl_pd_online(struct blkcg_gq *blkg)
+static void throtl_pd_online(struct blkg_policy_data *pd)
 {
 	/*
 	 * We don't want new groups to escape the limits of its ancestors.
 	 * Update has_rules[] after a new group is brought online.
 	 */
-	tg_update_has_rules(blkg_to_tg(blkg));
+	tg_update_has_rules(pd_to_tg(pd));
 }
 
 static void throtl_pd_free(struct blkg_policy_data *pd)
@@ -435,9 +436,9 @@ static void throtl_pd_free(struct blkg_policy_data *pd)
 	kfree(tg);
 }
 
-static void throtl_pd_reset_stats(struct blkcg_gq *blkg)
+static void throtl_pd_reset_stats(struct blkg_policy_data *pd)
 {
-	struct throtl_grp *tg = blkg_to_tg(blkg);
+	struct throtl_grp *tg = pd_to_tg(pd);
 	int cpu;
 
 	for_each_possible_cpu(cpu) {
diff --git a/block/cfq-iosched.c b/block/cfq-iosched.c
index d281ea6..d2bc961 100644
--- a/block/cfq-iosched.c
+++ b/block/cfq-iosched.c
@@ -1558,17 +1558,18 @@ static struct blkg_policy_data *cfq_pd_alloc(gfp_t gfp, int node)
 	return &cfqg->pd;
 }
 
-static void cfq_pd_init(struct blkcg_gq *blkg)
+static void cfq_pd_init(struct blkg_policy_data *pd)
 {
-	struct cfq_group *cfqg = blkg_to_cfqg(blkg);
+	struct cfq_group *cfqg = pd_to_cfqg(pd);
+	struct blkcg *blkcg = cfqg_to_blkg(cfqg)->blkcg;
 
-	cfqg->weight = blkg->blkcg->cfq_weight;
-	cfqg->leaf_weight = blkg->blkcg->cfq_leaf_weight;
+	cfqg->weight = blkcg->cfq_weight;
+	cfqg->leaf_weight = blkcg->cfq_leaf_weight;
 }
 
-static void cfq_pd_offline(struct blkcg_gq *blkg)
+static void cfq_pd_offline(struct blkg_policy_data *pd)
 {
-	struct cfq_group *cfqg = blkg_to_cfqg(blkg);
+	struct cfq_group *cfqg = pd_to_cfqg(pd);
 	int i;
 
 	for (i = 0; i < IOPRIO_BE_NR; i++) {
@@ -1621,9 +1622,9 @@ static struct blkg_rwstat cfqg_rwstat_pd_recursive_sum(struct blkg_policy_data *
 	return a;
 }
 
-static void cfq_pd_reset_stats(struct blkcg_gq *blkg)
+static void cfq_pd_reset_stats(struct blkg_policy_data *pd)
 {
-	struct cfq_group *cfqg = blkg_to_cfqg(blkg);
+	struct cfq_group *cfqg = pd_to_cfqg(pd);
 
 	cfqg_stats_reset(&cfqg->stats);
 	cfqg_stats_reset(&cfqg->dead_stats);
diff --git a/include/linux/blk-cgroup.h b/include/linux/blk-cgroup.h
index 9603ae1..cafc54a 100644
--- a/include/linux/blk-cgroup.h
+++ b/include/linux/blk-cgroup.h
@@ -121,11 +121,11 @@ struct blkcg_gq {
 };
 
 typedef struct blkg_policy_data *(blkcg_pol_alloc_pd_fn)(gfp_t gfp, int node);
-typedef void (blkcg_pol_init_pd_fn)(struct blkcg_gq *blkg);
-typedef void (blkcg_pol_online_pd_fn)(struct blkcg_gq *blkg);
-typedef void (blkcg_pol_offline_pd_fn)(struct blkcg_gq *blkg);
+typedef void (blkcg_pol_init_pd_fn)(struct blkg_policy_data *pd);
+typedef void (blkcg_pol_online_pd_fn)(struct blkg_policy_data *pd);
+typedef void (blkcg_pol_offline_pd_fn)(struct blkg_policy_data *pd);
 typedef void (blkcg_pol_free_pd_fn)(struct blkg_policy_data *pd);
-typedef void (blkcg_pol_reset_pd_stats_fn)(struct blkcg_gq *blkg);
+typedef void (blkcg_pol_reset_pd_stats_fn)(struct blkg_policy_data *pd);
 
 struct blkcg_policy {
 	int				plid;
-- 
2.4.3

^ permalink raw reply related	[flat|nested] 19+ messages in thread

* Re: [PATCH 5/7] blk-throttle: remove asynchrnous percpu stats allocation mechanism
@ 2015-06-25 19:10     ` Vivek Goyal
  0 siblings, 0 replies; 19+ messages in thread
From: Vivek Goyal @ 2015-06-25 19:10 UTC (permalink / raw)
  To: Tejun Heo; +Cc: axboe, linux-kernel, cgroups, avanzini.arianna, kernel-team

On Tue, Jun 23, 2015 at 10:44:11PM -0400, Tejun Heo wrote:
> Because percpu allocator couldn't do non-blocking allocations,
> blk-throttle was forced to implement an ad-hoc asynchronous allocation
> mechanism for its percpu stats for cases where blkg's (blkcg_gq's) are
> allocated from an IO path without sleepable context.
> 
> Now that percpu allocator can handle gfp_mask and blkg_policy_data
> alloc / free are handled by policy methods, the ad-hoc asynchronous
> allocation mechanism can be replaced with direct allocation from
> tg_stats_alloc_fn().  Rit it out.
> 
> This ensures that an active throtl_grp always has valid non-NULL
> ->stats_cpu.  Remove checks on it.

This is a nice cleanup. Allocating stats memory with the help of
a worker thread was twisted.

Thanks
Vivek

> 
> Signed-off-by: Tejun Heo <tj@kernel.org>
> Cc: Vivek Goyal <vgoyal@redhat.com>
> ---
>  block/blk-throttle.c | 112 ++++++++++++---------------------------------------
>  1 file changed, 25 insertions(+), 87 deletions(-)
> 
> diff --git a/block/blk-throttle.c b/block/blk-throttle.c
> index f1dd691..3c86976 100644
> --- a/block/blk-throttle.c
> +++ b/block/blk-throttle.c
> @@ -144,9 +144,6 @@ struct throtl_grp {
>  
>  	/* Per cpu stats pointer */
>  	struct tg_stats_cpu __percpu *stats_cpu;
> -
> -	/* List of tgs waiting for per cpu stats memory to be allocated */
> -	struct list_head stats_alloc_node;
>  };
>  
>  struct throtl_data
> @@ -168,13 +165,6 @@ struct throtl_data
>  	struct work_struct dispatch_work;
>  };
>  
> -/* list and work item to allocate percpu group stats */
> -static DEFINE_SPINLOCK(tg_stats_alloc_lock);
> -static LIST_HEAD(tg_stats_alloc_list);
> -
> -static void tg_stats_alloc_fn(struct work_struct *);
> -static DECLARE_DELAYED_WORK(tg_stats_alloc_work, tg_stats_alloc_fn);
> -
>  static void throtl_pending_timer_fn(unsigned long arg);
>  
>  static inline struct throtl_grp *pd_to_tg(struct blkg_policy_data *pd)
> @@ -256,53 +246,6 @@ static struct throtl_data *sq_to_td(struct throtl_service_queue *sq)
>  	}								\
>  } while (0)
>  
> -static void tg_stats_init(struct tg_stats_cpu *tg_stats)
> -{
> -	blkg_rwstat_init(&tg_stats->service_bytes);
> -	blkg_rwstat_init(&tg_stats->serviced);
> -}
> -
> -/*
> - * Worker for allocating per cpu stat for tgs. This is scheduled on the
> - * system_wq once there are some groups on the alloc_list waiting for
> - * allocation.
> - */
> -static void tg_stats_alloc_fn(struct work_struct *work)
> -{
> -	static struct tg_stats_cpu *stats_cpu;	/* this fn is non-reentrant */
> -	struct delayed_work *dwork = to_delayed_work(work);
> -	bool empty = false;
> -
> -alloc_stats:
> -	if (!stats_cpu) {
> -		int cpu;
> -
> -		stats_cpu = alloc_percpu(struct tg_stats_cpu);
> -		if (!stats_cpu) {
> -			/* allocation failed, try again after some time */
> -			schedule_delayed_work(dwork, msecs_to_jiffies(10));
> -			return;
> -		}
> -		for_each_possible_cpu(cpu)
> -			tg_stats_init(per_cpu_ptr(stats_cpu, cpu));
> -	}
> -
> -	spin_lock_irq(&tg_stats_alloc_lock);
> -
> -	if (!list_empty(&tg_stats_alloc_list)) {
> -		struct throtl_grp *tg = list_first_entry(&tg_stats_alloc_list,
> -							 struct throtl_grp,
> -							 stats_alloc_node);
> -		swap(tg->stats_cpu, stats_cpu);
> -		list_del_init(&tg->stats_alloc_node);
> -	}
> -
> -	empty = list_empty(&tg_stats_alloc_list);
> -	spin_unlock_irq(&tg_stats_alloc_lock);
> -	if (!empty)
> -		goto alloc_stats;
> -}
> -
>  static void throtl_qnode_init(struct throtl_qnode *qn, struct throtl_grp *tg)
>  {
>  	INIT_LIST_HEAD(&qn->node);
> @@ -405,7 +348,27 @@ static void throtl_service_queue_exit(struct throtl_service_queue *sq)
>  
>  static struct blkg_policy_data *throtl_pd_alloc(gfp_t gfp, int node)
>  {
> -	return kzalloc_node(sizeof(struct throtl_grp), gfp, node);
> +	struct throtl_grp *tg;
> +	int cpu;
> +
> +	tg = kzalloc_node(sizeof(*tg), gfp, node);
> +	if (!tg)
> +		return NULL;
> +
> +	tg->stats_cpu = alloc_percpu_gfp(struct tg_stats_cpu, gfp);
> +	if (!tg->stats_cpu) {
> +		kfree(tg);
> +		return NULL;
> +	}
> +
> +	for_each_possible_cpu(cpu) {
> +		struct tg_stats_cpu *stats_cpu = per_cpu_ptr(tg->stats_cpu, cpu);
> +
> +		blkg_rwstat_init(&stats_cpu->service_bytes);
> +		blkg_rwstat_init(&stats_cpu->serviced);
> +	}
> +
> +	return &tg->pd;
>  }
>  
>  static void throtl_pd_init(struct blkcg_gq *blkg)
> @@ -413,7 +376,6 @@ static void throtl_pd_init(struct blkcg_gq *blkg)
>  	struct throtl_grp *tg = blkg_to_tg(blkg);
>  	struct throtl_data *td = blkg->q->td;
>  	struct throtl_service_queue *parent_sq;
> -	unsigned long flags;
>  	int rw;
>  
>  	/*
> @@ -448,16 +410,6 @@ static void throtl_pd_init(struct blkcg_gq *blkg)
>  	tg->bps[WRITE] = -1;
>  	tg->iops[READ] = -1;
>  	tg->iops[WRITE] = -1;
> -
> -	/*
> -	 * Ugh... We need to perform per-cpu allocation for tg->stats_cpu
> -	 * but percpu allocator can't be called from IO path.  Queue tg on
> -	 * tg_stats_alloc_list and allocate from work item.
> -	 */
> -	spin_lock_irqsave(&tg_stats_alloc_lock, flags);
> -	list_add(&tg->stats_alloc_node, &tg_stats_alloc_list);
> -	schedule_delayed_work(&tg_stats_alloc_work, 0);
> -	spin_unlock_irqrestore(&tg_stats_alloc_lock, flags);
>  }
>  
>  /*
> @@ -487,20 +439,16 @@ static void throtl_pd_online(struct blkcg_gq *blkg)
>  static void throtl_pd_exit(struct blkcg_gq *blkg)
>  {
>  	struct throtl_grp *tg = blkg_to_tg(blkg);
> -	unsigned long flags;
> -
> -	spin_lock_irqsave(&tg_stats_alloc_lock, flags);
> -	list_del_init(&tg->stats_alloc_node);
> -	spin_unlock_irqrestore(&tg_stats_alloc_lock, flags);
> -
> -	free_percpu(tg->stats_cpu);
>  
>  	throtl_service_queue_exit(&tg->service_queue);
>  }
>  
>  static void throtl_pd_free(struct blkg_policy_data *pd)
>  {
> -	kfree(pd);
> +	struct throtl_grp *tg = pd_to_tg(pd);
> +
> +	free_percpu(tg->stats_cpu);
> +	kfree(tg);
>  }
>  
>  static void throtl_pd_reset_stats(struct blkcg_gq *blkg)
> @@ -508,9 +456,6 @@ static void throtl_pd_reset_stats(struct blkcg_gq *blkg)
>  	struct throtl_grp *tg = blkg_to_tg(blkg);
>  	int cpu;
>  
> -	if (tg->stats_cpu == NULL)
> -		return;
> -
>  	for_each_possible_cpu(cpu) {
>  		struct tg_stats_cpu *sc = per_cpu_ptr(tg->stats_cpu, cpu);
>  
> @@ -973,10 +918,6 @@ static void throtl_update_dispatch_stats(struct blkcg_gq *blkg, u64 bytes,
>  	struct tg_stats_cpu *stats_cpu;
>  	unsigned long flags;
>  
> -	/* If per cpu stats are not allocated yet, don't do any accounting. */
> -	if (tg->stats_cpu == NULL)
> -		return;
> -
>  	/*
>  	 * Disabling interrupts to provide mutual exclusion between two
>  	 * writes on same cpu. It probably is not needed for 64bit. Not
> @@ -1302,9 +1243,6 @@ static u64 tg_prfill_cpu_rwstat(struct seq_file *sf,
>  	struct blkg_rwstat rwstat = { }, tmp;
>  	int i, cpu;
>  
> -	if (tg->stats_cpu == NULL)
> -		return 0;
> -
>  	for_each_possible_cpu(cpu) {
>  		struct tg_stats_cpu *sc = per_cpu_ptr(tg->stats_cpu, cpu);
>  
> -- 
> 2.4.3

^ permalink raw reply	[flat|nested] 19+ messages in thread

* Re: [PATCH 5/7] blk-throttle: remove asynchrnous percpu stats allocation mechanism
@ 2015-06-25 19:10     ` Vivek Goyal
  0 siblings, 0 replies; 19+ messages in thread
From: Vivek Goyal @ 2015-06-25 19:10 UTC (permalink / raw)
  To: Tejun Heo
  Cc: axboe-tSWWG44O7X1aa/9Udqfwiw,
	linux-kernel-u79uwXL29TY76Z2rM5mHXA,
	cgroups-u79uwXL29TY76Z2rM5mHXA,
	avanzini.arianna-Re5JQEeQqe8AvxtiuMwx3w, kernel-team-b10kYP2dOMg

On Tue, Jun 23, 2015 at 10:44:11PM -0400, Tejun Heo wrote:
> Because percpu allocator couldn't do non-blocking allocations,
> blk-throttle was forced to implement an ad-hoc asynchronous allocation
> mechanism for its percpu stats for cases where blkg's (blkcg_gq's) are
> allocated from an IO path without sleepable context.
> 
> Now that percpu allocator can handle gfp_mask and blkg_policy_data
> alloc / free are handled by policy methods, the ad-hoc asynchronous
> allocation mechanism can be replaced with direct allocation from
> tg_stats_alloc_fn().  Rit it out.
> 
> This ensures that an active throtl_grp always has valid non-NULL
> ->stats_cpu.  Remove checks on it.

This is a nice cleanup. Allocating stats memory with the help of
a worker thread was twisted.

Thanks
Vivek

> 
> Signed-off-by: Tejun Heo <tj-DgEjT+Ai2ygdnm+yROfE0A@public.gmane.org>
> Cc: Vivek Goyal <vgoyal-H+wXaHxf7aLQT0dZR+AlfA@public.gmane.org>
> ---
>  block/blk-throttle.c | 112 ++++++++++++---------------------------------------
>  1 file changed, 25 insertions(+), 87 deletions(-)
> 
> diff --git a/block/blk-throttle.c b/block/blk-throttle.c
> index f1dd691..3c86976 100644
> --- a/block/blk-throttle.c
> +++ b/block/blk-throttle.c
> @@ -144,9 +144,6 @@ struct throtl_grp {
>  
>  	/* Per cpu stats pointer */
>  	struct tg_stats_cpu __percpu *stats_cpu;
> -
> -	/* List of tgs waiting for per cpu stats memory to be allocated */
> -	struct list_head stats_alloc_node;
>  };
>  
>  struct throtl_data
> @@ -168,13 +165,6 @@ struct throtl_data
>  	struct work_struct dispatch_work;
>  };
>  
> -/* list and work item to allocate percpu group stats */
> -static DEFINE_SPINLOCK(tg_stats_alloc_lock);
> -static LIST_HEAD(tg_stats_alloc_list);
> -
> -static void tg_stats_alloc_fn(struct work_struct *);
> -static DECLARE_DELAYED_WORK(tg_stats_alloc_work, tg_stats_alloc_fn);
> -
>  static void throtl_pending_timer_fn(unsigned long arg);
>  
>  static inline struct throtl_grp *pd_to_tg(struct blkg_policy_data *pd)
> @@ -256,53 +246,6 @@ static struct throtl_data *sq_to_td(struct throtl_service_queue *sq)
>  	}								\
>  } while (0)
>  
> -static void tg_stats_init(struct tg_stats_cpu *tg_stats)
> -{
> -	blkg_rwstat_init(&tg_stats->service_bytes);
> -	blkg_rwstat_init(&tg_stats->serviced);
> -}
> -
> -/*
> - * Worker for allocating per cpu stat for tgs. This is scheduled on the
> - * system_wq once there are some groups on the alloc_list waiting for
> - * allocation.
> - */
> -static void tg_stats_alloc_fn(struct work_struct *work)
> -{
> -	static struct tg_stats_cpu *stats_cpu;	/* this fn is non-reentrant */
> -	struct delayed_work *dwork = to_delayed_work(work);
> -	bool empty = false;
> -
> -alloc_stats:
> -	if (!stats_cpu) {
> -		int cpu;
> -
> -		stats_cpu = alloc_percpu(struct tg_stats_cpu);
> -		if (!stats_cpu) {
> -			/* allocation failed, try again after some time */
> -			schedule_delayed_work(dwork, msecs_to_jiffies(10));
> -			return;
> -		}
> -		for_each_possible_cpu(cpu)
> -			tg_stats_init(per_cpu_ptr(stats_cpu, cpu));
> -	}
> -
> -	spin_lock_irq(&tg_stats_alloc_lock);
> -
> -	if (!list_empty(&tg_stats_alloc_list)) {
> -		struct throtl_grp *tg = list_first_entry(&tg_stats_alloc_list,
> -							 struct throtl_grp,
> -							 stats_alloc_node);
> -		swap(tg->stats_cpu, stats_cpu);
> -		list_del_init(&tg->stats_alloc_node);
> -	}
> -
> -	empty = list_empty(&tg_stats_alloc_list);
> -	spin_unlock_irq(&tg_stats_alloc_lock);
> -	if (!empty)
> -		goto alloc_stats;
> -}
> -
>  static void throtl_qnode_init(struct throtl_qnode *qn, struct throtl_grp *tg)
>  {
>  	INIT_LIST_HEAD(&qn->node);
> @@ -405,7 +348,27 @@ static void throtl_service_queue_exit(struct throtl_service_queue *sq)
>  
>  static struct blkg_policy_data *throtl_pd_alloc(gfp_t gfp, int node)
>  {
> -	return kzalloc_node(sizeof(struct throtl_grp), gfp, node);
> +	struct throtl_grp *tg;
> +	int cpu;
> +
> +	tg = kzalloc_node(sizeof(*tg), gfp, node);
> +	if (!tg)
> +		return NULL;
> +
> +	tg->stats_cpu = alloc_percpu_gfp(struct tg_stats_cpu, gfp);
> +	if (!tg->stats_cpu) {
> +		kfree(tg);
> +		return NULL;
> +	}
> +
> +	for_each_possible_cpu(cpu) {
> +		struct tg_stats_cpu *stats_cpu = per_cpu_ptr(tg->stats_cpu, cpu);
> +
> +		blkg_rwstat_init(&stats_cpu->service_bytes);
> +		blkg_rwstat_init(&stats_cpu->serviced);
> +	}
> +
> +	return &tg->pd;
>  }
>  
>  static void throtl_pd_init(struct blkcg_gq *blkg)
> @@ -413,7 +376,6 @@ static void throtl_pd_init(struct blkcg_gq *blkg)
>  	struct throtl_grp *tg = blkg_to_tg(blkg);
>  	struct throtl_data *td = blkg->q->td;
>  	struct throtl_service_queue *parent_sq;
> -	unsigned long flags;
>  	int rw;
>  
>  	/*
> @@ -448,16 +410,6 @@ static void throtl_pd_init(struct blkcg_gq *blkg)
>  	tg->bps[WRITE] = -1;
>  	tg->iops[READ] = -1;
>  	tg->iops[WRITE] = -1;
> -
> -	/*
> -	 * Ugh... We need to perform per-cpu allocation for tg->stats_cpu
> -	 * but percpu allocator can't be called from IO path.  Queue tg on
> -	 * tg_stats_alloc_list and allocate from work item.
> -	 */
> -	spin_lock_irqsave(&tg_stats_alloc_lock, flags);
> -	list_add(&tg->stats_alloc_node, &tg_stats_alloc_list);
> -	schedule_delayed_work(&tg_stats_alloc_work, 0);
> -	spin_unlock_irqrestore(&tg_stats_alloc_lock, flags);
>  }
>  
>  /*
> @@ -487,20 +439,16 @@ static void throtl_pd_online(struct blkcg_gq *blkg)
>  static void throtl_pd_exit(struct blkcg_gq *blkg)
>  {
>  	struct throtl_grp *tg = blkg_to_tg(blkg);
> -	unsigned long flags;
> -
> -	spin_lock_irqsave(&tg_stats_alloc_lock, flags);
> -	list_del_init(&tg->stats_alloc_node);
> -	spin_unlock_irqrestore(&tg_stats_alloc_lock, flags);
> -
> -	free_percpu(tg->stats_cpu);
>  
>  	throtl_service_queue_exit(&tg->service_queue);
>  }
>  
>  static void throtl_pd_free(struct blkg_policy_data *pd)
>  {
> -	kfree(pd);
> +	struct throtl_grp *tg = pd_to_tg(pd);
> +
> +	free_percpu(tg->stats_cpu);
> +	kfree(tg);
>  }
>  
>  static void throtl_pd_reset_stats(struct blkcg_gq *blkg)
> @@ -508,9 +456,6 @@ static void throtl_pd_reset_stats(struct blkcg_gq *blkg)
>  	struct throtl_grp *tg = blkg_to_tg(blkg);
>  	int cpu;
>  
> -	if (tg->stats_cpu == NULL)
> -		return;
> -
>  	for_each_possible_cpu(cpu) {
>  		struct tg_stats_cpu *sc = per_cpu_ptr(tg->stats_cpu, cpu);
>  
> @@ -973,10 +918,6 @@ static void throtl_update_dispatch_stats(struct blkcg_gq *blkg, u64 bytes,
>  	struct tg_stats_cpu *stats_cpu;
>  	unsigned long flags;
>  
> -	/* If per cpu stats are not allocated yet, don't do any accounting. */
> -	if (tg->stats_cpu == NULL)
> -		return;
> -
>  	/*
>  	 * Disabling interrupts to provide mutual exclusion between two
>  	 * writes on same cpu. It probably is not needed for 64bit. Not
> @@ -1302,9 +1243,6 @@ static u64 tg_prfill_cpu_rwstat(struct seq_file *sf,
>  	struct blkg_rwstat rwstat = { }, tmp;
>  	int i, cpu;
>  
> -	if (tg->stats_cpu == NULL)
> -		return 0;
> -
>  	for_each_possible_cpu(cpu) {
>  		struct tg_stats_cpu *sc = per_cpu_ptr(tg->stats_cpu, cpu);
>  
> -- 
> 2.4.3

^ permalink raw reply	[flat|nested] 19+ messages in thread

* [PATCH 1/7] blkcg: remove unnecessary request_list->blkg NULL test in blk_put_rl()
@ 2015-07-07 15:51   ` Tejun Heo
  0 siblings, 0 replies; 19+ messages in thread
From: Tejun Heo @ 2015-07-07 15:51 UTC (permalink / raw)
  To: axboe
  Cc: jack, linux-kernel, cgroups, kernel-team, vgoyal,
	avanzini.arianna, Tejun Heo

Since ec13b1d6f0a0 ("blkcg: always create the blkcg_gq for the root
blkcg"), a request_list always has its blkg associated.  Drop
unnecessary rl->blkg NULL test from blk_put_rl().

Signed-off-by: Tejun Heo <tj@kernel.org>
Cc: Vivek Goyal <vgoyal@redhat.com>
---
 include/linux/blk-cgroup.h | 3 +--
 1 file changed, 1 insertion(+), 2 deletions(-)

diff --git a/include/linux/blk-cgroup.h b/include/linux/blk-cgroup.h
index 58cfab8..2e41f8c 100644
--- a/include/linux/blk-cgroup.h
+++ b/include/linux/blk-cgroup.h
@@ -399,8 +399,7 @@ static inline struct request_list *blk_get_rl(struct request_queue *q,
  */
 static inline void blk_put_rl(struct request_list *rl)
 {
-	/* root_rl may not have blkg set */
-	if (rl->blkg && rl->blkg->blkcg != &blkcg_root)
+	if (rl->blkg->blkcg != &blkcg_root)
 		blkg_put(rl->blkg);
 }
 
-- 
2.4.3


^ permalink raw reply related	[flat|nested] 19+ messages in thread

* [PATCH 1/7] blkcg: remove unnecessary request_list->blkg NULL test in blk_put_rl()
@ 2015-07-07 15:51   ` Tejun Heo
  0 siblings, 0 replies; 19+ messages in thread
From: Tejun Heo @ 2015-07-07 15:51 UTC (permalink / raw)
  To: axboe-tSWWG44O7X1aa/9Udqfwiw
  Cc: jack-AlSwsSmVLrQ, linux-kernel-u79uwXL29TY76Z2rM5mHXA,
	cgroups-u79uwXL29TY76Z2rM5mHXA, kernel-team-b10kYP2dOMg,
	vgoyal-H+wXaHxf7aLQT0dZR+AlfA,
	avanzini.arianna-Re5JQEeQqe8AvxtiuMwx3w, Tejun Heo

Since ec13b1d6f0a0 ("blkcg: always create the blkcg_gq for the root
blkcg"), a request_list always has its blkg associated.  Drop
unnecessary rl->blkg NULL test from blk_put_rl().

Signed-off-by: Tejun Heo <tj-DgEjT+Ai2ygdnm+yROfE0A@public.gmane.org>
Cc: Vivek Goyal <vgoyal-H+wXaHxf7aLQT0dZR+AlfA@public.gmane.org>
---
 include/linux/blk-cgroup.h | 3 +--
 1 file changed, 1 insertion(+), 2 deletions(-)

diff --git a/include/linux/blk-cgroup.h b/include/linux/blk-cgroup.h
index 58cfab8..2e41f8c 100644
--- a/include/linux/blk-cgroup.h
+++ b/include/linux/blk-cgroup.h
@@ -399,8 +399,7 @@ static inline struct request_list *blk_get_rl(struct request_queue *q,
  */
 static inline void blk_put_rl(struct request_list *rl)
 {
-	/* root_rl may not have blkg set */
-	if (rl->blkg && rl->blkg->blkcg != &blkcg_root)
+	if (rl->blkg->blkcg != &blkcg_root)
 		blkg_put(rl->blkg);
 }
 
-- 
2.4.3

^ permalink raw reply related	[flat|nested] 19+ messages in thread

end of thread, other threads:[~2015-07-07 15:52 UTC | newest]

Thread overview: 19+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2015-06-24  2:44 [PATCHSET block/for-4.2/writeback] blkcg: blkcg_policy methods cleanup Tejun Heo
2015-06-24  2:44 ` Tejun Heo
2015-06-24  2:44 ` [PATCH 1/7] blkcg: remove unnecessary request_list->blkg NULL test in blk_put_rl() Tejun Heo
2015-06-24  2:44   ` Tejun Heo
2015-06-24  2:44 ` [PATCH 2/7] blkcg: use blkg_free() in blkcg_init_queue() failure path Tejun Heo
2015-06-24  2:44   ` Tejun Heo
2015-06-24  2:44 ` [PATCH 3/7] blkcg: make blkcg_activate_policy() allow NULL ->pd_init_fn Tejun Heo
2015-06-24  2:44   ` Tejun Heo
2015-06-24  2:44 ` [PATCH 4/7] blkcg: replace blkcg_policy->pd_size with ->pd_alloc/free_fn() methods Tejun Heo
2015-06-24  2:44   ` Tejun Heo
2015-06-24  2:44 ` [PATCH 5/7] blk-throttle: remove asynchrnous percpu stats allocation mechanism Tejun Heo
2015-06-25 19:10   ` Vivek Goyal
2015-06-25 19:10     ` Vivek Goyal
2015-06-24  2:44 ` [PATCH 6/7] blk-throttle: clean up blkg_policy_data alloc/init/exit/free methods Tejun Heo
2015-06-24  2:44   ` Tejun Heo
2015-06-24  2:44 ` [PATCH 7/7] blkcg: make blkcg_policy methods take a pointer to blkcg_policy_data Tejun Heo
2015-06-24  2:44   ` Tejun Heo
2015-07-07 15:51 [PATCHSET v2 block/for-4.3] blkcg: blkcg_policy methods cleanup Tejun Heo
2015-07-07 15:51 ` [PATCH 1/7] blkcg: remove unnecessary request_list->blkg NULL test in blk_put_rl() Tejun Heo
2015-07-07 15:51   ` Tejun Heo

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.