linux-kernel.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
* [PATCHSET v3 block/for-4.3] blkcg: blkcg policy methods and data handling cleanup
@ 2015-07-11 18:00 Tejun Heo
  2015-07-11 18:00 ` [PATCH 01/11] blkcg: remove unnecessary request_list->blkg NULL test in blk_put_rl() Tejun Heo
                   ` (10 more replies)
  0 siblings, 11 replies; 13+ messages in thread
From: Tejun Heo @ 2015-07-11 18:00 UTC (permalink / raw)
  To: axboe; +Cc: linux-kernel, kernel-team, vgoyal, avanzini.arianna

This is v3 of blkcg_policy methods cleanup patchset.  Changes from the
last take [L] are

* Rebased on top of block/for-linus.

* 0003-blkcg-remove-unnecessary-blkcg_root-handling-from-cs.patch and
  0004-blkcg-restructure-blkg_policy_data-allocation-in-blk.patch
  added.  These are follow-up cleanups for the blkcg_policy_data
  handling fixes which went into block/for-linus.

* 0010-blkcg-cosmetic-updates-about-blkcg_policy_data.patch and
  0011-blkcg-replace-blkcg_policy-cpd_size-with-cpd_alloc-f.patch
  added so that blkcg_policy_data handling is consistent with
  blkg_policy_data handling.

This patchset contains assorted cleanups for blkcg_policy methods and
blk[c]g_policy_data handling.

* alloc/free added for blkg_policy_data.  exit dropped.

* alloc/free added for blkcg_policy_data.

* blk-throttle's async percpu allocation is replaced with direct
  allocation.

* all methods now take blk[c]g_policy_data instead of blkcg_gq or
  blkcg.

This patchset contains the following 11 patches.

 0001-blkcg-remove-unnecessary-request_list-blkg-NULL-test.patch
 0002-blkcg-use-blkg_free-in-blkcg_init_queue-failure-path.patch
 0003-blkcg-remove-unnecessary-blkcg_root-handling-from-cs.patch
 0004-blkcg-restructure-blkg_policy_data-allocation-in-blk.patch
 0005-blkcg-make-blkcg_activate_policy-allow-NULL-pd_init_.patch
 0006-blkcg-replace-blkcg_policy-pd_size-with-pd_alloc-fre.patch
 0007-blk-throttle-remove-asynchrnous-percpu-stats-allocat.patch
 0008-blk-throttle-clean-up-blkg_policy_data-alloc-init-ex.patch
 0009-blkcg-make-blkcg_policy-methods-take-a-pointer-to-bl.patch
 0010-blkcg-cosmetic-updates-about-blkcg_policy_data.patch
 0011-blkcg-replace-blkcg_policy-cpd_size-with-cpd_alloc-f.patch

0001-0005 are misc cleanups.  0006-0008 add alloc/free methods and
remove blk-throttle's async percpu allocation mechanism.  0009 makes
all methods take blkcg_policy_data.  0010-0011 apply similar cleanups
to blkcg_policy_data handling.

This patchset is also available in the following git branch.

 git://git.kernel.org/pub/scm/linux/kernel/git/tj/cgroup.git review-blkcg-methods-cleanup

and is on top of

  block/for-linus 06b285bd1125 ("blkcg: fix blkcg_policy_data allocation bug")
+ [1] [PATCHSET block/for-4.3] writeback: cgroup writeback updates
+ [2] [PATCHSET v2 block/for-4.3] block, cgroup: make cfq charge async IOs to the appropriate blkcgs

diffstat follows, thanks.

 block/blk-cgroup.c         |  171 +++++++++++++++++++-------------------------
 block/blk-throttle.c       |  173 +++++++++++++--------------------------------
 block/cfq-iosched.c        |   68 +++++++++++++----
 include/linux/blk-cgroup.h |   65 ++++++++--------
 4 files changed, 214 insertions(+), 263 deletions(-)

--
tejun

[L] http://lkml.kernel.org/g/1436284293-4666-1-git-send-email-tj@kernel.org
[1] http://lkml.kernel.org/g/1436281823-1947-1-git-send-email-tj@kernel.org
[2] http://lkml.kernel.org/g/1436283361-3889-1-git-send-email-tj@kernel.org

^ permalink raw reply	[flat|nested] 13+ messages in thread

* [PATCH 01/11] blkcg: remove unnecessary request_list->blkg NULL test in blk_put_rl()
  2015-07-11 18:00 [PATCHSET v3 block/for-4.3] blkcg: blkcg policy methods and data handling cleanup Tejun Heo
@ 2015-07-11 18:00 ` Tejun Heo
  2015-07-11 18:00 ` [PATCH 02/11] blkcg: use blkg_free() in blkcg_init_queue() failure path Tejun Heo
                   ` (9 subsequent siblings)
  10 siblings, 0 replies; 13+ messages in thread
From: Tejun Heo @ 2015-07-11 18:00 UTC (permalink / raw)
  To: axboe; +Cc: linux-kernel, kernel-team, vgoyal, avanzini.arianna, Tejun Heo

Since ec13b1d6f0a0 ("blkcg: always create the blkcg_gq for the root
blkcg"), a request_list always has its blkg associated.  Drop
unnecessary rl->blkg NULL test from blk_put_rl().

Signed-off-by: Tejun Heo <tj@kernel.org>
Cc: Vivek Goyal <vgoyal@redhat.com>
---
 include/linux/blk-cgroup.h | 3 +--
 1 file changed, 1 insertion(+), 2 deletions(-)

diff --git a/include/linux/blk-cgroup.h b/include/linux/blk-cgroup.h
index 1b62d76..9711fc2 100644
--- a/include/linux/blk-cgroup.h
+++ b/include/linux/blk-cgroup.h
@@ -394,8 +394,7 @@ static inline struct request_list *blk_get_rl(struct request_queue *q,
  */
 static inline void blk_put_rl(struct request_list *rl)
 {
-	/* root_rl may not have blkg set */
-	if (rl->blkg && rl->blkg->blkcg != &blkcg_root)
+	if (rl->blkg->blkcg != &blkcg_root)
 		blkg_put(rl->blkg);
 }
 
-- 
2.4.3


^ permalink raw reply related	[flat|nested] 13+ messages in thread

* [PATCH 02/11] blkcg: use blkg_free() in blkcg_init_queue() failure path
  2015-07-11 18:00 [PATCHSET v3 block/for-4.3] blkcg: blkcg policy methods and data handling cleanup Tejun Heo
  2015-07-11 18:00 ` [PATCH 01/11] blkcg: remove unnecessary request_list->blkg NULL test in blk_put_rl() Tejun Heo
@ 2015-07-11 18:00 ` Tejun Heo
  2015-07-11 18:00 ` [PATCH 03/11] blkcg: remove unnecessary blkcg_root handling from css_alloc/free paths Tejun Heo
                   ` (8 subsequent siblings)
  10 siblings, 0 replies; 13+ messages in thread
From: Tejun Heo @ 2015-07-11 18:00 UTC (permalink / raw)
  To: axboe; +Cc: linux-kernel, kernel-team, vgoyal, avanzini.arianna, Tejun Heo

When blkcg_init_queue() fails midway after creating a new blkg, it
performs kfree() directly; however, this doesn't free the policy data
areas.  Make it use blkg_free() instead.  In turn, blkg_free() is
updated to handle root request_list special case.

While this fixes a possible memory leak, it's on an unlikely failure
path of an already cold path and the size leaked per occurrence is
miniscule too.  I don't think it needs to be tagged for -stable.

Signed-off-by: Tejun Heo <tj@kernel.org>
Cc: Vivek Goyal <vgoyal@redhat.com>
---
 block/blk-cgroup.c | 5 +++--
 1 file changed, 3 insertions(+), 2 deletions(-)

diff --git a/block/blk-cgroup.c b/block/blk-cgroup.c
index fbb0b65..64cc48f 100644
--- a/block/blk-cgroup.c
+++ b/block/blk-cgroup.c
@@ -70,7 +70,8 @@ static void blkg_free(struct blkcg_gq *blkg)
 	for (i = 0; i < BLKCG_MAX_POLS; i++)
 		kfree(blkg->pd[i]);
 
-	blk_exit_rl(&blkg->rl);
+	if (blkg->blkcg != &blkcg_root)
+		blk_exit_rl(&blkg->rl);
 	kfree(blkg);
 }
 
@@ -934,7 +935,7 @@ int blkcg_init_queue(struct request_queue *q)
 		radix_tree_preload_end();
 
 	if (IS_ERR(blkg)) {
-		kfree(new_blkg);
+		blkg_free(new_blkg);
 		return PTR_ERR(blkg);
 	}
 
-- 
2.4.3


^ permalink raw reply related	[flat|nested] 13+ messages in thread

* [PATCH 03/11] blkcg: remove unnecessary blkcg_root handling from css_alloc/free paths
  2015-07-11 18:00 [PATCHSET v3 block/for-4.3] blkcg: blkcg policy methods and data handling cleanup Tejun Heo
  2015-07-11 18:00 ` [PATCH 01/11] blkcg: remove unnecessary request_list->blkg NULL test in blk_put_rl() Tejun Heo
  2015-07-11 18:00 ` [PATCH 02/11] blkcg: use blkg_free() in blkcg_init_queue() failure path Tejun Heo
@ 2015-07-11 18:00 ` Tejun Heo
  2015-07-11 18:00 ` [PATCH 04/11] blkcg: restructure blkg_policy_data allocation in blkcg_activate_policy() Tejun Heo
                   ` (7 subsequent siblings)
  10 siblings, 0 replies; 13+ messages in thread
From: Tejun Heo @ 2015-07-11 18:00 UTC (permalink / raw)
  To: axboe; +Cc: linux-kernel, kernel-team, vgoyal, avanzini.arianna, Tejun Heo

blkcg_css_alloc() bypasses policy data allocation and blkcg_css_free()
bypasses policy data and blkcg freeing for blkcg_root.  There's no
reason to to treat policy data any differently for blkcg_root.  If the
root css gets allocated after policies are registered, policy
registration path will add policy data; otherwise, the alloc path
will.  The free path isn't never invoked for root csses.

This patch removes the unnecessary special handling of blkcg_root from
css_alloc/free paths.

Signed-off-by: Tejun Heo <tj@kernel.org>
Cc: Vivek Goyal <vgoyal@redhat.com>
---
 block/blk-cgroup.c | 25 ++++++++++---------------
 1 file changed, 10 insertions(+), 15 deletions(-)

diff --git a/block/blk-cgroup.c b/block/blk-cgroup.c
index 64cc48f..2a493ce 100644
--- a/block/blk-cgroup.c
+++ b/block/blk-cgroup.c
@@ -819,18 +819,15 @@ static void blkcg_css_offline(struct cgroup_subsys_state *css)
 static void blkcg_css_free(struct cgroup_subsys_state *css)
 {
 	struct blkcg *blkcg = css_to_blkcg(css);
+	int i;
 
 	mutex_lock(&blkcg_pol_mutex);
 	list_del(&blkcg->all_blkcgs_node);
 	mutex_unlock(&blkcg_pol_mutex);
 
-	if (blkcg != &blkcg_root) {
-		int i;
-
-		for (i = 0; i < BLKCG_MAX_POLS; i++)
-			kfree(blkcg->pd[i]);
-		kfree(blkcg);
-	}
+	for (i = 0; i < BLKCG_MAX_POLS; i++)
+		kfree(blkcg->pd[i]);
+	kfree(blkcg);
 }
 
 static struct cgroup_subsys_state *
@@ -844,13 +841,12 @@ blkcg_css_alloc(struct cgroup_subsys_state *parent_css)
 
 	if (!parent_css) {
 		blkcg = &blkcg_root;
-		goto done;
-	}
-
-	blkcg = kzalloc(sizeof(*blkcg), GFP_KERNEL);
-	if (!blkcg) {
-		ret = ERR_PTR(-ENOMEM);
-		goto free_blkcg;
+	} else {
+		blkcg = kzalloc(sizeof(*blkcg), GFP_KERNEL);
+		if (!blkcg) {
+			ret = ERR_PTR(-ENOMEM);
+			goto free_blkcg;
+		}
 	}
 
 	for (i = 0; i < BLKCG_MAX_POLS ; i++) {
@@ -877,7 +873,6 @@ blkcg_css_alloc(struct cgroup_subsys_state *parent_css)
 		pol->cpd_init_fn(blkcg);
 	}
 
-done:
 	spin_lock_init(&blkcg->lock);
 	INIT_RADIX_TREE(&blkcg->blkg_tree, GFP_NOWAIT);
 	INIT_HLIST_HEAD(&blkcg->blkg_list);
-- 
2.4.3


^ permalink raw reply related	[flat|nested] 13+ messages in thread

* [PATCH 04/11] blkcg: restructure blkg_policy_data allocation in blkcg_activate_policy()
  2015-07-11 18:00 [PATCHSET v3 block/for-4.3] blkcg: blkcg policy methods and data handling cleanup Tejun Heo
                   ` (2 preceding siblings ...)
  2015-07-11 18:00 ` [PATCH 03/11] blkcg: remove unnecessary blkcg_root handling from css_alloc/free paths Tejun Heo
@ 2015-07-11 18:00 ` Tejun Heo
  2015-07-11 18:00 ` [PATCH 05/11] blkcg: make blkcg_activate_policy() allow NULL ->pd_init_fn Tejun Heo
                   ` (6 subsequent siblings)
  10 siblings, 0 replies; 13+ messages in thread
From: Tejun Heo @ 2015-07-11 18:00 UTC (permalink / raw)
  To: axboe; +Cc: linux-kernel, kernel-team, vgoyal, avanzini.arianna, Tejun Heo

When a policy gets activated, it needs to allocate and install its
policy data on all existing blkg's (blkcg_gq's).  Because blkg
iteration is protected by a spinlock, it currently counts the total
number of blkg's in the system, allocates the matching number of
policy data on a list and installs them during a single iteration.

This can be simplified by using speculative GFP_NOWAIT allocations
while iterating and falling back to a preallocated policy data on
failure.  If the preallocated one has already been consumed, it
releases the lock, preallocate with GFP_KERNEL and then restarts the
iteration.  This can be a bit more expensive than before but policy
activation is a very cold path and shouldn't matter.

Signed-off-by: Tejun Heo <tj@kernel.org>
---
 block/blk-cgroup.c         | 55 ++++++++++++++++++----------------------------
 include/linux/blk-cgroup.h |  3 ---
 2 files changed, 21 insertions(+), 37 deletions(-)

diff --git a/block/blk-cgroup.c b/block/blk-cgroup.c
index 2a493ce..5dbbacd 100644
--- a/block/blk-cgroup.c
+++ b/block/blk-cgroup.c
@@ -1043,65 +1043,52 @@ EXPORT_SYMBOL_GPL(blkio_cgrp_subsys);
 int blkcg_activate_policy(struct request_queue *q,
 			  const struct blkcg_policy *pol)
 {
-	LIST_HEAD(pds);
+	struct blkg_policy_data *pd_prealloc = NULL;
 	struct blkcg_gq *blkg;
-	struct blkg_policy_data *pd, *nd;
-	int cnt = 0, ret;
+	int ret;
 
 	if (blkcg_policy_enabled(q, pol))
 		return 0;
 
-	/* count and allocate policy_data for all existing blkgs */
 	blk_queue_bypass_start(q);
-	spin_lock_irq(q->queue_lock);
-	list_for_each_entry(blkg, &q->blkg_list, q_node)
-		cnt++;
-	spin_unlock_irq(q->queue_lock);
-
-	/* allocate per-blkg policy data for all existing blkgs */
-	while (cnt--) {
-		pd = kzalloc_node(pol->pd_size, GFP_KERNEL, q->node);
-		if (!pd) {
+pd_prealloc:
+	if (!pd_prealloc) {
+		pd_prealloc = kzalloc_node(pol->pd_size, GFP_KERNEL, q->node);
+		if (!pd_prealloc) {
 			ret = -ENOMEM;
-			goto out_free;
+			goto out_bypass_end;
 		}
-		list_add_tail(&pd->alloc_node, &pds);
 	}
 
-	/*
-	 * Install the allocated pds and cpds. With @q bypassing, no new blkg
-	 * should have been created while the queue lock was dropped.
-	 */
 	spin_lock_irq(q->queue_lock);
 
 	list_for_each_entry(blkg, &q->blkg_list, q_node) {
-		if (WARN_ON(list_empty(&pds))) {
-			/* umm... this shouldn't happen, just abort */
-			ret = -ENOMEM;
-			goto out_unlock;
-		}
-		pd = list_first_entry(&pds, struct blkg_policy_data, alloc_node);
-		list_del_init(&pd->alloc_node);
+		struct blkg_policy_data *pd;
 
-		/* grab blkcg lock too while installing @pd on @blkg */
-		spin_lock(&blkg->blkcg->lock);
+		if (blkg->pd[pol->plid])
+			continue;
+
+		pd = kzalloc_node(pol->pd_size, GFP_NOWAIT, q->node);
+		if (!pd)
+			swap(pd, pd_prealloc);
+		if (!pd) {
+			spin_unlock_irq(q->queue_lock);
+			goto pd_prealloc;
+		}
 
 		blkg->pd[pol->plid] = pd;
 		pd->blkg = blkg;
 		pd->plid = pol->plid;
 		pol->pd_init_fn(blkg);
-
-		spin_unlock(&blkg->blkcg->lock);
 	}
 
 	__set_bit(pol->plid, q->blkcg_pols);
 	ret = 0;
-out_unlock:
+
 	spin_unlock_irq(q->queue_lock);
-out_free:
+out_bypass_end:
 	blk_queue_bypass_end(q);
-	list_for_each_entry_safe(pd, nd, &pds, alloc_node)
-		kfree(pd);
+	kfree(pd_prealloc);
 	return ret;
 }
 EXPORT_SYMBOL_GPL(blkcg_activate_policy);
diff --git a/include/linux/blk-cgroup.h b/include/linux/blk-cgroup.h
index 9711fc2..db82288 100644
--- a/include/linux/blk-cgroup.h
+++ b/include/linux/blk-cgroup.h
@@ -80,9 +80,6 @@ struct blkg_policy_data {
 	/* the blkg and policy id this per-policy data belongs to */
 	struct blkcg_gq			*blkg;
 	int				plid;
-
-	/* used during policy activation */
-	struct list_head		alloc_node;
 };
 
 /*
-- 
2.4.3


^ permalink raw reply related	[flat|nested] 13+ messages in thread

* [PATCH 05/11] blkcg: make blkcg_activate_policy() allow NULL ->pd_init_fn
  2015-07-11 18:00 [PATCHSET v3 block/for-4.3] blkcg: blkcg policy methods and data handling cleanup Tejun Heo
                   ` (3 preceding siblings ...)
  2015-07-11 18:00 ` [PATCH 04/11] blkcg: restructure blkg_policy_data allocation in blkcg_activate_policy() Tejun Heo
@ 2015-07-11 18:00 ` Tejun Heo
  2015-07-11 18:00 ` [PATCH 06/11] blkcg: replace blkcg_policy->pd_size with ->pd_alloc/free_fn() methods Tejun Heo
                   ` (5 subsequent siblings)
  10 siblings, 0 replies; 13+ messages in thread
From: Tejun Heo @ 2015-07-11 18:00 UTC (permalink / raw)
  To: axboe; +Cc: linux-kernel, kernel-team, vgoyal, avanzini.arianna, Tejun Heo

blkg_create() allows NULL ->pd_init_fn() but blkcg_activate_policy()
doesn't.  As both in-kernel policies implement ->pd_init_fn, it
currently doesn't break anything.  Update blkcg_activate_policy() so
that its behavior is consistent with blkg_create().

Signed-off-by: Tejun Heo <tj@kernel.org>
Cc: Vivek Goyal <vgoyal@redhat.com>
---
 block/blk-cgroup.c | 3 ++-
 1 file changed, 2 insertions(+), 1 deletion(-)

diff --git a/block/blk-cgroup.c b/block/blk-cgroup.c
index 5dbbacd..b558705 100644
--- a/block/blk-cgroup.c
+++ b/block/blk-cgroup.c
@@ -1079,7 +1079,8 @@ int blkcg_activate_policy(struct request_queue *q,
 		blkg->pd[pol->plid] = pd;
 		pd->blkg = blkg;
 		pd->plid = pol->plid;
-		pol->pd_init_fn(blkg);
+		if (pol->pd_init_fn)
+			pol->pd_init_fn(blkg);
 	}
 
 	__set_bit(pol->plid, q->blkcg_pols);
-- 
2.4.3


^ permalink raw reply related	[flat|nested] 13+ messages in thread

* [PATCH 06/11] blkcg: replace blkcg_policy->pd_size with ->pd_alloc/free_fn() methods
  2015-07-11 18:00 [PATCHSET v3 block/for-4.3] blkcg: blkcg policy methods and data handling cleanup Tejun Heo
                   ` (4 preceding siblings ...)
  2015-07-11 18:00 ` [PATCH 05/11] blkcg: make blkcg_activate_policy() allow NULL ->pd_init_fn Tejun Heo
@ 2015-07-11 18:00 ` Tejun Heo
  2015-07-11 18:00 ` [PATCH 07/11] blk-throttle: remove asynchrnous percpu stats allocation mechanism Tejun Heo
                   ` (4 subsequent siblings)
  10 siblings, 0 replies; 13+ messages in thread
From: Tejun Heo @ 2015-07-11 18:00 UTC (permalink / raw)
  To: axboe; +Cc: linux-kernel, kernel-team, vgoyal, avanzini.arianna, Tejun Heo

A blkg (blkcg_gq) represents the relationship between a cgroup and
request_queue.  Each active policy has a pd (blkg_policy_data) on each
blkg.  The pd's were allocated by blkcg core and each policy could
request to allocate extra space at the end by setting
blkcg_policy->pd_size larger than the size of pd.

This is a bit unusual but was done this way mostly to simplify error
handling and all the existing use cases could be handled this way;
however, this is becoming too restrictive now that percpu memory can
be allocated without blocking.

This introduces two new mandatory blkcg_policy methods - pd_alloc_fn()
and pd_free_fn() - which are used to allocate and release pd for a
given policy.  As pd allocation is now done from policy side, it can
simply allocate a larger area which embeds pd at the beginning.  This
change makes ->pd_size pointless.  Removed.

Signed-off-by: Tejun Heo <tj@kernel.org>
Cc: Vivek Goyal <vgoyal@redhat.com>
---
 block/blk-cgroup.c         | 21 +++++++++++----------
 block/blk-throttle.c       | 13 ++++++++++++-
 block/cfq-iosched.c        | 13 ++++++++++++-
 include/linux/blk-cgroup.h | 18 +++++++++---------
 4 files changed, 44 insertions(+), 21 deletions(-)

diff --git a/block/blk-cgroup.c b/block/blk-cgroup.c
index b558705..9d83623 100644
--- a/block/blk-cgroup.c
+++ b/block/blk-cgroup.c
@@ -68,7 +68,8 @@ static void blkg_free(struct blkcg_gq *blkg)
 		return;
 
 	for (i = 0; i < BLKCG_MAX_POLS; i++)
-		kfree(blkg->pd[i]);
+		if (blkg->pd[i])
+			blkcg_policy[i]->pd_free_fn(blkg->pd[i]);
 
 	if (blkg->blkcg != &blkcg_root)
 		blk_exit_rl(&blkg->rl);
@@ -114,7 +115,7 @@ static struct blkcg_gq *blkg_alloc(struct blkcg *blkcg, struct request_queue *q,
 			continue;
 
 		/* alloc per-policy data and attach it to blkg */
-		pd = kzalloc_node(pol->pd_size, gfp_mask, q->node);
+		pd = pol->pd_alloc_fn(gfp_mask, q->node);
 		if (!pd)
 			goto err_free;
 
@@ -1053,7 +1054,7 @@ int blkcg_activate_policy(struct request_queue *q,
 	blk_queue_bypass_start(q);
 pd_prealloc:
 	if (!pd_prealloc) {
-		pd_prealloc = kzalloc_node(pol->pd_size, GFP_KERNEL, q->node);
+		pd_prealloc = pol->pd_alloc_fn(GFP_KERNEL, q->node);
 		if (!pd_prealloc) {
 			ret = -ENOMEM;
 			goto out_bypass_end;
@@ -1068,7 +1069,7 @@ int blkcg_activate_policy(struct request_queue *q,
 		if (blkg->pd[pol->plid])
 			continue;
 
-		pd = kzalloc_node(pol->pd_size, GFP_NOWAIT, q->node);
+		pd = pol->pd_alloc_fn(GFP_NOWAIT, q->node);
 		if (!pd)
 			swap(pd, pd_prealloc);
 		if (!pd) {
@@ -1089,7 +1090,8 @@ int blkcg_activate_policy(struct request_queue *q,
 	spin_unlock_irq(q->queue_lock);
 out_bypass_end:
 	blk_queue_bypass_end(q);
-	kfree(pd_prealloc);
+	if (pd_prealloc)
+		pol->pd_free_fn(pd_prealloc);
 	return ret;
 }
 EXPORT_SYMBOL_GPL(blkcg_activate_policy);
@@ -1124,8 +1126,10 @@ void blkcg_deactivate_policy(struct request_queue *q,
 		if (pol->pd_exit_fn)
 			pol->pd_exit_fn(blkg);
 
-		kfree(blkg->pd[pol->plid]);
-		blkg->pd[pol->plid] = NULL;
+		if (blkg->pd[pol->plid]) {
+			pol->pd_free_fn(blkg->pd[pol->plid]);
+			blkg->pd[pol->plid] = NULL;
+		}
 
 		spin_unlock(&blkg->blkcg->lock);
 	}
@@ -1147,9 +1151,6 @@ int blkcg_policy_register(struct blkcg_policy *pol)
 	struct blkcg *blkcg;
 	int i, ret;
 
-	if (WARN_ON(pol->pd_size < sizeof(struct blkg_policy_data)))
-		return -EINVAL;
-
 	mutex_lock(&blkcg_pol_register_mutex);
 	mutex_lock(&blkcg_pol_mutex);
 
diff --git a/block/blk-throttle.c b/block/blk-throttle.c
index b231935..f1dd691 100644
--- a/block/blk-throttle.c
+++ b/block/blk-throttle.c
@@ -403,6 +403,11 @@ static void throtl_service_queue_exit(struct throtl_service_queue *sq)
 	del_timer_sync(&sq->pending_timer);
 }
 
+static struct blkg_policy_data *throtl_pd_alloc(gfp_t gfp, int node)
+{
+	return kzalloc_node(sizeof(struct throtl_grp), gfp, node);
+}
+
 static void throtl_pd_init(struct blkcg_gq *blkg)
 {
 	struct throtl_grp *tg = blkg_to_tg(blkg);
@@ -493,6 +498,11 @@ static void throtl_pd_exit(struct blkcg_gq *blkg)
 	throtl_service_queue_exit(&tg->service_queue);
 }
 
+static void throtl_pd_free(struct blkg_policy_data *pd)
+{
+	kfree(pd);
+}
+
 static void throtl_pd_reset_stats(struct blkcg_gq *blkg)
 {
 	struct throtl_grp *tg = blkg_to_tg(blkg);
@@ -1468,12 +1478,13 @@ static void throtl_shutdown_wq(struct request_queue *q)
 }
 
 static struct blkcg_policy blkcg_policy_throtl = {
-	.pd_size		= sizeof(struct throtl_grp),
 	.cftypes		= throtl_files,
 
+	.pd_alloc_fn		= throtl_pd_alloc,
 	.pd_init_fn		= throtl_pd_init,
 	.pd_online_fn		= throtl_pd_online,
 	.pd_exit_fn		= throtl_pd_exit,
+	.pd_free_fn		= throtl_pd_free,
 	.pd_reset_stats_fn	= throtl_pd_reset_stats,
 };
 
diff --git a/block/cfq-iosched.c b/block/cfq-iosched.c
index 9c9ec7c..69ce288 100644
--- a/block/cfq-iosched.c
+++ b/block/cfq-iosched.c
@@ -1582,6 +1582,11 @@ static void cfq_cpd_init(const struct blkcg *blkcg)
 	}
 }
 
+static struct blkg_policy_data *cfq_pd_alloc(gfp_t gfp, int node)
+{
+	return kzalloc_node(sizeof(struct cfq_group), gfp, node);
+}
+
 static void cfq_pd_init(struct blkcg_gq *blkg)
 {
 	struct cfq_group *cfqg = blkg_to_cfqg(blkg);
@@ -1618,6 +1623,11 @@ static void cfq_pd_offline(struct blkcg_gq *blkg)
 	cfqg_stats_xfer_dead(cfqg);
 }
 
+static void cfq_pd_free(struct blkg_policy_data *pd)
+{
+	return kfree(pd);
+}
+
 /* offset delta from cfqg->stats to cfqg->dead_stats */
 static const int dead_stats_off_delta = offsetof(struct cfq_group, dead_stats) -
 					offsetof(struct cfq_group, stats);
@@ -4633,13 +4643,14 @@ static struct elevator_type iosched_cfq = {
 
 #ifdef CONFIG_CFQ_GROUP_IOSCHED
 static struct blkcg_policy blkcg_policy_cfq = {
-	.pd_size		= sizeof(struct cfq_group),
 	.cpd_size		= sizeof(struct cfq_group_data),
 	.cftypes		= cfq_blkcg_files,
 
 	.cpd_init_fn		= cfq_cpd_init,
+	.pd_alloc_fn		= cfq_pd_alloc,
 	.pd_init_fn		= cfq_pd_init,
 	.pd_offline_fn		= cfq_pd_offline,
+	.pd_free_fn		= cfq_pd_free,
 	.pd_reset_stats_fn	= cfq_pd_reset_stats,
 };
 #endif
diff --git a/include/linux/blk-cgroup.h b/include/linux/blk-cgroup.h
index db82288..bd173ea 100644
--- a/include/linux/blk-cgroup.h
+++ b/include/linux/blk-cgroup.h
@@ -68,13 +68,11 @@ struct blkg_rwstat {
  * request_queue (q).  This is used by blkcg policies which need to track
  * information per blkcg - q pair.
  *
- * There can be multiple active blkcg policies and each has its private
- * data on each blkg, the size of which is determined by
- * blkcg_policy->pd_size.  blkcg core allocates and frees such areas
- * together with blkg and invokes pd_init/exit_fn() methods.
- *
- * Such private data must embed struct blkg_policy_data (pd) at the
- * beginning and pd_size can't be smaller than pd.
+ * There can be multiple active blkcg policies and each blkg:policy pair is
+ * represented by a blkg_policy_data which is allocated and freed by each
+ * policy's pd_alloc/free_fn() methods.  A policy can allocate private data
+ * area by allocating larger data structure which embeds blkg_policy_data
+ * at the beginning.
  */
 struct blkg_policy_data {
 	/* the blkg and policy id this per-policy data belongs to */
@@ -126,16 +124,16 @@ struct blkcg_gq {
 };
 
 typedef void (blkcg_pol_init_cpd_fn)(const struct blkcg *blkcg);
+typedef struct blkg_policy_data *(blkcg_pol_alloc_pd_fn)(gfp_t gfp, int node);
 typedef void (blkcg_pol_init_pd_fn)(struct blkcg_gq *blkg);
 typedef void (blkcg_pol_online_pd_fn)(struct blkcg_gq *blkg);
 typedef void (blkcg_pol_offline_pd_fn)(struct blkcg_gq *blkg);
 typedef void (blkcg_pol_exit_pd_fn)(struct blkcg_gq *blkg);
+typedef void (blkcg_pol_free_pd_fn)(struct blkg_policy_data *pd);
 typedef void (blkcg_pol_reset_pd_stats_fn)(struct blkcg_gq *blkg);
 
 struct blkcg_policy {
 	int				plid;
-	/* policy specific private data size */
-	size_t				pd_size;
 	/* policy specific per-blkcg data size */
 	size_t				cpd_size;
 	/* cgroup files for the policy */
@@ -143,10 +141,12 @@ struct blkcg_policy {
 
 	/* operations */
 	blkcg_pol_init_cpd_fn		*cpd_init_fn;
+	blkcg_pol_alloc_pd_fn		*pd_alloc_fn;
 	blkcg_pol_init_pd_fn		*pd_init_fn;
 	blkcg_pol_online_pd_fn		*pd_online_fn;
 	blkcg_pol_offline_pd_fn		*pd_offline_fn;
 	blkcg_pol_exit_pd_fn		*pd_exit_fn;
+	blkcg_pol_free_pd_fn		*pd_free_fn;
 	blkcg_pol_reset_pd_stats_fn	*pd_reset_stats_fn;
 };
 
-- 
2.4.3


^ permalink raw reply related	[flat|nested] 13+ messages in thread

* [PATCH 07/11] blk-throttle: remove asynchrnous percpu stats allocation mechanism
  2015-07-11 18:00 [PATCHSET v3 block/for-4.3] blkcg: blkcg policy methods and data handling cleanup Tejun Heo
                   ` (5 preceding siblings ...)
  2015-07-11 18:00 ` [PATCH 06/11] blkcg: replace blkcg_policy->pd_size with ->pd_alloc/free_fn() methods Tejun Heo
@ 2015-07-11 18:00 ` Tejun Heo
  2015-07-11 18:00 ` [PATCH 08/11] blk-throttle: clean up blkg_policy_data alloc/init/exit/free methods Tejun Heo
                   ` (3 subsequent siblings)
  10 siblings, 0 replies; 13+ messages in thread
From: Tejun Heo @ 2015-07-11 18:00 UTC (permalink / raw)
  To: axboe; +Cc: linux-kernel, kernel-team, vgoyal, avanzini.arianna, Tejun Heo

Because percpu allocator couldn't do non-blocking allocations,
blk-throttle was forced to implement an ad-hoc asynchronous allocation
mechanism for its percpu stats for cases where blkg's (blkcg_gq's) are
allocated from an IO path without sleepable context.

Now that percpu allocator can handle gfp_mask and blkg_policy_data
alloc / free are handled by policy methods, the ad-hoc asynchronous
allocation mechanism can be replaced with direct allocation from
tg_stats_alloc_fn().  Rit it out.

This ensures that an active throtl_grp always has valid non-NULL
->stats_cpu.  Remove checks on it.

Signed-off-by: Tejun Heo <tj@kernel.org>
Cc: Vivek Goyal <vgoyal@redhat.com>
---
 block/blk-throttle.c | 112 ++++++++++++---------------------------------------
 1 file changed, 25 insertions(+), 87 deletions(-)

diff --git a/block/blk-throttle.c b/block/blk-throttle.c
index f1dd691..3c86976 100644
--- a/block/blk-throttle.c
+++ b/block/blk-throttle.c
@@ -144,9 +144,6 @@ struct throtl_grp {
 
 	/* Per cpu stats pointer */
 	struct tg_stats_cpu __percpu *stats_cpu;
-
-	/* List of tgs waiting for per cpu stats memory to be allocated */
-	struct list_head stats_alloc_node;
 };
 
 struct throtl_data
@@ -168,13 +165,6 @@ struct throtl_data
 	struct work_struct dispatch_work;
 };
 
-/* list and work item to allocate percpu group stats */
-static DEFINE_SPINLOCK(tg_stats_alloc_lock);
-static LIST_HEAD(tg_stats_alloc_list);
-
-static void tg_stats_alloc_fn(struct work_struct *);
-static DECLARE_DELAYED_WORK(tg_stats_alloc_work, tg_stats_alloc_fn);
-
 static void throtl_pending_timer_fn(unsigned long arg);
 
 static inline struct throtl_grp *pd_to_tg(struct blkg_policy_data *pd)
@@ -256,53 +246,6 @@ static struct throtl_data *sq_to_td(struct throtl_service_queue *sq)
 	}								\
 } while (0)
 
-static void tg_stats_init(struct tg_stats_cpu *tg_stats)
-{
-	blkg_rwstat_init(&tg_stats->service_bytes);
-	blkg_rwstat_init(&tg_stats->serviced);
-}
-
-/*
- * Worker for allocating per cpu stat for tgs. This is scheduled on the
- * system_wq once there are some groups on the alloc_list waiting for
- * allocation.
- */
-static void tg_stats_alloc_fn(struct work_struct *work)
-{
-	static struct tg_stats_cpu *stats_cpu;	/* this fn is non-reentrant */
-	struct delayed_work *dwork = to_delayed_work(work);
-	bool empty = false;
-
-alloc_stats:
-	if (!stats_cpu) {
-		int cpu;
-
-		stats_cpu = alloc_percpu(struct tg_stats_cpu);
-		if (!stats_cpu) {
-			/* allocation failed, try again after some time */
-			schedule_delayed_work(dwork, msecs_to_jiffies(10));
-			return;
-		}
-		for_each_possible_cpu(cpu)
-			tg_stats_init(per_cpu_ptr(stats_cpu, cpu));
-	}
-
-	spin_lock_irq(&tg_stats_alloc_lock);
-
-	if (!list_empty(&tg_stats_alloc_list)) {
-		struct throtl_grp *tg = list_first_entry(&tg_stats_alloc_list,
-							 struct throtl_grp,
-							 stats_alloc_node);
-		swap(tg->stats_cpu, stats_cpu);
-		list_del_init(&tg->stats_alloc_node);
-	}
-
-	empty = list_empty(&tg_stats_alloc_list);
-	spin_unlock_irq(&tg_stats_alloc_lock);
-	if (!empty)
-		goto alloc_stats;
-}
-
 static void throtl_qnode_init(struct throtl_qnode *qn, struct throtl_grp *tg)
 {
 	INIT_LIST_HEAD(&qn->node);
@@ -405,7 +348,27 @@ static void throtl_service_queue_exit(struct throtl_service_queue *sq)
 
 static struct blkg_policy_data *throtl_pd_alloc(gfp_t gfp, int node)
 {
-	return kzalloc_node(sizeof(struct throtl_grp), gfp, node);
+	struct throtl_grp *tg;
+	int cpu;
+
+	tg = kzalloc_node(sizeof(*tg), gfp, node);
+	if (!tg)
+		return NULL;
+
+	tg->stats_cpu = alloc_percpu_gfp(struct tg_stats_cpu, gfp);
+	if (!tg->stats_cpu) {
+		kfree(tg);
+		return NULL;
+	}
+
+	for_each_possible_cpu(cpu) {
+		struct tg_stats_cpu *stats_cpu = per_cpu_ptr(tg->stats_cpu, cpu);
+
+		blkg_rwstat_init(&stats_cpu->service_bytes);
+		blkg_rwstat_init(&stats_cpu->serviced);
+	}
+
+	return &tg->pd;
 }
 
 static void throtl_pd_init(struct blkcg_gq *blkg)
@@ -413,7 +376,6 @@ static void throtl_pd_init(struct blkcg_gq *blkg)
 	struct throtl_grp *tg = blkg_to_tg(blkg);
 	struct throtl_data *td = blkg->q->td;
 	struct throtl_service_queue *parent_sq;
-	unsigned long flags;
 	int rw;
 
 	/*
@@ -448,16 +410,6 @@ static void throtl_pd_init(struct blkcg_gq *blkg)
 	tg->bps[WRITE] = -1;
 	tg->iops[READ] = -1;
 	tg->iops[WRITE] = -1;
-
-	/*
-	 * Ugh... We need to perform per-cpu allocation for tg->stats_cpu
-	 * but percpu allocator can't be called from IO path.  Queue tg on
-	 * tg_stats_alloc_list and allocate from work item.
-	 */
-	spin_lock_irqsave(&tg_stats_alloc_lock, flags);
-	list_add(&tg->stats_alloc_node, &tg_stats_alloc_list);
-	schedule_delayed_work(&tg_stats_alloc_work, 0);
-	spin_unlock_irqrestore(&tg_stats_alloc_lock, flags);
 }
 
 /*
@@ -487,20 +439,16 @@ static void throtl_pd_online(struct blkcg_gq *blkg)
 static void throtl_pd_exit(struct blkcg_gq *blkg)
 {
 	struct throtl_grp *tg = blkg_to_tg(blkg);
-	unsigned long flags;
-
-	spin_lock_irqsave(&tg_stats_alloc_lock, flags);
-	list_del_init(&tg->stats_alloc_node);
-	spin_unlock_irqrestore(&tg_stats_alloc_lock, flags);
-
-	free_percpu(tg->stats_cpu);
 
 	throtl_service_queue_exit(&tg->service_queue);
 }
 
 static void throtl_pd_free(struct blkg_policy_data *pd)
 {
-	kfree(pd);
+	struct throtl_grp *tg = pd_to_tg(pd);
+
+	free_percpu(tg->stats_cpu);
+	kfree(tg);
 }
 
 static void throtl_pd_reset_stats(struct blkcg_gq *blkg)
@@ -508,9 +456,6 @@ static void throtl_pd_reset_stats(struct blkcg_gq *blkg)
 	struct throtl_grp *tg = blkg_to_tg(blkg);
 	int cpu;
 
-	if (tg->stats_cpu == NULL)
-		return;
-
 	for_each_possible_cpu(cpu) {
 		struct tg_stats_cpu *sc = per_cpu_ptr(tg->stats_cpu, cpu);
 
@@ -973,10 +918,6 @@ static void throtl_update_dispatch_stats(struct blkcg_gq *blkg, u64 bytes,
 	struct tg_stats_cpu *stats_cpu;
 	unsigned long flags;
 
-	/* If per cpu stats are not allocated yet, don't do any accounting. */
-	if (tg->stats_cpu == NULL)
-		return;
-
 	/*
 	 * Disabling interrupts to provide mutual exclusion between two
 	 * writes on same cpu. It probably is not needed for 64bit. Not
@@ -1302,9 +1243,6 @@ static u64 tg_prfill_cpu_rwstat(struct seq_file *sf,
 	struct blkg_rwstat rwstat = { }, tmp;
 	int i, cpu;
 
-	if (tg->stats_cpu == NULL)
-		return 0;
-
 	for_each_possible_cpu(cpu) {
 		struct tg_stats_cpu *sc = per_cpu_ptr(tg->stats_cpu, cpu);
 
-- 
2.4.3


^ permalink raw reply related	[flat|nested] 13+ messages in thread

* [PATCH 08/11] blk-throttle: clean up blkg_policy_data alloc/init/exit/free methods
  2015-07-11 18:00 [PATCHSET v3 block/for-4.3] blkcg: blkcg policy methods and data handling cleanup Tejun Heo
                   ` (6 preceding siblings ...)
  2015-07-11 18:00 ` [PATCH 07/11] blk-throttle: remove asynchrnous percpu stats allocation mechanism Tejun Heo
@ 2015-07-11 18:00 ` Tejun Heo
  2015-07-11 18:00 ` [PATCH 09/11] blkcg: make blkcg_policy methods take a pointer to blkcg_policy_data Tejun Heo
                   ` (2 subsequent siblings)
  10 siblings, 0 replies; 13+ messages in thread
From: Tejun Heo @ 2015-07-11 18:00 UTC (permalink / raw)
  To: axboe; +Cc: linux-kernel, kernel-team, vgoyal, avanzini.arianna, Tejun Heo

With the recent addition of alloc and free methods, things became
messier.  This patch reorganizes them according to the followings.

* ->pd_alloc_fn()

  Responsible for allocation and static initializations - the ones
  which can be done independent of where the pd might be attached.

* ->pd_init_fn()

  Initializations which require the knowledge of where the pd is
  attached.

* ->pd_free_fn()

  The counter part of pd_alloc_fn().  Static de-init and freeing.

This leaves ->pd_exit_fn() without any users.  Removed.

While at it, collapse an one liner function throtl_pd_exit(), which
has only one user, into its user.

Signed-off-by: Tejun Heo <tj@kernel.org>
Cc: Vivek Goyal <vgoyal@redhat.com>
---
 block/blk-cgroup.c         | 11 ---------
 block/blk-throttle.c       | 57 ++++++++++++++++------------------------------
 block/cfq-iosched.c        | 15 ++++++++----
 include/linux/blk-cgroup.h |  2 --
 4 files changed, 31 insertions(+), 54 deletions(-)

diff --git a/block/blk-cgroup.c b/block/blk-cgroup.c
index 9d83623..e509bc8 100644
--- a/block/blk-cgroup.c
+++ b/block/blk-cgroup.c
@@ -402,15 +402,6 @@ static void blkg_destroy_all(struct request_queue *q)
 void __blkg_release_rcu(struct rcu_head *rcu_head)
 {
 	struct blkcg_gq *blkg = container_of(rcu_head, struct blkcg_gq, rcu_head);
-	int i;
-
-	/* tell policies that this one is being freed */
-	for (i = 0; i < BLKCG_MAX_POLS; i++) {
-		struct blkcg_policy *pol = blkcg_policy[i];
-
-		if (blkg->pd[i] && pol->pd_exit_fn)
-			pol->pd_exit_fn(blkg);
-	}
 
 	/* release the blkcg and parent blkg refs this blkg has been holding */
 	css_put(&blkg->blkcg->css);
@@ -1123,8 +1114,6 @@ void blkcg_deactivate_policy(struct request_queue *q,
 
 		if (pol->pd_offline_fn)
 			pol->pd_offline_fn(blkg);
-		if (pol->pd_exit_fn)
-			pol->pd_exit_fn(blkg);
 
 		if (blkg->pd[pol->plid]) {
 			pol->pd_free_fn(blkg->pd[pol->plid]);
diff --git a/block/blk-throttle.c b/block/blk-throttle.c
index 3c86976..c3a235b 100644
--- a/block/blk-throttle.c
+++ b/block/blk-throttle.c
@@ -330,26 +330,19 @@ static struct bio *throtl_pop_queued(struct list_head *queued,
 }
 
 /* init a service_queue, assumes the caller zeroed it */
-static void throtl_service_queue_init(struct throtl_service_queue *sq,
-				      struct throtl_service_queue *parent_sq)
+static void throtl_service_queue_init(struct throtl_service_queue *sq)
 {
 	INIT_LIST_HEAD(&sq->queued[0]);
 	INIT_LIST_HEAD(&sq->queued[1]);
 	sq->pending_tree = RB_ROOT;
-	sq->parent_sq = parent_sq;
 	setup_timer(&sq->pending_timer, throtl_pending_timer_fn,
 		    (unsigned long)sq);
 }
 
-static void throtl_service_queue_exit(struct throtl_service_queue *sq)
-{
-	del_timer_sync(&sq->pending_timer);
-}
-
 static struct blkg_policy_data *throtl_pd_alloc(gfp_t gfp, int node)
 {
 	struct throtl_grp *tg;
-	int cpu;
+	int rw, cpu;
 
 	tg = kzalloc_node(sizeof(*tg), gfp, node);
 	if (!tg)
@@ -361,6 +354,19 @@ static struct blkg_policy_data *throtl_pd_alloc(gfp_t gfp, int node)
 		return NULL;
 	}
 
+	throtl_service_queue_init(&tg->service_queue);
+
+	for (rw = READ; rw <= WRITE; rw++) {
+		throtl_qnode_init(&tg->qnode_on_self[rw], tg);
+		throtl_qnode_init(&tg->qnode_on_parent[rw], tg);
+	}
+
+	RB_CLEAR_NODE(&tg->rb_node);
+	tg->bps[READ] = -1;
+	tg->bps[WRITE] = -1;
+	tg->iops[READ] = -1;
+	tg->iops[WRITE] = -1;
+
 	for_each_possible_cpu(cpu) {
 		struct tg_stats_cpu *stats_cpu = per_cpu_ptr(tg->stats_cpu, cpu);
 
@@ -375,8 +381,7 @@ static void throtl_pd_init(struct blkcg_gq *blkg)
 {
 	struct throtl_grp *tg = blkg_to_tg(blkg);
 	struct throtl_data *td = blkg->q->td;
-	struct throtl_service_queue *parent_sq;
-	int rw;
+	struct throtl_service_queue *sq = &tg->service_queue;
 
 	/*
 	 * If on the default hierarchy, we switch to properly hierarchical
@@ -391,25 +396,10 @@ static void throtl_pd_init(struct blkcg_gq *blkg)
 	 * Limits of a group don't interact with limits of other groups
 	 * regardless of the position of the group in the hierarchy.
 	 */
-	parent_sq = &td->service_queue;
-
+	sq->parent_sq = &td->service_queue;
 	if (cgroup_on_dfl(blkg->blkcg->css.cgroup) && blkg->parent)
-		parent_sq = &blkg_to_tg(blkg->parent)->service_queue;
-
-	throtl_service_queue_init(&tg->service_queue, parent_sq);
-
-	for (rw = READ; rw <= WRITE; rw++) {
-		throtl_qnode_init(&tg->qnode_on_self[rw], tg);
-		throtl_qnode_init(&tg->qnode_on_parent[rw], tg);
-	}
-
-	RB_CLEAR_NODE(&tg->rb_node);
+		sq->parent_sq = &blkg_to_tg(blkg->parent)->service_queue;
 	tg->td = td;
-
-	tg->bps[READ] = -1;
-	tg->bps[WRITE] = -1;
-	tg->iops[READ] = -1;
-	tg->iops[WRITE] = -1;
 }
 
 /*
@@ -436,17 +426,11 @@ static void throtl_pd_online(struct blkcg_gq *blkg)
 	tg_update_has_rules(blkg_to_tg(blkg));
 }
 
-static void throtl_pd_exit(struct blkcg_gq *blkg)
-{
-	struct throtl_grp *tg = blkg_to_tg(blkg);
-
-	throtl_service_queue_exit(&tg->service_queue);
-}
-
 static void throtl_pd_free(struct blkg_policy_data *pd)
 {
 	struct throtl_grp *tg = pd_to_tg(pd);
 
+	del_timer_sync(&tg->service_queue.pending_timer);
 	free_percpu(tg->stats_cpu);
 	kfree(tg);
 }
@@ -1421,7 +1405,6 @@ static struct blkcg_policy blkcg_policy_throtl = {
 	.pd_alloc_fn		= throtl_pd_alloc,
 	.pd_init_fn		= throtl_pd_init,
 	.pd_online_fn		= throtl_pd_online,
-	.pd_exit_fn		= throtl_pd_exit,
 	.pd_free_fn		= throtl_pd_free,
 	.pd_reset_stats_fn	= throtl_pd_reset_stats,
 };
@@ -1616,7 +1599,7 @@ int blk_throtl_init(struct request_queue *q)
 		return -ENOMEM;
 
 	INIT_WORK(&td->dispatch_work, blk_throtl_dispatch_work_fn);
-	throtl_service_queue_init(&td->service_queue, NULL);
+	throtl_service_queue_init(&td->service_queue);
 
 	q->td = td;
 	td->queue = q;
diff --git a/block/cfq-iosched.c b/block/cfq-iosched.c
index 69ce288..4b795c7 100644
--- a/block/cfq-iosched.c
+++ b/block/cfq-iosched.c
@@ -1584,7 +1584,17 @@ static void cfq_cpd_init(const struct blkcg *blkcg)
 
 static struct blkg_policy_data *cfq_pd_alloc(gfp_t gfp, int node)
 {
-	return kzalloc_node(sizeof(struct cfq_group), gfp, node);
+	struct cfq_group *cfqg;
+
+	cfqg = kzalloc_node(sizeof(*cfqg), gfp, node);
+	if (!cfqg)
+		return NULL;
+
+	cfq_init_cfqg_base(cfqg);
+	cfqg_stats_init(&cfqg->stats);
+	cfqg_stats_init(&cfqg->dead_stats);
+
+	return &cfqg->pd;
 }
 
 static void cfq_pd_init(struct blkcg_gq *blkg)
@@ -1592,11 +1602,8 @@ static void cfq_pd_init(struct blkcg_gq *blkg)
 	struct cfq_group *cfqg = blkg_to_cfqg(blkg);
 	struct cfq_group_data *cgd = blkcg_to_cfqgd(blkg->blkcg);
 
-	cfq_init_cfqg_base(cfqg);
 	cfqg->weight = cgd->weight;
 	cfqg->leaf_weight = cgd->leaf_weight;
-	cfqg_stats_init(&cfqg->stats);
-	cfqg_stats_init(&cfqg->dead_stats);
 }
 
 static void cfq_pd_offline(struct blkcg_gq *blkg)
diff --git a/include/linux/blk-cgroup.h b/include/linux/blk-cgroup.h
index bd173ea..9879469 100644
--- a/include/linux/blk-cgroup.h
+++ b/include/linux/blk-cgroup.h
@@ -128,7 +128,6 @@ typedef struct blkg_policy_data *(blkcg_pol_alloc_pd_fn)(gfp_t gfp, int node);
 typedef void (blkcg_pol_init_pd_fn)(struct blkcg_gq *blkg);
 typedef void (blkcg_pol_online_pd_fn)(struct blkcg_gq *blkg);
 typedef void (blkcg_pol_offline_pd_fn)(struct blkcg_gq *blkg);
-typedef void (blkcg_pol_exit_pd_fn)(struct blkcg_gq *blkg);
 typedef void (blkcg_pol_free_pd_fn)(struct blkg_policy_data *pd);
 typedef void (blkcg_pol_reset_pd_stats_fn)(struct blkcg_gq *blkg);
 
@@ -145,7 +144,6 @@ struct blkcg_policy {
 	blkcg_pol_init_pd_fn		*pd_init_fn;
 	blkcg_pol_online_pd_fn		*pd_online_fn;
 	blkcg_pol_offline_pd_fn		*pd_offline_fn;
-	blkcg_pol_exit_pd_fn		*pd_exit_fn;
 	blkcg_pol_free_pd_fn		*pd_free_fn;
 	blkcg_pol_reset_pd_stats_fn	*pd_reset_stats_fn;
 };
-- 
2.4.3


^ permalink raw reply related	[flat|nested] 13+ messages in thread

* [PATCH 09/11] blkcg: make blkcg_policy methods take a pointer to blkcg_policy_data
  2015-07-11 18:00 [PATCHSET v3 block/for-4.3] blkcg: blkcg policy methods and data handling cleanup Tejun Heo
                   ` (7 preceding siblings ...)
  2015-07-11 18:00 ` [PATCH 08/11] blk-throttle: clean up blkg_policy_data alloc/init/exit/free methods Tejun Heo
@ 2015-07-11 18:00 ` Tejun Heo
  2015-07-11 18:00 ` [PATCH 10/11] blkcg: cosmetic updates about blkcg_policy_data Tejun Heo
  2015-07-11 18:00 ` [PATCH 11/11] blkcg: replace blkcg_policy->cpd_size with ->cpd_alloc/free_fn() methods Tejun Heo
  10 siblings, 0 replies; 13+ messages in thread
From: Tejun Heo @ 2015-07-11 18:00 UTC (permalink / raw)
  To: axboe; +Cc: linux-kernel, kernel-team, vgoyal, avanzini.arianna, Tejun Heo

The newly added ->pd_alloc_fn() and ->pd_free_fn() deal with pd
(blkg_policy_data) while the older ones use blkg (blkcg_gq).  As using
blkg doesn't make sense for ->pd_alloc_fn() and after allocation pd
can always be mapped to blkg and given that these are policy-specific
methods, it makes sense to converge on pd.

This patch makes all methods deal with pd instead of blkg.  Most
conversions are trivial.  In blk-cgroup.c, a couple method invocation
sites now test whether pd exists instead of policy state for
consistency.  This shouldn't cause any behavioral differences.

Signed-off-by: Tejun Heo <tj@kernel.org>
Cc: Vivek Goyal <vgoyal@redhat.com>
---
 block/blk-cgroup.c         | 18 ++++++++----------
 block/blk-throttle.c       | 13 +++++++------
 block/cfq-iosched.c        | 14 +++++++-------
 include/linux/blk-cgroup.h |  8 ++++----
 4 files changed, 26 insertions(+), 27 deletions(-)

diff --git a/block/blk-cgroup.c b/block/blk-cgroup.c
index e509bc8..d18cdb6 100644
--- a/block/blk-cgroup.c
+++ b/block/blk-cgroup.c
@@ -242,7 +242,7 @@ static struct blkcg_gq *blkg_create(struct blkcg *blkcg,
 		struct blkcg_policy *pol = blkcg_policy[i];
 
 		if (blkg->pd[i] && pol->pd_init_fn)
-			pol->pd_init_fn(blkg);
+			pol->pd_init_fn(blkg->pd[i]);
 	}
 
 	/* insert */
@@ -256,7 +256,7 @@ static struct blkcg_gq *blkg_create(struct blkcg *blkcg,
 			struct blkcg_policy *pol = blkcg_policy[i];
 
 			if (blkg->pd[i] && pol->pd_online_fn)
-				pol->pd_online_fn(blkg);
+				pol->pd_online_fn(blkg->pd[i]);
 		}
 	}
 	blkg->online = true;
@@ -347,7 +347,7 @@ static void blkg_destroy(struct blkcg_gq *blkg)
 		struct blkcg_policy *pol = blkcg_policy[i];
 
 		if (blkg->pd[i] && pol->pd_offline_fn)
-			pol->pd_offline_fn(blkg);
+			pol->pd_offline_fn(blkg->pd[i]);
 	}
 	blkg->online = false;
 
@@ -468,9 +468,8 @@ static int blkcg_reset_stats(struct cgroup_subsys_state *css,
 		for (i = 0; i < BLKCG_MAX_POLS; i++) {
 			struct blkcg_policy *pol = blkcg_policy[i];
 
-			if (blkcg_policy_enabled(blkg->q, pol) &&
-			    pol->pd_reset_stats_fn)
-				pol->pd_reset_stats_fn(blkg);
+			if (blkg->pd[i] && pol->pd_reset_stats_fn)
+				pol->pd_reset_stats_fn(blkg->pd[i]);
 		}
 	}
 
@@ -1072,7 +1071,7 @@ int blkcg_activate_policy(struct request_queue *q,
 		pd->blkg = blkg;
 		pd->plid = pol->plid;
 		if (pol->pd_init_fn)
-			pol->pd_init_fn(blkg);
+			pol->pd_init_fn(pd);
 	}
 
 	__set_bit(pol->plid, q->blkcg_pols);
@@ -1112,10 +1111,9 @@ void blkcg_deactivate_policy(struct request_queue *q,
 		/* grab blkcg lock too while removing @pd from @blkg */
 		spin_lock(&blkg->blkcg->lock);
 
-		if (pol->pd_offline_fn)
-			pol->pd_offline_fn(blkg);
-
 		if (blkg->pd[pol->plid]) {
+			if (pol->pd_offline_fn)
+				pol->pd_offline_fn(blkg->pd[pol->plid]);
 			pol->pd_free_fn(blkg->pd[pol->plid]);
 			blkg->pd[pol->plid] = NULL;
 		}
diff --git a/block/blk-throttle.c b/block/blk-throttle.c
index c3a235b..c2c7547 100644
--- a/block/blk-throttle.c
+++ b/block/blk-throttle.c
@@ -377,9 +377,10 @@ static struct blkg_policy_data *throtl_pd_alloc(gfp_t gfp, int node)
 	return &tg->pd;
 }
 
-static void throtl_pd_init(struct blkcg_gq *blkg)
+static void throtl_pd_init(struct blkg_policy_data *pd)
 {
-	struct throtl_grp *tg = blkg_to_tg(blkg);
+	struct throtl_grp *tg = pd_to_tg(pd);
+	struct blkcg_gq *blkg = tg_to_blkg(tg);
 	struct throtl_data *td = blkg->q->td;
 	struct throtl_service_queue *sq = &tg->service_queue;
 
@@ -417,13 +418,13 @@ static void tg_update_has_rules(struct throtl_grp *tg)
 				    (tg->bps[rw] != -1 || tg->iops[rw] != -1);
 }
 
-static void throtl_pd_online(struct blkcg_gq *blkg)
+static void throtl_pd_online(struct blkg_policy_data *pd)
 {
 	/*
 	 * We don't want new groups to escape the limits of its ancestors.
 	 * Update has_rules[] after a new group is brought online.
 	 */
-	tg_update_has_rules(blkg_to_tg(blkg));
+	tg_update_has_rules(pd_to_tg(pd));
 }
 
 static void throtl_pd_free(struct blkg_policy_data *pd)
@@ -435,9 +436,9 @@ static void throtl_pd_free(struct blkg_policy_data *pd)
 	kfree(tg);
 }
 
-static void throtl_pd_reset_stats(struct blkcg_gq *blkg)
+static void throtl_pd_reset_stats(struct blkg_policy_data *pd)
 {
-	struct throtl_grp *tg = blkg_to_tg(blkg);
+	struct throtl_grp *tg = pd_to_tg(pd);
 	int cpu;
 
 	for_each_possible_cpu(cpu) {
diff --git a/block/cfq-iosched.c b/block/cfq-iosched.c
index 4b795c7..95e6b0c 100644
--- a/block/cfq-iosched.c
+++ b/block/cfq-iosched.c
@@ -1597,18 +1597,18 @@ static struct blkg_policy_data *cfq_pd_alloc(gfp_t gfp, int node)
 	return &cfqg->pd;
 }
 
-static void cfq_pd_init(struct blkcg_gq *blkg)
+static void cfq_pd_init(struct blkg_policy_data *pd)
 {
-	struct cfq_group *cfqg = blkg_to_cfqg(blkg);
-	struct cfq_group_data *cgd = blkcg_to_cfqgd(blkg->blkcg);
+	struct cfq_group *cfqg = pd_to_cfqg(pd);
+	struct cfq_group_data *cgd = blkcg_to_cfqgd(pd->blkg->blkcg);
 
 	cfqg->weight = cgd->weight;
 	cfqg->leaf_weight = cgd->leaf_weight;
 }
 
-static void cfq_pd_offline(struct blkcg_gq *blkg)
+static void cfq_pd_offline(struct blkg_policy_data *pd)
 {
-	struct cfq_group *cfqg = blkg_to_cfqg(blkg);
+	struct cfq_group *cfqg = pd_to_cfqg(pd);
 	int i;
 
 	for (i = 0; i < IOPRIO_BE_NR; i++) {
@@ -1661,9 +1661,9 @@ static struct blkg_rwstat cfqg_rwstat_pd_recursive_sum(struct blkg_policy_data *
 	return a;
 }
 
-static void cfq_pd_reset_stats(struct blkcg_gq *blkg)
+static void cfq_pd_reset_stats(struct blkg_policy_data *pd)
 {
-	struct cfq_group *cfqg = blkg_to_cfqg(blkg);
+	struct cfq_group *cfqg = pd_to_cfqg(pd);
 
 	cfqg_stats_reset(&cfqg->stats);
 	cfqg_stats_reset(&cfqg->dead_stats);
diff --git a/include/linux/blk-cgroup.h b/include/linux/blk-cgroup.h
index 9879469..ddd4b8b 100644
--- a/include/linux/blk-cgroup.h
+++ b/include/linux/blk-cgroup.h
@@ -125,11 +125,11 @@ struct blkcg_gq {
 
 typedef void (blkcg_pol_init_cpd_fn)(const struct blkcg *blkcg);
 typedef struct blkg_policy_data *(blkcg_pol_alloc_pd_fn)(gfp_t gfp, int node);
-typedef void (blkcg_pol_init_pd_fn)(struct blkcg_gq *blkg);
-typedef void (blkcg_pol_online_pd_fn)(struct blkcg_gq *blkg);
-typedef void (blkcg_pol_offline_pd_fn)(struct blkcg_gq *blkg);
+typedef void (blkcg_pol_init_pd_fn)(struct blkg_policy_data *pd);
+typedef void (blkcg_pol_online_pd_fn)(struct blkg_policy_data *pd);
+typedef void (blkcg_pol_offline_pd_fn)(struct blkg_policy_data *pd);
 typedef void (blkcg_pol_free_pd_fn)(struct blkg_policy_data *pd);
-typedef void (blkcg_pol_reset_pd_stats_fn)(struct blkcg_gq *blkg);
+typedef void (blkcg_pol_reset_pd_stats_fn)(struct blkg_policy_data *pd);
 
 struct blkcg_policy {
 	int				plid;
-- 
2.4.3


^ permalink raw reply related	[flat|nested] 13+ messages in thread

* [PATCH 10/11] blkcg: cosmetic updates about blkcg_policy_data
  2015-07-11 18:00 [PATCHSET v3 block/for-4.3] blkcg: blkcg policy methods and data handling cleanup Tejun Heo
                   ` (8 preceding siblings ...)
  2015-07-11 18:00 ` [PATCH 09/11] blkcg: make blkcg_policy methods take a pointer to blkcg_policy_data Tejun Heo
@ 2015-07-11 18:00 ` Tejun Heo
  2015-07-30 22:57   ` [PATCH 10/11] blkcg: minor updates around blkcg_policy_data Tejun Heo
  2015-07-11 18:00 ` [PATCH 11/11] blkcg: replace blkcg_policy->cpd_size with ->cpd_alloc/free_fn() methods Tejun Heo
  10 siblings, 1 reply; 13+ messages in thread
From: Tejun Heo @ 2015-07-11 18:00 UTC (permalink / raw)
  To: axboe; +Cc: linux-kernel, kernel-team, vgoyal, avanzini.arianna, Tejun Heo

* Rename blkcg->pd[] to blkcg->cpd[] so that cpd is consistently used
  for blkcg_policy_data.

* Make blkcg_policy->cpd_init_fn() take blkcg_policy_data instead of
  blkcg.  This makes it consistent with blkg_policy_data methods and
  to-be-added cpd alloc/free methods.

* blkcg_policy_data->blkcg and cpd_to_blkcg() added so that
  cpd_init_fn() can determine the associated blkcg from
  blkcg_policy_data.

Signed-off-by: Tejun Heo <tj@kernel.org>
Cc: Vivek Goyal <vgoyal@redhat.com>
Cc: Arianna Avanzini <avanzini.arianna@gmail.com>
---
 block/blk-cgroup.c         | 22 +++++++++++-----------
 block/cfq-iosched.c        | 11 +++++------
 include/linux/blk-cgroup.h | 14 ++++++++++----
 3 files changed, 26 insertions(+), 21 deletions(-)

diff --git a/block/blk-cgroup.c b/block/blk-cgroup.c
index d18cdb6..8173e06 100644
--- a/block/blk-cgroup.c
+++ b/block/blk-cgroup.c
@@ -817,7 +817,7 @@ static void blkcg_css_free(struct cgroup_subsys_state *css)
 	mutex_unlock(&blkcg_pol_mutex);
 
 	for (i = 0; i < BLKCG_MAX_POLS; i++)
-		kfree(blkcg->pd[i]);
+		kfree(blkcg->cpd[i]);
 	kfree(blkcg);
 }
 
@@ -853,15 +853,15 @@ blkcg_css_alloc(struct cgroup_subsys_state *parent_css)
 		if (!pol || !pol->cpd_size)
 			continue;
 
-		BUG_ON(blkcg->pd[i]);
+		BUG_ON(blkcg->cpd[i]);
 		cpd = kzalloc(pol->cpd_size, GFP_KERNEL);
 		if (!cpd) {
 			ret = ERR_PTR(-ENOMEM);
 			goto free_pd_blkcg;
 		}
-		blkcg->pd[i] = cpd;
+		blkcg->cpd[i] = cpd;
 		cpd->plid = i;
-		pol->cpd_init_fn(blkcg);
+		pol->cpd_init_fn(cpd);
 	}
 
 	spin_lock_init(&blkcg->lock);
@@ -877,7 +877,7 @@ blkcg_css_alloc(struct cgroup_subsys_state *parent_css)
 
 free_pd_blkcg:
 	for (i--; i >= 0; i--)
-		kfree(blkcg->pd[i]);
+		kfree(blkcg->cpd[i]);
 free_blkcg:
 	kfree(blkcg);
 	mutex_unlock(&blkcg_pol_mutex);
@@ -1164,9 +1164,9 @@ int blkcg_policy_register(struct blkcg_policy *pol)
 				goto err_free_cpds;
 			}
 
-			blkcg->pd[pol->plid] = cpd;
+			blkcg->cpd[pol->plid] = cpd;
 			cpd->plid = pol->plid;
-			pol->cpd_init_fn(blkcg);
+			pol->cpd_init_fn(cpd);
 		}
 	}
 
@@ -1182,8 +1182,8 @@ int blkcg_policy_register(struct blkcg_policy *pol)
 err_free_cpds:
 	if (pol->cpd_size) {
 		list_for_each_entry(blkcg, &all_blkcgs, all_blkcgs_node) {
-			kfree(blkcg->pd[pol->plid]);
-			blkcg->pd[pol->plid] = NULL;
+			kfree(blkcg->cpd[pol->plid]);
+			blkcg->cpd[pol->plid] = NULL;
 		}
 	}
 	blkcg_policy[pol->plid] = NULL;
@@ -1218,8 +1218,8 @@ void blkcg_policy_unregister(struct blkcg_policy *pol)
 
 	if (pol->cpd_size) {
 		list_for_each_entry(blkcg, &all_blkcgs, all_blkcgs_node) {
-			kfree(blkcg->pd[pol->plid]);
-			blkcg->pd[pol->plid] = NULL;
+			kfree(blkcg->cpd[pol->plid]);
+			blkcg->cpd[pol->plid] = NULL;
 		}
 	}
 	blkcg_policy[pol->plid] = NULL;
diff --git a/block/cfq-iosched.c b/block/cfq-iosched.c
index 95e6b0c..dd6ea9e 100644
--- a/block/cfq-iosched.c
+++ b/block/cfq-iosched.c
@@ -220,7 +220,7 @@ struct cfqg_stats {
 /* Per-cgroup data */
 struct cfq_group_data {
 	/* must be the first member */
-	struct blkcg_policy_data pd;
+	struct blkcg_policy_data cpd;
 
 	unsigned int weight;
 	unsigned int leaf_weight;
@@ -612,7 +612,7 @@ static inline struct cfq_group *pd_to_cfqg(struct blkg_policy_data *pd)
 static struct cfq_group_data
 *cpd_to_cfqgd(struct blkcg_policy_data *cpd)
 {
-	return cpd ? container_of(cpd, struct cfq_group_data, pd) : NULL;
+	return cpd ? container_of(cpd, struct cfq_group_data, cpd) : NULL;
 }
 
 static inline struct blkcg_gq *cfqg_to_blkg(struct cfq_group *cfqg)
@@ -1568,12 +1568,11 @@ static void cfqg_stats_init(struct cfqg_stats *stats)
 #endif
 }
 
-static void cfq_cpd_init(const struct blkcg *blkcg)
+static void cfq_cpd_init(struct blkcg_policy_data *cpd)
 {
-	struct cfq_group_data *cgd =
-		cpd_to_cfqgd(blkcg->pd[blkcg_policy_cfq.plid]);
+	struct cfq_group_data *cgd = cpd_to_cfqgd(cpd);
 
-	if (blkcg == &blkcg_root) {
+	if (cpd_to_blkcg(cpd) == &blkcg_root) {
 		cgd->weight = 2 * CFQ_WEIGHT_DEFAULT;
 		cgd->leaf_weight = 2 * CFQ_WEIGHT_DEFAULT;
 	} else {
diff --git a/include/linux/blk-cgroup.h b/include/linux/blk-cgroup.h
index ddd4b8b..7988d47 100644
--- a/include/linux/blk-cgroup.h
+++ b/include/linux/blk-cgroup.h
@@ -45,7 +45,7 @@ struct blkcg {
 	struct blkcg_gq			*blkg_hint;
 	struct hlist_head		blkg_list;
 
-	struct blkcg_policy_data	*pd[BLKCG_MAX_POLS];
+	struct blkcg_policy_data	*cpd[BLKCG_MAX_POLS];
 
 	struct list_head		all_blkcgs_node;
 #ifdef CONFIG_CGROUP_WRITEBACK
@@ -88,7 +88,8 @@ struct blkg_policy_data {
  * each policy handle per-blkcg data.
  */
 struct blkcg_policy_data {
-	/* the policy id this per-policy data belongs to */
+	/* the blkcg and policy id this per-policy data belongs to */
+	struct blkcg			*blkcg;
 	int				plid;
 };
 
@@ -123,7 +124,7 @@ struct blkcg_gq {
 	struct rcu_head			rcu_head;
 };
 
-typedef void (blkcg_pol_init_cpd_fn)(const struct blkcg *blkcg);
+typedef void (blkcg_pol_init_cpd_fn)(struct blkcg_policy_data *cpd);
 typedef struct blkg_policy_data *(blkcg_pol_alloc_pd_fn)(gfp_t gfp, int node);
 typedef void (blkcg_pol_init_pd_fn)(struct blkg_policy_data *pd);
 typedef void (blkcg_pol_online_pd_fn)(struct blkg_policy_data *pd);
@@ -243,7 +244,7 @@ static inline struct blkg_policy_data *blkg_to_pd(struct blkcg_gq *blkg,
 static inline struct blkcg_policy_data *blkcg_to_cpd(struct blkcg *blkcg,
 						     struct blkcg_policy *pol)
 {
-	return blkcg ? blkcg->pd[pol->plid] : NULL;
+	return blkcg ? blkcg->cpd[pol->plid] : NULL;
 }
 
 /**
@@ -257,6 +258,11 @@ static inline struct blkcg_gq *pd_to_blkg(struct blkg_policy_data *pd)
 	return pd ? pd->blkg : NULL;
 }
 
+static inline struct blkcg *cpd_to_blkcg(struct blkcg_policy_data *cpd)
+{
+	return cpd ? cpd->blkcg : NULL;
+}
+
 /**
  * blkg_path - format cgroup path of blkg
  * @blkg: blkg of interest
-- 
2.4.3


^ permalink raw reply related	[flat|nested] 13+ messages in thread

* [PATCH 11/11] blkcg: replace blkcg_policy->cpd_size with ->cpd_alloc/free_fn() methods
  2015-07-11 18:00 [PATCHSET v3 block/for-4.3] blkcg: blkcg policy methods and data handling cleanup Tejun Heo
                   ` (9 preceding siblings ...)
  2015-07-11 18:00 ` [PATCH 10/11] blkcg: cosmetic updates about blkcg_policy_data Tejun Heo
@ 2015-07-11 18:00 ` Tejun Heo
  10 siblings, 0 replies; 13+ messages in thread
From: Tejun Heo @ 2015-07-11 18:00 UTC (permalink / raw)
  To: axboe; +Cc: linux-kernel, kernel-team, vgoyal, avanzini.arianna, Tejun Heo

Each active policy has a cpd (blkcg_policy_data) on each blkcg.  The
cpd's were allocated by blkcg core and each policy could request to
allocate extra space at the end by setting blkcg_policy->cpd_size
larger than the size of cpd.

This is a bit unusual but blkg (blkcg_gq) policy data used to be
handled this way too so it made sense to be consistent; however, blkg
policy data switched to alloc/free callbacks.

This patch makes similar changes to cpd handling.
blkcg_policy->cpd_alloc/free_fn() are added to replace ->cpd_size.  As
cpd allocation is now done from policy side, it can simply allocate a
larger area which embeds cpd at the beginning.

As ->cpd_alloc_fn() may be able to perform all necessary
initializations, this patch makes ->cpd_init_fn() optional.

Signed-off-by: Tejun Heo <tj@kernel.org>
Cc: Vivek Goyal <vgoyal@redhat.com>
Cc: Arianna Avanzini <avanzini.arianna@gmail.com>
---
 block/blk-cgroup.c         | 39 ++++++++++++++++++++++++---------------
 block/cfq-iosched.c        | 19 ++++++++++++++++++-
 include/linux/blk-cgroup.h | 17 ++++++++++-------
 3 files changed, 52 insertions(+), 23 deletions(-)

diff --git a/block/blk-cgroup.c b/block/blk-cgroup.c
index 8173e06..48d95ca 100644
--- a/block/blk-cgroup.c
+++ b/block/blk-cgroup.c
@@ -813,11 +813,15 @@ static void blkcg_css_free(struct cgroup_subsys_state *css)
 	int i;
 
 	mutex_lock(&blkcg_pol_mutex);
+
 	list_del(&blkcg->all_blkcgs_node);
-	mutex_unlock(&blkcg_pol_mutex);
 
 	for (i = 0; i < BLKCG_MAX_POLS; i++)
-		kfree(blkcg->cpd[i]);
+		if (blkcg->cpd[i])
+			blkcg_policy[i]->cpd_free_fn(blkcg->cpd[i]);
+
+	mutex_unlock(&blkcg_pol_mutex);
+
 	kfree(blkcg);
 }
 
@@ -850,18 +854,18 @@ blkcg_css_alloc(struct cgroup_subsys_state *parent_css)
 		 * check if the policy requires any specific per-cgroup
 		 * data: if it does, allocate and initialize it.
 		 */
-		if (!pol || !pol->cpd_size)
+		if (!pol || !pol->cpd_alloc_fn)
 			continue;
 
-		BUG_ON(blkcg->cpd[i]);
-		cpd = kzalloc(pol->cpd_size, GFP_KERNEL);
+		cpd = pol->cpd_alloc_fn(GFP_KERNEL);
 		if (!cpd) {
 			ret = ERR_PTR(-ENOMEM);
 			goto free_pd_blkcg;
 		}
 		blkcg->cpd[i] = cpd;
 		cpd->plid = i;
-		pol->cpd_init_fn(cpd);
+		if (pol->cpd_init_fn)
+			pol->cpd_init_fn(cpd);
 	}
 
 	spin_lock_init(&blkcg->lock);
@@ -877,7 +881,8 @@ blkcg_css_alloc(struct cgroup_subsys_state *parent_css)
 
 free_pd_blkcg:
 	for (i--; i >= 0; i--)
-		kfree(blkcg->cpd[i]);
+		if (blkcg->cpd[i])
+			blkcg_policy[i]->cpd_free_fn(blkcg->cpd[i]);
 free_blkcg:
 	kfree(blkcg);
 	mutex_unlock(&blkcg_pol_mutex);
@@ -1154,11 +1159,11 @@ int blkcg_policy_register(struct blkcg_policy *pol)
 	blkcg_policy[pol->plid] = pol;
 
 	/* allocate and install cpd's */
-	if (pol->cpd_size) {
+	if (pol->cpd_alloc_fn) {
 		list_for_each_entry(blkcg, &all_blkcgs, all_blkcgs_node) {
 			struct blkcg_policy_data *cpd;
 
-			cpd = kzalloc(pol->cpd_size, GFP_KERNEL);
+			cpd = pol->cpd_alloc_fn(GFP_KERNEL);
 			if (!cpd) {
 				mutex_unlock(&blkcg_pol_mutex);
 				goto err_free_cpds;
@@ -1180,10 +1185,12 @@ int blkcg_policy_register(struct blkcg_policy *pol)
 	return 0;
 
 err_free_cpds:
-	if (pol->cpd_size) {
+	if (pol->cpd_alloc_fn) {
 		list_for_each_entry(blkcg, &all_blkcgs, all_blkcgs_node) {
-			kfree(blkcg->cpd[pol->plid]);
-			blkcg->cpd[pol->plid] = NULL;
+			if (blkcg->cpd[pol->plid]) {
+				pol->cpd_free_fn(blkcg->cpd[pol->plid]);
+				blkcg->cpd[pol->plid] = NULL;
+			}
 		}
 	}
 	blkcg_policy[pol->plid] = NULL;
@@ -1216,10 +1223,12 @@ void blkcg_policy_unregister(struct blkcg_policy *pol)
 	/* remove cpds and unregister */
 	mutex_lock(&blkcg_pol_mutex);
 
-	if (pol->cpd_size) {
+	if (pol->cpd_alloc_fn) {
 		list_for_each_entry(blkcg, &all_blkcgs, all_blkcgs_node) {
-			kfree(blkcg->cpd[pol->plid]);
-			blkcg->cpd[pol->plid] = NULL;
+			if (blkcg->cpd[pol->plid]) {
+				pol->cpd_free_fn(blkcg->cpd[pol->plid]);
+				blkcg->cpd[pol->plid] = NULL;
+			}
 		}
 	}
 	blkcg_policy[pol->plid] = NULL;
diff --git a/block/cfq-iosched.c b/block/cfq-iosched.c
index dd6ea9e..a4429b3 100644
--- a/block/cfq-iosched.c
+++ b/block/cfq-iosched.c
@@ -1568,6 +1568,16 @@ static void cfqg_stats_init(struct cfqg_stats *stats)
 #endif
 }
 
+static struct blkcg_policy_data *cfq_cpd_alloc(gfp_t gfp)
+{
+	struct cfq_group_data *cgd;
+
+	cgd = kzalloc(sizeof(*cgd), GFP_KERNEL);
+	if (!cgd)
+		return NULL;
+	return &cgd->cpd;
+}
+
 static void cfq_cpd_init(struct blkcg_policy_data *cpd)
 {
 	struct cfq_group_data *cgd = cpd_to_cfqgd(cpd);
@@ -1581,6 +1591,11 @@ static void cfq_cpd_init(struct blkcg_policy_data *cpd)
 	}
 }
 
+static void cfq_cpd_free(struct blkcg_policy_data *cpd)
+{
+	kfree(cpd_to_cfqgd(cpd));
+}
+
 static struct blkg_policy_data *cfq_pd_alloc(gfp_t gfp, int node)
 {
 	struct cfq_group *cfqg;
@@ -4649,10 +4664,12 @@ static struct elevator_type iosched_cfq = {
 
 #ifdef CONFIG_CFQ_GROUP_IOSCHED
 static struct blkcg_policy blkcg_policy_cfq = {
-	.cpd_size		= sizeof(struct cfq_group_data),
 	.cftypes		= cfq_blkcg_files,
 
+	.cpd_alloc_fn		= cfq_cpd_alloc,
 	.cpd_init_fn		= cfq_cpd_init,
+	.cpd_free_fn		= cfq_cpd_free,
+
 	.pd_alloc_fn		= cfq_pd_alloc,
 	.pd_init_fn		= cfq_pd_init,
 	.pd_offline_fn		= cfq_pd_offline,
diff --git a/include/linux/blk-cgroup.h b/include/linux/blk-cgroup.h
index 7988d47..15f2382 100644
--- a/include/linux/blk-cgroup.h
+++ b/include/linux/blk-cgroup.h
@@ -81,11 +81,11 @@ struct blkg_policy_data {
 };
 
 /*
- * Policies that need to keep per-blkcg data which is independent
- * from any request_queue associated to it must specify its size
- * with the cpd_size field of the blkcg_policy structure and
- * embed a blkcg_policy_data in it.  cpd_init() is invoked to let
- * each policy handle per-blkcg data.
+ * Policies that need to keep per-blkcg data which is independent from any
+ * request_queue associated to it should implement cpd_alloc/free_fn()
+ * methods.  A policy can allocate private data area by allocating larger
+ * data structure which embeds blkcg_policy_data at the beginning.
+ * cpd_init() is invoked to let each policy handle per-blkcg data.
  */
 struct blkcg_policy_data {
 	/* the blkcg and policy id this per-policy data belongs to */
@@ -124,7 +124,9 @@ struct blkcg_gq {
 	struct rcu_head			rcu_head;
 };
 
+typedef struct blkcg_policy_data *(blkcg_pol_alloc_cpd_fn)(gfp_t gfp);
 typedef void (blkcg_pol_init_cpd_fn)(struct blkcg_policy_data *cpd);
+typedef void (blkcg_pol_free_cpd_fn)(struct blkcg_policy_data *cpd);
 typedef struct blkg_policy_data *(blkcg_pol_alloc_pd_fn)(gfp_t gfp, int node);
 typedef void (blkcg_pol_init_pd_fn)(struct blkg_policy_data *pd);
 typedef void (blkcg_pol_online_pd_fn)(struct blkg_policy_data *pd);
@@ -134,13 +136,14 @@ typedef void (blkcg_pol_reset_pd_stats_fn)(struct blkg_policy_data *pd);
 
 struct blkcg_policy {
 	int				plid;
-	/* policy specific per-blkcg data size */
-	size_t				cpd_size;
 	/* cgroup files for the policy */
 	struct cftype			*cftypes;
 
 	/* operations */
+	blkcg_pol_alloc_cpd_fn		*cpd_alloc_fn;
 	blkcg_pol_init_cpd_fn		*cpd_init_fn;
+	blkcg_pol_free_cpd_fn		*cpd_free_fn;
+
 	blkcg_pol_alloc_pd_fn		*pd_alloc_fn;
 	blkcg_pol_init_pd_fn		*pd_init_fn;
 	blkcg_pol_online_pd_fn		*pd_online_fn;
-- 
2.4.3


^ permalink raw reply related	[flat|nested] 13+ messages in thread

* [PATCH 10/11] blkcg: minor updates around blkcg_policy_data
  2015-07-11 18:00 ` [PATCH 10/11] blkcg: cosmetic updates about blkcg_policy_data Tejun Heo
@ 2015-07-30 22:57   ` Tejun Heo
  0 siblings, 0 replies; 13+ messages in thread
From: Tejun Heo @ 2015-07-30 22:57 UTC (permalink / raw)
  To: axboe; +Cc: linux-kernel, kernel-team, vgoyal, avanzini.arianna

>From 4b246538f4bdb143de8faec8b8ed4e4c18a862a8 Mon Sep 17 00:00:00 2001
From: Tejun Heo <tj@kernel.org>
Date: Thu, 30 Jul 2015 18:51:50 -0400

* Rename blkcg->pd[] to blkcg->cpd[] so that cpd is consistently used
  for blkcg_policy_data.

* Make blkcg_policy->cpd_init_fn() take blkcg_policy_data instead of
  blkcg.  This makes it consistent with blkg_policy_data methods and
  to-be-added cpd alloc/free methods.

* blkcg_policy_data->blkcg and cpd_to_blkcg() added so that
  cpd_init_fn() can determine the associated blkcg from
  blkcg_policy_data.

v2: blkcg_policy_data->blkcg initializations were missing.  Added.

Signed-off-by: Tejun Heo <tj@kernel.org>
Cc: Vivek Goyal <vgoyal@redhat.com>
Cc: Arianna Avanzini <avanzini.arianna@gmail.com>
---
git branch updated accordingly.

Thanks.

 block/blk-cgroup.c         | 24 +++++++++++++-----------
 block/cfq-iosched.c        | 11 +++++------
 include/linux/blk-cgroup.h | 14 ++++++++++----
 3 files changed, 28 insertions(+), 21 deletions(-)

diff --git a/block/blk-cgroup.c b/block/blk-cgroup.c
index 8343450..247c42c 100644
--- a/block/blk-cgroup.c
+++ b/block/blk-cgroup.c
@@ -821,7 +821,7 @@ static void blkcg_css_free(struct cgroup_subsys_state *css)
 	mutex_unlock(&blkcg_pol_mutex);
 
 	for (i = 0; i < BLKCG_MAX_POLS; i++)
-		kfree(blkcg->pd[i]);
+		kfree(blkcg->cpd[i]);
 	kfree(blkcg);
 }
 
@@ -857,15 +857,16 @@ blkcg_css_alloc(struct cgroup_subsys_state *parent_css)
 		if (!pol || !pol->cpd_size)
 			continue;
 
-		BUG_ON(blkcg->pd[i]);
+		BUG_ON(blkcg->cpd[i]);
 		cpd = kzalloc(pol->cpd_size, GFP_KERNEL);
 		if (!cpd) {
 			ret = ERR_PTR(-ENOMEM);
 			goto free_pd_blkcg;
 		}
-		blkcg->pd[i] = cpd;
+		blkcg->cpd[i] = cpd;
+		cpd->blkcg = blkcg;
 		cpd->plid = i;
-		pol->cpd_init_fn(blkcg);
+		pol->cpd_init_fn(cpd);
 	}
 
 	spin_lock_init(&blkcg->lock);
@@ -881,7 +882,7 @@ blkcg_css_alloc(struct cgroup_subsys_state *parent_css)
 
 free_pd_blkcg:
 	for (i--; i >= 0; i--)
-		kfree(blkcg->pd[i]);
+		kfree(blkcg->cpd[i]);
 free_blkcg:
 	kfree(blkcg);
 	mutex_unlock(&blkcg_pol_mutex);
@@ -1168,9 +1169,10 @@ int blkcg_policy_register(struct blkcg_policy *pol)
 				goto err_free_cpds;
 			}
 
-			blkcg->pd[pol->plid] = cpd;
+			blkcg->cpd[pol->plid] = cpd;
+			cpd->blkcg = blkcg;
 			cpd->plid = pol->plid;
-			pol->cpd_init_fn(blkcg);
+			pol->cpd_init_fn(cpd);
 		}
 	}
 
@@ -1186,8 +1188,8 @@ int blkcg_policy_register(struct blkcg_policy *pol)
 err_free_cpds:
 	if (pol->cpd_size) {
 		list_for_each_entry(blkcg, &all_blkcgs, all_blkcgs_node) {
-			kfree(blkcg->pd[pol->plid]);
-			blkcg->pd[pol->plid] = NULL;
+			kfree(blkcg->cpd[pol->plid]);
+			blkcg->cpd[pol->plid] = NULL;
 		}
 	}
 	blkcg_policy[pol->plid] = NULL;
@@ -1222,8 +1224,8 @@ void blkcg_policy_unregister(struct blkcg_policy *pol)
 
 	if (pol->cpd_size) {
 		list_for_each_entry(blkcg, &all_blkcgs, all_blkcgs_node) {
-			kfree(blkcg->pd[pol->plid]);
-			blkcg->pd[pol->plid] = NULL;
+			kfree(blkcg->cpd[pol->plid]);
+			blkcg->cpd[pol->plid] = NULL;
 		}
 	}
 	blkcg_policy[pol->plid] = NULL;
diff --git a/block/cfq-iosched.c b/block/cfq-iosched.c
index 95e6b0c..dd6ea9e 100644
--- a/block/cfq-iosched.c
+++ b/block/cfq-iosched.c
@@ -220,7 +220,7 @@ struct cfqg_stats {
 /* Per-cgroup data */
 struct cfq_group_data {
 	/* must be the first member */
-	struct blkcg_policy_data pd;
+	struct blkcg_policy_data cpd;
 
 	unsigned int weight;
 	unsigned int leaf_weight;
@@ -612,7 +612,7 @@ static inline struct cfq_group *pd_to_cfqg(struct blkg_policy_data *pd)
 static struct cfq_group_data
 *cpd_to_cfqgd(struct blkcg_policy_data *cpd)
 {
-	return cpd ? container_of(cpd, struct cfq_group_data, pd) : NULL;
+	return cpd ? container_of(cpd, struct cfq_group_data, cpd) : NULL;
 }
 
 static inline struct blkcg_gq *cfqg_to_blkg(struct cfq_group *cfqg)
@@ -1568,12 +1568,11 @@ static void cfqg_stats_init(struct cfqg_stats *stats)
 #endif
 }
 
-static void cfq_cpd_init(const struct blkcg *blkcg)
+static void cfq_cpd_init(struct blkcg_policy_data *cpd)
 {
-	struct cfq_group_data *cgd =
-		cpd_to_cfqgd(blkcg->pd[blkcg_policy_cfq.plid]);
+	struct cfq_group_data *cgd = cpd_to_cfqgd(cpd);
 
-	if (blkcg == &blkcg_root) {
+	if (cpd_to_blkcg(cpd) == &blkcg_root) {
 		cgd->weight = 2 * CFQ_WEIGHT_DEFAULT;
 		cgd->leaf_weight = 2 * CFQ_WEIGHT_DEFAULT;
 	} else {
diff --git a/include/linux/blk-cgroup.h b/include/linux/blk-cgroup.h
index ddd4b8b..7988d47 100644
--- a/include/linux/blk-cgroup.h
+++ b/include/linux/blk-cgroup.h
@@ -45,7 +45,7 @@ struct blkcg {
 	struct blkcg_gq			*blkg_hint;
 	struct hlist_head		blkg_list;
 
-	struct blkcg_policy_data	*pd[BLKCG_MAX_POLS];
+	struct blkcg_policy_data	*cpd[BLKCG_MAX_POLS];
 
 	struct list_head		all_blkcgs_node;
 #ifdef CONFIG_CGROUP_WRITEBACK
@@ -88,7 +88,8 @@ struct blkg_policy_data {
  * each policy handle per-blkcg data.
  */
 struct blkcg_policy_data {
-	/* the policy id this per-policy data belongs to */
+	/* the blkcg and policy id this per-policy data belongs to */
+	struct blkcg			*blkcg;
 	int				plid;
 };
 
@@ -123,7 +124,7 @@ struct blkcg_gq {
 	struct rcu_head			rcu_head;
 };
 
-typedef void (blkcg_pol_init_cpd_fn)(const struct blkcg *blkcg);
+typedef void (blkcg_pol_init_cpd_fn)(struct blkcg_policy_data *cpd);
 typedef struct blkg_policy_data *(blkcg_pol_alloc_pd_fn)(gfp_t gfp, int node);
 typedef void (blkcg_pol_init_pd_fn)(struct blkg_policy_data *pd);
 typedef void (blkcg_pol_online_pd_fn)(struct blkg_policy_data *pd);
@@ -243,7 +244,7 @@ static inline struct blkg_policy_data *blkg_to_pd(struct blkcg_gq *blkg,
 static inline struct blkcg_policy_data *blkcg_to_cpd(struct blkcg *blkcg,
 						     struct blkcg_policy *pol)
 {
-	return blkcg ? blkcg->pd[pol->plid] : NULL;
+	return blkcg ? blkcg->cpd[pol->plid] : NULL;
 }
 
 /**
@@ -257,6 +258,11 @@ static inline struct blkcg_gq *pd_to_blkg(struct blkg_policy_data *pd)
 	return pd ? pd->blkg : NULL;
 }
 
+static inline struct blkcg *cpd_to_blkcg(struct blkcg_policy_data *cpd)
+{
+	return cpd ? cpd->blkcg : NULL;
+}
+
 /**
  * blkg_path - format cgroup path of blkg
  * @blkg: blkg of interest
-- 
2.4.3


^ permalink raw reply related	[flat|nested] 13+ messages in thread

end of thread, other threads:[~2015-07-30 22:57 UTC | newest]

Thread overview: 13+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2015-07-11 18:00 [PATCHSET v3 block/for-4.3] blkcg: blkcg policy methods and data handling cleanup Tejun Heo
2015-07-11 18:00 ` [PATCH 01/11] blkcg: remove unnecessary request_list->blkg NULL test in blk_put_rl() Tejun Heo
2015-07-11 18:00 ` [PATCH 02/11] blkcg: use blkg_free() in blkcg_init_queue() failure path Tejun Heo
2015-07-11 18:00 ` [PATCH 03/11] blkcg: remove unnecessary blkcg_root handling from css_alloc/free paths Tejun Heo
2015-07-11 18:00 ` [PATCH 04/11] blkcg: restructure blkg_policy_data allocation in blkcg_activate_policy() Tejun Heo
2015-07-11 18:00 ` [PATCH 05/11] blkcg: make blkcg_activate_policy() allow NULL ->pd_init_fn Tejun Heo
2015-07-11 18:00 ` [PATCH 06/11] blkcg: replace blkcg_policy->pd_size with ->pd_alloc/free_fn() methods Tejun Heo
2015-07-11 18:00 ` [PATCH 07/11] blk-throttle: remove asynchrnous percpu stats allocation mechanism Tejun Heo
2015-07-11 18:00 ` [PATCH 08/11] blk-throttle: clean up blkg_policy_data alloc/init/exit/free methods Tejun Heo
2015-07-11 18:00 ` [PATCH 09/11] blkcg: make blkcg_policy methods take a pointer to blkcg_policy_data Tejun Heo
2015-07-11 18:00 ` [PATCH 10/11] blkcg: cosmetic updates about blkcg_policy_data Tejun Heo
2015-07-30 22:57   ` [PATCH 10/11] blkcg: minor updates around blkcg_policy_data Tejun Heo
2015-07-11 18:00 ` [PATCH 11/11] blkcg: replace blkcg_policy->cpd_size with ->cpd_alloc/free_fn() methods Tejun Heo

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).