linux-kernel.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
* [PATCHSET] cgroup: cftype based file interface
@ 2012-03-16 23:35 Tejun Heo
  2012-03-16 23:35 ` [PATCH 01/10] cgroup: move cgroup_clear_directory() call out of cgroup_populate_dir() Tejun Heo
                   ` (10 more replies)
  0 siblings, 11 replies; 33+ messages in thread
From: Tejun Heo @ 2012-03-16 23:35 UTC (permalink / raw)
  To: glommer, lizf, containers, cgroups; +Cc: linux-kernel, fweisbec, rni, ctalbott

Hello, guys.

This patch replaces cgroup file interface with cftype based one which
allows dynamic additions and removals of cftype arrays whether the
target subsystem is attached or not.  This can be used to make subsys
rebinding via remount work properly but I intentionally avoided doing
that at the moment.

This makes cgroup population simpler for controllers and will be used
to allow controllers to be more dynamic.  e.g. blkio subsys has
sub-policies which may come and go while blkio subsys is attached and
it currently uses fixed set of files which stays blank if not in use.
This will also be useful for implementing unified hierarchy.

This patchset contains the following patches.

 0001-cgroup-move-cgroup_clear_directory-call-out-of-cgrou.patch
 0002-cgroup-build-list-of-all-cgroups-under-a-given-cgrou.patch
 0003-cgroup-implement-cgroup_add_cftypes-and-friends.patch
 0004-cgroup-merge-cft_release_agent-cftype-array-into-the.patch
 0005-cgroup-convert-all-non-memcg-controllers-to-the-new-.patch
 0006-cgroup-convert-memcg-controller-to-the-new-cftype-in.patch
 0007-cgroup-remove-cgroup_add_file-s.patch
 0008-cgroup-relocate-__d_cgrp-and-__d_cft.patch
 0009-cgroup-introduce-struct-cfent.patch
 0010-cgroup-implement-cgroup_rm_cftypes.patch

and is on top of

  cgroup/for-3.4 3ce3230a0cff484e5130153f244d4fb8a56b3a8b
+ [1] cgroup: deprecate remount option changes mount option

and is also available in the following git branch.

 git://git.kernel.org/pub/scm/linux/kernel/git/tj/misc.git cgroup-cftype

Glauber, can you please try to move net kmem stuff out of
->populate().  If ->create() doesn't work for whatever reason, can you
please explain it to me so that we can find a proper solution?

diffstat follows.

 block/blk-cgroup.c        |   10 -
 include/linux/cgroup.h    |   56 +++++--
 kernel/cgroup.c           |  351 ++++++++++++++++++++++++++++++++++------------
 kernel/cgroup_freezer.c   |   11 -
 kernel/cpuset.c           |   31 +---
 kernel/sched/core.c       |   16 --
 mm/memcontrol.c           |   31 ----
 net/core/netprio_cgroup.c |    9 -
 net/ipv4/tcp_memcontrol.c |    8 -
 net/sched/cls_cgroup.c    |    9 -
 security/device_cgroup.c  |   10 -
 11 files changed, 341 insertions(+), 201 deletions(-)

Thanks.

--
tejun

[1] http://thread.gmane.org/gmane.linux.kernel.cgroups/1192/focus=22611

^ permalink raw reply	[flat|nested] 33+ messages in thread

* [PATCH 01/10] cgroup: move cgroup_clear_directory() call out of cgroup_populate_dir()
  2012-03-16 23:35 [PATCHSET] cgroup: cftype based file interface Tejun Heo
@ 2012-03-16 23:35 ` Tejun Heo
  2012-03-19 10:25   ` Glauber Costa
  2012-03-16 23:35 ` [PATCH 02/10] cgroup: build list of all cgroups under a given cgroupfs_root Tejun Heo
                   ` (9 subsequent siblings)
  10 siblings, 1 reply; 33+ messages in thread
From: Tejun Heo @ 2012-03-16 23:35 UTC (permalink / raw)
  To: glommer, lizf, containers, cgroups
  Cc: linux-kernel, fweisbec, rni, ctalbott, Tejun Heo

cgroup_populate_dir() currently clears all files and then repopulate
the directory; however, the clearing part is only useful when it's
called from cgroup_remount().  Relocate the invocation to
cgroup_remount().

This is to prepare for further cgroup file handling updates.

Signed-off-by: Tejun Heo <tj@kernel.org>
---
 kernel/cgroup.c |    6 ++----
 1 files changed, 2 insertions(+), 4 deletions(-)

diff --git a/kernel/cgroup.c b/kernel/cgroup.c
index deeab87..2e2f93b 100644
--- a/kernel/cgroup.c
+++ b/kernel/cgroup.c
@@ -1313,7 +1313,8 @@ static int cgroup_remount(struct super_block *sb, int *flags, char *data)
 		goto out_unlock;
 	}
 
-	/* (re)populate subsystem files */
+	/* clear out any existing files and repopulate subsystem files */
+	cgroup_clear_directory(cgrp->dentry);
 	cgroup_populate_dir(cgrp);
 
 	if (opts.release_agent)
@@ -3648,9 +3649,6 @@ static int cgroup_populate_dir(struct cgroup *cgrp)
 	int err;
 	struct cgroup_subsys *ss;
 
-	/* First clear out any existing files */
-	cgroup_clear_directory(cgrp->dentry);
-
 	err = cgroup_add_files(cgrp, NULL, files, ARRAY_SIZE(files));
 	if (err < 0)
 		return err;
-- 
1.7.7.3


^ permalink raw reply related	[flat|nested] 33+ messages in thread

* [PATCH 02/10] cgroup: build list of all cgroups under a given cgroupfs_root
  2012-03-16 23:35 [PATCHSET] cgroup: cftype based file interface Tejun Heo
  2012-03-16 23:35 ` [PATCH 01/10] cgroup: move cgroup_clear_directory() call out of cgroup_populate_dir() Tejun Heo
@ 2012-03-16 23:35 ` Tejun Heo
  2012-03-16 23:35 ` [PATCH 03/10] cgroup: implement cgroup_add_cftypes() and friends Tejun Heo
                   ` (8 subsequent siblings)
  10 siblings, 0 replies; 33+ messages in thread
From: Tejun Heo @ 2012-03-16 23:35 UTC (permalink / raw)
  To: glommer, lizf, containers, cgroups
  Cc: linux-kernel, fweisbec, rni, ctalbott, Tejun Heo

Build a list of all cgroups anchored at cgroupfs_root->allcg_list and
going through cgroup->allcg_node.  The list is protected by
cgroup_mutex and will be used to improve cgroup file handling.

Signed-off-by: Tejun Heo <tj@kernel.org>
---
 include/linux/cgroup.h |    2 ++
 kernel/cgroup.c        |   10 ++++++++++
 2 files changed, 12 insertions(+), 0 deletions(-)

diff --git a/include/linux/cgroup.h b/include/linux/cgroup.h
index 501adb1..908f26f 100644
--- a/include/linux/cgroup.h
+++ b/include/linux/cgroup.h
@@ -191,6 +191,8 @@ struct cgroup {
 	 */
 	struct list_head css_sets;
 
+	struct list_head allcg_node;	/* cgroupfs_root->allcg_list */
+
 	/*
 	 * Linked list running through all cgroups that can
 	 * potentially be reaped by the release agent. Protected by
diff --git a/kernel/cgroup.c b/kernel/cgroup.c
index 2e2f93b..fceee52 100644
--- a/kernel/cgroup.c
+++ b/kernel/cgroup.c
@@ -127,6 +127,9 @@ struct cgroupfs_root {
 	/* A list running through the active hierarchies */
 	struct list_head root_list;
 
+	/* All cgroups on this root, cgroup_mutex protected */
+	struct list_head allcg_list;
+
 	/* Hierarchy-specific flags */
 	unsigned long flags;
 
@@ -1350,11 +1353,14 @@ static void init_cgroup_housekeeping(struct cgroup *cgrp)
 static void init_cgroup_root(struct cgroupfs_root *root)
 {
 	struct cgroup *cgrp = &root->top_cgroup;
+
 	INIT_LIST_HEAD(&root->subsys_list);
 	INIT_LIST_HEAD(&root->root_list);
+	INIT_LIST_HEAD(&root->allcg_list);
 	root->number_of_cgroups = 1;
 	cgrp->root = root;
 	cgrp->top_cgroup = cgrp;
+	list_add_tail(&cgrp->allcg_node, &root->allcg_list);
 	init_cgroup_housekeeping(cgrp);
 }
 
@@ -3794,6 +3800,8 @@ static long cgroup_create(struct cgroup *parent, struct dentry *dentry,
 	/* The cgroup directory was pre-locked for us */
 	BUG_ON(!mutex_is_locked(&cgrp->dentry->d_inode->i_mutex));
 
+	list_add_tail(&cgrp->allcg_node, &root->allcg_list);
+
 	err = cgroup_populate_dir(cgrp);
 	/* If err < 0, we have a half-filled directory - oh well ;) */
 
@@ -4002,6 +4010,8 @@ again:
 	list_del_init(&cgrp->sibling);
 	cgroup_unlock_hierarchy(cgrp->root);
 
+	list_del_init(&cgrp->allcg_node);
+
 	d = dget(cgrp->dentry);
 
 	cgroup_d_remove_dir(d);
-- 
1.7.7.3


^ permalink raw reply related	[flat|nested] 33+ messages in thread

* [PATCH 03/10] cgroup: implement cgroup_add_cftypes() and friends
  2012-03-16 23:35 [PATCHSET] cgroup: cftype based file interface Tejun Heo
  2012-03-16 23:35 ` [PATCH 01/10] cgroup: move cgroup_clear_directory() call out of cgroup_populate_dir() Tejun Heo
  2012-03-16 23:35 ` [PATCH 02/10] cgroup: build list of all cgroups under a given cgroupfs_root Tejun Heo
@ 2012-03-16 23:35 ` Tejun Heo
  2012-03-20  8:52   ` Aneesh Kumar K.V
  2012-03-16 23:35 ` [PATCH 04/10] cgroup: merge cft_release_agent cftype array into the base files array Tejun Heo
                   ` (7 subsequent siblings)
  10 siblings, 1 reply; 33+ messages in thread
From: Tejun Heo @ 2012-03-16 23:35 UTC (permalink / raw)
  To: glommer, lizf, containers, cgroups
  Cc: linux-kernel, fweisbec, rni, ctalbott, Tejun Heo

Currently, cgroup directories are populated by subsys->populate()
callback explicitly creating files on each cgroup creation.  This
level of flexibility isn't needed or desirable.  It provides largely
unused flexibility which call for abuses while severely limiting what
the core layer can do through the lack of structure and conventions.

Per each cgroup file type, the only distinction that cgroup users is
making is whether a cgroup is root or not, which can easily be
expressed with flags.

This patch introduces cgroup_add_cftypes() and its wrapper macros -
CGROUP_SUBSYS_CFTYPES[_COND]().  These deal with cftypes instead of
individual files - controllers indicate that certain types of files
exist for certain subsystem.  Newly added CFTYPE_*_ON_ROOT flags
indicate whether a cftype should be excluded or created only on the
root cgroup.

cgroup_add_cftypes() can be called any time whether the target
subsystem is currently attached or not.  cgroup core will create files
on the existing cgroups as necessary.  CGROUP_SUBSYS_CFTYPES[_COND]()
are convenience macros controllers so that cftypes can be declared to
belong to certain cgroup.  The COND variant is useful for cases where
certain files are dependent on boot time parameter.

Further patches will convert the existing users and remove the file
based interface.  Note that this interface allows dynamic addition of
files to an active controller.  This will be used for sub-controller
modularity and unified hierarchy in the longer term.

This patch implements the new mechanism but doesn't apply it to any
user.

Signed-off-by: Tejun Heo <tj@kernel.org>
---
 include/linux/cgroup.h |   38 +++++++++++++-
 kernel/cgroup.c        |  128 +++++++++++++++++++++++++++++++++++++++++++++++-
 2 files changed, 163 insertions(+), 3 deletions(-)

diff --git a/include/linux/cgroup.h b/include/linux/cgroup.h
index 908f26f..a78a6a8 100644
--- a/include/linux/cgroup.h
+++ b/include/linux/cgroup.h
@@ -192,6 +192,7 @@ struct cgroup {
 	struct list_head css_sets;
 
 	struct list_head allcg_node;	/* cgroupfs_root->allcg_list */
+	struct list_head cft_q_node;	/* used during cftype add/rm */
 
 	/*
 	 * Linked list running through all cgroups that can
@@ -277,11 +278,17 @@ struct cgroup_map_cb {
  *	- the 'cftype' of the file is file->f_dentry->d_fsdata
  */
 
-#define MAX_CFTYPE_NAME 64
+/* cftype->flags */
+#define CFTYPE_ONLY_ON_ROOT	(1U << 0)	/* only create on root cg */
+#define CFTYPE_NOT_ON_ROOT	(1U << 1)	/* don't create onp root cg */
+
+#define MAX_CFTYPE_NAME		64
+
 struct cftype {
 	/*
 	 * By convention, the name should begin with the name of the
-	 * subsystem, followed by a period
+	 * subsystem, followed by a period.  Zero length string indicates
+	 * end of cftype array.
 	 */
 	char name[MAX_CFTYPE_NAME];
 	int private;
@@ -297,6 +304,9 @@ struct cftype {
 	 */
 	size_t max_write_len;
 
+	/* CFTYPE_* flags */
+	unsigned int flags;
+
 	int (*open)(struct inode *inode, struct file *file);
 	ssize_t (*read)(struct cgroup *cgrp, struct cftype *cft,
 			struct file *file,
@@ -375,6 +385,25 @@ struct cftype {
 			struct eventfd_ctx *eventfd);
 };
 
+/*
+ * Declare cftype array @cfts for cgroup subsys @ss if @cond is %true.
+ * Useful if the files are dependent on boot time parameter.
+ */
+#define CGROUP_SUBSYS_CFTYPES_COND(ss, cfts, cond)			\
+	static int __init __cgroup_cfts_init_##ss_##cfts(void)		\
+	{								\
+		if ((cond))						\
+			WARN_ON(cgroup_add_cftypes(&ss, cfts));		\
+		return 0;						\
+	}								\
+	fs_initcall(__cgroup_cfts_init_##ss_##cfts);
+
+/*
+ * Declare cftype array @cfts for cgroup subsys @ss.
+ */
+#define CGROUP_SUBSYS_CFTYPES(ss, cfts)					\
+	CGROUP_SUBSYS_CFTYPES_COND(ss, cfts, true)
+
 struct cgroup_scanner {
 	struct cgroup *cg;
 	int (*test_task)(struct task_struct *p, struct cgroup_scanner *scan);
@@ -400,6 +429,8 @@ int cgroup_add_files(struct cgroup *cgrp,
 			const struct cftype cft[],
 			int count);
 
+int cgroup_add_cftypes(struct cgroup_subsys *ss, const struct cftype *cfts);
+
 int cgroup_is_removed(const struct cgroup *cgrp);
 
 int cgroup_path(const struct cgroup *cgrp, char *buf, int buflen);
@@ -467,6 +498,9 @@ struct cgroup_subsys {
 	void (*post_clone)(struct cgroup *cgrp);
 	void (*bind)(struct cgroup *root);
 
+	/* list of cftype_sets */
+	struct list_head cftsets;
+
 	int subsys_id;
 	int active;
 	int disabled;
diff --git a/kernel/cgroup.c b/kernel/cgroup.c
index fceee52..2b1a209 100644
--- a/kernel/cgroup.c
+++ b/kernel/cgroup.c
@@ -148,6 +148,16 @@ struct cgroupfs_root {
 static struct cgroupfs_root rootnode;
 
 /*
+ * cftype_sets describe cftypes belonging to a subsystem and are chained at
+ * cgroup_subsys->cftsets.  Each cftset points to an array of cftypes
+ * terminated by zero length name.
+ */
+struct cftype_set {
+	struct list_head		node;	/* chained at subsys->cftsets */
+	const struct cftype		*cfts;
+};
+
+/*
  * CSS ID -- ID per subsys's Cgroup Subsys State(CSS). used only when
  * cgroup_subsys->use_id != 0.
  */
@@ -2627,8 +2637,14 @@ int cgroup_add_file(struct cgroup *cgrp,
 	struct dentry *dentry;
 	int error;
 	umode_t mode;
-
 	char name[MAX_CGROUP_TYPE_NAMELEN + MAX_CFTYPE_NAME + 2] = { 0 };
+
+	/* does @cft->flags tell us to skip creation on @cgrp? */
+	if ((cft->flags & CFTYPE_NOT_ON_ROOT) && !cgrp->parent)
+		return 0;
+	if ((cft->flags & CFTYPE_ONLY_ON_ROOT) && cgrp->parent)
+		return 0;
+
 	if (subsys && !test_bit(ROOT_NOPREFIX, &cgrp->root->flags)) {
 		strcpy(name, subsys->name);
 		strcat(name, ".");
@@ -2664,6 +2680,97 @@ int cgroup_add_files(struct cgroup *cgrp,
 }
 EXPORT_SYMBOL_GPL(cgroup_add_files);
 
+static DEFINE_MUTEX(cgroup_cft_mutex);
+
+static void cgroup_cfts_prepare(void)
+	__acquires(&cgroup_cft_mutex) __acquires(&cgroup_mutex)
+{
+	/*
+	 * Thanks to the entanglement with vfs inode locking, we can't walk
+	 * the existing cgroups under cgroup_mutex and create files.
+	 * Instead, we increment reference on all cgroups and build list of
+	 * them using @cgrp->cft_q_node.  Grab cgroup_cft_mutex to ensure
+	 * exclusive access to the field.
+	 */
+	mutex_lock(&cgroup_cft_mutex);
+	mutex_lock(&cgroup_mutex);
+}
+
+static void cgroup_cfts_commit(struct cgroup_subsys *ss,
+			       const struct cftype *cfts)
+	__releases(&cgroup_mutex) __releases(&cgroup_cft_mutex)
+{
+	LIST_HEAD(pending);
+	struct cgroup *cgrp, *n;
+	int count = 0;
+
+	while (cfts[count].name[0] != '\0')
+		count++;
+
+	/* %NULL @cfts indicates abort and don't bother if @ss isn't attached */
+	if (cfts && ss->root != &rootnode) {
+		list_for_each_entry(cgrp, &ss->root->allcg_list, allcg_node) {
+			dget(cgrp->dentry);
+			list_add_tail(&cgrp->cft_q_node, &pending);
+		}
+	}
+
+	mutex_unlock(&cgroup_mutex);
+
+	/*
+	 * All new cgroups will see @cfts update on @ss->cftsets.  Add/rm
+	 * files for all cgroups which were created before.
+	 */
+	list_for_each_entry_safe(cgrp, n, &pending, cft_q_node) {
+		struct inode *inode = cgrp->dentry->d_inode;
+
+		mutex_lock(&inode->i_mutex);
+		mutex_lock(&cgroup_mutex);
+		if (!cgroup_is_removed(cgrp))
+			cgroup_add_files(cgrp, ss, cfts, count);
+		mutex_unlock(&cgroup_mutex);
+		mutex_unlock(&inode->i_mutex);
+
+		list_del_init(&cgrp->cft_q_node);
+		dput(cgrp->dentry);
+	}
+
+	mutex_unlock(&cgroup_cft_mutex);
+}
+
+/**
+ * cgroup_add_cftypes - add an array of cftypes to a subsystem
+ * @ss: target cgroup subsystem
+ * @cfts: zero-length name terminated array of cftypes
+ *
+ * Register @cfts to @ss.  Files described by @cfts are created for all
+ * existing cgroups to which @ss is attached and all future cgroups will
+ * have them too.  This function can be called anytime after
+ * subsys_initcall whether @ss is attached or not.
+ *
+ * Returns 0 on successful registration, -errno on failure.  Note that this
+ * function currently returns 0 as long as @cfts registration is successful
+ * even if some file creation attempts on existing cgroups fail.
+ */
+int cgroup_add_cftypes(struct cgroup_subsys *ss, const struct cftype *cfts)
+{
+	struct cftype_set *set;
+
+	set = kzalloc(sizeof(*set), GFP_KERNEL);
+	if (!set)
+		return -ENOMEM;
+
+	INIT_LIST_HEAD(&set->node);
+	set->cfts = cfts;
+
+	cgroup_cfts_prepare();
+	list_add_tail(&set->node, &ss->cftsets);
+	cgroup_cfts_commit(ss, cfts);
+
+	return 0;
+}
+EXPORT_SYMBOL_GPL(cgroup_add_cftypes);
+
 /**
  * cgroup_task_count - count the number of tasks in a cgroup.
  * @cgrp: the cgroup in question
@@ -3664,10 +3771,25 @@ static int cgroup_populate_dir(struct cgroup *cgrp)
 			return err;
 	}
 
+	/* process cftsets of each subsystem */
 	for_each_subsys(cgrp->root, ss) {
+		struct cftype_set *set;
+
 		if (ss->populate && (err = ss->populate(ss, cgrp)) < 0)
 			return err;
+
+		list_for_each_entry(set, &ss->cftsets, node) {
+			const struct cftype *cft;
+
+			for (cft = set->cfts; cft->name[0] != '\0'; cft++) {
+				err = cgroup_add_file(cgrp, ss, cft);
+				if (err)
+					pr_warning("cgroup_populate_dir: failed to create %s, err=%d\n",
+						   cft->name, err);
+			}
+		}
 	}
+
 	/* This cgroup is ready now */
 	for_each_subsys(cgrp->root, ss) {
 		struct cgroup_subsys_state *css = cgrp->subsys[ss->subsys_id];
@@ -4044,6 +4166,8 @@ static void __init cgroup_init_subsys(struct cgroup_subsys *ss)
 
 	printk(KERN_INFO "Initializing cgroup subsys %s\n", ss->name);
 
+	INIT_LIST_HEAD(&ss->cftsets);
+
 	/* Create the top cgroup state for this subsystem */
 	list_add(&ss->sibling, &rootnode.subsys_list);
 	ss->root = &rootnode;
@@ -4113,6 +4237,8 @@ int __init_or_module cgroup_load_subsys(struct cgroup_subsys *ss)
 		return 0;
 	}
 
+	INIT_LIST_HEAD(&ss->cftsets);
+
 	/*
 	 * need to register a subsys id before anything else - for example,
 	 * init_cgroup_css needs it.
-- 
1.7.7.3


^ permalink raw reply related	[flat|nested] 33+ messages in thread

* [PATCH 04/10] cgroup: merge cft_release_agent cftype array into the base files array
  2012-03-16 23:35 [PATCHSET] cgroup: cftype based file interface Tejun Heo
                   ` (2 preceding siblings ...)
  2012-03-16 23:35 ` [PATCH 03/10] cgroup: implement cgroup_add_cftypes() and friends Tejun Heo
@ 2012-03-16 23:35 ` Tejun Heo
  2012-03-16 23:35 ` [PATCH 05/10] cgroup: convert all non-memcg controllers to the new cftype interface Tejun Heo
                   ` (6 subsequent siblings)
  10 siblings, 0 replies; 33+ messages in thread
From: Tejun Heo @ 2012-03-16 23:35 UTC (permalink / raw)
  To: glommer, lizf, containers, cgroups
  Cc: linux-kernel, fweisbec, rni, ctalbott, Tejun Heo

Now that cftype can express whether a file should only be on root,
cft_release_agent can be merged into the base files cftypes array.

Signed-off-by: Tejun Heo <tj@kernel.org>
---
 kernel/cgroup.c |   19 +++++++------------
 1 files changed, 7 insertions(+), 12 deletions(-)

diff --git a/kernel/cgroup.c b/kernel/cgroup.c
index 2b1a209..6c396f3 100644
--- a/kernel/cgroup.c
+++ b/kernel/cgroup.c
@@ -3748,13 +3748,13 @@ static struct cftype files[] = {
 		.read_u64 = cgroup_clone_children_read,
 		.write_u64 = cgroup_clone_children_write,
 	},
-};
-
-static struct cftype cft_release_agent = {
-	.name = "release_agent",
-	.read_seq_string = cgroup_release_agent_show,
-	.write_string = cgroup_release_agent_write,
-	.max_write_len = PATH_MAX,
+	{
+		.name = "release_agent",
+		.flags = CFTYPE_ONLY_ON_ROOT,
+		.read_seq_string = cgroup_release_agent_show,
+		.write_string = cgroup_release_agent_write,
+		.max_write_len = PATH_MAX,
+	},
 };
 
 static int cgroup_populate_dir(struct cgroup *cgrp)
@@ -3766,11 +3766,6 @@ static int cgroup_populate_dir(struct cgroup *cgrp)
 	if (err < 0)
 		return err;
 
-	if (cgrp == cgrp->top_cgroup) {
-		if ((err = cgroup_add_file(cgrp, NULL, &cft_release_agent)) < 0)
-			return err;
-	}
-
 	/* process cftsets of each subsystem */
 	for_each_subsys(cgrp->root, ss) {
 		struct cftype_set *set;
-- 
1.7.7.3


^ permalink raw reply related	[flat|nested] 33+ messages in thread

* [PATCH 05/10] cgroup: convert all non-memcg controllers to the new cftype interface
  2012-03-16 23:35 [PATCHSET] cgroup: cftype based file interface Tejun Heo
                   ` (3 preceding siblings ...)
  2012-03-16 23:35 ` [PATCH 04/10] cgroup: merge cft_release_agent cftype array into the base files array Tejun Heo
@ 2012-03-16 23:35 ` Tejun Heo
  2012-03-16 23:35 ` [PATCH 06/10] cgroup: convert memcg controller " Tejun Heo
                   ` (5 subsequent siblings)
  10 siblings, 0 replies; 33+ messages in thread
From: Tejun Heo @ 2012-03-16 23:35 UTC (permalink / raw)
  To: glommer, lizf, containers, cgroups
  Cc: linux-kernel, fweisbec, rni, ctalbott, Tejun Heo, Paul Menage,
	Ingo Molnar, Peter Zijlstra, David S. Miller, Vivek Goyal

Convert debug, freezer, cpuset, cpu_cgroup, cpuacct, net_prio, blkio,
net_cls and device controllers to use the new cftype based interface.
Termination entry is added to cftype arrays and populate callbacks are
replaced with CGROUP_SUBSYS_CFTYPES() macro invocations.

This is functionally identical transformation.  There shouldn't be any
visible behavior change.

memcg is rather special and will be converted separately.

Signed-off-by: Tejun Heo <tj@kernel.org>
Cc: Paul Menage <paul@paulmenage.org>
Cc: Ingo Molnar <mingo@elte.hu>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: "David S. Miller" <davem@davemloft.net>
Cc: Vivek Goyal <vgoyal@redhat.com>
Cc: Li Zefan <lizf@cn.fujitsu.com>
---
 block/blk-cgroup.c        |   10 ++--------
 kernel/cgroup.c           |   10 +++-------
 kernel/cgroup_freezer.c   |   11 +++--------
 kernel/cpuset.c           |   31 ++++++++++---------------------
 kernel/sched/core.c       |   16 ++++------------
 net/core/netprio_cgroup.c |    9 ++-------
 net/sched/cls_cgroup.c    |    9 ++-------
 security/device_cgroup.c  |   10 ++--------
 8 files changed, 28 insertions(+), 78 deletions(-)

diff --git a/block/blk-cgroup.c b/block/blk-cgroup.c
index 1359d63..2a542b3 100644
--- a/block/blk-cgroup.c
+++ b/block/blk-cgroup.c
@@ -32,7 +32,6 @@ static struct cgroup_subsys_state *blkiocg_create(struct cgroup *);
 static int blkiocg_can_attach(struct cgroup *, struct cgroup_taskset *);
 static void blkiocg_attach(struct cgroup *, struct cgroup_taskset *);
 static void blkiocg_destroy(struct cgroup *);
-static int blkiocg_populate(struct cgroup_subsys *, struct cgroup *);
 
 /* for encoding cft->private value on file */
 #define BLKIOFILE_PRIVATE(x, val)	(((x) << 16) | (val))
@@ -46,7 +45,6 @@ struct cgroup_subsys blkio_subsys = {
 	.can_attach = blkiocg_can_attach,
 	.attach = blkiocg_attach,
 	.destroy = blkiocg_destroy,
-	.populate = blkiocg_populate,
 #ifdef CONFIG_BLK_CGROUP
 	/* note: blkio_subsys_id is otherwise defined in blk-cgroup.h */
 	.subsys_id = blkio_subsys_id,
@@ -1537,13 +1535,9 @@ struct cftype blkio_files[] = {
 		.read_map = blkiocg_file_read_map,
 	},
 #endif
+	{ }	/* terminate */
 };
-
-static int blkiocg_populate(struct cgroup_subsys *subsys, struct cgroup *cgroup)
-{
-	return cgroup_add_files(cgroup, subsys, blkio_files,
-				ARRAY_SIZE(blkio_files));
-}
+CGROUP_SUBSYS_CFTYPES(blkio_subsys, blkio_files);
 
 static void blkiocg_destroy(struct cgroup *cgroup)
 {
diff --git a/kernel/cgroup.c b/kernel/cgroup.c
index 6c396f3..20fdff9 100644
--- a/kernel/cgroup.c
+++ b/kernel/cgroup.c
@@ -5350,19 +5350,15 @@ static struct cftype debug_files[] =  {
 		.name = "releasable",
 		.read_u64 = releasable_read,
 	},
-};
 
-static int debug_populate(struct cgroup_subsys *ss, struct cgroup *cont)
-{
-	return cgroup_add_files(cont, ss, debug_files,
-				ARRAY_SIZE(debug_files));
-}
+	{ }	/* terminate */
+};
+CGROUP_SUBSYS_CFTYPES(debug_subsys, debug_files);
 
 struct cgroup_subsys debug_subsys = {
 	.name = "debug",
 	.create = debug_create,
 	.destroy = debug_destroy,
-	.populate = debug_populate,
 	.subsys_id = debug_subsys_id,
 };
 #endif /* CONFIG_CGROUP_DEBUG */
diff --git a/kernel/cgroup_freezer.c b/kernel/cgroup_freezer.c
index f86e939..b89f7c4 100644
--- a/kernel/cgroup_freezer.c
+++ b/kernel/cgroup_freezer.c
@@ -358,23 +358,18 @@ static int freezer_write(struct cgroup *cgroup,
 static struct cftype files[] = {
 	{
 		.name = "state",
+		.flags = CFTYPE_NOT_ON_ROOT,
 		.read_seq_string = freezer_read,
 		.write_string = freezer_write,
 	},
+	{ }	/* terminate */
 };
-
-static int freezer_populate(struct cgroup_subsys *ss, struct cgroup *cgroup)
-{
-	if (!cgroup->parent)
-		return 0;
-	return cgroup_add_files(cgroup, ss, files, ARRAY_SIZE(files));
-}
+CGROUP_SUBSYS_CFTYPES(freezer_subsys, files);
 
 struct cgroup_subsys freezer_subsys = {
 	.name		= "freezer",
 	.create		= freezer_create,
 	.destroy	= freezer_destroy,
-	.populate	= freezer_populate,
 	.subsys_id	= freezer_subsys_id,
 	.can_attach	= freezer_can_attach,
 	.fork		= freezer_fork,
diff --git a/kernel/cpuset.c b/kernel/cpuset.c
index 5d57583..2e38198 100644
--- a/kernel/cpuset.c
+++ b/kernel/cpuset.c
@@ -1792,28 +1792,18 @@ static struct cftype files[] = {
 		.write_u64 = cpuset_write_u64,
 		.private = FILE_SPREAD_SLAB,
 	},
-};
-
-static struct cftype cft_memory_pressure_enabled = {
-	.name = "memory_pressure_enabled",
-	.read_u64 = cpuset_read_u64,
-	.write_u64 = cpuset_write_u64,
-	.private = FILE_MEMORY_PRESSURE_ENABLED,
-};
 
-static int cpuset_populate(struct cgroup_subsys *ss, struct cgroup *cont)
-{
-	int err;
+	{
+		.name = "memory_pressure_enabled",
+		.flags = CFTYPE_ONLY_ON_ROOT,
+		.read_u64 = cpuset_read_u64,
+		.write_u64 = cpuset_write_u64,
+		.private = FILE_MEMORY_PRESSURE_ENABLED,
+	},
 
-	err = cgroup_add_files(cont, ss, files, ARRAY_SIZE(files));
-	if (err)
-		return err;
-	/* memory_pressure_enabled is in root cpuset only */
-	if (!cont->parent)
-		err = cgroup_add_file(cont, ss,
-				      &cft_memory_pressure_enabled);
-	return err;
-}
+	{ }	/* terminate */
+};
+CGROUP_SUBSYS_CFTYPES(cpuset_subsys, files);
 
 /*
  * post_clone() is called during cgroup_create() when the
@@ -1914,7 +1904,6 @@ struct cgroup_subsys cpuset_subsys = {
 	.destroy = cpuset_destroy,
 	.can_attach = cpuset_can_attach,
 	.attach = cpuset_attach,
-	.populate = cpuset_populate,
 	.post_clone = cpuset_post_clone,
 	.subsys_id = cpuset_subsys_id,
 	.early_init = 1,
diff --git a/kernel/sched/core.c b/kernel/sched/core.c
index ff12f72..4d69359 100644
--- a/kernel/sched/core.c
+++ b/kernel/sched/core.c
@@ -7902,12 +7902,9 @@ static struct cftype cpu_files[] = {
 		.write_u64 = cpu_rt_period_write_uint,
 	},
 #endif
+	{ }	/* terminate */
 };
-
-static int cpu_cgroup_populate(struct cgroup_subsys *ss, struct cgroup *cont)
-{
-	return cgroup_add_files(cont, ss, cpu_files, ARRAY_SIZE(cpu_files));
-}
+CGROUP_SUBSYS_CFTYPES(cpu_cgroup_subsys, cpu_files);
 
 struct cgroup_subsys cpu_cgroup_subsys = {
 	.name		= "cpu",
@@ -7916,7 +7913,6 @@ struct cgroup_subsys cpu_cgroup_subsys = {
 	.can_attach	= cpu_cgroup_can_attach,
 	.attach		= cpu_cgroup_attach,
 	.exit		= cpu_cgroup_exit,
-	.populate	= cpu_cgroup_populate,
 	.subsys_id	= cpu_cgroup_subsys_id,
 	.early_init	= 1,
 };
@@ -8102,12 +8098,9 @@ static struct cftype files[] = {
 		.name = "stat",
 		.read_map = cpuacct_stats_show,
 	},
+	{ }	/* terminate */
 };
-
-static int cpuacct_populate(struct cgroup_subsys *ss, struct cgroup *cgrp)
-{
-	return cgroup_add_files(cgrp, ss, files, ARRAY_SIZE(files));
-}
+CGROUP_SUBSYS_CFTYPES(cpuacct_subsys, files);
 
 /*
  * charge this task's execution time to its accounting group.
@@ -8140,7 +8133,6 @@ struct cgroup_subsys cpuacct_subsys = {
 	.name = "cpuacct",
 	.create = cpuacct_create,
 	.destroy = cpuacct_destroy,
-	.populate = cpuacct_populate,
 	.subsys_id = cpuacct_subsys_id,
 };
 #endif	/* CONFIG_CGROUP_CPUACCT */
diff --git a/net/core/netprio_cgroup.c b/net/core/netprio_cgroup.c
index 22036ab..1aee978 100644
--- a/net/core/netprio_cgroup.c
+++ b/net/core/netprio_cgroup.c
@@ -25,13 +25,11 @@
 
 static struct cgroup_subsys_state *cgrp_create(struct cgroup *cgrp);
 static void cgrp_destroy(struct cgroup *cgrp);
-static int cgrp_populate(struct cgroup_subsys *ss, struct cgroup *cgrp);
 
 struct cgroup_subsys net_prio_subsys = {
 	.name		= "net_prio",
 	.create		= cgrp_create,
 	.destroy	= cgrp_destroy,
-	.populate	= cgrp_populate,
 #ifdef CONFIG_NETPRIO_CGROUP
 	.subsys_id	= net_prio_subsys_id,
 #endif
@@ -256,12 +254,9 @@ static struct cftype ss_files[] = {
 		.read_map = read_priomap,
 		.write_string = write_priomap,
 	},
+	{ }	/* terminate */
 };
-
-static int cgrp_populate(struct cgroup_subsys *ss, struct cgroup *cgrp)
-{
-	return cgroup_add_files(cgrp, ss, ss_files, ARRAY_SIZE(ss_files));
-}
+CGROUP_SUBSYS_CFTYPES(net_prio_subsys, ss_files);
 
 static int netprio_device_event(struct notifier_block *unused,
 				unsigned long event, void *ptr)
diff --git a/net/sched/cls_cgroup.c b/net/sched/cls_cgroup.c
index 1afaa28..1a856d8 100644
--- a/net/sched/cls_cgroup.c
+++ b/net/sched/cls_cgroup.c
@@ -24,13 +24,11 @@
 
 static struct cgroup_subsys_state *cgrp_create(struct cgroup *cgrp);
 static void cgrp_destroy(struct cgroup *cgrp);
-static int cgrp_populate(struct cgroup_subsys *ss, struct cgroup *cgrp);
 
 struct cgroup_subsys net_cls_subsys = {
 	.name		= "net_cls",
 	.create		= cgrp_create,
 	.destroy	= cgrp_destroy,
-	.populate	= cgrp_populate,
 #ifdef CONFIG_NET_CLS_CGROUP
 	.subsys_id	= net_cls_subsys_id,
 #endif
@@ -86,12 +84,9 @@ static struct cftype ss_files[] = {
 		.read_u64 = read_classid,
 		.write_u64 = write_classid,
 	},
+	{ }	/* terminate */
 };
-
-static int cgrp_populate(struct cgroup_subsys *ss, struct cgroup *cgrp)
-{
-	return cgroup_add_files(cgrp, ss, ss_files, ARRAY_SIZE(ss_files));
-}
+CGROUP_SUBSYS_CFTYPES(net_cls_subsys, ss_files);
 
 struct cls_cgroup_head {
 	u32			handle;
diff --git a/security/device_cgroup.c b/security/device_cgroup.c
index c43a332..11cf061 100644
--- a/security/device_cgroup.c
+++ b/security/device_cgroup.c
@@ -447,21 +447,15 @@ static struct cftype dev_cgroup_files[] = {
 		.read_seq_string = devcgroup_seq_read,
 		.private = DEVCG_LIST,
 	},
+	{ }	/* terminate */
 };
-
-static int devcgroup_populate(struct cgroup_subsys *ss,
-				struct cgroup *cgroup)
-{
-	return cgroup_add_files(cgroup, ss, dev_cgroup_files,
-					ARRAY_SIZE(dev_cgroup_files));
-}
+CGROUP_SUBSYS_CFTYPES(devices_subsys, dev_cgroup_files);
 
 struct cgroup_subsys devices_subsys = {
 	.name = "devices",
 	.can_attach = devcgroup_can_attach,
 	.create = devcgroup_create,
 	.destroy = devcgroup_destroy,
-	.populate = devcgroup_populate,
 	.subsys_id = devices_subsys_id,
 };
 
-- 
1.7.7.3


^ permalink raw reply related	[flat|nested] 33+ messages in thread

* [PATCH 06/10] cgroup: convert memcg controller to the new cftype interface
  2012-03-16 23:35 [PATCHSET] cgroup: cftype based file interface Tejun Heo
                   ` (4 preceding siblings ...)
  2012-03-16 23:35 ` [PATCH 05/10] cgroup: convert all non-memcg controllers to the new cftype interface Tejun Heo
@ 2012-03-16 23:35 ` Tejun Heo
  2012-03-19  4:26   ` KAMEZAWA Hiroyuki
  2012-03-16 23:36 ` [PATCH 07/10] cgroup: remove cgroup_add_file[s]() Tejun Heo
                   ` (4 subsequent siblings)
  10 siblings, 1 reply; 33+ messages in thread
From: Tejun Heo @ 2012-03-16 23:35 UTC (permalink / raw)
  To: glommer, lizf, containers, cgroups
  Cc: linux-kernel, fweisbec, rni, ctalbott, Tejun Heo,
	Johannes Weiner, Michal Hocko, Balbir Singh, KAMEZAWA Hiroyuki,
	Hugh Dickins, Greg Thelen

Convert memcg to use the new cftype based interface. memcg is rather
special in that

* memsw_cgroup_files creation is dependent on do_swap_account.

* ->populate() is abused for mem_cgroup_sockets_init().

memsw_cgroup_files is converted to use CGROUP_SUBSYS_CFTYPES_COND(),
where the condition is really_do_swap_account.

->populate() is preserved for register_kmem_files() invocation but I
strongly urge moving the initialization to the tail of ->create().  If
there's something which can't be done from ->create(), which isn't
clear from the comment, please let me know.

Signed-off-by: Tejun Heo <tj@kernel.org>
Cc: Johannes Weiner <hannes@cmpxchg.org>
Cc: Michal Hocko <mhocko@suse.cz>
Cc: Balbir Singh <bsingharora@gmail.com>
Cc: KAMEZAWA Hiroyuki <kamezawa.hiroyu@jp.fujitsu.com>
Cc: Glauber Costa <glommer@parallels.com>
Cc: Hugh Dickins <hughd@google.com>
Cc: Greg Thelen <gthelen@google.com>
---
 mm/memcontrol.c           |   31 ++++++-------------------------
 net/ipv4/tcp_memcontrol.c |    8 ++++----
 2 files changed, 10 insertions(+), 29 deletions(-)

diff --git a/mm/memcontrol.c b/mm/memcontrol.c
index ae2f0a8..f2221ce 100644
--- a/mm/memcontrol.c
+++ b/mm/memcontrol.c
@@ -4665,7 +4665,9 @@ static struct cftype mem_cgroup_files[] = {
 		.mode = S_IRUGO,
 	},
 #endif
+	{ },	/* terminate */
 };
+CGROUP_SUBSYS_CFTYPES(mem_cgroup_subsys, mem_cgroup_files);
 
 #ifdef CONFIG_CGROUP_MEM_RES_CTLR_SWAP
 static struct cftype memsw_cgroup_files[] = {
@@ -4694,20 +4696,10 @@ static struct cftype memsw_cgroup_files[] = {
 		.trigger = mem_cgroup_reset,
 		.read_u64 = mem_cgroup_read,
 	},
+	{ }	/* terminate */
 };
-
-static int register_memsw_files(struct cgroup *cont, struct cgroup_subsys *ss)
-{
-	if (!do_swap_account)
-		return 0;
-	return cgroup_add_files(cont, ss, memsw_cgroup_files,
-				ARRAY_SIZE(memsw_cgroup_files));
-};
-#else
-static int register_memsw_files(struct cgroup *cont, struct cgroup_subsys *ss)
-{
-	return 0;
-}
+CGROUP_SUBSYS_CFTYPES_COND(mem_cgroup_subsys, memsw_cgroup_files,
+			   really_do_swap_account);
 #endif
 
 static int alloc_mem_cgroup_per_zone_info(struct mem_cgroup *memcg, int node)
@@ -4963,18 +4955,7 @@ static void mem_cgroup_destroy(struct cgroup *cont)
 static int mem_cgroup_populate(struct cgroup_subsys *ss,
 				struct cgroup *cont)
 {
-	int ret;
-
-	ret = cgroup_add_files(cont, ss, mem_cgroup_files,
-				ARRAY_SIZE(mem_cgroup_files));
-
-	if (!ret)
-		ret = register_memsw_files(cont, ss);
-
-	if (!ret)
-		ret = register_kmem_files(cont, ss);
-
-	return ret;
+	return register_kmem_files(cont, ss);
 }
 
 #ifdef CONFIG_MMU
diff --git a/net/ipv4/tcp_memcontrol.c b/net/ipv4/tcp_memcontrol.c
index e714c68..21f48f3 100644
--- a/net/ipv4/tcp_memcontrol.c
+++ b/net/ipv4/tcp_memcontrol.c
@@ -35,7 +35,9 @@ static struct cftype tcp_files[] = {
 		.trigger = tcp_cgroup_reset,
 		.read_u64 = tcp_cgroup_read,
 	},
+	{ }	/* terminate */
 };
+CGROUP_SUBSYS_CFTYPES(mem_cgroup_subsys, tcp_files);
 
 static inline struct tcp_memcontrol *tcp_from_cgproto(struct cg_proto *cg_proto)
 {
@@ -65,7 +67,7 @@ int tcp_init_cgroup(struct cgroup *cgrp, struct cgroup_subsys *ss)
 
 	cg_proto = tcp_prot.proto_cgroup(memcg);
 	if (!cg_proto)
-		goto create_files;
+		return 0;
 
 	tcp = tcp_from_cgproto(cg_proto);
 
@@ -88,9 +90,7 @@ int tcp_init_cgroup(struct cgroup *cgrp, struct cgroup_subsys *ss)
 	cg_proto->sockets_allocated = &tcp->tcp_sockets_allocated;
 	cg_proto->memcg = memcg;
 
-create_files:
-	return cgroup_add_files(cgrp, ss, tcp_files,
-				ARRAY_SIZE(tcp_files));
+	return 0;
 }
 EXPORT_SYMBOL(tcp_init_cgroup);
 
-- 
1.7.7.3


^ permalink raw reply related	[flat|nested] 33+ messages in thread

* [PATCH 07/10] cgroup: remove cgroup_add_file[s]()
  2012-03-16 23:35 [PATCHSET] cgroup: cftype based file interface Tejun Heo
                   ` (5 preceding siblings ...)
  2012-03-16 23:35 ` [PATCH 06/10] cgroup: convert memcg controller " Tejun Heo
@ 2012-03-16 23:36 ` Tejun Heo
  2012-03-20  8:47   ` Aneesh Kumar K.V
  2012-03-16 23:36 ` [PATCH 08/10] cgroup: relocate __d_cgrp() and __d_cft() Tejun Heo
                   ` (3 subsequent siblings)
  10 siblings, 1 reply; 33+ messages in thread
From: Tejun Heo @ 2012-03-16 23:36 UTC (permalink / raw)
  To: glommer, lizf, containers, cgroups
  Cc: linux-kernel, fweisbec, rni, ctalbott, Tejun Heo

No controller is using cgroup_add_files[s]().  Unexport them, and
convert cgroup_add_files() to handle NULL entry terminated array
instead of taking count explicitly and continue creation on failure
for internal use.

Signed-off-by: Tejun Heo <tj@kernel.org>
---
 include/linux/cgroup.h |   16 ---------------
 kernel/cgroup.c        |   51 ++++++++++++++++++-----------------------------
 2 files changed, 20 insertions(+), 47 deletions(-)

diff --git a/include/linux/cgroup.h b/include/linux/cgroup.h
index a78a6a8..05ddf59 100644
--- a/include/linux/cgroup.h
+++ b/include/linux/cgroup.h
@@ -413,22 +413,6 @@ struct cgroup_scanner {
 	void *data;
 };
 
-/*
- * Add a new file to the given cgroup directory. Should only be
- * called by subsystems from within a populate() method
- */
-int cgroup_add_file(struct cgroup *cgrp, struct cgroup_subsys *subsys,
-		       const struct cftype *cft);
-
-/*
- * Add a set of new files to the given cgroup directory. Should
- * only be called by subsystems from within a populate() method
- */
-int cgroup_add_files(struct cgroup *cgrp,
-			struct cgroup_subsys *subsys,
-			const struct cftype cft[],
-			int count);
-
 int cgroup_add_cftypes(struct cgroup_subsys *ss, const struct cftype *cfts);
 
 int cgroup_is_removed(const struct cgroup *cgrp);
diff --git a/kernel/cgroup.c b/kernel/cgroup.c
index 20fdff9..33383cb 100644
--- a/kernel/cgroup.c
+++ b/kernel/cgroup.c
@@ -2629,9 +2629,8 @@ static umode_t cgroup_file_mode(const struct cftype *cft)
 	return mode;
 }
 
-int cgroup_add_file(struct cgroup *cgrp,
-		       struct cgroup_subsys *subsys,
-		       const struct cftype *cft)
+static int cgroup_add_file(struct cgroup *cgrp, struct cgroup_subsys *subsys,
+			   const struct cftype *cft)
 {
 	struct dentry *dir = cgrp->dentry;
 	struct dentry *dentry;
@@ -2663,22 +2662,23 @@ int cgroup_add_file(struct cgroup *cgrp,
 		error = PTR_ERR(dentry);
 	return error;
 }
-EXPORT_SYMBOL_GPL(cgroup_add_file);
 
-int cgroup_add_files(struct cgroup *cgrp,
-			struct cgroup_subsys *subsys,
-			const struct cftype cft[],
-			int count)
+static int cgroup_add_files(struct cgroup *cgrp, struct cgroup_subsys *subsys,
+			    const struct cftype cfts[])
 {
-	int i, err;
-	for (i = 0; i < count; i++) {
-		err = cgroup_add_file(cgrp, subsys, &cft[i]);
-		if (err)
-			return err;
+	const struct cftype *cft;
+	int err, ret = 0;
+
+	for (cft = cfts; cft->name[0] != '\0'; cft++) {
+		err = cgroup_add_file(cgrp, subsys, cft);
+		if (err) {
+			pr_warning("cgroup_add_files: failed to create %s, err=%d\n",
+				   cft->name, err);
+			ret = err;
+		}
 	}
-	return 0;
+	return ret;
 }
-EXPORT_SYMBOL_GPL(cgroup_add_files);
 
 static DEFINE_MUTEX(cgroup_cft_mutex);
 
@@ -2702,10 +2702,6 @@ static void cgroup_cfts_commit(struct cgroup_subsys *ss,
 {
 	LIST_HEAD(pending);
 	struct cgroup *cgrp, *n;
-	int count = 0;
-
-	while (cfts[count].name[0] != '\0')
-		count++;
 
 	/* %NULL @cfts indicates abort and don't bother if @ss isn't attached */
 	if (cfts && ss->root != &rootnode) {
@@ -2727,7 +2723,7 @@ static void cgroup_cfts_commit(struct cgroup_subsys *ss,
 		mutex_lock(&inode->i_mutex);
 		mutex_lock(&cgroup_mutex);
 		if (!cgroup_is_removed(cgrp))
-			cgroup_add_files(cgrp, ss, cfts, count);
+			cgroup_add_files(cgrp, ss, cfts);
 		mutex_unlock(&cgroup_mutex);
 		mutex_unlock(&inode->i_mutex);
 
@@ -3755,6 +3751,7 @@ static struct cftype files[] = {
 		.write_string = cgroup_release_agent_write,
 		.max_write_len = PATH_MAX,
 	},
+	{ }	/* terminate */
 };
 
 static int cgroup_populate_dir(struct cgroup *cgrp)
@@ -3762,7 +3759,7 @@ static int cgroup_populate_dir(struct cgroup *cgrp)
 	int err;
 	struct cgroup_subsys *ss;
 
-	err = cgroup_add_files(cgrp, NULL, files, ARRAY_SIZE(files));
+	err = cgroup_add_files(cgrp, NULL, files);
 	if (err < 0)
 		return err;
 
@@ -3773,16 +3770,8 @@ static int cgroup_populate_dir(struct cgroup *cgrp)
 		if (ss->populate && (err = ss->populate(ss, cgrp)) < 0)
 			return err;
 
-		list_for_each_entry(set, &ss->cftsets, node) {
-			const struct cftype *cft;
-
-			for (cft = set->cfts; cft->name[0] != '\0'; cft++) {
-				err = cgroup_add_file(cgrp, ss, cft);
-				if (err)
-					pr_warning("cgroup_populate_dir: failed to create %s, err=%d\n",
-						   cft->name, err);
-			}
-		}
+		list_for_each_entry(set, &ss->cftsets, node)
+			cgroup_add_files(cgrp, ss, set->cfts);
 	}
 
 	/* This cgroup is ready now */
-- 
1.7.7.3


^ permalink raw reply related	[flat|nested] 33+ messages in thread

* [PATCH 08/10] cgroup: relocate __d_cgrp() and __d_cft()
  2012-03-16 23:35 [PATCHSET] cgroup: cftype based file interface Tejun Heo
                   ` (6 preceding siblings ...)
  2012-03-16 23:36 ` [PATCH 07/10] cgroup: remove cgroup_add_file[s]() Tejun Heo
@ 2012-03-16 23:36 ` Tejun Heo
  2012-03-16 23:36 ` [PATCH 09/10] cgroup: introduce struct cfent Tejun Heo
                   ` (2 subsequent siblings)
  10 siblings, 0 replies; 33+ messages in thread
From: Tejun Heo @ 2012-03-16 23:36 UTC (permalink / raw)
  To: glommer, lizf, containers, cgroups
  Cc: linux-kernel, fweisbec, rni, ctalbott, Tejun Heo

Move the two macros upwards as they'll be used earlier in the file.

Signed-off-by: Tejun Heo <tj@kernel.org>
---
 kernel/cgroup.c |   20 ++++++++++----------
 1 files changed, 10 insertions(+), 10 deletions(-)

diff --git a/kernel/cgroup.c b/kernel/cgroup.c
index 33383cb..765d9f5 100644
--- a/kernel/cgroup.c
+++ b/kernel/cgroup.c
@@ -292,6 +292,16 @@ list_for_each_entry(_ss, &_root->subsys_list, sibling)
 #define for_each_active_root(_root) \
 list_for_each_entry(_root, &roots, root_list)
 
+static inline struct cgroup *__d_cgrp(struct dentry *dentry)
+{
+	return dentry->d_fsdata;
+}
+
+static inline struct cftype *__d_cft(struct dentry *dentry)
+{
+	return dentry->d_fsdata;
+}
+
 /* the list of cgroups eligible for automatic release. Protected by
  * release_list_lock */
 static LIST_HEAD(release_list);
@@ -1718,16 +1728,6 @@ static struct file_system_type cgroup_fs_type = {
 
 static struct kobject *cgroup_kobj;
 
-static inline struct cgroup *__d_cgrp(struct dentry *dentry)
-{
-	return dentry->d_fsdata;
-}
-
-static inline struct cftype *__d_cft(struct dentry *dentry)
-{
-	return dentry->d_fsdata;
-}
-
 /**
  * cgroup_path - generate the path of a cgroup
  * @cgrp: the cgroup in question
-- 
1.7.7.3


^ permalink raw reply related	[flat|nested] 33+ messages in thread

* [PATCH 09/10] cgroup: introduce struct cfent
  2012-03-16 23:35 [PATCHSET] cgroup: cftype based file interface Tejun Heo
                   ` (7 preceding siblings ...)
  2012-03-16 23:36 ` [PATCH 08/10] cgroup: relocate __d_cgrp() and __d_cft() Tejun Heo
@ 2012-03-16 23:36 ` Tejun Heo
  2012-03-20 14:05   ` Glauber Costa
  2012-03-20 18:06   ` [PATCH UPDATED " Tejun Heo
  2012-03-16 23:36 ` [PATCH 10/10] cgroup: implement cgroup_rm_cftypes() Tejun Heo
  2012-03-19 10:22 ` [PATCHSET] cgroup: cftype based file interface Glauber Costa
  10 siblings, 2 replies; 33+ messages in thread
From: Tejun Heo @ 2012-03-16 23:36 UTC (permalink / raw)
  To: glommer, lizf, containers, cgroups
  Cc: linux-kernel, fweisbec, rni, ctalbott, Tejun Heo

This patch adds cfent (cgroup file entry) which is the association
between a cgroup and a file.  This is in-cgroup representation of
files under a cgroup directory.  This simplifies walking walking
cgroup files and thus cgroup_clear_directory(), which is now
implemented in two parts - cgroup_rm_file() and a loop around it.

cgroup_rm_file() will be used to implement cftype removal and cfent is
scheduled to serve cgroup specific per-file data (e.g. for sysfs-like
"sever" semantics).

Signed-off-by: Tejun Heo <tj@kernel.org>
---
 include/linux/cgroup.h |    1 +
 kernel/cgroup.c        |  107 ++++++++++++++++++++++++++++++++----------------
 2 files changed, 72 insertions(+), 36 deletions(-)

diff --git a/include/linux/cgroup.h b/include/linux/cgroup.h
index 05ddf59..caeaafc 100644
--- a/include/linux/cgroup.h
+++ b/include/linux/cgroup.h
@@ -175,6 +175,7 @@ struct cgroup {
 	 */
 	struct list_head sibling;	/* my parent's children */
 	struct list_head children;	/* my children */
+	struct list_head files;		/* my files */
 
 	struct cgroup *parent;		/* my parent */
 	struct dentry __rcu *dentry;	/* cgroup fs entry, RCU protected */
diff --git a/kernel/cgroup.c b/kernel/cgroup.c
index 765d9f5..c0c61b9 100644
--- a/kernel/cgroup.c
+++ b/kernel/cgroup.c
@@ -158,6 +158,15 @@ struct cftype_set {
 };
 
 /*
+ * cgroupfs file entry, pointed to from leaf dentry->d_fsdata.
+ */
+struct cfent {
+	struct list_head		node;
+	struct dentry			*dentry;
+	struct cftype			*type;
+};
+
+/*
  * CSS ID -- ID per subsys's Cgroup Subsys State(CSS). used only when
  * cgroup_subsys->use_id != 0.
  */
@@ -297,11 +306,16 @@ static inline struct cgroup *__d_cgrp(struct dentry *dentry)
 	return dentry->d_fsdata;
 }
 
-static inline struct cftype *__d_cft(struct dentry *dentry)
+static inline struct cfent *__d_cfe(struct dentry *dentry)
 {
 	return dentry->d_fsdata;
 }
 
+static inline struct cftype *__d_cft(struct dentry *dentry)
+{
+	return __d_cfe(dentry)->type;
+}
+
 /* the list of cgroups eligible for automatic release. Protected by
  * release_list_lock */
 static LIST_HEAD(release_list);
@@ -905,34 +919,38 @@ static void remove_dir(struct dentry *d)
 	dput(parent);
 }
 
-static void cgroup_clear_directory(struct dentry *dentry)
-{
-	struct list_head *node;
-
-	BUG_ON(!mutex_is_locked(&dentry->d_inode->i_mutex));
-	spin_lock(&dentry->d_lock);
-	node = dentry->d_subdirs.next;
-	while (node != &dentry->d_subdirs) {
-		struct dentry *d = list_entry(node, struct dentry, d_u.d_child);
-
-		spin_lock_nested(&d->d_lock, DENTRY_D_LOCK_NESTED);
-		list_del_init(node);
-		if (d->d_inode) {
-			/* This should never be called on a cgroup
-			 * directory with child cgroups */
-			BUG_ON(d->d_inode->i_mode & S_IFDIR);
-			dget_dlock(d);
-			spin_unlock(&d->d_lock);
-			spin_unlock(&dentry->d_lock);
-			d_delete(d);
-			simple_unlink(dentry->d_inode, d);
-			dput(d);
-			spin_lock(&dentry->d_lock);
-		} else
-			spin_unlock(&d->d_lock);
-		node = dentry->d_subdirs.next;
+static int cgroup_rm_file(struct cgroup *cgrp, const struct cftype *cft)
+{
+	struct cfent *cfe;
+
+	lockdep_assert_held(&cgrp->dentry->d_inode->i_mutex);
+	lockdep_assert_held(&cgroup_mutex);
+
+	list_for_each_entry(cfe, &cgrp->files, node) {
+		struct dentry *d = cfe->dentry;
+
+		if (cft && cfe->type != cft)
+			continue;
+
+		dget(d);
+		d_delete(d);
+		simple_unlink(d->d_inode, d);
+		dput(d);
+
+		list_del_init(&cfe->node);
+		kfree(cfe);
+		return 0;
 	}
-	spin_unlock(&dentry->d_lock);
+	return -ENOENT;
+}
+
+static void cgroup_clear_directory(struct dentry *dir)
+{
+	struct cgroup *cgrp = __d_cgrp(dir);
+
+	while (!list_empty(&cgrp->files))
+		cgroup_rm_file(cgrp, NULL);
+	WARN_ON_ONCE(!list_empty(&dir->d_subdirs));
 }
 
 /*
@@ -1362,6 +1380,7 @@ static void init_cgroup_housekeeping(struct cgroup *cgrp)
 {
 	INIT_LIST_HEAD(&cgrp->sibling);
 	INIT_LIST_HEAD(&cgrp->children);
+	INIT_LIST_HEAD(&cgrp->files);
 	INIT_LIST_HEAD(&cgrp->css_sets);
 	INIT_LIST_HEAD(&cgrp->release_list);
 	INIT_LIST_HEAD(&cgrp->pidlists);
@@ -2633,7 +2652,9 @@ static int cgroup_add_file(struct cgroup *cgrp, struct cgroup_subsys *subsys,
 			   const struct cftype *cft)
 {
 	struct dentry *dir = cgrp->dentry;
+	struct cgroup *parent = __d_cgrp(dir);
 	struct dentry *dentry;
+	struct cfent *cfe;
 	int error;
 	umode_t mode;
 	char name[MAX_CGROUP_TYPE_NAMELEN + MAX_CFTYPE_NAME + 2] = { 0 };
@@ -2649,17 +2670,31 @@ static int cgroup_add_file(struct cgroup *cgrp, struct cgroup_subsys *subsys,
 		strcat(name, ".");
 	}
 	strcat(name, cft->name);
+
 	BUG_ON(!mutex_is_locked(&dir->d_inode->i_mutex));
+
+	cfe = kzalloc(sizeof(*cfe), GFP_KERNEL);
+	if (!cfe)
+		return -ENOMEM;
+
 	dentry = lookup_one_len(name, dir, strlen(name));
-	if (!IS_ERR(dentry)) {
-		mode = cgroup_file_mode(cft);
-		error = cgroup_create_file(dentry, mode | S_IFREG,
-						cgrp->root->sb);
-		if (!error)
-			dentry->d_fsdata = (void *)cft;
-		dput(dentry);
-	} else
+	if (IS_ERR(dentry)) {
 		error = PTR_ERR(dentry);
+		goto out;
+	}
+
+	mode = cgroup_file_mode(cft);
+	error = cgroup_create_file(dentry, mode | S_IFREG, cgrp->root->sb);
+	if (!error) {
+		cfe->type = (void *)cft;
+		cfe->dentry = dentry;
+		dentry->d_fsdata = cfe;
+		list_add_tail(&cfe->node, &parent->files);
+		cfe = NULL;
+	}
+	dput(dentry);
+out:
+	kfree(cfe);
 	return error;
 }
 
-- 
1.7.7.3


^ permalink raw reply related	[flat|nested] 33+ messages in thread

* [PATCH 10/10] cgroup: implement cgroup_rm_cftypes()
  2012-03-16 23:35 [PATCHSET] cgroup: cftype based file interface Tejun Heo
                   ` (8 preceding siblings ...)
  2012-03-16 23:36 ` [PATCH 09/10] cgroup: introduce struct cfent Tejun Heo
@ 2012-03-16 23:36 ` Tejun Heo
  2012-03-19 10:22 ` [PATCHSET] cgroup: cftype based file interface Glauber Costa
  10 siblings, 0 replies; 33+ messages in thread
From: Tejun Heo @ 2012-03-16 23:36 UTC (permalink / raw)
  To: glommer, lizf, containers, cgroups
  Cc: linux-kernel, fweisbec, rni, ctalbott, Tejun Heo

Implement cgroup_rm_cftypes() which removes an array of cftypes from a
subsystem.  It can be called whether the target subsys is attached or
not.  cgroup core will remove the specified file from all existing
cgroups.

This will be used to improve sub-subsys modularity and will be helpful
for unified hierarchy.

Signed-off-by: Tejun Heo <tj@kernel.org>
---
 include/linux/cgroup.h |    1 +
 kernel/cgroup.c        |   54 +++++++++++++++++++++++++++++++++++++++--------
 2 files changed, 45 insertions(+), 10 deletions(-)

diff --git a/include/linux/cgroup.h b/include/linux/cgroup.h
index caeaafc..4f3a68b 100644
--- a/include/linux/cgroup.h
+++ b/include/linux/cgroup.h
@@ -415,6 +415,7 @@ struct cgroup_scanner {
 };
 
 int cgroup_add_cftypes(struct cgroup_subsys *ss, const struct cftype *cfts);
+int cgroup_rm_cftypes(struct cgroup_subsys *ss, const struct cftype *cfts);
 
 int cgroup_is_removed(const struct cgroup *cgrp);
 
diff --git a/kernel/cgroup.c b/kernel/cgroup.c
index c0c61b9..cf7b298 100644
--- a/kernel/cgroup.c
+++ b/kernel/cgroup.c
@@ -2698,17 +2698,20 @@ out:
 	return error;
 }
 
-static int cgroup_add_files(struct cgroup *cgrp, struct cgroup_subsys *subsys,
-			    const struct cftype cfts[])
+static int cgroup_addrm_files(struct cgroup *cgrp, struct cgroup_subsys *subsys,
+			      const struct cftype cfts[], bool is_add)
 {
 	const struct cftype *cft;
 	int err, ret = 0;
 
 	for (cft = cfts; cft->name[0] != '\0'; cft++) {
-		err = cgroup_add_file(cgrp, subsys, cft);
+		if (is_add)
+			err = cgroup_add_file(cgrp, subsys, cft);
+		else
+			err = cgroup_rm_file(cgrp, cft);
 		if (err) {
-			pr_warning("cgroup_add_files: failed to create %s, err=%d\n",
-				   cft->name, err);
+			pr_warning("cgroup_addrm_files: failed to %s %s, err=%d\n",
+				   is_add ? "add" : "remove", cft->name, err);
 			ret = err;
 		}
 	}
@@ -2732,7 +2735,7 @@ static void cgroup_cfts_prepare(void)
 }
 
 static void cgroup_cfts_commit(struct cgroup_subsys *ss,
-			       const struct cftype *cfts)
+			       const struct cftype *cfts, bool is_add)
 	__releases(&cgroup_mutex) __releases(&cgroup_cft_mutex)
 {
 	LIST_HEAD(pending);
@@ -2758,7 +2761,7 @@ static void cgroup_cfts_commit(struct cgroup_subsys *ss,
 		mutex_lock(&inode->i_mutex);
 		mutex_lock(&cgroup_mutex);
 		if (!cgroup_is_removed(cgrp))
-			cgroup_add_files(cgrp, ss, cfts);
+			cgroup_addrm_files(cgrp, ss, cfts, is_add);
 		mutex_unlock(&cgroup_mutex);
 		mutex_unlock(&inode->i_mutex);
 
@@ -2796,13 +2799,44 @@ int cgroup_add_cftypes(struct cgroup_subsys *ss, const struct cftype *cfts)
 
 	cgroup_cfts_prepare();
 	list_add_tail(&set->node, &ss->cftsets);
-	cgroup_cfts_commit(ss, cfts);
+	cgroup_cfts_commit(ss, cfts, true);
 
 	return 0;
 }
 EXPORT_SYMBOL_GPL(cgroup_add_cftypes);
 
 /**
+ * cgroup_rm_cftypes - remove an array of cftypes from a subsystem
+ * @ss: target cgroup subsystem
+ * @cfts: zero-length name terminated array of cftypes
+ *
+ * Unregister @cfts from @ss.  Files described by @cfts are removed from
+ * all existing cgroups to which @ss is attached and all future cgroups
+ * won't have them either.  This function can be called anytime after
+ * subsys_initcall whether @ss is attached or not.
+ *
+ * Returns 0 on successful unregistration, -ENOENT if @cfts is not
+ * registered with @ss.
+ */
+int cgroup_rm_cftypes(struct cgroup_subsys *ss, const struct cftype *cfts)
+{
+	struct cftype_set *set;
+
+	cgroup_cfts_prepare();
+
+	list_for_each_entry(set, &ss->cftsets, node) {
+		if (set->cfts == cfts) {
+			list_del_init(&set->node);
+			cgroup_cfts_commit(ss, cfts, false);
+			return 0;
+		}
+	}
+
+	cgroup_cfts_commit(ss, NULL, false);
+	return -ENOENT;
+}
+
+/**
  * cgroup_task_count - count the number of tasks in a cgroup.
  * @cgrp: the cgroup in question
  *
@@ -3794,7 +3828,7 @@ static int cgroup_populate_dir(struct cgroup *cgrp)
 	int err;
 	struct cgroup_subsys *ss;
 
-	err = cgroup_add_files(cgrp, NULL, files);
+	err = cgroup_addrm_files(cgrp, NULL, files, true);
 	if (err < 0)
 		return err;
 
@@ -3806,7 +3840,7 @@ static int cgroup_populate_dir(struct cgroup *cgrp)
 			return err;
 
 		list_for_each_entry(set, &ss->cftsets, node)
-			cgroup_add_files(cgrp, ss, set->cfts);
+			cgroup_addrm_files(cgrp, ss, set->cfts, true);
 	}
 
 	/* This cgroup is ready now */
-- 
1.7.7.3


^ permalink raw reply related	[flat|nested] 33+ messages in thread

* Re: [PATCH 06/10] cgroup: convert memcg controller to the new cftype interface
  2012-03-16 23:35 ` [PATCH 06/10] cgroup: convert memcg controller " Tejun Heo
@ 2012-03-19  4:26   ` KAMEZAWA Hiroyuki
  2012-03-19 10:43     ` Glauber Costa
  2012-03-19 16:10     ` Tejun Heo
  0 siblings, 2 replies; 33+ messages in thread
From: KAMEZAWA Hiroyuki @ 2012-03-19  4:26 UTC (permalink / raw)
  To: Tejun Heo
  Cc: glommer, lizf, containers, cgroups, linux-kernel, fweisbec, rni,
	ctalbott, Johannes Weiner, Michal Hocko, Balbir Singh,
	Hugh Dickins, Greg Thelen

(2012/03/17 8:35), Tejun Heo wrote:

> Convert memcg to use the new cftype based interface. memcg is rather
> special in that
> 
> * memsw_cgroup_files creation is dependent on do_swap_account.
> 
> * ->populate() is abused for mem_cgroup_sockets_init().
> 
> memsw_cgroup_files is converted to use CGROUP_SUBSYS_CFTYPES_COND(),
> where the condition is really_do_swap_account.
> 
> ->populate() is preserved for register_kmem_files() invocation but I
> strongly urge moving the initialization to the tail of ->create().  If
> there's something which can't be done from ->create(), which isn't
> clear from the comment, please let me know.
> 

Okay, then, cgroup files are created in kernel/cgroup.c and memcg just
registers entries.



> Signed-off-by: Tejun Heo <tj@kernel.org>
> Cc: Johannes Weiner <hannes@cmpxchg.org>
> Cc: Michal Hocko <mhocko@suse.cz>
> Cc: Balbir Singh <bsingharora@gmail.com>
> Cc: KAMEZAWA Hiroyuki <kamezawa.hiroyu@jp.fujitsu.com>
> Cc: Glauber Costa <glommer@parallels.com>
> Cc: Hugh Dickins <hughd@google.com>
> Cc: Greg Thelen <gthelen@google.com>
> ---
>  mm/memcontrol.c           |   31 ++++++-------------------------
>  net/ipv4/tcp_memcontrol.c |    8 ++++----
>  2 files changed, 10 insertions(+), 29 deletions(-)
> 
> diff --git a/mm/memcontrol.c b/mm/memcontrol.c
> index ae2f0a8..f2221ce 100644
> --- a/mm/memcontrol.c
> +++ b/mm/memcontrol.c
> @@ -4665,7 +4665,9 @@ static struct cftype mem_cgroup_files[] = {
>  		.mode = S_IRUGO,
>  	},
>  #endif
> +	{ },	/* terminate */
>  };
> +CGROUP_SUBSYS_CFTYPES(mem_cgroup_subsys, mem_cgroup_files);
>  
>  #ifdef CONFIG_CGROUP_MEM_RES_CTLR_SWAP
>  static struct cftype memsw_cgroup_files[] = {
> @@ -4694,20 +4696,10 @@ static struct cftype memsw_cgroup_files[] = {
>  		.trigger = mem_cgroup_reset,
>  		.read_u64 = mem_cgroup_read,
>  	},
> +	{ }	/* terminate */
>  };
> -
> -static int register_memsw_files(struct cgroup *cont, struct cgroup_subsys *ss)
> -{
> -	if (!do_swap_account)
> -		return 0;
> -	return cgroup_add_files(cont, ss, memsw_cgroup_files,
> -				ARRAY_SIZE(memsw_cgroup_files));
> -};
> -#else
> -static int register_memsw_files(struct cgroup *cont, struct cgroup_subsys *ss)
> -{
> -	return 0;
> -}
> +CGROUP_SUBSYS_CFTYPES_COND(mem_cgroup_subsys, memsw_cgroup_files,
> +			   really_do_swap_account);


I'm sorry but why you use really_do_swap_account instead of do_swap_account ?

Thanks,
-Kame


^ permalink raw reply	[flat|nested] 33+ messages in thread

* Re: [PATCHSET] cgroup: cftype based file interface
  2012-03-16 23:35 [PATCHSET] cgroup: cftype based file interface Tejun Heo
                   ` (9 preceding siblings ...)
  2012-03-16 23:36 ` [PATCH 10/10] cgroup: implement cgroup_rm_cftypes() Tejun Heo
@ 2012-03-19 10:22 ` Glauber Costa
  2012-03-19 16:05   ` Tejun Heo
  10 siblings, 1 reply; 33+ messages in thread
From: Glauber Costa @ 2012-03-19 10:22 UTC (permalink / raw)
  To: Tejun Heo
  Cc: lizf, containers, cgroups, linux-kernel, fweisbec, rni, ctalbott

On 03/17/2012 03:35 AM, Tejun Heo wrote:
> Hello, guys.
>
> This patch replaces cgroup file interface with cftype based one which
> allows dynamic additions and removals of cftype arrays whether the
> target subsystem is attached or not.

Great

> This can be used to make subsys
> rebinding via remount work properly but I intentionally avoided doing
> that at the moment.
>
> This makes cgroup population simpler for controllers and will be used
> to allow controllers to be more dynamic.  e.g. blkio subsys has
> sub-policies which may come and go while blkio subsys is attached and
> it currently uses fixed set of files which stays blank if not in use.
> This will also be useful for implementing unified hierarchy.
>
> This patchset contains the following patches.
>
>   0001-cgroup-move-cgroup_clear_directory-call-out-of-cgrou.patch
>   0002-cgroup-build-list-of-all-cgroups-under-a-given-cgrou.patch
>   0003-cgroup-implement-cgroup_add_cftypes-and-friends.patch
>   0004-cgroup-merge-cft_release_agent-cftype-array-into-the.patch
>   0005-cgroup-convert-all-non-memcg-controllers-to-the-new-.patch
>   0006-cgroup-convert-memcg-controller-to-the-new-cftype-in.patch
>   0007-cgroup-remove-cgroup_add_file-s.patch
>   0008-cgroup-relocate-__d_cgrp-and-__d_cft.patch
>   0009-cgroup-introduce-struct-cfent.patch
>   0010-cgroup-implement-cgroup_rm_cftypes.patch
>
> and is on top of
>
>    cgroup/for-3.4 3ce3230a0cff484e5130153f244d4fb8a56b3a8b
> + [1] cgroup: deprecate remount option changes mount option
>
> and is also available in the following git branch.
>
>   git://git.kernel.org/pub/scm/linux/kernel/git/tj/misc.git cgroup-cftype
>
> Glauber, can you please try to move net kmem stuff out of
> ->populate().  If ->create() doesn't work for whatever reason, can you
> please explain it to me so that we can find a proper solution?

The main reason is twofold:

It first had to do with the order in which registerings take place at 
the kernel. But this matter most for the root cgroup. For the children, 
it should be all initialized anyway. So we can special case whatever is 
needed.

Another point was not to bloat the socket structures with more function 
calls, for populate and create. But we can possibly be able to store 
some data ourselves, and figure it out.

How should I do it? Do you want me to provide a patch ontop of your tree ?

^ permalink raw reply	[flat|nested] 33+ messages in thread

* Re: [PATCH 01/10] cgroup: move cgroup_clear_directory() call out of cgroup_populate_dir()
  2012-03-16 23:35 ` [PATCH 01/10] cgroup: move cgroup_clear_directory() call out of cgroup_populate_dir() Tejun Heo
@ 2012-03-19 10:25   ` Glauber Costa
  0 siblings, 0 replies; 33+ messages in thread
From: Glauber Costa @ 2012-03-19 10:25 UTC (permalink / raw)
  To: Tejun Heo
  Cc: lizf, containers, cgroups, linux-kernel, fweisbec, rni, ctalbott

On 03/17/2012 03:35 AM, Tejun Heo wrote:
> cgroup_populate_dir() currently clears all files and then repopulate
> the directory; however, the clearing part is only useful when it's
> called from cgroup_remount().  Relocate the invocation to
> cgroup_remount().
>
> This is to prepare for further cgroup file handling updates.
>
> Signed-off-by: Tejun Heo<tj@kernel.org>

Acked-by: Glauber Costa <glommer@parallels.com>


^ permalink raw reply	[flat|nested] 33+ messages in thread

* Re: [PATCH 06/10] cgroup: convert memcg controller to the new cftype interface
  2012-03-19  4:26   ` KAMEZAWA Hiroyuki
@ 2012-03-19 10:43     ` Glauber Costa
  2012-03-19 16:15       ` Tejun Heo
  2012-03-19 16:10     ` Tejun Heo
  1 sibling, 1 reply; 33+ messages in thread
From: Glauber Costa @ 2012-03-19 10:43 UTC (permalink / raw)
  To: KAMEZAWA Hiroyuki
  Cc: Tejun Heo, lizf, containers, cgroups, linux-kernel, fweisbec,
	rni, ctalbott, Johannes Weiner, Michal Hocko, Balbir Singh,
	Hugh Dickins, Greg Thelen

On 03/19/2012 08:26 AM, KAMEZAWA Hiroyuki wrote:
>> ->populate() is preserved for register_kmem_files() invocation but I
>> >  strongly urge moving the initialization to the tail of ->create().  If
>> >  there's something which can't be done from ->create(), which isn't
>> >  clear from the comment, please let me know.
>> >  
> Okay, then, cgroup files are created in kernel/cgroup.c and memcg just
> registers entries.
> 
> 
> 
I am still in the middle of the review. Is it possible to dynamically
register entries? (right now, I mean)

If yes - which seems to be a bit of the point of the exercise, so it
should be totally okay from my PoV. I have that call in populate because
which files will be created depends on which protocols you have registered.

The trick here, is that doesn't need to happen at all cgroup creations.
But it can't happen at root's either, because this is quite fragile:
some protocols may only be registered after root memcg is created.

^ permalink raw reply	[flat|nested] 33+ messages in thread

* Re: [PATCHSET] cgroup: cftype based file interface
  2012-03-19 10:22 ` [PATCHSET] cgroup: cftype based file interface Glauber Costa
@ 2012-03-19 16:05   ` Tejun Heo
  2012-03-19 16:12     ` Glauber Costa
  0 siblings, 1 reply; 33+ messages in thread
From: Tejun Heo @ 2012-03-19 16:05 UTC (permalink / raw)
  To: Glauber Costa
  Cc: lizf, containers, cgroups, linux-kernel, fweisbec, rni, ctalbott

Hello, Glauber.

On Mon, Mar 19, 2012 at 02:22:44PM +0400, Glauber Costa wrote:
> >Glauber, can you please try to move net kmem stuff out of
> >->populate().  If ->create() doesn't work for whatever reason, can you
> >please explain it to me so that we can find a proper solution?
> 
> The main reason is twofold:
> 
> It first had to do with the order in which registerings take place
> at the kernel. But this matter most for the root cgroup. For the
> children, it should be all initialized anyway. So we can special
> case whatever is needed.

I see.

> Another point was not to bloat the socket structures with more
> function calls, for populate and create. But we can possibly be able
> to store some data ourselves, and figure it out.
>
> How should I do it? Do you want me to provide a patch ontop of your tree ?

Yeah, that would be great.

Thanks.

-- 
tejun

^ permalink raw reply	[flat|nested] 33+ messages in thread

* Re: [PATCH 06/10] cgroup: convert memcg controller to the new cftype interface
  2012-03-19  4:26   ` KAMEZAWA Hiroyuki
  2012-03-19 10:43     ` Glauber Costa
@ 2012-03-19 16:10     ` Tejun Heo
  2012-03-21  4:42       ` KAMEZAWA Hiroyuki
  1 sibling, 1 reply; 33+ messages in thread
From: Tejun Heo @ 2012-03-19 16:10 UTC (permalink / raw)
  To: KAMEZAWA Hiroyuki
  Cc: glommer, lizf, containers, cgroups, linux-kernel, fweisbec, rni,
	ctalbott, Johannes Weiner, Michal Hocko, Balbir Singh,
	Hugh Dickins, Greg Thelen

Hello,

On Mon, Mar 19, 2012 at 01:26:02PM +0900, KAMEZAWA Hiroyuki wrote:
> > +CGROUP_SUBSYS_CFTYPES_COND(mem_cgroup_subsys, memsw_cgroup_files,
> > +			   really_do_swap_account);
> 
> 
> I'm sorry but why you use really_do_swap_account instead of do_swap_account ?

Because do_swap_account is initialized too late.
CGROUP_SUBSYS_CFTYPES() are processed via fs_initcall().  AFAICS,
do_swap_account may not have been initialized by then.  Also, if memcg
as whole is disabled, it doesn't matter whether those files are
registered or not, right?  We probably need some comment there tho.

Thanks.

-- 
tejun

^ permalink raw reply	[flat|nested] 33+ messages in thread

* Re: [PATCHSET] cgroup: cftype based file interface
  2012-03-19 16:05   ` Tejun Heo
@ 2012-03-19 16:12     ` Glauber Costa
  2012-03-19 16:15       ` Tejun Heo
  0 siblings, 1 reply; 33+ messages in thread
From: Glauber Costa @ 2012-03-19 16:12 UTC (permalink / raw)
  To: Tejun Heo
  Cc: lizf, containers, cgroups, linux-kernel, fweisbec, rni, ctalbott

On 03/19/2012 08:05 PM, Tejun Heo wrote:
> Hello, Glauber.
>
> On Mon, Mar 19, 2012 at 02:22:44PM +0400, Glauber
Hello, Tejun! =)


^ permalink raw reply	[flat|nested] 33+ messages in thread

* Re: [PATCH 06/10] cgroup: convert memcg controller to the new cftype interface
  2012-03-19 10:43     ` Glauber Costa
@ 2012-03-19 16:15       ` Tejun Heo
  0 siblings, 0 replies; 33+ messages in thread
From: Tejun Heo @ 2012-03-19 16:15 UTC (permalink / raw)
  To: Glauber Costa
  Cc: KAMEZAWA Hiroyuki, lizf, containers, cgroups, linux-kernel,
	fweisbec, rni, ctalbott, Johannes Weiner, Michal Hocko,
	Balbir Singh, Hugh Dickins, Greg Thelen

Hey,

On Mon, Mar 19, 2012 at 02:43:14PM +0400, Glauber Costa wrote:
> I am still in the middle of the review. Is it possible to dynamically
> register entries? (right now, I mean)

Before the patchset, no.  After the patchset, yes.

> If yes - which seems to be a bit of the point of the exercise, so it
> should be totally okay from my PoV. I have that call in populate because
> which files will be created depends on which protocols you have registered.
> 
> The trick here, is that doesn't need to happen at all cgroup creations.
> But it can't happen at root's either, because this is quite fragile:
> some protocols may only be registered after root memcg is created.

Yeah, it can happen anytime but it shouldn't be called from ->create()
or other callbacks which are invoked under cgroup_mutex.  We can
provide an alternative interface or other locking trickery to avoid
this, but I don't think I'll take that path unless there are very
strong rationale.

Thanks.

-- 
tejun

^ permalink raw reply	[flat|nested] 33+ messages in thread

* Re: [PATCHSET] cgroup: cftype based file interface
  2012-03-19 16:12     ` Glauber Costa
@ 2012-03-19 16:15       ` Tejun Heo
  0 siblings, 0 replies; 33+ messages in thread
From: Tejun Heo @ 2012-03-19 16:15 UTC (permalink / raw)
  To: Glauber Costa
  Cc: lizf, containers, cgroups, linux-kernel, fweisbec, rni, ctalbott

On Mon, Mar 19, 2012 at 08:12:49PM +0400, Glauber Costa wrote:
> On 03/19/2012 08:05 PM, Tejun Heo wrote:
> >Hello, Glauber.
> >
> >On Mon, Mar 19, 2012 at 02:22:44PM +0400, Glauber
> Hello, Tejun! =)

Heh, =)

-- 
tejun

^ permalink raw reply	[flat|nested] 33+ messages in thread

* Re: [PATCH 07/10] cgroup: remove cgroup_add_file[s]()
  2012-03-16 23:36 ` [PATCH 07/10] cgroup: remove cgroup_add_file[s]() Tejun Heo
@ 2012-03-20  8:47   ` Aneesh Kumar K.V
  2012-03-20 16:01     ` Tejun Heo
  0 siblings, 1 reply; 33+ messages in thread
From: Aneesh Kumar K.V @ 2012-03-20  8:47 UTC (permalink / raw)
  To: Tejun Heo, glommer, lizf, containers, cgroups
  Cc: linux-kernel, fweisbec, rni, ctalbott, Tejun Heo

Tejun Heo <tj@kernel.org> writes:

> No controller is using cgroup_add_files[s]().  Unexport them, and
> convert cgroup_add_files() to handle NULL entry terminated array
> instead of taking count explicitly and continue creation on failure
> for internal use.
>

I am planning to use that in the HugeTLB extension of memcg.

http://article.gmane.org/gmane.linux.kernel.mm/75470

We need to have different limit and usage file for different HugeTLB
page size supported.

-aneesh


^ permalink raw reply	[flat|nested] 33+ messages in thread

* Re: [PATCH 03/10] cgroup: implement cgroup_add_cftypes() and friends
  2012-03-16 23:35 ` [PATCH 03/10] cgroup: implement cgroup_add_cftypes() and friends Tejun Heo
@ 2012-03-20  8:52   ` Aneesh Kumar K.V
  2012-03-20 16:03     ` Tejun Heo
  0 siblings, 1 reply; 33+ messages in thread
From: Aneesh Kumar K.V @ 2012-03-20  8:52 UTC (permalink / raw)
  To: Tejun Heo, glommer, lizf, containers, cgroups
  Cc: linux-kernel, fweisbec, rni, ctalbott, Tejun Heo

Tejun Heo <tj@kernel.org> writes:

> Currently, cgroup directories are populated by subsys->populate()
> callback explicitly creating files on each cgroup creation.  This
> level of flexibility isn't needed or desirable.  It provides largely
> unused flexibility which call for abuses while severely limiting what
> the core layer can do through the lack of structure and conventions.
>
> Per each cgroup file type, the only distinction that cgroup users is
> making is whether a cgroup is root or not, which can easily be
> expressed with flags.
>
> This patch introduces cgroup_add_cftypes() and its wrapper macros -
> CGROUP_SUBSYS_CFTYPES[_COND]().  These deal with cftypes instead of
> individual files - controllers indicate that certain types of files
> exist for certain subsystem.  Newly added CFTYPE_*_ON_ROOT flags
> indicate whether a cftype should be excluded or created only on the
> root cgroup.
>
> cgroup_add_cftypes() can be called any time whether the target
> subsystem is currently attached or not.  cgroup core will create files
> on the existing cgroups as necessary.  CGROUP_SUBSYS_CFTYPES[_COND]()
> are convenience macros controllers so that cftypes can be declared to
> belong to certain cgroup.  The COND variant is useful for cases where
> certain files are dependent on boot time parameter.
>
> Further patches will convert the existing users and remove the file
> based interface.  Note that this interface allows dynamic addition of
> files to an active controller.  This will be used for sub-controller
> modularity and unified hierarchy in the longer term.
>
> This patch implements the new mechanism but doesn't apply it to any
> user.
>

.....

> +/*
> + * Declare cftype array @cfts for cgroup subsys @ss if @cond is %true.
> + * Useful if the files are dependent on boot time parameter.
> + */
> +#define CGROUP_SUBSYS_CFTYPES_COND(ss, cfts, cond)			\
> +	static int __init __cgroup_cfts_init_##ss_##cfts(void)		\
> +	{								\
> +		if ((cond))						\
> +			WARN_ON(cgroup_add_cftypes(&ss, cfts));		\
> +		return 0;						\
> +	}								\
> +	fs_initcall(__cgroup_cfts_init_##ss_##cfts);
> +

Instead of using CGROUP_SUBSYS_CFTYPES_COND, are you ok if other
subsystem called cgroup_add_cftypes ?. I am looking at how this will
impact http://article.gmane.org/gmane.linux.kernel.mm/75470 . With
HugeTLB we cannot do the above. 

-aneesh


^ permalink raw reply	[flat|nested] 33+ messages in thread

* Re: [PATCH 09/10] cgroup: introduce struct cfent
  2012-03-16 23:36 ` [PATCH 09/10] cgroup: introduce struct cfent Tejun Heo
@ 2012-03-20 14:05   ` Glauber Costa
  2012-03-20 16:02     ` Tejun Heo
  2012-03-20 18:06   ` [PATCH UPDATED " Tejun Heo
  1 sibling, 1 reply; 33+ messages in thread
From: Glauber Costa @ 2012-03-20 14:05 UTC (permalink / raw)
  To: Tejun Heo
  Cc: lizf, containers, cgroups, linux-kernel, fweisbec, rni, ctalbott

On 03/17/2012 03:36 AM, Tejun Heo wrote:
> +static void cgroup_clear_directory(struct dentry *dir)
> +{
> +	struct cgroup *cgrp = __d_cgrp(dir);
> +
> +	while (!list_empty(&cgrp->files))
> +		cgroup_rm_file(cgrp, NULL);
> +	WARN_ON_ONCE(!list_empty(&dir->d_subdirs));
>   }


I am hitting this warning here under pretty much normal conditions.
You forgot to account for trivial directories like "."

I'll send a patch ontop of what you have.


^ permalink raw reply	[flat|nested] 33+ messages in thread

* Re: [PATCH 07/10] cgroup: remove cgroup_add_file[s]()
  2012-03-20  8:47   ` Aneesh Kumar K.V
@ 2012-03-20 16:01     ` Tejun Heo
  0 siblings, 0 replies; 33+ messages in thread
From: Tejun Heo @ 2012-03-20 16:01 UTC (permalink / raw)
  To: Aneesh Kumar K.V
  Cc: glommer, lizf, containers, cgroups, linux-kernel, fweisbec, rni,
	ctalbott

On Tue, Mar 20, 2012 at 02:17:51PM +0530, Aneesh Kumar K.V wrote:
> Tejun Heo <tj@kernel.org> writes:
> 
> > No controller is using cgroup_add_files[s]().  Unexport them, and
> > convert cgroup_add_files() to handle NULL entry terminated array
> > instead of taking count explicitly and continue creation on failure
> > for internal use.
> >
> 
> I am planning to use that in the HugeTLB extension of memcg.
> 
> http://article.gmane.org/gmane.linux.kernel.mm/75470
> 
> We need to have different limit and usage file for different HugeTLB
> page size supported.

Unless you have to alter available files cgroup-by-cgroup, you should
be fine.  If you're trying to create different set of files depending
on cgroup, it's confusing - change your interface.

Thanks.

-- 
tejun

^ permalink raw reply	[flat|nested] 33+ messages in thread

* Re: [PATCH 09/10] cgroup: introduce struct cfent
  2012-03-20 14:05   ` Glauber Costa
@ 2012-03-20 16:02     ` Tejun Heo
  2012-03-20 16:03       ` Glauber Costa
  0 siblings, 1 reply; 33+ messages in thread
From: Tejun Heo @ 2012-03-20 16:02 UTC (permalink / raw)
  To: Glauber Costa
  Cc: lizf, containers, cgroups, linux-kernel, fweisbec, rni, ctalbott

On Tue, Mar 20, 2012 at 06:05:29PM +0400, Glauber Costa wrote:
> On 03/17/2012 03:36 AM, Tejun Heo wrote:
> >+static void cgroup_clear_directory(struct dentry *dir)
> >+{
> >+	struct cgroup *cgrp = __d_cgrp(dir);
> >+
> >+	while (!list_empty(&cgrp->files))
> >+		cgroup_rm_file(cgrp, NULL);
> >+	WARN_ON_ONCE(!list_empty(&dir->d_subdirs));
> >  }
> 
> 
> I am hitting this warning here under pretty much normal conditions.
> You forgot to account for trivial directories like "."
> 
> I'll send a patch ontop of what you have.

Heh, yeah, I hit that yesterday too and was gonna look into that
today.  Is the patch coming? :)

Thanks.

-- 
tejun

^ permalink raw reply	[flat|nested] 33+ messages in thread

* Re: [PATCH 09/10] cgroup: introduce struct cfent
  2012-03-20 16:02     ` Tejun Heo
@ 2012-03-20 16:03       ` Glauber Costa
  2012-03-20 16:11         ` Glauber Costa
  2012-03-20 16:49         ` Tejun Heo
  0 siblings, 2 replies; 33+ messages in thread
From: Glauber Costa @ 2012-03-20 16:03 UTC (permalink / raw)
  To: Tejun Heo
  Cc: lizf, containers, cgroups, linux-kernel, fweisbec, rni, ctalbott

[-- Attachment #1: Type: text/plain, Size: 967 bytes --]

On 03/20/2012 08:02 PM, Tejun Heo wrote:
> On Tue, Mar 20, 2012 at 06:05:29PM +0400, Glauber Costa wrote:
>> On 03/17/2012 03:36 AM, Tejun Heo wrote:
>>> +static void cgroup_clear_directory(struct dentry *dir)
>>> +{
>>> +	struct cgroup *cgrp = __d_cgrp(dir);
>>> +
>>> +	while (!list_empty(&cgrp->files))
>>> +		cgroup_rm_file(cgrp, NULL);
>>> +	WARN_ON_ONCE(!list_empty(&dir->d_subdirs));
>>>   }
>>
>>
>> I am hitting this warning here under pretty much normal conditions.
>> You forgot to account for trivial directories like "."
>>
>> I'll send a patch ontop of what you have.
>
> Heh, yeah, I hit that yesterday too and was gonna look into that
> today.  Is the patch coming? :)
>
> Thanks.

I was about to send it together with my proposal for sock memcg, so I'm 
compile testing that under multiple config options just to be sure we're 
fine.

That's more or less what I have, see if you agree. I tried to keep the 
warning itself, because it is valuable...


[-- Attachment #2: 0001-don-t-trigger-warning-when-d_subdirs-is-not-empty.patch --]
[-- Type: text/x-patch, Size: 1158 bytes --]

>From 04604201411ab0a14fa1447072cd1b273d0774ed Mon Sep 17 00:00:00 2001
From: Glauber Costa <glommer@parallels.com>
Date: Tue, 20 Mar 2012 18:12:55 +0400
Subject: [PATCH] don't trigger warning when d_subdirs is not empty.

It is never empty at this point, because of the self references.
a better test is to see if any of them gets d_inode set.

Signed-off-by: Glauber Costa <glommer@parallels.com>
Signed-off-by: Tejun Heo <tj@kernel.org>
---
 kernel/cgroup.c |    8 +++++++-
 1 files changed, 7 insertions(+), 1 deletions(-)

diff --git a/kernel/cgroup.c b/kernel/cgroup.c
index cf7b298..b45a653 100644
--- a/kernel/cgroup.c
+++ b/kernel/cgroup.c
@@ -947,10 +947,16 @@ static int cgroup_rm_file(struct cgroup *cgrp, const struct cftype *cft)
 static void cgroup_clear_directory(struct dentry *dir)
 {
 	struct cgroup *cgrp = __d_cgrp(dir);
+	struct list_head *child;
 
 	while (!list_empty(&cgrp->files))
 		cgroup_rm_file(cgrp, NULL);
-	WARN_ON_ONCE(!list_empty(&dir->d_subdirs));
+
+	list_for_each(child, &dir->d_subdirs) {
+		struct dentry *d;
+		d = list_entry(child, struct dentry, d_u.d_child);
+		WARN_ON_ONCE(d->d_inode);
+	}
 }
 
 /*
-- 
1.7.7.6


^ permalink raw reply related	[flat|nested] 33+ messages in thread

* Re: [PATCH 03/10] cgroup: implement cgroup_add_cftypes() and friends
  2012-03-20  8:52   ` Aneesh Kumar K.V
@ 2012-03-20 16:03     ` Tejun Heo
  0 siblings, 0 replies; 33+ messages in thread
From: Tejun Heo @ 2012-03-20 16:03 UTC (permalink / raw)
  To: Aneesh Kumar K.V
  Cc: glommer, lizf, containers, cgroups, linux-kernel, fweisbec, rni,
	ctalbott

Hello, Aneesh.

On Tue, Mar 20, 2012 at 02:22:02PM +0530, Aneesh Kumar K.V wrote:
> Instead of using CGROUP_SUBSYS_CFTYPES_COND, are you ok if other
> subsystem called cgroup_add_cftypes ?. I am looking at how this will
> impact http://article.gmane.org/gmane.linux.kernel.mm/75470 . With
> HugeTLB we cannot do the above. 

Yeah, sure.  Allowing dyanmic additions and removals of cftypes is the
whole point of the patchset.

Thanks.

-- 
tejun

^ permalink raw reply	[flat|nested] 33+ messages in thread

* Re: [PATCH 09/10] cgroup: introduce struct cfent
  2012-03-20 16:03       ` Glauber Costa
@ 2012-03-20 16:11         ` Glauber Costa
  2012-03-20 16:49         ` Tejun Heo
  1 sibling, 0 replies; 33+ messages in thread
From: Glauber Costa @ 2012-03-20 16:11 UTC (permalink / raw)
  To: Tejun Heo
  Cc: lizf, containers, cgroups, linux-kernel, fweisbec, rni, ctalbott

On 03/20/2012 08:03 PM, Glauber Costa wrote:
> Signed-off-by: Glauber Costa<glommer@parallels.com>
> Signed-off-by: Tejun Heo<tj@kernel.org>

Just one tiny thing to fix, I meant CC, here, of course.

^ permalink raw reply	[flat|nested] 33+ messages in thread

* Re: [PATCH 09/10] cgroup: introduce struct cfent
  2012-03-20 16:03       ` Glauber Costa
  2012-03-20 16:11         ` Glauber Costa
@ 2012-03-20 16:49         ` Tejun Heo
  2012-03-20 16:51           ` Glauber Costa
  1 sibling, 1 reply; 33+ messages in thread
From: Tejun Heo @ 2012-03-20 16:49 UTC (permalink / raw)
  To: Glauber Costa
  Cc: lizf, containers, cgroups, linux-kernel, fweisbec, rni, ctalbott

On Tue, Mar 20, 2012 at 08:03:03PM +0400, Glauber Costa wrote:
> From 04604201411ab0a14fa1447072cd1b273d0774ed Mon Sep 17 00:00:00 2001
> From: Glauber Costa <glommer@parallels.com>
> Date: Tue, 20 Mar 2012 18:12:55 +0400
> Subject: [PATCH] don't trigger warning when d_subdirs is not empty.
> 
> It is never empty at this point, because of the self references.
> a better test is to see if any of them gets d_inode set.
> 
> Signed-off-by: Glauber Costa <glommer@parallels.com>
> Signed-off-by: Tejun Heo <tj@kernel.org>
> ---
>  kernel/cgroup.c |    8 +++++++-
>  1 files changed, 7 insertions(+), 1 deletions(-)
> 
> diff --git a/kernel/cgroup.c b/kernel/cgroup.c
> index cf7b298..b45a653 100644
> --- a/kernel/cgroup.c
> +++ b/kernel/cgroup.c
> @@ -947,10 +947,16 @@ static int cgroup_rm_file(struct cgroup *cgrp, const struct cftype *cft)
>  static void cgroup_clear_directory(struct dentry *dir)
>  {
>  	struct cgroup *cgrp = __d_cgrp(dir);
> +	struct list_head *child;
>  
>  	while (!list_empty(&cgrp->files))
>  		cgroup_rm_file(cgrp, NULL);
> -	WARN_ON_ONCE(!list_empty(&dir->d_subdirs));
> +
> +	list_for_each(child, &dir->d_subdirs) {
> +		struct dentry *d;
> +		d = list_entry(child, struct dentry, d_u.d_child);
> +		WARN_ON_ONCE(d->d_inode);
> +	}

This seems correct and I'll apply this but I still can't reproduce the
warning at will before the patch.  How do I a get negative dentry on
the d_subdirs when d_delete always returns 1?  Glauber, can you
trigger the warning at will?

Thanks.

-- 
tejun

^ permalink raw reply	[flat|nested] 33+ messages in thread

* Re: [PATCH 09/10] cgroup: introduce struct cfent
  2012-03-20 16:49         ` Tejun Heo
@ 2012-03-20 16:51           ` Glauber Costa
  0 siblings, 0 replies; 33+ messages in thread
From: Glauber Costa @ 2012-03-20 16:51 UTC (permalink / raw)
  To: Tejun Heo
  Cc: lizf, containers, cgroups, linux-kernel, fweisbec, rni, ctalbott

On 03/20/2012 08:49 PM, Tejun Heo wrote:
> On Tue, Mar 20, 2012 at 08:03:03PM +0400, Glauber Costa wrote:
>>  From 04604201411ab0a14fa1447072cd1b273d0774ed Mon Sep 17 00:00:00 2001
>> From: Glauber Costa<glommer@parallels.com>
>> Date: Tue, 20 Mar 2012 18:12:55 +0400
>> Subject: [PATCH] don't trigger warning when d_subdirs is not empty.
>>
>> It is never empty at this point, because of the self references.
>> a better test is to see if any of them gets d_inode set.
>>
>> Signed-off-by: Glauber Costa<glommer@parallels.com>
>> Signed-off-by: Tejun Heo<tj@kernel.org>
>> ---
>>   kernel/cgroup.c |    8 +++++++-
>>   1 files changed, 7 insertions(+), 1 deletions(-)
>>
>> diff --git a/kernel/cgroup.c b/kernel/cgroup.c
>> index cf7b298..b45a653 100644
>> --- a/kernel/cgroup.c
>> +++ b/kernel/cgroup.c
>> @@ -947,10 +947,16 @@ static int cgroup_rm_file(struct cgroup *cgrp, const struct cftype *cft)
>>   static void cgroup_clear_directory(struct dentry *dir)
>>   {
>>   	struct cgroup *cgrp = __d_cgrp(dir);
>> +	struct list_head *child;
>>
>>   	while (!list_empty(&cgrp->files))
>>   		cgroup_rm_file(cgrp, NULL);
>> -	WARN_ON_ONCE(!list_empty(&dir->d_subdirs));
>> +
>> +	list_for_each(child,&dir->d_subdirs) {
>> +		struct dentry *d;
>> +		d = list_entry(child, struct dentry, d_u.d_child);
>> +		WARN_ON_ONCE(d->d_inode);
>> +	}
>
> This seems correct and I'll apply this but I still can't reproduce the
> warning at will before the patch.  How do I a get negative dentry on
> the d_subdirs when d_delete always returns 1?  Glauber, can you
> trigger the warning at will?
>
> Thanks.
Yes, everytime I reboot the machine.




^ permalink raw reply	[flat|nested] 33+ messages in thread

* [PATCH UPDATED 09/10] cgroup: introduce struct cfent
  2012-03-16 23:36 ` [PATCH 09/10] cgroup: introduce struct cfent Tejun Heo
  2012-03-20 14:05   ` Glauber Costa
@ 2012-03-20 18:06   ` Tejun Heo
  1 sibling, 0 replies; 33+ messages in thread
From: Tejun Heo @ 2012-03-20 18:06 UTC (permalink / raw)
  To: glommer, lizf, containers, cgroups; +Cc: linux-kernel, fweisbec, rni, ctalbott

This patch adds cfent (cgroup file entry) which is the association
between a cgroup and a file.  This is in-cgroup representation of
files under a cgroup directory.  This simplifies walking walking
cgroup files and thus cgroup_clear_directory(), which is now
implemented in two parts - cgroup_rm_file() and a loop around it.

cgroup_rm_file() will be used to implement cftype removal and cfent is
scheduled to serve cgroup specific per-file data (e.g. for sysfs-like
"sever" semantics).

v2: - cfe was freed from cgroup_rm_file() which led to use-after-free
      if the file had openers at the time of removal.  Moved to
      cgroup_diput().

    - cgroup_clear_directory() triggered WARN_ON_ONCE() if d_subdirs
      wasn't empty after removing all files.  This triggered
      spuriously if some files were open during directory clearing.
      Removed.

Signed-off-by: Tejun Heo <tj@kernel.org>
Cc: Glauber Costa <glommer@parallels.com>
---
Glauber, we may have positive child dentries for open files so your
fix isn't enough.  I just removed the WARN_ON_ONCE().

Thanks.

 include/linux/cgroup.h |    1 
 kernel/cgroup.c        |  107 +++++++++++++++++++++++++++++++++----------------
 2 files changed, 74 insertions(+), 34 deletions(-)

Index: work/include/linux/cgroup.h
===================================================================
--- work.orig/include/linux/cgroup.h
+++ work/include/linux/cgroup.h
@@ -175,6 +175,7 @@ struct cgroup {
 	 */
 	struct list_head sibling;	/* my parent's children */
 	struct list_head children;	/* my children */
+	struct list_head files;		/* my files */
 
 	struct cgroup *parent;		/* my parent */
 	struct dentry __rcu *dentry;	/* cgroup fs entry, RCU protected */
Index: work/kernel/cgroup.c
===================================================================
--- work.orig/kernel/cgroup.c
+++ work/kernel/cgroup.c
@@ -158,6 +158,15 @@ struct cftype_set {
 };
 
 /*
+ * cgroupfs file entry, pointed to from leaf dentry->d_fsdata.
+ */
+struct cfent {
+	struct list_head		node;
+	struct dentry			*dentry;
+	struct cftype			*type;
+};
+
+/*
  * CSS ID -- ID per subsys's Cgroup Subsys State(CSS). used only when
  * cgroup_subsys->use_id != 0.
  */
@@ -297,11 +306,16 @@ static inline struct cgroup *__d_cgrp(st
 	return dentry->d_fsdata;
 }
 
-static inline struct cftype *__d_cft(struct dentry *dentry)
+static inline struct cfent *__d_cfe(struct dentry *dentry)
 {
 	return dentry->d_fsdata;
 }
 
+static inline struct cftype *__d_cft(struct dentry *dentry)
+{
+	return __d_cfe(dentry)->type;
+}
+
 /* the list of cgroups eligible for automatic release. Protected by
  * release_list_lock */
 static LIST_HEAD(release_list);
@@ -887,6 +901,12 @@ static void cgroup_diput(struct dentry *
 		BUG_ON(!list_empty(&cgrp->pidlists));
 
 		kfree_rcu(cgrp, rcu_head);
+	} else {
+		struct cfent *cfe = __d_cfe(dentry);
+
+		WARN_ONCE(!list_empty(&cfe->node),
+			  "cfe still linked for %s\n", cfe->type->name);
+		kfree(cfe);
 	}
 	iput(inode);
 }
@@ -905,34 +925,36 @@ static void remove_dir(struct dentry *d)
 	dput(parent);
 }
 
-static void cgroup_clear_directory(struct dentry *dentry)
+static int cgroup_rm_file(struct cgroup *cgrp, const struct cftype *cft)
 {
-	struct list_head *node;
+	struct cfent *cfe;
+
+	lockdep_assert_held(&cgrp->dentry->d_inode->i_mutex);
+	lockdep_assert_held(&cgroup_mutex);
+
+	list_for_each_entry(cfe, &cgrp->files, node) {
+		struct dentry *d = cfe->dentry;
 
-	BUG_ON(!mutex_is_locked(&dentry->d_inode->i_mutex));
-	spin_lock(&dentry->d_lock);
-	node = dentry->d_subdirs.next;
-	while (node != &dentry->d_subdirs) {
-		struct dentry *d = list_entry(node, struct dentry, d_u.d_child);
-
-		spin_lock_nested(&d->d_lock, DENTRY_D_LOCK_NESTED);
-		list_del_init(node);
-		if (d->d_inode) {
-			/* This should never be called on a cgroup
-			 * directory with child cgroups */
-			BUG_ON(d->d_inode->i_mode & S_IFDIR);
-			dget_dlock(d);
-			spin_unlock(&d->d_lock);
-			spin_unlock(&dentry->d_lock);
-			d_delete(d);
-			simple_unlink(dentry->d_inode, d);
-			dput(d);
-			spin_lock(&dentry->d_lock);
-		} else
-			spin_unlock(&d->d_lock);
-		node = dentry->d_subdirs.next;
+		if (cft && cfe->type != cft)
+			continue;
+
+		dget(d);
+		d_delete(d);
+		simple_unlink(d->d_inode, d);
+		list_del_init(&cfe->node);
+		dput(d);
+
+		return 0;
 	}
-	spin_unlock(&dentry->d_lock);
+	return -ENOENT;
+}
+
+static void cgroup_clear_directory(struct dentry *dir)
+{
+	struct cgroup *cgrp = __d_cgrp(dir);
+
+	while (!list_empty(&cgrp->files))
+		cgroup_rm_file(cgrp, NULL);
 }
 
 /*
@@ -1362,6 +1384,7 @@ static void init_cgroup_housekeeping(str
 {
 	INIT_LIST_HEAD(&cgrp->sibling);
 	INIT_LIST_HEAD(&cgrp->children);
+	INIT_LIST_HEAD(&cgrp->files);
 	INIT_LIST_HEAD(&cgrp->css_sets);
 	INIT_LIST_HEAD(&cgrp->release_list);
 	INIT_LIST_HEAD(&cgrp->pidlists);
@@ -2633,7 +2656,9 @@ static int cgroup_add_file(struct cgroup
 			   const struct cftype *cft)
 {
 	struct dentry *dir = cgrp->dentry;
+	struct cgroup *parent = __d_cgrp(dir);
 	struct dentry *dentry;
+	struct cfent *cfe;
 	int error;
 	umode_t mode;
 	char name[MAX_CGROUP_TYPE_NAMELEN + MAX_CFTYPE_NAME + 2] = { 0 };
@@ -2649,17 +2674,31 @@ static int cgroup_add_file(struct cgroup
 		strcat(name, ".");
 	}
 	strcat(name, cft->name);
+
 	BUG_ON(!mutex_is_locked(&dir->d_inode->i_mutex));
+
+	cfe = kzalloc(sizeof(*cfe), GFP_KERNEL);
+	if (!cfe)
+		return -ENOMEM;
+
 	dentry = lookup_one_len(name, dir, strlen(name));
-	if (!IS_ERR(dentry)) {
-		mode = cgroup_file_mode(cft);
-		error = cgroup_create_file(dentry, mode | S_IFREG,
-						cgrp->root->sb);
-		if (!error)
-			dentry->d_fsdata = (void *)cft;
-		dput(dentry);
-	} else
+	if (IS_ERR(dentry)) {
 		error = PTR_ERR(dentry);
+		goto out;
+	}
+
+	mode = cgroup_file_mode(cft);
+	error = cgroup_create_file(dentry, mode | S_IFREG, cgrp->root->sb);
+	if (!error) {
+		cfe->type = (void *)cft;
+		cfe->dentry = dentry;
+		dentry->d_fsdata = cfe;
+		list_add_tail(&cfe->node, &parent->files);
+		cfe = NULL;
+	}
+	dput(dentry);
+out:
+	kfree(cfe);
 	return error;
 }
 

^ permalink raw reply	[flat|nested] 33+ messages in thread

* Re: [PATCH 06/10] cgroup: convert memcg controller to the new cftype interface
  2012-03-19 16:10     ` Tejun Heo
@ 2012-03-21  4:42       ` KAMEZAWA Hiroyuki
  2012-03-21  5:08         ` Tejun Heo
  0 siblings, 1 reply; 33+ messages in thread
From: KAMEZAWA Hiroyuki @ 2012-03-21  4:42 UTC (permalink / raw)
  To: Tejun Heo
  Cc: glommer, lizf, containers, cgroups, linux-kernel, fweisbec, rni,
	ctalbott, Johannes Weiner, Michal Hocko, Balbir Singh,
	Hugh Dickins, Greg Thelen

(2012/03/20 1:10), Tejun Heo wrote:

> Hello,
> 
> On Mon, Mar 19, 2012 at 01:26:02PM +0900, KAMEZAWA Hiroyuki wrote:
>>> +CGROUP_SUBSYS_CFTYPES_COND(mem_cgroup_subsys, memsw_cgroup_files,
>>> +			   really_do_swap_account);
>>
>>
>> I'm sorry but why you use really_do_swap_account instead of do_swap_account ?
> 
> Because do_swap_account is initialized too late.
> CGROUP_SUBSYS_CFTYPES() are processed via fs_initcall().  AFAICS,
> do_swap_account may not have been initialized by then.  Also, if memcg
> as whole is disabled, it doesn't matter whether those files are
> registered or not, right?  We probably need some comment there tho.
> 


I think creating file based on really_do_swap_account is wrong.
(I'm sorry for this complication.)

How about creating this file unconditionally ? Returning -ENOTSUP in memcg's
handler will work well...anyway it's configured, no reason to show them is bad.

Thanks,
-Kame


^ permalink raw reply	[flat|nested] 33+ messages in thread

* Re: [PATCH 06/10] cgroup: convert memcg controller to the new cftype interface
  2012-03-21  4:42       ` KAMEZAWA Hiroyuki
@ 2012-03-21  5:08         ` Tejun Heo
  0 siblings, 0 replies; 33+ messages in thread
From: Tejun Heo @ 2012-03-21  5:08 UTC (permalink / raw)
  To: KAMEZAWA Hiroyuki
  Cc: glommer, lizf, containers, cgroups, linux-kernel, fweisbec, rni,
	ctalbott, Johannes Weiner, Michal Hocko, Balbir Singh,
	Hugh Dickins, Greg Thelen

Hello,

On Tue, Mar 20, 2012 at 9:42 PM, KAMEZAWA Hiroyuki
<kamezawa.hiroyu@jp.fujitsu.com> wrote:
> I think creating file based on really_do_swap_account is wrong.
> (I'm sorry for this complication.)
>
> How about creating this file unconditionally ? Returning -ENOTSUP in memcg's
> handler will work well...anyway it's configured, no reason to show them is bad.

That works for me. Then, I can also remove the _COND thing which isn't
pretty anyway. I'll update the patchset.

Thanks.

-- 
tejun

^ permalink raw reply	[flat|nested] 33+ messages in thread

end of thread, other threads:[~2012-03-21  5:08 UTC | newest]

Thread overview: 33+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2012-03-16 23:35 [PATCHSET] cgroup: cftype based file interface Tejun Heo
2012-03-16 23:35 ` [PATCH 01/10] cgroup: move cgroup_clear_directory() call out of cgroup_populate_dir() Tejun Heo
2012-03-19 10:25   ` Glauber Costa
2012-03-16 23:35 ` [PATCH 02/10] cgroup: build list of all cgroups under a given cgroupfs_root Tejun Heo
2012-03-16 23:35 ` [PATCH 03/10] cgroup: implement cgroup_add_cftypes() and friends Tejun Heo
2012-03-20  8:52   ` Aneesh Kumar K.V
2012-03-20 16:03     ` Tejun Heo
2012-03-16 23:35 ` [PATCH 04/10] cgroup: merge cft_release_agent cftype array into the base files array Tejun Heo
2012-03-16 23:35 ` [PATCH 05/10] cgroup: convert all non-memcg controllers to the new cftype interface Tejun Heo
2012-03-16 23:35 ` [PATCH 06/10] cgroup: convert memcg controller " Tejun Heo
2012-03-19  4:26   ` KAMEZAWA Hiroyuki
2012-03-19 10:43     ` Glauber Costa
2012-03-19 16:15       ` Tejun Heo
2012-03-19 16:10     ` Tejun Heo
2012-03-21  4:42       ` KAMEZAWA Hiroyuki
2012-03-21  5:08         ` Tejun Heo
2012-03-16 23:36 ` [PATCH 07/10] cgroup: remove cgroup_add_file[s]() Tejun Heo
2012-03-20  8:47   ` Aneesh Kumar K.V
2012-03-20 16:01     ` Tejun Heo
2012-03-16 23:36 ` [PATCH 08/10] cgroup: relocate __d_cgrp() and __d_cft() Tejun Heo
2012-03-16 23:36 ` [PATCH 09/10] cgroup: introduce struct cfent Tejun Heo
2012-03-20 14:05   ` Glauber Costa
2012-03-20 16:02     ` Tejun Heo
2012-03-20 16:03       ` Glauber Costa
2012-03-20 16:11         ` Glauber Costa
2012-03-20 16:49         ` Tejun Heo
2012-03-20 16:51           ` Glauber Costa
2012-03-20 18:06   ` [PATCH UPDATED " Tejun Heo
2012-03-16 23:36 ` [PATCH 10/10] cgroup: implement cgroup_rm_cftypes() Tejun Heo
2012-03-19 10:22 ` [PATCHSET] cgroup: cftype based file interface Glauber Costa
2012-03-19 16:05   ` Tejun Heo
2012-03-19 16:12     ` Glauber Costa
2012-03-19 16:15       ` Tejun Heo

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).