All of lore.kernel.org
 help / color / mirror / Atom feed
* [PATCH] cgroup: fix top cgroup refcnt leak
@ 2014-02-14  9:36 ` Li Zefan
  0 siblings, 0 replies; 6+ messages in thread
From: Li Zefan @ 2014-02-14  9:36 UTC (permalink / raw)
  To: Tejun Heo; +Cc: LKML, Cgroups

If we mount the same cgroupfs in serveral mount points, and then
umount all of them, kill_sb() will be called only once.

Therefore it's wrong to increment top_cgroup's refcnt when we find
an existing cgroup_root.

Try:
	# mount -t cgroup -o cpuacct xxx /cgroup
	# mount -t cgroup -o cpuacct xxx /cgroup2
	# cat /proc/cgroups | grep cpuacct
	cpuacct 2       1       1
	# umount /cgroup
	# umount /cgroup2
	# cat /proc/cgroups | grep cpuacct
	cpuacct 2       1       1

You'll see cgroupfs will never be freed.

Also move this chunk of code upwards.

Signed-off-by: Li Zefan <lizefan@huawei.com>
---
 kernel/cgroup.c | 32 ++++++++++++++++----------------
 1 file changed, 16 insertions(+), 16 deletions(-)

diff --git a/kernel/cgroup.c b/kernel/cgroup.c
index 37d94a2..5bfe738 100644
--- a/kernel/cgroup.c
+++ b/kernel/cgroup.c
@@ -1498,6 +1498,22 @@ retry:
 		bool name_match = false;
 
 		/*
+		 * A root's lifetime is governed by its top cgroup.  Zero
+		 * ref indicate that the root is being destroyed.  Wait for
+		 * destruction to complete so that the subsystems are free.
+		 * We can use wait_queue for the wait but this path is
+		 * super cold.  Let's just sleep for a bit and retry.
+		 */
+		if (!atomic_read(&root->top_cgroup.refcnt)) {
+			mutex_unlock(&cgroup_mutex);
+			mutex_unlock(&cgroup_tree_mutex);
+			kfree(opts.release_agent);
+			kfree(opts.name);
+			msleep(10);
+			goto retry;
+		}
+
+		/*
 		 * If we asked for a name then it must match.  Also, if
 		 * name matches but sybsys_mask doesn't, we should fail.
 		 * Remember whether name matched.
@@ -1530,22 +1546,6 @@ retry:
 			}
 		}
 
-		/*
-		 * A root's lifetime is governed by its top cgroup.  Zero
-		 * ref indicate that the root is being destroyed.  Wait for
-		 * destruction to complete so that the subsystems are free.
-		 * We can use wait_queue for the wait but this path is
-		 * super cold.  Let's just sleep for a bit and retry.
-		 */
-		if (!atomic_inc_not_zero(&root->top_cgroup.refcnt)) {
-			mutex_unlock(&cgroup_mutex);
-			mutex_unlock(&cgroup_tree_mutex);
-			kfree(opts.release_agent);
-			kfree(opts.name);
-			msleep(10);
-			goto retry;
-		}
-
 		ret = 0;
 		goto out_unlock;
 	}
-- 
1.8.0.2

^ permalink raw reply related	[flat|nested] 6+ messages in thread

* [PATCH] cgroup: fix top cgroup refcnt leak
@ 2014-02-14  9:36 ` Li Zefan
  0 siblings, 0 replies; 6+ messages in thread
From: Li Zefan @ 2014-02-14  9:36 UTC (permalink / raw)
  To: Tejun Heo; +Cc: LKML, Cgroups

If we mount the same cgroupfs in serveral mount points, and then
umount all of them, kill_sb() will be called only once.

Therefore it's wrong to increment top_cgroup's refcnt when we find
an existing cgroup_root.

Try:
	# mount -t cgroup -o cpuacct xxx /cgroup
	# mount -t cgroup -o cpuacct xxx /cgroup2
	# cat /proc/cgroups | grep cpuacct
	cpuacct 2       1       1
	# umount /cgroup
	# umount /cgroup2
	# cat /proc/cgroups | grep cpuacct
	cpuacct 2       1       1

You'll see cgroupfs will never be freed.

Also move this chunk of code upwards.

Signed-off-by: Li Zefan <lizefan-hv44wF8Li93QT0dZR+AlfA@public.gmane.org>
---
 kernel/cgroup.c | 32 ++++++++++++++++----------------
 1 file changed, 16 insertions(+), 16 deletions(-)

diff --git a/kernel/cgroup.c b/kernel/cgroup.c
index 37d94a2..5bfe738 100644
--- a/kernel/cgroup.c
+++ b/kernel/cgroup.c
@@ -1498,6 +1498,22 @@ retry:
 		bool name_match = false;
 
 		/*
+		 * A root's lifetime is governed by its top cgroup.  Zero
+		 * ref indicate that the root is being destroyed.  Wait for
+		 * destruction to complete so that the subsystems are free.
+		 * We can use wait_queue for the wait but this path is
+		 * super cold.  Let's just sleep for a bit and retry.
+		 */
+		if (!atomic_read(&root->top_cgroup.refcnt)) {
+			mutex_unlock(&cgroup_mutex);
+			mutex_unlock(&cgroup_tree_mutex);
+			kfree(opts.release_agent);
+			kfree(opts.name);
+			msleep(10);
+			goto retry;
+		}
+
+		/*
 		 * If we asked for a name then it must match.  Also, if
 		 * name matches but sybsys_mask doesn't, we should fail.
 		 * Remember whether name matched.
@@ -1530,22 +1546,6 @@ retry:
 			}
 		}
 
-		/*
-		 * A root's lifetime is governed by its top cgroup.  Zero
-		 * ref indicate that the root is being destroyed.  Wait for
-		 * destruction to complete so that the subsystems are free.
-		 * We can use wait_queue for the wait but this path is
-		 * super cold.  Let's just sleep for a bit and retry.
-		 */
-		if (!atomic_inc_not_zero(&root->top_cgroup.refcnt)) {
-			mutex_unlock(&cgroup_mutex);
-			mutex_unlock(&cgroup_tree_mutex);
-			kfree(opts.release_agent);
-			kfree(opts.name);
-			msleep(10);
-			goto retry;
-		}
-
 		ret = 0;
 		goto out_unlock;
 	}
-- 
1.8.0.2

^ permalink raw reply related	[flat|nested] 6+ messages in thread

* Re: [PATCH] cgroup: fix top cgroup refcnt leak
@ 2014-02-14 11:15   ` Li Zefan
  0 siblings, 0 replies; 6+ messages in thread
From: Li Zefan @ 2014-02-14 11:15 UTC (permalink / raw)
  To: Tejun Heo; +Cc: Li Zefan, LKML, Cgroups

于 2014年02月14日 17:36, Li Zefan 写道:
> If we mount the same cgroupfs in serveral mount points, and then
> umount all of them, kill_sb() will be called only once.
> 
> Therefore it's wrong to increment top_cgroup's refcnt when we find
> an existing cgroup_root.
> 
> Try:
> 	# mount -t cgroup -o cpuacct xxx /cgroup
> 	# mount -t cgroup -o cpuacct xxx /cgroup2
> 	# cat /proc/cgroups | grep cpuacct
> 	cpuacct 2       1       1
> 	# umount /cgroup
> 	# umount /cgroup2
> 	# cat /proc/cgroups | grep cpuacct
> 	cpuacct 2       1       1
> 
> You'll see cgroupfs will never be freed.
> 
> Also move this chunk of code upwards.
> 
> Signed-off-by: Li Zefan <lizefan@huawei.com>
> ---
>  kernel/cgroup.c | 32 ++++++++++++++++----------------
>  1 file changed, 16 insertions(+), 16 deletions(-)
> 
> diff --git a/kernel/cgroup.c b/kernel/cgroup.c
> index 37d94a2..5bfe738 100644
> --- a/kernel/cgroup.c
> +++ b/kernel/cgroup.c
> @@ -1498,6 +1498,22 @@ retry:
>  		bool name_match = false;
>  
>  		/*
> +		 * A root's lifetime is governed by its top cgroup.  Zero
> +		 * ref indicate that the root is being destroyed.  Wait for
> +		 * destruction to complete so that the subsystems are free.
> +		 * We can use wait_queue for the wait but this path is
> +		 * super cold.  Let's just sleep for a bit and retry.
> +		 */
> +		if (!atomic_read(&root->top_cgroup.refcnt)) {

oops, this fix is wrong. We call kernfs_mount() without cgroup locks and it
drops cgroup refcnt if failed.

I guess we need to bump the refcnt and then drop it after kernfs_mount().

^ permalink raw reply	[flat|nested] 6+ messages in thread

* Re: [PATCH] cgroup: fix top cgroup refcnt leak
@ 2014-02-14 11:15   ` Li Zefan
  0 siblings, 0 replies; 6+ messages in thread
From: Li Zefan @ 2014-02-14 11:15 UTC (permalink / raw)
  To: Tejun Heo; +Cc: Li Zefan, LKML, Cgroups

于 2014年02月14日 17:36, Li Zefan 写道:
> If we mount the same cgroupfs in serveral mount points, and then
> umount all of them, kill_sb() will be called only once.
> 
> Therefore it's wrong to increment top_cgroup's refcnt when we find
> an existing cgroup_root.
> 
> Try:
> 	# mount -t cgroup -o cpuacct xxx /cgroup
> 	# mount -t cgroup -o cpuacct xxx /cgroup2
> 	# cat /proc/cgroups | grep cpuacct
> 	cpuacct 2       1       1
> 	# umount /cgroup
> 	# umount /cgroup2
> 	# cat /proc/cgroups | grep cpuacct
> 	cpuacct 2       1       1
> 
> You'll see cgroupfs will never be freed.
> 
> Also move this chunk of code upwards.
> 
> Signed-off-by: Li Zefan <lizefan-hv44wF8Li93QT0dZR+AlfA@public.gmane.org>
> ---
>  kernel/cgroup.c | 32 ++++++++++++++++----------------
>  1 file changed, 16 insertions(+), 16 deletions(-)
> 
> diff --git a/kernel/cgroup.c b/kernel/cgroup.c
> index 37d94a2..5bfe738 100644
> --- a/kernel/cgroup.c
> +++ b/kernel/cgroup.c
> @@ -1498,6 +1498,22 @@ retry:
>  		bool name_match = false;
>  
>  		/*
> +		 * A root's lifetime is governed by its top cgroup.  Zero
> +		 * ref indicate that the root is being destroyed.  Wait for
> +		 * destruction to complete so that the subsystems are free.
> +		 * We can use wait_queue for the wait but this path is
> +		 * super cold.  Let's just sleep for a bit and retry.
> +		 */
> +		if (!atomic_read(&root->top_cgroup.refcnt)) {

oops, this fix is wrong. We call kernfs_mount() without cgroup locks and it
drops cgroup refcnt if failed.

I guess we need to bump the refcnt and then drop it after kernfs_mount().

^ permalink raw reply	[flat|nested] 6+ messages in thread

* Re: [PATCH] cgroup: fix top cgroup refcnt leak
  2014-02-14 11:15   ` Li Zefan
  (?)
@ 2014-02-14 16:02   ` Tejun Heo
  -1 siblings, 0 replies; 6+ messages in thread
From: Tejun Heo @ 2014-02-14 16:02 UTC (permalink / raw)
  To: Li Zefan; +Cc: Li Zefan, LKML, Cgroups

On Fri, Feb 14, 2014 at 07:15:18PM +0800, Li Zefan wrote:
> >  		/*
> > +		 * A root's lifetime is governed by its top cgroup.  Zero
> > +		 * ref indicate that the root is being destroyed.  Wait for
> > +		 * destruction to complete so that the subsystems are free.
> > +		 * We can use wait_queue for the wait but this path is
> > +		 * super cold.  Let's just sleep for a bit and retry.
> > +		 */
> > +		if (!atomic_read(&root->top_cgroup.refcnt)) {
> 
> oops, this fix is wrong. We call kernfs_mount() without cgroup locks and it
> drops cgroup refcnt if failed.
> 
> I guess we need to bump the refcnt and then drop it after kernfs_mount().

Alright, will wait for the updated fix.

Thanks!

-- 
tejun

^ permalink raw reply	[flat|nested] 6+ messages in thread

* [PATCH] cgroup: fix top cgroup refcnt leak
@ 2018-12-28 23:59 Andrei Vagin
  0 siblings, 0 replies; 6+ messages in thread
From: Andrei Vagin @ 2018-12-28 23:59 UTC (permalink / raw)
  To: Alexander Viro, David Howells
  Cc: linux-fsdevel, cgroups, Andrei Vagin, Li Zefan

It looks like the c6b3d5bcd67c ("cgroup: fix top cgroup refcnt leak")
commit was reverted by mistake.

$ mkdir /tmp/cgroup
$ mkdir /tmp/cgroup2
$ mount -t cgroup -o none,name=test test /tmp/cgroup
$ mount -t cgroup -o none,name=test test /tmp/cgroup2
$ umount /tmp/cgroup
$ umount /tmp/cgroup2
$ cat /proc/self/cgroup | grep test
12:name=test:/

You can see the test cgroup was not freed.

Cc: Li Zefan <lizefan@huawei.com>
Fixes: aea3f2676c83 ("kernfs, sysfs, cgroup, intel_rdt: Support fs_context")
Signed-off-by: Andrei Vagin <avagin@gmail.com>
---
 kernel/cgroup/cgroup.c | 5 ++++-
 1 file changed, 4 insertions(+), 1 deletion(-)

diff --git a/kernel/cgroup/cgroup.c b/kernel/cgroup/cgroup.c
index fb0717696895..dbb8805bf66c 100644
--- a/kernel/cgroup/cgroup.c
+++ b/kernel/cgroup/cgroup.c
@@ -2045,8 +2045,11 @@ int cgroup_do_get_tree(struct fs_context *fc)
 	}
 
 	ret = 0;
-	if (ctx->kfc.new_sb_created)
+	if (ctx->kfc.new_sb_created) {
 		goto out_cgrp;
+	} else {
+		cgroup_put(&ctx->root->cgrp);
+	}
 	apply_cgroup_root_flags(ctx->flags);
 	return 0;
 
-- 
2.17.2

^ permalink raw reply related	[flat|nested] 6+ messages in thread

end of thread, other threads:[~2018-12-28 23:59 UTC | newest]

Thread overview: 6+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2014-02-14  9:36 [PATCH] cgroup: fix top cgroup refcnt leak Li Zefan
2014-02-14  9:36 ` Li Zefan
2014-02-14 11:15 ` Li Zefan
2014-02-14 11:15   ` Li Zefan
2014-02-14 16:02   ` Tejun Heo
2018-12-28 23:59 Andrei Vagin

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.