All of lore.kernel.org
 help / color / mirror / Atom feed
* [PATCH v2 1/2] cgroup: delay the clearing of cgrp->kn->priv
@ 2014-09-04  6:43 ` Li Zefan
  0 siblings, 0 replies; 5+ messages in thread
From: Li Zefan @ 2014-09-04  6:43 UTC (permalink / raw)
  To: Tejun Heo; +Cc: Toralf Förster, LKML, Cgroups

Run these two scripts concurrently:

    for ((; ;))
    {
        mkdir /cgroup/sub
        rmdir /cgroup/sub
    }

    for ((; ;))
    {
        echo $$ > /cgroup/sub/cgroup.procs
        echo $$ > /cgroup/cgroup.procs
    }

A kernel bug will be triggered:

BUG: unable to handle kernel NULL pointer dereference at 00000038
IP: [<c10bbd69>] cgroup_put+0x9/0x80
...
Call Trace:
 [<c10bbe19>] cgroup_kn_unlock+0x39/0x50
 [<c10bbe91>] cgroup_kn_lock_live+0x61/0x70
 [<c10be3c1>] __cgroup_procs_write.isra.26+0x51/0x230
 [<c10be5b2>] cgroup_tasks_write+0x12/0x20
 [<c10bb7b0>] cgroup_file_write+0x40/0x130
 [<c11aee71>] kernfs_fop_write+0xd1/0x160
 [<c1148e58>] vfs_write+0x98/0x1e0
 [<c114934d>] SyS_write+0x4d/0xa0
 [<c16f656b>] sysenter_do_call+0x12/0x12

We clear cgrp->kn->priv in the end of cgroup_rmdir(), but another
concurrent thread can access kn->priv after the clearing.

We should move the clearing to css_release_work_fn(). At that time
no one is holding reference to the cgroup and no one can gain a new
reference to access it.

v2:
- remove RCU_INIT_POINTER() into the else block. (Tejun)
- remove the cgroup_parent() check. (Tejun)
- update the comment in css_tryget_online_from_dir().

Cc: <stable@vger.kernel.org> # 3.15+
Reported-by: Toralf Förster <toralf.foerster@gmx.de>
Signed-off-by: Zefan Li <lizefan@huawei.com>
---
 kernel/cgroup.c | 21 ++++++++++-----------
 1 file changed, 10 insertions(+), 11 deletions(-)

diff --git a/kernel/cgroup.c b/kernel/cgroup.c
index 1c56924..205f793 100644
--- a/kernel/cgroup.c
+++ b/kernel/cgroup.c
@@ -4181,6 +4181,15 @@ static void css_release_work_fn(struct work_struct *work)
 		/* cgroup release path */
 		cgroup_idr_remove(&cgrp->root->cgroup_idr, cgrp->id);
 		cgrp->id = -1;
+
+		/*
+		 * There are two control paths which try to determine
+		 * cgroup from dentry without going through kernfs -
+		 * cgroupstats_build() and css_tryget_online_from_dir().
+		 * Those are supported by RCU protecting clearing of
+		 * cgrp->kn->priv backpointer.
+		 */
+		RCU_INIT_POINTER(*(void __rcu __force **)&cgrp->kn->priv, NULL);
 	}
 
 	mutex_unlock(&cgroup_mutex);
@@ -4601,16 +4610,6 @@ static int cgroup_rmdir(struct kernfs_node *kn)
 
 	cgroup_kn_unlock(kn);
 
-	/*
-	 * There are two control paths which try to determine cgroup from
-	 * dentry without going through kernfs - cgroupstats_build() and
-	 * css_tryget_online_from_dir().  Those are supported by RCU
-	 * protecting clearing of cgrp->kn->priv backpointer, which should
-	 * happen after all files under it have been removed.
-	 */
-	if (!ret)
-		RCU_INIT_POINTER(*(void __rcu __force **)&kn->priv, NULL);
-
 	cgroup_put(cgrp);
 	return ret;
 }
@@ -5175,7 +5174,7 @@ struct cgroup_subsys_state *css_tryget_online_from_dir(struct dentry *dentry,
 	/*
 	 * This path doesn't originate from kernfs and @kn could already
 	 * have been or be removed at any point.  @kn->priv is RCU
-	 * protected for this access.  See cgroup_rmdir() for details.
+	 * protected for this access.  See css_release_work_fn() for details.
 	 */
 	cgrp = rcu_dereference(kn->priv);
 	if (cgrp)
-- 
1.8.0.2


^ permalink raw reply related	[flat|nested] 5+ messages in thread

* [PATCH v2 1/2] cgroup: delay the clearing of cgrp->kn->priv
@ 2014-09-04  6:43 ` Li Zefan
  0 siblings, 0 replies; 5+ messages in thread
From: Li Zefan @ 2014-09-04  6:43 UTC (permalink / raw)
  To: Tejun Heo; +Cc: Toralf Förster, LKML, Cgroups

Run these two scripts concurrently:

    for ((; ;))
    {
        mkdir /cgroup/sub
        rmdir /cgroup/sub
    }

    for ((; ;))
    {
        echo $$ > /cgroup/sub/cgroup.procs
        echo $$ > /cgroup/cgroup.procs
    }

A kernel bug will be triggered:

BUG: unable to handle kernel NULL pointer dereference at 00000038
IP: [<c10bbd69>] cgroup_put+0x9/0x80
...
Call Trace:
 [<c10bbe19>] cgroup_kn_unlock+0x39/0x50
 [<c10bbe91>] cgroup_kn_lock_live+0x61/0x70
 [<c10be3c1>] __cgroup_procs_write.isra.26+0x51/0x230
 [<c10be5b2>] cgroup_tasks_write+0x12/0x20
 [<c10bb7b0>] cgroup_file_write+0x40/0x130
 [<c11aee71>] kernfs_fop_write+0xd1/0x160
 [<c1148e58>] vfs_write+0x98/0x1e0
 [<c114934d>] SyS_write+0x4d/0xa0
 [<c16f656b>] sysenter_do_call+0x12/0x12

We clear cgrp->kn->priv in the end of cgroup_rmdir(), but another
concurrent thread can access kn->priv after the clearing.

We should move the clearing to css_release_work_fn(). At that time
no one is holding reference to the cgroup and no one can gain a new
reference to access it.

v2:
- remove RCU_INIT_POINTER() into the else block. (Tejun)
- remove the cgroup_parent() check. (Tejun)
- update the comment in css_tryget_online_from_dir().

Cc: <stable-u79uwXL29TY76Z2rM5mHXA@public.gmane.org> # 3.15+
Reported-by: Toralf Förster <toralf.foerster-Mmb7MZpHnFY@public.gmane.org>
Signed-off-by: Zefan Li <lizefan-hv44wF8Li93QT0dZR+AlfA@public.gmane.org>
---
 kernel/cgroup.c | 21 ++++++++++-----------
 1 file changed, 10 insertions(+), 11 deletions(-)

diff --git a/kernel/cgroup.c b/kernel/cgroup.c
index 1c56924..205f793 100644
--- a/kernel/cgroup.c
+++ b/kernel/cgroup.c
@@ -4181,6 +4181,15 @@ static void css_release_work_fn(struct work_struct *work)
 		/* cgroup release path */
 		cgroup_idr_remove(&cgrp->root->cgroup_idr, cgrp->id);
 		cgrp->id = -1;
+
+		/*
+		 * There are two control paths which try to determine
+		 * cgroup from dentry without going through kernfs -
+		 * cgroupstats_build() and css_tryget_online_from_dir().
+		 * Those are supported by RCU protecting clearing of
+		 * cgrp->kn->priv backpointer.
+		 */
+		RCU_INIT_POINTER(*(void __rcu __force **)&cgrp->kn->priv, NULL);
 	}
 
 	mutex_unlock(&cgroup_mutex);
@@ -4601,16 +4610,6 @@ static int cgroup_rmdir(struct kernfs_node *kn)
 
 	cgroup_kn_unlock(kn);
 
-	/*
-	 * There are two control paths which try to determine cgroup from
-	 * dentry without going through kernfs - cgroupstats_build() and
-	 * css_tryget_online_from_dir().  Those are supported by RCU
-	 * protecting clearing of cgrp->kn->priv backpointer, which should
-	 * happen after all files under it have been removed.
-	 */
-	if (!ret)
-		RCU_INIT_POINTER(*(void __rcu __force **)&kn->priv, NULL);
-
 	cgroup_put(cgrp);
 	return ret;
 }
@@ -5175,7 +5174,7 @@ struct cgroup_subsys_state *css_tryget_online_from_dir(struct dentry *dentry,
 	/*
 	 * This path doesn't originate from kernfs and @kn could already
 	 * have been or be removed at any point.  @kn->priv is RCU
-	 * protected for this access.  See cgroup_rmdir() for details.
+	 * protected for this access.  See css_release_work_fn() for details.
 	 */
 	cgrp = rcu_dereference(kn->priv);
 	if (cgrp)
-- 
1.8.0.2

^ permalink raw reply related	[flat|nested] 5+ messages in thread

* [PATCH v2 2/2] cgroup: check cgroup liveliness before unbreaking kernfs
@ 2014-09-04  6:43   ` Li Zefan
  0 siblings, 0 replies; 5+ messages in thread
From: Li Zefan @ 2014-09-04  6:43 UTC (permalink / raw)
  To: Tejun Heo; +Cc: Toralf Förster, LKML, Cgroups

When cgroup_kn_lock_live() is called through some kernfs operation and
another thread is calling cgroup_rmdir(), we'll trigger the warning in
cgroup_get().

------------[ cut here ]------------
WARNING: CPU: 1 PID: 1228 at kernel/cgroup.c:1034 cgroup_get+0x89/0xa0()
...
Call Trace:
 [<c16ee73d>] dump_stack+0x41/0x52
 [<c10468ef>] warn_slowpath_common+0x7f/0xa0
 [<c104692d>] warn_slowpath_null+0x1d/0x20
 [<c10bb999>] cgroup_get+0x89/0xa0
 [<c10bbe58>] cgroup_kn_lock_live+0x28/0x70
 [<c10be3c1>] __cgroup_procs_write.isra.26+0x51/0x230
 [<c10be5b2>] cgroup_tasks_write+0x12/0x20
 [<c10bb7b0>] cgroup_file_write+0x40/0x130
 [<c11aee71>] kernfs_fop_write+0xd1/0x160
 [<c1148e58>] vfs_write+0x98/0x1e0
 [<c114934d>] SyS_write+0x4d/0xa0
 [<c16f656b>] sysenter_do_call+0x12/0x12
---[ end trace 6f2e0c38c2108a74 ]---

Fix this by calling css_tryget() instead of cgroup_get().

v2:
- move cgroup_tryget() right below cgroup_get() definition. (Tejun)

Cc: <stable@vger.kernel.org> # 3.15+
Reported-by: Toralf Förster <toralf.foerster@gmx.de>
Signed-off-by: Zefan Li <lizefan@huawei.com>
---
 kernel/cgroup.c | 8 +++++++-
 1 file changed, 7 insertions(+), 1 deletion(-)

diff --git a/kernel/cgroup.c b/kernel/cgroup.c
index 709a6a0..51dd46e 100644
--- a/kernel/cgroup.c
+++ b/kernel/cgroup.c
@@ -1031,6 +1031,11 @@ static void cgroup_get(struct cgroup *cgrp)
 	css_get(&cgrp->self);
 }
 
+static bool cgroup_tryget(struct cgroup *cgrp)
+{
+	return css_tryget(&cgrp->self);
+}
+
 static void cgroup_put(struct cgroup *cgrp)
 {
 	css_put(&cgrp->self);
@@ -1091,7 +1096,8 @@ static struct cgroup *cgroup_kn_lock_live(struct kernfs_node *kn)
 	 * protection against removal.  Ensure @cgrp stays accessible and
 	 * break the active_ref protection.
 	 */
-	cgroup_get(cgrp);
+	if (!cgroup_tryget(cgrp))
+		return NULL;
 	kernfs_break_active_protection(kn);
 
 	mutex_lock(&cgroup_mutex);
-- 
1.8.0.2


^ permalink raw reply related	[flat|nested] 5+ messages in thread

* [PATCH v2 2/2] cgroup: check cgroup liveliness before unbreaking kernfs
@ 2014-09-04  6:43   ` Li Zefan
  0 siblings, 0 replies; 5+ messages in thread
From: Li Zefan @ 2014-09-04  6:43 UTC (permalink / raw)
  To: Tejun Heo; +Cc: Toralf Förster, LKML, Cgroups

When cgroup_kn_lock_live() is called through some kernfs operation and
another thread is calling cgroup_rmdir(), we'll trigger the warning in
cgroup_get().

------------[ cut here ]------------
WARNING: CPU: 1 PID: 1228 at kernel/cgroup.c:1034 cgroup_get+0x89/0xa0()
...
Call Trace:
 [<c16ee73d>] dump_stack+0x41/0x52
 [<c10468ef>] warn_slowpath_common+0x7f/0xa0
 [<c104692d>] warn_slowpath_null+0x1d/0x20
 [<c10bb999>] cgroup_get+0x89/0xa0
 [<c10bbe58>] cgroup_kn_lock_live+0x28/0x70
 [<c10be3c1>] __cgroup_procs_write.isra.26+0x51/0x230
 [<c10be5b2>] cgroup_tasks_write+0x12/0x20
 [<c10bb7b0>] cgroup_file_write+0x40/0x130
 [<c11aee71>] kernfs_fop_write+0xd1/0x160
 [<c1148e58>] vfs_write+0x98/0x1e0
 [<c114934d>] SyS_write+0x4d/0xa0
 [<c16f656b>] sysenter_do_call+0x12/0x12
---[ end trace 6f2e0c38c2108a74 ]---

Fix this by calling css_tryget() instead of cgroup_get().

v2:
- move cgroup_tryget() right below cgroup_get() definition. (Tejun)

Cc: <stable-u79uwXL29TY76Z2rM5mHXA@public.gmane.org> # 3.15+
Reported-by: Toralf Förster <toralf.foerster-Mmb7MZpHnFY@public.gmane.org>
Signed-off-by: Zefan Li <lizefan-hv44wF8Li93QT0dZR+AlfA@public.gmane.org>
---
 kernel/cgroup.c | 8 +++++++-
 1 file changed, 7 insertions(+), 1 deletion(-)

diff --git a/kernel/cgroup.c b/kernel/cgroup.c
index 709a6a0..51dd46e 100644
--- a/kernel/cgroup.c
+++ b/kernel/cgroup.c
@@ -1031,6 +1031,11 @@ static void cgroup_get(struct cgroup *cgrp)
 	css_get(&cgrp->self);
 }
 
+static bool cgroup_tryget(struct cgroup *cgrp)
+{
+	return css_tryget(&cgrp->self);
+}
+
 static void cgroup_put(struct cgroup *cgrp)
 {
 	css_put(&cgrp->self);
@@ -1091,7 +1096,8 @@ static struct cgroup *cgroup_kn_lock_live(struct kernfs_node *kn)
 	 * protection against removal.  Ensure @cgrp stays accessible and
 	 * break the active_ref protection.
 	 */
-	cgroup_get(cgrp);
+	if (!cgroup_tryget(cgrp))
+		return NULL;
 	kernfs_break_active_protection(kn);
 
 	mutex_lock(&cgroup_mutex);
-- 
1.8.0.2

^ permalink raw reply related	[flat|nested] 5+ messages in thread

* Re: [PATCH v2 2/2] cgroup: check cgroup liveliness before unbreaking kernfs
  2014-09-04  6:43   ` Li Zefan
  (?)
@ 2014-09-04 16:38   ` Tejun Heo
  -1 siblings, 0 replies; 5+ messages in thread
From: Tejun Heo @ 2014-09-04 16:38 UTC (permalink / raw)
  To: Li Zefan; +Cc: Toralf Förster, LKML, Cgroups

On Thu, Sep 04, 2014 at 02:43:38PM +0800, Li Zefan wrote:
> When cgroup_kn_lock_live() is called through some kernfs operation and
> another thread is calling cgroup_rmdir(), we'll trigger the warning in
> cgroup_get().
> 
> ------------[ cut here ]------------
> WARNING: CPU: 1 PID: 1228 at kernel/cgroup.c:1034 cgroup_get+0x89/0xa0()
> ...
> Call Trace:
>  [<c16ee73d>] dump_stack+0x41/0x52
>  [<c10468ef>] warn_slowpath_common+0x7f/0xa0
>  [<c104692d>] warn_slowpath_null+0x1d/0x20
>  [<c10bb999>] cgroup_get+0x89/0xa0
>  [<c10bbe58>] cgroup_kn_lock_live+0x28/0x70
>  [<c10be3c1>] __cgroup_procs_write.isra.26+0x51/0x230
>  [<c10be5b2>] cgroup_tasks_write+0x12/0x20
>  [<c10bb7b0>] cgroup_file_write+0x40/0x130
>  [<c11aee71>] kernfs_fop_write+0xd1/0x160
>  [<c1148e58>] vfs_write+0x98/0x1e0
>  [<c114934d>] SyS_write+0x4d/0xa0
>  [<c16f656b>] sysenter_do_call+0x12/0x12
> ---[ end trace 6f2e0c38c2108a74 ]---
> 
> Fix this by calling css_tryget() instead of cgroup_get().
> 
> v2:
> - move cgroup_tryget() right below cgroup_get() definition. (Tejun)
> 
> Cc: <stable@vger.kernel.org> # 3.15+
> Reported-by: Toralf Förster <toralf.foerster@gmx.de>
> Signed-off-by: Zefan Li <lizefan@huawei.com>

Applied 1-2 to cgroup/for-3.17-fixes.

Thanks.

-- 
tejun

^ permalink raw reply	[flat|nested] 5+ messages in thread

end of thread, other threads:[~2014-09-04 16:38 UTC | newest]

Thread overview: 5+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2014-09-04  6:43 [PATCH v2 1/2] cgroup: delay the clearing of cgrp->kn->priv Li Zefan
2014-09-04  6:43 ` Li Zefan
2014-09-04  6:43 ` [PATCH v2 2/2] cgroup: check cgroup liveliness before unbreaking kernfs Li Zefan
2014-09-04  6:43   ` Li Zefan
2014-09-04 16:38   ` Tejun Heo

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.