* [PATCH 1/2] cgroup: Delay the clearing of cgrp->kn->priv
@ 2014-09-02 10:56 Li Zefan
2014-09-02 10:57 ` [PATCH 2/2] cgroup: check cgroup liveliness before unbreaking kernfs protection Li Zefan
2014-09-02 15:33 ` [PATCH 1/2] cgroup: Delay the clearing of cgrp->kn->priv Tejun Heo
0 siblings, 2 replies; 5+ messages in thread
From: Li Zefan @ 2014-09-02 10:56 UTC (permalink / raw)
To: Tejun Heo; +Cc: Toralf Förster, LKML, cgroups
Run these two scripts concurrently:
for ((; ;))
{
mkdir /cgroup/sub
rmdir /cgroup/sub
}
for ((; ;))
{
echo $$ > /cgroup/sub/cgroup.procs
ech $$ > /cgce 6f2e0c38c2108a74 ]---
}
A kernel bug will be triggered:
BUG: unable to handle kernel NULL pointer dereference at 00000038
IP: [<c10bbd69>] cgroup_put+0x9/0x80
...
Call Trace:
[<c10bbe19>] cgroup_kn_unlock+0x39/0x50
[<c10bbe91>] cgroup_kn_lock_live+0x61/0x70
[<c10be3c1>] __cgroup_procs_write.isra.26+0x51/0x230
[<c10be5b2>] cgroup_tasks_write+0x12/0x20
[<c10bb7b0>] cgroup_file_write+0x40/0x130
[<c11aee71>] kernfs_fop_write+0xd1/0x160
[<c1148e58>] vfs_write+0x98/0x1e0
[<c114934d>] SyS_write+0x4d/0xa0
[<c16f656b>] sysenter_do_call+0x12/0x12
We clear cgrp->kn->priv in the end of cgroup_rmdir(), but another
concurrent thread can access kn->priv after the clearing.
We should move the clearing to css_release_work_fn(). At that time
no one is holding reference to the cgroup and no one can gain a new
reference to access it.
Reported-by: Toralf Förster <toralf.foerster@gmx.de>
Signed-off-by: Li Zefan <lizefan@huawei.com>
---
Toralf, Thanks for reporting the bug. I'm not able to repy to your email,
because I was kicked out of the cgroup mailing list so didn't receive
emails from mailing list for a week.
---
kernel/cgroup.c | 19 +++++++++----------
1 file changed, 9 insertions(+), 10 deletions(-)
diff --git a/kernel/cgroup.c b/kernel/cgroup.c
index 1c56924..e03fc62 100644
--- a/kernel/cgroup.c
+++ b/kernel/cgroup.c
@@ -4185,6 +4185,15 @@ static void css_release_work_fn(struct work_struct *work)
mutex_unlock(&cgroup_mutex);
+ /*
+ * There are two control paths which try to determine cgroup from
+ * dentry without going through kernfs - cgroupstats_build() and
+ * css_tryget_online_from_dir(). Those are supported by RCU
+ * protecting clearing of cgrp->kn->priv backpointer.
+ */
+ if (!ss && cgroup_parent(cgrp))
+ RCU_INIT_POINTER(*(void __rcu __force **)&cgrp->kn->priv, NULL);
+
call_rcu(&css->rcu_head, css_free_rcu_fn);
}
@@ -4601,16 +4610,6 @@ static int cgroup_rmdir(struct kernfs_node *kn)
cgroup_kn_unlock(kn);
- /*
- * There are two control paths which try to determine cgroup from
- * dentry without going through kernfs - cgroupstats_build() and
- * css_tryget_online_from_dir(). Those are supported by RCU
- * protecting clearing of cgrp->kn->priv backpointer, which should
- * happen after all files under it have been removed.
- */
- if (!ret)
- RCU_INIT_POINTER(*(void __rcu __force **)&kn->priv, NULL);
-
cgroup_put(cgrp);
return ret;
}
--
1.8.0.2
^ permalink raw reply related [flat|nested] 5+ messages in thread
* [PATCH 2/2] cgroup: check cgroup liveliness before unbreaking kernfs protection
2014-09-02 10:56 [PATCH 1/2] cgroup: Delay the clearing of cgrp->kn->priv Li Zefan
@ 2014-09-02 10:57 ` Li Zefan
2014-09-02 15:51 ` Tejun Heo
2014-09-02 15:33 ` [PATCH 1/2] cgroup: Delay the clearing of cgrp->kn->priv Tejun Heo
1 sibling, 1 reply; 5+ messages in thread
From: Li Zefan @ 2014-09-02 10:57 UTC (permalink / raw)
To: Tejun Heo; +Cc: Toralf Förster, LKML, cgroups
When cgroup_kn_lock_live() is called through some kernfs operation and
another thread is calling cgroup_rmdir(), we may trigger the warning in
cgroup_get().
------------[ cut here ]------------
WARNING: CPU: 1 PID: 1228 at kernel/cgroup.c:1034 cgroup_get+0x89/0xa0()
...
Call Trace:
[<c16ee73d>] dump_stack+0x41/0x52
[<c10468ef>] warn_slowpath_common+0x7f/0xa0
[<c104692d>] warn_slowpath_null+0x1d/0x20
[<c10bb999>] cgroup_get+0x89/0xa0
[<c10bbe58>] cgroup_kn_lock_live+0x28/0x70
[<c10be3c1>] __cgroup_procs_write.isra.26+0x51/0x230
[<c10be5b2>] cgroup_tasks_write+0x12/0x20
[<c10bb7b0>] cgroup_file_write+0x40/0x130
[<c11aee71>] kernfs_fop_write+0xd1/0x160
[<c1148e58>] vfs_write+0x98/0x1e0
[<c114934d>] SyS_write+0x4d/0xa0
[<c16f656b>] sysenter_do_call+0x12/0x12
---[ end trace 6f2e0c38c2108a74 ]---
Fix this by calling css_tryget() instead of cgroup_get().
Reported-by: Toralf Förster <toralf.foerster@gmx.de>
Signed-off-by: Li Zefan <lizefan@huawei.com>
---
kernel/cgroup.c | 8 +++++++-
1 file changed, 7 insertions(+), 1 deletion(-)
diff --git a/kernel/cgroup.c b/kernel/cgroup.c
index e03fc62..c8d07e5 100644
--- a/kernel/cgroup.c
+++ b/kernel/cgroup.c
@@ -1025,6 +1025,11 @@ static umode_t cgroup_file_mode(const struct cftype *cft)
return mode;
}
+static bool cgroup_tryget(struct cgroup *cgrp)
+{
+ return css_tryget(&cgrp->self);
+}
+
static void cgroup_get(struct cgroup *cgrp)
{
WARN_ON_ONCE(cgroup_is_dead(cgrp));
@@ -1091,7 +1096,8 @@ static struct cgroup *cgroup_kn_lock_live(struct kernfs_node *kn)
* protection against removal. Ensure @cgrp stays accessible and
* break the active_ref protection.
*/
- cgroup_get(cgrp);
+ if (!cgroup_tryget(cgrp))
+ return NULL;
kernfs_break_active_protection(kn);
mutex_lock(&cgroup_mutex);
--
1.8.0.2
^ permalink raw reply related [flat|nested] 5+ messages in thread
* Re: [PATCH 1/2] cgroup: Delay the clearing of cgrp->kn->priv
2014-09-02 10:56 [PATCH 1/2] cgroup: Delay the clearing of cgrp->kn->priv Li Zefan
2014-09-02 10:57 ` [PATCH 2/2] cgroup: check cgroup liveliness before unbreaking kernfs protection Li Zefan
@ 2014-09-02 15:33 ` Tejun Heo
2014-09-04 3:35 ` Li Zefan
1 sibling, 1 reply; 5+ messages in thread
From: Tejun Heo @ 2014-09-02 15:33 UTC (permalink / raw)
To: Li Zefan; +Cc: Toralf Förster, LKML, cgroups
Hello, Li.
On Tue, Sep 02, 2014 at 06:56:58PM +0800, Li Zefan wrote:
> for ((; ;))
> {
> echo $$ > /cgroup/sub/cgroup.procs
> ech $$ > /cgce 6f2e0c38c2108a74 ]---
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
copy & paste error?
...
> Reported-by: Toralf Förster <toralf.foerster@gmx.de>
> Signed-off-by: Li Zefan <lizefan@huawei.com>
> ---
>
> Toralf, Thanks for reporting the bug. I'm not able to repy to your email,
> because I was kicked out of the cgroup mailing list so didn't receive
> emails from mailing list for a week.
>
> ---
> kernel/cgroup.c | 19 +++++++++----------
> 1 file changed, 9 insertions(+), 10 deletions(-)
>
> diff --git a/kernel/cgroup.c b/kernel/cgroup.c
> index 1c56924..e03fc62 100644
> --- a/kernel/cgroup.c
> +++ b/kernel/cgroup.c
> @@ -4185,6 +4185,15 @@ static void css_release_work_fn(struct work_struct *work)
>
> mutex_unlock(&cgroup_mutex);
>
> + /*
> + * There are two control paths which try to determine cgroup from
> + * dentry without going through kernfs - cgroupstats_build() and
> + * css_tryget_online_from_dir(). Those are supported by RCU
> + * protecting clearing of cgrp->kn->priv backpointer.
> + */
> + if (!ss && cgroup_parent(cgrp))
> + RCU_INIT_POINTER(*(void __rcu __force **)&cgrp->kn->priv, NULL);
Can we move the above into the preceding else block? I don't think
holding cgroup_mutex or not makes any difference here. Also, why do
we need the cgroup_parent() check? Do we deref root's kn->priv in the
destruction path? If so, can you please note that in the comment?
Thanks.
--
tejun
^ permalink raw reply [flat|nested] 5+ messages in thread
* Re: [PATCH 2/2] cgroup: check cgroup liveliness before unbreaking kernfs protection
2014-09-02 10:57 ` [PATCH 2/2] cgroup: check cgroup liveliness before unbreaking kernfs protection Li Zefan
@ 2014-09-02 15:51 ` Tejun Heo
0 siblings, 0 replies; 5+ messages in thread
From: Tejun Heo @ 2014-09-02 15:51 UTC (permalink / raw)
To: Li Zefan; +Cc: Toralf Förster, LKML, cgroups
On Tue, Sep 02, 2014 at 06:57:54PM +0800, Li Zefan wrote:
> diff --git a/kernel/cgroup.c b/kernel/cgroup.c
> index e03fc62..c8d07e5 100644
> --- a/kernel/cgroup.c
> +++ b/kernel/cgroup.c
> @@ -1025,6 +1025,11 @@ static umode_t cgroup_file_mode(const struct cftype *cft)
> return mode;
> }
>
> +static bool cgroup_tryget(struct cgroup *cgrp)
> +{
> + return css_tryget(&cgrp->self);
> +}
Can you please move this right below cgroup_get() definition?
Thanks.
--
tejun
^ permalink raw reply [flat|nested] 5+ messages in thread
* Re: [PATCH 1/2] cgroup: Delay the clearing of cgrp->kn->priv
2014-09-02 15:33 ` [PATCH 1/2] cgroup: Delay the clearing of cgrp->kn->priv Tejun Heo
@ 2014-09-04 3:35 ` Li Zefan
0 siblings, 0 replies; 5+ messages in thread
From: Li Zefan @ 2014-09-04 3:35 UTC (permalink / raw)
To: Tejun Heo; +Cc: Toralf Förster, LKML, cgroups
于 2014/9/2 23:33, Tejun Heo 写道:
> Hello, Li.
>
> On Tue, Sep 02, 2014 at 06:56:58PM +0800, Li Zefan wrote:
>> for ((; ;))
>> {
>> echo $$ > /cgroup/sub/cgroup.procs
>> ech $$ > /cgce 6f2e0c38c2108a74 ]---
> ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
> copy & paste error?
> ...
oops
>> Reported-by: Toralf Förster <toralf.foerster@gmx.de>
>> Signed-off-by: Li Zefan <lizefan@huawei.com>
>> ---
>>
>> Toralf, Thanks for reporting the bug. I'm not able to repy to your email,
>> because I was kicked out of the cgroup mailing list so didn't receive
>> emails from mailing list for a week.
>>
>> ---
>> kernel/cgroup.c | 19 +++++++++----------
>> 1 file changed, 9 insertions(+), 10 deletions(-)
>>
>> diff --git a/kernel/cgroup.c b/kernel/cgroup.c
>> index 1c56924..e03fc62 100644
>> --- a/kernel/cgroup.c
>> +++ b/kernel/cgroup.c
>> @@ -4185,6 +4185,15 @@ static void css_release_work_fn(struct work_struct *work)
>>
>> mutex_unlock(&cgroup_mutex);
>>
>> + /*
>> + * There are two control paths which try to determine cgroup from
>> + * dentry without going through kernfs - cgroupstats_build() and
>> + * css_tryget_online_from_dir(). Those are supported by RCU
>> + * protecting clearing of cgrp->kn->priv backpointer.
>> + */
>> + if (!ss && cgroup_parent(cgrp))
>> + RCU_INIT_POINTER(*(void __rcu __force **)&cgrp->kn->priv, NULL);
>
> Can we move the above into the preceding else block? I don't think
> holding cgroup_mutex or not makes any difference here.
> Also, why do
> we need the cgroup_parent() check? Do we deref root's kn->priv in the
> destruction path? If so, can you please note that in the comment?
>
I think the check is not necessary. I was trying to make smaller difference
than the original code, and RCU_INIT_POINTER() is in cgroup_rmdir() which
won't be called on root cgroup.
^ permalink raw reply [flat|nested] 5+ messages in thread
end of thread, other threads:[~2014-09-04 3:35 UTC | newest]
Thread overview: 5+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2014-09-02 10:56 [PATCH 1/2] cgroup: Delay the clearing of cgrp->kn->priv Li Zefan
2014-09-02 10:57 ` [PATCH 2/2] cgroup: check cgroup liveliness before unbreaking kernfs protection Li Zefan
2014-09-02 15:51 ` Tejun Heo
2014-09-02 15:33 ` [PATCH 1/2] cgroup: Delay the clearing of cgrp->kn->priv Tejun Heo
2014-09-04 3:35 ` Li Zefan
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).