* [PATCH 1/2] sched/cgroup: move sched_online_group() back into css_online() @ 2017-01-26 9:41 Konstantin Khlebnikov 2017-01-26 9:41 ` [PATCH 2/2] kernfs: define name and path to "(null)" for NULL pointer kernfs nodes Konstantin Khlebnikov ` (2 more replies) 0 siblings, 3 replies; 9+ messages in thread From: Konstantin Khlebnikov @ 2017-01-26 9:41 UTC (permalink / raw) To: Tejun Heo, Peter Zijlstra, Ingo Molnar, linux-kernel; +Cc: stable Commit 2f5177f0fd7e ("sched/cgroup: Fix/cleanup cgroup teardown/init") moved sched_online_group() from css_online() to css_alloc(). It exposes half-baked task group into global lists before initializing generic cgroup stuff. LTP testcase (third in cgroup_regression_test) written for testing similar race in kernels 2.6.26-2.6.28 easily triggers this oops: BUG: unable to handle kernel NULL pointer dereference at 0000000000000008 IP: kernfs_path_from_node_locked+0x260/0x320 CPU: 1 PID: 30346 Comm: cat Not tainted 4.10.0-rc5-test #4 Call Trace: ? kernfs_path_from_node+0x4f/0x60 kernfs_path_from_node+0x3e/0x60 print_rt_rq+0x44/0x2b0 print_rt_stats+0x7a/0xd0 print_cpu+0x2fc/0xe80 ? __might_sleep+0x4a/0x80 sched_debug_show+0x17/0x30 seq_read+0xf2/0x3b0 proc_reg_read+0x42/0x70 __vfs_read+0x28/0x130 ? security_file_permission+0x9b/0xc0 ? rw_verify_area+0x4e/0xb0 vfs_read+0xa5/0x170 SyS_read+0x46/0xa0 entry_SYSCALL_64_fastpath+0x1e/0xad Here task group already linked into global RCU-protected list task_groups but pointer css->cgroup is still NULL. This patch reverts this chunk and moves online back to css_online(). Signed-off-by: Konstantin Khlebnikov <khlebnikov@yandex-team.ru> Fixes: 2f5177f0fd7e ("sched/cgroup: Fix/cleanup cgroup teardown/init") Cc: <stable@vger.kernel.org> # 4.6+ --- kernel/sched/core.c | 13 +++++++++++-- 1 file changed, 11 insertions(+), 2 deletions(-) diff --git a/kernel/sched/core.c b/kernel/sched/core.c index c56fb57f2991..a4020bec7a87 100644 --- a/kernel/sched/core.c +++ b/kernel/sched/core.c @@ -8398,11 +8398,19 @@ cpu_cgroup_css_alloc(struct cgroup_subsys_state *parent_css) if (IS_ERR(tg)) return ERR_PTR(-ENOMEM); - sched_online_group(tg, parent); - return &tg->css; } +static int cpu_cgroup_css_online(struct cgroup_subsys_state *css) +{ + struct task_group *tg = css_tg(css); + struct task_group *parent = css_tg(css->parent); + + if (parent) + sched_online_group(tg, parent); + return 0; +} + static void cpu_cgroup_css_released(struct cgroup_subsys_state *css) { struct task_group *tg = css_tg(css); @@ -8805,6 +8813,7 @@ static struct cftype cpu_files[] = { struct cgroup_subsys cpu_cgrp_subsys = { .css_alloc = cpu_cgroup_css_alloc, + .css_online = cpu_cgroup_css_online, .css_released = cpu_cgroup_css_released, .css_free = cpu_cgroup_css_free, .fork = cpu_cgroup_fork, ^ permalink raw reply related [flat|nested] 9+ messages in thread
* [PATCH 2/2] kernfs: define name and path to "(null)" for NULL pointer kernfs nodes 2017-01-26 9:41 [PATCH 1/2] sched/cgroup: move sched_online_group() back into css_online() Konstantin Khlebnikov @ 2017-01-26 9:41 ` Konstantin Khlebnikov 2017-01-26 19:37 ` Tejun Heo 2017-01-26 10:17 ` [PATCH 1/2] sched/cgroup: move sched_online_group() back into css_online() Peter Zijlstra 2017-02-08 11:27 ` [PATCH RESEND/add comment] " Konstantin Khlebnikov 2 siblings, 1 reply; 9+ messages in thread From: Konstantin Khlebnikov @ 2017-01-26 9:41 UTC (permalink / raw) To: Tejun Heo, Peter Zijlstra, Ingo Molnar, linux-kernel; +Cc: stable The same LTP testcase triggers NULL deref css->cgroup->kn in the same place. Commit a5bca2152036 ("cgroup: factor out cgroup_create() out of cgroup_mkdir()") reordered cgroup construction. Now css created and set online twice: first in cgroup_create(), second in cgroup_mkdir(). First happens before creation of kernfs node. It seems safer and cleaner to handle NULL pointer right in kernfs than spreading this magic across all other code. Now all functions that prints kernfs path and name prints "(null)" for NULL pointer. Also it might be a good idea to reorder cgroup construction to make these NULL pointers unreachable: online css only after complete initialization. After that NULL checks in kernfs could be surrounded with WARN_ON(). Signed-off-by: Konstantin Khlebnikov <khlebnikov@yandex-team.ru> Fixes: a5bca2152036 ("cgroup: factor out cgroup_create() out of cgroup_mkdir()") Cc: <stable@vger.kernel.org> # 4.6+ --- fs/kernfs/dir.c | 6 ++++++ include/trace/events/cgroup.h | 20 ++++++-------------- 2 files changed, 12 insertions(+), 14 deletions(-) diff --git a/fs/kernfs/dir.c b/fs/kernfs/dir.c index cf4c636ff4da..2ba728d134b2 100644 --- a/fs/kernfs/dir.c +++ b/fs/kernfs/dir.c @@ -41,6 +41,9 @@ static bool kernfs_lockdep(struct kernfs_node *kn) static int kernfs_name_locked(struct kernfs_node *kn, char *buf, size_t buflen) { + if (!kn) + return strlcpy(buf, "(null)", buflen); + return strlcpy(buf, kn->parent ? kn->name : "/", buflen); } @@ -123,6 +126,9 @@ static int kernfs_path_from_node_locked(struct kernfs_node *kn_to, size_t depth_from, depth_to, len = 0; int i, j; + if (!kn_to) + return strlcpy(buf, "(null)", buflen); + if (!kn_from) kn_from = kernfs_root(kn_to)->kn; diff --git a/include/trace/events/cgroup.h b/include/trace/events/cgroup.h index ab68640a18d0..c226f50e88fa 100644 --- a/include/trace/events/cgroup.h +++ b/include/trace/events/cgroup.h @@ -61,19 +61,15 @@ DECLARE_EVENT_CLASS(cgroup, __field( int, id ) __field( int, level ) __dynamic_array(char, path, - cgrp->kn ? cgroup_path(cgrp, NULL, 0) + 1 - : strlen("(null)")) + cgroup_path(cgrp, NULL, 0) + 1) ), TP_fast_assign( __entry->root = cgrp->root->hierarchy_id; __entry->id = cgrp->id; __entry->level = cgrp->level; - if (cgrp->kn) - cgroup_path(cgrp, __get_dynamic_array(path), - __get_dynamic_array_len(path)); - else - __assign_str(path, "(null)"); + cgroup_path(cgrp, __get_dynamic_array(path), + __get_dynamic_array_len(path)); ), TP_printk("root=%d id=%d level=%d path=%s", @@ -119,8 +115,7 @@ DECLARE_EVENT_CLASS(cgroup_migrate, __field( int, dst_id ) __field( int, dst_level ) __dynamic_array(char, dst_path, - dst_cgrp->kn ? cgroup_path(dst_cgrp, NULL, 0) + 1 - : strlen("(null)")) + cgroup_path(dst_cgrp, NULL, 0) + 1) __field( int, pid ) __string( comm, task->comm ) ), @@ -129,11 +124,8 @@ DECLARE_EVENT_CLASS(cgroup_migrate, __entry->dst_root = dst_cgrp->root->hierarchy_id; __entry->dst_id = dst_cgrp->id; __entry->dst_level = dst_cgrp->level; - if (dst_cgrp->kn) - cgroup_path(dst_cgrp, __get_dynamic_array(dst_path), - __get_dynamic_array_len(dst_path)); - else - __assign_str(dst_path, "(null)"); + cgroup_path(dst_cgrp, __get_dynamic_array(dst_path), + __get_dynamic_array_len(dst_path)); __entry->pid = task->pid; __assign_str(comm, task->comm); ), ^ permalink raw reply related [flat|nested] 9+ messages in thread
* Re: [PATCH 2/2] kernfs: define name and path to "(null)" for NULL pointer kernfs nodes 2017-01-26 9:41 ` [PATCH 2/2] kernfs: define name and path to "(null)" for NULL pointer kernfs nodes Konstantin Khlebnikov @ 2017-01-26 19:37 ` Tejun Heo 2017-01-26 21:50 ` Tejun Heo 0 siblings, 1 reply; 9+ messages in thread From: Tejun Heo @ 2017-01-26 19:37 UTC (permalink / raw) To: Konstantin Khlebnikov; +Cc: Peter Zijlstra, Ingo Molnar, linux-kernel, stable Hello, On Thu, Jan 26, 2017 at 12:41:48PM +0300, Konstantin Khlebnikov wrote: > The same LTP testcase triggers NULL deref css->cgroup->kn in the same place. > > Commit a5bca2152036 ("cgroup: factor out cgroup_create() out of > cgroup_mkdir()") reordered cgroup construction. Now css created and > set online twice: first in cgroup_create(), second in cgroup_mkdir(). > First happens before creation of kernfs node. It's isn't set online twice. It's onlined and then made visible later. The problem here is that cgroup_name/path() depend on a kernfs node being associated with it and there's a gap between onlining and kernfs_node association. It's fine in cgroup proper but cgroup_name/path() are used from other places too, so it is problematic. > It seems safer and cleaner to handle NULL pointer right in kernfs than > spreading this magic across all other code. Now all functions that > prints kernfs path and name prints "(null)" for NULL pointer. I'm not against this patch but yeah this doesn't address the root cause here. Can you please send the patch to Greg? Please feel free to add my Acked-by. > Also it might be a good idea to reorder cgroup construction to make these > NULL pointers unreachable: online css only after complete initialization. > After that NULL checks in kernfs could be surrounded with WARN_ON(). Oh yeah, definitely. The names should be available before online. Will look into it. Thanks. -- tejun ^ permalink raw reply [flat|nested] 9+ messages in thread
* [PATCH cgroup/for-4.10-fixes] cgroup: don't online subsystems before cgroup_name/path() are operational 2017-01-26 19:37 ` Tejun Heo @ 2017-01-26 21:50 ` Tejun Heo 0 siblings, 0 replies; 9+ messages in thread From: Tejun Heo @ 2017-01-26 21:50 UTC (permalink / raw) To: Li Zefan, Johannes Weiner, cgroups Cc: Peter Zijlstra, Ingo Molnar, linux-kernel, stable, Konstantin Khlebnikov, kernel-team >From 07cd12945551b63ecb1a349d50a6d69d1d6feb4a Mon Sep 17 00:00:00 2001 From: Tejun Heo <tj@kernel.org> Date: Thu, 26 Jan 2017 16:47:28 -0500 While refactoring cgroup creation, a5bca2152036 ("cgroup: factor out cgroup_create() out of cgroup_mkdir()") incorrectly onlined subsystems before the new cgroup is associated with it kernfs_node. This is fine for cgroup proper but cgroup_name/path() depend on the associated kernfs_node and if a subsystem makes the new cgroup_subsys_state visible, which they're allowed to after onlining, it can lead to NULL dereference. The current code performs cgroup creation and subsystem onlining in cgroup_create() and cgroup_mkdir() makes the cgroup and subsystems visible afterwards. There's no reason to online the subsystems early and we can simply drop cgroup_apply_control_enable() call from cgroup_create() so that the subsystems are onlined and made visible at the same time. Signed-off-by: Tejun Heo <tj@kernel.org> Reported-by: Konstantin Khlebnikov <khlebnikov@yandex-team.ru> Fixes: a5bca2152036 ("cgroup: factor out cgroup_create() out of cgroup_mkdir()") Cc: stable@vger.kernel.org # v4.6+ --- Applied to cgroup/for-4.10-fixes. Thanks. kernel/cgroup.c | 13 +++++-------- 1 file changed, 5 insertions(+), 8 deletions(-) diff --git a/kernel/cgroup.c b/kernel/cgroup.c index 2ee9ec3..688dd02 100644 --- a/kernel/cgroup.c +++ b/kernel/cgroup.c @@ -5221,6 +5221,11 @@ static struct cgroup_subsys_state *css_create(struct cgroup *cgrp, return ERR_PTR(err); } +/* + * The returned cgroup is fully initialized including its control mask, but + * it isn't associated with its kernfs_node and doesn't have the control + * mask applied. + */ static struct cgroup *cgroup_create(struct cgroup *parent) { struct cgroup_root *root = parent->root; @@ -5288,11 +5293,6 @@ static struct cgroup *cgroup_create(struct cgroup *parent) cgroup_propagate_control(cgrp); - /* @cgrp doesn't have dir yet so the following will only create csses */ - ret = cgroup_apply_control_enable(cgrp); - if (ret) - goto out_destroy; - return cgrp; out_cancel_ref: @@ -5300,9 +5300,6 @@ static struct cgroup *cgroup_create(struct cgroup *parent) out_free_cgrp: kfree(cgrp); return ERR_PTR(ret); -out_destroy: - cgroup_destroy_locked(cgrp); - return ERR_PTR(ret); } static int cgroup_mkdir(struct kernfs_node *parent_kn, const char *name, -- 2.9.3 ^ permalink raw reply related [flat|nested] 9+ messages in thread
* [PATCH cgroup/for-4.10-fixes] cgroup: don't online subsystems before cgroup_name/path() are operational @ 2017-01-26 21:50 ` Tejun Heo 0 siblings, 0 replies; 9+ messages in thread From: Tejun Heo @ 2017-01-26 21:50 UTC (permalink / raw) To: Li Zefan, Johannes Weiner, cgroups Cc: Peter Zijlstra, Ingo Molnar, linux-kernel, stable, Konstantin Khlebnikov, kernel-team From 07cd12945551b63ecb1a349d50a6d69d1d6feb4a Mon Sep 17 00:00:00 2001 From: Tejun Heo <tj@kernel.org> Date: Thu, 26 Jan 2017 16:47:28 -0500 While refactoring cgroup creation, a5bca2152036 ("cgroup: factor out cgroup_create() out of cgroup_mkdir()") incorrectly onlined subsystems before the new cgroup is associated with it kernfs_node. This is fine for cgroup proper but cgroup_name/path() depend on the associated kernfs_node and if a subsystem makes the new cgroup_subsys_state visible, which they're allowed to after onlining, it can lead to NULL dereference. The current code performs cgroup creation and subsystem onlining in cgroup_create() and cgroup_mkdir() makes the cgroup and subsystems visible afterwards. There's no reason to online the subsystems early and we can simply drop cgroup_apply_control_enable() call from cgroup_create() so that the subsystems are onlined and made visible at the same time. Signed-off-by: Tejun Heo <tj@kernel.org> Reported-by: Konstantin Khlebnikov <khlebnikov@yandex-team.ru> Fixes: a5bca2152036 ("cgroup: factor out cgroup_create() out of cgroup_mkdir()") Cc: stable@vger.kernel.org # v4.6+ --- Applied to cgroup/for-4.10-fixes. Thanks. kernel/cgroup.c | 13 +++++-------- 1 file changed, 5 insertions(+), 8 deletions(-) diff --git a/kernel/cgroup.c b/kernel/cgroup.c index 2ee9ec3..688dd02 100644 --- a/kernel/cgroup.c +++ b/kernel/cgroup.c @@ -5221,6 +5221,11 @@ static struct cgroup_subsys_state *css_create(struct cgroup *cgrp, return ERR_PTR(err); } +/* + * The returned cgroup is fully initialized including its control mask, but + * it isn't associated with its kernfs_node and doesn't have the control + * mask applied. + */ static struct cgroup *cgroup_create(struct cgroup *parent) { struct cgroup_root *root = parent->root; @@ -5288,11 +5293,6 @@ static struct cgroup *cgroup_create(struct cgroup *parent) cgroup_propagate_control(cgrp); - /* @cgrp doesn't have dir yet so the following will only create csses */ - ret = cgroup_apply_control_enable(cgrp); - if (ret) - goto out_destroy; - return cgrp; out_cancel_ref: @@ -5300,9 +5300,6 @@ static struct cgroup *cgroup_create(struct cgroup *parent) out_free_cgrp: kfree(cgrp); return ERR_PTR(ret); -out_destroy: - cgroup_destroy_locked(cgrp); - return ERR_PTR(ret); } static int cgroup_mkdir(struct kernfs_node *parent_kn, const char *name, -- 2.9.3 ^ permalink raw reply related [flat|nested] 9+ messages in thread
* Re: [PATCH 1/2] sched/cgroup: move sched_online_group() back into css_online() 2017-01-26 9:41 [PATCH 1/2] sched/cgroup: move sched_online_group() back into css_online() Konstantin Khlebnikov 2017-01-26 9:41 ` [PATCH 2/2] kernfs: define name and path to "(null)" for NULL pointer kernfs nodes Konstantin Khlebnikov @ 2017-01-26 10:17 ` Peter Zijlstra 2017-01-26 10:27 ` Konstantin Khlebnikov 2017-02-08 11:27 ` [PATCH RESEND/add comment] " Konstantin Khlebnikov 2 siblings, 1 reply; 9+ messages in thread From: Peter Zijlstra @ 2017-01-26 10:17 UTC (permalink / raw) To: Konstantin Khlebnikov; +Cc: Tejun Heo, Ingo Molnar, linux-kernel, stable On Thu, Jan 26, 2017 at 12:41:41PM +0300, Konstantin Khlebnikov wrote: > Commit 2f5177f0fd7e ("sched/cgroup: Fix/cleanup cgroup teardown/init") moved > sched_online_group() from css_online() to css_alloc(). It exposes half-baked > task group into global lists before initializing generic cgroup stuff. > > LTP testcase (third in cgroup_regression_test) written for testing > similar race in kernels 2.6.26-2.6.28 easily triggers this oops: > So nobody's run LTP against the kernel for almost a year? > > Here task group already linked into global RCU-protected list task_groups > but pointer css->cgroup is still NULL. > > This patch reverts this chunk and moves online back to css_online(). Maybe put a comment with it that explains why this is needed? Something along the lines of this perhaps? /* * Don't expose the cgroup until initialization of it is complete in the * cgroup core. Otherwise things like cgroup_path() will return NULL * pointers and the like. */ > +static int cpu_cgroup_css_online(struct cgroup_subsys_state *css) > +{ > + struct task_group *tg = css_tg(css); > + struct task_group *parent = css_tg(css->parent); > + > + if (parent) > + sched_online_group(tg, parent); > + return 0; > +} ^ permalink raw reply [flat|nested] 9+ messages in thread
* Re: [PATCH 1/2] sched/cgroup: move sched_online_group() back into css_online() 2017-01-26 10:17 ` [PATCH 1/2] sched/cgroup: move sched_online_group() back into css_online() Peter Zijlstra @ 2017-01-26 10:27 ` Konstantin Khlebnikov 0 siblings, 0 replies; 9+ messages in thread From: Konstantin Khlebnikov @ 2017-01-26 10:27 UTC (permalink / raw) To: Peter Zijlstra; +Cc: Tejun Heo, Ingo Molnar, linux-kernel, stable On 26.01.2017 13:17, Peter Zijlstra wrote: > On Thu, Jan 26, 2017 at 12:41:41PM +0300, Konstantin Khlebnikov wrote: >> Commit 2f5177f0fd7e ("sched/cgroup: Fix/cleanup cgroup teardown/init") moved >> sched_online_group() from css_online() to css_alloc(). It exposes half-baked >> task group into global lists before initializing generic cgroup stuff. >> >> LTP testcase (third in cgroup_regression_test) written for testing >> similar race in kernels 2.6.26-2.6.28 easily triggers this oops: >> > > So nobody's run LTP against the kernel for almost a year? Yep. Nobody runs LTP =) CONFIG_RT_GROUP_SCHED must be y (which almost impossible to use IRL, I have some patches for it out of tree) Also systemd by default binds cgroups cpu and cpuacct together - this breaks testcase. > >> >> Here task group already linked into global RCU-protected list task_groups >> but pointer css->cgroup is still NULL. >> >> This patch reverts this chunk and moves online back to css_online(). > > Maybe put a comment with it that explains why this is needed? > > Something along the lines of this perhaps? > Actually online is called before complete initialization. See second patch. =) I don't know what to do with this. cgroups are messy as always. > /* > * Don't expose the cgroup until initialization of it is complete in the > * cgroup core. Otherwise things like cgroup_path() will return NULL > * pointers and the like. > */ > >> +static int cpu_cgroup_css_online(struct cgroup_subsys_state *css) >> +{ >> + struct task_group *tg = css_tg(css); >> + struct task_group *parent = css_tg(css->parent); >> + >> + if (parent) >> + sched_online_group(tg, parent); >> + return 0; >> +} ^ permalink raw reply [flat|nested] 9+ messages in thread
* [PATCH RESEND/add comment] sched/cgroup: move sched_online_group() back into css_online() 2017-01-26 9:41 [PATCH 1/2] sched/cgroup: move sched_online_group() back into css_online() Konstantin Khlebnikov 2017-01-26 9:41 ` [PATCH 2/2] kernfs: define name and path to "(null)" for NULL pointer kernfs nodes Konstantin Khlebnikov 2017-01-26 10:17 ` [PATCH 1/2] sched/cgroup: move sched_online_group() back into css_online() Peter Zijlstra @ 2017-02-08 11:27 ` Konstantin Khlebnikov 2017-02-24 9:19 ` [tip:sched/urgent] sched/cgroup: Move sched_online_group() back into css_online() to fix crash tip-bot for Konstantin Khlebnikov 2 siblings, 1 reply; 9+ messages in thread From: Konstantin Khlebnikov @ 2017-02-08 11:27 UTC (permalink / raw) To: Peter Zijlstra, Ingo Molnar, linux-kernel; +Cc: Tejun Heo Commit 2f5177f0fd7e ("sched/cgroup: Fix/cleanup cgroup teardown/init") moved sched_online_group() from css_online() to css_alloc(). It exposes half-baked task group into global lists before initializing generic cgroup stuff. LTP testcase (third in cgroup_regression_test) written for testing similar race in kernels 2.6.26-2.6.28 easily triggers this oops: BUG: unable to handle kernel NULL pointer dereference at 0000000000000008 IP: kernfs_path_from_node_locked+0x260/0x320 CPU: 1 PID: 30346 Comm: cat Not tainted 4.10.0-rc5-test #4 Call Trace: ? kernfs_path_from_node+0x4f/0x60 kernfs_path_from_node+0x3e/0x60 print_rt_rq+0x44/0x2b0 print_rt_stats+0x7a/0xd0 print_cpu+0x2fc/0xe80 ? __might_sleep+0x4a/0x80 sched_debug_show+0x17/0x30 seq_read+0xf2/0x3b0 proc_reg_read+0x42/0x70 __vfs_read+0x28/0x130 ? security_file_permission+0x9b/0xc0 ? rw_verify_area+0x4e/0xb0 vfs_read+0xa5/0x170 SyS_read+0x46/0xa0 entry_SYSCALL_64_fastpath+0x1e/0xad Here task group already linked into global RCU-protected list task_groups but pointer css->cgroup is still NULL. This patch reverts this chunk and moves online back to css_online(). Signed-off-by: Konstantin Khlebnikov <khlebnikov@yandex-team.ru> Fixes: 2f5177f0fd7e ("sched/cgroup: Fix/cleanup cgroup teardown/init") Cc: <stable@vger.kernel.org> # 4.6+ --- kernel/sched/core.c | 14 ++++++++++++-- 1 file changed, 12 insertions(+), 2 deletions(-) diff --git a/kernel/sched/core.c b/kernel/sched/core.c index c56fb57f2991..5bfd9cc2aa29 100644 --- a/kernel/sched/core.c +++ b/kernel/sched/core.c @@ -8398,11 +8398,20 @@ cpu_cgroup_css_alloc(struct cgroup_subsys_state *parent_css) if (IS_ERR(tg)) return ERR_PTR(-ENOMEM); - sched_online_group(tg, parent); - return &tg->css; } +/* Expose task group only after completing cgroup initialization */ +static int cpu_cgroup_css_online(struct cgroup_subsys_state *css) +{ + struct task_group *tg = css_tg(css); + struct task_group *parent = css_tg(css->parent); + + if (parent) + sched_online_group(tg, parent); + return 0; +} + static void cpu_cgroup_css_released(struct cgroup_subsys_state *css) { struct task_group *tg = css_tg(css); @@ -8805,6 +8814,7 @@ static struct cftype cpu_files[] = { struct cgroup_subsys cpu_cgrp_subsys = { .css_alloc = cpu_cgroup_css_alloc, + .css_online = cpu_cgroup_css_online, .css_released = cpu_cgroup_css_released, .css_free = cpu_cgroup_css_free, .fork = cpu_cgroup_fork, ^ permalink raw reply related [flat|nested] 9+ messages in thread
* [tip:sched/urgent] sched/cgroup: Move sched_online_group() back into css_online() to fix crash 2017-02-08 11:27 ` [PATCH RESEND/add comment] " Konstantin Khlebnikov @ 2017-02-24 9:19 ` tip-bot for Konstantin Khlebnikov 0 siblings, 0 replies; 9+ messages in thread From: tip-bot for Konstantin Khlebnikov @ 2017-02-24 9:19 UTC (permalink / raw) To: linux-tip-commits Cc: khlebnikov, torvalds, tglx, peterz, hpa, tj, linux-kernel, mingo Commit-ID: 96b777452d8881480fd5be50112f791c17db4b6b Gitweb: http://git.kernel.org/tip/96b777452d8881480fd5be50112f791c17db4b6b Author: Konstantin Khlebnikov <khlebnikov@yandex-team.ru> AuthorDate: Wed, 8 Feb 2017 14:27:27 +0300 Committer: Ingo Molnar <mingo@kernel.org> CommitDate: Fri, 24 Feb 2017 09:25:28 +0100 sched/cgroup: Move sched_online_group() back into css_online() to fix crash Commit: 2f5177f0fd7e ("sched/cgroup: Fix/cleanup cgroup teardown/init") .. moved sched_online_group() from css_online() to css_alloc(). It exposes half-baked task group into global lists before initializing generic cgroup stuff. LTP testcase (third in cgroup_regression_test) written for testing similar race in kernels 2.6.26-2.6.28 easily triggers this oops: BUG: unable to handle kernel NULL pointer dereference at 0000000000000008 IP: kernfs_path_from_node_locked+0x260/0x320 CPU: 1 PID: 30346 Comm: cat Not tainted 4.10.0-rc5-test #4 Call Trace: ? kernfs_path_from_node+0x4f/0x60 kernfs_path_from_node+0x3e/0x60 print_rt_rq+0x44/0x2b0 print_rt_stats+0x7a/0xd0 print_cpu+0x2fc/0xe80 ? __might_sleep+0x4a/0x80 sched_debug_show+0x17/0x30 seq_read+0xf2/0x3b0 proc_reg_read+0x42/0x70 __vfs_read+0x28/0x130 ? security_file_permission+0x9b/0xc0 ? rw_verify_area+0x4e/0xb0 vfs_read+0xa5/0x170 SyS_read+0x46/0xa0 entry_SYSCALL_64_fastpath+0x1e/0xad Here the task group is already linked into the global RCU-protected 'task_groups' list, but the css->cgroup pointer is still NULL. This patch reverts this chunk and moves online back to css_online(). Signed-off-by: Konstantin Khlebnikov <khlebnikov@yandex-team.ru> Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org> Cc: Linus Torvalds <torvalds@linux-foundation.org> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Tejun Heo <tj@kernel.org> Cc: Thomas Gleixner <tglx@linutronix.de> Fixes: 2f5177f0fd7e ("sched/cgroup: Fix/cleanup cgroup teardown/init") Link: http://lkml.kernel.org/r/148655324740.424917.5302984537258726349.stgit@buzz Signed-off-by: Ingo Molnar <mingo@kernel.org> --- kernel/sched/core.c | 14 ++++++++++++-- 1 file changed, 12 insertions(+), 2 deletions(-) diff --git a/kernel/sched/core.c b/kernel/sched/core.c index cc1e3e0..e01bd80 100644 --- a/kernel/sched/core.c +++ b/kernel/sched/core.c @@ -6811,11 +6811,20 @@ cpu_cgroup_css_alloc(struct cgroup_subsys_state *parent_css) if (IS_ERR(tg)) return ERR_PTR(-ENOMEM); - sched_online_group(tg, parent); - return &tg->css; } +/* Expose task group only after completing cgroup initialization */ +static int cpu_cgroup_css_online(struct cgroup_subsys_state *css) +{ + struct task_group *tg = css_tg(css); + struct task_group *parent = css_tg(css->parent); + + if (parent) + sched_online_group(tg, parent); + return 0; +} + static void cpu_cgroup_css_released(struct cgroup_subsys_state *css) { struct task_group *tg = css_tg(css); @@ -7221,6 +7230,7 @@ static struct cftype cpu_files[] = { struct cgroup_subsys cpu_cgrp_subsys = { .css_alloc = cpu_cgroup_css_alloc, + .css_online = cpu_cgroup_css_online, .css_released = cpu_cgroup_css_released, .css_free = cpu_cgroup_css_free, .fork = cpu_cgroup_fork, ^ permalink raw reply related [flat|nested] 9+ messages in thread
end of thread, other threads:[~2017-02-24 9:19 UTC | newest] Thread overview: 9+ messages (download: mbox.gz / follow: Atom feed) -- links below jump to the message on this page -- 2017-01-26 9:41 [PATCH 1/2] sched/cgroup: move sched_online_group() back into css_online() Konstantin Khlebnikov 2017-01-26 9:41 ` [PATCH 2/2] kernfs: define name and path to "(null)" for NULL pointer kernfs nodes Konstantin Khlebnikov 2017-01-26 19:37 ` Tejun Heo 2017-01-26 21:50 ` [PATCH cgroup/for-4.10-fixes] cgroup: don't online subsystems before cgroup_name/path() are operational Tejun Heo 2017-01-26 21:50 ` Tejun Heo 2017-01-26 10:17 ` [PATCH 1/2] sched/cgroup: move sched_online_group() back into css_online() Peter Zijlstra 2017-01-26 10:27 ` Konstantin Khlebnikov 2017-02-08 11:27 ` [PATCH RESEND/add comment] " Konstantin Khlebnikov 2017-02-24 9:19 ` [tip:sched/urgent] sched/cgroup: Move sched_online_group() back into css_online() to fix crash tip-bot for Konstantin Khlebnikov
This is an external index of several public inboxes, see mirroring instructions on how to clone and mirror all data and code used by this external index.