* [PATCH v4.4.y] sched/cgroup: Move sched_online_group() back into css_online() to fix crash
@ 2017-07-20 13:53 Matt Fleming
2017-07-25 18:04 ` Greg KH
0 siblings, 1 reply; 6+ messages in thread
From: Matt Fleming @ 2017-07-20 13:53 UTC (permalink / raw)
To: stable
Cc: linux-kernel, Konstantin Khlebnikov, Peter Zijlstra,
Linus Torvalds, Tejun Heo, Thomas Gleixner, Ingo Molnar,
Matt Fleming
From: Konstantin Khlebnikov <khlebnikov@yandex-team.ru>
commit 96b777452d8881480fd5be50112f791c17db4b6b upstream.
Commit:
2f5177f0fd7e ("sched/cgroup: Fix/cleanup cgroup teardown/init")
.. moved sched_online_group() from css_online() to css_alloc().
It exposes half-baked task group into global lists before initializing
generic cgroup stuff.
LTP testcase (third in cgroup_regression_test) written for testing
similar race in kernels 2.6.26-2.6.28 easily triggers this oops:
BUG: unable to handle kernel NULL pointer dereference at 0000000000000008
IP: kernfs_path_from_node_locked+0x260/0x320
CPU: 1 PID: 30346 Comm: cat Not tainted 4.10.0-rc5-test #4
Call Trace:
? kernfs_path_from_node+0x4f/0x60
kernfs_path_from_node+0x3e/0x60
print_rt_rq+0x44/0x2b0
print_rt_stats+0x7a/0xd0
print_cpu+0x2fc/0xe80
? __might_sleep+0x4a/0x80
sched_debug_show+0x17/0x30
seq_read+0xf2/0x3b0
proc_reg_read+0x42/0x70
__vfs_read+0x28/0x130
? security_file_permission+0x9b/0xc0
? rw_verify_area+0x4e/0xb0
vfs_read+0xa5/0x170
SyS_read+0x46/0xa0
entry_SYSCALL_64_fastpath+0x1e/0xad
Here the task group is already linked into the global RCU-protected 'task_groups'
list, but the css->cgroup pointer is still NULL.
This patch reverts this chunk and moves online back to css_online().
Signed-off-by: Konstantin Khlebnikov <khlebnikov@yandex-team.ru>
Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Tejun Heo <tj@kernel.org>
Cc: Thomas Gleixner <tglx@linutronix.de>
Fixes: 2f5177f0fd7e ("sched/cgroup: Fix/cleanup cgroup teardown/init")
Link: http://lkml.kernel.org/r/148655324740.424917.5302984537258726349.stgit@buzz
Signed-off-by: Ingo Molnar <mingo@kernel.org>
Signed-off-by: Matt Fleming <matt@codeblueprint.co.uk>
---
kernel/sched/core.c | 14 ++++++++++++--
1 file changed, 12 insertions(+), 2 deletions(-)
diff --git a/kernel/sched/core.c b/kernel/sched/core.c
index 20253dbc8610..7a5060bf3bbc 100644
--- a/kernel/sched/core.c
+++ b/kernel/sched/core.c
@@ -8237,11 +8237,20 @@ cpu_cgroup_css_alloc(struct cgroup_subsys_state *parent_css)
if (IS_ERR(tg))
return ERR_PTR(-ENOMEM);
- sched_online_group(tg, parent);
-
return &tg->css;
}
+/* Expose task group only after completing cgroup initialization */
+static int cpu_cgroup_css_online(struct cgroup_subsys_state *css)
+{
+ struct task_group *tg = css_tg(css);
+ struct task_group *parent = css_tg(css->parent);
+
+ if (parent)
+ sched_online_group(tg, parent);
+ return 0;
+}
+
static void cpu_cgroup_css_released(struct cgroup_subsys_state *css)
{
struct task_group *tg = css_tg(css);
@@ -8616,6 +8625,7 @@ static struct cftype cpu_files[] = {
struct cgroup_subsys cpu_cgrp_subsys = {
.css_alloc = cpu_cgroup_css_alloc,
+ .css_online = cpu_cgroup_css_online,
.css_released = cpu_cgroup_css_released,
.css_free = cpu_cgroup_css_free,
.fork = cpu_cgroup_fork,
--
2.12.2
^ permalink raw reply related [flat|nested] 6+ messages in thread
* Re: [PATCH v4.4.y] sched/cgroup: Move sched_online_group() back into css_online() to fix crash
2017-07-20 13:53 [PATCH v4.4.y] sched/cgroup: Move sched_online_group() back into css_online() to fix crash Matt Fleming
@ 2017-07-25 18:04 ` Greg KH
2017-07-26 12:05 ` Matt Fleming
0 siblings, 1 reply; 6+ messages in thread
From: Greg KH @ 2017-07-25 18:04 UTC (permalink / raw)
To: Matt Fleming
Cc: stable, linux-kernel, Konstantin Khlebnikov, Peter Zijlstra,
Linus Torvalds, Tejun Heo, Thomas Gleixner, Ingo Molnar
On Thu, Jul 20, 2017 at 02:53:09PM +0100, Matt Fleming wrote:
> From: Konstantin Khlebnikov <khlebnikov@yandex-team.ru>
>
> commit 96b777452d8881480fd5be50112f791c17db4b6b upstream.
>
> Commit:
>
> 2f5177f0fd7e ("sched/cgroup: Fix/cleanup cgroup teardown/init")
>
> .. moved sched_online_group() from css_online() to css_alloc().
> It exposes half-baked task group into global lists before initializing
> generic cgroup stuff.
>
> LTP testcase (third in cgroup_regression_test) written for testing
> similar race in kernels 2.6.26-2.6.28 easily triggers this oops:
>
> BUG: unable to handle kernel NULL pointer dereference at 0000000000000008
> IP: kernfs_path_from_node_locked+0x260/0x320
> CPU: 1 PID: 30346 Comm: cat Not tainted 4.10.0-rc5-test #4
> Call Trace:
> ? kernfs_path_from_node+0x4f/0x60
> kernfs_path_from_node+0x3e/0x60
> print_rt_rq+0x44/0x2b0
> print_rt_stats+0x7a/0xd0
> print_cpu+0x2fc/0xe80
> ? __might_sleep+0x4a/0x80
> sched_debug_show+0x17/0x30
> seq_read+0xf2/0x3b0
> proc_reg_read+0x42/0x70
> __vfs_read+0x28/0x130
> ? security_file_permission+0x9b/0xc0
> ? rw_verify_area+0x4e/0xb0
> vfs_read+0xa5/0x170
> SyS_read+0x46/0xa0
> entry_SYSCALL_64_fastpath+0x1e/0xad
>
> Here the task group is already linked into the global RCU-protected 'task_groups'
> list, but the css->cgroup pointer is still NULL.
>
> This patch reverts this chunk and moves online back to css_online().
>
> Signed-off-by: Konstantin Khlebnikov <khlebnikov@yandex-team.ru>
> Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org>
> Cc: Linus Torvalds <torvalds@linux-foundation.org>
> Cc: Peter Zijlstra <peterz@infradead.org>
> Cc: Tejun Heo <tj@kernel.org>
> Cc: Thomas Gleixner <tglx@linutronix.de>
> Fixes: 2f5177f0fd7e ("sched/cgroup: Fix/cleanup cgroup teardown/init")
> Link: http://lkml.kernel.org/r/148655324740.424917.5302984537258726349.stgit@buzz
> Signed-off-by: Ingo Molnar <mingo@kernel.org>
> Signed-off-by: Matt Fleming <matt@codeblueprint.co.uk>
> ---
> kernel/sched/core.c | 14 ++++++++++++--
> 1 file changed, 12 insertions(+), 2 deletions(-)
What about 4.9-stable, this should go there too, right?
thanks,
greg k-h
^ permalink raw reply [flat|nested] 6+ messages in thread
* Re: [PATCH v4.4.y] sched/cgroup: Move sched_online_group() back into css_online() to fix crash
2017-07-25 18:04 ` Greg KH
@ 2017-07-26 12:05 ` Matt Fleming
2017-07-26 14:35 ` Greg KH
0 siblings, 1 reply; 6+ messages in thread
From: Matt Fleming @ 2017-07-26 12:05 UTC (permalink / raw)
To: Greg KH
Cc: stable, linux-kernel, Konstantin Khlebnikov, Peter Zijlstra,
Linus Torvalds, Tejun Heo, Thomas Gleixner, Ingo Molnar
On Tue, 25 Jul, at 11:04:39AM, Greg KH wrote:
> On Thu, Jul 20, 2017 at 02:53:09PM +0100, Matt Fleming wrote:
> > From: Konstantin Khlebnikov <khlebnikov@yandex-team.ru>
> >
> > commit 96b777452d8881480fd5be50112f791c17db4b6b upstream.
> >
> > Commit:
> >
> > 2f5177f0fd7e ("sched/cgroup: Fix/cleanup cgroup teardown/init")
> >
> > .. moved sched_online_group() from css_online() to css_alloc().
> > It exposes half-baked task group into global lists before initializing
> > generic cgroup stuff.
> >
> > LTP testcase (third in cgroup_regression_test) written for testing
> > similar race in kernels 2.6.26-2.6.28 easily triggers this oops:
> >
> > BUG: unable to handle kernel NULL pointer dereference at 0000000000000008
> > IP: kernfs_path_from_node_locked+0x260/0x320
> > CPU: 1 PID: 30346 Comm: cat Not tainted 4.10.0-rc5-test #4
> > Call Trace:
> > ? kernfs_path_from_node+0x4f/0x60
> > kernfs_path_from_node+0x3e/0x60
> > print_rt_rq+0x44/0x2b0
> > print_rt_stats+0x7a/0xd0
> > print_cpu+0x2fc/0xe80
> > ? __might_sleep+0x4a/0x80
> > sched_debug_show+0x17/0x30
> > seq_read+0xf2/0x3b0
> > proc_reg_read+0x42/0x70
> > __vfs_read+0x28/0x130
> > ? security_file_permission+0x9b/0xc0
> > ? rw_verify_area+0x4e/0xb0
> > vfs_read+0xa5/0x170
> > SyS_read+0x46/0xa0
> > entry_SYSCALL_64_fastpath+0x1e/0xad
> >
> > Here the task group is already linked into the global RCU-protected 'task_groups'
> > list, but the css->cgroup pointer is still NULL.
> >
> > This patch reverts this chunk and moves online back to css_online().
> >
> > Signed-off-by: Konstantin Khlebnikov <khlebnikov@yandex-team.ru>
> > Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org>
> > Cc: Linus Torvalds <torvalds@linux-foundation.org>
> > Cc: Peter Zijlstra <peterz@infradead.org>
> > Cc: Tejun Heo <tj@kernel.org>
> > Cc: Thomas Gleixner <tglx@linutronix.de>
> > Fixes: 2f5177f0fd7e ("sched/cgroup: Fix/cleanup cgroup teardown/init")
> > Link: http://lkml.kernel.org/r/148655324740.424917.5302984537258726349.stgit@buzz
> > Signed-off-by: Ingo Molnar <mingo@kernel.org>
> > Signed-off-by: Matt Fleming <matt@codeblueprint.co.uk>
> > ---
> > kernel/sched/core.c | 14 ++++++++++++--
> > 1 file changed, 12 insertions(+), 2 deletions(-)
>
> What about 4.9-stable, this should go there too, right?
Yes, good catch. Would you like me to send a separate patch?
^ permalink raw reply [flat|nested] 6+ messages in thread
* Re: [PATCH v4.4.y] sched/cgroup: Move sched_online_group() back into css_online() to fix crash
2017-07-26 12:05 ` Matt Fleming
@ 2017-07-26 14:35 ` Greg KH
2017-07-26 20:59 ` Matt Fleming
0 siblings, 1 reply; 6+ messages in thread
From: Greg KH @ 2017-07-26 14:35 UTC (permalink / raw)
To: Matt Fleming
Cc: stable, linux-kernel, Konstantin Khlebnikov, Peter Zijlstra,
Linus Torvalds, Tejun Heo, Thomas Gleixner, Ingo Molnar
On Wed, Jul 26, 2017 at 01:05:23PM +0100, Matt Fleming wrote:
> On Tue, 25 Jul, at 11:04:39AM, Greg KH wrote:
> > On Thu, Jul 20, 2017 at 02:53:09PM +0100, Matt Fleming wrote:
> > > From: Konstantin Khlebnikov <khlebnikov@yandex-team.ru>
> > >
> > > commit 96b777452d8881480fd5be50112f791c17db4b6b upstream.
> > >
> > > Commit:
> > >
> > > 2f5177f0fd7e ("sched/cgroup: Fix/cleanup cgroup teardown/init")
> > >
> > > .. moved sched_online_group() from css_online() to css_alloc().
> > > It exposes half-baked task group into global lists before initializing
> > > generic cgroup stuff.
> > >
> > > LTP testcase (third in cgroup_regression_test) written for testing
> > > similar race in kernels 2.6.26-2.6.28 easily triggers this oops:
> > >
> > > BUG: unable to handle kernel NULL pointer dereference at 0000000000000008
> > > IP: kernfs_path_from_node_locked+0x260/0x320
> > > CPU: 1 PID: 30346 Comm: cat Not tainted 4.10.0-rc5-test #4
> > > Call Trace:
> > > ? kernfs_path_from_node+0x4f/0x60
> > > kernfs_path_from_node+0x3e/0x60
> > > print_rt_rq+0x44/0x2b0
> > > print_rt_stats+0x7a/0xd0
> > > print_cpu+0x2fc/0xe80
> > > ? __might_sleep+0x4a/0x80
> > > sched_debug_show+0x17/0x30
> > > seq_read+0xf2/0x3b0
> > > proc_reg_read+0x42/0x70
> > > __vfs_read+0x28/0x130
> > > ? security_file_permission+0x9b/0xc0
> > > ? rw_verify_area+0x4e/0xb0
> > > vfs_read+0xa5/0x170
> > > SyS_read+0x46/0xa0
> > > entry_SYSCALL_64_fastpath+0x1e/0xad
> > >
> > > Here the task group is already linked into the global RCU-protected 'task_groups'
> > > list, but the css->cgroup pointer is still NULL.
> > >
> > > This patch reverts this chunk and moves online back to css_online().
> > >
> > > Signed-off-by: Konstantin Khlebnikov <khlebnikov@yandex-team.ru>
> > > Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org>
> > > Cc: Linus Torvalds <torvalds@linux-foundation.org>
> > > Cc: Peter Zijlstra <peterz@infradead.org>
> > > Cc: Tejun Heo <tj@kernel.org>
> > > Cc: Thomas Gleixner <tglx@linutronix.de>
> > > Fixes: 2f5177f0fd7e ("sched/cgroup: Fix/cleanup cgroup teardown/init")
> > > Link: http://lkml.kernel.org/r/148655324740.424917.5302984537258726349.stgit@buzz
> > > Signed-off-by: Ingo Molnar <mingo@kernel.org>
> > > Signed-off-by: Matt Fleming <matt@codeblueprint.co.uk>
> > > ---
> > > kernel/sched/core.c | 14 ++++++++++++--
> > > 1 file changed, 12 insertions(+), 2 deletions(-)
> >
> > What about 4.9-stable, this should go there too, right?
>
> Yes, good catch. Would you like me to send a separate patch?
If it needs a backport and a simple cherry-pick does not work, yes
please.
thanks,
greg k-h
^ permalink raw reply [flat|nested] 6+ messages in thread
* Re: [PATCH v4.4.y] sched/cgroup: Move sched_online_group() back into css_online() to fix crash
2017-07-26 14:35 ` Greg KH
@ 2017-07-26 20:59 ` Matt Fleming
2017-08-03 22:33 ` Greg KH
0 siblings, 1 reply; 6+ messages in thread
From: Matt Fleming @ 2017-07-26 20:59 UTC (permalink / raw)
To: Greg KH
Cc: stable, linux-kernel, Konstantin Khlebnikov, Peter Zijlstra,
Linus Torvalds, Tejun Heo, Thomas Gleixner, Ingo Molnar
On Wed, 26 Jul, at 07:35:51AM, Greg KH wrote:
>
> If it needs a backport and a simple cherry-pick does not work, yes
> please.
Oh, it turns out cherry-picking commit 96b777452d88 to 4.9-stable
works just fine.
^ permalink raw reply [flat|nested] 6+ messages in thread
* Re: [PATCH v4.4.y] sched/cgroup: Move sched_online_group() back into css_online() to fix crash
2017-07-26 20:59 ` Matt Fleming
@ 2017-08-03 22:33 ` Greg KH
0 siblings, 0 replies; 6+ messages in thread
From: Greg KH @ 2017-08-03 22:33 UTC (permalink / raw)
To: Matt Fleming
Cc: stable, linux-kernel, Konstantin Khlebnikov, Peter Zijlstra,
Linus Torvalds, Tejun Heo, Thomas Gleixner, Ingo Molnar
On Wed, Jul 26, 2017 at 09:59:43PM +0100, Matt Fleming wrote:
> On Wed, 26 Jul, at 07:35:51AM, Greg KH wrote:
> >
> > If it needs a backport and a simple cherry-pick does not work, yes
> > please.
>
> Oh, it turns out cherry-picking commit 96b777452d88 to 4.9-stable
> works just fine.
Great, now queued up.
thanks,
greg k-h
^ permalink raw reply [flat|nested] 6+ messages in thread
end of thread, other threads:[~2017-08-03 22:33 UTC | newest]
Thread overview: 6+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2017-07-20 13:53 [PATCH v4.4.y] sched/cgroup: Move sched_online_group() back into css_online() to fix crash Matt Fleming
2017-07-25 18:04 ` Greg KH
2017-07-26 12:05 ` Matt Fleming
2017-07-26 14:35 ` Greg KH
2017-07-26 20:59 ` Matt Fleming
2017-08-03 22:33 ` Greg KH
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).