On 18.10.2012 10:27, cwillu wrote: > On Tue, Jul 24, 2012 at 8:21 AM, tip-bot for Peter Zijlstra > wrote: >> Commit-ID: 8323f26ce3425460769605a6aece7a174edaa7d1 >> Gitweb: http://git.kernel.org/tip/8323f26ce3425460769605a6aece7a174edaa7d1 >> Author: Peter Zijlstra >> AuthorDate: Fri, 22 Jun 2012 13:36:05 +0200 >> Committer: Ingo Molnar >> CommitDate: Tue, 24 Jul 2012 13:58:20 +0200 >> >> sched: Fix race in task_group() >> >> Stefan reported a crash on a kernel before a3e5d1091c1 ("sched: >> Don't call task_group() too many times in set_task_rq()"), he >> found the reason to be that the multiple task_group() >> invocations in set_task_rq() returned different values. >> >> Looking at all that I found a lack of serialization and plain >> wrong comments. >> >> The below tries to fix it using an extra pointer which is >> updated under the appropriate scheduler locks. Its not pretty, >> but I can't really see another way given how all the cgroup >> stuff works. >> >> Reported-and-tested-by: Stefan Bader >> Signed-off-by: Peter Zijlstra >> Link: http://lkml.kernel.org/r/1340364965.18025.71.camel@twins >> Signed-off-by: Ingo Molnar > > I just finished bisecting a crash on boot to this commit; booting with > "noautogroup" brings it back. > > 3.5.4 is the latest -stable that still boots, and none of the 3.6 rc's > boot at all. > > Photo of the bug (3.6.0next is 3.6 + btrfs's for-linus): > https://lh5.googleusercontent.com/-0DY-YYhgvzs/UHdB-BQdzMI/AAAAAAAAAEg/QhY9rgxnv98/s811/2012-10-11 > On a very quick glance I wonder whether there might be a case where sched_fork goes into set_task_cpu with a different cpu than the current but has not yet task_group.sched_task_group set to something valid...