On 09/02/11 11:02 -0800, Sukadev Bhattiprolu wrote: > Louis Rilling [Louis.Rilling-aw0BnHfMbSpBDgjK7y7TUQ@public.gmane.org] wrote: > | > | Are we still getting it with 2.6.37 ? > | > > | > I am not currently getting the crash on 2.6.37 - I thought it was due to > | > the following commit which removed the check for task_detached() in > | > do_wait_thread(). > | > > | > commit 9cd80bbb07fcd6d4d037fad4297496d3b132ac6b > | > Author: Oleg Nesterov > | > Date: Thu Dec 17 15:27:15 2009 -0800 > | > | I don't think that this introduced the bug. The bug triggers with EXIT_DEAD > | tasks, for which wait() must ignore (see below). So, the bug looks still there > | in 2.6.37. > > Sorry, I did not mean to imply that the above commit caused the crash > you saw in Jun 2010. > > I can reproduce a crash with 2.6.32 - where if container-init terminates > before a detached child, we get a crash when the detached child calls > proc_flush_mnt(). I suspected it was because do_wait_thread() skipped > over detached tasks (in 2.6.32). > > The same test case does not crash on 2.6.37 - which includes the above commit. > The removes the check for detached tasks, my initial guess is that the above > commit, may have contributed to _fixing_ the crash in 2.6.37. Hm, I don't see how this commit changed things for detached tasks, unless ptrace is involved. Detached tasks go atomically from ->exit_state == 0 to ->exit_state == EXIT_DEAD in exit_notify(), because tracehook_notify_death() returns DEATH_REAP for all not ptraced detached tasks. What do you think has changed precisely? Thanks, Louis -- Dr Louis Rilling Kerlabs Skype: louis.rilling Batiment Germanium Phone: (+33|0) 6 80 89 08 23 80 avenue des Buttes de Coesmes http://www.kerlabs.com/ 35700 Rennes