On 09/07/10 6:05 -0700, Eric W. Biederman wrote: > Louis Rilling writes: > > > On 08/07/10 21:39 -0700, Eric W. Biederman wrote: > >> > >> Currently it is possible to put proc_mnt before we have flushed the > >> last process that will use the proc_mnt to flush it's proc entries. > >> > >> This race is fixed by not flushing proc entries for dead pid > >> namespaces, and calling pid_ns_release_proc unconditionally from > >> zap_pid_ns_processes after the pid namespace has been declared dead. > > > > One comment below. > > > >> > >> To ensure we don't unnecessarily leak any dcache entries with skipped > >> flushes pid_ns_release_proc flushes the entire proc_mnt when it is > >> called. > >> > >> Signed-off-by: Eric W. Biederman > >> --- > >> fs/proc/base.c | 9 +++++---- > >> fs/proc/root.c | 3 +++ > >> kernel/pid_namespace.c | 1 + > >> 3 files changed, 9 insertions(+), 4 deletions(-) > >> > >> diff --git a/fs/proc/base.c b/fs/proc/base.c > >> index acb7ef8..e9d84e1 100644 > >> --- a/fs/proc/base.c > >> +++ b/fs/proc/base.c > >> @@ -2742,13 +2742,14 @@ void proc_flush_task(struct task_struct *task) > >> > >> for (i = 0; i <= pid->level; i++) { > >> upid = &pid->numbers[i]; > >> + > >> + /* Don't bother flushing dead pid namespaces */ > >> + if (test_bit(PIDNS_DEAD, &upid->ns->flags)) > >> + continue; > >> + > > > > IMHO, nothing prevents zap_pid_ns_processes() from setting PIDNS_DEAD and > > calling pid_ns_release_proc() right now. zap_pid_ns_processes() does not wait > > for EXIT_DEAD (self-reaping) children to be released. > > Good point we need something probably a lock to prevent proc_mnt from > going away here. We might do a little better if we were starting with > a specific dentry, those at least have some rcu properties but that isn't > a big help. > > Hmm. Perhaps there is a way to completely restructure this flushing > of dentries. It is just an optimization after all so we don't get too many > stale dentries building up. > > It might just be worth it simply kill proc_flush_mnt altogether. I know > it is measurable when we don't do the flushing but perhaps there can > be a work struct that periodically wakes up and smacks stale proc dentries. > > Right now I really don't think proc_flush_task is worth the hassle it > causes. Indeed, proc_flush_task() seems to be the only bad guy trying to access pid_ns->proc_mnt after the death of the init process. But I don't know enough about the performance impact of removing it. Louis > > Grumble, Grumble more thinking to do. > > Eric > _______________________________________________ > Containers mailing list > Containers@lists.linux-foundation.org > https://lists.linux-foundation.org/mailman/listinfo/containers -- Dr Louis Rilling Kerlabs Skype: louis.rilling Batiment Germanium Phone: (+33|0) 6 80 89 08 23 80 avenue des Buttes de Coesmes http://www.kerlabs.com/ 35700 Rennes