On 08/07/10 21:39 -0700, Eric W. Biederman wrote: > > Currently it is possible to put proc_mnt before we have flushed the > last process that will use the proc_mnt to flush it's proc entries. > > This race is fixed by not flushing proc entries for dead pid > namespaces, and calling pid_ns_release_proc unconditionally from > zap_pid_ns_processes after the pid namespace has been declared dead. One comment below. > > To ensure we don't unnecessarily leak any dcache entries with skipped > flushes pid_ns_release_proc flushes the entire proc_mnt when it is > called. > > Signed-off-by: Eric W. Biederman > --- > fs/proc/base.c | 9 +++++---- > fs/proc/root.c | 3 +++ > kernel/pid_namespace.c | 1 + > 3 files changed, 9 insertions(+), 4 deletions(-) > > diff --git a/fs/proc/base.c b/fs/proc/base.c > index acb7ef8..e9d84e1 100644 > --- a/fs/proc/base.c > +++ b/fs/proc/base.c > @@ -2742,13 +2742,14 @@ void proc_flush_task(struct task_struct *task) > > for (i = 0; i <= pid->level; i++) { > upid = &pid->numbers[i]; > + > + /* Don't bother flushing dead pid namespaces */ > + if (test_bit(PIDNS_DEAD, &upid->ns->flags)) > + continue; > + IMHO, nothing prevents zap_pid_ns_processes() from setting PIDNS_DEAD and calling pid_ns_release_proc() right now. zap_pid_ns_processes() does not wait for EXIT_DEAD (self-reaping) children to be released. Thanks, Louis > proc_flush_task_mnt(upid->ns->proc_mnt, upid->nr, > tgid->numbers[i].nr); > } > - > - upid = &pid->numbers[pid->level]; > - if (upid->nr == 1) > - pid_ns_release_proc(upid->ns); > } > > static struct dentry *proc_pid_instantiate(struct inode *dir, > diff --git a/fs/proc/root.c b/fs/proc/root.c > index cfdf032..2298fdd 100644 > --- a/fs/proc/root.c > +++ b/fs/proc/root.c > @@ -209,5 +209,8 @@ int pid_ns_prepare_proc(struct pid_namespace *ns) > > void pid_ns_release_proc(struct pid_namespace *ns) > { > + /* Flush any cached proc dentries for this pid namespace */ > + shrink_dcache_parent(ns->proc_mnt->mnt_root); > + > mntput(ns->proc_mnt); > } > diff --git a/kernel/pid_namespace.c b/kernel/pid_namespace.c > index 92032d1..43dec5d 100644 > --- a/kernel/pid_namespace.c > +++ b/kernel/pid_namespace.c > @@ -189,6 +189,7 @@ void zap_pid_ns_processes(struct pid_namespace *pid_ns) > rc = sys_wait4(-1, NULL, __WALL, NULL); > } while (rc != -ECHILD); > > + pid_ns_release_proc(pid_ns); > acct_exit_ns(pid_ns); > return; > } > -- > 1.6.5.2.143.g8cc62 > > -- > To unsubscribe from this list: send the line "unsubscribe linux-kernel" in > the body of a message to majordomo@vger.kernel.org > More majordomo info at http://vger.kernel.org/majordomo-info.html > Please read the FAQ at http://www.tux.org/lkml/ -- Dr Louis Rilling Kerlabs Skype: louis.rilling Batiment Germanium Phone: (+33|0) 6 80 89 08 23 80 avenue des Buttes de Coesmes http://www.kerlabs.com/ 35700 Rennes