* [PATCH v3 0/2] Expose task pid_ns_for_children to userspace
@ 2017-05-03 10:32 ` Kirill Tkhai
0 siblings, 0 replies; 6+ messages in thread
From: Kirill Tkhai @ 2017-05-03 10:32 UTC (permalink / raw)
To: agruenba, keescook, linux-api, oleg, viro, linux-kernel, paul,
ktkhai, ebiederm, avagin, linux-fsdevel, mtk.manpages, akpm,
luto, gorcunov, mingo, serge
pid_ns_for_children set by a task is known only to the task itself,
and it's impossible to identify it from outside.
It's a big problem for checkpoint/restore software like CRIU,
because it can't correctly handle tasks, that do setns(CLONE_NEWPID)
in proccess of their work. If they have a custom pid_ns_for_children
before dump, they must have the same ns after restore. Otherwise,
restored task bumped into enviroment it does not expect.
This patchset solves the problem. It exposes pid_ns_for_children
to ns directory in standard way with the name "pid_for_children":
~# ls /proc/5531/ns -l | grep pid
lrwxrwxrwx 1 root root 0 Jan 14 16:38 pid -> pid:[4026531836]
lrwxrwxrwx 1 root root 0 Jan 14 16:38 pid_for_children -> pid:[4026532286]
v3: Check child_reaper without tasklist_lock as we are only interested
in the fact it's not zero, and not in a specific value.
v2: Do not allow to take a pid namespace, if there is no child reaper
created. This prevents race between creation of the child reaper and
other tasks.
---
Kirill Tkhai (2):
ns: Allow ns_entries to have custom symlink content
pidns: Expose task pid_ns_for_children to userspace
fs/nsfs.c | 4 +++-
fs/proc/namespaces.c | 1 +
include/linux/proc_ns.h | 2 ++
kernel/pid_namespace.c | 28 ++++++++++++++++++++++++++++
4 files changed, 34 insertions(+), 1 deletion(-)
--
Signed-off-by: Kirill Tkhai <ktkhai@virtuozzo.com>
^ permalink raw reply [flat|nested] 6+ messages in thread
* [PATCH v3 0/2] Expose task pid_ns_for_children to userspace
@ 2017-05-03 10:32 ` Kirill Tkhai
0 siblings, 0 replies; 6+ messages in thread
From: Kirill Tkhai @ 2017-05-03 10:32 UTC (permalink / raw)
To: agruenba-H+wXaHxf7aLQT0dZR+AlfA, keescook-F7+t8E8rja9g9hUCZPvPmw,
linux-api-u79uwXL29TY76Z2rM5mHXA, oleg-H+wXaHxf7aLQT0dZR+AlfA,
viro-RmSDqhL/yNMiFSDQTTA3OLVCufUGDwFn,
linux-kernel-u79uwXL29TY76Z2rM5mHXA, paul-r2n+y4ga6xFZroRs9YW3xA,
ktkhai-5HdwGun5lf+gSpxsJD1C4w, ebiederm-aS9lmoZGLiVWk0Htik3J/w,
avagin-GEFAQzZX7r8dnm+yROfE0A,
linux-fsdevel-u79uwXL29TY76Z2rM5mHXA,
mtk.manpages-Re5JQEeQqe8AvxtiuMwx3w,
akpm-de/tnXTf+JLsfHDXvbKv3WD2FQJk+8+b,
luto-kltTT9wpgjJwATOyAt5JVQ, gorcunov-GEFAQzZX7r8dnm+yROfE0A,
mingo-DgEjT+Ai2ygdnm+yROfE0A, serge-A9i7LUbDfNHQT0dZR+AlfA
pid_ns_for_children set by a task is known only to the task itself,
and it's impossible to identify it from outside.
It's a big problem for checkpoint/restore software like CRIU,
because it can't correctly handle tasks, that do setns(CLONE_NEWPID)
in proccess of their work. If they have a custom pid_ns_for_children
before dump, they must have the same ns after restore. Otherwise,
restored task bumped into enviroment it does not expect.
This patchset solves the problem. It exposes pid_ns_for_children
to ns directory in standard way with the name "pid_for_children":
~# ls /proc/5531/ns -l | grep pid
lrwxrwxrwx 1 root root 0 Jan 14 16:38 pid -> pid:[4026531836]
lrwxrwxrwx 1 root root 0 Jan 14 16:38 pid_for_children -> pid:[4026532286]
v3: Check child_reaper without tasklist_lock as we are only interested
in the fact it's not zero, and not in a specific value.
v2: Do not allow to take a pid namespace, if there is no child reaper
created. This prevents race between creation of the child reaper and
other tasks.
---
Kirill Tkhai (2):
ns: Allow ns_entries to have custom symlink content
pidns: Expose task pid_ns_for_children to userspace
fs/nsfs.c | 4 +++-
fs/proc/namespaces.c | 1 +
include/linux/proc_ns.h | 2 ++
kernel/pid_namespace.c | 28 ++++++++++++++++++++++++++++
4 files changed, 34 insertions(+), 1 deletion(-)
--
Signed-off-by: Kirill Tkhai <ktkhai-5HdwGun5lf+gSpxsJD1C4w@public.gmane.org>
^ permalink raw reply [flat|nested] 6+ messages in thread
* [PATCH v3 1/2] ns: Allow ns_entries to have custom symlink content
@ 2017-05-03 10:32 ` Kirill Tkhai
0 siblings, 0 replies; 6+ messages in thread
From: Kirill Tkhai @ 2017-05-03 10:32 UTC (permalink / raw)
To: agruenba, keescook, linux-api, oleg, viro, linux-kernel, paul,
ktkhai, ebiederm, avagin, linux-fsdevel, mtk.manpages, akpm,
luto, gorcunov, mingo, serge
Patch series "Expose task pid_ns_for_children to userspace".
pid_ns_for_children set by a task is known only to the task itself, and
it's impossible to identify it from outside.
It's a big problem for checkpoint/restore software like CRIU, because it
can't correctly handle tasks, that do setns(CLONE_NEWPID) in proccess of
their work. If they have a custom pid_ns_for_children before dump, they
must have the same ns after restore. Otherwise, restored task bumped into
enviroment it does not expect.
This patchset solves the problem. It exposes pid_ns_for_children to ns
directory in standard way with the name "pid_for_children":
~# ls /proc/5531/ns -l | grep pid
lrwxrwxrwx 1 root root 0 Jan 14 16:38 pid -> pid:[4026531836]
lrwxrwxrwx 1 root root 0 Jan 14 16:38 pid_for_children -> pid:[4026532286]
This patch (of 2):
Make possible to have link content prefix yyy
different from the link name xxx:
$ readlink /proc/[pid]/ns/xxx
yyy:[4026531838]
This will be used in next patch.
Signed-off-by: Kirill Tkhai <ktkhai@virtuozzo.com>
Reviewed-by: Cyrill Gorcunov <gorcunov@openvz.org>
Acked-by: Andrei Vagin <avagin@virtuozzo.com>
---
fs/nsfs.c | 4 +++-
include/linux/proc_ns.h | 1 +
2 files changed, 4 insertions(+), 1 deletion(-)
diff --git a/fs/nsfs.c b/fs/nsfs.c
index 323f492e0822..f3db56e83dd2 100644
--- a/fs/nsfs.c
+++ b/fs/nsfs.c
@@ -196,9 +196,11 @@ int ns_get_name(char *buf, size_t size, struct task_struct *task,
{
struct ns_common *ns;
int res = -ENOENT;
+ const char *name;
ns = ns_ops->get(task);
if (ns) {
- res = snprintf(buf, size, "%s:[%u]", ns_ops->name, ns->inum);
+ name = ns_ops->real_ns_name ? : ns_ops->name;
+ res = snprintf(buf, size, "%s:[%u]", name, ns->inum);
ns_ops->put(ns);
}
return res;
diff --git a/include/linux/proc_ns.h b/include/linux/proc_ns.h
index 12cb8bd81d2d..88dba3b53375 100644
--- a/include/linux/proc_ns.h
+++ b/include/linux/proc_ns.h
@@ -14,6 +14,7 @@ struct inode;
struct proc_ns_operations {
const char *name;
+ const char *real_ns_name;
int type;
struct ns_common *(*get)(struct task_struct *task);
void (*put)(struct ns_common *ns);
^ permalink raw reply related [flat|nested] 6+ messages in thread
* [PATCH v3 1/2] ns: Allow ns_entries to have custom symlink content
@ 2017-05-03 10:32 ` Kirill Tkhai
0 siblings, 0 replies; 6+ messages in thread
From: Kirill Tkhai @ 2017-05-03 10:32 UTC (permalink / raw)
To: agruenba-H+wXaHxf7aLQT0dZR+AlfA, keescook-F7+t8E8rja9g9hUCZPvPmw,
linux-api-u79uwXL29TY76Z2rM5mHXA, oleg-H+wXaHxf7aLQT0dZR+AlfA,
viro-RmSDqhL/yNMiFSDQTTA3OLVCufUGDwFn,
linux-kernel-u79uwXL29TY76Z2rM5mHXA, paul-r2n+y4ga6xFZroRs9YW3xA,
ktkhai-5HdwGun5lf+gSpxsJD1C4w, ebiederm-aS9lmoZGLiVWk0Htik3J/w,
avagin-GEFAQzZX7r8dnm+yROfE0A,
linux-fsdevel-u79uwXL29TY76Z2rM5mHXA,
mtk.manpages-Re5JQEeQqe8AvxtiuMwx3w,
akpm-de/tnXTf+JLsfHDXvbKv3WD2FQJk+8+b,
luto-kltTT9wpgjJwATOyAt5JVQ, gorcunov-GEFAQzZX7r8dnm+yROfE0A,
mingo-DgEjT+Ai2ygdnm+yROfE0A, serge-A9i7LUbDfNHQT0dZR+AlfA
Patch series "Expose task pid_ns_for_children to userspace".
pid_ns_for_children set by a task is known only to the task itself, and
it's impossible to identify it from outside.
It's a big problem for checkpoint/restore software like CRIU, because it
can't correctly handle tasks, that do setns(CLONE_NEWPID) in proccess of
their work. If they have a custom pid_ns_for_children before dump, they
must have the same ns after restore. Otherwise, restored task bumped into
enviroment it does not expect.
This patchset solves the problem. It exposes pid_ns_for_children to ns
directory in standard way with the name "pid_for_children":
~# ls /proc/5531/ns -l | grep pid
lrwxrwxrwx 1 root root 0 Jan 14 16:38 pid -> pid:[4026531836]
lrwxrwxrwx 1 root root 0 Jan 14 16:38 pid_for_children -> pid:[4026532286]
This patch (of 2):
Make possible to have link content prefix yyy
different from the link name xxx:
$ readlink /proc/[pid]/ns/xxx
yyy:[4026531838]
This will be used in next patch.
Signed-off-by: Kirill Tkhai <ktkhai-5HdwGun5lf+gSpxsJD1C4w@public.gmane.org>
Reviewed-by: Cyrill Gorcunov <gorcunov-GEFAQzZX7r8dnm+yROfE0A@public.gmane.org>
Acked-by: Andrei Vagin <avagin-5HdwGun5lf+gSpxsJD1C4w@public.gmane.org>
---
fs/nsfs.c | 4 +++-
include/linux/proc_ns.h | 1 +
2 files changed, 4 insertions(+), 1 deletion(-)
diff --git a/fs/nsfs.c b/fs/nsfs.c
index 323f492e0822..f3db56e83dd2 100644
--- a/fs/nsfs.c
+++ b/fs/nsfs.c
@@ -196,9 +196,11 @@ int ns_get_name(char *buf, size_t size, struct task_struct *task,
{
struct ns_common *ns;
int res = -ENOENT;
+ const char *name;
ns = ns_ops->get(task);
if (ns) {
- res = snprintf(buf, size, "%s:[%u]", ns_ops->name, ns->inum);
+ name = ns_ops->real_ns_name ? : ns_ops->name;
+ res = snprintf(buf, size, "%s:[%u]", name, ns->inum);
ns_ops->put(ns);
}
return res;
diff --git a/include/linux/proc_ns.h b/include/linux/proc_ns.h
index 12cb8bd81d2d..88dba3b53375 100644
--- a/include/linux/proc_ns.h
+++ b/include/linux/proc_ns.h
@@ -14,6 +14,7 @@ struct inode;
struct proc_ns_operations {
const char *name;
+ const char *real_ns_name;
int type;
struct ns_common *(*get)(struct task_struct *task);
void (*put)(struct ns_common *ns);
^ permalink raw reply related [flat|nested] 6+ messages in thread
* [PATCH v3 2/2] pidns: Expose task pid_ns_for_children to userspace
@ 2017-05-03 10:32 ` Kirill Tkhai
0 siblings, 0 replies; 6+ messages in thread
From: Kirill Tkhai @ 2017-05-03 10:32 UTC (permalink / raw)
To: agruenba, keescook, linux-api, oleg, viro, linux-kernel, paul,
ktkhai, ebiederm, avagin, linux-fsdevel, mtk.manpages, akpm,
luto, gorcunov, mingo, serge
pid_ns_for_children set by a task is known only to the task itself,
and it's impossible to identify it from outside.
It's a big problem for checkpoint/restore software like CRIU,
because it can't correctly handle tasks, that do setns(CLONE_NEWPID)
in proccess of their work.
This patch solves the problem, and it exposes pid_ns_for_children
to ns directory in standard way with the name "pid_for_children":
~# ls /proc/5531/ns -l | grep pid
lrwxrwxrwx 1 root root 0 Jan 14 16:38 pid -> pid:[4026531836]
lrwxrwxrwx 1 root root 0 Jan 14 16:38 pid_for_children -> pid:[4026532286]
v3: Check child_reaper without tasklist_lock as we are only interested
of it's not zero, not in specific value.
v2: Do not allow to get namespace if there is no child reaper created,
as other tasks need initializations it did (e.g., pid_namespace::proc_mnt),
and we don't want they race.
Signed-off-by: Kirill Tkhai <ktkhai@virtuozzo.com>
---
fs/proc/namespaces.c | 1 +
include/linux/proc_ns.h | 1 +
kernel/pid_namespace.c | 28 ++++++++++++++++++++++++++++
3 files changed, 30 insertions(+)
diff --git a/fs/proc/namespaces.c b/fs/proc/namespaces.c
index 766f0c637ad1..3803b24ca220 100644
--- a/fs/proc/namespaces.c
+++ b/fs/proc/namespaces.c
@@ -23,6 +23,7 @@ static const struct proc_ns_operations *ns_entries[] = {
#endif
#ifdef CONFIG_PID_NS
&pidns_operations,
+ &pidns_for_children_operations,
#endif
#ifdef CONFIG_USER_NS
&userns_operations,
diff --git a/include/linux/proc_ns.h b/include/linux/proc_ns.h
index 88dba3b53375..58ab28d81fc2 100644
--- a/include/linux/proc_ns.h
+++ b/include/linux/proc_ns.h
@@ -27,6 +27,7 @@ extern const struct proc_ns_operations netns_operations;
extern const struct proc_ns_operations utsns_operations;
extern const struct proc_ns_operations ipcns_operations;
extern const struct proc_ns_operations pidns_operations;
+extern const struct proc_ns_operations pidns_for_children_operations;
extern const struct proc_ns_operations userns_operations;
extern const struct proc_ns_operations mntns_operations;
extern const struct proc_ns_operations cgroupns_operations;
diff --git a/kernel/pid_namespace.c b/kernel/pid_namespace.c
index de461aa0bf9a..2a5ec0a00127 100644
--- a/kernel/pid_namespace.c
+++ b/kernel/pid_namespace.c
@@ -374,6 +374,23 @@ static struct ns_common *pidns_get(struct task_struct *task)
return ns ? &ns->ns : NULL;
}
+static struct ns_common *pidns_for_children_get(struct task_struct *task)
+{
+ struct pid_namespace *ns = NULL;
+
+ task_lock(task);
+ if (task->nsproxy) {
+ ns = task->nsproxy->pid_ns_for_children;
+ if (ns->child_reaper)
+ get_pid_ns(ns);
+ else
+ ns = NULL;
+ }
+ task_unlock(task);
+
+ return ns ? &ns->ns : NULL;
+}
+
static void pidns_put(struct ns_common *ns)
{
put_pid_ns(to_pid_ns(ns));
@@ -443,6 +460,17 @@ const struct proc_ns_operations pidns_operations = {
.get_parent = pidns_get_parent,
};
+const struct proc_ns_operations pidns_for_children_operations = {
+ .name = "pid_for_children",
+ .real_ns_name = "pid",
+ .type = CLONE_NEWPID,
+ .get = pidns_for_children_get,
+ .put = pidns_put,
+ .install = pidns_install,
+ .owner = pidns_owner,
+ .get_parent = pidns_get_parent,
+};
+
static __init int pid_namespaces_init(void)
{
pid_ns_cachep = KMEM_CACHE(pid_namespace, SLAB_PANIC);
^ permalink raw reply related [flat|nested] 6+ messages in thread
* [PATCH v3 2/2] pidns: Expose task pid_ns_for_children to userspace
@ 2017-05-03 10:32 ` Kirill Tkhai
0 siblings, 0 replies; 6+ messages in thread
From: Kirill Tkhai @ 2017-05-03 10:32 UTC (permalink / raw)
To: agruenba-H+wXaHxf7aLQT0dZR+AlfA, keescook-F7+t8E8rja9g9hUCZPvPmw,
linux-api-u79uwXL29TY76Z2rM5mHXA, oleg-H+wXaHxf7aLQT0dZR+AlfA,
viro-RmSDqhL/yNMiFSDQTTA3OLVCufUGDwFn,
linux-kernel-u79uwXL29TY76Z2rM5mHXA, paul-r2n+y4ga6xFZroRs9YW3xA,
ktkhai-5HdwGun5lf+gSpxsJD1C4w, ebiederm-aS9lmoZGLiVWk0Htik3J/w,
avagin-GEFAQzZX7r8dnm+yROfE0A,
linux-fsdevel-u79uwXL29TY76Z2rM5mHXA,
mtk.manpages-Re5JQEeQqe8AvxtiuMwx3w,
akpm-de/tnXTf+JLsfHDXvbKv3WD2FQJk+8+b,
luto-kltTT9wpgjJwATOyAt5JVQ, gorcunov-GEFAQzZX7r8dnm+yROfE0A,
mingo-DgEjT+Ai2ygdnm+yROfE0A, serge-A9i7LUbDfNHQT0dZR+AlfA
pid_ns_for_children set by a task is known only to the task itself,
and it's impossible to identify it from outside.
It's a big problem for checkpoint/restore software like CRIU,
because it can't correctly handle tasks, that do setns(CLONE_NEWPID)
in proccess of their work.
This patch solves the problem, and it exposes pid_ns_for_children
to ns directory in standard way with the name "pid_for_children":
~# ls /proc/5531/ns -l | grep pid
lrwxrwxrwx 1 root root 0 Jan 14 16:38 pid -> pid:[4026531836]
lrwxrwxrwx 1 root root 0 Jan 14 16:38 pid_for_children -> pid:[4026532286]
v3: Check child_reaper without tasklist_lock as we are only interested
of it's not zero, not in specific value.
v2: Do not allow to get namespace if there is no child reaper created,
as other tasks need initializations it did (e.g., pid_namespace::proc_mnt),
and we don't want they race.
Signed-off-by: Kirill Tkhai <ktkhai-5HdwGun5lf+gSpxsJD1C4w@public.gmane.org>
---
fs/proc/namespaces.c | 1 +
include/linux/proc_ns.h | 1 +
kernel/pid_namespace.c | 28 ++++++++++++++++++++++++++++
3 files changed, 30 insertions(+)
diff --git a/fs/proc/namespaces.c b/fs/proc/namespaces.c
index 766f0c637ad1..3803b24ca220 100644
--- a/fs/proc/namespaces.c
+++ b/fs/proc/namespaces.c
@@ -23,6 +23,7 @@ static const struct proc_ns_operations *ns_entries[] = {
#endif
#ifdef CONFIG_PID_NS
&pidns_operations,
+ &pidns_for_children_operations,
#endif
#ifdef CONFIG_USER_NS
&userns_operations,
diff --git a/include/linux/proc_ns.h b/include/linux/proc_ns.h
index 88dba3b53375..58ab28d81fc2 100644
--- a/include/linux/proc_ns.h
+++ b/include/linux/proc_ns.h
@@ -27,6 +27,7 @@ extern const struct proc_ns_operations netns_operations;
extern const struct proc_ns_operations utsns_operations;
extern const struct proc_ns_operations ipcns_operations;
extern const struct proc_ns_operations pidns_operations;
+extern const struct proc_ns_operations pidns_for_children_operations;
extern const struct proc_ns_operations userns_operations;
extern const struct proc_ns_operations mntns_operations;
extern const struct proc_ns_operations cgroupns_operations;
diff --git a/kernel/pid_namespace.c b/kernel/pid_namespace.c
index de461aa0bf9a..2a5ec0a00127 100644
--- a/kernel/pid_namespace.c
+++ b/kernel/pid_namespace.c
@@ -374,6 +374,23 @@ static struct ns_common *pidns_get(struct task_struct *task)
return ns ? &ns->ns : NULL;
}
+static struct ns_common *pidns_for_children_get(struct task_struct *task)
+{
+ struct pid_namespace *ns = NULL;
+
+ task_lock(task);
+ if (task->nsproxy) {
+ ns = task->nsproxy->pid_ns_for_children;
+ if (ns->child_reaper)
+ get_pid_ns(ns);
+ else
+ ns = NULL;
+ }
+ task_unlock(task);
+
+ return ns ? &ns->ns : NULL;
+}
+
static void pidns_put(struct ns_common *ns)
{
put_pid_ns(to_pid_ns(ns));
@@ -443,6 +460,17 @@ const struct proc_ns_operations pidns_operations = {
.get_parent = pidns_get_parent,
};
+const struct proc_ns_operations pidns_for_children_operations = {
+ .name = "pid_for_children",
+ .real_ns_name = "pid",
+ .type = CLONE_NEWPID,
+ .get = pidns_for_children_get,
+ .put = pidns_put,
+ .install = pidns_install,
+ .owner = pidns_owner,
+ .get_parent = pidns_get_parent,
+};
+
static __init int pid_namespaces_init(void)
{
pid_ns_cachep = KMEM_CACHE(pid_namespace, SLAB_PANIC);
^ permalink raw reply related [flat|nested] 6+ messages in thread
end of thread, other threads:[~2017-05-03 10:33 UTC | newest]
Thread overview: 6+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2017-05-03 10:32 [PATCH v3 0/2] Expose task pid_ns_for_children to userspace Kirill Tkhai
2017-05-03 10:32 ` Kirill Tkhai
2017-05-03 10:32 ` [PATCH v3 1/2] ns: Allow ns_entries to have custom symlink content Kirill Tkhai
2017-05-03 10:32 ` Kirill Tkhai
2017-05-03 10:32 ` [PATCH v3 2/2] pidns: Expose task pid_ns_for_children to userspace Kirill Tkhai
2017-05-03 10:32 ` Kirill Tkhai
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.