linux-kernel.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
* [PATCH 0/2] nsfs: Introduce ioctl to set vector of ns_last_pid's on pid ns hierarhy
@ 2017-04-17 17:34 Kirill Tkhai
  2017-04-17 17:36 ` [PATCH 1/2] nsfs: Add namespace-specific ioctl (NS_SPECIFIC_IOC) Kirill Tkhai
  2017-04-17 17:36 ` [PATCH 2/2] pid_ns: Introduce ioctl to set vector of ns_last_pid's on ns hierarhy Kirill Tkhai
  0 siblings, 2 replies; 23+ messages in thread
From: Kirill Tkhai @ 2017-04-17 17:34 UTC (permalink / raw)
  To: serge, ebiederm, agruenba, linux-api, oleg, linux-kernel, paul,
	ktkhai, viro, avagin, linux-fsdevel, mtk.manpages, akpm, luto,
	gorcunov, mingo, keescook

On implementing of nested pid namespaces support in CRIU
(checkpoint-restore in userspace tool) we run into
the situation, that it's impossible to create a task with
specific NSpid effectively. After commit 49f4d8b93ccf
"pidns: Capture the user namespace and filter ns_last_pid"
it is impossible to set ns_last_pid on any pid namespace,
except task's active pid_ns (before the commit it was possible
to set it for pid_ns_for_children). Thus, if a restored task
in a container has more than one pid_ns levels, the restorer
code must have a task helper for every pid namespace
of the task's pid_ns hierarhy.

This is a big problem, because of communication with
a helper for every pid_ns in the hierarchy is not cheap
and not performance-good. It implies many wakeups of helpers
to create a single task (independently, how you communicate
with the helpers). So, this patchset tries to decide the problem.

It introduces a namespaces-specific ioctls and implements the
realization for pid_ns, which allows to write a vector of last
pids on pid_ns hierarchy.

The vector is passed as a ":"-delimited string with pids,
written in reverse order. The first number corresponds to
the opened namespace ns_last_pid, the second is to its parent, etc.
If you have the pid namespaces hierarchy like:

pid_ns1 (grand father)
  |
  v
pid_ns2 (father)
  |
  v
pid_ns3 (child)

and the ns of task's of pid_ns3 is open, then the corresponding
vector will be "last_ns_pid3:last_ns_pid2:last_ns_pid1". This
vector may be short and it may contain less levels, for example,
"last_ns_pid3:last_ns_pid2" or even "last_ns_pid3", in dependence
of which levels you want to populate. Numbers last_ns_pidX are
just numbers written in decimal form.

---

Kirill Tkhai (2):
      nsfs: Add namespace-specific ioctl (NS_SPECIFIC_IOC)
      pid_ns: Introduce ioctl to set vector of ns_last_pid's on ns hierarhy


 fs/nsfs.c                 |    4 ++
 include/linux/proc_ns.h   |    1 +
 include/uapi/linux/nsfs.h |   11 ++++++
 kernel/pid_namespace.c    |   88 +++++++++++++++++++++++++++++++++++++++++++++
 4 files changed, 104 insertions(+)

--
Signed-off-by: Kirill Tkhai <ktkhai@virtuozzo.com>

^ permalink raw reply	[flat|nested] 23+ messages in thread

end of thread, other threads:[~2017-05-03 10:21 UTC | newest]

Thread overview: 23+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2017-04-17 17:34 [PATCH 0/2] nsfs: Introduce ioctl to set vector of ns_last_pid's on pid ns hierarhy Kirill Tkhai
2017-04-17 17:36 ` [PATCH 1/2] nsfs: Add namespace-specific ioctl (NS_SPECIFIC_IOC) Kirill Tkhai
2017-04-17 17:36 ` [PATCH 2/2] pid_ns: Introduce ioctl to set vector of ns_last_pid's on ns hierarhy Kirill Tkhai
2017-04-19 20:27   ` Serge E. Hallyn
2017-04-24 19:03   ` Cyrill Gorcunov
2017-04-26 15:53   ` Oleg Nesterov
2017-04-26 16:11     ` Kirill Tkhai
2017-04-26 16:33       ` Kirill Tkhai
2017-04-26 16:32         ` Eric W. Biederman
2017-04-26 16:43           ` Kirill Tkhai
2017-04-26 17:01             ` Eric W. Biederman
2017-04-27 16:12       ` Oleg Nesterov
2017-04-27 16:17         ` Kirill Tkhai
2017-04-27 16:22           ` Oleg Nesterov
2017-04-28  9:17             ` Kirill Tkhai
2017-05-02 16:33               ` Oleg Nesterov
2017-05-02 17:22                 ` Eric W. Biederman
2017-05-02 17:33                 ` Kirill Tkhai
2017-05-02 21:13                   ` Eric W. Biederman
2017-05-03 10:20                     ` Kirill Tkhai
2017-04-27 16:39           ` Eric W. Biederman
2017-04-28  9:22             ` Kirill Tkhai
2017-04-27 16:16       ` Oleg Nesterov

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).