All of lore.kernel.org
 help / color / mirror / Atom feed
From: Kirill Tkhai <ktkhai@virtuozzo.com>
To: <serge@hallyn.com>, <ebiederm@xmission.com>,
	<agruenba@redhat.com>, <linux-api@vger.kernel.org>,
	<oleg@redhat.com>, <linux-kernel@vger.kernel.org>,
	<paul@paul-moore.com>, <ktkhai@virtuozzo.com>,
	<viro@zeniv.linux.org.uk>, <avagin@openvz.org>,
	<linux-fsdevel@vger.kernel.org>, <mtk.manpages@gmail.com>,
	<akpm@linux-foundation.org>, <luto@amacapital.net>,
	<gorcunov@openvz.org>, <mingo@kernel.org>,
	<keescook@chromium.org>
Subject: [PATCH 0/2] nsfs: Introduce ioctl to set vector of ns_last_pid's on pid ns hierarhy
Date: Mon, 17 Apr 2017 20:34:37 +0300	[thread overview]
Message-ID: <149245014695.17600.12640895883798122726.stgit@localhost.localdomain> (raw)

On implementing of nested pid namespaces support in CRIU
(checkpoint-restore in userspace tool) we run into
the situation, that it's impossible to create a task with
specific NSpid effectively. After commit 49f4d8b93ccf
"pidns: Capture the user namespace and filter ns_last_pid"
it is impossible to set ns_last_pid on any pid namespace,
except task's active pid_ns (before the commit it was possible
to set it for pid_ns_for_children). Thus, if a restored task
in a container has more than one pid_ns levels, the restorer
code must have a task helper for every pid namespace
of the task's pid_ns hierarhy.

This is a big problem, because of communication with
a helper for every pid_ns in the hierarchy is not cheap
and not performance-good. It implies many wakeups of helpers
to create a single task (independently, how you communicate
with the helpers). So, this patchset tries to decide the problem.

It introduces a namespaces-specific ioctls and implements the
realization for pid_ns, which allows to write a vector of last
pids on pid_ns hierarchy.

The vector is passed as a ":"-delimited string with pids,
written in reverse order. The first number corresponds to
the opened namespace ns_last_pid, the second is to its parent, etc.
If you have the pid namespaces hierarchy like:

pid_ns1 (grand father)
  |
  v
pid_ns2 (father)
  |
  v
pid_ns3 (child)

and the ns of task's of pid_ns3 is open, then the corresponding
vector will be "last_ns_pid3:last_ns_pid2:last_ns_pid1". This
vector may be short and it may contain less levels, for example,
"last_ns_pid3:last_ns_pid2" or even "last_ns_pid3", in dependence
of which levels you want to populate. Numbers last_ns_pidX are
just numbers written in decimal form.

---

Kirill Tkhai (2):
      nsfs: Add namespace-specific ioctl (NS_SPECIFIC_IOC)
      pid_ns: Introduce ioctl to set vector of ns_last_pid's on ns hierarhy


 fs/nsfs.c                 |    4 ++
 include/linux/proc_ns.h   |    1 +
 include/uapi/linux/nsfs.h |   11 ++++++
 kernel/pid_namespace.c    |   88 +++++++++++++++++++++++++++++++++++++++++++++
 4 files changed, 104 insertions(+)

--
Signed-off-by: Kirill Tkhai <ktkhai@virtuozzo.com>

WARNING: multiple messages have this Message-ID (diff)
From: Kirill Tkhai <ktkhai@virtuozzo.com>
To: serge@hallyn.com, ebiederm@xmission.com, agruenba@redhat.com,
	linux-api@vger.kernel.org, oleg@redhat.com,
	linux-kernel@vger.kernel.org, paul@paul-moore.com,
	ktkhai@virtuozzo.com, viro@zeniv.linux.org.uk, avagin@openvz.org,
	linux-fsdevel@vger.kernel.org, mtk.manpages@gmail.com,
	akpm@linux-foundation.org, luto@amacapital.net,
	gorcunov@openvz.org, mingo@kernel.org, keescook@chromium.org
Subject: [PATCH 0/2] nsfs: Introduce ioctl to set vector of ns_last_pid's on pid ns hierarhy
Date: Mon, 17 Apr 2017 20:34:37 +0300	[thread overview]
Message-ID: <149245014695.17600.12640895883798122726.stgit@localhost.localdomain> (raw)

On implementing of nested pid namespaces support in CRIU
(checkpoint-restore in userspace tool) we run into
the situation, that it's impossible to create a task with
specific NSpid effectively. After commit 49f4d8b93ccf
"pidns: Capture the user namespace and filter ns_last_pid"
it is impossible to set ns_last_pid on any pid namespace,
except task's active pid_ns (before the commit it was possible
to set it for pid_ns_for_children). Thus, if a restored task
in a container has more than one pid_ns levels, the restorer
code must have a task helper for every pid namespace
of the task's pid_ns hierarhy.

This is a big problem, because of communication with
a helper for every pid_ns in the hierarchy is not cheap
and not performance-good. It implies many wakeups of helpers
to create a single task (independently, how you communicate
with the helpers). So, this patchset tries to decide the problem.

It introduces a namespaces-specific ioctls and implements the
realization for pid_ns, which allows to write a vector of last
pids on pid_ns hierarchy.

The vector is passed as a ":"-delimited string with pids,
written in reverse order. The first number corresponds to
the opened namespace ns_last_pid, the second is to its parent, etc.
If you have the pid namespaces hierarchy like:

pid_ns1 (grand father)
  |
  v
pid_ns2 (father)
  |
  v
pid_ns3 (child)

and the ns of task's of pid_ns3 is open, then the corresponding
vector will be "last_ns_pid3:last_ns_pid2:last_ns_pid1". This
vector may be short and it may contain less levels, for example,
"last_ns_pid3:last_ns_pid2" or even "last_ns_pid3", in dependence
of which levels you want to populate. Numbers last_ns_pidX are
just numbers written in decimal form.

---

Kirill Tkhai (2):
      nsfs: Add namespace-specific ioctl (NS_SPECIFIC_IOC)
      pid_ns: Introduce ioctl to set vector of ns_last_pid's on ns hierarhy


 fs/nsfs.c                 |    4 ++
 include/linux/proc_ns.h   |    1 +
 include/uapi/linux/nsfs.h |   11 ++++++
 kernel/pid_namespace.c    |   88 +++++++++++++++++++++++++++++++++++++++++++++
 4 files changed, 104 insertions(+)

--
Signed-off-by: Kirill Tkhai <ktkhai@virtuozzo.com>

             reply	other threads:[~2017-04-17 17:35 UTC|newest]

Thread overview: 44+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2017-04-17 17:34 Kirill Tkhai [this message]
2017-04-17 17:34 ` [PATCH 0/2] nsfs: Introduce ioctl to set vector of ns_last_pid's on pid ns hierarhy Kirill Tkhai
2017-04-17 17:36 ` [PATCH 1/2] nsfs: Add namespace-specific ioctl (NS_SPECIFIC_IOC) Kirill Tkhai
2017-04-17 17:36   ` Kirill Tkhai
2017-04-17 17:36 ` [PATCH 2/2] pid_ns: Introduce ioctl to set vector of ns_last_pid's on ns hierarhy Kirill Tkhai
2017-04-17 17:36   ` Kirill Tkhai
2017-04-19 20:27   ` Serge E. Hallyn
2017-04-19 20:27     ` Serge E. Hallyn
2017-04-24 19:03   ` Cyrill Gorcunov
2017-04-24 19:03     ` Cyrill Gorcunov
2017-04-26 15:53   ` Oleg Nesterov
2017-04-26 15:53     ` Oleg Nesterov
2017-04-26 16:11     ` Kirill Tkhai
2017-04-26 16:11       ` Kirill Tkhai
2017-04-26 16:33       ` Kirill Tkhai
2017-04-26 16:33         ` Kirill Tkhai
2017-04-26 16:32         ` Eric W. Biederman
2017-04-26 16:32           ` Eric W. Biederman
2017-04-26 16:43           ` Kirill Tkhai
2017-04-26 16:43             ` Kirill Tkhai
2017-04-26 17:01             ` Eric W. Biederman
2017-04-26 17:01               ` Eric W. Biederman
2017-04-27 16:12       ` Oleg Nesterov
2017-04-27 16:12         ` Oleg Nesterov
2017-04-27 16:17         ` Kirill Tkhai
2017-04-27 16:17           ` Kirill Tkhai
2017-04-27 16:22           ` Oleg Nesterov
2017-04-27 16:22             ` Oleg Nesterov
2017-04-28  9:17             ` Kirill Tkhai
2017-04-28  9:17               ` Kirill Tkhai
2017-05-02 16:33               ` Oleg Nesterov
2017-05-02 17:22                 ` Eric W. Biederman
2017-05-02 17:22                   ` Eric W. Biederman
2017-05-02 17:33                 ` Kirill Tkhai
2017-05-02 17:33                   ` Kirill Tkhai
2017-05-02 21:13                   ` Eric W. Biederman
2017-05-02 21:13                     ` Eric W. Biederman
2017-05-03 10:20                     ` Kirill Tkhai
2017-05-03 10:20                       ` Kirill Tkhai
2017-04-27 16:39           ` Eric W. Biederman
2017-04-27 16:39             ` Eric W. Biederman
2017-04-28  9:22             ` Kirill Tkhai
2017-04-28  9:22               ` Kirill Tkhai
2017-04-27 16:16       ` Oleg Nesterov

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=149245014695.17600.12640895883798122726.stgit@localhost.localdomain \
    --to=ktkhai@virtuozzo.com \
    --cc=agruenba@redhat.com \
    --cc=akpm@linux-foundation.org \
    --cc=avagin@openvz.org \
    --cc=ebiederm@xmission.com \
    --cc=gorcunov@openvz.org \
    --cc=keescook@chromium.org \
    --cc=linux-api@vger.kernel.org \
    --cc=linux-fsdevel@vger.kernel.org \
    --cc=linux-kernel@vger.kernel.org \
    --cc=luto@amacapital.net \
    --cc=mingo@kernel.org \
    --cc=mtk.manpages@gmail.com \
    --cc=oleg@redhat.com \
    --cc=paul@paul-moore.com \
    --cc=serge@hallyn.com \
    --cc=viro@zeniv.linux.org.uk \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.