Linux-Fsdevel Archive on lore.kernel.org
 help / color / Atom feed
From: ebiederm@xmission.com (Eric W. Biederman)
To: Kirill Tkhai <ktkhai@virtuozzo.com>
Cc: Andrei Vagin <avagin@gmail.com>,
	adobriyan@gmail.com, viro@zeniv.linux.org.uk,
	davem@davemloft.net, akpm@linux-foundation.org,
	christian.brauner@ubuntu.com, areber@redhat.com,
	serge@hallyn.com, linux-kernel@vger.kernel.org,
	linux-fsdevel@vger.kernel.org,
	Pavel Tikhomirov <ptikhomirov@virtuozzo.com>
Subject: Re: [PATCH 00/23] proc: Introduce /proc/namespaces/ directory to expose namespaces lineary
Date: Mon, 17 Aug 2020 10:48:01 -0500
Message-ID: <87d03pb7f2.fsf@x220.int.ebiederm.org> (raw)
In-Reply-To: <56ed1fb9-4f1f-3528-3f09-78478b9dfcf2@virtuozzo.com> (Kirill Tkhai's message of "Mon, 17 Aug 2020 17:05:26 +0300")


Creating names in the kernel for namespaces is very difficult and
problematic.  I have not seen anything that looks like  all of the
problems have been solved with restoring these new names.

When your filter for your list of namespaces is user namespace creating
a new directory in proc is highly questionable.

As everyone uses proc placing this functionality in proc also amplifies
the problem of creating names.


Rather than proc having a way to mount a namespace filesystem filter by
the user namespace of the mounter likely to have many many fewer
problems.  Especially as we are limiting/not allow new non-process
things and ideally finding a way to remove the non-process things.


Kirill you have a good point that taking the case where a pid namespace
does not exist in a user namespace is likely quite unrealistic.

Kirill mentioned upthread that the list of namespaces are the list that
can appear in a container.  Except by discipline in creating containers
it is not possible to know which namespaces may appear in attached to a
process.  It is possible to be very creative with setns, and violate any
constraint you may have.  Which means your filtered list of namespaces
may not contain all of the namespaces used by a set of processes.  This
further argues that attaching the list of namespaces to proc does not
make sense.

Andrei has a good point that placing the names in a hierarchy by
user namespace has the potential to create more freedom when
assigning names to namespaces, as it means the names for namespaces
do not need to be globally unique, and while still allowing the names
to stay the same.


To recap the possibilities for names for namespaces that I have seen
mentioned in this thread are:
  - Names per mount
  - Names per user namespace

I personally suspect that names per mount are likely to be so flexibly
they are confusing, while names per user namespace are likely to be
rigid, possibly too rigid to use.

It all depends upon how everything is used.  I have yet to see a
complete story of how these names will be generated and used.  So I can
not really judge.


Let me add another take on this idea that might give this work a path
forward. If I were solving this I would explore giving nsfs directories
per user namespace, and a way to mount it that exposed the directory of
the mounters current user namespace (something like btrfs snapshots).

Hmm.  For the user namespace directory I think I would give it a file
"ns" that can be opened to get a file handle on the user namespace.
Plus a set of subdirectories "cgroup", "ipc", "mnt", "net", "pid",
"user", "uts") for each type of namespace.  In each directory I think
I would just have a 64bit counter and each new entry I would assign the
next number from that counter.

The restore could either have the ability to rename files or simply the
ability to bump the counter (like we do with pids) so the names of the
namespaces can be restored.

That winds up making a user namespace the namespace of namespaces, so
I am not 100% about the idea. 

Eric



  reply index

Thread overview: 76+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2020-07-30 11:59 Kirill Tkhai
2020-07-30 11:59 ` [PATCH 01/23] ns: Add common refcount into ns_common add use it as counter for net_ns Kirill Tkhai
2020-07-30 13:35   ` Christian Brauner
2020-07-30 14:07     ` Kirill Tkhai
2020-07-30 15:59       ` Christian Brauner
2020-07-30 14:30   ` Christian Brauner
2020-07-30 14:34     ` Kirill Tkhai
2020-07-30 14:39       ` Christian Brauner
2020-07-30 11:59 ` [PATCH 02/23] uts: Use generic ns_common::count Kirill Tkhai
2020-07-30 14:30   ` Christian Brauner
2020-07-30 11:59 ` [PATCH 03/23] ipc: " Kirill Tkhai
2020-07-30 14:32   ` Christian Brauner
2020-07-30 11:59 ` [PATCH 04/23] pid: " Kirill Tkhai
2020-07-30 14:37   ` Christian Brauner
2020-07-30 11:59 ` [PATCH 05/23] user: " Kirill Tkhai
2020-07-30 14:46   ` Christian Brauner
2020-07-30 11:59 ` [PATCH 06/23] mnt: " Kirill Tkhai
2020-07-30 14:49   ` Christian Brauner
2020-07-30 11:59 ` [PATCH 07/23] cgroup: " Kirill Tkhai
2020-07-30 14:50   ` Christian Brauner
2020-07-30 12:00 ` [PATCH 08/23] time: " Kirill Tkhai
2020-07-30 14:52   ` Christian Brauner
2020-07-30 12:00 ` [PATCH 09/23] ns: Introduce ns_idr to be able to iterate all allocated namespaces in the system Kirill Tkhai
2020-07-30 12:23   ` Matthew Wilcox
2020-07-30 13:32     ` Kirill Tkhai
2020-07-30 13:56       ` Matthew Wilcox
2020-07-30 14:12         ` Kirill Tkhai
2020-07-30 14:15           ` Matthew Wilcox
2020-07-30 14:20             ` Kirill Tkhai
2020-07-30 12:00 ` [PATCH 10/23] fs: Rename fs/proc/namespaces.c into fs/proc/task_namespaces.c Kirill Tkhai
2020-07-30 12:00 ` [PATCH 11/23] fs: Add /proc/namespaces/ directory Kirill Tkhai
2020-07-30 12:18   ` Alexey Dobriyan
2020-07-30 13:22     ` Kirill Tkhai
2020-07-30 13:26   ` Christian Brauner
2020-07-30 14:30     ` Kirill Tkhai
2020-07-30 20:47   ` kernel test robot
2020-07-30 22:20   ` kernel test robot
2020-08-05  8:17   ` kernel test robot
2020-08-05  8:17   ` [RFC PATCH] fs: namespaces_dentry_operations can be static kernel test robot
2020-07-30 12:00 ` [PATCH 12/23] user: Free user_ns one RCU grace period after final counter put Kirill Tkhai
2020-07-30 12:00 ` [PATCH 13/23] user: Add user namespaces into ns_idr Kirill Tkhai
2020-07-30 12:00 ` [PATCH 14/23] net: Add net " Kirill Tkhai
2020-07-30 12:00 ` [PATCH 15/23] pid: Eextract child_reaper check from pidns_for_children_get() Kirill Tkhai
2020-07-30 12:00 ` [PATCH 16/23] proc_ns_operations: Add can_get method Kirill Tkhai
2020-07-30 12:00 ` [PATCH 17/23] pid: Add pid namespaces into ns_idr Kirill Tkhai
2020-07-30 12:00 ` [PATCH 18/23] uts: Free uts namespace one RCU grace period after final counter put Kirill Tkhai
2020-07-30 12:01 ` [PATCH 19/23] uts: Add uts namespaces into ns_idr Kirill Tkhai
2020-07-30 12:01 ` [PATCH 20/23] ipc: Add ipc " Kirill Tkhai
2020-07-30 12:01 ` [PATCH 21/23] mnt: Add mount " Kirill Tkhai
2020-07-30 12:01 ` [PATCH 22/23] cgroup: Add cgroup " Kirill Tkhai
2020-07-30 12:01 ` [PATCH 23/23] time: Add time " Kirill Tkhai
2020-07-30 13:08 ` [PATCH 00/23] proc: Introduce /proc/namespaces/ directory to expose namespaces lineary Christian Brauner
2020-07-30 13:38   ` Christian Brauner
2020-07-30 14:34 ` Eric W. Biederman
2020-07-30 14:42   ` Christian Brauner
2020-07-30 15:01   ` Kirill Tkhai
2020-07-30 22:13     ` Eric W. Biederman
2020-07-31  8:48       ` Pavel Tikhomirov
2020-08-03 10:03       ` Kirill Tkhai
2020-08-03 10:51         ` Alexey Dobriyan
2020-08-06  8:05         ` Andrei Vagin
2020-08-07  8:47           ` Kirill Tkhai
2020-08-10 17:34             ` Andrei Vagin
2020-08-11 10:23               ` Kirill Tkhai
2020-08-12 17:53                 ` Andrei Vagin
2020-08-13  8:12                   ` Kirill Tkhai
2020-08-14  1:16                     ` Andrei Vagin
2020-08-14 15:11                       ` Kirill Tkhai
2020-08-14 19:21                         ` Andrei Vagin
2020-08-17 14:05                           ` Kirill Tkhai
2020-08-17 15:48                             ` Eric W. Biederman [this message]
2020-08-17 17:47                               ` Christian Brauner
2020-08-17 18:53                                 ` Eric W. Biederman
2020-08-04  5:43     ` Andrei Vagin
2020-08-04 12:11       ` Pavel Tikhomirov
2020-08-04 14:47       ` Kirill Tkhai

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=87d03pb7f2.fsf@x220.int.ebiederm.org \
    --to=ebiederm@xmission.com \
    --cc=adobriyan@gmail.com \
    --cc=akpm@linux-foundation.org \
    --cc=areber@redhat.com \
    --cc=avagin@gmail.com \
    --cc=christian.brauner@ubuntu.com \
    --cc=davem@davemloft.net \
    --cc=ktkhai@virtuozzo.com \
    --cc=linux-fsdevel@vger.kernel.org \
    --cc=linux-kernel@vger.kernel.org \
    --cc=ptikhomirov@virtuozzo.com \
    --cc=serge@hallyn.com \
    --cc=viro@zeniv.linux.org.uk \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link

Linux-Fsdevel Archive on lore.kernel.org

Archives are clonable:
	git clone --mirror https://lore.kernel.org/linux-fsdevel/0 linux-fsdevel/git/0.git

	# If you have public-inbox 1.1+ installed, you may
	# initialize and index your mirror using the following commands:
	public-inbox-init -V2 linux-fsdevel linux-fsdevel/ https://lore.kernel.org/linux-fsdevel \
		linux-fsdevel@vger.kernel.org
	public-inbox-index linux-fsdevel

Example config snippet for mirrors

Newsgroup available over NNTP:
	nntp://nntp.lore.kernel.org/org.kernel.vger.linux-fsdevel


AGPL code for this site: git clone https://public-inbox.org/public-inbox.git