linux-kernel.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
From: James Bottomley <James.Bottomley@HansenPartnership.com>
To: Yordan Karadzhov <y.karadz@gmail.com>,
	Steven Rostedt <rostedt@goodmis.org>
Cc: linux-kernel@vger.kernel.org, linux-fsdevel@vger.kernel.org,
	viro@zeniv.linux.org.uk, mingo@redhat.com, hagen@jauu.net,
	rppt@kernel.org, akpm@linux-foundation.org, vvs@virtuozzo.com,
	shakeelb@google.com, christian.brauner@ubuntu.com,
	mkoutny@suse.com, Linux Containers <containers@lists.linux.dev>,
	"Eric W. Biederman" <ebiederm@xmission.com>
Subject: Re: [RFC PATCH 0/4] namespacefs: Proof-of-Concept
Date: Fri, 19 Nov 2021 18:22:55 -0500	[thread overview]
Message-ID: <f141c401560d90a546968514c6cfc63d7fdb8e00.camel@HansenPartnership.com> (raw)
In-Reply-To: <de336e53-68e1-1d4b-7f71-e276b5363b7c@gmail.com>

On Fri, 2021-11-19 at 19:14 +0200, Yordan Karadzhov wrote:
> On 19.11.21 г. 18:42 ч., James Bottomley wrote:
[...]
> > Can we back up and ask what problem you're trying to solve before
> > we start introducing new objects like namespace name?  The problem
> > statement just seems to be "Being able to see the structure of the
> > namespaces can be very useful in the context of the containerized
> > workloads."  which you later expanded on as "trying to add more
> > visibility into the working of things like kubernetes".  If you
> > just want to see the namespace "tree" you can script that (as root)
> > by matching the process tree and the /proc/<pid>/ns changes without
> > actually needing to construct it in the kernel.  This can also be
> > done without introducing the concept of a namespace name.  However,
> > there is a subtlety of doing this matching in the way I described
> > in that you don't get proper parenting to the user namespace
> > ownership ... but that seems to be something you don't want anyway?
> > 
> 
> The major motivation is to be able to hook tracing to individual
> containers. We want to be able to quickly discover the 
> PIDs of all containers running on a system. And when we say all, we
> mean not only Docker, but really all sorts of 
> containers that exist now or may exist in the future. We also
> considered the solution of brute-forcing all processes in 
> /proc/*/ns/ but we are afraid that such solution do not scale.

What do you mean does not scale?  ps and top use the /proc tree to
gather all the real time interface data for every process; do they not
"scale" as well and should therefore be done as in-kernel interfaces?

>  As I stated in the Cover letter, the problem was 
> discussed at Plumbers (links at the bottom of the Cover letter) and
> the conclusion was that the most distinct feature 
> that anything that can be called 'Container' must have is a separate
> PID namespace.

Unfortunately, I think I was fighting matrix fires at the time so
couldn't be there.  However, I'd have pushed back on the idea of
identifying containers by the pid namespace (mainly because most of the
unprivileged containers I set up don't have one).  Realistically, if
you're not a system container (need for pid 1) and don't have multiple
untrusted tenants (global process tree information leak), you likely
shouldn't be using the pid namespace either ... it just adds isolation
for no value.

>  This is why the PoC starts with the implementation of this
> namespace. You can see in the example script that discovering the
> name and all PIDs of all  containers gets quick and trivial with the
> help of this new filesystem. And you need to add just few more lines
> of code in order to make it start tracing a selected container.

But I could write a script or a tool to gather all the information
without this filesystem.  The namespace tree can be reconstructed by
anything that can view the process tree and the /proc/<pid>/ns
directory.

James



  parent reply	other threads:[~2021-11-19 23:23 UTC|newest]

Thread overview: 28+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2021-11-18 18:12 [RFC PATCH 0/4] namespacefs: Proof-of-Concept Yordan Karadzhov (VMware)
2021-11-18 18:12 ` [RFC PATCH 1/4] namespacefs: Introduce 'namespacefs' Yordan Karadzhov (VMware)
2021-11-18 18:12 ` [RFC PATCH 2/4] namespacefs: Add methods to create/remove PID namespace directories Yordan Karadzhov (VMware)
2021-11-18 18:12 ` [RFC PATCH 3/4] namespacefs: Couple namespacefs to the PID namespace Yordan Karadzhov (VMware)
2021-11-18 18:12 ` [RFC PATCH 4/4] namespacefs: Couple namespacefs to the UTS namespace Yordan Karadzhov (VMware)
2021-11-18 18:55 ` [RFC PATCH 0/4] namespacefs: Proof-of-Concept Eric W. Biederman
2021-11-18 19:02   ` Steven Rostedt
2021-11-18 19:22     ` Eric W. Biederman
2021-11-18 19:36       ` Steven Rostedt
2021-11-18 19:24   ` Steven Rostedt
2021-11-19  9:50     ` Kirill Tkhai
2021-11-19 12:45     ` James Bottomley
     [not found]       ` <20211119092758.1012073e@gandalf.local.home>
2021-11-19 16:42         ` James Bottomley
2021-11-19 17:14           ` Yordan Karadzhov
2021-11-19 17:22             ` Steven Rostedt
2021-11-19 23:22             ` James Bottomley [this message]
2021-11-20  0:07               ` Steven Rostedt
2021-11-20  0:14                 ` James Bottomley
     [not found]         ` <f6ca1f5bdb3b516688f291d9685a6a59f49f1393.camel@HansenPartnership.com>
2021-11-19 16:47           ` Steven Rostedt
2021-11-19 16:49             ` Steven Rostedt
2021-11-19 23:08               ` James Bottomley
2021-11-22 13:02                 ` Yordan Karadzhov
2021-11-22 13:44                   ` James Bottomley
2021-11-22 15:00                     ` Yordan Karadzhov
2021-11-22 15:47                       ` James Bottomley
2021-11-22 16:15                         ` Yordan Karadzhov
2021-11-19 14:26   ` Yordan Karadzhov
2021-11-18 21:24 ` Mike Rapoport

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=f141c401560d90a546968514c6cfc63d7fdb8e00.camel@HansenPartnership.com \
    --to=james.bottomley@hansenpartnership.com \
    --cc=akpm@linux-foundation.org \
    --cc=christian.brauner@ubuntu.com \
    --cc=containers@lists.linux.dev \
    --cc=ebiederm@xmission.com \
    --cc=hagen@jauu.net \
    --cc=linux-fsdevel@vger.kernel.org \
    --cc=linux-kernel@vger.kernel.org \
    --cc=mingo@redhat.com \
    --cc=mkoutny@suse.com \
    --cc=rostedt@goodmis.org \
    --cc=rppt@kernel.org \
    --cc=shakeelb@google.com \
    --cc=viro@zeniv.linux.org.uk \
    --cc=vvs@virtuozzo.com \
    --cc=y.karadz@gmail.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).