All of lore.kernel.org
 help / color / mirror / Atom feed
From: Serge Hallyn <serge.hallyn@ubuntu.com>
To: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Cc: "Michael H. Warfield" <mhw@WittsEnd.com>,
	linux-kernel@vger.kernel.org, Jens Axboe <axboe@kernel.dk>,
	Arnd Bergmann <arnd@arndb.de>,
	Eric Biederman <ebiederm@xmission.com>,
	Serge Hallyn <serge.hallyn@canonical.com>,
	lxc-devel@lists.linuxcontainers.org,
	James Bottomley <James.Bottomley@HansenPartnership.com>
Subject: Re: [lxc-devel] [RFC PATCH 00/11] Add support for devtmpfs in user namespaces
Date: Fri, 16 May 2014 01:49:59 +0000	[thread overview]
Message-ID: <20140516014959.GD22591@ubuntumail> (raw)
In-Reply-To: <20140515221551.GB13306@kroah.com>

Quoting Greg Kroah-Hartman (gregkh@linuxfoundation.org):
> On Thu, May 15, 2014 at 05:42:54PM +0000, Serge Hallyn wrote:
> > What exactly defines '"normal" use case for a container'?
> 
> Well, I'd say "acting like a virtual machine" is a good start :)
> 
> > Not too long ago much of what we can now do with network namespaces
> > was not a normal container use case.  Neither "you can't do it now"
> > nor "I don't use it like that" should be grounds for a pre-emptive
> > nack.  "It will horribly break security assumptions" certainly would
> > be.
> 
> I agree, and maybe we will get there over time, but this patch is nto
> the way to do that.

Ok.  [ I/we may be asking for more details later, but think there is enough
below :), particularly the point about event forwarding ]  Thanks.

> > That's not to say there might not be good reasons why this in particular
> > is not appropriate, but ISTM if things are going to be nacked without
> > consideration of the patchset itself, we ought to be having a ksummit
> > session to come to a consensus [ or receive a decree, presumably by you :)
> > but after we have a chance to make our case ] on what things are going to
> > be un/acceptable.
> 
> I already stood up and publically said this last year at Plumbers, why
> is anything now different?

Well I've simply never had a chance to talk to you since then to find out
exactly what it is that is unacceptable, and why.  And, of course, code
makes it easier to discuss these things.

> And this patchset is proof of why it's not a good idea.  You really
> didn't do anything with all of the namespace stuff, except change loop.
> That's the only thing that cares, so, just do it there, like I said to
> do so, last August.

Sorry, just do it where?

> And you are ignoring the notifications to userspace and how namespaces
> here would deal with that.

Good point.  Addressing that is at the same time necessary, interesting,
and complicated.

> > > > Serge mentioned something to me about a loopdevfs (?) thing that someone
> > > > else is working on.  That would seem to be a better solution in this
> > > > particular case but I don't know much about it or where it's at.
> > > 
> > > Ok, let's see those patches then.
> > 
> > I think Seth has a git tree ready, but not sure which branch he'd want
> > us to look at.
> > 
> > Splitting a namespaced devtmpfs from loopdevfs discussion might be
> > sensible.  However, in defense of a namespaced devtmpfs I'd say
> > that for userspace to, at every container startup, bind-mount in
> > devices from the global devtmpfs into a private tmpfs (for systemd's
> > sake it can't just be on the container rootfs), seems like something
> > worth avoiding.
> 
> I think having to pick and choose what device nodes you want in a
> container is a good thing.  Becides, you would have to do the same thing
> in the kernel anyway, what's wrong with userspace making the decision
> here, especially as it knows exactly what it wants to do much more so
> than the kernel ever can.

For 'real' devices that sounds sensible.  The thing about loop devices
is that we simply want to allow a container to say "give me a loop
device to use" and have it receive a unique loop device (or 3), without
having to pre-assign them.  I think that would be cleaner to do using
a pseudofs and loop-control device, rather than having to have a
daemon in userspace on the host farming those out in response to
some, I don't know, dbus request?

> > PS - Apparently both parallels and Michael independently
> > project devices which are hot-plugged on the host into containers.
> > That also seems like something worth talking about (best practices,
> > shortcomings, use cases not met by it, any ways tha the kernel can
> > help out) at ksummit/linuxcon.
> 
> I was told that containers would never want devices hotplugged into
> them.  What use case has this happening / needed?

I'm pretty sure I didn't say that <looks around nervously>.  But I guess
we are combining two topics here, the loop psuedofs and the namespaced
devtmpfs.

The use case of loop-control device and loop pseudofs is to have
multiple chrooted/namespaced programs be able to grab a loop device
on demand which they can use for the obvious things (building a livecd,
extracting file contents, etc) without stepping on each other's toes.  The
namespaced devtmpfs is not required for this.

One advantage of a namespaced devtmpfs would be sane-looking devices
in unprivileged containers.  Currently we have to bind-mount the host's
/dev/{full,zero,etc} which, due to uid and guid mappings, then shows up
as:

crw-rw-rw- 1 nobody nogroup   1, 7 May 12 13:35 full

Also you mentioned uevent forwarding above.  Michael has talked several
times about having userspace on the host 'pass' devices into the
container.  One thing which I believe he and Eric have discussed
before was how to have userspace in the container be notified when
a device is passed in.  It seems to me that at least this is something
that would be simpler done from devtmpfs.  I could be wrong on this -
Michael do you have any updates or corrections?

Still I think we may be all agreed that we could wait a bit longer and
see how far we can get with userspace guidance (which we had
originally decided a year ago, and again a year or two before that
before user namespaces were complete).

thanks,
-serge

  parent reply	other threads:[~2014-05-16  1:50 UTC|newest]

Thread overview: 76+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2014-05-14 21:34 [RFC PATCH 00/11] Add support for devtmpfs in user namespaces Seth Forshee
2014-05-14 21:34 ` [RFC PATCH 01/11] driver core: Assign owning user namespace to devices Seth Forshee
2014-05-14 21:34 ` [RFC PATCH 02/11] driver core: Add device_create_global() Seth Forshee
2014-05-14 21:34 ` [RFC PATCH 03/11] tmpfs: Add sub-filesystem data pointer to shmem_sb_info Seth Forshee
2014-05-14 21:34 ` [RFC PATCH 04/11] ramfs: Add sub-filesystem data pointer to ram_fs_info Seth Forshee
2014-05-14 21:34 ` [RFC PATCH 05/11] devtmpfs: Add support for mounting in user namespaces Seth Forshee
2014-05-14 21:34 ` [RFC PATCH 06/11] drivers/char/mem.c: Make null/zero/full/random/urandom available to " Seth Forshee
2014-05-14 21:34 ` [RFC PATCH 07/11] block: Make partitions inherit namespace from whole disk device Seth Forshee
2014-05-14 21:34 ` [RFC PATCH 08/11] block: Allow blkdev ioctls within user namespaces Seth Forshee
2014-05-14 21:34 ` [RFC PATCH 09/11] misc: Make loop-control available to all " Seth Forshee
2014-05-14 21:34 ` [RFC PATCH 10/11] loop: Assign devices to current_user_ns() Seth Forshee
2014-05-14 21:34 ` [RFC PATCH 11/11] loop: Allow priveleged operations for root in the namespace which owns a device Seth Forshee
2014-05-23  5:48   ` Marian Marinov
2014-05-26  9:16     ` Seth Forshee
2014-05-26 15:32       ` [lxc-devel] " Michael H. Warfield
2014-05-26 15:45         ` Seth Forshee
2014-05-27  1:36         ` Serge E. Hallyn
2014-05-27  2:39           ` Michael H. Warfield
2014-05-27  7:16             ` Seth Forshee
2014-05-27 13:16             ` Serge Hallyn
2014-05-15  1:32 ` [RFC PATCH 00/11] Add support for devtmpfs in user namespaces Greg Kroah-Hartman
2014-05-15  2:17   ` [lxc-devel] " Michael H. Warfield
2014-05-15  3:15     ` Seth Forshee
2014-05-15  4:00       ` Greg Kroah-Hartman
2014-05-15 13:42         ` Michael H. Warfield
2014-05-15 14:08           ` Greg Kroah-Hartman
2014-05-15 17:42             ` Serge Hallyn
2014-05-15 18:12               ` Seth Forshee
2014-05-15 22:15               ` Greg Kroah-Hartman
2014-05-16  1:42                 ` Michael H. Warfield
2014-05-16  7:56                   ` Richard Weinberger
2014-05-16 19:20                   ` James Bottomley
2014-05-16 19:42                     ` Michael H. Warfield
2014-05-16 19:52                       ` [lxc-devel] Mount and other notifiers, was: " James Bottomley
2014-05-16 20:04                         ` Michael H. Warfield
2014-05-16  1:49                 ` Serge Hallyn [this message]
2014-05-16  4:35                   ` [lxc-devel] " Greg Kroah-Hartman
2014-05-16 14:06                     ` Seth Forshee
2014-05-16 15:28                       ` Michael H. Warfield
2014-05-16 15:43                         ` Seth Forshee
2014-05-16 18:57                       ` Greg Kroah-Hartman
2014-05-16 19:28                         ` James Bottomley
2014-05-16 20:18                           ` Seth Forshee
2014-05-20  0:04                             ` Eric W. Biederman
2014-05-20  1:14                               ` Michael H. Warfield
2014-05-20 14:18                                 ` Serge Hallyn
2014-05-20 14:21                               ` Seth Forshee
2014-05-21 22:00                                 ` Eric W. Biederman
2014-05-21 22:33                                   ` Serge Hallyn
2014-05-23 22:23                                     ` Eric W. Biederman
2014-05-28  9:26                                       ` Seth Forshee
2014-05-28 13:12                                         ` Serge E. Hallyn
2014-05-28 20:33                                           ` Eric W. Biederman
2014-05-18  2:42                           ` Serge E. Hallyn
2014-05-17  4:31                     ` Eric W. Biederman
2014-05-17 16:01                       ` Seth Forshee
2014-05-18  2:44                         ` Serge E. Hallyn
2014-05-19 13:27                           ` Seth Forshee
2014-05-20 14:15                             ` Serge Hallyn
2014-05-20 14:26                               ` Serge Hallyn
2014-05-17 12:57                     ` Michael H. Warfield
2014-05-15 18:25             ` Richard Weinberger
2014-05-15 19:50               ` Serge Hallyn
2014-05-15 20:13                 ` Richard Weinberger
2014-05-15 20:26                   ` Serge E. Hallyn
2014-05-15 20:33                     ` Richard Weinberger
2014-05-19 20:22                     ` Andy Lutomirski
2014-05-20 14:19                       ` Serge Hallyn
2014-05-23  8:20                         ` Marian Marinov
2014-05-23 13:16                           ` James Bottomley
2014-05-23 16:39                             ` Andy Lutomirski
2014-05-24 22:25                             ` Serge Hallyn
2014-05-25  8:12                               ` James Bottomley
2014-05-25 22:24                                 ` Serge E. Hallyn
2014-05-28  7:02                                   ` James Bottomley
2014-05-28 13:49                                     ` Serge Hallyn

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20140516014959.GD22591@ubuntumail \
    --to=serge.hallyn@ubuntu.com \
    --cc=James.Bottomley@HansenPartnership.com \
    --cc=arnd@arndb.de \
    --cc=axboe@kernel.dk \
    --cc=ebiederm@xmission.com \
    --cc=gregkh@linuxfoundation.org \
    --cc=linux-kernel@vger.kernel.org \
    --cc=lxc-devel@lists.linuxcontainers.org \
    --cc=mhw@WittsEnd.com \
    --cc=serge.hallyn@canonical.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.