All of lore.kernel.org
 help / color / mirror / Atom feed
From: James Bottomley <James.Bottomley-d9PhHud1JfjCXq6kfMZ53/egYHeGw8Jk@public.gmane.org>
To: "Eric W. Biederman" <ebiederm-aS9lmoZGLiVWk0Htik3J/w@public.gmane.org>
Cc: Linux Containers
	<containers-cunTk1MwBs9QetFLy7KEm3xJsTq8ys+cHZ5vskTnxNA@public.gmane.org>,
	Serge Hallyn
	<serge.hallyn-GeWIH/nMZzLQT0dZR+AlfA@public.gmane.org>,
	"linux-kernel-u79uwXL29TY76Z2rM5mHXA@public.gmane.org"
	<linux-kernel-u79uwXL29TY76Z2rM5mHXA@public.gmane.org>,
	LXC development mailing-list
	<lxc-devel-cunTk1MwBs9qMoObBWhMNEqPaTDuhLve2LY78lusg7I@public.gmane.org>
Subject: Re: [RFC] Per-user namespace process accounting
Date: Sat, 07 Jun 2014 14:39:04 -0700	[thread overview]
Message-ID: <1402177144.2236.26.camel@dabdike.int.hansenpartnership.com> (raw)
In-Reply-To: <8738flkhf0.fsf-JOvCrm2gF+uungPnsOpG7nhyD016LWXt@public.gmane.org>

On Tue, 2014-06-03 at 10:54 -0700, Eric W. Biederman wrote:
> Serge Hallyn <serge.hallyn-GeWIH/nMZzLQT0dZR+AlfA@public.gmane.org> writes:
> 
> > Quoting Pavel Emelyanov (xemul-bzQdu9zFT3WakBO8gow8eQ@public.gmane.org):
> >> On 05/29/2014 07:32 PM, Serge Hallyn wrote:
> >> > Quoting Marian Marinov (mm-108MBtLGafw@public.gmane.org):
> >> >> We are not using NFS. We are using a shared block storage that offers us snapshots. So provisioning new containers is
> >> >> extremely cheep and fast. Comparing that with untar is comparing a race car with Smart. Yes it can be done and no, I
> >> >> do not believe we should go backwards.
> >> >>
> >> >> We do not share filesystems between containers, we offer them block devices.
> >> > 
> >> > Yes, this is a real nuisance for openstack style deployments.
> >> > 
> >> > One nice solution to this imo would be a very thin stackable filesystem
> >> > which does uid shifting, or, better yet, a non-stackable way of shifting
> >> > uids at mount.
> >> 
> >> I vote for non-stackable way too. Maybe on generic VFS level so that filesystems 
> >> don't bother with it. From what I've seen, even simple stacking is quite a challenge.
> >
> > Do you have any ideas for how to go about it?  It seems like we'd have
> > to have separate inodes per mapping for each file, which is why of
> > course stacking seems "natural" here.
> >
> > Trying to catch the uid/gid at every kernel-userspace crossing seems
> > like a design regression from the current userns approach.  I suppose we
> > could continue in the kuid theme and introduce a iiud/igid for the
> > in-kernel inode uid/gid owners.  Then allow a user privileged in some
> > ns to create a new mount associated with a different mapping for any
> > ids over which he is privileged.
> 
> There is a simple solution.
> 
> We pick the filesystems we choose to support.
> We add privileged mounting in a user namespace.
> We create the user and mount namespace.
> Global root goes into the target mount namespace with setns and performs
> the mounts.
> 
> 90% of that work is already done.
> 
> As long as we don't plan to support XFS (as it XFS likes to expose it's
> implementation details to userspace) it should be quite straight
> forward.

Any implementation which doesn't support XFS is unviable from a distro
point of view.  The whole reason we're fighting to get USER_NS enabled
in distros goes back to lack of XFS support (they basically refused to
turn it on until it wasn't a choice between XFS and USER_NS).  If we put
them in a position where they choose a namespace feature or XFS, they'll
choose XFS.

XFS developers aren't unreasonable ... they'll help if we ask.  I mean
it was them who eventually helped us get USER_NS turned on in the first
place.

James

WARNING: multiple messages have this Message-ID (diff)
From: James Bottomley <James.Bottomley@HansenPartnership.com>
To: "Eric W. Biederman" <ebiederm@xmission.com>
Cc: Serge Hallyn <serge.hallyn@ubuntu.com>,
	"linux-kernel@vger.kernel.org" <linux-kernel@vger.kernel.org>,
	Linux Containers <containers@lists.linux-foundation.org>,
	LXC development mailing-list 
	<lxc-devel@lists.linuxcontainers.org>
Subject: Re: [RFC] Per-user namespace process accounting
Date: Sat, 07 Jun 2014 14:39:04 -0700	[thread overview]
Message-ID: <1402177144.2236.26.camel@dabdike.int.hansenpartnership.com> (raw)
In-Reply-To: <8738flkhf0.fsf@x220.int.ebiederm.org>

On Tue, 2014-06-03 at 10:54 -0700, Eric W. Biederman wrote:
> Serge Hallyn <serge.hallyn@ubuntu.com> writes:
> 
> > Quoting Pavel Emelyanov (xemul@parallels.com):
> >> On 05/29/2014 07:32 PM, Serge Hallyn wrote:
> >> > Quoting Marian Marinov (mm@1h.com):
> >> >> We are not using NFS. We are using a shared block storage that offers us snapshots. So provisioning new containers is
> >> >> extremely cheep and fast. Comparing that with untar is comparing a race car with Smart. Yes it can be done and no, I
> >> >> do not believe we should go backwards.
> >> >>
> >> >> We do not share filesystems between containers, we offer them block devices.
> >> > 
> >> > Yes, this is a real nuisance for openstack style deployments.
> >> > 
> >> > One nice solution to this imo would be a very thin stackable filesystem
> >> > which does uid shifting, or, better yet, a non-stackable way of shifting
> >> > uids at mount.
> >> 
> >> I vote for non-stackable way too. Maybe on generic VFS level so that filesystems 
> >> don't bother with it. From what I've seen, even simple stacking is quite a challenge.
> >
> > Do you have any ideas for how to go about it?  It seems like we'd have
> > to have separate inodes per mapping for each file, which is why of
> > course stacking seems "natural" here.
> >
> > Trying to catch the uid/gid at every kernel-userspace crossing seems
> > like a design regression from the current userns approach.  I suppose we
> > could continue in the kuid theme and introduce a iiud/igid for the
> > in-kernel inode uid/gid owners.  Then allow a user privileged in some
> > ns to create a new mount associated with a different mapping for any
> > ids over which he is privileged.
> 
> There is a simple solution.
> 
> We pick the filesystems we choose to support.
> We add privileged mounting in a user namespace.
> We create the user and mount namespace.
> Global root goes into the target mount namespace with setns and performs
> the mounts.
> 
> 90% of that work is already done.
> 
> As long as we don't plan to support XFS (as it XFS likes to expose it's
> implementation details to userspace) it should be quite straight
> forward.

Any implementation which doesn't support XFS is unviable from a distro
point of view.  The whole reason we're fighting to get USER_NS enabled
in distros goes back to lack of XFS support (they basically refused to
turn it on until it wasn't a choice between XFS and USER_NS).  If we put
them in a position where they choose a namespace feature or XFS, they'll
choose XFS.

XFS developers aren't unreasonable ... they'll help if we ask.  I mean
it was them who eventually helped us get USER_NS turned on in the first
place.

James



  parent reply	other threads:[~2014-06-07 21:39 UTC|newest]

Thread overview: 32+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2014-05-29  6:37 [RFC] Per-user namespace process accounting Marian Marinov
2014-05-29  6:37 ` Marian Marinov
     [not found] ` <5386D58D.2080809-108MBtLGafw@public.gmane.org>
2014-05-29 10:06   ` Eric W. Biederman
2014-05-29 10:06     ` Eric W. Biederman
     [not found]     ` <87tx88nbko.fsf-JOvCrm2gF+uungPnsOpG7nhyD016LWXt@public.gmane.org>
2014-05-29 10:40       ` Marian Marinov
2014-05-29 10:40         ` Marian Marinov
     [not found]         ` <53870EAA.4060101-108MBtLGafw@public.gmane.org>
2014-05-29 15:32           ` Serge Hallyn
2014-05-29 15:32             ` Serge Hallyn
2014-06-03 17:01             ` Pavel Emelyanov
2014-06-03 17:01               ` Pavel Emelyanov
     [not found]               ` <538DFF72.7000209-bzQdu9zFT3WakBO8gow8eQ@public.gmane.org>
2014-06-03 17:26                 ` Serge Hallyn
2014-06-03 17:26                   ` Serge Hallyn
2014-06-03 17:39                   ` Pavel Emelyanov
2014-06-03 17:39                     ` Pavel Emelyanov
     [not found]                     ` <538E0848.6060900-bzQdu9zFT3WakBO8gow8eQ@public.gmane.org>
2014-06-03 17:47                       ` Serge Hallyn
2014-06-03 17:47                         ` Serge Hallyn
2014-06-03 18:18                       ` Eric W. Biederman
2014-06-03 18:18                         ` Eric W. Biederman
2014-06-03 17:54                   ` Eric W. Biederman
2014-06-03 17:54                     ` Eric W. Biederman
     [not found]                     ` <8738flkhf0.fsf-JOvCrm2gF+uungPnsOpG7nhyD016LWXt@public.gmane.org>
2014-06-03 21:39                       ` Marian Marinov
2014-06-03 21:39                         ` Marian Marinov
     [not found]                         ` <538E4088.7010605-108MBtLGafw@public.gmane.org>
2014-06-23  4:07                           ` Serge E. Hallyn
2014-06-23  4:07                             ` Serge E. Hallyn
2014-06-07 21:39                       ` James Bottomley [this message]
2014-06-07 21:39                         ` James Bottomley
     [not found]                         ` <1402177144.2236.26.camel-sFMDBYUN5F8GjUHQrlYNx2Wm91YjaHnnhRte9Li2A+AAvxtiuMwx3w@public.gmane.org>
2014-06-08  3:25                           ` Eric W. Biederman
2014-06-08  3:25                             ` Eric W. Biederman
2014-06-12 14:37   ` Alin Dobre
2014-06-12 14:37     ` Alin Dobre
     [not found]     ` <5399BB42.60304-1hSFou9RDDldEee+Cai+ZQ@public.gmane.org>
2014-06-12 15:08       ` Serge Hallyn
2014-06-12 15:08         ` [lxc-devel] " Serge Hallyn

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=1402177144.2236.26.camel@dabdike.int.hansenpartnership.com \
    --to=james.bottomley-d9phhud1jfjcxq6kfmz53/egyhegw8jk@public.gmane.org \
    --cc=containers-cunTk1MwBs9QetFLy7KEm3xJsTq8ys+cHZ5vskTnxNA@public.gmane.org \
    --cc=ebiederm-aS9lmoZGLiVWk0Htik3J/w@public.gmane.org \
    --cc=linux-kernel-u79uwXL29TY76Z2rM5mHXA@public.gmane.org \
    --cc=lxc-devel-cunTk1MwBs9qMoObBWhMNEqPaTDuhLve2LY78lusg7I@public.gmane.org \
    --cc=serge.hallyn-GeWIH/nMZzLQT0dZR+AlfA@public.gmane.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.