All of lore.kernel.org
 help / color / mirror / Atom feed
From: Dave Chinner <david-FqsqvQoI3Ljby3iVrkZq2A@public.gmane.org>
To: Andy Lutomirski <luto-kltTT9wpgjJwATOyAt5JVQ@public.gmane.org>
Cc: Konstantin Khlebnikov
	<khlebnikov-XoJtRXgx1JseBXzfvpsJ4g@public.gmane.org>,
	Li Xi <pkuelelixi-Re5JQEeQqe8AvxtiuMwx3w@public.gmane.org>,
	Linux FS Devel
	<linux-fsdevel-u79uwXL29TY76Z2rM5mHXA@public.gmane.org>,
	"linux-ext4-u79uwXL29TY76Z2rM5mHXA@public.gmane.org"
	<linux-ext4-u79uwXL29TY76Z2rM5mHXA@public.gmane.org>,
	Linux API <linux-api-u79uwXL29TY76Z2rM5mHXA@public.gmane.org>,
	Theodore Ts'o <tytso-3s7WtUTddSA@public.gmane.org>,
	Andreas Dilger <adilger-m1MBpc4rdrD3fQ9qLvQP4Q@public.gmane.org>,
	Jan Kara <jack-AlSwsSmVLrQ@public.gmane.org>,
	Al Viro <viro-RmSDqhL/yNMiFSDQTTA3OLVCufUGDwFn@public.gmane.org>,
	Christoph Hellwig <hch-wEGCiKHe2LqWVfeAwA7xHQ@public.gmane.org>,
	dmonakhov-GEFAQzZX7r8dnm+yROfE0A@public.gmane.org,
	"Eric W. Biederman"
	<ebiederm-aS9lmoZGLiVWk0Htik3J/w@public.gmane.org>
Subject: Re: [v8 4/5] ext4: adds FS_IOC_FSSETXATTR/FS_IOC_FSGETXATTR interface support
Date: Tue, 27 Jan 2015 19:02:39 +1100	[thread overview]
Message-ID: <20150127080239.GQ16552@dastard> (raw)
In-Reply-To: <CALCETrXPCrOTrkoAMuW2os=z6anaEfv4F4D2yDxo6VtCuEtRZw-JsoAwUIsXosN+BqQ9rBEUg@public.gmane.org>

On Fri, Jan 23, 2015 at 03:59:04PM -0800, Andy Lutomirski wrote:
> On Fri, Jan 23, 2015 at 3:30 PM, Dave Chinner <david-FqsqvQoI3Ljby3iVrkZq2A@public.gmane.org> wrote:
> > On Fri, Jan 23, 2015 at 02:58:09PM +0300, Konstantin Khlebnikov wrote:
> >> On 23.01.2015 04:53, Dave Chinner wrote:
> >> >On Thu, Jan 22, 2015 at 06:28:51PM +0300, Konstantin Khlebnikov wrote:
> >> >>>+  kprojid = make_kprojid(&init_user_ns, (projid_t)projid);
> >> >>
> >> >>Maybe current_user_ns()?
> >> >>This code should be user-namespace aware from the beginning.
> >> >
> >> >No, the code is correct. Project quotas have nothing to do with
> >> >UIDs and so should never have been included in the uid/gid
> >> >namespace mapping infrastructure in the first place.
> >>
> >> Right, but user-namespace provides id mapping for project-id too.
> >> This infrastructure adds support for nested project quotas with
> >> virtualized ids in sub-containers. I couldn't say that this is
> >> must have feature but implementation is trivial because whole
> >> infrastructure is already here.
> >
> > This is an extremely common misunderstanding of project IDs. Project
> > IDs are completely separate to the UID/GID namespace.  Project
> > quotas were originally designed specifically for
> > accounting/enforcing quotas in situations where uid/gid
> > accounting/enforcing is not possible. This design intent goes back
> > 25 years - it predates XFS...
> >
> > IOWs, mapping prids via user namespaces defeats the purpose
> > for which prids were originally intended for.
> >
> >> >Point in case: directory subtree quotas can be used as a resource
> >> >controller for limiting space usage within separate containers that
> >> >share the same underlying (large) filesystem via mount namespaces.
> >>
> >> That's exactly my use-case: 'sub-volumes' for containers with
> >> quota for space usage/inodes count.
> >
> > That doesn't require mapped project IDs. Hard container space limits
> > can only be controlled by the init namespace, and because inodes can
> > hold only one project ID the current ns cannot be allowed to change
> > the project ID on the inode because that allows them to escape the
> > resource limits set on the project ID associated with the sub-mount
> > set up by the init namespace...
> >
> > i.e.
> >
> > /mnt                    prid = 0, default for entire fs.
> > /mnt/container1/        prid = 1, inherit, 10GB space limit
> > /mnt/container2/        prid = 2, inherit, 50GB space limit
> > .....
> > /mnt/containerN/        prid = N, inherit, 20GB space limit
> >
> > And you clone the mount namespace for each container so the root is
> > at the appropriate /mnt/containerX/.  Now the containers have a
> > fixed amount of space they can use in the parent filesystem they
> > know nothing about, and it is enforced by directory subquotas
> > controlled by the init namespace.  This "fixed amount of space" is
> > reflected in the container namespace when "df" is run as it will
> > report the project quota space limits. Adding or removing space to a
> > container is as simple as changing the project quota limits from the
> > init namespace. i.e. an admin operation controlled by the host, not
> > the container....
> >
> > Allowing the container to modify the prid and/or the inherit bit of
> > inodes in it's namespace then means the user can define their own
> > space usage limits, even turn them off. It's not a resource
> > container at that point because the user can define their own
> > limits.  Hence, only if the current_ns cannot change project quotas
> > will we have a hard fence on space usage that the container *cannot
> > exceed*.
> 
> I think I must be missing something simple here.  In a hypothetical
> world where the code used nsown_capable, if an admin wants to stick a
> container in /mnt/container1 with associated prid 1 and a userns,
> shouldn't it just map only prid 1 into the user ns?  Then a user in
> that userns can't try to change the prid of a file to 2 because the
> number "2" is unmapped for that user and translation will fail.

You've effectively said "yes, project quotas are enabled, but you
only have a single ID, it's always turned on and you can't change it
to anything else.

So, why do they need to be mapped via user namespaces to enable
this? Think about it a little harder:

	- Project IDs are not user IDs.
	- Project IDs are not a security/permission mechanism.
	- Project quotas only provide a mechanism for
	  resource usage control.

Think about that last one some more. Perhaps, as a hint, I should
relate it to control groups? :) i.e:

	- Project quotas can be used as an effective mount ns space
	  usage controller.

But this can only be safely and reliably by keeping the project IDs
inaccessible from the containers themselves. I don't see why a
mechanism that controls the amount of filesystem space used by a
container should be considered any differently to a memory control
group that limits the amount of memory the container can use.

However, nobody on the container side of things would answer any of
my questions about how project quotas were going to be used,
limited, managed, etc back when we had to make a decision to enable
XFS user ns support, I did what was needed to support the obvious
container use case and close any possible loop hole that containers
might be able to use to subvert that use case.

If we want to do anything different, then there's a *lot* of
userns aware regression tests needed to be written for xfstests....

Cheers,

Dave.
-- 
Dave Chinner
david-FqsqvQoI3Ljby3iVrkZq2A@public.gmane.org

  parent reply	other threads:[~2015-01-27  8:02 UTC|newest]

Thread overview: 37+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2014-12-09  5:22 [v8 0/5] ext4: add project quota support Li Xi
2014-12-09  5:22 ` [v8 1/5] vfs: adds general codes to enforces project quota limits Li Xi
2014-12-09  5:22 ` [v8 2/5] ext4: adds project ID support Li Xi
2015-01-07 23:11   ` Andreas Dilger
2015-01-08  8:51     ` Jan Kara
2015-01-15  7:52     ` Li Xi
     [not found]   ` <1418102548-5469-3-git-send-email-lixi-LfVdkaOWEx8@public.gmane.org>
2015-01-08  8:26     ` Jan Kara
2015-01-08 22:20       ` Andreas Dilger
2015-01-09  9:47         ` Jan Kara
     [not found]           ` <20150109094758.GA2576-+0h/O2h83AeN3ZZ/Hiejyg@public.gmane.org>
2015-01-09 23:46             ` Dave Chinner
2015-01-12 17:01               ` Jan Kara
2014-12-09  5:22 ` [v8 3/5] ext4: adds project quota support Li Xi
2015-01-06 20:01   ` Andreas Dilger
2015-01-06 21:52     ` Jan Kara
2014-12-09  5:22 ` [v8 4/5] ext4: adds FS_IOC_FSSETXATTR/FS_IOC_FSGETXATTR interface support Li Xi
2014-12-09 22:57   ` Dave Chinner
2015-01-22 15:20     ` Konstantin Khlebnikov
2015-01-22 15:59       ` Jan Kara
2015-01-22 18:35         ` Konstantin Khlebnikov
     [not found]         ` <20150122155900.GB3062-+0h/O2h83AeN3ZZ/Hiejyg@public.gmane.org>
2015-01-23  1:39           ` Dave Chinner
2015-01-22 15:28   ` Konstantin Khlebnikov
     [not found]     ` <54C11733.7080801-XoJtRXgx1JseBXzfvpsJ4g@public.gmane.org>
2015-01-23  1:53       ` Dave Chinner
2015-01-23 11:58         ` Konstantin Khlebnikov
2015-01-23 23:30           ` Dave Chinner
2015-01-23 23:59             ` Andy Lutomirski
     [not found]               ` <CALCETrXPCrOTrkoAMuW2os=z6anaEfv4F4D2yDxo6VtCuEtRZw-JsoAwUIsXosN+BqQ9rBEUg@public.gmane.org>
2015-01-27  8:02                 ` Dave Chinner [this message]
2015-01-27 10:45                   ` Konstantin Khlebnikov
2015-01-28  0:37                     ` Dave Chinner
2015-02-04 15:22                       ` Konstantin Khlebnikov
     [not found]                         ` <20150204225844.GA12722@dastard>
2015-02-05  9:32                           ` Konstantin Khlebnikov
2015-02-05 16:38                           ` Jan Kara
2015-02-05 21:05                             ` Dave Chinner
2015-01-28  0:45                   ` Andy Lutomirski
     [not found] ` <1418102548-5469-1-git-send-email-lixi-LfVdkaOWEx8@public.gmane.org>
2014-12-09  5:22   ` [v8 5/5] ext4: cleanup inode flag definitions Li Xi
     [not found]     ` <1418102548-5469-6-git-send-email-lixi-LfVdkaOWEx8@public.gmane.org>
2015-01-06 20:05       ` Andreas Dilger
2015-01-26 18:46 ` [v8 0/5] ext4: add project quota support jon ernst
2015-02-13  2:00   ` Li Xi

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20150127080239.GQ16552@dastard \
    --to=david-fqsqvqoi3ljby3ivrkzq2a@public.gmane.org \
    --cc=adilger-m1MBpc4rdrD3fQ9qLvQP4Q@public.gmane.org \
    --cc=dmonakhov-GEFAQzZX7r8dnm+yROfE0A@public.gmane.org \
    --cc=ebiederm-aS9lmoZGLiVWk0Htik3J/w@public.gmane.org \
    --cc=hch-wEGCiKHe2LqWVfeAwA7xHQ@public.gmane.org \
    --cc=jack-AlSwsSmVLrQ@public.gmane.org \
    --cc=khlebnikov-XoJtRXgx1JseBXzfvpsJ4g@public.gmane.org \
    --cc=linux-api-u79uwXL29TY76Z2rM5mHXA@public.gmane.org \
    --cc=linux-ext4-u79uwXL29TY76Z2rM5mHXA@public.gmane.org \
    --cc=linux-fsdevel-u79uwXL29TY76Z2rM5mHXA@public.gmane.org \
    --cc=luto-kltTT9wpgjJwATOyAt5JVQ@public.gmane.org \
    --cc=pkuelelixi-Re5JQEeQqe8AvxtiuMwx3w@public.gmane.org \
    --cc=tytso-3s7WtUTddSA@public.gmane.org \
    --cc=viro-RmSDqhL/yNMiFSDQTTA3OLVCufUGDwFn@public.gmane.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.