All of lore.kernel.org
 help / color / mirror / Atom feed
* MDS auth caps for cephfs
@ 2015-05-21  0:14 Sage Weil
  2015-05-22  9:28 ` John Spray
  0 siblings, 1 reply; 35+ messages in thread
From: Sage Weil @ 2015-05-21  0:14 UTC (permalink / raw)
  To: ceph-devel; +Cc: nishtha3rai

Looking at the MDSAuthCaps again, I think there are a few things we might 
need to clean up first.  The way it is currently structured, the idea is 
that you have an array of grants (MDSCapGrant).  For any operation, you'd 
look at each grant until one that says what you're trying to do is okay.  
If non match, you fail.  (i.e., they're additive only.)

Each MDSCapGrant has a 'spec' and a 'match'.  The 'match' is a check 
to see if the current grant applies to a given operation, and the 'spec' 
says what you're allowed to do.

Currently MDSCapMatch is just

  int uid;  // Require UID to be equal to this, if !=MDS_AUTH_UID_ANY
  std::string path;  // Require path to be child of this (may be "/" for any)

I think path is clearly right.  UID I'm not sure makes sense here... I'm 
inclined to ignore it (instead of removing it) until we decide 
how to restrict a mount to be a single user.

The spec is

  bool read;
  bool write;
  bool any;

I'm not quite sure what 'any' means, but read/write are pretty clear.

The root_squash option clearly belongs in spec, and Nistha's first patch 
adds it there.  What about the other NFS options.. should be mirror those 
too?

root_squash
 Map requests from uid/gid 0 to the anonymous uid/gid. Note that this does
 not apply to any other uids or gids that might be equally sensitive, such
 as user bin or group staff.
no_root_squash
 Turn off root squashing. This option is mainly useful for diskless
 clients.
all_squash
 Map all uids and gids to the anonymous user. Useful for NFS-exported
 public FTP directories, news spool directories, etc. The opposite option
 is no_all_squash, which is the default setting.
anonuid and anongid
 These options explicitly set the uid and gid of the anonymous account.
 This option is primarily useful for PC/NFS clients, where you might want
 all requests appear to be from one user. As an example, consider the
 export entry for /home/joe in the example section below, which maps all
 requests to uid 150 (which is supposedly that of user joe).

We could also do an all_squash bool at the same time (or a flags field for 
more efficient encoding), and anonuid/gid so that we don't hard-code 
65534.

In order to add these to the grammer, I suspect we should go back to 
root_squash (not squash_root), and add an 'optoins' tag.  e.g.,

 allow path /foo rw options no_root_squash anonuid=123 anongid=123

(having them live next to rw was breaking the spirit parser, bah).  

Any opinions?

sage

^ permalink raw reply	[flat|nested] 35+ messages in thread

* Re: MDS auth caps for cephfs
  2015-05-21  0:14 MDS auth caps for cephfs Sage Weil
@ 2015-05-22  9:28 ` John Spray
  2015-05-22 21:35   ` Sage Weil
  0 siblings, 1 reply; 35+ messages in thread
From: John Spray @ 2015-05-22  9:28 UTC (permalink / raw)
  To: Sage Weil, ceph-devel; +Cc: nishtha3rai



On 21/05/2015 01:14, Sage Weil wrote:
> Looking at the MDSAuthCaps again, I think there are a few things we might
> need to clean up first.  The way it is currently structured, the idea is
> that you have an array of grants (MDSCapGrant).  For any operation, you'd
> look at each grant until one that says what you're trying to do is okay.
> If non match, you fail.  (i.e., they're additive only.)
>
> Each MDSCapGrant has a 'spec' and a 'match'.  The 'match' is a check
> to see if the current grant applies to a given operation, and the 'spec'
> says what you're allowed to do.
>
> Currently MDSCapMatch is just
>
>    int uid;  // Require UID to be equal to this, if !=MDS_AUTH_UID_ANY
>    std::string path;  // Require path to be child of this (may be "/" for any)
>
> I think path is clearly right.  UID I'm not sure makes sense here... I'm
> inclined to ignore it (instead of removing it) until we decide
> how to restrict a mount to be a single user.
>
> The spec is
>
>    bool read;
>    bool write;
>    bool any;
>
> I'm not quite sure what 'any' means, but read/write are pretty clear.

Ah, I added that when implementing 'tell' -- 'any' is checked when 
handling incoming MCommand in MDS, so it's effectively the admin permission.

> The root_squash option clearly belongs in spec, and Nistha's first patch
> adds it there.  What about the other NFS options.. should be mirror those
> too?
>
> root_squash
>   Map requests from uid/gid 0 to the anonymous uid/gid. Note that this does
>   not apply to any other uids or gids that might be equally sensitive, such
>   as user bin or group staff.
> no_root_squash
>   Turn off root squashing. This option is mainly useful for diskless
>   clients.
> all_squash
>   Map all uids and gids to the anonymous user. Useful for NFS-exported
>   public FTP directories, news spool directories, etc. The opposite option
>   is no_all_squash, which is the default setting.
> anonuid and anongid
>   These options explicitly set the uid and gid of the anonymous account.
>   This option is primarily useful for PC/NFS clients, where you might want
>   all requests appear to be from one user. As an example, consider the
>   export entry for /home/joe in the example section below, which maps all
>   requests to uid 150 (which is supposedly that of user joe).

Yes, I think we should.  Part of me wants to say that people who want 
NFS-like behaviour should be using NFS gateways.  However, these are all 
probably straightforward enough to implement that it's worth maintaining 
them in cephfs too.

We probably need to mirror these in our mount options too, so that e.g. 
someone with an admin key can still enable root_squash at will, rather 
than having to craft an authentication token with the desired behaviour.

> We could also do an all_squash bool at the same time (or a flags field for
> more efficient encoding), and anonuid/gid so that we don't hard-code
> 65534.
>
> In order to add these to the grammer, I suspect we should go back to
> root_squash (not squash_root), and add an 'optoins' tag.  e.g.,
>
>   allow path /foo rw options no_root_squash anonuid=123 anongid=123
>
> (having them live next to rw was breaking the spirit parser, bah).
Looks good to me.

John

^ permalink raw reply	[flat|nested] 35+ messages in thread

* Re: MDS auth caps for cephfs
  2015-05-22  9:28 ` John Spray
@ 2015-05-22 21:35   ` Sage Weil
  2015-05-22 22:02     ` Gregory Farnum
  0 siblings, 1 reply; 35+ messages in thread
From: Sage Weil @ 2015-05-22 21:35 UTC (permalink / raw)
  To: John Spray; +Cc: ceph-devel, nishtha3rai, jashank42

On Fri, 22 May 2015, John Spray wrote:
> On 21/05/2015 01:14, Sage Weil wrote:
> > Looking at the MDSAuthCaps again, I think there are a few things we might
> > need to clean up first.  The way it is currently structured, the idea is
> > that you have an array of grants (MDSCapGrant).  For any operation, you'd
> > look at each grant until one that says what you're trying to do is okay.
> > If non match, you fail.  (i.e., they're additive only.)
> > 
> > Each MDSCapGrant has a 'spec' and a 'match'.  The 'match' is a check
> > to see if the current grant applies to a given operation, and the 'spec'
> > says what you're allowed to do.
> > 
> > Currently MDSCapMatch is just
> > 
> >    int uid;  // Require UID to be equal to this, if !=MDS_AUTH_UID_ANY
> >    std::string path;  // Require path to be child of this (may be "/" for
> > any)
> > 
> > I think path is clearly right.  UID I'm not sure makes sense here... I'm
> > inclined to ignore it (instead of removing it) until we decide
> > how to restrict a mount to be a single user.
> > 
> > The spec is
> > 
> >    bool read;
> >    bool write;
> >    bool any;
> > 
> > I'm not quite sure what 'any' means, but read/write are pretty clear.
> 
> Ah, I added that when implementing 'tell' -- 'any' is checked when handling
> incoming MCommand in MDS, so it's effectively the admin permission.

Ok!

> > The root_squash option clearly belongs in spec, and Nistha's first patch
> > adds it there.  What about the other NFS options.. should be mirror those
> > too?
> > 
> > root_squash
> >   Map requests from uid/gid 0 to the anonymous uid/gid. Note that this does
> >   not apply to any other uids or gids that might be equally sensitive, such
> >   as user bin or group staff.
> > no_root_squash
> >   Turn off root squashing. This option is mainly useful for diskless
> >   clients.
> > all_squash
> >   Map all uids and gids to the anonymous user. Useful for NFS-exported
> >   public FTP directories, news spool directories, etc. The opposite option
> >   is no_all_squash, which is the default setting.
> > anonuid and anongid
> >   These options explicitly set the uid and gid of the anonymous account.
> >   This option is primarily useful for PC/NFS clients, where you might want
> >   all requests appear to be from one user. As an example, consider the
> >   export entry for /home/joe in the example section below, which maps all
> >   requests to uid 150 (which is supposedly that of user joe).
> 
> Yes, I think we should.  Part of me wants to say that people who want NFS-like
> behaviour should be using NFS gateways.  However, these are all probably
> straightforward enough to implement that it's worth maintaining them in cephfs
> too.
> 
> We probably need to mirror these in our mount options too, so that e.g.
> someone with an admin key can still enable root_squash at will, rather than
> having to craft an authentication token with the desired behaviour.

Yeah. So Greg and Josh and I sat down with Dan van der Ster yesterday and 
went over some of this.  I think we also concluded:

 - We should somehow tag requests with a uid and list<gid>.  This will 
make the request path permission checks sane WRT these sorts of checks.

 - We need something trickier for cap writeback.  We can simply tag the 
dirty cap on the client with the uid etc of whoever dirtied it, but if 
multiple users do that it can get messy.  I suggest forcing the client to 
flush before allowing a second dirty, although this will be slighly 
painful as we need to handle the case where the MDS fails or a subtree 
migrates, so it might mean actually blocking in that case.  (This will be 
semi gross to code but I don't think will affect any realworld workload.)

 - For per-user kerberos, we'll need an extra exchange between client and 
MDS to establish user credentials (e.g., when a user does kinit, or a new 
user logs into the box, etc.).  Note that the kerberos credential has a 
group concept, but I'm not sure how that maps onto the Unix groups 
(perhaps that is a parallel PAM thing with the LDAP/AD server?).  In any 
case, if such an exchange will be needed there, and that session 
state is what we'll be checking against, should we create that structure 
now and use it to establish the gid list (instead of, say, including a 
potentially largish list<gid_t> in every MClientRequest)?

sage


> > We could also do an all_squash bool at the same time (or a flags field for
> > more efficient encoding), and anonuid/gid so that we don't hard-code
> > 65534.
> > 
> > In order to add these to the grammer, I suspect we should go back to
> > root_squash (not squash_root), and add an 'optoins' tag.  e.g.,
> > 
> >   allow path /foo rw options no_root_squash anonuid=123 anongid=123
> > 
> > (having them live next to rw was breaking the spirit parser, bah).
>
> Looks good to me.

^ permalink raw reply	[flat|nested] 35+ messages in thread

* Re: MDS auth caps for cephfs
  2015-05-22 21:35   ` Sage Weil
@ 2015-05-22 22:02     ` Gregory Farnum
  2015-05-22 22:18       ` Sage Weil
  2015-05-26 12:56       ` John Spray
  0 siblings, 2 replies; 35+ messages in thread
From: Gregory Farnum @ 2015-05-22 22:02 UTC (permalink / raw)
  To: Sage Weil; +Cc: John Spray, ceph-devel, Nishtha Rai, jashan kamboj

On Fri, May 22, 2015 at 2:35 PM, Sage Weil <sweil@redhat.com> wrote:
> On Fri, 22 May 2015, John Spray wrote:
>> On 21/05/2015 01:14, Sage Weil wrote:
>> > Looking at the MDSAuthCaps again, I think there are a few things we might
>> > need to clean up first.  The way it is currently structured, the idea is
>> > that you have an array of grants (MDSCapGrant).  For any operation, you'd
>> > look at each grant until one that says what you're trying to do is okay.
>> > If non match, you fail.  (i.e., they're additive only.)
>> >
>> > Each MDSCapGrant has a 'spec' and a 'match'.  The 'match' is a check
>> > to see if the current grant applies to a given operation, and the 'spec'
>> > says what you're allowed to do.
>> >
>> > Currently MDSCapMatch is just
>> >
>> >    int uid;  // Require UID to be equal to this, if !=MDS_AUTH_UID_ANY
>> >    std::string path;  // Require path to be child of this (may be "/" for
>> > any)
>> >
>> > I think path is clearly right.  UID I'm not sure makes sense here... I'm
>> > inclined to ignore it (instead of removing it) until we decide
>> > how to restrict a mount to be a single user.
>> >
>> > The spec is
>> >
>> >    bool read;
>> >    bool write;
>> >    bool any;
>> >
>> > I'm not quite sure what 'any' means, but read/write are pretty clear.
>>
>> Ah, I added that when implementing 'tell' -- 'any' is checked when handling
>> incoming MCommand in MDS, so it's effectively the admin permission.
>
> Ok!
>
>> > The root_squash option clearly belongs in spec, and Nistha's first patch
>> > adds it there.  What about the other NFS options.. should be mirror those
>> > too?
>> >
>> > root_squash
>> >   Map requests from uid/gid 0 to the anonymous uid/gid. Note that this does
>> >   not apply to any other uids or gids that might be equally sensitive, such
>> >   as user bin or group staff.
>> > no_root_squash
>> >   Turn off root squashing. This option is mainly useful for diskless
>> >   clients.
>> > all_squash
>> >   Map all uids and gids to the anonymous user. Useful for NFS-exported
>> >   public FTP directories, news spool directories, etc. The opposite option
>> >   is no_all_squash, which is the default setting.
>> > anonuid and anongid
>> >   These options explicitly set the uid and gid of the anonymous account.
>> >   This option is primarily useful for PC/NFS clients, where you might want
>> >   all requests appear to be from one user. As an example, consider the
>> >   export entry for /home/joe in the example section below, which maps all
>> >   requests to uid 150 (which is supposedly that of user joe).
>>
>> Yes, I think we should.  Part of me wants to say that people who want NFS-like
>> behaviour should be using NFS gateways.  However, these are all probably
>> straightforward enough to implement that it's worth maintaining them in cephfs
>> too.

Unfortunately not really — the NFS semantics are very different from
the way our CephX security caps work. We grant accesses with each
permission, rather than restricting them. We can accomplish similar
things, but they'll need to be in opposite directions:
allow anon_access
allow uid 123, allow gid 123[,456,789,...]
allow root
where each additional grant gives the session more access. (And I'm
not sure if these are best set up as specific things on their own or
just squashed in so that UID -1 is "anon", etc) These let you set up
access permissions like those of NFS, but it's a quite different model
than the various mounting and config file options NFS gives you. I
want to make sure we're clear about not trying to match those
precisely because otherwise our security capabilities are not going to
make any kind of sense. :(
What would it mean for a user who doesn't have no_root_squash to have
access to uid 0? Why should we allow random users to access any UID
*except* for root? Does a client who has no_root_squash and anon uid
123 get to access stuff as root, or else as 123? Can they access as
124?
I mean, I think it would have to mean they get access to everything as
anybody, and I'm not sure which requests would be considered
"anonymous" for the uid 123 bit to kick in. But I don't think that's
what the administrator would *mean* for them to have.

As I think about this more I guess the point is that for multi tenancy
we want each client to be able to do anything inside of their own
particular directory namespace, since UIDs and GIDs may not be
synchronized across tenants? I'm not sure how to address that, but
either way I think it will require a wider/different set of primitives
than we've described here. :/

>>
>> We probably need to mirror these in our mount options too, so that e.g.
>> someone with an admin key can still enable root_squash at will, rather than
>> having to craft an authentication token with the desired behaviour.

Mmmm, given that clients normally can't see their capabilities at all
that's a bit tricky. We could maybe accomplish it by tying in with the
extra session exchange (that Sage referred to below); that will be
necessary for adding clients to an existing host session dynamically
and we could also let a user voluntarily drop certain permissions with
it...although dropping permissions requires a client to know that they
have them. Hrm.

On Fri, May 22, 2015 at 2:35 PM, Sage Weil <sweil@redhat.com> wrote:
> Yeah. So Greg and Josh and I sat down with Dan van der Ster yesterday and
> went over some of this.  I think we also concluded:
>
>  - We should somehow tag requests with a uid and list<gid>.  This will
> make the request path permission checks sane WRT these sorts of checks.

Well, hopefully we don't need to tag individual requests with a list
of GIDs because the group information will be in the session state?

>
>  - We need something trickier for cap writeback.  We can simply tag the
> dirty cap on the client with the uid etc of whoever dirtied it, but if
> multiple users do that it can get messy.  I suggest forcing the client to
> flush before allowing a second dirty, although this will be slighly
> painful as we need to handle the case where the MDS fails or a subtree
> migrates, so it might mean actually blocking in that case.  (This will be
> semi gross to code but I don't think will affect any realworld workload.)

Flushing *might* be the easiest solution to implement, but I actually
worry we'll run into it a non-trivial amount of the time. Consider a
client with multiple containerized applications running on the same
host, that need to share data...
I'd need to look through the writeback paths in the client pretty
carefully before I felt comfortable picking a path forward here. I'm
tempted to set up some kind of ordered flush thing similar to our
projected journal updates (but simpler!) — if the client allows
something the MDS doesn't then we've got a problem, but that basically
requires a user subverting the client so I'm not sure it's worth
worrying about?

>
>  - For per-user kerberos, we'll need an extra exchange between client and
> MDS to establish user credentials (e.g., when a user does kinit, or a new
> user logs into the box, etc.).  Note that the kerberos credential has a
> group concept, but I'm not sure how that maps onto the Unix groups
> (perhaps that is a parallel PAM thing with the LDAP/AD server?).  In any
> case, if such an exchange will be needed there, and that session
> state is what we'll be checking against, should we create that structure
> now and use it to establish the gid list (instead of, say, including a
> potentially largish list<gid_t> in every MClientRequest)?

Like I've said, the GID list that the MDS can care about needs to be
in the session list anyway, right? So we shouldn't need to add it to
MClientRequests.
-Greg
--
To unsubscribe from this list: send the line "unsubscribe ceph-devel" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

^ permalink raw reply	[flat|nested] 35+ messages in thread

* Re: MDS auth caps for cephfs
  2015-05-22 22:02     ` Gregory Farnum
@ 2015-05-22 22:18       ` Sage Weil
  2015-05-22 22:38         ` Gregory Farnum
  2015-05-26 12:56       ` John Spray
  1 sibling, 1 reply; 35+ messages in thread
From: Sage Weil @ 2015-05-22 22:18 UTC (permalink / raw)
  To: Gregory Farnum; +Cc: John Spray, ceph-devel, Nishtha Rai, jashan kamboj

On Fri, 22 May 2015, Gregory Farnum wrote:
> >> > The root_squash option clearly belongs in spec, and Nistha's first patch
> >> > adds it there.  What about the other NFS options.. should be mirror those
> >> > too?
> >> >
> >> > root_squash
> >> >   Map requests from uid/gid 0 to the anonymous uid/gid. Note that this does
> >> >   not apply to any other uids or gids that might be equally sensitive, such
> >> >   as user bin or group staff.
> >> > no_root_squash
> >> >   Turn off root squashing. This option is mainly useful for diskless
> >> >   clients.
> >> > all_squash
> >> >   Map all uids and gids to the anonymous user. Useful for NFS-exported
> >> >   public FTP directories, news spool directories, etc. The opposite option
> >> >   is no_all_squash, which is the default setting.
> >> > anonuid and anongid
> >> >   These options explicitly set the uid and gid of the anonymous account.
> >> >   This option is primarily useful for PC/NFS clients, where you might want
> >> >   all requests appear to be from one user. As an example, consider the
> >> >   export entry for /home/joe in the example section below, which maps all
> >> >   requests to uid 150 (which is supposedly that of user joe).
> >>
> >> Yes, I think we should.  Part of me wants to say that people who want NFS-like
> >> behaviour should be using NFS gateways.  However, these are all probably
> >> straightforward enough to implement that it's worth maintaining them in cephfs
> >> too.
> 
> Unfortunately not really ? the NFS semantics are very different from
> the way our CephX security caps work. We grant accesses with each
> permission, rather than restricting them. We can accomplish similar
> things, but they'll need to be in opposite directions:
> allow anon_access
> allow uid 123, allow gid 123[,456,789,...]
> allow root
> where each additional grant gives the session more access. (And I'm
> not sure if these are best set up as specific things on their own or
> just squashed in so that UID -1 is "anon", etc) These let you set up
> access permissions like those of NFS, but it's a quite different model
> than the various mounting and config file options NFS gives you. I
> want to make sure we're clear about not trying to match those
> precisely because otherwise our security capabilities are not going to
> make any kind of sense. :(

I don't think this additive vs not additive thing is an issue.  Each 
"grant" exists in isolation.  It either grants access, or it doesn't.  If 
it doesn't, we check other grants (that may or may not grant something).  
How each grant decides whether it grants access can be based on 
anything--including a rule that says e.g. "anything that couldn't only be 
done by root".

The above example would be silly, since the final 'allow root' would 
presumably allow anything--the other grants needn't exist and won't 
have any effect on the result.

(Similarly, whether it defaults to root_squash or you have to explicitly 
mention it is just a UX issue... and maybe compatibility if we care about 
existing clusters with 'mds = allow rwx' caps out there.)

> What would it mean for a user who doesn't have no_root_squash to have
> access to uid 0? Why should we allow random users to access any UID
> *except* for root? Does a client who has no_root_squash and anon uid
> 123 get to access stuff as root, or else as 123? Can they access as
> 124?

I can't tell what you mean... :(

I'm guessing you're getting at root_squash being a weak tool, since it 
mostly only prevents you from doing something that only root could do (a 
compromised client can just claim to be any uid).  A weak tool is still a 
tool, though, and one that people can make use of.

> I mean, I think it would have to mean they get access to everything as
> anybody, and I'm not sure which requests would be considered
> "anonymous" for the uid 123 bit to kick in. But I don't think that's
> what the administrator would *mean* for them to have.
> 
> As I think about this more I guess the point is that for multi tenancy
> we want each client to be able to do anything inside of their own
> particular directory namespace, since UIDs and GIDs may not be
> synchronized across tenants? I'm not sure how to address that, but
> either way I think it will require a wider/different set of primitives
> than we've described here. :/

I agree that locking mounts inside directories is a much more useful 
paradigm, and likely the one that lots of people will use most of the 
time.  But we still need to deal with different users accessing shared 
storage.

> >> We probably need to mirror these in our mount options too, so that e.g.
> >> someone with an admin key can still enable root_squash at will, rather than
> >> having to craft an authentication token with the desired behaviour.
> 
> Mmmm, given that clients normally can't see their capabilities at all
> that's a bit tricky. We could maybe accomplish it by tying in with the
> extra session exchange (that Sage referred to below); that will be
> necessary for adding clients to an existing host session dynamically
> and we could also let a user voluntarily drop certain permissions with
> it...although dropping permissions requires a client to know that they
> have them. Hrm.

Hrm.  I'm inclined to leave it to the cap for now for simplicity?

> On Fri, May 22, 2015 at 2:35 PM, Sage Weil <sweil@redhat.com> wrote:
> > Yeah. So Greg and Josh and I sat down with Dan van der Ster yesterday and
> > went over some of this.  I think we also concluded:
> >
> >  - We should somehow tag requests with a uid and list<gid>.  This will
> > make the request path permission checks sane WRT these sorts of checks.
> 
> Well, hopefully we don't need to tag individual requests with a list
> of GIDs because the group information will be in the session state?
> 
> >
> >  - We need something trickier for cap writeback.  We can simply tag the
> > dirty cap on the client with the uid etc of whoever dirtied it, but if
> > multiple users do that it can get messy.  I suggest forcing the client to
> > flush before allowing a second dirty, although this will be slighly
> > painful as we need to handle the case where the MDS fails or a subtree
> > migrates, so it might mean actually blocking in that case.  (This will be
> > semi gross to code but I don't think will affect any realworld workload.)
> 
> Flushing *might* be the easiest solution to implement, but I actually
> worry we'll run into it a non-trivial amount of the time. Consider a
> client with multiple containerized applications running on the same
> host, that need to share data...
> I'd need to look through the writeback paths in the client pretty
> carefully before I felt comfortable picking a path forward here. I'm
> tempted to set up some kind of ordered flush thing similar to our
> projected journal updates (but simpler!) ? if the client allows
> something the MDS doesn't then we've got a problem, but that basically
> requires a user subverting the client so I'm not sure it's worth
> worrying about?

Yeah.  I agree, ordered flushes would be nicer.

> >  - For per-user kerberos, we'll need an extra exchange between client and
> > MDS to establish user credentials (e.g., when a user does kinit, or a new
> > user logs into the box, etc.).  Note that the kerberos credential has a
> > group concept, but I'm not sure how that maps onto the Unix groups
> > (perhaps that is a parallel PAM thing with the LDAP/AD server?).  In any
> > case, if such an exchange will be needed there, and that session
> > state is what we'll be checking against, should we create that structure
> > now and use it to establish the gid list (instead of, say, including a
> > potentially largish list<gid_t> in every MClientRequest)?
> 
> Like I've said, the GID list that the MDS can care about needs to be
> in the session list anyway, right? So we shouldn't need to add it to
> MClientRequests.

Okay, that means adding these new user auth messages we've been talking 
about now rather than later.  I'm okay with that, but it's more work, and 
comes with some risk that we'll get it wrong (since we're not knee-deep in 
per-user kerberos yet)...

sage

^ permalink raw reply	[flat|nested] 35+ messages in thread

* Re: MDS auth caps for cephfs
  2015-05-22 22:18       ` Sage Weil
@ 2015-05-22 22:38         ` Gregory Farnum
  2015-05-22 22:52           ` Sage Weil
  0 siblings, 1 reply; 35+ messages in thread
From: Gregory Farnum @ 2015-05-22 22:38 UTC (permalink / raw)
  To: Sage Weil; +Cc: John Spray, ceph-devel, Nishtha Rai, jashan kamboj

On Fri, May 22, 2015 at 3:18 PM, Sage Weil <sweil@redhat.com> wrote:
> On Fri, 22 May 2015, Gregory Farnum wrote:
>> >> > The root_squash option clearly belongs in spec, and Nistha's first patch
>> >> > adds it there.  What about the other NFS options.. should be mirror those
>> >> > too?
>> >> >
>> >> > root_squash
>> >> >   Map requests from uid/gid 0 to the anonymous uid/gid. Note that this does
>> >> >   not apply to any other uids or gids that might be equally sensitive, such
>> >> >   as user bin or group staff.
>> >> > no_root_squash
>> >> >   Turn off root squashing. This option is mainly useful for diskless
>> >> >   clients.
>> >> > all_squash
>> >> >   Map all uids and gids to the anonymous user. Useful for NFS-exported
>> >> >   public FTP directories, news spool directories, etc. The opposite option
>> >> >   is no_all_squash, which is the default setting.
>> >> > anonuid and anongid
>> >> >   These options explicitly set the uid and gid of the anonymous account.
>> >> >   This option is primarily useful for PC/NFS clients, where you might want
>> >> >   all requests appear to be from one user. As an example, consider the
>> >> >   export entry for /home/joe in the example section below, which maps all
>> >> >   requests to uid 150 (which is supposedly that of user joe).
>> >>
>> >> Yes, I think we should.  Part of me wants to say that people who want NFS-like
>> >> behaviour should be using NFS gateways.  However, these are all probably
>> >> straightforward enough to implement that it's worth maintaining them in cephfs
>> >> too.
>>
>> Unfortunately not really ? the NFS semantics are very different from
>> the way our CephX security caps work. We grant accesses with each
>> permission, rather than restricting them. We can accomplish similar
>> things, but they'll need to be in opposite directions:
>> allow anon_access
>> allow uid 123, allow gid 123[,456,789,...]
>> allow root
>> where each additional grant gives the session more access. (And I'm
>> not sure if these are best set up as specific things on their own or
>> just squashed in so that UID -1 is "anon", etc) These let you set up
>> access permissions like those of NFS, but it's a quite different model
>> than the various mounting and config file options NFS gives you. I
>> want to make sure we're clear about not trying to match those
>> precisely because otherwise our security capabilities are not going to
>> make any kind of sense. :(
>
> I don't think this additive vs not additive thing is an issue.  Each
> "grant" exists in isolation.  It either grants access, or it doesn't.  If
> it doesn't, we check other grants (that may or may not grant something).
> How each grant decides whether it grants access can be based on
> anything--including a rule that says e.g. "anything that couldn't only be
> done by root".

Right, so each grant allows it or it doesn't: each grant adds more
power to the user. There's no subtracting abilities from one grant,
because each grant acts in isolation.

>
> The above example would be silly, since the final 'allow root' would
> presumably allow anything--the other grants needn't exist and won't
> have any effect on the result.
>
> (Similarly, whether it defaults to root_squash or you have to explicitly
> mention it is just a UX issue... and maybe compatibility if we care about
> existing clusters with 'mds = allow rwx' caps out there.)
>
>> What would it mean for a user who doesn't have no_root_squash to have
>> access to uid 0? Why should we allow random users to access any UID
>> *except* for root? Does a client who has no_root_squash and anon uid
>> 123 get to access stuff as root, or else as 123? Can they access as
>> 124?
>
> I can't tell what you mean... :(
>
> I'm guessing you're getting at root_squash being a weak tool, since it
> mostly only prevents you from doing something that only root could do (a
> compromised client can just claim to be any uid).  A weak tool is still a
> tool, though, and one that people can make use of.

I guess maybe I am. But it's not just that: it's that I don't
understand how it interacts with specifying UIDs. I would *expect*
that we grant specific cephx keys access to specific UIDs, or to "any
except root", or to "any". But the usage of root_squash implies that
any client can try to act as any UID and we'll let them. So if a key
says "no_root_squash, allow uid=123, allow gid=123", does that key
allow the client to act as root? I think it *does*, but it
*shouldn't*.

>
>> I mean, I think it would have to mean they get access to everything as
>> anybody, and I'm not sure which requests would be considered
>> "anonymous" for the uid 123 bit to kick in. But I don't think that's
>> what the administrator would *mean* for them to have.
>>
>> As I think about this more I guess the point is that for multi tenancy
>> we want each client to be able to do anything inside of their own
>> particular directory namespace, since UIDs and GIDs may not be
>> synchronized across tenants? I'm not sure how to address that, but
>> either way I think it will require a wider/different set of primitives
>> than we've described here. :/
>
> I agree that locking mounts inside directories is a much more useful
> paradigm, and likely the one that lots of people will use most of the
> time.  But we still need to deal with different users accessing shared
> storage.
>
>> >> We probably need to mirror these in our mount options too, so that e.g.
>> >> someone with an admin key can still enable root_squash at will, rather than
>> >> having to craft an authentication token with the desired behaviour.
>>
>> Mmmm, given that clients normally can't see their capabilities at all
>> that's a bit tricky. We could maybe accomplish it by tying in with the
>> extra session exchange (that Sage referred to below); that will be
>> necessary for adding clients to an existing host session dynamically
>> and we could also let a user voluntarily drop certain permissions with
>> it...although dropping permissions requires a client to know that they
>> have them. Hrm.
>
> Hrm.  I'm inclined to leave it to the cap for now for simplicity?
>
>> On Fri, May 22, 2015 at 2:35 PM, Sage Weil <sweil@redhat.com> wrote:
>> > Yeah. So Greg and Josh and I sat down with Dan van der Ster yesterday and
>> > went over some of this.  I think we also concluded:
>> >
>> >  - We should somehow tag requests with a uid and list<gid>.  This will
>> > make the request path permission checks sane WRT these sorts of checks.
>>
>> Well, hopefully we don't need to tag individual requests with a list
>> of GIDs because the group information will be in the session state?
>>
>> >
>> >  - We need something trickier for cap writeback.  We can simply tag the
>> > dirty cap on the client with the uid etc of whoever dirtied it, but if
>> > multiple users do that it can get messy.  I suggest forcing the client to
>> > flush before allowing a second dirty, although this will be slighly
>> > painful as we need to handle the case where the MDS fails or a subtree
>> > migrates, so it might mean actually blocking in that case.  (This will be
>> > semi gross to code but I don't think will affect any realworld workload.)
>>
>> Flushing *might* be the easiest solution to implement, but I actually
>> worry we'll run into it a non-trivial amount of the time. Consider a
>> client with multiple containerized applications running on the same
>> host, that need to share data...
>> I'd need to look through the writeback paths in the client pretty
>> carefully before I felt comfortable picking a path forward here. I'm
>> tempted to set up some kind of ordered flush thing similar to our
>> projected journal updates (but simpler!) ? if the client allows
>> something the MDS doesn't then we've got a problem, but that basically
>> requires a user subverting the client so I'm not sure it's worth
>> worrying about?
>
> Yeah.  I agree, ordered flushes would be nicer.
>
>> >  - For per-user kerberos, we'll need an extra exchange between client and
>> > MDS to establish user credentials (e.g., when a user does kinit, or a new
>> > user logs into the box, etc.).  Note that the kerberos credential has a
>> > group concept, but I'm not sure how that maps onto the Unix groups
>> > (perhaps that is a parallel PAM thing with the LDAP/AD server?).  In any
>> > case, if such an exchange will be needed there, and that session
>> > state is what we'll be checking against, should we create that structure
>> > now and use it to establish the gid list (instead of, say, including a
>> > potentially largish list<gid_t> in every MClientRequest)?
>>
>> Like I've said, the GID list that the MDS can care about needs to be
>> in the session list anyway, right? So we shouldn't need to add it to
>> MClientRequests.
>
> Okay, that means adding these new user auth messages we've been talking
> about now rather than later.  I'm okay with that, but it's more work, and
> comes with some risk that we'll get it wrong (since we're not knee-deep in
> per-user kerberos yet)...

Does adding new ones right now get us anything? If we're just using
them to report what groups the user has on the local machine, we might
as well be verifying them on the client.

^ permalink raw reply	[flat|nested] 35+ messages in thread

* Re: MDS auth caps for cephfs
  2015-05-22 22:38         ` Gregory Farnum
@ 2015-05-22 22:52           ` Sage Weil
  2015-05-26 14:26             ` Gregory Farnum
  0 siblings, 1 reply; 35+ messages in thread
From: Sage Weil @ 2015-05-22 22:52 UTC (permalink / raw)
  To: Gregory Farnum; +Cc: John Spray, ceph-devel, Nishtha Rai, jashan kamboj

On Fri, 22 May 2015, Gregory Farnum wrote:
> On Fri, May 22, 2015 at 3:18 PM, Sage Weil <sweil@redhat.com> wrote:
> >> What would it mean for a user who doesn't have no_root_squash to have
> >> access to uid 0? Why should we allow random users to access any UID
> >> *except* for root? Does a client who has no_root_squash and anon uid
> >> 123 get to access stuff as root, or else as 123? Can they access as
> >> 124?
> >
> > I can't tell what you mean... :(
> >
> > I'm guessing you're getting at root_squash being a weak tool, since it
> > mostly only prevents you from doing something that only root could do (a
> > compromised client can just claim to be any uid).  A weak tool is still a
> > tool, though, and one that people can make use of.
> 
> I guess maybe I am. But it's not just that: it's that I don't
> understand how it interacts with specifying UIDs. I would *expect*
> that we grant specific cephx keys access to specific UIDs, 

 allow uid 1 rw
 allow uid 2 rw

> or to "any except root",

 allow rw options root_squash

> or to "any". 

 allow rw     # implied no_root_squash

> But the usage of root_squash implies that any client can try to act as 
> any UID and we'll let them.

Yes... any uid *except* uid 0.

> So if a key says "no_root_squash, allow uid=123, allow gid=123", does 
> that key allow the client to act as root? I think it *does*, but it 
> *shouldn't*.

Do you mean

 allow rw options  # implied no_root_squash
 allow uid 123 rw
 allow gid 123 rw

?  Because the 2nd and third are no-ops since the first will already let 
123 and any-other non-0 uid (or gid) do a write.

Or, do you mean

 allow uid 123 gid 123 options no_root_squash

In that case, I think no_root_squash is a meaningless option because we 
have already said 'uid 123' only please.


> >> >  - For per-user kerberos, we'll need an extra exchange between client and
> >> > MDS to establish user credentials (e.g., when a user does kinit, or a new
> >> > user logs into the box, etc.).  Note that the kerberos credential has a
> >> > group concept, but I'm not sure how that maps onto the Unix groups
> >> > (perhaps that is a parallel PAM thing with the LDAP/AD server?).  In any
> >> > case, if such an exchange will be needed there, and that session
> >> > state is what we'll be checking against, should we create that structure
> >> > now and use it to establish the gid list (instead of, say, including a
> >> > potentially largish list<gid_t> in every MClientRequest)?
> >>
> >> Like I've said, the GID list that the MDS can care about needs to be
> >> in the session list anyway, right? So we shouldn't need to add it to
> >> MClientRequests.
> >
> > Okay, that means adding these new user auth messages we've been talking
> > about now rather than later.  I'm okay with that, but it's more work, and
> > comes with some risk that we'll get it wrong (since we're not knee-deep in
> > per-user kerberos yet)...
> 
> Does adding new ones right now get us anything? If we're just using
> them to report what groups the user has on the local machine, we might
> as well be verifying them on the client.

Once we start talking about 'allow uid XXX' etc, we'll need to know the 
user's gid so we can apply the group rwx bits.

s

^ permalink raw reply	[flat|nested] 35+ messages in thread

* Re: MDS auth caps for cephfs
  2015-05-22 22:02     ` Gregory Farnum
  2015-05-22 22:18       ` Sage Weil
@ 2015-05-26 12:56       ` John Spray
  1 sibling, 0 replies; 35+ messages in thread
From: John Spray @ 2015-05-26 12:56 UTC (permalink / raw)
  To: Gregory Farnum, Sage Weil; +Cc: ceph-devel, Nishtha Rai, jashan kamboj



On 22/05/2015 23:02, Gregory Farnum wrote:
> On Fri, May 22, 2015 at 2:35 PM, Sage Weil <sweil@redhat.com> wrote:
>> On Fri, 22 May 2015, John Spray wrote:
>> Yes, I think we should.  Part of me wants to say that people who want NFS-like
>> behaviour should be using NFS gateways.  However, these are all probably
>> straightforward enough to implement that it's worth maintaining them in cephfs
>> too.
> Unfortunately not really — the NFS semantics are very different from
> the way our CephX security caps work. We grant accesses with each
> permission, rather than restricting them. We can accomplish similar
> things, but they'll need to be in opposite directions:
> allow anon_access
> allow uid 123, allow gid 123[,456,789,...]
> allow root
> where each additional grant gives the session more access. (And I'm
> not sure if these are best set up as specific things on their own or
> just squashed in so that UID -1 is "anon", etc) These let you set up
> access permissions like those of NFS, but it's a quite different model
> than the various mounting and config file options NFS gives you. I
> want to make sure we're clear about not trying to match those
> precisely because otherwise our security capabilities are not going to
> make any kind of sense. :(
> What would it mean for a user who doesn't have no_root_squash to have
> access to uid 0? Why should we allow random users to access any UID
> *except* for root? Does a client who has no_root_squash and anon uid
> 123 get to access stuff as root, or else as 123? Can they access as
> 124?
> I mean, I think it would have to mean they get access to everything as
> anybody, and I'm not sure which requests would be considered
> "anonymous" for the uid 123 bit to kick in. But I don't think that's
> what the administrator would *mean* for them to have.
This feels more like a syntax issue: as you say, these NFS-esque options 
don't make sense as part of a list of additive capabilities, they're 
more like a single structure with some fields.  The naming is 
potentially confusing too; we're really talking about a condition under 
which we squash a UID (never, when requester was root, or always), and 
then how we do the squashing (to which UID/GID).

A syntax that doesn't allow the nonsensical combinations of options 
might be something like:
squash: <none|all|root>
squash_to: <uid> <gid>
capabilities: [existing additive list style] [] []...

(or even a list of those structures, so that admin could define e.g. 
multiple path-limited capabilities, each with different squashing rules).

But maybe it is a good idea to avoid introducing this to users as "it's 
like NFS" to avoid confusion.

>
> As I think about this more I guess the point is that for multi tenancy
> we want each client to be able to do anything inside of their own
> particular directory namespace, since UIDs and GIDs may not be
> synchronized across tenants? I'm not sure how to address that, but
> either way I think it will require a wider/different set of primitives
> than we've described here. :/


The path limiting is the most important thing (protect clients accessing 
one subtree from clients accessing another subtree) but UID mangling is 
important too (protect multiple clients mounting same path from one 
another's local user DBs).  It's two different motivations.

I'm imagining that "access to path /foo/bar, all_squash to user 123" 
would be a very typical use case for multitenancy, for environments 
where each tenant is a single-UID application, but we don't really care 
what the local UIDs are.

>
>>> We probably need to mirror these in our mount options too, so that e.g.
>>> someone with an admin key can still enable root_squash at will, rather than
>>> having to craft an authentication token with the desired behaviour.
> Mmmm, given that clients normally can't see their capabilities at all
> that's a bit tricky. We could maybe accomplish it by tying in with the
> extra session exchange (that Sage referred to below); that will be
> necessary for adding clients to an existing host session dynamically
> and we could also let a user voluntarily drop certain permissions with
> it...although dropping permissions requires a client to know that they
> have them. Hrm.

The mount options could just be overrides that the client sends in the 
session open: it would be up to the MDS to perform the requested 
dropping of capabilities.  But now that I think about this more, the 
idea of defining a second syntax for clients to use to request 
subtractions from their capabilities becomes unappealing.  Maybe we 
should just say that what's in your MDSAuthCaps is what you get, and 
make sure that the associated tooling is simple enough for users to 
readily create these on a per-mount basis if they want different options.

John
--
To unsubscribe from this list: send the line "unsubscribe ceph-devel" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

^ permalink raw reply	[flat|nested] 35+ messages in thread

* Re: MDS auth caps for cephfs
  2015-05-22 22:52           ` Sage Weil
@ 2015-05-26 14:26             ` Gregory Farnum
  2015-05-26 16:28               ` Sage Weil
  0 siblings, 1 reply; 35+ messages in thread
From: Gregory Farnum @ 2015-05-26 14:26 UTC (permalink / raw)
  To: Sage Weil; +Cc: John Spray, ceph-devel, Nishtha Rai, jashan kamboj

On Fri, May 22, 2015 at 3:52 PM, Sage Weil <sweil@redhat.com> wrote:
> On Fri, 22 May 2015, Gregory Farnum wrote:
>> On Fri, May 22, 2015 at 3:18 PM, Sage Weil <sweil@redhat.com> wrote:
>> >> What would it mean for a user who doesn't have no_root_squash to have
>> >> access to uid 0? Why should we allow random users to access any UID
>> >> *except* for root? Does a client who has no_root_squash and anon uid
>> >> 123 get to access stuff as root, or else as 123? Can they access as
>> >> 124?
>> >
>> > I can't tell what you mean... :(
>> >
>> > I'm guessing you're getting at root_squash being a weak tool, since it
>> > mostly only prevents you from doing something that only root could do (a
>> > compromised client can just claim to be any uid).  A weak tool is still a
>> > tool, though, and one that people can make use of.
>>
>> I guess maybe I am. But it's not just that: it's that I don't
>> understand how it interacts with specifying UIDs. I would *expect*
>> that we grant specific cephx keys access to specific UIDs,
>
>  allow uid 1 rw
>  allow uid 2 rw
>
>> or to "any except root",
>
>  allow rw options root_squash
>
>> or to "any".
>
>  allow rw     # implied no_root_squash
>
>> But the usage of root_squash implies that any client can try to act as
>> any UID and we'll let them.
>
> Yes... any uid *except* uid 0.
>
>> So if a key says "no_root_squash, allow uid=123, allow gid=123", does
>> that key allow the client to act as root? I think it *does*, but it
>> *shouldn't*.
>
> Do you mean
>
>  allow rw options  # implied no_root_squash
>  allow uid 123 rw
>  allow gid 123 rw
>
> ?  Because the 2nd and third are no-ops since the first will already let
> 123 and any-other non-0 uid (or gid) do a write.
>
> Or, do you mean
>
>  allow uid 123 gid 123 options no_root_squash
>
> In that case, I think no_root_squash is a meaningless option because we
> have already said 'uid 123' only please.

So here you've said that no_root_squash is meaningless because of a
different set of granted permissions; that doesn't sound strictly
additive to me.
Or do you mean (as John implies below) that no_root_squash will not
squash the user and then it will fail because it doesn't have adequate
access permissions?

On Tue, May 26, 2015 at 5:56 AM, John Spray <john.spray@redhat.com> wrote:
> This feels more like a syntax issue: as you say, these NFS-esque options
> don't make sense as part of a list of additive capabilities, they're more
> like a single structure with some fields.  The naming is potentially
> confusing too; we're really talking about a condition under which we squash
> a UID (never, when requester was root, or always), and then how we do the
> squashing (to which UID/GID).
>
> A syntax that doesn't allow the nonsensical combinations of options might be
> something like:
> squash: <none|all|root>
> squash_to: <uid> <gid>
> capabilities: [existing additive list style] [] []...
>
> (or even a list of those structures, so that admin could define e.g.
> multiple path-limited capabilities, each with different squashing rules).
>
> But maybe it is a good idea to avoid introducing this to users as "it's like
> NFS" to avoid confusion.

Yes, something like this makes a lot more sense to me. Although I'd
almost expect/want it to be done client-side rather than on the
server: the server is responsible for making sure clients have
permission to do what they're asking; the clients are responsible for
forming their requests in a way that they have permission. Is that
nonsensical?
-Greg

^ permalink raw reply	[flat|nested] 35+ messages in thread

* Re: MDS auth caps for cephfs
  2015-05-26 14:26             ` Gregory Farnum
@ 2015-05-26 16:28               ` Sage Weil
  2015-05-26 21:26                 ` Sage Weil
  2015-05-26 21:53                 ` Gregory Farnum
  0 siblings, 2 replies; 35+ messages in thread
From: Sage Weil @ 2015-05-26 16:28 UTC (permalink / raw)
  To: Gregory Farnum; +Cc: John Spray, ceph-devel, Nishtha Rai, jashan kamboj

On Tue, 26 May 2015, Gregory Farnum wrote:
> On Fri, May 22, 2015 at 3:52 PM, Sage Weil <sweil@redhat.com> wrote:
> > On Fri, 22 May 2015, Gregory Farnum wrote:
> >> On Fri, May 22, 2015 at 3:18 PM, Sage Weil <sweil@redhat.com> wrote:
> >> >> What would it mean for a user who doesn't have no_root_squash to have
> >> >> access to uid 0? Why should we allow random users to access any UID
> >> >> *except* for root? Does a client who has no_root_squash and anon uid
> >> >> 123 get to access stuff as root, or else as 123? Can they access as
> >> >> 124?
> >> >
> >> > I can't tell what you mean... :(
> >> >
> >> > I'm guessing you're getting at root_squash being a weak tool, since it
> >> > mostly only prevents you from doing something that only root could do (a
> >> > compromised client can just claim to be any uid).  A weak tool is still a
> >> > tool, though, and one that people can make use of.
> >>
> >> I guess maybe I am. But it's not just that: it's that I don't
> >> understand how it interacts with specifying UIDs. I would *expect*
> >> that we grant specific cephx keys access to specific UIDs,
> >
> >  allow uid 1 rw
> >  allow uid 2 rw
> >
> >> or to "any except root",
> >
> >  allow rw options root_squash
> >
> >> or to "any".
> >
> >  allow rw     # implied no_root_squash
> >
> >> But the usage of root_squash implies that any client can try to act as
> >> any UID and we'll let them.
> >
> > Yes... any uid *except* uid 0.
> >
> >> So if a key says "no_root_squash, allow uid=123, allow gid=123", does
> >> that key allow the client to act as root? I think it *does*, but it
> >> *shouldn't*.
> >
> > Do you mean
> >
> >  allow rw options  # implied no_root_squash
> >  allow uid 123 rw
> >  allow gid 123 rw
> >
> > ?  Because the 2nd and third are no-ops since the first will already let
> > 123 and any-other non-0 uid (or gid) do a write.
> >
> > Or, do you mean
> >
> >  allow uid 123 gid 123 options no_root_squash
> >
> > In that case, I think no_root_squash is a meaningless option because we
> > have already said 'uid 123' only please.
> 
> So here you've said that no_root_squash is meaningless because of a
> different set of granted permissions; that doesn't sound strictly
> additive to me.
> Or do you mean (as John implies below) that no_root_squash will not
> squash the user and then it will fail because it doesn't have adequate
> access permissions?

It is 'additive' means that we have a set of 'allow ...' stanzas, and if 
there exists one that is says "yes" then we are happy; there is no such 
thing as 'deny ...'.

I think what we make the 'allow ...' look like internally is a separate 
issue... John called it a syntax issue, which I think makes sense.  I'm 
not sure about the proposal to put squash stuff outside of the additive 
grant list, though, or else we lose the additive property.

I think the confusing part here is that "squash" is a verb.  I think we 
have 2 options:

1) Treat the caps as a conditional test only that is e.g. const.  This 
is (mostly) what I've been thinking.  Or,

2) Mix in actions, like "squash uids to X, and then see if it is allowed. 
If so, perform the (squashed) operation."

The latter muddies the water, I think, but I think it's what is needed to 
make this truly NFS-ilke.

> On Tue, May 26, 2015 at 5:56 AM, John Spray <john.spray@redhat.com> wrote:
> > This feels more like a syntax issue: as you say, these NFS-esque options
> > don't make sense as part of a list of additive capabilities, they're more
> > like a single structure with some fields.  The naming is potentially
> > confusing too; we're really talking about a condition under which we squash
> > a UID (never, when requester was root, or always), and then how we do the
> > squashing (to which UID/GID).
> >
> > A syntax that doesn't allow the nonsensical combinations of options might be
> > something like:
> > squash: <none|all|root>
> > squash_to: <uid> <gid>
> > capabilities: [existing additive list style] [] []...
> >
> > (or even a list of those structures, so that admin could define e.g.
> > multiple path-limited capabilities, each with different squashing rules).
> >
> > But maybe it is a good idea to avoid introducing this to users as "it's like
> > NFS" to avoid confusion.
> 
> Yes, something like this makes a lot more sense to me. Although I'd
> almost expect/want it to be done client-side rather than on the
> server: the server is responsible for making sure clients have
> permission to do what they're asking; the clients are responsible for
> forming their requests in a way that they have permission. Is that
> nonsensical?

That makes sense to me.  I suggest

- the MDS decides it is allowed.  If so, do the presented operation.  
Preserve the const-ness of teh current permission checks.

- the client may do some squashing before-hand.

- forget what I said before and make this un-NFS-like.  :)

Perhaps the illustrative example is this: an op to create a file, as uid 
0, in a 777 directory comes in, and the capability says root_squash.  In 
this case, our const check says "if i were to squash to 65534, it would 
still be allowed" and so the mds goes ahead and creates a file *owned by 
root*.

Does that make sense?

sage


^ permalink raw reply	[flat|nested] 35+ messages in thread

* Re: MDS auth caps for cephfs
  2015-05-26 16:28               ` Sage Weil
@ 2015-05-26 21:26                 ` Sage Weil
  2015-05-26 21:53                 ` Gregory Farnum
  1 sibling, 0 replies; 35+ messages in thread
From: Sage Weil @ 2015-05-26 21:26 UTC (permalink / raw)
  To: Gregory Farnum; +Cc: John Spray, ceph-devel, Nishtha Rai, jashan kamboj

I wrote some notes at

	http://pad.ceph.com/p/squash

that capture my current proposal.  Please take a look!

sage

^ permalink raw reply	[flat|nested] 35+ messages in thread

* Re: MDS auth caps for cephfs
  2015-05-26 16:28               ` Sage Weil
  2015-05-26 21:26                 ` Sage Weil
@ 2015-05-26 21:53                 ` Gregory Farnum
  2015-05-26 22:17                   ` Sage Weil
  1 sibling, 1 reply; 35+ messages in thread
From: Gregory Farnum @ 2015-05-26 21:53 UTC (permalink / raw)
  To: Sage Weil; +Cc: John Spray, ceph-devel, Nishtha Rai, jashan kamboj

On Tue, May 26, 2015 at 9:28 AM, Sage Weil <sweil@redhat.com> wrote:
> On Tue, 26 May 2015, Gregory Farnum wrote:
>> On Fri, May 22, 2015 at 3:52 PM, Sage Weil <sweil@redhat.com> wrote:
>> > On Fri, 22 May 2015, Gregory Farnum wrote:
>> >> On Fri, May 22, 2015 at 3:18 PM, Sage Weil <sweil@redhat.com> wrote:
>> >> >> What would it mean for a user who doesn't have no_root_squash to have
>> >> >> access to uid 0? Why should we allow random users to access any UID
>> >> >> *except* for root? Does a client who has no_root_squash and anon uid
>> >> >> 123 get to access stuff as root, or else as 123? Can they access as
>> >> >> 124?
>> >> >
>> >> > I can't tell what you mean... :(
>> >> >
>> >> > I'm guessing you're getting at root_squash being a weak tool, since it
>> >> > mostly only prevents you from doing something that only root could do (a
>> >> > compromised client can just claim to be any uid).  A weak tool is still a
>> >> > tool, though, and one that people can make use of.
>> >>
>> >> I guess maybe I am. But it's not just that: it's that I don't
>> >> understand how it interacts with specifying UIDs. I would *expect*
>> >> that we grant specific cephx keys access to specific UIDs,
>> >
>> >  allow uid 1 rw
>> >  allow uid 2 rw
>> >
>> >> or to "any except root",
>> >
>> >  allow rw options root_squash
>> >
>> >> or to "any".
>> >
>> >  allow rw     # implied no_root_squash
>> >
>> >> But the usage of root_squash implies that any client can try to act as
>> >> any UID and we'll let them.
>> >
>> > Yes... any uid *except* uid 0.
>> >
>> >> So if a key says "no_root_squash, allow uid=123, allow gid=123", does
>> >> that key allow the client to act as root? I think it *does*, but it
>> >> *shouldn't*.
>> >
>> > Do you mean
>> >
>> >  allow rw options  # implied no_root_squash
>> >  allow uid 123 rw
>> >  allow gid 123 rw
>> >
>> > ?  Because the 2nd and third are no-ops since the first will already let
>> > 123 and any-other non-0 uid (or gid) do a write.
>> >
>> > Or, do you mean
>> >
>> >  allow uid 123 gid 123 options no_root_squash
>> >
>> > In that case, I think no_root_squash is a meaningless option because we
>> > have already said 'uid 123' only please.
>>
>> So here you've said that no_root_squash is meaningless because of a
>> different set of granted permissions; that doesn't sound strictly
>> additive to me.
>> Or do you mean (as John implies below) that no_root_squash will not
>> squash the user and then it will fail because it doesn't have adequate
>> access permissions?
>
> It is 'additive' means that we have a set of 'allow ...' stanzas, and if
> there exists one that is says "yes" then we are happy; there is no such
> thing as 'deny ...'.
>
> I think what we make the 'allow ...' look like internally is a separate
> issue... John called it a syntax issue, which I think makes sense.  I'm
> not sure about the proposal to put squash stuff outside of the additive
> grant list, though, or else we lose the additive property.
>
> I think the confusing part here is that "squash" is a verb.  I think we
> have 2 options:
>
> 1) Treat the caps as a conditional test only that is e.g. const.  This
> is (mostly) what I've been thinking.  Or,
>
> 2) Mix in actions, like "squash uids to X, and then see if it is allowed.
> If so, perform the (squashed) operation."
>
> The latter muddies the water, I think, but I think it's what is needed to
> make this truly NFS-ilke.

Yes. If we're doing mixed actions I very much want stuff to be
single-pass. So maybe we do squashes, but that happens either on the
client or the server *before* we apply any permission checks.

>
>> On Tue, May 26, 2015 at 5:56 AM, John Spray <john.spray@redhat.com> wrote:
>> > This feels more like a syntax issue: as you say, these NFS-esque options
>> > don't make sense as part of a list of additive capabilities, they're more
>> > like a single structure with some fields.  The naming is potentially
>> > confusing too; we're really talking about a condition under which we squash
>> > a UID (never, when requester was root, or always), and then how we do the
>> > squashing (to which UID/GID).
>> >
>> > A syntax that doesn't allow the nonsensical combinations of options might be
>> > something like:
>> > squash: <none|all|root>
>> > squash_to: <uid> <gid>
>> > capabilities: [existing additive list style] [] []...
>> >
>> > (or even a list of those structures, so that admin could define e.g.
>> > multiple path-limited capabilities, each with different squashing rules).
>> >
>> > But maybe it is a good idea to avoid introducing this to users as "it's like
>> > NFS" to avoid confusion.
>>
>> Yes, something like this makes a lot more sense to me. Although I'd
>> almost expect/want it to be done client-side rather than on the
>> server: the server is responsible for making sure clients have
>> permission to do what they're asking; the clients are responsible for
>> forming their requests in a way that they have permission. Is that
>> nonsensical?
>
> That makes sense to me.  I suggest
>
> - the MDS decides it is allowed.  If so, do the presented operation.
> Preserve the const-ness of teh current permission checks.
>
> - the client may do some squashing before-hand.
>
> - forget what I said before and make this un-NFS-like.  :)

:)

>
> Perhaps the illustrative example is this: an op to create a file, as uid
> 0, in a 777 directory comes in, and the capability says root_squash.  In
> this case, our const check says "if i were to squash to 65534, it would
> still be allowed" and so the mds goes ahead and creates a file *owned by
> root*.
>
> Does that make sense?

Mmm, I'm having trouble making this one work out. If you can write a
file with UID 0 you have to be able to subsequently read it, and I
just don't see that happening if you aren't doing an operation as UID
0?

But this is getting thornier as we consider future multi-tenant (not
just multi-user) and subtree work. I've been imaging the subtree
restriction as *not additive*, in that it would be a restriction to
users. It looks like you've got similar ideas from the pad, so I don't
understand how the anonymous user would generally be allowed to do
things against the tree except read it?

When looking at your pad and you say "if a UID is specified..." do you
mean specified within an op, or in the cephx caps? The first one makes
sense to me; the second won't work (not additive, etc).

Basically I'm still stuck on how any of this lets us lock a user into
a subtree while letting them do what they want within it. I'm not sure
how/if NFS solves that problem...
-Greg

^ permalink raw reply	[flat|nested] 35+ messages in thread

* Re: MDS auth caps for cephfs
  2015-05-26 21:53                 ` Gregory Farnum
@ 2015-05-26 22:17                   ` Sage Weil
  2015-05-26 22:50                     ` Gregory Farnum
  0 siblings, 1 reply; 35+ messages in thread
From: Sage Weil @ 2015-05-26 22:17 UTC (permalink / raw)
  To: Gregory Farnum; +Cc: John Spray, ceph-devel, Nishtha Rai, jashan kamboj

On Tue, 26 May 2015, Gregory Farnum wrote:
> > That makes sense to me.  I suggest
> >
> > - the MDS decides it is allowed.  If so, do the presented operation.
> > Preserve the const-ness of teh current permission checks.
> >
> > - the client may do some squashing before-hand.
> >
> > - forget what I said before and make this un-NFS-like.  :)
> 
> :)
> 
> >
> > Perhaps the illustrative example is this: an op to create a file, as uid
> > 0, in a 777 directory comes in, and the capability says root_squash.  In
> > this case, our const check says "if i were to squash to 65534, it would
> > still be allowed" and so the mds goes ahead and creates a file *owned by
> > root*.
> >
> > Does that make sense?
> 
> Mmm, I'm having trouble making this one work out. If you can write a
> file with UID 0 you have to be able to subsequently read it, and I
> just don't see that happening if you aren't doing an operation as UID
> 0?

The operation would be done as uid 0.

(Also, I'm completely ignoring read operations for the moment... :/ )

> But this is getting thornier as we consider future multi-tenant (not
> just multi-user) and subtree work. I've been imaging the subtree
> restriction as *not additive*, in that it would be a restriction to
> users. 

I think we're still talking past each other about 'additive'.  I'm 
suggesting caps are *always* additive.  That is, any operation permitted 
by

 allow A

will also permit the same operation if you have

 allow A ; allow B

for *any* B.  B cannot effect A, and only one of the allow's needs to say 
"yes".

...but what does that have to do with path restrictions?  If you have an 
allow A like

 allow rw path /foo

that will allow something in /foo regardless of what other allow B's are 
added to the list.

Right?

> It looks like you've got similar ideas from the pad, so I don't
> understand how the anonymous user would generally be allowed to do
> things against the tree except read it?
> 
> When looking at your pad and you say "if a UID is specified..." do you
> mean specified within an op, or in the cephx caps? The first one makes
> sense to me; the second won't work (not additive, etc).

In the caps, where it is optional.  The uid on the op will be required 
(modulo whatever we have to do for compatibility).

> Basically I'm still stuck on how any of this lets us lock a user into
> a subtree while letting them do what they want within it. I'm not sure
> how/if NFS solves that problem...

That's easy:

 # lock client into a dir
 allow rw path /home/user  

Or, for a shared model:

 # allow access to a project dir, as project or user gid
 allow rw path /share/project uid 123 gids 123,1000

Ooh... are you thinking forward to when we might have to dynamically write 
a cap that implements changing kerberos auth information?  For that, I'd 
suggest something like

 allow rw path /foo kerberos

...where that then does an additional check against any kerberos tickets 
info we've gotten from the client session?

sage

^ permalink raw reply	[flat|nested] 35+ messages in thread

* Re: MDS auth caps for cephfs
  2015-05-26 22:17                   ` Sage Weil
@ 2015-05-26 22:50                     ` Gregory Farnum
  2015-05-26 23:12                       ` Sage Weil
  0 siblings, 1 reply; 35+ messages in thread
From: Gregory Farnum @ 2015-05-26 22:50 UTC (permalink / raw)
  To: Sage Weil; +Cc: John Spray, ceph-devel, Nishtha Rai, jashan kamboj

On Tue, May 26, 2015 at 3:17 PM, Sage Weil <sweil@redhat.com> wrote:
> On Tue, 26 May 2015, Gregory Farnum wrote:
>> > That makes sense to me.  I suggest
>> >
>> > - the MDS decides it is allowed.  If so, do the presented operation.
>> > Preserve the const-ness of teh current permission checks.
>> >
>> > - the client may do some squashing before-hand.
>> >
>> > - forget what I said before and make this un-NFS-like.  :)
>>
>> :)
>>
>> >
>> > Perhaps the illustrative example is this: an op to create a file, as uid
>> > 0, in a 777 directory comes in, and the capability says root_squash.  In
>> > this case, our const check says "if i were to squash to 65534, it would
>> > still be allowed" and so the mds goes ahead and creates a file *owned by
>> > root*.
>> >
>> > Does that make sense?
>>
>> Mmm, I'm having trouble making this one work out. If you can write a
>> file with UID 0 you have to be able to subsequently read it, and I
>> just don't see that happening if you aren't doing an operation as UID
>> 0?
>
> The operation would be done as uid 0.
>
> (Also, I'm completely ignoring read operations for the moment... :/ )

Okay, let's not ignore that. If a read comes in as anonymous for a
file owned by UID 0 and without world-readable caps (which that write
operation you described can do), it fails. Correct?
I see that as a bit of a problem. :/

>
>> But this is getting thornier as we consider future multi-tenant (not
>> just multi-user) and subtree work. I've been imaging the subtree
>> restriction as *not additive*, in that it would be a restriction to
>> users.
>
> I think we're still talking past each other about 'additive'.  I'm
> suggesting caps are *always* additive.  That is, any operation permitted
> by
>
>  allow A
>
> will also permit the same operation if you have
>
>  allow A ; allow B
>
> for *any* B.  B cannot effect A, and only one of the allow's needs to say
> "yes".
>
> ...but what does that have to do with path restrictions?  If you have an
> allow A like
>
>  allow rw path /foo
>
> that will allow something in /foo regardless of what other allow B's are
> added to the list.
>
> Right?

Okay, so that's one path we could go down. I'm not sure if it's the
right one or not.
In particular, I'd like us to be able to guarantee that users never
escape out of their given subtree. It seems like they could if it's
just a positive grant. Or maybe if the cephx caps don't include
anything else then no operations elsewhere will be disallowed? I guess
that should work. But in any case, not what you wrote in the pad:

> If a path is specified in the cap, only ops within that subtree are allowed.

:/

>
>> It looks like you've got similar ideas from the pad, so I don't
>> understand how the anonymous user would generally be allowed to do
>> things against the tree except read it?
>>
>> When looking at your pad and you say "if a UID is specified..." do you
>> mean specified within an op, or in the cephx caps? The first one makes
>> sense to me; the second won't work (not additive, etc).
>
> In the caps, where it is optional.  The uid on the op will be required
> (modulo whatever we have to do for compatibility).

Okay, quoting from the pad:
> If the uid is specified in the cap, we only allow the operation if it is tagged with the same uid
> (and all other constraints are satisfied).

Maybe this is the right way for us to go, but that's definitely not
additive: it *disallows* the operation if its UID doesn't match the
one in this cephx cap, regardless of what anything else says.
UIDs like this I think can and should be strictly additive. We
probably want to include ranges or some other sort of aggregation, but
I don't see much point to making them optional restrictions the way I
see for the path-based stuff?

Or (looking below) perhaps you mean the uid for a specific grant, not
for all grants within the set of cephx caps. That does make a bit more
sense, and makes upgrades easier. Hrmm.

>
>> Basically I'm still stuck on how any of this lets us lock a user into
>> a subtree while letting them do what they want within it. I'm not sure
>> how/if NFS solves that problem...
>
> That's easy:
>
>  # lock client into a dir
>  allow rw path /home/user
>
> Or, for a shared model:
>
>  # allow access to a project dir, as project or user gid
>  allow rw path /share/project uid 123 gids 123,1000

I think I'm forgetting how the parsing works. Is that all one specific
allow stanza that is evaluated as a unit? Ie, to match it an operation
must fall within /share/project and be sent by a user with either UID
123 or GID{123,1000}? You're right, that works.

>
> Ooh... are you thinking forward to when we might have to dynamically write
> a cap that implements changing kerberos auth information?  For that, I'd
> suggest something like
>
>  allow rw path /foo kerberos
>
> ...where that then does an additional check against any kerberos tickets
> info we've gotten from the client session?
>
> sage

^ permalink raw reply	[flat|nested] 35+ messages in thread

* Re: MDS auth caps for cephfs
  2015-05-26 22:50                     ` Gregory Farnum
@ 2015-05-26 23:12                       ` Sage Weil
  2015-05-26 23:32                         ` Gregory Farnum
  0 siblings, 1 reply; 35+ messages in thread
From: Sage Weil @ 2015-05-26 23:12 UTC (permalink / raw)
  To: Gregory Farnum; +Cc: John Spray, ceph-devel, Nishtha Rai, jashan kamboj

On Tue, 26 May 2015, Gregory Farnum wrote:
> On Tue, May 26, 2015 at 3:17 PM, Sage Weil <sweil@redhat.com> wrote:
> > On Tue, 26 May 2015, Gregory Farnum wrote:
> >> > That makes sense to me.  I suggest
> >> >
> >> > - the MDS decides it is allowed.  If so, do the presented operation.
> >> > Preserve the const-ness of teh current permission checks.
> >> >
> >> > - the client may do some squashing before-hand.
> >> >
> >> > - forget what I said before and make this un-NFS-like.  :)
> >>
> >> :)
> >>
> >> >
> >> > Perhaps the illustrative example is this: an op to create a file, as uid
> >> > 0, in a 777 directory comes in, and the capability says root_squash.  In
> >> > this case, our const check says "if i were to squash to 65534, it would
> >> > still be allowed" and so the mds goes ahead and creates a file *owned by
> >> > root*.
> >> >
> >> > Does that make sense?
> >>
> >> Mmm, I'm having trouble making this one work out. If you can write a
> >> file with UID 0 you have to be able to subsequently read it, and I
> >> just don't see that happening if you aren't doing an operation as UID
> >> 0?
> >
> > The operation would be done as uid 0.
> >
> > (Also, I'm completely ignoring read operations for the moment... :/ )
> 
> Okay, let's not ignore that. If a read comes in as anonymous for a

"as anonymous"?

> file owned by UID 0 and without world-readable caps (which that write
> operation you described can do), it fails. Correct?
> I see that as a bit of a problem. :/

It's perfectly okay to create a file you can't read:

 echo asdf > foo
 chmod 0 foo
 cat foo

..and things are get fishy when root squash kicks in, so I'm not sure it's 
a problem.

> >> But this is getting thornier as we consider future multi-tenant (not
> >> just multi-user) and subtree work. I've been imaging the subtree
> >> restriction as *not additive*, in that it would be a restriction to
> >> users.
> >
> > I think we're still talking past each other about 'additive'.  I'm
> > suggesting caps are *always* additive.  That is, any operation permitted
> > by
> >
> >  allow A
> >
> > will also permit the same operation if you have
> >
> >  allow A ; allow B
> >
> > for *any* B.  B cannot effect A, and only one of the allow's needs to say
> > "yes".
> >
> > ...but what does that have to do with path restrictions?  If you have an
> > allow A like
> >
> >  allow rw path /foo
> >
> > that will allow something in /foo regardless of what other allow B's are
> > added to the list.
> >
> > Right?
> 
> Okay, so that's one path we could go down. I'm not sure if it's the
> right one or not.

What other path is there?  In my mind this is what "additive" means... if 
that's not what you mean please clarify because I'm so confused right now!

> In particular, I'd like us to be able to guarantee that users never
> escape out of their given subtree.

That's a matter of implementing the checks properly...

> It seems like they could if it's
> just a positive grant. Or maybe if the cephx caps don't include
> anything else then no operations elsewhere will be disallowed? I guess
> that should work. But in any case, not what you wrote in the pad:
> 
> > If a path is specified in the cap, only ops within that subtree are allowed.
> 
> :/

Sorry, this means *for that allow clause*.  I'll clarify in the pad.  That 
is, if you change the above cap to be

 allow rw path /foo ; allow rw

they would not be restricted to /foo.  The set of things you can do is a 
union, not an intersection.

> >> It looks like you've got similar ideas from the pad, so I don't
> >> understand how the anonymous user would generally be allowed to do
> >> things against the tree except read it?
> >>
> >> When looking at your pad and you say "if a UID is specified..." do you
> >> mean specified within an op, or in the cephx caps? The first one makes
> >> sense to me; the second won't work (not additive, etc).
> >
> > In the caps, where it is optional.  The uid on the op will be required
> > (modulo whatever we have to do for compatibility).
> 
> Okay, quoting from the pad:
> > If the uid is specified in the cap, we only allow the operation if it is tagged with the same uid
> > (and all other constraints are satisfied).
> 
> Maybe this is the right way for us to go, but that's definitely not
> additive: it *disallows* the operation if its UID doesn't match the
> one in this cephx cap, regardless of what anything else says.

No, there is no such thing as disallow.  It *grants* access *if* some set 
of conditions are true.

> UIDs like this I think can and should be strictly additive. We
> probably want to include ranges or some other sort of aggregation, but
> I don't see much point to making them optional restrictions the way I
> see for the path-based stuff?
> 
> Or (looking below) perhaps you mean the uid for a specific grant, not
> for all grants within the set of cephx caps. That does make a bit more
> sense, and makes upgrades easier. Hrmm.

Yes

> >> Basically I'm still stuck on how any of this lets us lock a user into
> >> a subtree while letting them do what they want within it. I'm not sure
> >> how/if NFS solves that problem...
> >
> > That's easy:
> >
> >  # lock client into a dir
> >  allow rw path /home/user
> >
> > Or, for a shared model:
> >
> >  # allow access to a project dir, as project or user gid
> >  allow rw path /share/project uid 123 gids 123,1000
> 
> I think I'm forgetting how the parsing works. Is that all one specific
> allow stanza that is evaluated as a unit? Ie, to match it an operation
> must fall within /share/project and be sent by a user with either UID
> 123 or GID{123,1000}? You're right, that works.

Right.  Each 'allow' stanza is a set of conditions.  If they are true, 
then we allow.

sage


> 
> >
> > Ooh... are you thinking forward to when we might have to dynamically write
> > a cap that implements changing kerberos auth information?  For that, I'd
> > suggest something like
> >
> >  allow rw path /foo kerberos
> >
> > ...where that then does an additional check against any kerberos tickets
> > info we've gotten from the client session?
> >
> > sage
> 
> 

^ permalink raw reply	[flat|nested] 35+ messages in thread

* Re: MDS auth caps for cephfs
  2015-05-26 23:12                       ` Sage Weil
@ 2015-05-26 23:32                         ` Gregory Farnum
  2015-05-27 21:44                           ` Sage Weil
  0 siblings, 1 reply; 35+ messages in thread
From: Gregory Farnum @ 2015-05-26 23:32 UTC (permalink / raw)
  To: Sage Weil; +Cc: John Spray, ceph-devel, Nishtha Rai, jashan kamboj

On Tue, May 26, 2015 at 4:12 PM, Sage Weil <sweil@redhat.com> wrote:
> On Tue, 26 May 2015, Gregory Farnum wrote:
>> On Tue, May 26, 2015 at 3:17 PM, Sage Weil <sweil@redhat.com> wrote:
>> > On Tue, 26 May 2015, Gregory Farnum wrote:
>> >> > That makes sense to me.  I suggest
>> >> >
>> >> > - the MDS decides it is allowed.  If so, do the presented operation.
>> >> > Preserve the const-ness of teh current permission checks.
>> >> >
>> >> > - the client may do some squashing before-hand.
>> >> >
>> >> > - forget what I said before and make this un-NFS-like.  :)
>> >>
>> >> :)
>> >>
>> >> >
>> >> > Perhaps the illustrative example is this: an op to create a file, as uid
>> >> > 0, in a 777 directory comes in, and the capability says root_squash.  In
>> >> > this case, our const check says "if i were to squash to 65534, it would
>> >> > still be allowed" and so the mds goes ahead and creates a file *owned by
>> >> > root*.
>> >> >
>> >> > Does that make sense?
>> >>
>> >> Mmm, I'm having trouble making this one work out. If you can write a
>> >> file with UID 0 you have to be able to subsequently read it, and I
>> >> just don't see that happening if you aren't doing an operation as UID
>> >> 0?
>> >
>> > The operation would be done as uid 0.
>> >
>> > (Also, I'm completely ignoring read operations for the moment... :/ )
>>
>> Okay, let's not ignore that. If a read comes in as anonymous for a
>
> "as anonymous"?
>
>> file owned by UID 0 and without world-readable caps (which that write
>> operation you described can do), it fails. Correct?
>> I see that as a bit of a problem. :/
>
> It's perfectly okay to create a file you can't read:
>
>  echo asdf > foo
>  chmod 0 foo
>  cat foo
>
> ..and things are get fishy when root squash kicks in, so I'm not sure it's
> a problem.

Yes, but in a local filesystem *somebody* on that box will be able to
access that file. Whereas here it might be that only the Ceph admin
can do so, which is more troubling to me.

>
>> >> But this is getting thornier as we consider future multi-tenant (not
>> >> just multi-user) and subtree work. I've been imaging the subtree
>> >> restriction as *not additive*, in that it would be a restriction to
>> >> users.
>> >
>> > I think we're still talking past each other about 'additive'.  I'm
>> > suggesting caps are *always* additive.  That is, any operation permitted
>> > by
>> >
>> >  allow A
>> >
>> > will also permit the same operation if you have
>> >
>> >  allow A ; allow B
>> >
>> > for *any* B.  B cannot effect A, and only one of the allow's needs to say
>> > "yes".
>> >
>> > ...but what does that have to do with path restrictions?  If you have an
>> > allow A like
>> >
>> >  allow rw path /foo
>> >
>> > that will allow something in /foo regardless of what other allow B's are
>> > added to the list.
>> >
>> > Right?
>>
>> Okay, so that's one path we could go down. I'm not sure if it's the
>> right one or not.
>
> What other path is there?  In my mind this is what "additive" means... if
> that's not what you mean please clarify because I'm so confused right now!

You're absolutely right — sorry, I was taking this off on a bit of a
tangent as I put that together in my own head (we have mostly talked
about restricting to subtrees and I was wondering if we do want a big
fat restriction like that which is not additive. Can discuss later).

>
>> In particular, I'd like us to be able to guarantee that users never
>> escape out of their given subtree.
>
> That's a matter of implementing the checks properly...
>
>> It seems like they could if it's
>> just a positive grant. Or maybe if the cephx caps don't include
>> anything else then no operations elsewhere will be disallowed? I guess
>> that should work. But in any case, not what you wrote in the pad:
>>
>> > If a path is specified in the cap, only ops within that subtree are allowed.
>>
>> :/
>
> Sorry, this means *for that allow clause*.  I'll clarify in the pad.  That
> is, if you change the above cap to be
>
>  allow rw path /foo ; allow rw
>
> they would not be restricted to /foo.  The set of things you can do is a
> union, not an intersection.
>
>> >> It looks like you've got similar ideas from the pad, so I don't
>> >> understand how the anonymous user would generally be allowed to do
>> >> things against the tree except read it?
>> >>
>> >> When looking at your pad and you say "if a UID is specified..." do you
>> >> mean specified within an op, or in the cephx caps? The first one makes
>> >> sense to me; the second won't work (not additive, etc).
>> >
>> > In the caps, where it is optional.  The uid on the op will be required
>> > (modulo whatever we have to do for compatibility).
>>
>> Okay, quoting from the pad:
>> > If the uid is specified in the cap, we only allow the operation if it is tagged with the same uid
>> > (and all other constraints are satisfied).
>>
>> Maybe this is the right way for us to go, but that's definitely not
>> additive: it *disallows* the operation if its UID doesn't match the
>> one in this cephx cap, regardless of what anything else says.
>
> No, there is no such thing as disallow.  It *grants* access *if* some set
> of conditions are true.
>
>> UIDs like this I think can and should be strictly additive. We
>> probably want to include ranges or some other sort of aggregation, but
>> I don't see much point to making them optional restrictions the way I
>> see for the path-based stuff?
>>
>> Or (looking below) perhaps you mean the uid for a specific grant, not
>> for all grants within the set of cephx caps. That does make a bit more
>> sense, and makes upgrades easier. Hrmm.
>
> Yes
>
>> >> Basically I'm still stuck on how any of this lets us lock a user into
>> >> a subtree while letting them do what they want within it. I'm not sure
>> >> how/if NFS solves that problem...
>> >
>> > That's easy:
>> >
>> >  # lock client into a dir
>> >  allow rw path /home/user
>> >
>> > Or, for a shared model:
>> >
>> >  # allow access to a project dir, as project or user gid
>> >  allow rw path /share/project uid 123 gids 123,1000
>>
>> I think I'm forgetting how the parsing works. Is that all one specific
>> allow stanza that is evaluated as a unit? Ie, to match it an operation
>> must fall within /share/project and be sent by a user with either UID
>> 123 or GID{123,1000}? You're right, that works.
>
> Right.  Each 'allow' stanza is a set of conditions.  If they are true,
> then we allow.

Okay, I think we're on the same page here.
-Greg
--
To unsubscribe from this list: send the line "unsubscribe ceph-devel" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

^ permalink raw reply	[flat|nested] 35+ messages in thread

* Re: MDS auth caps for cephfs
  2015-05-26 23:32                         ` Gregory Farnum
@ 2015-05-27 21:44                           ` Sage Weil
  2015-05-27 22:03                             ` Gregory Farnum
  0 siblings, 1 reply; 35+ messages in thread
From: Sage Weil @ 2015-05-27 21:44 UTC (permalink / raw)
  To: Gregory Farnum; +Cc: John Spray, ceph-devel, Nishtha Rai, jashan kamboj

On Tue, 26 May 2015, Gregory Farnum wrote:
> >> >> Basically I'm still stuck on how any of this lets us lock a user into
> >> >> a subtree while letting them do what they want within it. I'm not sure
> >> >> how/if NFS solves that problem...
> >> >
> >> > That's easy:
> >> >
> >> >  # lock client into a dir
> >> >  allow rw path /home/user
> >> >
> >> > Or, for a shared model:
> >> >
> >> >  # allow access to a project dir, as project or user gid
> >> >  allow rw path /share/project uid 123 gids 123,1000
> >>
> >> I think I'm forgetting how the parsing works. Is that all one specific
> >> allow stanza that is evaluated as a unit? Ie, to match it an operation
> >> must fall within /share/project and be sent by a user with either UID
> >> 123 or GID{123,1000}? You're right, that works.
> >
> > Right.  Each 'allow' stanza is a set of conditions.  If they are true,
> > then we allow.
> 
> Okay, I think we're on the same page here.

Yay!

Now, let's see if I can throw us off again...

I was just talking to Simo about the longer-term kerberos auth goals to 
make sure we don't do something stupid here that we regret later.  His 
feedback boils down to:

 1) Don't bother with root squash since it doesn't buy you much, and 
 2) Never let the client construct the credential--do it on the server.

I'm okay with skipping squash_root (although it's simple enough it might 
be worthwhile anyway), but #2 is a bit different than what I was thinking.  
Specifically, this is about tagging requests with the uid + gid list.  If 
you let the client provide the group membership you lose most of the 
security--this is what NFS did and it sucked.  (There were other problems 
too, like a limit of 16 gids, and/or problems when a windows admin in 4000 
groups comes along.)

The idea we ended up on was to have a plugin interface on the MDS do to 
the credential -> uid + gid list mapping.  For simplicity, our initial 
"credential id" can just be a uid.  And the plugin interface would be 
something like

 int resolve_credential(bufferlist cred, uid_t *uid, vector<gid_t> *gidls);

with plugins that do various trivial things, like

 - cred = uid, assume we are in one group with gid == uid
 - cred = uid, resolve groups from local machine (where ceph-mds 
is running)
 - cred = uid, resolve groups from explicitly named passwd/group files

and later we'd add plugins to query LDAP, parse a kerberos 
credential, or parse the MS-PAC thing from kerberos.

The target environments would be:

1) trusted, no auth, keep doing what we do now (trust the client and check 
nothing at the mds)

  allow any

2) semi-trusted client.  Use cap like

  allow rw

but check client requests at MDS by resolving credentials and verifying 
unix permissions/ACLs.  (This will use the above call-out to do the uid -> 
gid translation.)

3) per-client trust.  Use caps like

  allow rw uid 123 gids 123,1000

so that a given host is locked as a single user (or maybe a small list of 
users).  Or,

  allow rw path /foo uid 123 gids 123

etc.

4) untrusted client.  Use kerberos.  Use caps like

  allow rw kerberos_domain=FOO.COM

and do all the fancypants stuff to get per-user tickets from clients, 
resolve them to groups, and enforce things on the server.  This one is 
still hand-wavey since we haven't defined the protocol etc.

I think we can get 1-3 without too much trouble!  The main question for me 
right now is how we define teh credential we tag requests and cap 
writeback with.  Maybe something simple like

struct ceph_cred_handle {
	enum { NONE, UID, OTHER } type;
	uint64_t id;
};

For now we just stuff the uid into id.  For kerberos, we'll put some 
cookie in there that came from a previous exchange where we passed the 
kerberos ticket to the MDS and got an id.  (The ticket may be big--we 
don't want to attach it to each request.)

sage

^ permalink raw reply	[flat|nested] 35+ messages in thread

* Re: MDS auth caps for cephfs
  2015-05-27 21:44                           ` Sage Weil
@ 2015-05-27 22:03                             ` Gregory Farnum
  2015-05-27 22:21                               ` Sage Weil
  0 siblings, 1 reply; 35+ messages in thread
From: Gregory Farnum @ 2015-05-27 22:03 UTC (permalink / raw)
  To: Sage Weil; +Cc: John Spray, ceph-devel, Nishtha Rai, jashan kamboj

On Wed, May 27, 2015 at 2:44 PM, Sage Weil <sweil@redhat.com> wrote:
> On Tue, 26 May 2015, Gregory Farnum wrote:
>> >> >> Basically I'm still stuck on how any of this lets us lock a user into
>> >> >> a subtree while letting them do what they want within it. I'm not sure
>> >> >> how/if NFS solves that problem...
>> >> >
>> >> > That's easy:
>> >> >
>> >> >  # lock client into a dir
>> >> >  allow rw path /home/user
>> >> >
>> >> > Or, for a shared model:
>> >> >
>> >> >  # allow access to a project dir, as project or user gid
>> >> >  allow rw path /share/project uid 123 gids 123,1000
>> >>
>> >> I think I'm forgetting how the parsing works. Is that all one specific
>> >> allow stanza that is evaluated as a unit? Ie, to match it an operation
>> >> must fall within /share/project and be sent by a user with either UID
>> >> 123 or GID{123,1000}? You're right, that works.
>> >
>> > Right.  Each 'allow' stanza is a set of conditions.  If they are true,
>> > then we allow.
>>
>> Okay, I think we're on the same page here.
>
> Yay!
>
> Now, let's see if I can throw us off again...
>
> I was just talking to Simo about the longer-term kerberos auth goals to
> make sure we don't do something stupid here that we regret later.  His
> feedback boils down to:
>
>  1) Don't bother with root squash since it doesn't buy you much, and
>  2) Never let the client construct the credential--do it on the server.
>
> I'm okay with skipping squash_root (although it's simple enough it might
> be worthwhile anyway)

Oh, I like skipping it, given the syntax and usability problems we went over. ;)

> but #2 is a bit different than what I was thinking.
> Specifically, this is about tagging requests with the uid + gid list.  If
> you let the client provide the group membership you lose most of the
> security--this is what NFS did and it sucked.  (There were other problems
> too, like a limit of 16 gids, and/or problems when a windows admin in 4000
> groups comes along.)

I'm not sure I understand this bit. I thought we were planning to have
gids in the cephx caps, and then have the client construct the list it
thinks is appropriate for each given request?
Obviously that trusts the client *some*, but it sandboxes them in and
I'm not sure the trust is a useful extension as long as we make sure
the UID and GID sets go together from the cephx caps.

>
> The idea we ended up on was to have a plugin interface on the MDS do to
> the credential -> uid + gid list mapping.  For simplicity, our initial
> "credential id" can just be a uid.  And the plugin interface would be
> something like
>
>  int resolve_credential(bufferlist cred, uid_t *uid, vector<gid_t> *gidls);
>
> with plugins that do various trivial things, like
>
>  - cred = uid, assume we are in one group with gid == uid
>  - cred = uid, resolve groups from local machine (where ceph-mds
> is running)
>  - cred = uid, resolve groups from explicitly named passwd/group files
>
> and later we'd add plugins to query LDAP, parse a kerberos
> credential, or parse the MS-PAC thing from kerberos.
>
> The target environments would be:
>
> 1) trusted, no auth, keep doing what we do now (trust the client and check
> nothing at the mds)
>
>   allow any
>
> 2) semi-trusted client.  Use cap like
>
>   allow rw
>
> but check client requests at MDS by resolving credentials and verifying
> unix permissions/ACLs.  (This will use the above call-out to do the uid ->
> gid translation.)
>
> 3) per-client trust.  Use caps like
>
>   allow rw uid 123 gids 123,1000
>
> so that a given host is locked as a single user (or maybe a small list of
> users).  Or,
>
>   allow rw path /foo uid 123 gids 123
>
> etc.
>
> 4) untrusted client.  Use kerberos.  Use caps like
>
>   allow rw kerberos_domain=FOO.COM
>
> and do all the fancypants stuff to get per-user tickets from clients,
> resolve them to groups, and enforce things on the server.  This one is
> still hand-wavey since we haven't defined the protocol etc.
>
> I think we can get 1-3 without too much trouble!  The main question for me
> right now is how we define teh credential we tag requests and cap
> writeback with.  Maybe something simple like
>
> struct ceph_cred_handle {
>         enum { NONE, UID, OTHER } type;
>         uint64_t id;
> };
>
> For now we just stuff the uid into id.  For kerberos, we'll put some
> cookie in there that came from a previous exchange where we passed the
> kerberos ticket to the MDS and got an id.  (The ticket may be big--we
> don't want to attach it to each request.)

Okay, so we want to do a lot more than in-cephx uid and gid
permissions granting? These look depressingly
integration-intensive-difficult but not terribly complicated
internally. I'd kind of like the interface to not imply we're doing
external callouts on every MDS op, though!
-Greg

^ permalink raw reply	[flat|nested] 35+ messages in thread

* Re: MDS auth caps for cephfs
  2015-05-27 22:03                             ` Gregory Farnum
@ 2015-05-27 22:21                               ` Sage Weil
  2015-05-27 22:40                                 ` Gregory Farnum
  0 siblings, 1 reply; 35+ messages in thread
From: Sage Weil @ 2015-05-27 22:21 UTC (permalink / raw)
  To: Gregory Farnum; +Cc: John Spray, ceph-devel, Nishtha Rai, jashan kamboj

On Wed, 27 May 2015, Gregory Farnum wrote:
> > I was just talking to Simo about the longer-term kerberos auth goals to
> > make sure we don't do something stupid here that we regret later.  His
> > feedback boils down to:
> >
> >  1) Don't bother with root squash since it doesn't buy you much, and
> >  2) Never let the client construct the credential--do it on the server.
> >
> > I'm okay with skipping squash_root (although it's simple enough it might
> > be worthwhile anyway)
> 
> Oh, I like skipping it, given the syntax and usability problems we went over. ;)
> 
> > but #2 is a bit different than what I was thinking.
> > Specifically, this is about tagging requests with the uid + gid list.  If
> > you let the client provide the group membership you lose most of the
> > security--this is what NFS did and it sucked.  (There were other problems
> > too, like a limit of 16 gids, and/or problems when a windows admin in 4000
> > groups comes along.)
> 
> I'm not sure I understand this bit. I thought we were planning to have
> gids in the cephx caps, and then have the client construct the list it
> thinks is appropriate for each given request?
> Obviously that trusts the client *some*, but it sandboxes them in and
> I'm not sure the trust is a useful extension as long as we make sure
> the UID and GID sets go together from the cephx caps.

We went around in circles about this for a while, but in the end I think 
we agreed there is minimal value from having the client construct anything 
(the gid list in this case), and it avoids taking any step down what is 
ultimately a dead-end road.  For example, caps like

  allow rw gid 2000

are useless since the client can set gid=2000 but then make the request 
uid anything it wants (namely, the file owner).  Cutting the client out of 
the picture also avoids the many-gid issue.  The trade-off is that if you 
want stronger auth you need to teach the MDS how to do those mappings.

We need to make sure we can make this sane in a multi-namespace 
environment, e.g., where we have different cloud tenants in different 
paths.  Would we want to specify different uid->gid mappings for those?  
Maybe we actually want a cap like

 allow rw path=/foo uidgidns=foo

or something so that another tenant could have

 allow rw path=/foo uidgidns=bar

Or, we can just say that you get either

 - a global uid->gid mapping, server-side enforcement, and allow based on 
uid;
 - same as above, but also with a path restriction; or
 - path restriction, and no server-side uid/gid permission/acl checks

> > The idea we ended up on was to have a plugin interface on the MDS do to
> > the credential -> uid + gid list mapping.  For simplicity, our initial
> > "credential id" can just be a uid.  And the plugin interface would be
> > something like
> >
> >  int resolve_credential(bufferlist cred, uid_t *uid, vector<gid_t> *gidls);
> >
> > with plugins that do various trivial things, like
> >
> >  - cred = uid, assume we are in one group with gid == uid
> >  - cred = uid, resolve groups from local machine (where ceph-mds
> > is running)
> >  - cred = uid, resolve groups from explicitly named passwd/group files
> >
> > and later we'd add plugins to query LDAP, parse a kerberos
> > credential, or parse the MS-PAC thing from kerberos.
> >
> > The target environments would be:
> >
> > 1) trusted, no auth, keep doing what we do now (trust the client and check
> > nothing at the mds)
> >
> >   allow any
> >
> > 2) semi-trusted client.  Use cap like
> >
> >   allow rw
> >
> > but check client requests at MDS by resolving credentials and verifying
> > unix permissions/ACLs.  (This will use the above call-out to do the uid ->
> > gid translation.)
> >
> > 3) per-client trust.  Use caps like
> >
> >   allow rw uid 123 gids 123,1000
> >
> > so that a given host is locked as a single user (or maybe a small list of
> > users).  Or,
> >
> >   allow rw path /foo uid 123 gids 123
> >
> > etc.
> >
> > 4) untrusted client.  Use kerberos.  Use caps like
> >
> >   allow rw kerberos_domain=FOO.COM
> >
> > and do all the fancypants stuff to get per-user tickets from clients,
> > resolve them to groups, and enforce things on the server.  This one is
> > still hand-wavey since we haven't defined the protocol etc.
> >
> > I think we can get 1-3 without too much trouble!  The main question for me
> > right now is how we define teh credential we tag requests and cap
> > writeback with.  Maybe something simple like
> >
> > struct ceph_cred_handle {
> >         enum { NONE, UID, OTHER } type;
> >         uint64_t id;
> > };
> >
> > For now we just stuff the uid into id.  For kerberos, we'll put some
> > cookie in there that came from a previous exchange where we passed the
> > kerberos ticket to the MDS and got an id.  (The ticket may be big--we
> > don't want to attach it to each request.)
> 
> Okay, so we want to do a lot more than in-cephx uid and gid
> permissions granting? These look depressingly
> integration-intensive-difficult but not terribly complicated
> internally. I'd kind of like the interface to not imply we're doing
> external callouts on every MDS op, though!

We'd probably need to allow it to be async (return EAGAIN) or something.  
Some cases will hit a cache or be trivial and non-blocking, but others 
will need to do an upcall to some slow network service.  Maybe

  int resolve_credential(bufferlist cred, uid_t *uid, vector<gid_t> 
     *gidls, Context *onfinish);

where r == 0 means we did it, and r == -EAGAIN means we will call onfinish 
when the result is ready.  Or some similar construct that let's avoid a 
spurious Context alloc+free in the fast path.

sage

^ permalink raw reply	[flat|nested] 35+ messages in thread

* Re: MDS auth caps for cephfs
  2015-05-27 22:21                               ` Sage Weil
@ 2015-05-27 22:40                                 ` Gregory Farnum
  2015-05-27 23:07                                   ` Sage Weil
  0 siblings, 1 reply; 35+ messages in thread
From: Gregory Farnum @ 2015-05-27 22:40 UTC (permalink / raw)
  To: Sage Weil; +Cc: John Spray, ceph-devel, Nishtha Rai, jashan kamboj

On Wed, May 27, 2015 at 3:21 PM, Sage Weil <sweil@redhat.com> wrote:
> On Wed, 27 May 2015, Gregory Farnum wrote:
>> > I was just talking to Simo about the longer-term kerberos auth goals to
>> > make sure we don't do something stupid here that we regret later.  His
>> > feedback boils down to:
>> >
>> >  1) Don't bother with root squash since it doesn't buy you much, and
>> >  2) Never let the client construct the credential--do it on the server.
>> >
>> > I'm okay with skipping squash_root (although it's simple enough it might
>> > be worthwhile anyway)
>>
>> Oh, I like skipping it, given the syntax and usability problems we went over. ;)
>>
>> > but #2 is a bit different than what I was thinking.
>> > Specifically, this is about tagging requests with the uid + gid list.  If
>> > you let the client provide the group membership you lose most of the
>> > security--this is what NFS did and it sucked.  (There were other problems
>> > too, like a limit of 16 gids, and/or problems when a windows admin in 4000
>> > groups comes along.)
>>
>> I'm not sure I understand this bit. I thought we were planning to have
>> gids in the cephx caps, and then have the client construct the list it
>> thinks is appropriate for each given request?
>> Obviously that trusts the client *some*, but it sandboxes them in and
>> I'm not sure the trust is a useful extension as long as we make sure
>> the UID and GID sets go together from the cephx caps.
>
> We went around in circles about this for a while, but in the end I think
> we agreed there is minimal value from having the client construct anything
> (the gid list in this case), and it avoids taking any step down what is
> ultimately a dead-end road.  For example, caps like
>
>   allow rw gid 2000
>
> are useless since the client can set gid=2000 but then make the request
> uid anything it wants (namely, the file owner).  Cutting the client out of
> the picture also avoids the many-gid issue.

I don't think I understand the threat model we're worried about here.
(Granted a cap that sets gid but not uid sounds like a bad idea to
me.) But if the cephx caps include the GID then a client can only use
weaker ones than they're permitted, which could frequently be correct.
For instance if each tenant in a multitenant system has a single cephx
key, but they have both admin and non-admin users within their local
context?

>  The trade-off is that if you
> want stronger auth you need to teach the MDS how to do those mappings.
>
> We need to make sure we can make this sane in a multi-namespace
> environment, e.g., where we have different cloud tenants in different
> paths.  Would we want to specify different uid->gid mappings for those?
> Maybe we actually want a cap like
>
>  allow rw path=/foo uidgidns=foo
>
> or something so that another tenant could have
>
>  allow rw path=/foo uidgidns=bar
>
> Or, we can just say that you get either
>
>  - a global uid->gid mapping, server-side enforcement, and allow based on
> uid;
>  - same as above, but also with a path restriction; or
>  - path restriction, and no server-side uid/gid permission/acl checks

Yes, this multi-namespace environment was what I was touching on in
some of my more confusing asides earlier. I think we need to survey
more operators about what they'd want here before making any
decisions, because I just don't understand the tradeoffs from their
perspective. (Is that list of 3 choices going to be a problem for
anybody? It's certainly the *easiest* to implement, and UID namespaces
within a single hierarchy sound like a bit of a nightmare both to
implement and administer...at that point maybe we're better off with
just multiple separate hierarchies.)

>
>> > The idea we ended up on was to have a plugin interface on the MDS do to
>> > the credential -> uid + gid list mapping.  For simplicity, our initial
>> > "credential id" can just be a uid.  And the plugin interface would be
>> > something like
>> >
>> >  int resolve_credential(bufferlist cred, uid_t *uid, vector<gid_t> *gidls);
>> >
>> > with plugins that do various trivial things, like
>> >
>> >  - cred = uid, assume we are in one group with gid == uid
>> >  - cred = uid, resolve groups from local machine (where ceph-mds
>> > is running)
>> >  - cred = uid, resolve groups from explicitly named passwd/group files
>> >
>> > and later we'd add plugins to query LDAP, parse a kerberos
>> > credential, or parse the MS-PAC thing from kerberos.
>> >
>> > The target environments would be:
>> >
>> > 1) trusted, no auth, keep doing what we do now (trust the client and check
>> > nothing at the mds)
>> >
>> >   allow any
>> >
>> > 2) semi-trusted client.  Use cap like
>> >
>> >   allow rw
>> >
>> > but check client requests at MDS by resolving credentials and verifying
>> > unix permissions/ACLs.  (This will use the above call-out to do the uid ->
>> > gid translation.)
>> >
>> > 3) per-client trust.  Use caps like
>> >
>> >   allow rw uid 123 gids 123,1000
>> >
>> > so that a given host is locked as a single user (or maybe a small list of
>> > users).  Or,
>> >
>> >   allow rw path /foo uid 123 gids 123
>> >
>> > etc.
>> >
>> > 4) untrusted client.  Use kerberos.  Use caps like
>> >
>> >   allow rw kerberos_domain=FOO.COM
>> >
>> > and do all the fancypants stuff to get per-user tickets from clients,
>> > resolve them to groups, and enforce things on the server.  This one is
>> > still hand-wavey since we haven't defined the protocol etc.
>> >
>> > I think we can get 1-3 without too much trouble!  The main question for me
>> > right now is how we define teh credential we tag requests and cap
>> > writeback with.  Maybe something simple like
>> >
>> > struct ceph_cred_handle {
>> >         enum { NONE, UID, OTHER } type;
>> >         uint64_t id;
>> > };
>> >
>> > For now we just stuff the uid into id.  For kerberos, we'll put some
>> > cookie in there that came from a previous exchange where we passed the
>> > kerberos ticket to the MDS and got an id.  (The ticket may be big--we
>> > don't want to attach it to each request.)
>>
>> Okay, so we want to do a lot more than in-cephx uid and gid
>> permissions granting? These look depressingly
>> integration-intensive-difficult but not terribly complicated
>> internally. I'd kind of like the interface to not imply we're doing
>> external callouts on every MDS op, though!
>
> We'd probably need to allow it to be async (return EAGAIN) or something.
> Some cases will hit a cache or be trivial and non-blocking, but others
> will need to do an upcall to some slow network service.  Maybe
>
>   int resolve_credential(bufferlist cred, uid_t *uid, vector<gid_t>
>      *gidls, Context *onfinish);
>
> where r == 0 means we did it, and r == -EAGAIN means we will call onfinish
> when the result is ready.  Or some similar construct that let's avoid a
> spurious Context alloc+free in the fast path.

Mmm. "slow network service" scares me. I presume you're thinking here
that this is a per-session request, not a per-operation one? If we're
going to include external security systems we probably need to let
them get a say on every request but it very much needs to be local
data only for those.
-Greg

^ permalink raw reply	[flat|nested] 35+ messages in thread

* Re: MDS auth caps for cephfs
  2015-05-27 22:40                                 ` Gregory Farnum
@ 2015-05-27 23:07                                   ` Sage Weil
  2015-05-27 23:18                                     ` Gregory Farnum
  0 siblings, 1 reply; 35+ messages in thread
From: Sage Weil @ 2015-05-27 23:07 UTC (permalink / raw)
  To: Gregory Farnum; +Cc: John Spray, ceph-devel, Nishtha Rai, jashan kamboj

On Wed, 27 May 2015, Gregory Farnum wrote:
> On Wed, May 27, 2015 at 3:21 PM, Sage Weil <sweil@redhat.com> wrote:
> > On Wed, 27 May 2015, Gregory Farnum wrote:
> >> > I was just talking to Simo about the longer-term kerberos auth goals to
> >> > make sure we don't do something stupid here that we regret later.  His
> >> > feedback boils down to:
> >> >
> >> >  1) Don't bother with root squash since it doesn't buy you much, and
> >> >  2) Never let the client construct the credential--do it on the server.
> >> >
> >> > I'm okay with skipping squash_root (although it's simple enough it might
> >> > be worthwhile anyway)
> >>
> >> Oh, I like skipping it, given the syntax and usability problems we went over. ;)
> >>
> >> > but #2 is a bit different than what I was thinking.
> >> > Specifically, this is about tagging requests with the uid + gid list.  If
> >> > you let the client provide the group membership you lose most of the
> >> > security--this is what NFS did and it sucked.  (There were other problems
> >> > too, like a limit of 16 gids, and/or problems when a windows admin in 4000
> >> > groups comes along.)
> >>
> >> I'm not sure I understand this bit. I thought we were planning to have
> >> gids in the cephx caps, and then have the client construct the list it
> >> thinks is appropriate for each given request?
> >> Obviously that trusts the client *some*, but it sandboxes them in and
> >> I'm not sure the trust is a useful extension as long as we make sure
> >> the UID and GID sets go together from the cephx caps.
> >
> > We went around in circles about this for a while, but in the end I think
> > we agreed there is minimal value from having the client construct anything
> > (the gid list in this case), and it avoids taking any step down what is
> > ultimately a dead-end road.  For example, caps like
> >
> >   allow rw gid 2000
> >
> > are useless since the client can set gid=2000 but then make the request
> > uid anything it wants (namely, the file owner).  Cutting the client out of
> > the picture also avoids the many-gid issue.
> 
> I don't think I understand the threat model we're worried about here.
> (Granted a cap that sets gid but not uid sounds like a bad idea to
> me.) But if the cephx caps include the GID then a client can only use
> weaker ones than they're permitted, which could frequently be correct.
> For instance if each tenant in a multitenant system has a single cephx
> key, but they have both admin and non-admin users within their local
> context?

Not sure I understand the question.  The threat model is... a client that 
can send arbitrary requests and wants to modify files?

- Any cap that specifies gid only is useless, since you can choose a uid 
to match the file.

- Any cap that specifies uid only exposes any group-writeable files/dirs.

- Any cap that specifies uid and gid(s) is fine.

...but if we have a server-side mapping of uid -> gid(s), then any of 
those is fine (we can specify uid only, gid only, or both).

> >  The trade-off is that if you want stronger auth you need to teach the 
> > MDS how to do those mappings.
> >
> > We need to make sure we can make this sane in a multi-namespace
> > environment, e.g., where we have different cloud tenants in different
> > paths.  Would we want to specify different uid->gid mappings for those?
> > Maybe we actually want a cap like
> >
> >  allow rw path=/foo uidgidns=foo
> >
> > or something so that another tenant could have
> >
> >  allow rw path=/foo uidgidns=bar
> >
> > Or, we can just say that you get either
> >
> >  - a global uid->gid mapping, server-side enforcement, and allow based on
> > uid;
> >  - same as above, but also with a path restriction; or
> >  - path restriction, and no server-side uid/gid permission/acl checks
> 
> Yes, this multi-namespace environment was what I was touching on in
> some of my more confusing asides earlier. I think we need to survey
> more operators about what they'd want here before making any
> decisions, because I just don't understand the tradeoffs from their
> perspective. (Is that list of 3 choices going to be a problem for
> anybody? It's certainly the *easiest* to implement, and UID namespaces
> within a single hierarchy sound like a bit of a nightmare both to
> implement and administer...at that point maybe we're better off with
> just multiple separate hierarchies.)

Yeah.  Well, I think something like uidgidns=foo in the cap could let us 
do the forth option (separate gid mappings for each path).  Or that could 
be associated with the file system (directory layout property, maybe--not 
a cap property).  I'm not sure it matters.  In any case, I less worried 
that we'll box ourselves into a corner in that regard, especially since 
I suspect most users will want a global uid->gid mapping anyway.

> >> > The idea we ended up on was to have a plugin interface on the MDS do to
> >> > the credential -> uid + gid list mapping.  For simplicity, our initial
> >> > "credential id" can just be a uid.  And the plugin interface would be
> >> > something like
> >> >
> >> >  int resolve_credential(bufferlist cred, uid_t *uid, vector<gid_t> *gidls);
> >> >
> >> > with plugins that do various trivial things, like
> >> >
> >> >  - cred = uid, assume we are in one group with gid == uid
> >> >  - cred = uid, resolve groups from local machine (where ceph-mds
> >> > is running)
> >> >  - cred = uid, resolve groups from explicitly named passwd/group files
> >> >
> >> > and later we'd add plugins to query LDAP, parse a kerberos
> >> > credential, or parse the MS-PAC thing from kerberos.
> >> >
> >> > The target environments would be:
> >> >
> >> > 1) trusted, no auth, keep doing what we do now (trust the client and check
> >> > nothing at the mds)
> >> >
> >> >   allow any
> >> >
> >> > 2) semi-trusted client.  Use cap like
> >> >
> >> >   allow rw
> >> >
> >> > but check client requests at MDS by resolving credentials and verifying
> >> > unix permissions/ACLs.  (This will use the above call-out to do the uid ->
> >> > gid translation.)
> >> >
> >> > 3) per-client trust.  Use caps like
> >> >
> >> >   allow rw uid 123 gids 123,1000
> >> >
> >> > so that a given host is locked as a single user (or maybe a small list of
> >> > users).  Or,
> >> >
> >> >   allow rw path /foo uid 123 gids 123
> >> >
> >> > etc.
> >> >
> >> > 4) untrusted client.  Use kerberos.  Use caps like
> >> >
> >> >   allow rw kerberos_domain=FOO.COM
> >> >
> >> > and do all the fancypants stuff to get per-user tickets from clients,
> >> > resolve them to groups, and enforce things on the server.  This one is
> >> > still hand-wavey since we haven't defined the protocol etc.
> >> >
> >> > I think we can get 1-3 without too much trouble!  The main question for me
> >> > right now is how we define teh credential we tag requests and cap
> >> > writeback with.  Maybe something simple like
> >> >
> >> > struct ceph_cred_handle {
> >> >         enum { NONE, UID, OTHER } type;
> >> >         uint64_t id;
> >> > };
> >> >
> >> > For now we just stuff the uid into id.  For kerberos, we'll put some
> >> > cookie in there that came from a previous exchange where we passed the
> >> > kerberos ticket to the MDS and got an id.  (The ticket may be big--we
> >> > don't want to attach it to each request.)
> >>
> >> Okay, so we want to do a lot more than in-cephx uid and gid
> >> permissions granting? These look depressingly
> >> integration-intensive-difficult but not terribly complicated
> >> internally. I'd kind of like the interface to not imply we're doing
> >> external callouts on every MDS op, though!
> >
> > We'd probably need to allow it to be async (return EAGAIN) or something.
> > Some cases will hit a cache or be trivial and non-blocking, but others
> > will need to do an upcall to some slow network service.  Maybe
> >
> >   int resolve_credential(bufferlist cred, uid_t *uid, vector<gid_t>
> >      *gidls, Context *onfinish);
> >
> > where r == 0 means we did it, and r == -EAGAIN means we will call onfinish
> > when the result is ready.  Or some similar construct that let's avoid a
> > spurious Context alloc+free in the fast path.
> 
> Mmm. "slow network service" scares me. I presume you're thinking here
> that this is a per-session request, not a per-operation one? If we're
> going to include external security systems we probably need to let
> them get a say on every request but it very much needs to be local
> data only for those.

The ceph_cred_handle would be per-request, but you would normally do 
upcalls infrequently.  Like in the kerberos case, we'd do that when they 
credential was registered (before it was used).  The the resolve step 
would have no network hop.  Or we might call out to LDAP, in which case 
the plugin would go async, and then cache the result so it is fast the 
next time around.

sage

^ permalink raw reply	[flat|nested] 35+ messages in thread

* Re: MDS auth caps for cephfs
  2015-05-27 23:07                                   ` Sage Weil
@ 2015-05-27 23:18                                     ` Gregory Farnum
  2015-05-27 23:59                                       ` Sage Weil
  0 siblings, 1 reply; 35+ messages in thread
From: Gregory Farnum @ 2015-05-27 23:18 UTC (permalink / raw)
  To: Sage Weil; +Cc: John Spray, ceph-devel, Nishtha Rai, jashan kamboj

On Wed, May 27, 2015 at 4:07 PM, Sage Weil <sweil@redhat.com> wrote:
> On Wed, 27 May 2015, Gregory Farnum wrote:
>> On Wed, May 27, 2015 at 3:21 PM, Sage Weil <sweil@redhat.com> wrote:
>> > On Wed, 27 May 2015, Gregory Farnum wrote:
>> >> > I was just talking to Simo about the longer-term kerberos auth goals to
>> >> > make sure we don't do something stupid here that we regret later.  His
>> >> > feedback boils down to:
>> >> >
>> >> >  1) Don't bother with root squash since it doesn't buy you much, and
>> >> >  2) Never let the client construct the credential--do it on the server.
>> >> >
>> >> > I'm okay with skipping squash_root (although it's simple enough it might
>> >> > be worthwhile anyway)
>> >>
>> >> Oh, I like skipping it, given the syntax and usability problems we went over. ;)
>> >>
>> >> > but #2 is a bit different than what I was thinking.
>> >> > Specifically, this is about tagging requests with the uid + gid list.  If
>> >> > you let the client provide the group membership you lose most of the
>> >> > security--this is what NFS did and it sucked.  (There were other problems
>> >> > too, like a limit of 16 gids, and/or problems when a windows admin in 4000
>> >> > groups comes along.)
>> >>
>> >> I'm not sure I understand this bit. I thought we were planning to have
>> >> gids in the cephx caps, and then have the client construct the list it
>> >> thinks is appropriate for each given request?
>> >> Obviously that trusts the client *some*, but it sandboxes them in and
>> >> I'm not sure the trust is a useful extension as long as we make sure
>> >> the UID and GID sets go together from the cephx caps.
>> >
>> > We went around in circles about this for a while, but in the end I think
>> > we agreed there is minimal value from having the client construct anything
>> > (the gid list in this case), and it avoids taking any step down what is
>> > ultimately a dead-end road.  For example, caps like
>> >
>> >   allow rw gid 2000
>> >
>> > are useless since the client can set gid=2000 but then make the request
>> > uid anything it wants (namely, the file owner).  Cutting the client out of
>> > the picture also avoids the many-gid issue.
>>
>> I don't think I understand the threat model we're worried about here.
>> (Granted a cap that sets gid but not uid sounds like a bad idea to
>> me.) But if the cephx caps include the GID then a client can only use
>> weaker ones than they're permitted, which could frequently be correct.
>> For instance if each tenant in a multitenant system has a single cephx
>> key, but they have both admin and non-admin users within their local
>> context?
>
> Not sure I understand the question.  The threat model is... a client that
> can send arbitrary requests and wants to modify files?
>
> - Any cap that specifies gid only is useless, since you can choose a uid
> to match the file.
>
> - Any cap that specifies uid only exposes any group-writeable files/dirs.
>
> - Any cap that specifies uid and gid(s) is fine.
>
> ...but if we have a server-side mapping of uid -> gid(s), then any of
> those is fine (we can specify uid only, gid only, or both).

Okay, so it's just malformed cephx caps then. We could just make it
refuse to accept gid specs if there's not a uid one as well.

Not that I'm necessarily opposed to doing it server-side, but I'm not
sure where we'd store it in the minimal configuration (without
kerberos or some other server to do lookups in) and not including them
in the cephx caps just feels odd.

>
>> >  The trade-off is that if you want stronger auth you need to teach the
>> > MDS how to do those mappings.
>> >
>> > We need to make sure we can make this sane in a multi-namespace
>> > environment, e.g., where we have different cloud tenants in different
>> > paths.  Would we want to specify different uid->gid mappings for those?
>> > Maybe we actually want a cap like
>> >
>> >  allow rw path=/foo uidgidns=foo
>> >
>> > or something so that another tenant could have
>> >
>> >  allow rw path=/foo uidgidns=bar
>> >
>> > Or, we can just say that you get either
>> >
>> >  - a global uid->gid mapping, server-side enforcement, and allow based on
>> > uid;
>> >  - same as above, but also with a path restriction; or
>> >  - path restriction, and no server-side uid/gid permission/acl checks
>>
>> Yes, this multi-namespace environment was what I was touching on in
>> some of my more confusing asides earlier. I think we need to survey
>> more operators about what they'd want here before making any
>> decisions, because I just don't understand the tradeoffs from their
>> perspective. (Is that list of 3 choices going to be a problem for
>> anybody? It's certainly the *easiest* to implement, and UID namespaces
>> within a single hierarchy sound like a bit of a nightmare both to
>> implement and administer...at that point maybe we're better off with
>> just multiple separate hierarchies.)
>
> Yeah.  Well, I think something like uidgidns=foo in the cap could let us
> do the forth option (separate gid mappings for each path).  Or that could
> be associated with the file system (directory layout property, maybe--not
> a cap property).  I'm not sure it matters.  In any case, I less worried
> that we'll box ourselves into a corner in that regard, especially since
> I suspect most users will want a global uid->gid mapping anyway.
>
>> >> > The idea we ended up on was to have a plugin interface on the MDS do to
>> >> > the credential -> uid + gid list mapping.  For simplicity, our initial
>> >> > "credential id" can just be a uid.  And the plugin interface would be
>> >> > something like
>> >> >
>> >> >  int resolve_credential(bufferlist cred, uid_t *uid, vector<gid_t> *gidls);
>> >> >
>> >> > with plugins that do various trivial things, like
>> >> >
>> >> >  - cred = uid, assume we are in one group with gid == uid
>> >> >  - cred = uid, resolve groups from local machine (where ceph-mds
>> >> > is running)
>> >> >  - cred = uid, resolve groups from explicitly named passwd/group files
>> >> >
>> >> > and later we'd add plugins to query LDAP, parse a kerberos
>> >> > credential, or parse the MS-PAC thing from kerberos.
>> >> >
>> >> > The target environments would be:
>> >> >
>> >> > 1) trusted, no auth, keep doing what we do now (trust the client and check
>> >> > nothing at the mds)
>> >> >
>> >> >   allow any
>> >> >
>> >> > 2) semi-trusted client.  Use cap like
>> >> >
>> >> >   allow rw
>> >> >
>> >> > but check client requests at MDS by resolving credentials and verifying
>> >> > unix permissions/ACLs.  (This will use the above call-out to do the uid ->
>> >> > gid translation.)
>> >> >
>> >> > 3) per-client trust.  Use caps like
>> >> >
>> >> >   allow rw uid 123 gids 123,1000
>> >> >
>> >> > so that a given host is locked as a single user (or maybe a small list of
>> >> > users).  Or,
>> >> >
>> >> >   allow rw path /foo uid 123 gids 123
>> >> >
>> >> > etc.
>> >> >
>> >> > 4) untrusted client.  Use kerberos.  Use caps like
>> >> >
>> >> >   allow rw kerberos_domain=FOO.COM
>> >> >
>> >> > and do all the fancypants stuff to get per-user tickets from clients,
>> >> > resolve them to groups, and enforce things on the server.  This one is
>> >> > still hand-wavey since we haven't defined the protocol etc.
>> >> >
>> >> > I think we can get 1-3 without too much trouble!  The main question for me
>> >> > right now is how we define teh credential we tag requests and cap
>> >> > writeback with.  Maybe something simple like
>> >> >
>> >> > struct ceph_cred_handle {
>> >> >         enum { NONE, UID, OTHER } type;
>> >> >         uint64_t id;
>> >> > };
>> >> >
>> >> > For now we just stuff the uid into id.  For kerberos, we'll put some
>> >> > cookie in there that came from a previous exchange where we passed the
>> >> > kerberos ticket to the MDS and got an id.  (The ticket may be big--we
>> >> > don't want to attach it to each request.)
>> >>
>> >> Okay, so we want to do a lot more than in-cephx uid and gid
>> >> permissions granting? These look depressingly
>> >> integration-intensive-difficult but not terribly complicated
>> >> internally. I'd kind of like the interface to not imply we're doing
>> >> external callouts on every MDS op, though!
>> >
>> > We'd probably need to allow it to be async (return EAGAIN) or something.
>> > Some cases will hit a cache or be trivial and non-blocking, but others
>> > will need to do an upcall to some slow network service.  Maybe
>> >
>> >   int resolve_credential(bufferlist cred, uid_t *uid, vector<gid_t>
>> >      *gidls, Context *onfinish);
>> >
>> > where r == 0 means we did it, and r == -EAGAIN means we will call onfinish
>> > when the result is ready.  Or some similar construct that let's avoid a
>> > spurious Context alloc+free in the fast path.
>>
>> Mmm. "slow network service" scares me. I presume you're thinking here
>> that this is a per-session request, not a per-operation one? If we're
>> going to include external security systems we probably need to let
>> them get a say on every request but it very much needs to be local
>> data only for those.
>
> The ceph_cred_handle would be per-request, but you would normally do
> upcalls infrequently.  Like in the kerberos case, we'd do that when they
> credential was registered (before it was used).  The the resolve step
> would have no network hop.  Or we might call out to LDAP, in which case
> the plugin would go async, and then cache the result so it is fast the
> next time around.

ceph_cred_handle as you defined it above is just an internal Ceph
structure though, right? I'm imagining more complicated systems (which
maybe don't exist) that couldn't be well-represented by a simple ID to
permissions mapping that we'll always understand. Or systems that
include timeouts and want us to renew the credential every N seconds.
So it'd be useful to let them define their own per-request security
check operating on static data, as well as the (async)
credential-identifying upcall.
-Greg

^ permalink raw reply	[flat|nested] 35+ messages in thread

* Re: MDS auth caps for cephfs
  2015-05-27 23:18                                     ` Gregory Farnum
@ 2015-05-27 23:59                                       ` Sage Weil
  2015-05-28  0:11                                         ` Gregory Farnum
  0 siblings, 1 reply; 35+ messages in thread
From: Sage Weil @ 2015-05-27 23:59 UTC (permalink / raw)
  To: Gregory Farnum; +Cc: John Spray, ceph-devel, Nishtha Rai, jashan kamboj

On Wed, 27 May 2015, Gregory Farnum wrote:
> On Wed, May 27, 2015 at 4:07 PM, Sage Weil <sweil@redhat.com> wrote:
> > On Wed, 27 May 2015, Gregory Farnum wrote:
> >> On Wed, May 27, 2015 at 3:21 PM, Sage Weil <sweil@redhat.com> wrote:
> >> > On Wed, 27 May 2015, Gregory Farnum wrote:
> >> >> > I was just talking to Simo about the longer-term kerberos auth goals to
> >> >> > make sure we don't do something stupid here that we regret later.  His
> >> >> > feedback boils down to:
> >> >> >
> >> >> >  1) Don't bother with root squash since it doesn't buy you much, and
> >> >> >  2) Never let the client construct the credential--do it on the server.
> >> >> >
> >> >> > I'm okay with skipping squash_root (although it's simple enough it might
> >> >> > be worthwhile anyway)
> >> >>
> >> >> Oh, I like skipping it, given the syntax and usability problems we went over. ;)
> >> >>
> >> >> > but #2 is a bit different than what I was thinking.
> >> >> > Specifically, this is about tagging requests with the uid + gid list.  If
> >> >> > you let the client provide the group membership you lose most of the
> >> >> > security--this is what NFS did and it sucked.  (There were other problems
> >> >> > too, like a limit of 16 gids, and/or problems when a windows admin in 4000
> >> >> > groups comes along.)
> >> >>
> >> >> I'm not sure I understand this bit. I thought we were planning to have
> >> >> gids in the cephx caps, and then have the client construct the list it
> >> >> thinks is appropriate for each given request?
> >> >> Obviously that trusts the client *some*, but it sandboxes them in and
> >> >> I'm not sure the trust is a useful extension as long as we make sure
> >> >> the UID and GID sets go together from the cephx caps.
> >> >
> >> > We went around in circles about this for a while, but in the end I think
> >> > we agreed there is minimal value from having the client construct anything
> >> > (the gid list in this case), and it avoids taking any step down what is
> >> > ultimately a dead-end road.  For example, caps like
> >> >
> >> >   allow rw gid 2000
> >> >
> >> > are useless since the client can set gid=2000 but then make the request
> >> > uid anything it wants (namely, the file owner).  Cutting the client out of
> >> > the picture also avoids the many-gid issue.
> >>
> >> I don't think I understand the threat model we're worried about here.
> >> (Granted a cap that sets gid but not uid sounds like a bad idea to
> >> me.) But if the cephx caps include the GID then a client can only use
> >> weaker ones than they're permitted, which could frequently be correct.
> >> For instance if each tenant in a multitenant system has a single cephx
> >> key, but they have both admin and non-admin users within their local
> >> context?
> >
> > Not sure I understand the question.  The threat model is... a client that
> > can send arbitrary requests and wants to modify files?
> >
> > - Any cap that specifies gid only is useless, since you can choose a uid
> > to match the file.
> >
> > - Any cap that specifies uid only exposes any group-writeable files/dirs.
> >
> > - Any cap that specifies uid and gid(s) is fine.
> >
> > ...but if we have a server-side mapping of uid -> gid(s), then any of
> > those is fine (we can specify uid only, gid only, or both).
> 
> Okay, so it's just malformed cephx caps then. We could just make it
> refuse to accept gid specs if there's not a uid one as well.

Well... it's meaningless if the client gets to choose the gid set.  If we 
don't do that, then it depends on what the server-side does.  If kerberos 
is used (i.e., the user doesn't get to choose an arbitrary uid) then it's 
okay.  But yeah, I guess we should disallow it for now until that becomes 
available, since in the meantime even with server-side uid->gid mapping 
they can pick any uid.

> Not that I'm necessarily opposed to doing it server-side, but I'm not
> sure where we'd store it in the minimal configuration (without
> kerberos or some other server to do lookups in) and not including them
> in the cephx caps just feels odd.

Yeah.  In fact, if we do have a server-side uid->gid map, and a cap like

 allow rw uid 100 gid 100

does the gid part actually accomplish anything?  I'm thinking it doesn't, 
and we can just forget gid in the caps entirely for the time being?  I 
mean, maybe the user is in groups 100, 200, and 300, but we only want to 
them act as though they're in 100 for this mount.. but who would even want 
to do that, and do we care at this point?

FWIW, my inclination would be to make the default mapping be a trivial 
mapping where the gid list == the uid.  Or, maybe, no gids at all.

> >> >> > I think we can get 1-3 without too much trouble!  The main question for me
> >> >> > right now is how we define teh credential we tag requests and cap
> >> >> > writeback with.  Maybe something simple like
> >> >> >
> >> >> > struct ceph_cred_handle {
> >> >> >         enum { NONE, UID, OTHER } type;
> >> >> >         uint64_t id;
> >> >> > };
> >> >> >
> >> >> > For now we just stuff the uid into id.  For kerberos, we'll put some
> >> >> > cookie in there that came from a previous exchange where we passed the
> >> >> > kerberos ticket to the MDS and got an id.  (The ticket may be big--we
> >> >> > don't want to attach it to each request.)
> >> >>
> >> >> Okay, so we want to do a lot more than in-cephx uid and gid
> >> >> permissions granting? These look depressingly
> >> >> integration-intensive-difficult but not terribly complicated
> >> >> internally. I'd kind of like the interface to not imply we're doing
> >> >> external callouts on every MDS op, though!
> >> >
> >> > We'd probably need to allow it to be async (return EAGAIN) or something.
> >> > Some cases will hit a cache or be trivial and non-blocking, but others
> >> > will need to do an upcall to some slow network service.  Maybe
> >> >
> >> >   int resolve_credential(bufferlist cred, uid_t *uid, vector<gid_t>
> >> >      *gidls, Context *onfinish);
> >> >
> >> > where r == 0 means we did it, and r == -EAGAIN means we will call onfinish
> >> > when the result is ready.  Or some similar construct that let's avoid a
> >> > spurious Context alloc+free in the fast path.
> >>
> >> Mmm. "slow network service" scares me. I presume you're thinking here
> >> that this is a per-session request, not a per-operation one? If we're
> >> going to include external security systems we probably need to let
> >> them get a say on every request but it very much needs to be local
> >> data only for those.
> >
> > The ceph_cred_handle would be per-request, but you would normally do
> > upcalls infrequently.  Like in the kerberos case, we'd do that when they
> > credential was registered (before it was used).  The the resolve step
> > would have no network hop.  Or we might call out to LDAP, in which case
> > the plugin would go async, and then cache the result so it is fast the
> > next time around.
> 
> ceph_cred_handle as you defined it above is just an internal Ceph
> structure though, right? I'm imagining more complicated systems (which
> maybe don't exist) that couldn't be well-represented by a simple ID to
> permissions mapping that we'll always understand. Or systems that
> include timeouts and want us to renew the credential every N seconds.
> So it'd be useful to let them define their own per-request security
> check operating on static data, as well as the (async)
> credential-identifying upcall.

I'm thinking it can either be a simple int (like uid) for trivial schemes, 
or an id referencing a previous exchange that set up the complicated 
thingk (like a kerberos ticket). e.g.,

 client -> mds : register_credential(<blob>)
 mds -> client : register_credential_reply(cred_handle.id=123, expires=...)
 client -> mds : request(mkdir foo, cred_handle.id=123)
 client -> mds : request(mkdir bar, cred_handle.id=123)
 client -> mds : request(mkdir baz, cred_handle.id=123)
 ...

I was just trying to keep it small and fixed-size.  But we could also just 
make it a bufferlist/blob in case there is a larger, non-fixed size thing 
that we want to include with every request (that's actually what Simo 
originally suggested).  I think that's only helpful though if we expect to 
have big blobs that aren't reused...

sage

^ permalink raw reply	[flat|nested] 35+ messages in thread

* Re: MDS auth caps for cephfs
  2015-05-27 23:59                                       ` Sage Weil
@ 2015-05-28  0:11                                         ` Gregory Farnum
  2015-05-28  0:37                                           ` Sage Weil
  0 siblings, 1 reply; 35+ messages in thread
From: Gregory Farnum @ 2015-05-28  0:11 UTC (permalink / raw)
  To: Sage Weil; +Cc: John Spray, ceph-devel, Nishtha Rai, jashan kamboj

On Wed, May 27, 2015 at 4:59 PM, Sage Weil <sweil@redhat.com> wrote:
> On Wed, 27 May 2015, Gregory Farnum wrote:
>> On Wed, May 27, 2015 at 4:07 PM, Sage Weil <sweil@redhat.com> wrote:
>> > On Wed, 27 May 2015, Gregory Farnum wrote:
>> >> On Wed, May 27, 2015 at 3:21 PM, Sage Weil <sweil@redhat.com> wrote:
>> >> > On Wed, 27 May 2015, Gregory Farnum wrote:
>> >> >> > I was just talking to Simo about the longer-term kerberos auth goals to
>> >> >> > make sure we don't do something stupid here that we regret later.  His
>> >> >> > feedback boils down to:
>> >> >> >
>> >> >> >  1) Don't bother with root squash since it doesn't buy you much, and
>> >> >> >  2) Never let the client construct the credential--do it on the server.
>> >> >> >
>> >> >> > I'm okay with skipping squash_root (although it's simple enough it might
>> >> >> > be worthwhile anyway)
>> >> >>
>> >> >> Oh, I like skipping it, given the syntax and usability problems we went over. ;)
>> >> >>
>> >> >> > but #2 is a bit different than what I was thinking.
>> >> >> > Specifically, this is about tagging requests with the uid + gid list.  If
>> >> >> > you let the client provide the group membership you lose most of the
>> >> >> > security--this is what NFS did and it sucked.  (There were other problems
>> >> >> > too, like a limit of 16 gids, and/or problems when a windows admin in 4000
>> >> >> > groups comes along.)
>> >> >>
>> >> >> I'm not sure I understand this bit. I thought we were planning to have
>> >> >> gids in the cephx caps, and then have the client construct the list it
>> >> >> thinks is appropriate for each given request?
>> >> >> Obviously that trusts the client *some*, but it sandboxes them in and
>> >> >> I'm not sure the trust is a useful extension as long as we make sure
>> >> >> the UID and GID sets go together from the cephx caps.
>> >> >
>> >> > We went around in circles about this for a while, but in the end I think
>> >> > we agreed there is minimal value from having the client construct anything
>> >> > (the gid list in this case), and it avoids taking any step down what is
>> >> > ultimately a dead-end road.  For example, caps like
>> >> >
>> >> >   allow rw gid 2000
>> >> >
>> >> > are useless since the client can set gid=2000 but then make the request
>> >> > uid anything it wants (namely, the file owner).  Cutting the client out of
>> >> > the picture also avoids the many-gid issue.
>> >>
>> >> I don't think I understand the threat model we're worried about here.
>> >> (Granted a cap that sets gid but not uid sounds like a bad idea to
>> >> me.) But if the cephx caps include the GID then a client can only use
>> >> weaker ones than they're permitted, which could frequently be correct.
>> >> For instance if each tenant in a multitenant system has a single cephx
>> >> key, but they have both admin and non-admin users within their local
>> >> context?
>> >
>> > Not sure I understand the question.  The threat model is... a client that
>> > can send arbitrary requests and wants to modify files?
>> >
>> > - Any cap that specifies gid only is useless, since you can choose a uid
>> > to match the file.
>> >
>> > - Any cap that specifies uid only exposes any group-writeable files/dirs.
>> >
>> > - Any cap that specifies uid and gid(s) is fine.
>> >
>> > ...but if we have a server-side mapping of uid -> gid(s), then any of
>> > those is fine (we can specify uid only, gid only, or both).
>>
>> Okay, so it's just malformed cephx caps then. We could just make it
>> refuse to accept gid specs if there's not a uid one as well.
>
> Well... it's meaningless if the client gets to choose the gid set.  If we
> don't do that, then it depends on what the server-side does.  If kerberos
> is used (i.e., the user doesn't get to choose an arbitrary uid) then it's
> okay.  But yeah, I guess we should disallow it for now until that becomes
> available, since in the meantime even with server-side uid->gid mapping
> they can pick any uid.
>
>> Not that I'm necessarily opposed to doing it server-side, but I'm not
>> sure where we'd store it in the minimal configuration (without
>> kerberos or some other server to do lookups in) and not including them
>> in the cephx caps just feels odd.
>
> Yeah.  In fact, if we do have a server-side uid->gid map, and a cap like
>
>  allow rw uid 100 gid 100
>
> does the gid part actually accomplish anything?  I'm thinking it doesn't,
> and we can just forget gid in the caps entirely for the time being?  I
> mean, maybe the user is in groups 100, 200, and 300, but we only want to
> them act as though they're in 100 for this mount.. but who would even want
> to do that, and do we care at this point?

I don't understand. In the base case where there is no other user
authentication/authorization system in place, we need to either
support group allows in the cephx caps, or we disallow the use of
groups, or each individual user/mount can claim to be a member of
whatever group they want.

So that makes me think we need to support group allows in the cephx
caps, of form something like

allow rw uid 100 gid 100,200,300

That would let the client act as user 100 and as a member of groups
100, 200, and 300 (*only* with uid 100!) if they so desire. That
enables lots of important use cases with sharing, right? And it's not
the client choosing the allowed set, it's the Ceph administrator. Is
there something about this that we don't want to enable that I'm just
missing, or are you ignoring the non-Kerberos case, or is there some
conflict between this and the Kerberos case, or....? I feel like our
meanings are sliding past each other again here. :(

-Greg

^ permalink raw reply	[flat|nested] 35+ messages in thread

* Re: MDS auth caps for cephfs
  2015-05-28  0:11                                         ` Gregory Farnum
@ 2015-05-28  0:37                                           ` Sage Weil
  2015-05-28  0:42                                             ` Gregory Farnum
  0 siblings, 1 reply; 35+ messages in thread
From: Sage Weil @ 2015-05-28  0:37 UTC (permalink / raw)
  To: Gregory Farnum; +Cc: John Spray, ceph-devel, Nishtha Rai, jashan kamboj

On Wed, 27 May 2015, Gregory Farnum wrote:
> On Wed, May 27, 2015 at 4:59 PM, Sage Weil <sweil@redhat.com> wrote:
> > On Wed, 27 May 2015, Gregory Farnum wrote:
> >> On Wed, May 27, 2015 at 4:07 PM, Sage Weil <sweil@redhat.com> wrote:
> >> > On Wed, 27 May 2015, Gregory Farnum wrote:
> >> >> On Wed, May 27, 2015 at 3:21 PM, Sage Weil <sweil@redhat.com> wrote:
> >> >> > On Wed, 27 May 2015, Gregory Farnum wrote:
> >> >> >> > I was just talking to Simo about the longer-term kerberos auth goals to
> >> >> >> > make sure we don't do something stupid here that we regret later.  His
> >> >> >> > feedback boils down to:
> >> >> >> >
> >> >> >> >  1) Don't bother with root squash since it doesn't buy you much, and
> >> >> >> >  2) Never let the client construct the credential--do it on the server.
> >> >> >> >
> >> >> >> > I'm okay with skipping squash_root (although it's simple enough it might
> >> >> >> > be worthwhile anyway)
> >> >> >>
> >> >> >> Oh, I like skipping it, given the syntax and usability problems we went over. ;)
> >> >> >>
> >> >> >> > but #2 is a bit different than what I was thinking.
> >> >> >> > Specifically, this is about tagging requests with the uid + gid list.  If
> >> >> >> > you let the client provide the group membership you lose most of the
> >> >> >> > security--this is what NFS did and it sucked.  (There were other problems
> >> >> >> > too, like a limit of 16 gids, and/or problems when a windows admin in 4000
> >> >> >> > groups comes along.)
> >> >> >>
> >> >> >> I'm not sure I understand this bit. I thought we were planning to have
> >> >> >> gids in the cephx caps, and then have the client construct the list it
> >> >> >> thinks is appropriate for each given request?
> >> >> >> Obviously that trusts the client *some*, but it sandboxes them in and
> >> >> >> I'm not sure the trust is a useful extension as long as we make sure
> >> >> >> the UID and GID sets go together from the cephx caps.
> >> >> >
> >> >> > We went around in circles about this for a while, but in the end I think
> >> >> > we agreed there is minimal value from having the client construct anything
> >> >> > (the gid list in this case), and it avoids taking any step down what is
> >> >> > ultimately a dead-end road.  For example, caps like
> >> >> >
> >> >> >   allow rw gid 2000
> >> >> >
> >> >> > are useless since the client can set gid=2000 but then make the request
> >> >> > uid anything it wants (namely, the file owner).  Cutting the client out of
> >> >> > the picture also avoids the many-gid issue.
> >> >>
> >> >> I don't think I understand the threat model we're worried about here.
> >> >> (Granted a cap that sets gid but not uid sounds like a bad idea to
> >> >> me.) But if the cephx caps include the GID then a client can only use
> >> >> weaker ones than they're permitted, which could frequently be correct.
> >> >> For instance if each tenant in a multitenant system has a single cephx
> >> >> key, but they have both admin and non-admin users within their local
> >> >> context?
> >> >
> >> > Not sure I understand the question.  The threat model is... a client that
> >> > can send arbitrary requests and wants to modify files?
> >> >
> >> > - Any cap that specifies gid only is useless, since you can choose a uid
> >> > to match the file.
> >> >
> >> > - Any cap that specifies uid only exposes any group-writeable files/dirs.
> >> >
> >> > - Any cap that specifies uid and gid(s) is fine.
> >> >
> >> > ...but if we have a server-side mapping of uid -> gid(s), then any of
> >> > those is fine (we can specify uid only, gid only, or both).
> >>
> >> Okay, so it's just malformed cephx caps then. We could just make it
> >> refuse to accept gid specs if there's not a uid one as well.
> >
> > Well... it's meaningless if the client gets to choose the gid set.  If we
> > don't do that, then it depends on what the server-side does.  If kerberos
> > is used (i.e., the user doesn't get to choose an arbitrary uid) then it's
> > okay.  But yeah, I guess we should disallow it for now until that becomes
> > available, since in the meantime even with server-side uid->gid mapping
> > they can pick any uid.
> >
> >> Not that I'm necessarily opposed to doing it server-side, but I'm not
> >> sure where we'd store it in the minimal configuration (without
> >> kerberos or some other server to do lookups in) and not including them
> >> in the cephx caps just feels odd.
> >
> > Yeah.  In fact, if we do have a server-side uid->gid map, and a cap like
> >
> >  allow rw uid 100 gid 100
> >
> > does the gid part actually accomplish anything?  I'm thinking it doesn't,
> > and we can just forget gid in the caps entirely for the time being?  I
> > mean, maybe the user is in groups 100, 200, and 300, but we only want to
> > them act as though they're in 100 for this mount.. but who would even want
> > to do that, and do we care at this point?
> 
> I don't understand. In the base case where there is no other user
> authentication/authorization system in place, we need to either
> support group allows in the cephx caps, or we disallow the use of
> groups, or each individual user/mount can claim to be a member of
> whatever group they want.

Oh, right.  I was assuming the MDS is configured with some uid->gid 
backend.  But many won't have or want that, and

> So that makes me think we need to support group allows in the cephx
> caps, of form something like
> 
>  allow rw uid 100 gid 100,200,300
> 
> That would let the client act as user 100 and as a member of groups
> 100, 200, and 300 (*only* with uid 100!) if they so desire. That
> enables lots of important use cases with sharing, right? And it's not
> the client choosing the allowed set, it's the Ceph administrator. Is
> there something about this that we don't want to enable that I'm just
> missing, or are you ignoring the non-Kerberos case, or is there some
> conflict between this and the Kerberos case, or....?

I think this makes perfect sense.  :)

So, the use-cases are now:

1) No authentication: 'allow any'.  What we have now.

2) Subtree restriction: 'allow rw path /foo'.

3) Uid and group restriction: 'allow rw uid 123 gids 123,1000,1001'.

4) Uid restriction + some backend: 'allow rw uid 123'.  MDS will do some 
call-out to map each uid to a gid list.

5) Kerberos: 'allow rw kerberos blah blah'.  Client presents user tickets 
to MDS, and MDS will do the call-out to map that to a uid + gid list.

6) 2 + 3

7) 2 + 4

8) 2 + 5

And then later, maybe,

9) 2 + a different backend for each subtree.
...

> I feel like our meanings are sliding past each other again here. :(

Hopefully not?  :)
sage

^ permalink raw reply	[flat|nested] 35+ messages in thread

* Re: MDS auth caps for cephfs
  2015-05-28  0:37                                           ` Sage Weil
@ 2015-05-28  0:42                                             ` Gregory Farnum
  2015-05-28 16:20                                               ` Robert LeBlanc
  0 siblings, 1 reply; 35+ messages in thread
From: Gregory Farnum @ 2015-05-28  0:42 UTC (permalink / raw)
  To: Sage Weil; +Cc: John Spray, ceph-devel, Nishtha Rai, jashan kamboj

On Wed, May 27, 2015 at 5:37 PM, Sage Weil <sweil@redhat.com> wrote:
> On Wed, 27 May 2015, Gregory Farnum wrote:
>> On Wed, May 27, 2015 at 4:59 PM, Sage Weil <sweil@redhat.com> wrote:
>> > On Wed, 27 May 2015, Gregory Farnum wrote:
>> >> On Wed, May 27, 2015 at 4:07 PM, Sage Weil <sweil@redhat.com> wrote:
>> >> > On Wed, 27 May 2015, Gregory Farnum wrote:
>> >> >> On Wed, May 27, 2015 at 3:21 PM, Sage Weil <sweil@redhat.com> wrote:
>> >> >> > On Wed, 27 May 2015, Gregory Farnum wrote:
>> >> >> >> > I was just talking to Simo about the longer-term kerberos auth goals to
>> >> >> >> > make sure we don't do something stupid here that we regret later.  His
>> >> >> >> > feedback boils down to:
>> >> >> >> >
>> >> >> >> >  1) Don't bother with root squash since it doesn't buy you much, and
>> >> >> >> >  2) Never let the client construct the credential--do it on the server.
>> >> >> >> >
>> >> >> >> > I'm okay with skipping squash_root (although it's simple enough it might
>> >> >> >> > be worthwhile anyway)
>> >> >> >>
>> >> >> >> Oh, I like skipping it, given the syntax and usability problems we went over. ;)
>> >> >> >>
>> >> >> >> > but #2 is a bit different than what I was thinking.
>> >> >> >> > Specifically, this is about tagging requests with the uid + gid list.  If
>> >> >> >> > you let the client provide the group membership you lose most of the
>> >> >> >> > security--this is what NFS did and it sucked.  (There were other problems
>> >> >> >> > too, like a limit of 16 gids, and/or problems when a windows admin in 4000
>> >> >> >> > groups comes along.)
>> >> >> >>
>> >> >> >> I'm not sure I understand this bit. I thought we were planning to have
>> >> >> >> gids in the cephx caps, and then have the client construct the list it
>> >> >> >> thinks is appropriate for each given request?
>> >> >> >> Obviously that trusts the client *some*, but it sandboxes them in and
>> >> >> >> I'm not sure the trust is a useful extension as long as we make sure
>> >> >> >> the UID and GID sets go together from the cephx caps.
>> >> >> >
>> >> >> > We went around in circles about this for a while, but in the end I think
>> >> >> > we agreed there is minimal value from having the client construct anything
>> >> >> > (the gid list in this case), and it avoids taking any step down what is
>> >> >> > ultimately a dead-end road.  For example, caps like
>> >> >> >
>> >> >> >   allow rw gid 2000
>> >> >> >
>> >> >> > are useless since the client can set gid=2000 but then make the request
>> >> >> > uid anything it wants (namely, the file owner).  Cutting the client out of
>> >> >> > the picture also avoids the many-gid issue.
>> >> >>
>> >> >> I don't think I understand the threat model we're worried about here.
>> >> >> (Granted a cap that sets gid but not uid sounds like a bad idea to
>> >> >> me.) But if the cephx caps include the GID then a client can only use
>> >> >> weaker ones than they're permitted, which could frequently be correct.
>> >> >> For instance if each tenant in a multitenant system has a single cephx
>> >> >> key, but they have both admin and non-admin users within their local
>> >> >> context?
>> >> >
>> >> > Not sure I understand the question.  The threat model is... a client that
>> >> > can send arbitrary requests and wants to modify files?
>> >> >
>> >> > - Any cap that specifies gid only is useless, since you can choose a uid
>> >> > to match the file.
>> >> >
>> >> > - Any cap that specifies uid only exposes any group-writeable files/dirs.
>> >> >
>> >> > - Any cap that specifies uid and gid(s) is fine.
>> >> >
>> >> > ...but if we have a server-side mapping of uid -> gid(s), then any of
>> >> > those is fine (we can specify uid only, gid only, or both).
>> >>
>> >> Okay, so it's just malformed cephx caps then. We could just make it
>> >> refuse to accept gid specs if there's not a uid one as well.
>> >
>> > Well... it's meaningless if the client gets to choose the gid set.  If we
>> > don't do that, then it depends on what the server-side does.  If kerberos
>> > is used (i.e., the user doesn't get to choose an arbitrary uid) then it's
>> > okay.  But yeah, I guess we should disallow it for now until that becomes
>> > available, since in the meantime even with server-side uid->gid mapping
>> > they can pick any uid.
>> >
>> >> Not that I'm necessarily opposed to doing it server-side, but I'm not
>> >> sure where we'd store it in the minimal configuration (without
>> >> kerberos or some other server to do lookups in) and not including them
>> >> in the cephx caps just feels odd.
>> >
>> > Yeah.  In fact, if we do have a server-side uid->gid map, and a cap like
>> >
>> >  allow rw uid 100 gid 100
>> >
>> > does the gid part actually accomplish anything?  I'm thinking it doesn't,
>> > and we can just forget gid in the caps entirely for the time being?  I
>> > mean, maybe the user is in groups 100, 200, and 300, but we only want to
>> > them act as though they're in 100 for this mount.. but who would even want
>> > to do that, and do we care at this point?
>>
>> I don't understand. In the base case where there is no other user
>> authentication/authorization system in place, we need to either
>> support group allows in the cephx caps, or we disallow the use of
>> groups, or each individual user/mount can claim to be a member of
>> whatever group they want.
>
> Oh, right.  I was assuming the MDS is configured with some uid->gid
> backend.  But many won't have or want that, and
>
>> So that makes me think we need to support group allows in the cephx
>> caps, of form something like
>>
>>  allow rw uid 100 gid 100,200,300
>>
>> That would let the client act as user 100 and as a member of groups
>> 100, 200, and 300 (*only* with uid 100!) if they so desire. That
>> enables lots of important use cases with sharing, right? And it's not
>> the client choosing the allowed set, it's the Ceph administrator. Is
>> there something about this that we don't want to enable that I'm just
>> missing, or are you ignoring the non-Kerberos case, or is there some
>> conflict between this and the Kerberos case, or....?
>
> I think this makes perfect sense.  :)
>
> So, the use-cases are now:
>
> 1) No authentication: 'allow any'.  What we have now.
>
> 2) Subtree restriction: 'allow rw path /foo'.
>
> 3) Uid and group restriction: 'allow rw uid 123 gids 123,1000,1001'.
>
> 4) Uid restriction + some backend: 'allow rw uid 123'.  MDS will do some
> call-out to map each uid to a gid list.
>
> 5) Kerberos: 'allow rw kerberos blah blah'.  Client presents user tickets
> to MDS, and MDS will do the call-out to map that to a uid + gid list.
>
> 6) 2 + 3
>
> 7) 2 + 4
>
> 8) 2 + 5
>
> And then later, maybe,
>
> 9) 2 + a different backend for each subtree.
> ...
>
>> I feel like our meanings are sliding past each other again here. :(
>
> Hopefully not?  :)
> sage

Hurray!

^ permalink raw reply	[flat|nested] 35+ messages in thread

* Re: MDS auth caps for cephfs
  2015-05-28  0:42                                             ` Gregory Farnum
@ 2015-05-28 16:20                                               ` Robert LeBlanc
  2015-05-28 16:42                                                 ` Gregory Farnum
  0 siblings, 1 reply; 35+ messages in thread
From: Robert LeBlanc @ 2015-05-28 16:20 UTC (permalink / raw)
  To: Gregory Farnum
  Cc: Sage Weil, John Spray, ceph-devel, Nishtha Rai, jashan kamboj

-----BEGIN PGP SIGNED MESSAGE-----
Hash: SHA256

I've been trying to follow this and I've been lost many times, but I'd
like to put in my $0.02.  In my mind any multi-tenant system that
relies on the client to specify UID/GID as authoritative is
fundamentally flawed. The server needs to be authoritative with access
or I would not trust it in a muti-tenant environment.

My take is have the User key (generated by the Ceph admin) specify the
CephFS directory|directories the key can access and the rwx
permissions for the directory|directories and then leave it up to the
tenant to handle the UID/GID allocation and the synchronization
between their hosts. Some tenants may want just local UID/GID
management, others may want LDAP, Kerberos, etc. I believe Ceph should
only be worried about "share" permissions and leave "file" permissions
to the tenant. Ceph just needs the ability to store UID/GID and POSIX
ACLs.

The MDS could combine a tenant ID and a UID/GID to store unique
UID/GIDs on the back end and just strip off the tenant ID when
presented to the client so there are no collisions of UID/GIDs between
tenants in the MDS.

Please excuse me if I'm off the rails here, but I think this is one
thing SMB got right and why I prefer Samba over NFS for multi-tenant
environments.
- ----------------
Robert LeBlanc
GPG Fingerprint 79A2 9CA4 6CC4 45DD A904  C70E E654 3BB2 FA62 B9F1


On Wed, May 27, 2015 at 6:42 PM, Gregory Farnum  wrote:
> On Wed, May 27, 2015 at 5:37 PM, Sage Weil  wrote:
>> On Wed, 27 May 2015, Gregory Farnum wrote:
>>> On Wed, May 27, 2015 at 4:59 PM, Sage Weil  wrote:
>>> > On Wed, 27 May 2015, Gregory Farnum wrote:
>>> >> On Wed, May 27, 2015 at 4:07 PM, Sage Weil  wrote:
>>> >> > On Wed, 27 May 2015, Gregory Farnum wrote:
>>> >> >> On Wed, May 27, 2015 at 3:21 PM, Sage Weil  wrote:
>>> >> >> > On Wed, 27 May 2015, Gregory Farnum wrote:
>>> >> >> >> > I was just talking to Simo about the longer-term kerberos auth goals to
>>> >> >> >> > make sure we don't do something stupid here that we regret later.  His
>>> >> >> >> > feedback boils down to:
>>> >> >> >> >
>>> >> >> >> >  1) Don't bother with root squash since it doesn't buy you much, and
>>> >> >> >> >  2) Never let the client construct the credential--do it on the server.
>>> >> >> >> >
>>> >> >> >> > I'm okay with skipping squash_root (although it's simple enough it might
>>> >> >> >> > be worthwhile anyway)
>>> >> >> >>
>>> >> >> >> Oh, I like skipping it, given the syntax and usability problems we went over. ;)
>>> >> >> >>
>>> >> >> >> > but #2 is a bit different than what I was thinking.
>>> >> >> >> > Specifically, this is about tagging requests with the uid + gid list.  If
>>> >> >> >> > you let the client provide the group membership you lose most of the
>>> >> >> >> > security--this is what NFS did and it sucked.  (There were other problems
>>> >> >> >> > too, like a limit of 16 gids, and/or problems when a windows admin in 4000
>>> >> >> >> > groups comes along.)
>>> >> >> >>
>>> >> >> >> I'm not sure I understand this bit. I thought we were planning to have
>>> >> >> >> gids in the cephx caps, and then have the client construct the list it
>>> >> >> >> thinks is appropriate for each given request?
>>> >> >> >> Obviously that trusts the client *some*, but it sandboxes them in and
>>> >> >> >> I'm not sure the trust is a useful extension as long as we make sure
>>> >> >> >> the UID and GID sets go together from the cephx caps.
>>> >> >> >
>>> >> >> > We went around in circles about this for a while, but in the end I think
>>> >> >> > we agreed there is minimal value from having the client construct anything
>>> >> >> > (the gid list in this case), and it avoids taking any step down what is
>>> >> >> > ultimately a dead-end road.  For example, caps like
>>> >> >> >
>>> >> >> >   allow rw gid 2000
>>> >> >> >
>>> >> >> > are useless since the client can set gid=2000 but then make the request
>>> >> >> > uid anything it wants (namely, the file owner).  Cutting the client out of
>>> >> >> > the picture also avoids the many-gid issue.
>>> >> >>
>>> >> >> I don't think I understand the threat model we're worried about here.
>>> >> >> (Granted a cap that sets gid but not uid sounds like a bad idea to
>>> >> >> me.) But if the cephx caps include the GID then a client can only use
>>> >> >> weaker ones than they're permitted, which could frequently be correct.
>>> >> >> For instance if each tenant in a multitenant system has a single cephx
>>> >> >> key, but they have both admin and non-admin users within their local
>>> >> >> context?
>>> >> >
>>> >> > Not sure I understand the question.  The threat model is... a client that
>>> >> > can send arbitrary requests and wants to modify files?
>>> >> >
>>> >> > - Any cap that specifies gid only is useless, since you can choose a uid
>>> >> > to match the file.
>>> >> >
>>> >> > - Any cap that specifies uid only exposes any group-writeable files/dirs.
>>> >> >
>>> >> > - Any cap that specifies uid and gid(s) is fine.
>>> >> >
>>> >> > ...but if we have a server-side mapping of uid -> gid(s), then any of
>>> >> > those is fine (we can specify uid only, gid only, or both).
>>> >>
>>> >> Okay, so it's just malformed cephx caps then. We could just make it
>>> >> refuse to accept gid specs if there's not a uid one as well.
>>> >
>>> > Well... it's meaningless if the client gets to choose the gid set.  If we
>>> > don't do that, then it depends on what the server-side does.  If kerberos
>>> > is used (i.e., the user doesn't get to choose an arbitrary uid) then it's
>>> > okay.  But yeah, I guess we should disallow it for now until that becomes
>>> > available, since in the meantime even with server-side uid->gid mapping
>>> > they can pick any uid.
>>> >
>>> >> Not that I'm necessarily opposed to doing it server-side, but I'm not
>>> >> sure where we'd store it in the minimal configuration (without
>>> >> kerberos or some other server to do lookups in) and not including them
>>> >> in the cephx caps just feels odd.
>>> >
>>> > Yeah.  In fact, if we do have a server-side uid->gid map, and a cap like
>>> >
>>> >  allow rw uid 100 gid 100
>>> >
>>> > does the gid part actually accomplish anything?  I'm thinking it doesn't,
>>> > and we can just forget gid in the caps entirely for the time being?  I
>>> > mean, maybe the user is in groups 100, 200, and 300, but we only want to
>>> > them act as though they're in 100 for this mount.. but who would even want
>>> > to do that, and do we care at this point?
>>>
>>> I don't understand. In the base case where there is no other user
>>> authentication/authorization system in place, we need to either
>>> support group allows in the cephx caps, or we disallow the use of
>>> groups, or each individual user/mount can claim to be a member of
>>> whatever group they want.
>>
>> Oh, right.  I was assuming the MDS is configured with some uid->gid
>> backend.  But many won't have or want that, and
>>
>>> So that makes me think we need to support group allows in the cephx
>>> caps, of form something like
>>>
>>>  allow rw uid 100 gid 100,200,300
>>>
>>> That would let the client act as user 100 and as a member of groups
>>> 100, 200, and 300 (*only* with uid 100!) if they so desire. That
>>> enables lots of important use cases with sharing, right? And it's not
>>> the client choosing the allowed set, it's the Ceph administrator. Is
>>> there something about this that we don't want to enable that I'm just
>>> missing, or are you ignoring the non-Kerberos case, or is there some
>>> conflict between this and the Kerberos case, or....?
>>
>> I think this makes perfect sense.  :)
>>
>> So, the use-cases are now:
>>
>> 1) No authentication: 'allow any'.  What we have now.
>>
>> 2) Subtree restriction: 'allow rw path /foo'.
>>
>> 3) Uid and group restriction: 'allow rw uid 123 gids 123,1000,1001'.
>>
>> 4) Uid restriction + some backend: 'allow rw uid 123'.  MDS will do some
>> call-out to map each uid to a gid list.
>>
>> 5) Kerberos: 'allow rw kerberos blah blah'.  Client presents user tickets
>> to MDS, and MDS will do the call-out to map that to a uid + gid list.
>>
>> 6) 2 + 3
>>
>> 7) 2 + 4
>>
>> 8) 2 + 5
>>
>> And then later, maybe,
>>
>> 9) 2 + a different backend for each subtree.
>> ...
>>
>>> I feel like our meanings are sliding past each other again here. :(
>>
>> Hopefully not?  :)
>> sage
>
> Hurray!
> --
> To unsubscribe from this list: send the line "unsubscribe ceph-devel" in
> the body of a message to majordomo@vger.kernel.org
> More majordomo info at  http://vger.kernel.org/majordomo-info.html

-----BEGIN PGP SIGNATURE-----
Version: Mailvelope v0.13.1
Comment: https://www.mailvelope.com

wsFcBAEBCAAQBQJVZ0BbCRDmVDuy+mK58QAAI9IQAMZKPcVrrR3TOjH0SQZ5
Col1MIhiGcz1aeUC5ApPkAvwsGSQtAswoOc4GtKMpxc/1HNPRIeJ+qetme2K
/czP2O6L1wlk+i50oS9sWBF2yU1ZrIaBcuYhMPrf90vr2Sp0Y2dqZdvhBbzT
mrtvCNDyjPGNYYB4CjfmtiUNzNzyNPN8dleG87UpF8jWJWhrmlTVAY+jwpaM
Y5jFlAOZIzhhR2hX9lEsmMVvZALo4Dqu/6auWObOFeb/elROaHMLFH7ovV+z
zjrExONKv77zI0BwXYu9wkOUzTNeCCzhBMwgDkqoXekiWVOmcxcHru2Rmjyf
iEyhO39EV9z6fhYGPX3vt+sEV+Bboisk+6xZf5hU7PHvdwZ43lVMGeKQc4Tx
Jowk7sjnNzx4uFqPtc+MwOPwoOCc58QLO2xOKIg8fatexlL2jiQRY6mtUtRe
ZCSnRvr5c+jmq3cj19S6NLhhN7FaiNJ5wL8nylkAUzXLI2QO3EoqCZYiEyln
/pRBPJcgOoZ7qmIkqXuyQtd9ixqLh9QZ/RLihWlLdHRBCGi/0abUyhlRojRs
GF0X3LH6t4Fg+WRe5hIbUb+wcVdda8SppXx1zdsRKFZTm11qcmYpR1K2wIXg
CUqRb2OFrlPyQx553BYksmEoPKCcZiEulcGZ33NsVJxneOSN40gOcm7wDBlm
GOsp
=D88o
-----END PGP SIGNATURE-----

^ permalink raw reply	[flat|nested] 35+ messages in thread

* Re: MDS auth caps for cephfs
  2015-05-28 16:20                                               ` Robert LeBlanc
@ 2015-05-28 16:42                                                 ` Gregory Farnum
  2015-05-28 17:02                                                   ` Sage Weil
  2015-05-28 17:06                                                   ` Robert LeBlanc
  0 siblings, 2 replies; 35+ messages in thread
From: Gregory Farnum @ 2015-05-28 16:42 UTC (permalink / raw)
  To: Robert LeBlanc
  Cc: Sage Weil, John Spray, ceph-devel, Nishtha Rai, jashan kamboj

On Thu, May 28, 2015 at 9:20 AM, Robert LeBlanc <robert@leblancnet.us> wrote:
> -----BEGIN PGP SIGNED MESSAGE-----
> Hash: SHA256
>
> I've been trying to follow this and I've been lost many times, but I'd
> like to put in my $0.02.  In my mind any multi-tenant system that
> relies on the client to specify UID/GID as authoritative is
> fundamentally flawed. The server needs to be authoritative with access
> or I would not trust it in a muti-tenant environment.
>
> My take is have the User key (generated by the Ceph admin) specify the
> CephFS directory|directories the key can access and the rwx
> permissions for the directory|directories and then leave it up to the
> tenant to handle the UID/GID allocation and the synchronization
> between their hosts.

Right, this is basically what we're planning. The sticky bits are about
1) dealing with clients that have access to multiple UIDs/GIDs
(because different end users are on the same host, for instance). :)
2) dealing with "public cloud"-like scenarios, where you have a bunch
of tenants who are all root on their own machines and thus control
their UID space. (Right now we can't put multiple CephFS instances in
a single RADOS cluster, so the only obvious way to support this is by
giving each client their own subspace within the unified hierarchy.)

> Some tenants may want just local UID/GID
> management, others may want LDAP, Kerberos, etc. I believe Ceph should
> only be worried about "share" permissions and leave "file" permissions
> to the tenant. Ceph just needs the ability to store UID/GID and POSIX
> ACLs.

Well that doesn't quite work — it's entirely possible you want to
share read-only files with a bunch of people that shouldn't be allowed
to write them; that lack of write ability needs to be enforced by Ceph
at the server layer!

>
> The MDS could combine a tenant ID and a UID/GID to store unique
> UID/GIDs on the back end and just strip off the tenant ID when
> presented to the client so there are no collisions of UID/GIDs between
> tenants in the MDS.

Hmm, that is another thought...
-Greg

>
> Please excuse me if I'm off the rails here, but I think this is one
> thing SMB got right and why I prefer Samba over NFS for multi-tenant
> environments.
> - ----------------
> Robert LeBlanc
> GPG Fingerprint 79A2 9CA4 6CC4 45DD A904  C70E E654 3BB2 FA62 B9F1
>
>
> On Wed, May 27, 2015 at 6:42 PM, Gregory Farnum  wrote:
>> On Wed, May 27, 2015 at 5:37 PM, Sage Weil  wrote:
>>> On Wed, 27 May 2015, Gregory Farnum wrote:
>>>> On Wed, May 27, 2015 at 4:59 PM, Sage Weil  wrote:
>>>> > On Wed, 27 May 2015, Gregory Farnum wrote:
>>>> >> On Wed, May 27, 2015 at 4:07 PM, Sage Weil  wrote:
>>>> >> > On Wed, 27 May 2015, Gregory Farnum wrote:
>>>> >> >> On Wed, May 27, 2015 at 3:21 PM, Sage Weil  wrote:
>>>> >> >> > On Wed, 27 May 2015, Gregory Farnum wrote:
>>>> >> >> >> > I was just talking to Simo about the longer-term kerberos auth goals to
>>>> >> >> >> > make sure we don't do something stupid here that we regret later.  His
>>>> >> >> >> > feedback boils down to:
>>>> >> >> >> >
>>>> >> >> >> >  1) Don't bother with root squash since it doesn't buy you much, and
>>>> >> >> >> >  2) Never let the client construct the credential--do it on the server.
>>>> >> >> >> >
>>>> >> >> >> > I'm okay with skipping squash_root (although it's simple enough it might
>>>> >> >> >> > be worthwhile anyway)
>>>> >> >> >>
>>>> >> >> >> Oh, I like skipping it, given the syntax and usability problems we went over. ;)
>>>> >> >> >>
>>>> >> >> >> > but #2 is a bit different than what I was thinking.
>>>> >> >> >> > Specifically, this is about tagging requests with the uid + gid list.  If
>>>> >> >> >> > you let the client provide the group membership you lose most of the
>>>> >> >> >> > security--this is what NFS did and it sucked.  (There were other problems
>>>> >> >> >> > too, like a limit of 16 gids, and/or problems when a windows admin in 4000
>>>> >> >> >> > groups comes along.)
>>>> >> >> >>
>>>> >> >> >> I'm not sure I understand this bit. I thought we were planning to have
>>>> >> >> >> gids in the cephx caps, and then have the client construct the list it
>>>> >> >> >> thinks is appropriate for each given request?
>>>> >> >> >> Obviously that trusts the client *some*, but it sandboxes them in and
>>>> >> >> >> I'm not sure the trust is a useful extension as long as we make sure
>>>> >> >> >> the UID and GID sets go together from the cephx caps.
>>>> >> >> >
>>>> >> >> > We went around in circles about this for a while, but in the end I think
>>>> >> >> > we agreed there is minimal value from having the client construct anything
>>>> >> >> > (the gid list in this case), and it avoids taking any step down what is
>>>> >> >> > ultimately a dead-end road.  For example, caps like
>>>> >> >> >
>>>> >> >> >   allow rw gid 2000
>>>> >> >> >
>>>> >> >> > are useless since the client can set gid=2000 but then make the request
>>>> >> >> > uid anything it wants (namely, the file owner).  Cutting the client out of
>>>> >> >> > the picture also avoids the many-gid issue.
>>>> >> >>
>>>> >> >> I don't think I understand the threat model we're worried about here.
>>>> >> >> (Granted a cap that sets gid but not uid sounds like a bad idea to
>>>> >> >> me.) But if the cephx caps include the GID then a client can only use
>>>> >> >> weaker ones than they're permitted, which could frequently be correct.
>>>> >> >> For instance if each tenant in a multitenant system has a single cephx
>>>> >> >> key, but they have both admin and non-admin users within their local
>>>> >> >> context?
>>>> >> >
>>>> >> > Not sure I understand the question.  The threat model is... a client that
>>>> >> > can send arbitrary requests and wants to modify files?
>>>> >> >
>>>> >> > - Any cap that specifies gid only is useless, since you can choose a uid
>>>> >> > to match the file.
>>>> >> >
>>>> >> > - Any cap that specifies uid only exposes any group-writeable files/dirs.
>>>> >> >
>>>> >> > - Any cap that specifies uid and gid(s) is fine.
>>>> >> >
>>>> >> > ...but if we have a server-side mapping of uid -> gid(s), then any of
>>>> >> > those is fine (we can specify uid only, gid only, or both).
>>>> >>
>>>> >> Okay, so it's just malformed cephx caps then. We could just make it
>>>> >> refuse to accept gid specs if there's not a uid one as well.
>>>> >
>>>> > Well... it's meaningless if the client gets to choose the gid set.  If we
>>>> > don't do that, then it depends on what the server-side does.  If kerberos
>>>> > is used (i.e., the user doesn't get to choose an arbitrary uid) then it's
>>>> > okay.  But yeah, I guess we should disallow it for now until that becomes
>>>> > available, since in the meantime even with server-side uid->gid mapping
>>>> > they can pick any uid.
>>>> >
>>>> >> Not that I'm necessarily opposed to doing it server-side, but I'm not
>>>> >> sure where we'd store it in the minimal configuration (without
>>>> >> kerberos or some other server to do lookups in) and not including them
>>>> >> in the cephx caps just feels odd.
>>>> >
>>>> > Yeah.  In fact, if we do have a server-side uid->gid map, and a cap like
>>>> >
>>>> >  allow rw uid 100 gid 100
>>>> >
>>>> > does the gid part actually accomplish anything?  I'm thinking it doesn't,
>>>> > and we can just forget gid in the caps entirely for the time being?  I
>>>> > mean, maybe the user is in groups 100, 200, and 300, but we only want to
>>>> > them act as though they're in 100 for this mount.. but who would even want
>>>> > to do that, and do we care at this point?
>>>>
>>>> I don't understand. In the base case where there is no other user
>>>> authentication/authorization system in place, we need to either
>>>> support group allows in the cephx caps, or we disallow the use of
>>>> groups, or each individual user/mount can claim to be a member of
>>>> whatever group they want.
>>>
>>> Oh, right.  I was assuming the MDS is configured with some uid->gid
>>> backend.  But many won't have or want that, and
>>>
>>>> So that makes me think we need to support group allows in the cephx
>>>> caps, of form something like
>>>>
>>>>  allow rw uid 100 gid 100,200,300
>>>>
>>>> That would let the client act as user 100 and as a member of groups
>>>> 100, 200, and 300 (*only* with uid 100!) if they so desire. That
>>>> enables lots of important use cases with sharing, right? And it's not
>>>> the client choosing the allowed set, it's the Ceph administrator. Is
>>>> there something about this that we don't want to enable that I'm just
>>>> missing, or are you ignoring the non-Kerberos case, or is there some
>>>> conflict between this and the Kerberos case, or....?
>>>
>>> I think this makes perfect sense.  :)
>>>
>>> So, the use-cases are now:
>>>
>>> 1) No authentication: 'allow any'.  What we have now.
>>>
>>> 2) Subtree restriction: 'allow rw path /foo'.
>>>
>>> 3) Uid and group restriction: 'allow rw uid 123 gids 123,1000,1001'.
>>>
>>> 4) Uid restriction + some backend: 'allow rw uid 123'.  MDS will do some
>>> call-out to map each uid to a gid list.
>>>
>>> 5) Kerberos: 'allow rw kerberos blah blah'.  Client presents user tickets
>>> to MDS, and MDS will do the call-out to map that to a uid + gid list.
>>>
>>> 6) 2 + 3
>>>
>>> 7) 2 + 4
>>>
>>> 8) 2 + 5
>>>
>>> And then later, maybe,
>>>
>>> 9) 2 + a different backend for each subtree.
>>> ...
>>>
>>>> I feel like our meanings are sliding past each other again here. :(
>>>
>>> Hopefully not?  :)
>>> sage
>>
>> Hurray!
>> --
>> To unsubscribe from this list: send the line "unsubscribe ceph-devel" in
>> the body of a message to majordomo@vger.kernel.org
>> More majordomo info at  http://vger.kernel.org/majordomo-info.html
>
> -----BEGIN PGP SIGNATURE-----
> Version: Mailvelope v0.13.1
> Comment: https://www.mailvelope.com
>
> wsFcBAEBCAAQBQJVZ0BbCRDmVDuy+mK58QAAI9IQAMZKPcVrrR3TOjH0SQZ5
> Col1MIhiGcz1aeUC5ApPkAvwsGSQtAswoOc4GtKMpxc/1HNPRIeJ+qetme2K
> /czP2O6L1wlk+i50oS9sWBF2yU1ZrIaBcuYhMPrf90vr2Sp0Y2dqZdvhBbzT
> mrtvCNDyjPGNYYB4CjfmtiUNzNzyNPN8dleG87UpF8jWJWhrmlTVAY+jwpaM
> Y5jFlAOZIzhhR2hX9lEsmMVvZALo4Dqu/6auWObOFeb/elROaHMLFH7ovV+z
> zjrExONKv77zI0BwXYu9wkOUzTNeCCzhBMwgDkqoXekiWVOmcxcHru2Rmjyf
> iEyhO39EV9z6fhYGPX3vt+sEV+Bboisk+6xZf5hU7PHvdwZ43lVMGeKQc4Tx
> Jowk7sjnNzx4uFqPtc+MwOPwoOCc58QLO2xOKIg8fatexlL2jiQRY6mtUtRe
> ZCSnRvr5c+jmq3cj19S6NLhhN7FaiNJ5wL8nylkAUzXLI2QO3EoqCZYiEyln
> /pRBPJcgOoZ7qmIkqXuyQtd9ixqLh9QZ/RLihWlLdHRBCGi/0abUyhlRojRs
> GF0X3LH6t4Fg+WRe5hIbUb+wcVdda8SppXx1zdsRKFZTm11qcmYpR1K2wIXg
> CUqRb2OFrlPyQx553BYksmEoPKCcZiEulcGZ33NsVJxneOSN40gOcm7wDBlm
> GOsp
> =D88o
> -----END PGP SIGNATURE-----
--
To unsubscribe from this list: send the line "unsubscribe ceph-devel" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

^ permalink raw reply	[flat|nested] 35+ messages in thread

* Re: MDS auth caps for cephfs
  2015-05-28 16:42                                                 ` Gregory Farnum
@ 2015-05-28 17:02                                                   ` Sage Weil
  2015-05-28 17:21                                                     ` Robert LeBlanc
  2015-05-28 17:06                                                   ` Robert LeBlanc
  1 sibling, 1 reply; 35+ messages in thread
From: Sage Weil @ 2015-05-28 17:02 UTC (permalink / raw)
  To: Gregory Farnum
  Cc: Robert LeBlanc, John Spray, ceph-devel, Nishtha Rai, jashan kamboj

On Thu, 28 May 2015, Gregory Farnum wrote:
> On Thu, May 28, 2015 at 9:20 AM, Robert LeBlanc <robert@leblancnet.us> wrote:
> > -----BEGIN PGP SIGNED MESSAGE-----
> > Hash: SHA256
> >
> > I've been trying to follow this and I've been lost many times, but I'd
> > like to put in my $0.02.  In my mind any multi-tenant system that
> > relies on the client to specify UID/GID as authoritative is
> > fundamentally flawed. The server needs to be authoritative with access
> > or I would not trust it in a muti-tenant environment.
> >
> > My take is have the User key (generated by the Ceph admin) specify the
> > CephFS directory|directories the key can access and the rwx
> > permissions for the directory|directories and then leave it up to the
> > tenant to handle the UID/GID allocation and the synchronization
> > between their hosts.
> 
> Right, this is basically what we're planning. The sticky bits are about
> 1) dealing with clients that have access to multiple UIDs/GIDs
> (because different end users are on the same host, for instance). :)
> 2) dealing with "public cloud"-like scenarios, where you have a bunch
> of tenants who are all root on their own machines and thus control
> their UID space. (Right now we can't put multiple CephFS instances in
> a single RADOS cluster, so the only obvious way to support this is by
> giving each client their own subspace within the unified hierarchy.)

Yep!

> > Some tenants may want just local UID/GID
> > management, others may want LDAP, Kerberos, etc. I believe Ceph should
> > only be worried about "share" permissions and leave "file" permissions
> > to the tenant. Ceph just needs the ability to store UID/GID and POSIX
> > ACLs.
> 
> Well that doesn't quite work ? it's entirely possible you want to
> share read-only files with a bunch of people that shouldn't be allowed
> to write them; that lack of write ability needs to be enforced by Ceph
> at the server layer!

I think with what we're proposing you can still do this.  You'd use ceph 
capabilities that lock mounts into subtrees and do nothing else.  Ceph 
can continue to store uid/gids/acls but not interpret them.

The extra complexity we're talking about would kick in if you *do* want to 
share the same subtrees across users and want CephFS to enforce unix 
permissions server-side.  That's important for some users.  And it's great 
to hear that it's not important for lots of others!

> > The MDS could combine a tenant ID and a UID/GID to store unique
> > UID/GIDs on the back end and just strip off the tenant ID when
> > presented to the client so there are no collisions of UID/GIDs between
> > tenants in the MDS.
> 
> Hmm, that is another thought...

Unless you ask Ceph to enforce the unix permissions server side, the 
uid/gid are stored but not interpreted.  I don't think the tenant ID is 
needed since there is no impact if the same uids are used in different 
subtrees.  It's just up to the admin to divvy up non-overlapping subtrees 
to the tenants...

sage

^ permalink raw reply	[flat|nested] 35+ messages in thread

* Re: MDS auth caps for cephfs
  2015-05-28 16:42                                                 ` Gregory Farnum
  2015-05-28 17:02                                                   ` Sage Weil
@ 2015-05-28 17:06                                                   ` Robert LeBlanc
  1 sibling, 0 replies; 35+ messages in thread
From: Robert LeBlanc @ 2015-05-28 17:06 UTC (permalink / raw)
  To: Gregory Farnum
  Cc: Sage Weil, John Spray, ceph-devel, Nishtha Rai, jashan kamboj

-----BEGIN PGP SIGNED MESSAGE-----
Hash: SHA256

I think if there is a way to store the tenant ID with the UID/GID,
then a lot of the challenges could be resolved.

On Thu, May 28, 2015 at 10:42 AM, Gregory Farnum  wrote:

> Right, this is basically what we're planning. The sticky bits are about
> 1) dealing with clients that have access to multiple UIDs/GIDs
> (because different end users are on the same host, for instance). :)

I'm having trouble visualizing this. This seems to be the best option
because there is no collision of UID/GIDs. If the problem is that a
user on multiple boxes don't have the ability to set their UID/GID to
the same thing, then that is where the client can specify which
UID/GID to use. But by doing so, they can't access anything from other
tenants because the tenant ID won't match on the MDS.

> 2) dealing with "public cloud"-like scenarios, where you have a bunch
> of tenants who are all root on their own machines and thus control
> their UID space. (Right now we can't put multiple CephFS instances in
> a single RADOS cluster, so the only obvious way to support this is by
> giving each client their own subspace within the unified hierarchy.)

Yes, the tenant ID would give each tenant to do their own thing
regardless of root access to their local boxes. It creates a separate
space that the MDS can enforce.

>> Some tenants may want just local UID/GID
>> management, others may want LDAP, Kerberos, etc. I believe Ceph should
>> only be worried about "share" permissions and leave "file" permissions
>> to the tenant. Ceph just needs the ability to store UID/GID and POSIX
>> ACLs.
>
> Well that doesn't quite work — it's entirely possible you want to
> share read-only files with a bunch of people that shouldn't be allowed
> to write them; that lack of write ability needs to be enforced by Ceph
> at the server layer!

The MDS would add the tenant ID to the perms, if it doesn't match,
then they can't write, only read because that is what is specified in
the User caps, only the "owner" of the directory would have write
access.

>> The MDS could combine a tenant ID and a UID/GID to store unique
>> UID/GIDs on the back end and just strip off the tenant ID when
>> presented to the client so there are no collisions of UID/GIDs between
>> tenants in the MDS.
>
> Hmm, that is another thought...
> -Greg

- ----------------
Robert LeBlanc
GPG Fingerprint 79A2 9CA4 6CC4 45DD A904  C70E E654 3BB2 FA62 B9F1
-----BEGIN PGP SIGNATURE-----
Version: Mailvelope v0.13.1
Comment: https://www.mailvelope.com

wsFcBAEBCAAQBQJVZ0sFCRDmVDuy+mK58QAAOqAQAJYvWDoCK6bv6YN49ccx
Ifu9P+1utGAOa9P+pE+BOGqdGWuMo++W371Iy5MkjsYU9Rli4LMwGxLjL8cs
s1wNyZ0KZej8q4ibkz1wH0btdv+vcIB79NICZE1CzKCIsMX5tWb6U2T0RG0g
KHhPQXFtYteRHimIUskG7VVvXyti8N0ysOUoSWDz/CXcFZEw2en9+M4jyMlr
9wqbzJzftfBpGq7FMXUaQC5t4tRTTg3mN8UPK2/SSINawwyM4QeCLHD+1+5b
1ksTAkD/weOX+IBnvRlaA9s+GsMHxi5pl7Ns2Oxb+TV3Kyx8mTI9ly28Euy+
JwKD3IyVS3ByIOfuzlaSsRwKnisaq3w1oaVeN9rAmDBK6N/CGsnGQ7ZdhcaO
wlFJcW9k1RMYGA4/dc08MIDs0k2Xo3qjuJNvtIT0et97l7qtylRYno7D1Uh4
cd4v5kzH7FgmuUIFn8x5GOhnHeDR9essL7PbidqyBYh6KEgoAQPzw99OyHje
rP3dlO4CmQDmB5lNj45CNFRe65kyMCbP5kMlHNWMu8NUiZo65yeXp2MX3+oB
9Vkq51Nznb3KTYqIcDTk3SCK5hKFcv55zAFvlnPLoapUeya3tsUmZn5Ui4U7
JZtzMebvtcHaITf9gZjmrST34B6D4Eqw5fydN0qsVXlVBmbbU86CkOPWVLS+
Fwa+
=AZFj
-----END PGP SIGNATURE-----
--
To unsubscribe from this list: send the line "unsubscribe ceph-devel" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

^ permalink raw reply	[flat|nested] 35+ messages in thread

* Re: MDS auth caps for cephfs
  2015-05-28 17:02                                                   ` Sage Weil
@ 2015-05-28 17:21                                                     ` Robert LeBlanc
  2015-05-28 17:32                                                       ` Sage Weil
  0 siblings, 1 reply; 35+ messages in thread
From: Robert LeBlanc @ 2015-05-28 17:21 UTC (permalink / raw)
  To: Sage Weil
  Cc: Gregory Farnum, John Spray, ceph-devel, Nishtha Rai, jashan kamboj

On Thu, May 28, 2015 at 11:02 AM, Sage Weil <sage@newdream.net> wrote:

>> > The MDS could combine a tenant ID and a UID/GID to store unique
>> > UID/GIDs on the back end and just strip off the tenant ID when
>> > presented to the client so there are no collisions of UID/GIDs between
>> > tenants in the MDS.
>>
>> Hmm, that is another thought...
>
> Unless you ask Ceph to enforce the unix permissions server side, the
> uid/gid are stored but not interpreted.  I don't think the tenant ID is
> needed since there is no impact if the same uids are used in different
> subtrees.  It's just up to the admin to divvy up non-overlapping subtrees
> to the tenants...

I don't think I would expect Ceph to enforce file permissions, only
share permissions. The file permissions should be enforced by the host
mounting the FS.

If Ceph appends 8 bits as the Tenant ID to the 16 bit UID/GID for
example, then Ceph only has to match and act the first 8 bits and pass
the next 16 bits to the client. It only has to store the UID/GID and
perms/ACLs. Then you don't have to worry about not crossing the
streams in subtrees. If an admin later decides to provide access to
two clients to the same subtree nothing bad will happen. The two
clients will have to figure out between themselves what to do with
UID/GIDs/perms/ACLs to provide the appropriate access.

If for instance a directory is shared between tenant A and B, and A
can write and B can't, then when B tries to write because the perms
are correct for the UID/GID on the client side, the MDS will prevent
the write because that tenant doesn't have "share" write access on
that directory.

If a tenant wants to allow write access to part of a directory, then
there has to be some level of trust that they will act responsibly. I
can't see getting around that without implementing Kerberos and
preventing the client from mapping to other UID/GIDs, but that really
takes the flexibility out of the system.

----------------
Robert LeBlanc
GPG Fingerprint 79A2 9CA4 6CC4 45DD A904  C70E E654 3BB2 FA62 B9F1

^ permalink raw reply	[flat|nested] 35+ messages in thread

* Re: MDS auth caps for cephfs
  2015-05-28 17:21                                                     ` Robert LeBlanc
@ 2015-05-28 17:32                                                       ` Sage Weil
  2015-05-28 18:29                                                         ` Robert LeBlanc
  0 siblings, 1 reply; 35+ messages in thread
From: Sage Weil @ 2015-05-28 17:32 UTC (permalink / raw)
  To: Robert LeBlanc
  Cc: Gregory Farnum, John Spray, ceph-devel, Nishtha Rai, jashan kamboj

On Thu, 28 May 2015, Robert LeBlanc wrote:
> On Thu, May 28, 2015 at 11:02 AM, Sage Weil <sage@newdream.net> wrote:
> 
> >> > The MDS could combine a tenant ID and a UID/GID to store unique
> >> > UID/GIDs on the back end and just strip off the tenant ID when
> >> > presented to the client so there are no collisions of UID/GIDs between
> >> > tenants in the MDS.
> >>
> >> Hmm, that is another thought...
> >
> > Unless you ask Ceph to enforce the unix permissions server side, the
> > uid/gid are stored but not interpreted.  I don't think the tenant ID is
> > needed since there is no impact if the same uids are used in different
> > subtrees.  It's just up to the admin to divvy up non-overlapping subtrees
> > to the tenants...
> 
> I don't think I would expect Ceph to enforce file permissions, only
> share permissions. The file permissions should be enforced by the host
> mounting the FS.
> 
> If Ceph appends 8 bits as the Tenant ID to the 16 bit UID/GID for
> example, then Ceph only has to match and act the first 8 bits and pass
> the next 16 bits to the client. It only has to store the UID/GID and
> perms/ACLs. Then you don't have to worry about not crossing the
> streams in subtrees. If an admin later decides to provide access to
> two clients to the same subtree nothing bad will happen. The two
> clients will have to figure out between themselves what to do with
> UID/GIDs/perms/ACLs to provide the appropriate access.
> 
> If for instance a directory is shared between tenant A and B, and A
> can write and B can't, then when B tries to write because the perms
> are correct for the UID/GID on the client side, the MDS will prevent
> the write because that tenant doesn't have "share" write access on
> that directory.

This feels like it's just adding some protection for an admin that 
accidentally gives tenant A access to tenant B's subtree.  Assuming the 
subtree streams aren't crossed, it doesn't add anything, right?

> If a tenant wants to allow write access to part of a directory, then
> there has to be some level of trust that they will act responsibly. I
> can't see getting around that without implementing Kerberos and
> preventing the client from mapping to other UID/GIDs, but that really
> takes the flexibility out of the system.

There are three scenarios:

1) Each tentant in their own subtree, clients do enforcement.  They can do 
whatever they want but Ceph doesn't care because the tenant is confined.

2) Kerberos, as you say, allows the MDS to enforce permissions on shared 
directories.  You're right that you need that infrastructure before we can 
sanely, except when

3) untrusted clients are given access to a shared directory but restricted 
to act as a single user.  In this case, the MDS still enforces 
permissions, and kerberos isn't needed.  Imagine your workstation being 
allowed to mount the department file server.

In any case, doing any enforcement on the MDS is opt-in.  It *expands* 
your options by making it possible to share the same subtrees to different 
untrusted clients and still enforce permissions.  If you have a 
multi-tenant environment where data isn't shared, you can still do what 
you're suggesting and leave it to the clients...  or even run in a 
completely trusted mode like we have now where clients all mount / and can 
do whatever they want.

Does that make sense?

sage


^ permalink raw reply	[flat|nested] 35+ messages in thread

* Re: MDS auth caps for cephfs
  2015-05-28 17:32                                                       ` Sage Weil
@ 2015-05-28 18:29                                                         ` Robert LeBlanc
  2015-06-01 19:39                                                           ` Sage Weil
  0 siblings, 1 reply; 35+ messages in thread
From: Robert LeBlanc @ 2015-05-28 18:29 UTC (permalink / raw)
  To: Sage Weil
  Cc: Gregory Farnum, John Spray, ceph-devel, Nishtha Rai, jashan kamboj

-----BEGIN PGP SIGNED MESSAGE-----
Hash: SHA256

On Thu, May 28, 2015 at 11:32 AM, Sage Weil  wrote:
>> If for instance a directory is shared between tenant A and B, and A
>> can write and B can't, then when B tries to write because the perms
>> are correct for the UID/GID on the client side, the MDS will prevent
>> the write because that tenant doesn't have "share" write access on
>> that directory.
>
> This feels like it's just adding some protection for an admin that
> accidentally gives tenant A access to tenant B's subtree.  Assuming the
> subtree streams aren't crossed, it doesn't add anything, right?

The example I was thinking about was that A builds and provides some
RPMS in a directory. They want to allow tenant B to access those in a
read-only fashion. Tenant B key is given share read access to the RPM
directory (A also makes sure the files are world readable).

>> If a tenant wants to allow write access to part of a directory, then
>> there has to be some level of trust that they will act responsibly. I
>> can't see getting around that without implementing Kerberos and
>> preventing the client from mapping to other UID/GIDs, but that really
>> takes the flexibility out of the system.
>
> There are three scenarios:
>
> 1) Each tentant in their own subtree, clients do enforcement.  They can do
> whatever they want but Ceph doesn't care because the tenant is confined.
>
> 2) Kerberos, as you say, allows the MDS to enforce permissions on shared
> directories.  You're right that you need that infrastructure before we can
> sanely, except when
>
> 3) untrusted clients are given access to a shared directory but restricted
> to act as a single user.  In this case, the MDS still enforces
> permissions, and kerberos isn't needed.  Imagine your workstation being
> allowed to mount the department file server.
>
> In any case, doing any enforcement on the MDS is opt-in.  It *expands*
> your options by making it possible to share the same subtrees to different
> untrusted clients and still enforce permissions.  If you have a
> multi-tenant environment where data isn't shared, you can still do what
> you're suggesting and leave it to the clients...  or even run in a
> completely trusted mode like we have now where clients all mount / and can
> do whatever they want.

I certainly would like the shared aspect of CephFS to work well, be
secure and flexible. I think this is getting into the points I really
dislike about NFS. My opinion of a Network file system is that access
should be controlled by the server. The server should only allow a
client to operate as the UID that they authenticate as (none of this
'I get to choose which UID I get to be' stuff). The server is
authoritative for UID to GID mapping and has final say over file
access.

This type of structure favors users, each user has to map the
directory. It really sucks at system mounts and you have to rely on
groups a ton. Add that you either have to tie into a User directory or
integrate one to make it really useful, it can get really complex real
fast. However the NFS model is really good at system mounts because it
is easy to twiddle the owner/perms etc, but a lot is left to the
client which makes it less secure. At the same time, it could probably
be implemented with little to no change of cephx.

I guess it boils down to which situation are we trying to target/solve?

- ----------------
Robert LeBlanc
GPG Fingerprint 79A2 9CA4 6CC4 45DD A904  C70E E654 3BB2 FA62 B9F1
-----BEGIN PGP SIGNATURE-----
Version: Mailvelope v0.13.1
Comment: https://www.mailvelope.com

wsFcBAEBCAAQBQJVZ154CRDmVDuy+mK58QAAXTcP/1bJDaE2KU1BKVk2O+o1
7VKuy7qfXhOrthVimBIH12G0F9KOmG4jWMWjW3PVXOHLRuQK8kM96W4qXG5o
7AFQnu3i7JDva3EUBK/PTvGSIxkVCmHLTmpI9FapVPE3fil6LB5VEmHLgNDt
IgPuLbkSwiRw5bjCBdfbVQMSjfhs+Q1HhrrXJy3S7Inp+AcBVetyY17JLHQm
6OX4Ae2l0sSEhqHzse3wB0WF2dgvcq/Z7IxPL80+V5Gh0yZJ9F5oCuhXKOlO
q9dOJOWdR6peEactpXBAoD9Yqr3Na2euO9TCkk+1nTx7Efas8IfzEwdh3Ge7
O9i+shOG7e0Za9wF2q1UL1rORRseJ1A8ezA9DaQxZyT+1vR8IbxJda0AUhRP
e8E+Yw553MyxKqSE33A0mPnyZ4VVXBucQtYtjCEqJV1ijF/OS7SlAi3e730X
VaxWaCYb7XshGsvndyr2F2W6CZY9naXIyjPmDoqGzxMa4DDWrmYvZb4/wtQi
lkq88ZtOJLaGZotshlwvdqh46XKmkUicQJnyUSrGFh47LVVfBm7PzuUCu1T4
8WlWS8QRHEcX0j8SY8eIvPZGgVYxsR6qm4cobv9jrdnRaHHwN/CQVBx9Xcci
W3w8yO0MSKUukdekIlsOcWgP3jqjjG/cyrOy2k+OhjDRrhbQlmnr2lzjNSKP
WHXV
=U1EU
-----END PGP SIGNATURE-----

^ permalink raw reply	[flat|nested] 35+ messages in thread

* Re: MDS auth caps for cephfs
  2015-05-28 18:29                                                         ` Robert LeBlanc
@ 2015-06-01 19:39                                                           ` Sage Weil
  2015-06-02 23:40                                                             ` Gregory Farnum
  0 siblings, 1 reply; 35+ messages in thread
From: Sage Weil @ 2015-06-01 19:39 UTC (permalink / raw)
  To: Robert LeBlanc
  Cc: Gregory Farnum, John Spray, ceph-devel, Nishtha Rai, jashan kamboj

I have a pull request posted at

	https://github.com/ceph/ceph/pull/4809

that updates the mds cap parser and defines a check method.  Please take 
a look and see if this makes sense.

For the path restrictions, I think the next steps are something like

 - generate a path string and pass it in to that method
 - write some simple tests
 - make sure the hook is called from everywhere it needs to be (all of the 
other request handlers in Server.cc)
 - call the hook from the cap writeback path (tricky)
 - figure out how to handle files in the stray dir (tricky)

For the user-based restrictions,

 - I think we need to expand the allows() method so that it has a couple 
output arguments (uid and gid list) that subsequent permissions should be 
validated against.  Then we can change the function in Server.cc so 
that when those are populated it does an actually unix permissions check.  
This seems like the simplest thing to me, although it does have the 
slightly-odd property that if you are doing an operation that requires 
permission on two inodes (directory and file, say) you might have to 
different 'allow ...' lines granting access to each.  (I think this is 
both painful to avoid and also harmless?)
 - same items above to call into the check method in the appropriate 
places.
 - extend client/mds protocol to pass credential struct.  This should 
piggyback on Goldwyn's work to fix up these structures for namespaces.
 - mark caps with credentials on clients, fix writeback order, etc.

In any case, the first item on both those lists seems like the place 
to start.

sage

^ permalink raw reply	[flat|nested] 35+ messages in thread

* Re: MDS auth caps for cephfs
  2015-06-01 19:39                                                           ` Sage Weil
@ 2015-06-02 23:40                                                             ` Gregory Farnum
  0 siblings, 0 replies; 35+ messages in thread
From: Gregory Farnum @ 2015-06-02 23:40 UTC (permalink / raw)
  To: Sage Weil
  Cc: Robert LeBlanc, John Spray, ceph-devel, Nishtha Rai, jashan kamboj

On Mon, Jun 1, 2015 at 12:39 PM, Sage Weil <sage@newdream.net> wrote:
> I have a pull request posted at
>
>         https://github.com/ceph/ceph/pull/4809
>
> that updates the mds cap parser and defines a check method.  Please take
> a look and see if this makes sense.

Couple comments on the internal interfaces but the syntax seems good
and so do the MDSCap changes.

>
> For the path restrictions, I think the next steps are something like
>
>  - generate a path string and pass it in to that method
>  - write some simple tests
>  - make sure the hook is called from everywhere it needs to be (all of the
> other request handlers in Server.cc)
>  - call the hook from the cap writeback path (tricky)
>  - figure out how to handle files in the stray dir (tricky)

This path access function was my concern (we need to push that down
inside of the MDSCaps rather than exposing their internal state);
making that change will probably impact how we do these. But this
general set of tasks seems right to me. Probably we do unit tests
first, which we can do once it's all MDSCap internal. ;)

I'm not sure if including the inode is actually going to be helpful to
us? We can't use it instead of the path, for instance. Users might
have access via one path but not another and we need to check both
that they can reach the inode and that they came in the right way.

> For the user-based restrictions,
>
>  - I think we need to expand the allows() method so that it has a couple
> output arguments (uid and gid list) that subsequent permissions should be
> validated against.  Then we can change the function in Server.cc so
> that when those are populated it does an actually unix permissions check.
> This seems like the simplest thing to me, although it does have the
> slightly-odd property that if you are doing an operation that requires
> permission on two inodes (directory and file, say) you might have to
> different 'allow ...' lines granting access to each.  (I think this is
> both painful to avoid and also harmless?)

Why don't we just pass in the UID and GID the request is being made
as? Then it's either allowed or denied.
-Greg

^ permalink raw reply	[flat|nested] 35+ messages in thread

end of thread, other threads:[~2015-06-02 23:40 UTC | newest]

Thread overview: 35+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2015-05-21  0:14 MDS auth caps for cephfs Sage Weil
2015-05-22  9:28 ` John Spray
2015-05-22 21:35   ` Sage Weil
2015-05-22 22:02     ` Gregory Farnum
2015-05-22 22:18       ` Sage Weil
2015-05-22 22:38         ` Gregory Farnum
2015-05-22 22:52           ` Sage Weil
2015-05-26 14:26             ` Gregory Farnum
2015-05-26 16:28               ` Sage Weil
2015-05-26 21:26                 ` Sage Weil
2015-05-26 21:53                 ` Gregory Farnum
2015-05-26 22:17                   ` Sage Weil
2015-05-26 22:50                     ` Gregory Farnum
2015-05-26 23:12                       ` Sage Weil
2015-05-26 23:32                         ` Gregory Farnum
2015-05-27 21:44                           ` Sage Weil
2015-05-27 22:03                             ` Gregory Farnum
2015-05-27 22:21                               ` Sage Weil
2015-05-27 22:40                                 ` Gregory Farnum
2015-05-27 23:07                                   ` Sage Weil
2015-05-27 23:18                                     ` Gregory Farnum
2015-05-27 23:59                                       ` Sage Weil
2015-05-28  0:11                                         ` Gregory Farnum
2015-05-28  0:37                                           ` Sage Weil
2015-05-28  0:42                                             ` Gregory Farnum
2015-05-28 16:20                                               ` Robert LeBlanc
2015-05-28 16:42                                                 ` Gregory Farnum
2015-05-28 17:02                                                   ` Sage Weil
2015-05-28 17:21                                                     ` Robert LeBlanc
2015-05-28 17:32                                                       ` Sage Weil
2015-05-28 18:29                                                         ` Robert LeBlanc
2015-06-01 19:39                                                           ` Sage Weil
2015-06-02 23:40                                                             ` Gregory Farnum
2015-05-28 17:06                                                   ` Robert LeBlanc
2015-05-26 12:56       ` John Spray

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.