All of lore.kernel.org
 help / color / mirror / Atom feed
* [Virtio-fs] Securing file handles
@ 2021-03-05 16:22 Max Reitz
  2021-03-08  9:06 ` Sergio Lopez
                   ` (2 more replies)
  0 siblings, 3 replies; 15+ messages in thread
From: Max Reitz @ 2021-03-05 16:22 UTC (permalink / raw)
  To: virtio-fs-list

Hi,

We want virtio-fs to support persistent file handles.  There’s a
problem, though: If the guest can give arbitrary file handles for
virtiofsd to open, it may be able to open host files that are not inside
of the shared directory.

I’m aware of two general ways that can solve this problem:

(1) We can ensure that the guest can only use file handles that
     virtiofsd has generated, or

(2) When receiving a file handle from the guest, verify that it resides
     within the exported directory.

Implementation-wise, (1) can be done by adding a cryptographic signature
or rather a message authentication code (MAC) with a secret (persistent)
key, which is checked before a file handle is used to open a file.

(2) can be done either by attempting to reconstruct the path that leads
to the file (and thus checking whether it is under the exported
directory or not – this is what NFS’s subtree_check option does); or by
having the exported directory be the root of a mount, so that we can
check whether file handles (which are bound to a specific mount ID)
refer to a mount under that directory.  The only practical way to have
such mounts are bind mounts.

Let’s look at the various implementations in more detail.


== Adding a MAC ==

We can keep a private key in virtiofsd (needs to be persistent, though,
i.e. stored on the host somewhere), and then for every file handle we
pass on to the guest, add something like an 8-byte MAC.  When we receive
a file handle from the guest to open the corresponding file, we can
verify that the MAC is valid and so that this file handle must have been
created by a virtiofsd instance in possession of the private key.

The advantage is that it'll work without any special hacks.

The disadvantages I can think of are:
- If a file is moved, it retains its handle.  If it is moved outside of
   the exported directory, it will remain accessible through the handle.

- It increases the handle size.  Probably not a real problem, we will
   most likely have to store some metadata in the handle anyway, and
   spending eight bytes on the MAC shouldn’t be too bad.

- It isn’t very elegant.  It is still possible to access file handles
   outside of the shared directory, you “just” have to guess the MAC.
   (64 bits of entropy should be enough, but still.)


== Reconstructing the file path ==

This is what NFS does with subtree_check.

There is no way to get a file’s parent.  (One practical reason is that
files may be hard-linked, so they can have multiple parents.)  So to
reconstruct a file’s path, you need to store its parent when the handle
is created (when you usually know its full path).

For comparison, an NFS file handle (with subtree_check) looks something
like this:

00 00 00 00 85 00 00 00 – NFS file ID (the inode ID), split into low u32
                           and high u32: 0x85
00 80 00 00             – NFS file type: 0x8000 = S_IFREG
34 00                   – Length of the embedded file handle: 0x34 bytes
01                      – File handle version: 1 (this is always 1)
00                      – Auth type: 0 (this is always 0)
07                      – FS ID type: 7 = u64 parent inode ID +
                           16-byte FS UUID
82                      – Host FS file handle type: 0x82
                           = u64 inode ID, u32 generation, u64 parent
                             inode ID, u32 parent generation
84 00 00 00 00 00 00 00 – Parent inode ID: 0x84
xx xx xx xx xx xx xx xx
xx xx xx xx xx xx xx xx – Host FS UUID
[The actual XFS file handle begins here]
85 00 00 00 00 00 00 00 – Inode ID: 0x85
48 95 e1 8c             – Generation: 0x8ce19548
84 00 00 00 00 00 00 00 – Parent inode ID: 0x84
b4 6a 36 92             – Parent generation: 0x92366ab4

(The NFS file ID is equal to the inode ID, and the parent inode ID
before the XFS handle is equal to the parent inode ID within.)

When opening this file handle, the kernel tries to make a connected
path, i.e. it tries to obtain the file’s path back to a VFS entry that
is already connected (at the worst back to the FS root).  For
directories, you can just obtain the parent, so this is easy.  For
files, not so much.

What happens for files is this: The parent is opened based on the parent
information in the handle, and then the kernel (generally) iterates
through that directory to find the file.  If it doesn’t find it, that’s
an error[1].  If it’s found, the parent is indeed a parent of the file,
and we can go up the path to the FS root just like normal.

[1] This is contrary to what I said in the virtio-fs call on Wednesday.
     I said you could just replace the inode ID and generation by some
     other file on the same file system (regardless of where it is
     located), and NFS would open it for you.  Indeed, I can see that
     behavior on my normal system; but in a guest, where I could debug
     it, such file handles are not accepted.

I think we could emulate the same behavior in virtiofsd in user-space.
While there are some special functions only available in the kernel to
generate file handles that contain parent information, and to get a
directory’s parent, we should be able to emulate the same in user-space:

- File handles that include parent information are generally just two
   file handles squished together.  File handles of type 0x81 are
   “u64 inode ID + u32 generation”, and file handles of type 0x82 are two
   0x81 file handles concatenated (one for the file, one for the parent).
   We can just generate a file handle for the parent and glue it together
   with the actual file handle, we don’t need special filesystem support
   for that.

- To go to a directory’s parent, you can statat(dirfd, "..") or
   openat(dirfd, "..").

The advantage of this approach is that I feel it’s kind of elegant,
because it really checks the path when the file handle is to be opened.
This allows us to safely handled files moved around, because the parent
information will become noticeably stale.  (Though if you move a file
into some other directory under the shared directory, the file handle
will stop to work, which may be considered a problem.)

I see two disadvantages:
- Iterating the parent to find the file to verify that it is indeed the
   current parent of the file is kind of not nice.  But it is what NFS
   does with subtree_check, so...

- NFS no longer has subtree_check as the default.  The man page says
   “subtree_checking tends to cause more problems than it is worth.”  As
   far as I could find out, that’s precisely because renaming/moving a
   file will invalidate its handle.


== Constricting files to bind mounts ==

With no_subtree_check being the default for NFS nowadays, users are
encouraged to use a bind mount as the root of an NFS export.
Incidentally, AFAIU, virtiofsd and virtiofsd-rs do the same on their own
already, so this might seem like the natural solution for us, too.
(Spoiler: I don’t think it is, though.)

File handles are characterized by the following (meta)data:
(1) The mount to which they belong[2],
(2) An FS-specific file handle type,
(3) The file handle itself.

[2] When you generate a handle (name_to_handle_at()), you receive the
     respective mount ID.  When you open a file handle
     (open_by_handle_at()), you have to specify any FD to any entry in
     the respective mount.

So it may seem like a file handle would be bound to a specific mount: If
you open a file handle on a bind mount, it will only allow you to open
files that are inside of that bind mount.

But that’s not what happens.  A bind mount is not actually different
from the original file system, so all handle operations operate on that
original FS.  That means (if you have the respective handle) you can
open all files that reside on the original FS.  Bind mounts do not
restrict that.

So, from a security standpoint, I don’t see how bind mounts would
restrict accessing file handles.  (And, in fact, when I let virtiofsd-rs
basically just pass through file handles, the guest can open any file
handle on the file system the exported directory is on.)

So while at the first glance perhaps the obvious solution, I don’t think
they can help us for file handles.


== Summary ==

So, my current position is:

- Bind mounts don’t help with restricting file handles to the exported
   directory.

- A MAC is not very elegant, and we might encounter problems where a
   file may be moved outside of the shared directory, but remains
   accessible (because moving a file doesn’t change its handle).
   (If we consider that a problem.  NFS evidently doesn’t, because
   without subtree_check, it has absolutely no protection against
   arbitrary file handles being opened (on the FS where the export
   resides), so valid file handles always remain valid.)

- A solution such as NFS’s subtree_check (i.e., storing the file’s
   parent’s handle in addition to the file’s handle itself, then
   verifying that the file does still reside in that directory when the
   handle is opened, and then going up the tree to see whether we can
   trace it back to the shared directory) is interesting and can perhaps
   be considered elegant, but it requires iterating the directory the
   file resides in when it is opened, and it will result in file handles
   being invalidated whenever a file is moved (outside of its directory).
   Perhaps also other issues.  In any case, there are reasons why NFS has
   basically deprecated this.

Opinions? :)


^ permalink raw reply	[flat|nested] 15+ messages in thread

* Re: [Virtio-fs] Securing file handles
  2021-03-05 16:22 [Virtio-fs] Securing file handles Max Reitz
@ 2021-03-08  9:06 ` Sergio Lopez
  2021-03-08 10:52   ` Max Reitz
  2021-03-08 15:01   ` Stefan Hajnoczi
  2021-03-08  9:54 ` Miklos Szeredi
  2021-03-08 11:44 ` Dr. David Alan Gilbert
  2 siblings, 2 replies; 15+ messages in thread
From: Sergio Lopez @ 2021-03-08  9:06 UTC (permalink / raw)
  To: Max Reitz; +Cc: virtio-fs-list

[-- Attachment #1: Type: text/plain, Size: 2169 bytes --]

On Fri, Mar 05, 2021 at 05:22:56PM +0100, Max Reitz wrote:
> == Summary ==
> 
> So, my current position is:
> 
> - Bind mounts don’t help with restricting file handles to the exported
>   directory.
> 
> - A MAC is not very elegant, and we might encounter problems where a
>   file may be moved outside of the shared directory, but remains
>   accessible (because moving a file doesn’t change its handle).
>   (If we consider that a problem.  NFS evidently doesn’t, because
>   without subtree_check, it has absolutely no protection against
>   arbitrary file handles being opened (on the FS where the export
>   resides), so valid file handles always remain valid.)
> 
> - A solution such as NFS’s subtree_check (i.e., storing the file’s
>   parent’s handle in addition to the file’s handle itself, then
>   verifying that the file does still reside in that directory when the
>   handle is opened, and then going up the tree to see whether we can
>   trace it back to the shared directory) is interesting and can perhaps
>   be considered elegant, but it requires iterating the directory the
>   file resides in when it is opened, and it will result in file handles
>   being invalidated whenever a file is moved (outside of its directory).
>   Perhaps also other issues.  In any case, there are reasons why NFS has
>   basically deprecated this.
> 
> Opinions? :)

While the MAC option doesn't look too bad to me, I can't help but feel
that we're working around a kernel (mis)feature, which is something
that's risky and tends to backfire. It also worries me the fact that
we'd need to run virtiofsd with CAP_DAC_READ_SEARCH.

IIUC, we need this to avoid the need to keep an FD open for each entry
that's in the Guest's lookup cache, which is something that's probably
going to become a problem once we have dozens of virtiofsd instances
servicing VMs on the same Host (BTW, this is already a problem on
macOS, where the default *system-wide* NOFILE limit is a little over
10,000).

Perhaps we should try to aim higher and propose some kernel
extensions that would fit better our needs?

Thanks,
Sergio.


[-- Attachment #2: signature.asc --]
[-- Type: application/pgp-signature, Size: 833 bytes --]

^ permalink raw reply	[flat|nested] 15+ messages in thread

* Re: [Virtio-fs] Securing file handles
  2021-03-05 16:22 [Virtio-fs] Securing file handles Max Reitz
  2021-03-08  9:06 ` Sergio Lopez
@ 2021-03-08  9:54 ` Miklos Szeredi
  2021-03-08 11:29   ` Max Reitz
  2021-03-08 22:03   ` Vivek Goyal
  2021-03-08 11:44 ` Dr. David Alan Gilbert
  2 siblings, 2 replies; 15+ messages in thread
From: Miklos Szeredi @ 2021-03-08  9:54 UTC (permalink / raw)
  To: Max Reitz; +Cc: virtio-fs-list

Hi,

Thanks for the good summary.

Another aspect is what the file handle will be used for:

 a) allowing server to close O_PATH descriptors any time because they
can be reconstructed using the file handle

 b) allowing NFS export on client, or just name_to_handle_at(2)
open_by_handle_at(2).

The requirements are slightly different, since file handles used for
(a) do not have to persist after a guest reboot (since the VFS cache
referencing those handles is gone).  While (b) requires persistence
after a reboot.

Yet another issue is global CAP_DAC_READ_SEARCH required by the server
for file handle decode.

Taking this into account, I think the final solution has to be in the
host kernel.   E.g. it seems okay to allow user namespace owner to
decode file handles on filesystems it actually owns.   That would not
generally help us, though, since virtiofs will want to export root
owned fs as well.

Addition of a MAC header to the file handle by name_to_handle_at(2)
could solve some or all of the above problems.  The question is where
the key comes from and what the security implications are.

A per-process (e.g. associated with task->files, generated by the
kernel on demand and discarded on process exit) key would suffice to
replace O_PATH descriptors.  In this case the only difference between
keeping the O_PATH fd open and

  name_to_handle_at(opathfd, &handle);
  close(opathfd);
  opathfd=open_by_handle_at(&handle);

would be that the resulting fd might point to a disconnected dentry
and hence would result in incomplete path information under
/proc/self/fd/.  Need to think hard about whether this has any
security implications for unprivileged users.

Adding key management would solve the other aspects, but would also
possibly open up holes for accessing arbitrary files, so this would
need to be done carefully.

Adding subtree checking to the kernel is also a possibility (i.e.
limit the opened fd to the subtree of the bind mount).   It would have
the advantage of not resulting in disconnected dentries, but
disadvantage of not working if the file was moved across directories.

Thanks,
Miklos


^ permalink raw reply	[flat|nested] 15+ messages in thread

* Re: [Virtio-fs] Securing file handles
  2021-03-08  9:06 ` Sergio Lopez
@ 2021-03-08 10:52   ` Max Reitz
  2021-03-08 14:15     ` Sergio Lopez
  2021-03-08 15:01   ` Stefan Hajnoczi
  1 sibling, 1 reply; 15+ messages in thread
From: Max Reitz @ 2021-03-08 10:52 UTC (permalink / raw)
  To: Sergio Lopez; +Cc: virtio-fs-list

On 08.03.21 10:06, Sergio Lopez wrote:
> On Fri, Mar 05, 2021 at 05:22:56PM +0100, Max Reitz wrote:
>> == Summary ==
>>
>> So, my current position is:
>>
>> - Bind mounts don’t help with restricting file handles to the exported
>>    directory.
>>
>> - A MAC is not very elegant, and we might encounter problems where a
>>    file may be moved outside of the shared directory, but remains
>>    accessible (because moving a file doesn’t change its handle).
>>    (If we consider that a problem.  NFS evidently doesn’t, because
>>    without subtree_check, it has absolutely no protection against
>>    arbitrary file handles being opened (on the FS where the export
>>    resides), so valid file handles always remain valid.)
>>
>> - A solution such as NFS’s subtree_check (i.e., storing the file’s
>>    parent’s handle in addition to the file’s handle itself, then
>>    verifying that the file does still reside in that directory when the
>>    handle is opened, and then going up the tree to see whether we can
>>    trace it back to the shared directory) is interesting and can perhaps
>>    be considered elegant, but it requires iterating the directory the
>>    file resides in when it is opened, and it will result in file handles
>>    being invalidated whenever a file is moved (outside of its directory).
>>    Perhaps also other issues.  In any case, there are reasons why NFS has
>>    basically deprecated this.
>>
>> Opinions? :)
> 
> While the MAC option doesn't look too bad to me, I can't help but feel
> that we're working around a kernel (mis)feature, which is something
> that's risky and tends to backfire.

Which misfeature do you mean exactly?  That you can open arbitrary files 
by specifying the right magic number (i.e. its handle)?

That in itself is nothing we’re really working around, but rather 
something that we actively want to pass through to the guest.

If you mean the part of that feature where you can’t really find out the 
path to a file opened this way, well...  I’d say it’s kind of the point, 
really, because the handle is supposed to be independent of the path, as 
evidenced by the trouble NFS seemed to have with subtree_check.

> It also worries me the fact that
> we'd need to run virtiofsd with CAP_DAC_READ_SEARCH.

Well, to me it’s a decision the user needs to make, whether they want 
persistent file handles in the guest or not.

We could think about having an extra process that has 
CAP_DAC_READ_SEARCH and opens FDs for virtiofsd (doing the respective 
security checks), but I don’t know whether that would really be worth 
the effort.

> IIUC, we need this to avoid the need to keep an FD open for each entry
> that's in the Guest's lookup cache, which is something that's probably
> going to become a problem once we have dozens of virtiofsd instances
> servicing VMs on the same Host (BTW, this is already a problem on
> macOS, where the default *system-wide* NOFILE limit is a little over
> 10,000).

Regarding the use of file handles, I consider that a lower priority 
(because like you, I’m not sure file handles are the ideal solution for 
this).  The main priority right now would be to just get persistent file 
handles working in the guest.

Maybe having them working also helps with migration, but it’s entirely 
possible that there is no advantage in having file handles in the guest 
for migration and it’ll be sufficient to have them in virtiofsd.  We’ll see.

> Perhaps we should try to aim higher and propose some kernel
> extensions that would fit better our needs?

I believe we can try other ways than file handles to reduce the FD 
usage, and I believe we want to try other ways precisely because of the 
capability.  But I also think we want working file handles anyway, so 
solving the FD issue some other way won’t mean we don’t need working 
file handles.

Max


^ permalink raw reply	[flat|nested] 15+ messages in thread

* Re: [Virtio-fs] Securing file handles
  2021-03-08  9:54 ` Miklos Szeredi
@ 2021-03-08 11:29   ` Max Reitz
  2021-03-08 12:30     ` Miklos Szeredi
  2021-03-08 22:03   ` Vivek Goyal
  1 sibling, 1 reply; 15+ messages in thread
From: Max Reitz @ 2021-03-08 11:29 UTC (permalink / raw)
  To: Miklos Szeredi; +Cc: virtio-fs-list

On 08.03.21 10:54, Miklos Szeredi wrote:
> Hi,
> 
> Thanks for the good summary.
> 
> Another aspect is what the file handle will be used for:
> 
>   a) allowing server to close O_PATH descriptors any time because they
> can be reconstructed using the file handle
> 
>   b) allowing NFS export on client, or just name_to_handle_at(2)
> open_by_handle_at(2).
> 
> The requirements are slightly different, since file handles used for
> (a) do not have to persist after a guest reboot (since the VFS cache
> referencing those handles is gone).  While (b) requires persistence
> after a reboot.

I’m not even sure we need file handles in the guest for (a).  For 
example, we could just store the file handle for each node in the table 
virtiofsd keeps.  Or perhaps we don’t even need that, because we can 
always reopen nodes with something like

   openat(self.parent.fd.get(), self.name)

(which may recurse quite a bit).  The problem with that would be “what 
happens when a directory is moved”, but well, what does happen then?  Is 
that change propagated to the guest today?  Or will things break?  If 
so, how will they break?

> Yet another issue is global CAP_DAC_READ_SEARCH required by the server
> for file handle decode.
> 
> Taking this into account, I think the final solution has to be in the
> host kernel.   E.g. it seems okay to allow user namespace owner to
> decode file handles on filesystems it actually owns.

What do you mean by “owns”?  That the process’s user is the owner of the 
FS root?

> That would not
> generally help us, though, since virtiofs will want to export root
> owned fs as well.
> 
> Addition of a MAC header to the file handle by name_to_handle_at(2)
> could solve some or all of the above problems.  The question is where
> the key comes from and what the security implications are.
> 
> A per-process (e.g. associated with task->files, generated by the
> kernel on demand and discarded on process exit) key would suffice to
> replace O_PATH descriptors.  In this case the only difference between
> keeping the O_PATH fd open and
> 
>    name_to_handle_at(opathfd, &handle);
>    close(opathfd);
>    opathfd=open_by_handle_at(&handle);
> 
> would be that the resulting fd might point to a disconnected dentry
> and hence would result in incomplete path information under
> /proc/self/fd/.  Need to think hard about whether this has any
> security implications for unprivileged users.
> 
> Adding key management would solve the other aspects, but would also
> possibly open up holes for accessing arbitrary files, so this would
> need to be done carefully.

I have a bad feeling in my stomach when thinking about adding such 
things to the kernel.

I’d feel better about doing something like I noted in my reply to 
Sergio, i.e. add a helper process that does this for other processes. 
This process could do key management and return file handles with MACs, 
and then allow opening them later, passing FDs back.  Only this process 
would need the capability then.  Would that be worse than adding such a 
thing to the kernel?

> Adding subtree checking to the kernel is also a possibility (i.e.
> limit the opened fd to the subtree of the bind mount).   It would have
> the advantage of not resulting in disconnected dentries, but
> disadvantage of not working if the file was moved across directories.

To me it looks like NFS now considers that route to have been a mistake, 
so I’d have a bad feeling about it...

Max


^ permalink raw reply	[flat|nested] 15+ messages in thread

* Re: [Virtio-fs] Securing file handles
  2021-03-05 16:22 [Virtio-fs] Securing file handles Max Reitz
  2021-03-08  9:06 ` Sergio Lopez
  2021-03-08  9:54 ` Miklos Szeredi
@ 2021-03-08 11:44 ` Dr. David Alan Gilbert
  2 siblings, 0 replies; 15+ messages in thread
From: Dr. David Alan Gilbert @ 2021-03-08 11:44 UTC (permalink / raw)
  To: Max Reitz; +Cc: virtio-fs-list

* Max Reitz (mreitz@redhat.com) wrote:
> Hi,
> 
> We want virtio-fs to support persistent file handles.  There’s a
> problem, though: If the guest can give arbitrary file handles for
> virtiofsd to open, it may be able to open host files that are not inside
> of the shared directory.
> 
> I’m aware of two general ways that can solve this problem:
> 
> (1) We can ensure that the guest can only use file handles that
>     virtiofsd has generated, or
> 
> (2) When receiving a file handle from the guest, verify that it resides
>     within the exported directory.

Can we cc in some NFS people?  I'd like to benefit from their pain of
implementing this in the past; especially in their old userspace cases.

Dave

> Implementation-wise, (1) can be done by adding a cryptographic signature
> or rather a message authentication code (MAC) with a secret (persistent)
> key, which is checked before a file handle is used to open a file.
> 
> (2) can be done either by attempting to reconstruct the path that leads
> to the file (and thus checking whether it is under the exported
> directory or not – this is what NFS’s subtree_check option does); or by
> having the exported directory be the root of a mount, so that we can
> check whether file handles (which are bound to a specific mount ID)
> refer to a mount under that directory.  The only practical way to have
> such mounts are bind mounts.
> 
> Let’s look at the various implementations in more detail.
> 
> 
> == Adding a MAC ==
> 
> We can keep a private key in virtiofsd (needs to be persistent, though,
> i.e. stored on the host somewhere), and then for every file handle we
> pass on to the guest, add something like an 8-byte MAC.  When we receive
> a file handle from the guest to open the corresponding file, we can
> verify that the MAC is valid and so that this file handle must have been
> created by a virtiofsd instance in possession of the private key.
> 
> The advantage is that it'll work without any special hacks.
> 
> The disadvantages I can think of are:
> - If a file is moved, it retains its handle.  If it is moved outside of
>   the exported directory, it will remain accessible through the handle.
> 
> - It increases the handle size.  Probably not a real problem, we will
>   most likely have to store some metadata in the handle anyway, and
>   spending eight bytes on the MAC shouldn’t be too bad.
> 
> - It isn’t very elegant.  It is still possible to access file handles
>   outside of the shared directory, you “just” have to guess the MAC.
>   (64 bits of entropy should be enough, but still.)
> 
> 
> == Reconstructing the file path ==
> 
> This is what NFS does with subtree_check.
> 
> There is no way to get a file’s parent.  (One practical reason is that
> files may be hard-linked, so they can have multiple parents.)  So to
> reconstruct a file’s path, you need to store its parent when the handle
> is created (when you usually know its full path).
> 
> For comparison, an NFS file handle (with subtree_check) looks something
> like this:
> 
> 00 00 00 00 85 00 00 00 – NFS file ID (the inode ID), split into low u32
>                           and high u32: 0x85
> 00 80 00 00             – NFS file type: 0x8000 = S_IFREG
> 34 00                   – Length of the embedded file handle: 0x34 bytes
> 01                      – File handle version: 1 (this is always 1)
> 00                      – Auth type: 0 (this is always 0)
> 07                      – FS ID type: 7 = u64 parent inode ID +
>                           16-byte FS UUID
> 82                      – Host FS file handle type: 0x82
>                           = u64 inode ID, u32 generation, u64 parent
>                             inode ID, u32 parent generation
> 84 00 00 00 00 00 00 00 – Parent inode ID: 0x84
> xx xx xx xx xx xx xx xx
> xx xx xx xx xx xx xx xx – Host FS UUID
> [The actual XFS file handle begins here]
> 85 00 00 00 00 00 00 00 – Inode ID: 0x85
> 48 95 e1 8c             – Generation: 0x8ce19548
> 84 00 00 00 00 00 00 00 – Parent inode ID: 0x84
> b4 6a 36 92             – Parent generation: 0x92366ab4
> 
> (The NFS file ID is equal to the inode ID, and the parent inode ID
> before the XFS handle is equal to the parent inode ID within.)
> 
> When opening this file handle, the kernel tries to make a connected
> path, i.e. it tries to obtain the file’s path back to a VFS entry that
> is already connected (at the worst back to the FS root).  For
> directories, you can just obtain the parent, so this is easy.  For
> files, not so much.
> 
> What happens for files is this: The parent is opened based on the parent
> information in the handle, and then the kernel (generally) iterates
> through that directory to find the file.  If it doesn’t find it, that’s
> an error[1].  If it’s found, the parent is indeed a parent of the file,
> and we can go up the path to the FS root just like normal.
> 
> [1] This is contrary to what I said in the virtio-fs call on Wednesday.
>     I said you could just replace the inode ID and generation by some
>     other file on the same file system (regardless of where it is
>     located), and NFS would open it for you.  Indeed, I can see that
>     behavior on my normal system; but in a guest, where I could debug
>     it, such file handles are not accepted.
> 
> I think we could emulate the same behavior in virtiofsd in user-space.
> While there are some special functions only available in the kernel to
> generate file handles that contain parent information, and to get a
> directory’s parent, we should be able to emulate the same in user-space:
> 
> - File handles that include parent information are generally just two
>   file handles squished together.  File handles of type 0x81 are
>   “u64 inode ID + u32 generation”, and file handles of type 0x82 are two
>   0x81 file handles concatenated (one for the file, one for the parent).
>   We can just generate a file handle for the parent and glue it together
>   with the actual file handle, we don’t need special filesystem support
>   for that.
> 
> - To go to a directory’s parent, you can statat(dirfd, "..") or
>   openat(dirfd, "..").
> 
> The advantage of this approach is that I feel it’s kind of elegant,
> because it really checks the path when the file handle is to be opened.
> This allows us to safely handled files moved around, because the parent
> information will become noticeably stale.  (Though if you move a file
> into some other directory under the shared directory, the file handle
> will stop to work, which may be considered a problem.)
> 
> I see two disadvantages:
> - Iterating the parent to find the file to verify that it is indeed the
>   current parent of the file is kind of not nice.  But it is what NFS
>   does with subtree_check, so...
> 
> - NFS no longer has subtree_check as the default.  The man page says
>   “subtree_checking tends to cause more problems than it is worth.”  As
>   far as I could find out, that’s precisely because renaming/moving a
>   file will invalidate its handle.
> 
> 
> == Constricting files to bind mounts ==
> 
> With no_subtree_check being the default for NFS nowadays, users are
> encouraged to use a bind mount as the root of an NFS export.
> Incidentally, AFAIU, virtiofsd and virtiofsd-rs do the same on their own
> already, so this might seem like the natural solution for us, too.
> (Spoiler: I don’t think it is, though.)
> 
> File handles are characterized by the following (meta)data:
> (1) The mount to which they belong[2],
> (2) An FS-specific file handle type,
> (3) The file handle itself.
> 
> [2] When you generate a handle (name_to_handle_at()), you receive the
>     respective mount ID.  When you open a file handle
>     (open_by_handle_at()), you have to specify any FD to any entry in
>     the respective mount.
> 
> So it may seem like a file handle would be bound to a specific mount: If
> you open a file handle on a bind mount, it will only allow you to open
> files that are inside of that bind mount.
> 
> But that’s not what happens.  A bind mount is not actually different
> from the original file system, so all handle operations operate on that
> original FS.  That means (if you have the respective handle) you can
> open all files that reside on the original FS.  Bind mounts do not
> restrict that.
> 
> So, from a security standpoint, I don’t see how bind mounts would
> restrict accessing file handles.  (And, in fact, when I let virtiofsd-rs
> basically just pass through file handles, the guest can open any file
> handle on the file system the exported directory is on.)
> 
> So while at the first glance perhaps the obvious solution, I don’t think
> they can help us for file handles.
> 
> 
> == Summary ==
> 
> So, my current position is:
> 
> - Bind mounts don’t help with restricting file handles to the exported
>   directory.
> 
> - A MAC is not very elegant, and we might encounter problems where a
>   file may be moved outside of the shared directory, but remains
>   accessible (because moving a file doesn’t change its handle).
>   (If we consider that a problem.  NFS evidently doesn’t, because
>   without subtree_check, it has absolutely no protection against
>   arbitrary file handles being opened (on the FS where the export
>   resides), so valid file handles always remain valid.)
> 
> - A solution such as NFS’s subtree_check (i.e., storing the file’s
>   parent’s handle in addition to the file’s handle itself, then
>   verifying that the file does still reside in that directory when the
>   handle is opened, and then going up the tree to see whether we can
>   trace it back to the shared directory) is interesting and can perhaps
>   be considered elegant, but it requires iterating the directory the
>   file resides in when it is opened, and it will result in file handles
>   being invalidated whenever a file is moved (outside of its directory).
>   Perhaps also other issues.  In any case, there are reasons why NFS has
>   basically deprecated this.
> 
> Opinions? :)
> 
> _______________________________________________
> Virtio-fs mailing list
> Virtio-fs@redhat.com
> https://listman.redhat.com/mailman/listinfo/virtio-fs
-- 
Dr. David Alan Gilbert / dgilbert@redhat.com / Manchester, UK


^ permalink raw reply	[flat|nested] 15+ messages in thread

* Re: [Virtio-fs] Securing file handles
  2021-03-08 11:29   ` Max Reitz
@ 2021-03-08 12:30     ` Miklos Szeredi
  2021-03-08 13:39       ` Max Reitz
  0 siblings, 1 reply; 15+ messages in thread
From: Miklos Szeredi @ 2021-03-08 12:30 UTC (permalink / raw)
  To: Max Reitz; +Cc: virtio-fs-list

On Mon, Mar 8, 2021 at 12:29 PM Max Reitz <mreitz@redhat.com> wrote:
>
> On 08.03.21 10:54, Miklos Szeredi wrote:
> > Hi,
> >
> > Thanks for the good summary.
> >
> > Another aspect is what the file handle will be used for:
> >
> >   a) allowing server to close O_PATH descriptors any time because they
> > can be reconstructed using the file handle
> >
> >   b) allowing NFS export on client, or just name_to_handle_at(2)
> > open_by_handle_at(2).
> >
> > The requirements are slightly different, since file handles used for
> > (a) do not have to persist after a guest reboot (since the VFS cache
> > referencing those handles is gone).  While (b) requires persistence
> > after a reboot.
>
> I’m not even sure we need file handles in the guest for (a).  For
> example, we could just store the file handle for each node in the table
> virtiofsd keeps.

Right.


>  Or perhaps we don’t even need that, because we can
> always reopen nodes with something like
>
>    openat(self.parent.fd.get(), self.name)
>
> (which may recurse quite a bit).  The problem with that would be “what
> happens when a directory is moved”, but well, what does happen then?  Is
> that change propagated to the guest today?  Or will things break?  If
> so, how will they break?

Today this should work because of the O_PATH descriptors that are kept
open for all cached objects.  Those descriptors track any local
movement of objects and possibly even remote (depends on the
underlying filesystem).

> > Yet another issue is global CAP_DAC_READ_SEARCH required by the server
> > for file handle decode.
> >
> > Taking this into account, I think the final solution has to be in the
> > host kernel.   E.g. it seems okay to allow user namespace owner to
> > decode file handles on filesystems it actually owns.
>
> What do you mean by “owns”?  That the process’s user is the owner of the
> FS root?

User namespaces each have an owner (the user which has additional
capabilities in that user namespace) and each filesystem has an owning
user namespace (s_user_ns).   E.g. the following will create a tmpfs
instance owned by $USER:

$ unshare -rUm
# mount -t tmpfs tmpfs /mnt

On this mount $USER has superuser privileges on all files and
directories, which includes CAP_DAC_READ_SEARCH.

> > That would not
> > generally help us, though, since virtiofs will want to export root
> > owned fs as well.
> >
> > Addition of a MAC header to the file handle by name_to_handle_at(2)
> > could solve some or all of the above problems.  The question is where
> > the key comes from and what the security implications are.
> >
> > A per-process (e.g. associated with task->files, generated by the
> > kernel on demand and discarded on process exit) key would suffice to
> > replace O_PATH descriptors.  In this case the only difference between
> > keeping the O_PATH fd open and
> >
> >    name_to_handle_at(opathfd, &handle);
> >    close(opathfd);
> >    opathfd=open_by_handle_at(&handle);
> >
> > would be that the resulting fd might point to a disconnected dentry
> > and hence would result in incomplete path information under
> > /proc/self/fd/.  Need to think hard about whether this has any
> > security implications for unprivileged users.
> >
> > Adding key management would solve the other aspects, but would also
> > possibly open up holes for accessing arbitrary files, so this would
> > need to be done carefully.
>
> I have a bad feeling in my stomach when thinking about adding such
> things to the kernel.

Yet it could be more generally useful, not just for virtiofsd.   An
RFC patch should be simple enough and feedback from VFS and security
folks could help decide whether to pursue this route or not.

I've not yet used the crypto API in the kernel, but I guess generating
a key and creating a MAC are not such a big deal.  Do you want to have
a go at this, or should I?

Thanks,
Miklos



^ permalink raw reply	[flat|nested] 15+ messages in thread

* Re: [Virtio-fs] Securing file handles
  2021-03-08 12:30     ` Miklos Szeredi
@ 2021-03-08 13:39       ` Max Reitz
  2021-03-08 14:50         ` Miklos Szeredi
  0 siblings, 1 reply; 15+ messages in thread
From: Max Reitz @ 2021-03-08 13:39 UTC (permalink / raw)
  To: Miklos Szeredi; +Cc: virtio-fs-list

On 08.03.21 13:30, Miklos Szeredi wrote:
> On Mon, Mar 8, 2021 at 12:29 PM Max Reitz <mreitz@redhat.com> wrote:
>>
>> On 08.03.21 10:54, Miklos Szeredi wrote:
>>> Hi,
>>>
>>> Thanks for the good summary.
>>>
>>> Another aspect is what the file handle will be used for:
>>>
>>>    a) allowing server to close O_PATH descriptors any time because they
>>> can be reconstructed using the file handle
>>>
>>>    b) allowing NFS export on client, or just name_to_handle_at(2)
>>> open_by_handle_at(2).
>>>
>>> The requirements are slightly different, since file handles used for
>>> (a) do not have to persist after a guest reboot (since the VFS cache
>>> referencing those handles is gone).  While (b) requires persistence
>>> after a reboot.
>>
>> I’m not even sure we need file handles in the guest for (a).  For
>> example, we could just store the file handle for each node in the table
>> virtiofsd keeps.
> 
> Right.
> 
> 
>>   Or perhaps we don’t even need that, because we can
>> always reopen nodes with something like
>>
>>     openat(self.parent.fd.get(), self.name)
>>
>> (which may recurse quite a bit).  The problem with that would be “what
>> happens when a directory is moved”, but well, what does happen then?  Is
>> that change propagated to the guest today?  Or will things break?  If
>> so, how will they break?
> 
> Today this should work because of the O_PATH descriptors that are kept
> open for all cached objects.  Those descriptors track any local
> movement of objects and possibly even remote (depends on the
> underlying filesystem).
> 
>>> Yet another issue is global CAP_DAC_READ_SEARCH required by the server
>>> for file handle decode.
>>>
>>> Taking this into account, I think the final solution has to be in the
>>> host kernel.   E.g. it seems okay to allow user namespace owner to
>>> decode file handles on filesystems it actually owns.
>>
>> What do you mean by “owns”?  That the process’s user is the owner of the
>> FS root?
> 
> User namespaces each have an owner (the user which has additional
> capabilities in that user namespace) and each filesystem has an owning
> user namespace (s_user_ns).   E.g. the following will create a tmpfs
> instance owned by $USER:
> 
> $ unshare -rUm
> # mount -t tmpfs tmpfs /mnt
> 
> On this mount $USER has superuser privileges on all files and
> directories, which includes CAP_DAC_READ_SEARCH.
> 
>>> That would not
>>> generally help us, though, since virtiofs will want to export root
>>> owned fs as well.
>>>
>>> Addition of a MAC header to the file handle by name_to_handle_at(2)
>>> could solve some or all of the above problems.  The question is where
>>> the key comes from and what the security implications are.
>>>
>>> A per-process (e.g. associated with task->files, generated by the
>>> kernel on demand and discarded on process exit) key would suffice to
>>> replace O_PATH descriptors.  In this case the only difference between
>>> keeping the O_PATH fd open and
>>>
>>>     name_to_handle_at(opathfd, &handle);
>>>     close(opathfd);
>>>     opathfd=open_by_handle_at(&handle);
>>>
>>> would be that the resulting fd might point to a disconnected dentry
>>> and hence would result in incomplete path information under
>>> /proc/self/fd/.  Need to think hard about whether this has any
>>> security implications for unprivileged users.
>>>
>>> Adding key management would solve the other aspects, but would also
>>> possibly open up holes for accessing arbitrary files, so this would
>>> need to be done carefully.
>>
>> I have a bad feeling in my stomach when thinking about adding such
>> things to the kernel.
> 
> Yet it could be more generally useful, not just for virtiofsd.   An
> RFC patch should be simple enough and feedback from VFS and security
> folks could help decide whether to pursue this route or not.

Well, I imagine the helper process could also be used by other users 
than virtiofsd.

> I've not yet used the crypto API in the kernel, but I guess generating
> a key and creating a MAC are not such a big deal.  Do you want to have
> a go at this, or should I?

Admittedly I’m not yet at the point where I feel comfortable doing 
changes to the kernel at all, so if you have the time, I’d appreciate 
it.  (If you don’t really have the time, I could try my hand first and 
then we’d see.)

So AFAIU you want to put this in the kernel so we can get rid of needing 
the capability, because when you can only open handles that were 
previously generated for you, there wouldn’t be a security problem, right?

But what about cases where a file is made inaccessible to some process 
between generating the handle and later opening it?  E.g. in
/path/to/file, the “to” directory is changed to go-x (and the current 
user is not the owner), so opening /path/to/file wouldn’t be possible by 
path anymore.  Sure, if the FD remained open, you could still open the 
file anyway; but I consider it different in semantics.  (E.g. you could 
check that there are no processes that have “file” open anymore, and so 
you could assume that it’s now inaccessible.)

(I’m asking, because if this is kind of problematic, then it seems to me 
like a capability would still be needed.)

Max


^ permalink raw reply	[flat|nested] 15+ messages in thread

* Re: [Virtio-fs] Securing file handles
  2021-03-08 10:52   ` Max Reitz
@ 2021-03-08 14:15     ` Sergio Lopez
  0 siblings, 0 replies; 15+ messages in thread
From: Sergio Lopez @ 2021-03-08 14:15 UTC (permalink / raw)
  To: Max Reitz; +Cc: virtio-fs-list

[-- Attachment #1: Type: text/plain, Size: 2391 bytes --]

On Mon, Mar 08, 2021 at 11:52:58AM +0100, Max Reitz wrote:
> On 08.03.21 10:06, Sergio Lopez wrote:
> > On Fri, Mar 05, 2021 at 05:22:56PM +0100, Max Reitz wrote:
> > > == Summary ==
> > > 
> > > So, my current position is:
> > > 
> > > - Bind mounts don’t help with restricting file handles to the exported
> > >    directory.
> > > 
> > > - A MAC is not very elegant, and we might encounter problems where a
> > >    file may be moved outside of the shared directory, but remains
> > >    accessible (because moving a file doesn’t change its handle).
> > >    (If we consider that a problem.  NFS evidently doesn’t, because
> > >    without subtree_check, it has absolutely no protection against
> > >    arbitrary file handles being opened (on the FS where the export
> > >    resides), so valid file handles always remain valid.)
> > > 
> > > - A solution such as NFS’s subtree_check (i.e., storing the file’s
> > >    parent’s handle in addition to the file’s handle itself, then
> > >    verifying that the file does still reside in that directory when the
> > >    handle is opened, and then going up the tree to see whether we can
> > >    trace it back to the shared directory) is interesting and can perhaps
> > >    be considered elegant, but it requires iterating the directory the
> > >    file resides in when it is opened, and it will result in file handles
> > >    being invalidated whenever a file is moved (outside of its directory).
> > >    Perhaps also other issues.  In any case, there are reasons why NFS has
> > >    basically deprecated this.
> > > 
> > > Opinions? :)
> > 
> > While the MAC option doesn't look too bad to me, I can't help but feel
> > that we're working around a kernel (mis)feature, which is something
> > that's risky and tends to backfire.
> 
> Which misfeature do you mean exactly?  That you can open arbitrary files by
> specifying the right magic number (i.e. its handle)?
> 
> That in itself is nothing we’re really working around, but rather something
> that we actively want to pass through to the guest.

But I think those file handles should be constrained to some context
other than the backing file system. In any case, I see Miklos has
highlighted the same issue with more detail in his response, so let's
follow up this conversation there to avoid dispersion.

Sergio.

[-- Attachment #2: signature.asc --]
[-- Type: application/pgp-signature, Size: 833 bytes --]

^ permalink raw reply	[flat|nested] 15+ messages in thread

* Re: [Virtio-fs] Securing file handles
  2021-03-08 13:39       ` Max Reitz
@ 2021-03-08 14:50         ` Miklos Szeredi
  2021-03-16 17:28           ` Max Reitz
  0 siblings, 1 reply; 15+ messages in thread
From: Miklos Szeredi @ 2021-03-08 14:50 UTC (permalink / raw)
  To: Max Reitz; +Cc: virtio-fs-list

[-- Attachment #1: Type: text/plain, Size: 1568 bytes --]

On Mon, Mar 8, 2021 at 2:39 PM Max Reitz <mreitz@redhat.com> wrote:

> Admittedly I’m not yet at the point where I feel comfortable doing
> changes to the kernel at all, so if you have the time, I’d appreciate
> it.  (If you don’t really have the time, I could try my hand first and
> then we’d see.)

I'd hate to context switch away from the fuse leases to the kernel
crypto, so it would have to wait some time...

But I've attached an incomplete patch that just missing the crypto
bits and testing.

Would you mind having a go at it?

>
> So AFAIU you want to put this in the kernel so we can get rid of needing
> the capability, because when you can only open handles that were
> previously generated for you, there wouldn’t be a security problem, right?

Something like that.

> But what about cases where a file is made inaccessible to some process
> between generating the handle and later opening it?  E.g. in
> /path/to/file, the “to” directory is changed to go-x (and the current
> user is not the owner), so opening /path/to/file wouldn’t be possible by
> path anymore.  Sure, if the FD remained open, you could still open the
> file anyway; but I consider it different in semantics.  (E.g. you could
> check that there are no processes that have “file” open anymore, and so
> you could assume that it’s now inaccessible.)

That could be a concern, yes.   Requiring CAP_DAC_READ_SEARCH in the
current user namespace, as my template patch does, might mitigate
those worries somewhat.

Thanks,
Miklos

[-- Attachment #2: fhandle-mac.patch --]
[-- Type: text/x-patch, Size: 4800 bytes --]

diff --git a/fs/fhandle.c b/fs/fhandle.c
index ec6feeccc276..17c066be0a5d 100644
--- a/fs/fhandle.c
+++ b/fs/fhandle.c
@@ -14,9 +14,40 @@
 #include "internal.h"
 #include "mount.h"
 
+static int fhandle_add_mac(struct file_handle *handle, size_t handle_alloc)
+{
+	struct files_struct *files = current->files;
+	struct fhandle_mac_key *key = READ_ONCE(files->key), *old;
+	size_t mac_size = 8; /* or whatever */
+
+	handle->handle_bytes += mac_size;
+
+	if (handle->handle_type == FILEID_INVALID ||
+	    handle->handle_bytes > handle_alloc)
+		return FILEID_INVALID;
+
+	if (!key) {
+		key = /* generate_key */;
+		if (!key)
+			return -ENOMEM;
+		old = cmpxchg(&files->key, NULL, key);
+		if (old) {
+			/* race */
+			kfree(key);
+			key = old;
+		}
+	}
+
+	/* add MAC to the end of the current handle using key */
+
+	handle->handle_type |= FILEID_MAC:
+
+	return 0;
+}
+
 static long do_sys_name_to_handle(struct path *path,
 				  struct file_handle __user *ufh,
-				  int __user *mnt_id)
+				  int __user *mnt_id, bool mac)
 {
 	long retval;
 	struct file_handle f_handle;
@@ -49,12 +80,20 @@ static long do_sys_name_to_handle(struct path *path,
 	retval = exportfs_encode_fh(path->dentry,
 				    (struct fid *)handle->f_handle,
 				    &handle_dwords,  0);
+	if (retval == -ENOSPC)
+		retval = FILEID_INVALID;
 	handle->handle_type = retval;
 	/* convert handle size to bytes */
 	handle_bytes = handle_dwords * sizeof(u32);
 	handle->handle_bytes = handle_bytes;
+	if (mac) {
+		retval = fhandle_add_mac(handle, f_handle.handle_bytes);
+		if (retval < 0)
+			goto out;
+		handle_bytes = handle->handle_bytes;
+	}
 	if ((handle->handle_bytes > f_handle.handle_bytes) ||
-	    (retval == FILEID_INVALID) || (retval == -ENOSPC)) {
+	    retval == FILEID_INVALID)
 		/* As per old exportfs_encode_fh documentation
 		 * we could return ENOSPC to indicate overflow
 		 * But file system returned 255 always. So handle
@@ -73,6 +112,7 @@ static long do_sys_name_to_handle(struct path *path,
 	    copy_to_user(ufh, handle,
 			 sizeof(struct file_handle) + handle_bytes))
 		retval = -EFAULT;
+out:
 	kfree(handle);
 	return retval;
 }
@@ -98,7 +138,7 @@ SYSCALL_DEFINE5(name_to_handle_at, int, dfd, const char __user *, name,
 	int lookup_flags;
 	int err;
 
-	if ((flag & ~(AT_SYMLINK_FOLLOW | AT_EMPTY_PATH)) != 0)
+	if ((flag & ~(AT_SYMLINK_FOLLOW | AT_EMPTY_PATH | AT_HANDLE_MAC)) != 0)
 		return -EINVAL;
 
 	lookup_flags = (flag & AT_SYMLINK_FOLLOW) ? LOOKUP_FOLLOW : 0;
@@ -106,7 +146,8 @@ SYSCALL_DEFINE5(name_to_handle_at, int, dfd, const char __user *, name,
 		lookup_flags |= LOOKUP_EMPTY;
 	err = user_path_at(dfd, name, lookup_flags, &path);
 	if (!err) {
-		err = do_sys_name_to_handle(&path, handle, mnt_id);
+		err = do_sys_name_to_handle(&path, handle, mnt_id,
+			flag & AT_HANDLE_MAC);
 		path_put(&path);
 	}
 	return err;
@@ -147,6 +188,14 @@ static int do_handle_to_path(int mountdirfd, struct file_handle *handle,
 		retval = PTR_ERR(path->mnt);
 		goto out_err;
 	}
+
+	if (handle->handle_type & FILEID_MAC) {
+		/* verify mac using current->files->key */
+		handle->handle_bytes -= 8;
+	} else if(!ns_capable(path->mnt->mnt_sb->s_user_ns, CAP_DAC_READ_SEARCH)) {
+		return -EPERM;
+	}
+
 	/* change the handle size to multiple of sizeof(u32) */
 	handle_dwords = handle->handle_bytes >> 2;
 	path->dentry = exportfs_decode_fh(path->mnt,
@@ -176,7 +225,7 @@ static int handle_to_path(int mountdirfd, struct file_handle __user *ufh,
 	 * directory. Ideally we would like CAP_DAC_SEARCH.
 	 * But we don't have that
 	 */
-	if (!capable(CAP_DAC_READ_SEARCH)) {
+	if (!ns_capable(current_user_ns(), CAP_DAC_READ_SEARCH)) {
 		retval = -EPERM;
 		goto out_err;
 	}
diff --git a/fs/file.c b/fs/file.c
index f3a4bac2cbe9..9e41b8beea52 100644
--- a/fs/file.c
+++ b/fs/file.c
@@ -420,6 +420,8 @@ void put_files_struct(struct files_struct *files)
 		/* free the arrays if they are not embedded */
 		if (fdt != &files->fdtab)
 			__free_fdtable(fdt);
+
+		fhandle_key_free(files->key);
 		kmem_cache_free(files_cachep, files);
 	}
 }
diff --git a/include/linux/exportfs.h b/include/linux/exportfs.h
index fe848901fcc3..a6fdd9bbe98a 100644
--- a/include/linux/exportfs.h
+++ b/include/linux/exportfs.h
@@ -113,6 +113,9 @@ enum fid_type {
 	 * Filesystems must not use 0xff file ID.
 	 */
 	FILEID_INVALID = 0xff,
+
+	/* OR-ed with the actual ID; used by the fhandle API. */
+	FILEID_MAC = 0x10000000;
 };
 
 struct fid {
diff --git a/include/linux/fdtable.h b/include/linux/fdtable.h
index d0e78174874a..1895d21435ac 100644
--- a/include/linux/fdtable.h
+++ b/include/linux/fdtable.h
@@ -56,6 +56,8 @@ struct files_struct {
 
 	struct fdtable __rcu *fdt;
 	struct fdtable fdtab;
+
+	struct fhandle_mac_key *key;
   /*
    * written part on a separate cache line in SMP
    */

^ permalink raw reply related	[flat|nested] 15+ messages in thread

* Re: [Virtio-fs] Securing file handles
  2021-03-08  9:06 ` Sergio Lopez
  2021-03-08 10:52   ` Max Reitz
@ 2021-03-08 15:01   ` Stefan Hajnoczi
  1 sibling, 0 replies; 15+ messages in thread
From: Stefan Hajnoczi @ 2021-03-08 15:01 UTC (permalink / raw)
  To: Sergio Lopez; +Cc: virtio-fs-list, Max Reitz

[-- Attachment #1: Type: text/plain, Size: 3041 bytes --]

On Mon, Mar 08, 2021 at 10:06:50AM +0100, Sergio Lopez wrote:
> On Fri, Mar 05, 2021 at 05:22:56PM +0100, Max Reitz wrote:
> > == Summary ==
> > 
> > So, my current position is:
> > 
> > - Bind mounts don’t help with restricting file handles to the exported
> >   directory.
> > 
> > - A MAC is not very elegant, and we might encounter problems where a
> >   file may be moved outside of the shared directory, but remains
> >   accessible (because moving a file doesn’t change its handle).
> >   (If we consider that a problem.  NFS evidently doesn’t, because
> >   without subtree_check, it has absolutely no protection against
> >   arbitrary file handles being opened (on the FS where the export
> >   resides), so valid file handles always remain valid.)
> > 
> > - A solution such as NFS’s subtree_check (i.e., storing the file’s
> >   parent’s handle in addition to the file’s handle itself, then
> >   verifying that the file does still reside in that directory when the
> >   handle is opened, and then going up the tree to see whether we can
> >   trace it back to the shared directory) is interesting and can perhaps
> >   be considered elegant, but it requires iterating the directory the
> >   file resides in when it is opened, and it will result in file handles
> >   being invalidated whenever a file is moved (outside of its directory).
> >   Perhaps also other issues.  In any case, there are reasons why NFS has
> >   basically deprecated this.
> > 
> > Opinions? :)
> 
> While the MAC option doesn't look too bad to me, I can't help but feel
> that we're working around a kernel (mis)feature, which is something
> that's risky and tends to backfire. It also worries me the fact that
> we'd need to run virtiofsd with CAP_DAC_READ_SEARCH.
> 
> IIUC, we need this to avoid the need to keep an FD open for each entry
> that's in the Guest's lookup cache, which is something that's probably
> going to become a problem once we have dozens of virtiofsd instances
> servicing VMs on the same Host (BTW, this is already a problem on
> macOS, where the default *system-wide* NOFILE limit is a little over
> 10,000).
> 
> Perhaps we should try to aim higher and propose some kernel
> extensions that would fit better our needs?

Another option is to aim lower than everything that has been discussed
here:

Only support handle operations when the shared directory is the root of
a file system. That way the security issues are mostly eliminated.

The rationale is that most applications don't need file handles. If you
require file handles then you'll need to go through the effort of
creating a dedicated file system for the shared directory. For all other
cases, we'll disable file handle operations and not worry about it.

There is one security issue I can think of: if the host administrator
creates a mount inside the shared directory intended to "hide" access to
the files beneath, then I guess file handles could still access those
files.

Stefan

[-- Attachment #2: signature.asc --]
[-- Type: application/pgp-signature, Size: 488 bytes --]

^ permalink raw reply	[flat|nested] 15+ messages in thread

* Re: [Virtio-fs] Securing file handles
  2021-03-08  9:54 ` Miklos Szeredi
  2021-03-08 11:29   ` Max Reitz
@ 2021-03-08 22:03   ` Vivek Goyal
  1 sibling, 0 replies; 15+ messages in thread
From: Vivek Goyal @ 2021-03-08 22:03 UTC (permalink / raw)
  To: Miklos Szeredi; +Cc: virtio-fs-list, Max Reitz

On Mon, Mar 08, 2021 at 10:54:58AM +0100, Miklos Szeredi wrote:
> Hi,
> 
> Thanks for the good summary.
> 
> Another aspect is what the file handle will be used for:
> 
>  a) allowing server to close O_PATH descriptors any time because they
> can be reconstructed using the file handle

I am probably missing something, so I will ask. To solve problem a)
can we simply store the file handle in lo_inode. (say inode->fh).
and close inode->fd.

And when we need to, do open_by_handle_at(inode->fh). 

Given we already have lo->ino_map redirection, guest can't pass
an arbitrary file handle. So it seems to solve the issue of
arbitrary file handles from guest without MAC stuff and we don't
keep O_PATH fd around.

What am I missing?

Vivek

> 
>  b) allowing NFS export on client, or just name_to_handle_at(2)
> open_by_handle_at(2).
> 
> The requirements are slightly different, since file handles used for
> (a) do not have to persist after a guest reboot (since the VFS cache
> referencing those handles is gone).  While (b) requires persistence
> after a reboot.
> 
> Yet another issue is global CAP_DAC_READ_SEARCH required by the server
> for file handle decode.
> 
> Taking this into account, I think the final solution has to be in the
> host kernel.   E.g. it seems okay to allow user namespace owner to
> decode file handles on filesystems it actually owns.   That would not
> generally help us, though, since virtiofs will want to export root
> owned fs as well.
> 
> Addition of a MAC header to the file handle by name_to_handle_at(2)
> could solve some or all of the above problems.  The question is where
> the key comes from and what the security implications are.
> 
> A per-process (e.g. associated with task->files, generated by the
> kernel on demand and discarded on process exit) key would suffice to
> replace O_PATH descriptors.  In this case the only difference between
> keeping the O_PATH fd open and
> 
>   name_to_handle_at(opathfd, &handle);
>   close(opathfd);
>   opathfd=open_by_handle_at(&handle);
> 
> would be that the resulting fd might point to a disconnected dentry
> and hence would result in incomplete path information under
> /proc/self/fd/.  Need to think hard about whether this has any
> security implications for unprivileged users.
> 
> Adding key management would solve the other aspects, but would also
> possibly open up holes for accessing arbitrary files, so this would
> need to be done carefully.
> 
> Adding subtree checking to the kernel is also a possibility (i.e.
> limit the opened fd to the subtree of the bind mount).   It would have
> the advantage of not resulting in disconnected dentries, but
> disadvantage of not working if the file was moved across directories.
> 
> Thanks,
> Miklos
> 
> _______________________________________________
> Virtio-fs mailing list
> Virtio-fs@redhat.com
> https://listman.redhat.com/mailman/listinfo/virtio-fs


^ permalink raw reply	[flat|nested] 15+ messages in thread

* Re: [Virtio-fs] Securing file handles
  2021-03-08 14:50         ` Miklos Szeredi
@ 2021-03-16 17:28           ` Max Reitz
  2021-03-17 13:19             ` Vivek Goyal
  0 siblings, 1 reply; 15+ messages in thread
From: Max Reitz @ 2021-03-16 17:28 UTC (permalink / raw)
  To: Miklos Szeredi; +Cc: virtio-fs-list

On 08.03.21 15:50, Miklos Szeredi wrote:
> On Mon, Mar 8, 2021 at 2:39 PM Max Reitz <mreitz@redhat.com> wrote:
> 
>> Admittedly I’m not yet at the point where I feel comfortable doing
>> changes to the kernel at all, so if you have the time, I’d appreciate
>> it.  (If you don’t really have the time, I could try my hand first and
>> then we’d see.)
> 
> I'd hate to context switch away from the fuse leases to the kernel
> crypto, so it would have to wait some time...
> 
> But I've attached an incomplete patch that just missing the crypto
> bits and testing.
> 
> Would you mind having a go at it?

Thanks, I’ll look into the crypto stuff and have a go.  (Sorry for the 
delay...)

I’d still prefer a service process instead of putting this in the 
kernel, but let’s see how complicated it would be.  I suppose one 
problem with putting it into a service process is that doing so wouldn’t 
help with containers: If containers don’t allow CAP_DAC_READ_SEARCH, we 
I suppose it’ll be difficult to give it even to such a service process.

One thing that also needs to be solved is how to specify a persistent 
key.  I suppose the idea in your patch is to generate a random key for 
every new process, but we would need a persistent key.  With a service 
process, it could be configured by the user to use a specific key, or 
perhaps it has kind of small database and virtiofsd selects its 
persistent key by a hash of it or some other ID that it has received 
from the service process.

I don’t know how you’d go making the kernel store persistent keys, though.

Max

>> So AFAIU you want to put this in the kernel so we can get rid of needing
>> the capability, because when you can only open handles that were
>> previously generated for you, there wouldn’t be a security problem, right?
> 
> Something like that.
> 
>> But what about cases where a file is made inaccessible to some process
>> between generating the handle and later opening it?  E.g. in
>> /path/to/file, the “to” directory is changed to go-x (and the current
>> user is not the owner), so opening /path/to/file wouldn’t be possible by
>> path anymore.  Sure, if the FD remained open, you could still open the
>> file anyway; but I consider it different in semantics.  (E.g. you could
>> check that there are no processes that have “file” open anymore, and so
>> you could assume that it’s now inaccessible.)
> 
> That could be a concern, yes.   Requiring CAP_DAC_READ_SEARCH in the
> current user namespace, as my template patch does, might mitigate
> those worries somewhat.
> 
> Thanks,
> Miklos
> 


^ permalink raw reply	[flat|nested] 15+ messages in thread

* Re: [Virtio-fs] Securing file handles
  2021-03-16 17:28           ` Max Reitz
@ 2021-03-17 13:19             ` Vivek Goyal
  2021-03-17 15:13               ` Miklos Szeredi
  0 siblings, 1 reply; 15+ messages in thread
From: Vivek Goyal @ 2021-03-17 13:19 UTC (permalink / raw)
  To: Max Reitz; +Cc: virtio-fs-list

On Tue, Mar 16, 2021 at 06:28:24PM +0100, Max Reitz wrote:
> On 08.03.21 15:50, Miklos Szeredi wrote:
> > On Mon, Mar 8, 2021 at 2:39 PM Max Reitz <mreitz@redhat.com> wrote:
> > 
> > > Admittedly I’m not yet at the point where I feel comfortable doing
> > > changes to the kernel at all, so if you have the time, I’d appreciate
> > > it.  (If you don’t really have the time, I could try my hand first and
> > > then we’d see.)
> > 
> > I'd hate to context switch away from the fuse leases to the kernel
> > crypto, so it would have to wait some time...
> > 
> > But I've attached an incomplete patch that just missing the crypto
> > bits and testing.
> > 
> > Would you mind having a go at it?
> 
> Thanks, I’ll look into the crypto stuff and have a go.  (Sorry for the
> delay...)
> 
> I’d still prefer a service process instead of putting this in the kernel,
> but let’s see how complicated it would be.  I suppose one problem with
> putting it into a service process is that doing so wouldn’t help with
> containers: If containers don’t allow CAP_DAC_READ_SEARCH, we I suppose
> it’ll be difficult to give it even to such a service process.
> 
> One thing that also needs to be solved is how to specify a persistent key.
> I suppose the idea in your patch is to generate a random key for every new
> process, but we would need a persistent key.  With a service process, it
> could be configured by the user to use a specific key, or perhaps it has
> kind of small database and virtiofsd selects its persistent key by a hash of
> it or some other ID that it has received from the service process.
> 
> I don’t know how you’d go making the kernel store persistent keys, though.

Is it possible to load persistent key from user space into a keyring
using keyctl. 

Vivek

> 
> Max
> 
> > > So AFAIU you want to put this in the kernel so we can get rid of needing
> > > the capability, because when you can only open handles that were
> > > previously generated for you, there wouldn’t be a security problem, right?
> > 
> > Something like that.
> > 
> > > But what about cases where a file is made inaccessible to some process
> > > between generating the handle and later opening it?  E.g. in
> > > /path/to/file, the “to” directory is changed to go-x (and the current
> > > user is not the owner), so opening /path/to/file wouldn’t be possible by
> > > path anymore.  Sure, if the FD remained open, you could still open the
> > > file anyway; but I consider it different in semantics.  (E.g. you could
> > > check that there are no processes that have “file” open anymore, and so
> > > you could assume that it’s now inaccessible.)
> > 
> > That could be a concern, yes.   Requiring CAP_DAC_READ_SEARCH in the
> > current user namespace, as my template patch does, might mitigate
> > those worries somewhat.
> > 
> > Thanks,
> > Miklos
> > 
> 
> _______________________________________________
> Virtio-fs mailing list
> Virtio-fs@redhat.com
> https://listman.redhat.com/mailman/listinfo/virtio-fs


^ permalink raw reply	[flat|nested] 15+ messages in thread

* Re: [Virtio-fs] Securing file handles
  2021-03-17 13:19             ` Vivek Goyal
@ 2021-03-17 15:13               ` Miklos Szeredi
  0 siblings, 0 replies; 15+ messages in thread
From: Miklos Szeredi @ 2021-03-17 15:13 UTC (permalink / raw)
  To: Vivek Goyal; +Cc: virtio-fs-list, Howells, David, Max Reitz

[CC] David Howells.

On Wed, Mar 17, 2021 at 2:19 PM Vivek Goyal <vgoyal@redhat.com> wrote:
>
> On Tue, Mar 16, 2021 at 06:28:24PM +0100, Max Reitz wrote:

> > One thing that also needs to be solved is how to specify a persistent key.
> > I suppose the idea in your patch is to generate a random key for every new
> > process, but we would need a persistent key.  With a service process, it
> > could be configured by the user to use a specific key, or perhaps it has
> > kind of small database and virtiofsd selects its persistent key by a hash of
> > it or some other ID that it has received from the service process.
> >
> > I don’t know how you’d go making the kernel store persistent keys, though.
>
> Is it possible to load persistent key from user space into a keyring
> using keyctl.

Context for David:

We'd like unprivileged open_by_handle_at(2).   One idea is for the
kernel to authenticate file handles (add an authentication header)
using a secret key, so that unprivileged open_by_handle_at() only
works on handles obtained through file_to_handle_at(), and will reject
any maliciously crafted file handles.

So the question is how the authentication keys should be managed.

The unprivileged process must not have access to the key, obviously,
but it should be possible to save the key across restarts.

Any ideas?

Thanks,
Miklos



^ permalink raw reply	[flat|nested] 15+ messages in thread

end of thread, other threads:[~2021-03-17 15:13 UTC | newest]

Thread overview: 15+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2021-03-05 16:22 [Virtio-fs] Securing file handles Max Reitz
2021-03-08  9:06 ` Sergio Lopez
2021-03-08 10:52   ` Max Reitz
2021-03-08 14:15     ` Sergio Lopez
2021-03-08 15:01   ` Stefan Hajnoczi
2021-03-08  9:54 ` Miklos Szeredi
2021-03-08 11:29   ` Max Reitz
2021-03-08 12:30     ` Miklos Szeredi
2021-03-08 13:39       ` Max Reitz
2021-03-08 14:50         ` Miklos Szeredi
2021-03-16 17:28           ` Max Reitz
2021-03-17 13:19             ` Vivek Goyal
2021-03-17 15:13               ` Miklos Szeredi
2021-03-08 22:03   ` Vivek Goyal
2021-03-08 11:44 ` Dr. David Alan Gilbert

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.