Re: [PATCH] cifs: handle termination of cifs oplockd kernel thread

linux-kernel.vger.kernel.org archive mirror
 help / color / mirror / Atom feed

* Re: [PATCH] cifs: handle termination of cifs oplockd kernel thread
@ 2005-04-29 21:09 Steve French
  2005-04-29 21:31 ` Christoph Hellwig
  0 siblings, 1 reply; 21+ messages in thread
From: Steve French @ 2005-04-29 21:09 UTC (permalink / raw)
  To: linux-kernel

 > does and who revied that?  Things like that don't have a business in the
 > kernel, and certainly not as ioctl.

Other filesystems such as smbfs had an ioctl that returned the uid of 
the mounter which they used (in the smbfs case in smbumount).  This was 
required by the unmount helper to determine if the unmount would allow a 
user to unmount a particular mount that they mounted.   Unlike in the 
case of mount, for unmount  you can not use the owner uid of the mount 
target to tell who mounted that mount.   I had not received any better 
suggestions as to how to address it.   I had proposed various 
alternatives - exporting in in /proc/mounts e.g.   

As we try to gradually obsolete smbfs, this came up with various users 
(there was even a bugzilla bug opened for adding it) who said that they 
need the ability to unmount their own mounts for network filesystems 
without using /etc/fstab.    Unfortunately for network filesytsems, 
unlike local filesystems, it is impractical to put every possible mount 
target in /etc/fstab since servers get renamed and the universe of 
possible cifs mount targets for a user is large.

There seemed only three alternatives -
1) mimic the smbfs ioctl -   as can be seen from smbfs and smbumount 
source this has portability problems because apparently there is no 
guarantee that uid_t is the same size in kernel and in userspace - smbfs 
actually has two ioctls for different sizes of uid field - this seemed 
like a bad idea
2) export the uid in /proc/mounts - same problem as above
3) call into the kernel to see if current matches the uid of the mounter 
- this has no 16/32/64 bit uid portability issues since the check is 
made in kernel

If there is a better way to achieve these goals I would like to know - I 
had not gotten any feedback on a better way.   Although I am not a fan 
of ioctls, this is as simple as they get and I checked for overlaps in 
the ioctl numbers and the utility checks to make sure it is only invoked 
if the filesystem magic number matches CIFS's magic number and no parms 
are passed or returned so it is quite safe.

Of course I would have preferred that this facility were built into the 
kernel via a syscall so the same approach could be put in umount itself 
instead of in a umount helper.

^ permalink raw reply	[flat|nested] 21+ messages in thread

* Re: [PATCH] cifs: handle termination of cifs oplockd kernel thread
  2005-04-29 21:09 [PATCH] cifs: handle termination of cifs oplockd kernel thread Steve French
@ 2005-04-29 21:31 ` Christoph Hellwig
  2005-04-29 22:20   ` Steve French
  0 siblings, 1 reply; 21+ messages in thread
From: Christoph Hellwig @ 2005-04-29 21:31 UTC (permalink / raw)
  To: Steve French; +Cc: linux-kernel

On Fri, Apr 29, 2005 at 04:09:09PM -0500, Steve French wrote:
> > does and who revied that?  Things like that don't have a business in the
> > kernel, and certainly not as ioctl.
> 
> Other filesystems such as smbfs had an ioctl that returned the uid of 
> the mounter which they used (in the smbfs case in smbumount).  This was 
> required by the unmount helper to determine if the unmount would allow a 
> user to unmount a particular mount that they mounted.   Unlike in the 
> case of mount, for unmount  you can not use the owner uid of the mount 
> target to tell who mounted that mount.   I had not received any better 
> suggestions as to how to address it.   I had proposed various 
> alternatives - exporting in in /proc/mounts e.g.   

exporting the uid using the show_options superblock methods sounds like
a much better option.

> As we try to gradually obsolete smbfs, this came up with various users 
> (there was even a bugzilla bug opened for adding it) who said that they 
> need the ability to unmount their own mounts for network filesystems 
> without using /etc/fstab.    Unfortunately for network filesytsems, 
> unlike local filesystems, it is impractical to put every possible mount 
> target in /etc/fstab since servers get renamed and the universe of 
> possible cifs mount targets for a user is large.

Do you use the same suid wrapper hack for mounts as fuse?  Maybe you
should chime in on that thread so we can find a proper solution.

> 
> There seemed only three alternatives -
> 1) mimic the smbfs ioctl -   as can be seen from smbfs and smbumount 
> source this has portability problems because apparently there is no 
> guarantee that uid_t is the same size in kernel and in userspace - smbfs 
> actually has two ioctls for different sizes of uid field - this seemed 
> like a bad idea
> 2) export the uid in /proc/mounts - same problem as above

No.  /proc/self/mounts is an ASCII format, so there's no problem with
differemt sizes.


^ permalink raw reply	[flat|nested] 21+ messages in thread

* Re: [PATCH] cifs: handle termination of cifs oplockd kernel thread
  2005-04-29 21:31 ` Christoph Hellwig
@ 2005-04-29 22:20   ` Steve French
  2005-05-11  8:56     ` Christoph Hellwig
  0 siblings, 1 reply; 21+ messages in thread
From: Steve French @ 2005-04-29 22:20 UTC (permalink / raw)
  To: Christoph Hellwig; +Cc: linux-kernel

Christoph Hellwig wrote:

>On Fri, Apr 29, 2005 at 04:09:09PM -0500, Steve French wrote:
>  
>
>>>does and who revied that?  Things like that don't have a business in the
>>>kernel, and certainly not as ioctl.
>>>      
>>>
>>Other filesystems such as smbfs had an ioctl that returned the uid of 
>>the mounter which they used (in the smbfs case in smbumount).  This was 
>>required by the unmount helper to determine if the unmount would allow a 
>>user to unmount a particular mount that they mounted.   Unlike in the 
>>case of mount, for unmount  you can not use the owner uid of the mount 
>>target to tell who mounted that mount.   I had not received any better 
>>suggestions as to how to address it.   I had proposed various 
>>alternatives - exporting in in /proc/mounts e.g.   
>>    
>>
>
>exporting the uid using the show_options superblock methods sounds like
>a much better option.
>
>  
>

>No.  /proc/self/mounts is an ASCII format, so there's no problem with
>differemt sizes.
>
>
>  
>

I agree that it would work for most cases [today, in 2.6 Linux] - but I 
really feel uncomfortable introducing a user space / kernel space 
dependency on the size of a field where none is needed - I am especially 
nervous because I can see from the Samba change logs that:
1) historically the size of this field changed
2) other operating systems also have either increased the size of uid 
(as MacOS e.g.) or mapped it (as Windows does) - to accomodate the 
needed concept of UUID (in large networks the current uid is too small)

Ideally I would have liked a general kernel call to be able to answer 
the question "Does the current security context match that of the 
mounter?"  because with SELinux, and the need to increase the size of 
uid or somehow work around it for distributed systems, it is hard for 
user space code to take something opaque (the thing that showmounts 
returns) and map it to what libc returns as the uid for current - 
otherwise it would be impossible for user space code to guarantee that 
it will match the kernel code, but this is so trivial, and has no 
sideeffects so in the interim this seems safer.



   

^ permalink raw reply	[flat|nested] 21+ messages in thread

* Re: [PATCH] cifs: handle termination of cifs oplockd kernel thread
  2005-04-29 22:20   ` Steve French
@ 2005-05-11  8:56     ` Christoph Hellwig
  2005-05-11 18:19       ` Steve French
  0 siblings, 1 reply; 21+ messages in thread
From: Christoph Hellwig @ 2005-05-11  8:56 UTC (permalink / raw)
  To: Steve French; +Cc: Christoph Hellwig, linux-kernel

On Fri, Apr 29, 2005 at 05:20:37PM -0500, Steve French wrote:
> I agree that it would work for most cases [today, in 2.6 Linux] - but I 
> really feel uncomfortable introducing a user space / kernel space 
> dependency on the size of a field where none is needed - I am especially 
> nervous because I can see from the Samba change logs that:

Please listen, I said you should export it in /proc/<pid>/mounts, which is
an ASCII interface and any half-sane parser does not depend on the width
of the field in the kernel.

Can we please get rid of the broken ioctl now so it doesn't become part
of the ABI and you'll add the trivial output to /proc/<pid>/mounts?


^ permalink raw reply	[flat|nested] 21+ messages in thread

* Re: [PATCH] cifs: handle termination of cifs oplockd kernel thread
  2005-05-11  8:56     ` Christoph Hellwig
@ 2005-05-11 18:19       ` Steve French
  2005-05-16  9:34         ` Christoph Hellwig
  0 siblings, 1 reply; 21+ messages in thread
From: Steve French @ 2005-05-11 18:19 UTC (permalink / raw)
  To: Christoph Hellwig; +Cc: linux-kernel

Christoph Hellwig wrote:

>
>should export it in /proc/<pid>/mounts, which is
>an ASCII interface and any half-sane parser does not depend on the width
>of the field in the kernel.
>
>Can we please get rid of the broken ioctl now so it doesn't become part
>of the ABI and you'll add the trivial output to /proc/<pid>/mounts?
>
>
>  
>
OK - why don't we just add this (ie the ioctl removal) to the patch

[PATCH] unprivileged mount/umount

of Miklos et al, since that removes the need to modify showmounts (and 
avoids any name collision/confusion
with the existing meaning of the mount option "uid" ie as the default 
uid to use for files on the system when
mounting to servers which can not return inode owners as uids).

On another topic relating to ioctls, various people have suggested 
adding an ioctl to add a table to optionally map file owner (uid / gid 
mapping tables) on remote filesystems. Although this is easy enough to 
do for the case of CIFS, this seems like a function (loading the table) 
that could be done via /proc or perhaps even sysfs. Is there are 
precedent for doing this on Linux? Googling I see various examples where 
NFS client on other platforms (not Linux) have done something vaguely 
similar. NFSv4 uses an upcall for this (although they are mapping 
slightly differently since they now receive a fully qualified username 
and have to map this to a loca uid, rather than getting a remote uid to 
local uid as earlier nfs did). The general issue is that when mounting 
to multiple Unix/Linux servers (especially in different domains), unlike 
in Windows (or perhaps MacOS), similar users are defined with different 
uids, and there are cases where mapping uids/gids or ranges of uids/gids
from that returned from the server would be helpful. The mapping table 
would have to hang off the tree connection or the SMB session for the 
case of CIFS but I would rather not use an ioctl to load it, yet if the 
table ever got big, I would prefer not to use /proc either. Is there a 
recommended approach.

^ permalink raw reply	[flat|nested] 21+ messages in thread

* Re: [PATCH] cifs: handle termination of cifs oplockd kernel thread
  2005-05-11 18:19       ` Steve French
@ 2005-05-16  9:34         ` Christoph Hellwig
  0 siblings, 0 replies; 21+ messages in thread
From: Christoph Hellwig @ 2005-05-16  9:34 UTC (permalink / raw)
  To: Steve French; +Cc: linux-kernel

On Wed, May 11, 2005 at 01:19:19PM -0500, Steve French wrote:
> OK - why don't we just add this (ie the ioctl removal) to the patch
> 
> [PATCH] unprivileged mount/umount
> 
> of Miklos et al, since that removes the need to modify showmounts (and 
> avoids any name collision/confusion
> with the existing meaning of the mount option "uid" ie as the default 
> uid to use for files on the system when
> mounting to servers which can not return inode owners as uids).

I think that would be best.  It still needs a little work first.

> On another topic relating to ioctls, various people have suggested 
> adding an ioctl to add a table to optionally map file owner (uid / gid 
> mapping tables) on remote filesystems. Although this is easy enough to 
> do for the case of CIFS, this seems like a function (loading the table) 
> that could be done via /proc or perhaps even sysfs. Is there are 
> precedent for doing this on Linux?

I don't think that mapping should happen in kernelspace.  It would
be nice if you could share that with nfs, maybe even generalizing the
nfsv4 one.


^ permalink raw reply	[flat|nested] 21+ messages in thread

* Re: [PATCH] cifs: handle termination of cifs oplockd kernel thread
  2005-04-30  9:22         ` Miklos Szeredi
  2005-04-30 10:57           ` Miklos Szeredi
@ 2005-05-11  8:59           ` Christoph Hellwig
  1 sibling, 0 replies; 21+ messages in thread
From: Christoph Hellwig @ 2005-05-11  8:59 UTC (permalink / raw)
  To: Miklos Szeredi; +Cc: hch, 7eggert, smfrench, linux-kernel

On Sat, Apr 30, 2005 at 11:22:48AM +0200, Miklos Szeredi wrote:
> >  - virtual filesystems exporting kernel data are obviously safe as
> >    they enforce permissions no matter who mounted them.  (actually we'd
> >    need to check for some odd mount options)
> 
> Maybe sysadmin doesn't want to let users see /sys for example.  How
> would you disable it if users can mount it themselfes?  OK, you can
> disable user mounts completely, but that's probably not fine grained
> enough for some.

the sysadmin shouldn't do that.  sysfs is needed for proper operation
in a modern system and there's nothing to hid in there.

> >  - block-based filesystems should be safe as long as the mounter has
> >    access to the underlying block device
> 
> Not true if user controls the baking device (e.g. loopback).

I didn't say we should make using the loopback driver a non-privilegued
operation.

> Most
> block based filesystems are probably unsafe at the moment, because not
> enough consistency checking is done at runtime.  Are things like
> non-cyclic directory graphs ensured by all filesystems?  My guess is
> not.  See also Linus' comment about the state of the iso9660
> filesystem:
> 
>   http://lwn.net/Articles/128744/

It just mean that a) the system admin should be careful what drivers are
loaded and b) we need to audit block based filesystems better.

Note that in many current distributions users can mount iso9660 filesystems
through user mount hacks already.  Accessible to everyone and in the global
namespace.


^ permalink raw reply	[flat|nested] 21+ messages in thread

* Re: [PATCH] cifs: handle termination of cifs oplockd kernel thread
  2005-04-30 13:28             ` Steve French
  2005-04-30 14:53               ` J. Bruce Fields
  2005-04-30 16:16               ` Miklos Szeredi
@ 2005-05-01  0:10               ` Bodo Eggert
  2 siblings, 0 replies; 21+ messages in thread
From: Bodo Eggert @ 2005-05-01  0:10 UTC (permalink / raw)
  To: Steve French; +Cc: Miklos Szeredi, hch, 7eggert, linux-kernel

On Sat, 30 Apr 2005, Steve French wrote:

> There is one remaining issue with mount and umount - the user space 
> utilities.   By the way who maintains
> them these days?   Although the mount utilities allow filesystem 
> specific mount and umount helpers
> to be placed in /sbin and automatically executed for the matching 
> filesystem type, there are a few functions
> that belong in common code - in a system library which today have to be 
> implemented in every helper
> function and of course are implemented in different ways in different 
> distros and
> different tools with possibility of corruption of the /etc/mtab.

For user mounts, there should be no practical way of maintaining mtab,
especially if the users are using private namespaces (as suggested in 
another thread) or if they are supposed to be able to unmount using
a non-suid generic umount.

>   It 
> may be that the file /etc/mtab
> does not matter and that it needs to go away and everyone needs to look 
> at /proc/mounts instead, but
> in the meantime /etc/mtab can easily get out of sync with the actual 
> list of mounts, although that
> usuallly does not prevent unmount from working it may be confusing.

The drawback of /proc/mounts is not showing the -oloop information.
Either it's easy to implement showing that extra information, or you'll
need a ~/.etc/mtab

^ permalink raw reply	[flat|nested] 21+ messages in thread

* Re: [PATCH] cifs: handle termination of cifs oplockd kernel thread
  2005-04-30 14:50                 ` Steve French
@ 2005-04-30 17:23                   ` Miklos Szeredi
  0 siblings, 0 replies; 21+ messages in thread
From: Miklos Szeredi @ 2005-04-30 17:23 UTC (permalink / raw)
  To: smfrench; +Cc: bfields, miklos, hch, 7eggert, linux-kernel

> But you bring up an interesting point about security policy.  For
> the case of evil user trying to mount to evil server (e.g. one under
> evil user's control), in one sense it is no different than allowing
> a user to mount an evil cdrom or usb storage device with evil
> contens - a device which may contain specially crafted data (file
> and directory contents and metadata) designed to crash the system,
> but there is a difference - for network filesystems the server also
> can delay the responses, throw away the responses or corrupt the
> frame headers (this can just as often happen due to buggy network
> hardware and routers too).

There's another difference.  Mounting a cdrom or usb stick needs
_physical_ access to the machine in question.  If you have physical
access you don't need to craft special filesystems to crash the
machine, just pull out the plug from the wall.

So network/userspace filesystems which allow the user to mount an
arbitrary server should be _extra_ careful to verify data from the
server.  Otherwise they can remotely crash the machine or gain
elevated privileges.

Miklos

^ permalink raw reply	[flat|nested] 21+ messages in thread

* Re: [PATCH] cifs: handle termination of cifs oplockd kernel thread
  2005-04-30 13:28             ` Steve French
  2005-04-30 14:53               ` J. Bruce Fields
@ 2005-04-30 16:16               ` Miklos Szeredi
  2005-04-30 15:27                 ` Steve French
  2005-05-01  0:10               ` Bodo Eggert
  2 siblings, 1 reply; 21+ messages in thread
From: Miklos Szeredi @ 2005-04-30 16:16 UTC (permalink / raw)
  To: smfrench; +Cc: miklos, hch, 7eggert, linux-kernel

> Don't see how FUSE is that much safer, if you allocate kernel memory
> at all you eventually can create DoS, and you can not do a
> filesystem without allocating some kernel memory, but it does not
> seem that easy to do intentionally.

Allocating kernel memory is usually not a problem, when it's
associated with some object, whose number is already limited by the
kernel.  These are: cache entries (inode, dentry), file pointers
(limited in various ways), or super blocks (should be limited in case
of user mounts).

The big problem is the page cache, because that is not limited.  The
user can mmap huge amounts of memory, dirty them, and then when the
machine runs out of memory, and writeback kicks in, it may already be
too late.

This problem can be demonstrated with _any_ network filesystem that
supports shared writable mapping, and is mounted from the local
machine.  One exception is CODA, because it uses disk files as file
backing, and so does not have problems with writeback.

FUSE solves the problem by simply not allowing shared writable
mapping.  It's a _very_ hard thing to solve otherwise.  CIFS, smbfs,
etc, can do the same for unprivileged mounts, or untrusted servers.

> At least for the CIFS case you can turn off the page cache for
> inode data on a per mount basis (with the forcedirectio mount flag)
> if you worry about the server intentionally holding up writes.

That's sounds like a solution to this problem.

> Unless the write is past end of file, writes are timed out
> reasonably quickly anyway, and end up killing the session, which
> depending on the setting of the hard/soft flag probably should
> result in a page fault.

A timeout is also OK, but you should be careful, that the page under
writeback does get freed after the timeout.

Miklos

^ permalink raw reply	[flat|nested] 21+ messages in thread

* Re: [PATCH] cifs: handle termination of cifs oplockd kernel thread
  2005-04-30 16:16               ` Miklos Szeredi
@ 2005-04-30 15:27                 ` Steve French
  0 siblings, 0 replies; 21+ messages in thread
From: Steve French @ 2005-04-30 15:27 UTC (permalink / raw)
  To: Miklos Szeredi; +Cc: hch, 7eggert, linux-kernel

On Sat, 2005-04-30 at 11:16, Miklos Szeredi wrote:

> The big problem is the page cache, because that is not limited.  The
> user can mmap huge amounts of memory, dirty them, and then when the
> machine runs out of memory, and writeback kicks in, it may already be
> too late.

Yes - we have seen the inode caching get too aggressive in testcases
with paired machines each mounting two clients and also running two
servers - in particular running an NFS and CIFS mounts from the same
client to a server running nfsd and Samba, and then doing the reverse
and launching large file copy testcases going both directions at the
same time.  To make it nastier they add exports for Samba and NFS of
more than one local filesystem type.  Fortunately most of the issues
there have been fixed, but it is an incredibly hard thing to guarantee
that there is enough memory in those cases because so much is taken up
by the page manager doing inode data caching and multimachine deadlock
could occur if the server is blocked on a client doing writepage which
is blocked on the other server which is blocked on the other client ...

^ permalink raw reply	[flat|nested] 21+ messages in thread

* Re: [PATCH] cifs: handle termination of cifs oplockd kernel thread
  2005-04-30 13:28             ` Steve French
@ 2005-04-30 14:53               ` J. Bruce Fields
  2005-04-30 14:50                 ` Steve French
  2005-04-30 16:16               ` Miklos Szeredi
  2005-05-01  0:10               ` Bodo Eggert
  2 siblings, 1 reply; 21+ messages in thread
From: J. Bruce Fields @ 2005-04-30 14:53 UTC (permalink / raw)
  To: Steve French; +Cc: Miklos Szeredi, hch, 7eggert, linux-kernel

On Sat, Apr 30, 2005 at 08:28:27AM -0500, Steve French wrote:
> Miklos Szeredi wrote:
> 
> >>>- network/userspace filesystems should be fine aswell
> >>>     
> >>>
> >>They should, but again I wonder if NFS with all it's complexity is
> >>being careful enough with what it accepts from the server.
> >>   
> >>
> That is the fun of trying to get our network filesystems up to the
> 20th century.  There is at lot more work that has to be done here, but
> it is gradually improving.  At least for cifs but probably for NFSv4
> (and possibly AFS) it is possible for the client to validate that the
> server is who it says it is, and both NFSv4 (actually the newer NFS
> RPC) and CIFS of course allow packet signing which helps, not sure if
> AFS allows packet signing.

None of this helps in the situation Miklos is considering, where the
attacker is a user on the client doing the mount.  So presumably the
user gets to choose a server under his/her control, and all the
authentication does is prove to the user that s/he got the right server,
which doesn't protect the kernel at all.

The only defense is auditing the client code's handling of data it
receives from the server.

--b.

^ permalink raw reply	[flat|nested] 21+ messages in thread

* Re: [PATCH] cifs: handle termination of cifs oplockd kernel thread
  2005-04-30 14:53               ` J. Bruce Fields
@ 2005-04-30 14:50                 ` Steve French
  2005-04-30 17:23                   ` Miklos Szeredi
  0 siblings, 1 reply; 21+ messages in thread
From: Steve French @ 2005-04-30 14:50 UTC (permalink / raw)
  To: J. Bruce Fields; +Cc: Miklos Szeredi, hch, 7eggert, linux-kernel

J. Bruce Fields wrote:

>On Sat, Apr 30, 2005 at 08:28:27AM -0500, Steve French wrote:
>  
>
>>Miklos Szeredi wrote:
>>
>>    
>>
>>>>>- network/userspace filesystems should be fine aswell
>>>>>    
>>>>>
>>>>>          
>>>>>
>>>>They should, but again I wonder if NFS with all it's complexity is
>>>>being careful enough with what it accepts from the server.
>>>>  
>>>>        
>>>>
>>it is possible for the client to validate that the
>>server is who it says it is, and both NFSv4 (actually the newer NFS
>>RPC) and CIFS of course allow packet signing which helps, not sure if
>>AFS allows packet signing.
>>    
>>
>
>None of this helps in the situation Miklos is considering, where the
>attacker is a user on the client doing the mount.  So presumably the
>user gets to choose a server under his/her control, and all the
>authentication does is prove to the user that s/he got the right server,
>which doesn't protect the kernel at all.
>
>The only defense is auditing the client code's handling of data it
>receives from the server
>  
>
I agree that periodic auditing of returned data, and testing with 
intentionally corrupted server responses
is necessary and should continue.

But you bring up an interesting point about security policy.    For the 
case of evil user trying to mount
to evil server (e.g. one under evil user's control), in one sense it is 
no different than allowing a user to
mount an evil cdrom or usb storage device with evil contens - a device 
which may contain specially
crafted data (file and directory contents and metadata) designed to 
crash the system, but there is
a difference - for network filesystems the server also can delay the 
responses, throw away
the responses or corrupt the frame headers (this can just as often 
happen due to buggy network
hardware and routers too).  Obviously we need to continue to audit for 
both cases - but it would
not hurt to optionally verify that the server can prove its identity and 
prove that it has been properly
added to a security domain that you trust - ie allow an NFSv4 mount and 
the CIFS mount helper
to be configured/built so that users could only mount to servers that are:
    1) In the clients security domain (ie kerberos realm)
    2) In a trusted domain (realm)
IIRC this is already done for the case of some services such as winbind, 
and I vaguely remember
IBM OS/2 doing this (verifying the server's identity during mount) when 
using Kerberized SMB back in
the early 1990s in the Directory and Security server product.


^ permalink raw reply	[flat|nested] 21+ messages in thread

* Re: [PATCH] cifs: handle termination of cifs oplockd kernel thread
  2005-04-30 10:57           ` Miklos Szeredi
@ 2005-04-30 13:28             ` Steve French
  2005-04-30 14:53               ` J. Bruce Fields
                                 ` (2 more replies)
  0 siblings, 3 replies; 21+ messages in thread
From: Steve French @ 2005-04-30 13:28 UTC (permalink / raw)
  To: Miklos Szeredi, hch, 7eggert, linux-kernel

Miklos Szeredi wrote:

>>> - network/userspace filesystems should be fine aswell
>>>      
>>>
>>They should, but again I wonder if NFS with all it's complexity is
>>being careful enough with what it accepts from the server.
>>    
>>
That is the fun of trying to get our network filesystems up to the 20th 
century.  There is
at lot more work that has to be done here, but it is gradually 
improving.  At least for cifs
but probably for NFSv4 (and possibly AFS) it is possible for the client 
to validate that the
server is who it says it is, and both NFSv4 (actually the newer NFS RPC) 
and CIFS of course
allow packet signing which helps, not sure if AFS allows packet 
signing.   There does
need to be even more testing in one area though - selective packet 
corruption testing
(which can be done by temporarily modifying the server to inject random 
invalid packet
information on a subset of fields every thousand packets or so) since 
the biggest danger
in network filesystems is the huge variety of servers with different 
server bugs that you have
to be able to workaround.  Working around server bugs is a huge problem with
the client side of networking code.

>>I take that back.  Any filesystem using page cache and allowing shared
>>writable mapping is currently unsafe because of OOM deadlock if
>>mounted from local machine, or simply DoS against client by delaying
>>writeback.
>>
>>So other than FUSE, what's left as safe?
>>
>>Miklos
>>
Don't see how FUSE is that much safer, if you allocate kernel memory at 
all you eventually can create DoS,
and you can not do a filesystem without allocating some kernel memory, 
but it does not seem that easy to
do intentionally.   At least for the CIFS case you can turn off the page 
cache for inode data on a per mount
basis (with the forcedirectio mount flag) if you worry about the server 
intentionally holding up writes. 
Unless the write is past end of file, writes are timed out reasonably 
quickly anyway, and end up
killing the session, which depending on the setting of the hard/soft 
flag probably should result in a page fault.

There is one remaining issue with mount and umount - the user space 
utilities.   By the way who maintains
them these days?   Although the mount utilities allow filesystem 
specific mount and umount helpers
to be placed in /sbin and automatically executed for the matching 
filesystem type, there are a few functions
that belong in common code - in a system library which today have to be 
implemented in every helper
function and of course are implemented in different ways in different 
distros and
different tools with possibility of corruption of the /etc/mtab.   It 
may be that the file /etc/mtab
does not matter and that it needs to go away and everyone needs to look 
at /proc/mounts instead, but
in the meantime /etc/mtab can easily get out of sync with the actual 
list of mounts, although that
usuallly does not prevent unmount from working it may be confusing.   
The basic problem is that the
"lock the mtab / add a new line to it / unlock" is not available in an 
exported function (alternatively if
lock and unlock mtab functions were exported that would help) and 
similarly with umount.there
is no "safely remove the matching line from mtab" -  Looking at the 
mount utils and various mount helper
functions it looks like you can not make any assumptions about the name 
of the lock file used to protect
mtab (I am not even sure that you can guarantee that /etc/mtab is a 
file, or even a symlink).  
the lock file used to lock the mtab

^ permalink raw reply	[flat|nested] 21+ messages in thread

* Re: [PATCH] cifs: handle termination of cifs oplockd kernel thread
  2005-04-30  8:29       ` Christoph Hellwig
  2005-04-30  9:22         ` Miklos Szeredi
@ 2005-04-30 12:52         ` Steve French
  1 sibling, 0 replies; 21+ messages in thread
From: Steve French @ 2005-04-30 12:52 UTC (permalink / raw)
  To: Christoph Hellwig; +Cc: Miklos Szeredi, 7eggert, linux-kernel

On Sat, 2005-04-30 at 03:29, Christoph Hellwig wrote:
> > 
> > Having a mount owner is not a problem.
Perhaps some day "mount owner" might be more complex than simply the
uid_t of the owner (I don't know if there will be future cases in which
you might want to check the gid_t at mount time or some SELinux specific
security context), but I would prefer that mnt_uid be stored in the
superblock so I could get rid of those few lines of code in cifs, and
that is a fairly non-controversial start.   Coming up with the policy
as Miklos and Christoph were suggesting may be doable in small stages.

>   Having a good policy for
> > accepting mounts is rather more so, according to some:
> > 
> >    http://marc.theaimsgroup.com/?l=linux-kernel&m=107705608603071&w=2
> > 
> > Just a little taste of what that policy would involve:
> > 
> >   - global limit on user mounts
> 
> I don't think we need that one.

agreed

> 
> >   - possibly per user limit on mounts
> 
> Makes sense as an ulimit, that way the sysadmin can easily disable the
> user mount feature aswell.
> 

agreed.

> >   - acceptable mountpoints (unlimited writablity is probably a good minimum)
> 
> Yupp.
Yes, although not sure what unlimited means here since the filesystem
you are mounting will often forbid writes (at the server)

> 
> >   - acceptable mount options (nosuid, nodev are obviously not)
> 
> noexecis a bit too much, so the above look good.

There are cases in which adding noexec might make sense as a system
policy for user mounts, but the typical case in which user mounts are
needed are for home directories over the network or equivalent in which
noexec makes it tough for them to be very useful.  nosuid and nodev on
the other hand should be restricted and users are used to this already
since they are the two flags that are added by mount.cifs if a non-root
user mounts and the admin has configured mount.cifs to allow user
mounts, so that would be consistent.


^ permalink raw reply	[flat|nested] 21+ messages in thread

* Re: [PATCH] cifs: handle termination of cifs oplockd kernel thread
  2005-04-30  9:22         ` Miklos Szeredi
@ 2005-04-30 10:57           ` Miklos Szeredi
  2005-04-30 13:28             ` Steve French
  2005-05-11  8:59           ` Christoph Hellwig
  1 sibling, 1 reply; 21+ messages in thread
From: Miklos Szeredi @ 2005-04-30 10:57 UTC (permalink / raw)
  To: miklos; +Cc: hch, 7eggert, smfrench, linux-kernel

> >  - network/userspace filesystems should be fine aswell
> 
> They should, but again I wonder if NFS with all it's complexity is
> being careful enough with what it accepts from the server.
> 
> Smbfs might be close to safe, since it's already available for users
> to mount from an arbitrary server.

I take that back.  Any filesystem using page cache and allowing shared
writable mapping is currently unsafe because of OOM deadlock if
mounted from local machine, or simply DoS against client by delaying
writeback.

So other than FUSE, what's left as safe?

Miklos

^ permalink raw reply	[flat|nested] 21+ messages in thread

* Re: [PATCH] cifs: handle termination of cifs oplockd kernel thread
  2005-04-30  8:29       ` Christoph Hellwig
@ 2005-04-30  9:22         ` Miklos Szeredi
  2005-04-30 10:57           ` Miklos Szeredi
  2005-05-11  8:59           ` Christoph Hellwig
  2005-04-30 12:52         ` Steve French
  1 sibling, 2 replies; 21+ messages in thread
From: Miklos Szeredi @ 2005-04-30  9:22 UTC (permalink / raw)
  To: hch; +Cc: 7eggert, smfrench, linux-kernel

> > 
> >   - global limit on user mounts
> 
> I don't think we need that one.

We have that for open files '/proc/sys/fs/file-max'.  It limits the
total memory usage of the thing.  Which otherwise is hard if you have
a system with lots of users.

> >   - possibly per user limit on mounts
> 
> Makes sense as an ulimit, that way the sysadmin can easily disable the
> user mount feature aswell.
> 
> >   - acceptable mountpoints (unlimited writablity is probably a good minimum)
> 
> Yupp.
> 
> >   - acceptable mount options (nosuid, nodev are obviously not)
> 
> noexecis a bit too much, so the above look good.
> 
> >   - filesystems "safe" to mount by users
> 
> what filesystem do you think is unsafe?
> 
>  - virtual filesystems exporting kernel data are obviously safe as
>    they enforce permissions no matter who mounted them.  (actually we'd
>    need to check for some odd mount options)

Maybe sysadmin doesn't want to let users see /sys for example.  How
would you disable it if users can mount it themselfes?  OK, you can
disable user mounts completely, but that's probably not fine grained
enough for some.

>  - block-based filesystems should be safe as long as the mounter has
>    access to the underlying block device

Not true if user controls the baking device (e.g. loopback).  Most
block based filesystems are probably unsafe at the moment, because not
enough consistency checking is done at runtime.  Are things like
non-cyclic directory graphs ensured by all filesystems?  My guess is
not.  See also Linus' comment about the state of the iso9660
filesystem:

  http://lwn.net/Articles/128744/

>  - network/userspace filesystems should be fine aswell

They should, but again I wonder if NFS with all it's complexity is
being careful enough with what it accepts from the server.

Smbfs might be close to safe, since it's already available for users
to mount from an arbitrary server.

So safeness is a per-filesystem property, determined, how well it
checks input.

Miklos

^ permalink raw reply	[flat|nested] 21+ messages in thread

* Re: [PATCH] cifs: handle termination of cifs oplockd kernel thread
  2005-04-30  8:14     ` Miklos Szeredi
@ 2005-04-30  8:29       ` Christoph Hellwig
  2005-04-30  9:22         ` Miklos Szeredi
  2005-04-30 12:52         ` Steve French
  0 siblings, 2 replies; 21+ messages in thread
From: Christoph Hellwig @ 2005-04-30  8:29 UTC (permalink / raw)
  To: Miklos Szeredi; +Cc: hch, 7eggert, smfrench, linux-kernel

On Sat, Apr 30, 2005 at 10:14:07AM +0200, Miklos Szeredi wrote:
> > Except that we don't have the concept of a mount owner at the VFS level
> > right now, because everyone is adding stupid suid wrapper hacks instead
> > of trying to fix the problems for real.
> 
> Having a mount owner is not a problem.  Having a good policy for
> accepting mounts is rather more so, according to some:
> 
>    http://marc.theaimsgroup.com/?l=linux-kernel&m=107705608603071&w=2
> 
> Just a little taste of what that policy would involve:
> 
>   - global limit on user mounts

I don't think we need that one.

>   - possibly per user limit on mounts

Makes sense as an ulimit, that way the sysadmin can easily disable the
user mount feature aswell.

>   - acceptable mountpoints (unlimited writablity is probably a good minimum)

Yupp.

>   - acceptable mount options (nosuid, nodev are obviously not)

noexecis a bit too much, so the above look good.

>   - filesystems "safe" to mount by users

what filesystem do you think is unsafe?

 - virtual filesystems exporting kernel data are obviously safe as
   they enforce permissions no matter who mounted them.  (actually we'd
   need to check for some odd mount options)

 - block-based filesystems should be safe as long as the mounter has
   access to the underlying block device

 - network/userspace filesystems should be fine aswell


^ permalink raw reply	[flat|nested] 21+ messages in thread

* Re: [PATCH] cifs: handle termination of cifs oplockd kernel thread
  2005-04-30  7:32   ` Christoph Hellwig
@ 2005-04-30  8:14     ` Miklos Szeredi
  2005-04-30  8:29       ` Christoph Hellwig
  0 siblings, 1 reply; 21+ messages in thread
From: Miklos Szeredi @ 2005-04-30  8:14 UTC (permalink / raw)
  To: hch; +Cc: 7eggert, smfrench, linux-kernel

> Except that we don't have the concept of a mount owner at the VFS level
> right now, because everyone is adding stupid suid wrapper hacks instead
> of trying to fix the problems for real.

Having a mount owner is not a problem.  Having a good policy for
accepting mounts is rather more so, according to some:

   http://marc.theaimsgroup.com/?l=linux-kernel&m=107705608603071&w=2

Just a little taste of what that policy would involve:

  - global limit on user mounts
  - possibly per user limit on mounts
  - acceptable mountpoints (unlimited writablity is probably a good minimum)
  - acceptable mount options (nosuid, nodev are obviously not)
  - filesystems "safe" to mount by users

I'm not against something like that, but I'd like to hear other
people's opinion before trying to push a solution to mainline.

Thanks,
Miklos

^ permalink raw reply	[flat|nested] 21+ messages in thread

* Re: [PATCH] cifs: handle termination of cifs oplockd kernel thread
  2005-04-29 23:18 ` Bodo Eggert <harvested.in.lkml@posting.7eggert.dyndns.org>
@ 2005-04-30  7:32   ` Christoph Hellwig
  2005-04-30  8:14     ` Miklos Szeredi
  0 siblings, 1 reply; 21+ messages in thread
From: Christoph Hellwig @ 2005-04-30  7:32 UTC (permalink / raw)
  To: Bodo Eggert <harvested.in.lkml@posting.7eggert.dyndns.org>
  Cc: Steve French, linux-kernel

On Sat, Apr 30, 2005 at 01:18:19AM +0200, Bodo Eggert <harvested.in.lkml@posting.7eggert.dyndns.org> wrote:
> 4) Turn umounting own mounts into a non-privileged operation.
> 
> As (AFAI see) the only thing that needs suid privileges is the umount
> operation, and it is granted if the user mounted it himself, you can as
> well do this simple check in the kernel.

Except that we don't have the concept of a mount owner at the VFS level
right now, because everyone is adding stupid suid wrapper hacks instead
of trying to fix the problems for real.


^ permalink raw reply	[flat|nested] 21+ messages in thread

* Re: [PATCH] cifs: handle termination of cifs oplockd kernel thread
       [not found] <3YLdQ-4vS-15@gated-at.bofh.it>
@ 2005-04-29 23:18 ` Bodo Eggert <harvested.in.lkml@posting.7eggert.dyndns.org>
  2005-04-30  7:32   ` Christoph Hellwig
  0 siblings, 1 reply; 21+ messages in thread
From: Bodo Eggert <harvested.in.lkml@posting.7eggert.dyndns.org> @ 2005-04-29 23:18 UTC (permalink / raw)
  To: Steve French, linux-kernel

Steve French <smfrench@austin.rr.com> wrote:

> As we try to gradually obsolete smbfs, this came up with various users
> (there was even a bugzilla bug opened for adding it) who said that they
> need the ability to unmount their own mounts for network filesystems
> without using /etc/fstab.    Unfortunately for network filesytsems,
> unlike local filesystems, it is impractical to put every possible mount
> target in /etc/fstab since servers get renamed and the universe of
> possible cifs mount targets for a user is large.
> 
> There seemed only three alternatives -
> 1) mimic the smbfs ioctl -   as can be seen from smbfs and smbumount
> source this has portability problems because apparently there is no
> guarantee that uid_t is the same size in kernel and in userspace - smbfs
> actually has two ioctls for different sizes of uid field - this seemed
> like a bad idea
> 2) export the uid in /proc/mounts - same problem as above
> 3) call into the kernel to see if current matches the uid of the mounter
> - this has no 16/32/64 bit uid portability issues since the check is
> made in kernel

4) Turn umounting own mounts into a non-privileged operation.

As (AFAI see) the only thing that needs suid privileges is the umount
operation, and it is granted if the user mounted it himself, you can as
well do this simple check in the kernel.
-- 
Funny quotes:
40. Isn't making a smoking section in a restaurant like making a peeing
    section in a swimming pool?


^ permalink raw reply	[flat|nested] 21+ messages in thread

end of thread, other threads:[~2005-05-16  9:34 UTC | newest]

Thread overview: 21+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2005-04-29 21:09 [PATCH] cifs: handle termination of cifs oplockd kernel thread Steve French
2005-04-29 21:31 ` Christoph Hellwig
2005-04-29 22:20   ` Steve French
2005-05-11  8:56     ` Christoph Hellwig
2005-05-11 18:19       ` Steve French
2005-05-16  9:34         ` Christoph Hellwig
     [not found] <3YLdQ-4vS-15@gated-at.bofh.it>
2005-04-29 23:18 ` Bodo Eggert <harvested.in.lkml@posting.7eggert.dyndns.org>
2005-04-30  7:32   ` Christoph Hellwig
2005-04-30  8:14     ` Miklos Szeredi
2005-04-30  8:29       ` Christoph Hellwig
2005-04-30  9:22         ` Miklos Szeredi
2005-04-30 10:57           ` Miklos Szeredi
2005-04-30 13:28             ` Steve French
2005-04-30 14:53               ` J. Bruce Fields
2005-04-30 14:50                 ` Steve French
2005-04-30 17:23                   ` Miklos Szeredi
2005-04-30 16:16               ` Miklos Szeredi
2005-04-30 15:27                 ` Steve French
2005-05-01  0:10               ` Bodo Eggert
2005-05-11  8:59           ` Christoph Hellwig
2005-04-30 12:52         ` Steve French

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).