linux-kernel.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
* RE: [autofs] [RFC] Towards a Modern Autofs
@ 2004-01-06 22:28 Ogden, Aaron A.
  2004-01-06 22:41 ` Mike Fedyk
                   ` (3 more replies)
  0 siblings, 4 replies; 82+ messages in thread
From: Ogden, Aaron A. @ 2004-01-06 22:28 UTC (permalink / raw)
  To: thockin, H. Peter Anvin
  Cc: autofs mailing list, Mike Waychison, Kernel Mailing List



> -----Original Message-----
> From: autofs-bounces@linux.kernel.org 
> [mailto:autofs-bounces@linux.kernel.org] On Behalf Of Tim Hockin
> Sent: Tuesday, January 06, 2004 3:50 PM
> To: H. Peter Anvin
> Cc: autofs mailing list; Mike Waychison; Kernel Mailing List
> Subject: Re: [autofs] [RFC] Towards a Modern Autofs
> 
<...snip...>
>
> > Pardon me for sounding harsh, but I'm seriously sick of the
oft-repeated
> > idiocy that effectively boils down to "the daemon can die and would
lose
> > its state, so let's put it all in the kernel."  A dead daemon is a
> > painful recovery, admitted.  It is also a THIS SHOULD NOT HAPPEN
> 
> But it *does* happen.
> 
> > condition.  By cramming it into the kernel, you're in fact 
> > making the system less stable, not more, because the kernel being
tainted with
> > faulty code is a total system malfunction; a crashed userspace
daemon is
> 
> I don't think this design crams anything into the kernel.  It 
> doesn't put a whole lot more into the kernel than is currently in
there 
> (expiry and new mount stuff, aside).  All the work still happens in
userland.
> 
> The daemon as it stands does NOT handle namespaces, does NOT handle
expiry
> well, and is a pretty sad copy of an old design.
> 
> > "merely" a messy cleanup.  In practice, the autofs daemon does not
die
> > unless a careless system administrator kills it.  It is a
non-problem.
> 
> I have some customers I'd love to send to you, if you really 
> think that's true.

Speaking as a sysadmin with 300+ machines (some linux, some solaris)
using autofs, I can say that the linux autofs daemon does die on
occasion, or at least some of the children become hung or unresponsive.
This happened to us with autofs3 and autofs4, leading me to contact Ian
Kent and become involved in testing new versions of autofs4.  I don't
have any problems with the newest versions (4.1.0+) but with previous
code, 4.0.0pre10 for example, I found the ability to restart the daemon
invaluable.  On those occasions where the autofs daemon gets confused
(loses track of mountpoints, gets corruption in its internal
representation of NIS maps, etc.) we could shut down the autofs daemon,
kill any remaining processes, and restart it from scratch.  In most
cases restarting the daemon fixes the problem.  It's worth noting that I
have seen this happen on Solaris 2.6 as well but it is extremely rare.
On the solaris machine there was no automount daemon to restart so I was
forced to reboot it to regain access to the 'missing' mountpoint.

If you've read this far, what I'm trying to say is that having userspace
related code remain in userland is a good thing since you can restart
the daemon if something goes wrong.  If you move all of this to
kernel-space you can't do anything about it if there is a problem.  In
Solaris there is a command called 'automount' that tells the kernel to
re-read the automount maps, perhaps it resets the autofs subsystem in
the kernel as well.  If linux autofs had the same capability we might
not need the daemon, but until then, having the daemon in userland is a
good thing.

^ permalink raw reply	[flat|nested] 82+ messages in thread

* Re: [autofs] [RFC] Towards a Modern Autofs
  2004-01-06 22:28 [autofs] [RFC] Towards a Modern Autofs Ogden, Aaron A.
@ 2004-01-06 22:41 ` Mike Fedyk
  2004-01-06 22:47 ` Tim Hockin
                   ` (2 subsequent siblings)
  3 siblings, 0 replies; 82+ messages in thread
From: Mike Fedyk @ 2004-01-06 22:41 UTC (permalink / raw)
  To: Ogden, Aaron A.
  Cc: thockin, H. Peter Anvin, autofs mailing list, Mike Waychison,
	Kernel Mailing List

On Tue, Jan 06, 2004 at 04:28:59PM -0600, Ogden, Aaron A. wrote:
[snip]
> having the daemon in userland is a
> good thing.

You and hpa are agreeing...

^ permalink raw reply	[flat|nested] 82+ messages in thread

* Re: [autofs] [RFC] Towards a Modern Autofs
  2004-01-06 22:28 [autofs] [RFC] Towards a Modern Autofs Ogden, Aaron A.
  2004-01-06 22:41 ` Mike Fedyk
@ 2004-01-06 22:47 ` Tim Hockin
  2004-01-06 22:53 ` Paul Raines
  2004-01-07 23:14 ` Jim Carter
  3 siblings, 0 replies; 82+ messages in thread
From: Tim Hockin @ 2004-01-06 22:47 UTC (permalink / raw)
  To: Ogden, Aaron A.
  Cc: thockin, H. Peter Anvin, autofs mailing list, Mike Waychison,
	Kernel Mailing List

On Tue, Jan 06, 2004 at 04:28:59PM -0600, Ogden, Aaron A. wrote:
> Solaris there is a command called 'automount' that tells the kernel to
> re-read the automount maps, perhaps it resets the autofs subsystem in
> the kernel as well.  If linux autofs had the same capability we might
> not need the daemon, but until then, having the daemon in userland is a
> good thing.

That's more or less exactly what is proposed.

^ permalink raw reply	[flat|nested] 82+ messages in thread

* RE: [autofs] [RFC] Towards a Modern Autofs
  2004-01-06 22:28 [autofs] [RFC] Towards a Modern Autofs Ogden, Aaron A.
  2004-01-06 22:41 ` Mike Fedyk
  2004-01-06 22:47 ` Tim Hockin
@ 2004-01-06 22:53 ` Paul Raines
  2004-01-07 23:14 ` Jim Carter
  3 siblings, 0 replies; 82+ messages in thread
From: Paul Raines @ 2004-01-06 22:53 UTC (permalink / raw)
  To: Ogden, Aaron A.
  Cc: thockin, H. Peter Anvin, autofs mailing list, Mike Waychison,
	Kernel Mailing List

As another sysadmin with 300+ linux and solaris boxes, I second
you sentiments exactly.  As my previous post today states, I am
having exactly the problem you describe with automount daemons
becoming hung or unresponsive.  Guess I should give 4.1.0 a try.  

Of course the same arguement applies to NFS server but they went
ahead and moved most of that into the kernel anyway for the 
performance gain.

-- 
---------------------------------------------------------------
Paul Raines                   email: raines@nmr.mgh.harvard.edu
MGH/MIT/HMS Athinoula A. Martinos Center for Biomedical Imaging
149 (2301) 13th Street        Charlestown, MA 02129	USA   





^ permalink raw reply	[flat|nested] 82+ messages in thread

* RE: [autofs] [RFC] Towards a Modern Autofs
  2004-01-06 22:28 [autofs] [RFC] Towards a Modern Autofs Ogden, Aaron A.
                   ` (2 preceding siblings ...)
  2004-01-06 22:53 ` Paul Raines
@ 2004-01-07 23:14 ` Jim Carter
  2004-01-07 23:32   ` H. Peter Anvin
  3 siblings, 1 reply; 82+ messages in thread
From: Jim Carter @ 2004-01-07 23:14 UTC (permalink / raw)
  To: Ogden, Aaron A.
  Cc: thockin, H. Peter Anvin, autofs mailing list, Mike Waychison,
	Kernel Mailing List

On Tue, 6 Jan 2004, Ogden, Aaron A. wrote:
> If you've read this far, what I'm trying to say is that having userspace
> related code remain in userland is a good thing since you can restart
> the daemon if something goes wrong.

Hear, hear.  But...

> If you move all of this to
> kernel-space you can't do anything about it if there is a problem.  In
> Solaris there is a command called 'automount' that tells the kernel to
> re-read the automount maps, perhaps it resets the autofs subsystem in
> the kernel as well.  If linux autofs had the same capability we might
> not need the daemon, but until then, having the daemon in userland is a
> good thing.

To my mind the ideal design goes something like this:

1.  you can mount a synthetic autofs filesystem on lots of directories,
including subdirs of other autofs filesystems.

2.  Whenever anything tries to access one of those directories (for a
direct map) or one of its subdirs whether visible or not (indirect map), if
nothing is mounted on it [and it hasn't been told by a special flag that
it's non-mountable, see the /home/user/server{A,B} example], the autofs
kernel module runs a script in user space (in the namespace context of the
originally requesting process).  Upon exit, if something is now mounted on
the subdir, fine.  Otherwise, ENOENT.  The module is not required to know
anything about autofs maps that the userspace helper may or may not
consult.

3.  Periodically the module should check if mounted filesystems are
potentially unmountable (this seems to be inexpensive), and if so it should
run the userspace helper to unmount them.  If the unmount fails, the helper
(not the kernel) should try to distinguish a race condition from a dead NFS
server, and whether the mount will be viable once the server comes back. If
not, it should be more aggressive than the present daemon in unmounting. At
present the module carefully keeps up-to-date a last_used field and a
timeout potentially different for each mount, but it's probably sufficient
to merely poll all the mount points periodically all at once, perhaps with
a one-time exemption when something is first mounted.

And that's *all* the complexity that should be in the kernel.  That's quite
complex enough in my opinion.  If the userspace helper needs state, it can
lock and read/write a file.  I don't really see the need for the autofs
system to have state beyond "it's mounted".

James F. Carter          Voice 310 825 2897    FAX 310 206 6673
UCLA-Mathnet;  6115 MSA; 405 Hilgard Ave.; Los Angeles, CA, USA  90095-1555
Email: jimc@math.ucla.edu    http://www.math.ucla.edu/~jimc (q.v. for PGP key)

^ permalink raw reply	[flat|nested] 82+ messages in thread

* Re: [autofs] [RFC] Towards a Modern Autofs
  2004-01-07 23:14 ` Jim Carter
@ 2004-01-07 23:32   ` H. Peter Anvin
  2004-01-08 12:52     ` Ian Kent
  0 siblings, 1 reply; 82+ messages in thread
From: H. Peter Anvin @ 2004-01-07 23:32 UTC (permalink / raw)
  To: Jim Carter
  Cc: Ogden, Aaron A.,
	thockin, autofs mailing list, Mike Waychison,
	Kernel Mailing List

Jim Carter wrote:
> 
> To my mind the ideal design goes something like this:
> 
> 1.  you can mount a synthetic autofs filesystem on lots of directories,
> including subdirs of other autofs filesystems.
> 
> 2.  Whenever anything tries to access one of those directories (for a
> direct map) or one of its subdirs whether visible or not (indirect map), if
> nothing is mounted on it [and it hasn't been told by a special flag that
> it's non-mountable, see the /home/user/server{A,B} example], the autofs
> kernel module runs a script in user space (in the namespace context of the
> originally requesting process).  Upon exit, if something is now mounted on
> the subdir, fine.  Otherwise, ENOENT.  The module is not required to know
> anything about autofs maps that the userspace helper may or may not
> consult.
> 
> 3.  Periodically the module should check if mounted filesystems are
> potentially unmountable (this seems to be inexpensive), and if so it should
> run the userspace helper to unmount them.  If the unmount fails, the helper
> (not the kernel) should try to distinguish a race condition from a dead NFS
> server, and whether the mount will be viable once the server comes back. If
> not, it should be more aggressive than the present daemon in unmounting. At
> present the module carefully keeps up-to-date a last_used field and a
> timeout potentially different for each mount, but it's probably sufficient
> to merely poll all the mount points periodically all at once, perhaps with
> a one-time exemption when something is first mounted.
> 
> And that's *all* the complexity that should be in the kernel.  That's quite
> complex enough in my opinion.  If the userspace helper needs state, it can
> lock and read/write a file.  I don't really see the need for the autofs
> system to have state beyond "it's mounted".
> 

What you've described above is more or less the autofs v3 design.  There
are reasons why you really want to have a simple-minded timeout in the
kernel, mostly because attempting umount is more expensive than it
should be on some filesystems.  It only needs to be statistically
accurate, though, and thus it does not introduce a race.

Once you have to deal with mount trees (multiple filesystems on the same
mount point which you want to have appear to userspace as a unit),
things get significantly more complex, unfortunately.  Mounting is not a
problem, since the nonprivileged processes are simply held, but
umounting is, since in order to make sure there are no race conditions
userspace needs to be locked out from filesystem "a" while umounting
filesystem "a/b", *or* the equivalent of a direct mount autofs point has
to be imposed on node "a/b" of filesystem "a" which can be atomically
deleted together with the umounting of filesystem "a".

These are the mount traps Al Viro has been architecting.

	-hpa


^ permalink raw reply	[flat|nested] 82+ messages in thread

* Re: [autofs] [RFC] Towards a Modern Autofs
  2004-01-07 23:32   ` H. Peter Anvin
@ 2004-01-08 12:52     ` Ian Kent
  2004-01-08 18:31       ` viro
  0 siblings, 1 reply; 82+ messages in thread
From: Ian Kent @ 2004-01-08 12:52 UTC (permalink / raw)
  To: H. Peter Anvin
  Cc: Jim Carter, Ogden, Aaron A.,
	thockin, autofs mailing list, Mike Waychison,
	Kernel Mailing List

On Wed, 7 Jan 2004, H. Peter Anvin wrote:

>
> These are the mount traps Al Viro has been architecting.
>

Please tell me about these.

I have`nt seen any discussion on the implementation.

Just a few sentences ....

Ian



^ permalink raw reply	[flat|nested] 82+ messages in thread

* Re: [autofs] [RFC] Towards a Modern Autofs
  2004-01-08 12:52     ` Ian Kent
@ 2004-01-08 18:31       ` viro
  2004-01-09 18:43         ` Ian Kent
  2004-01-09 19:41         ` Mike Waychison
  0 siblings, 2 replies; 82+ messages in thread
From: viro @ 2004-01-08 18:31 UTC (permalink / raw)
  To: Ian Kent
  Cc: H. Peter Anvin, Jim Carter, Ogden, Aaron A.,
	thockin, autofs mailing list, Mike Waychison,
	Kernel Mailing List

On Thu, Jan 08, 2004 at 08:52:31PM +0800, Ian Kent wrote:
> On Wed, 7 Jan 2004, H. Peter Anvin wrote:
> 
> >
> > These are the mount traps Al Viro has been architecting.
> >
> 
> Please tell me about these.
> 
> I have`nt seen any discussion on the implementation.
> 
> Just a few sentences ....

Special vfsmount mounted somewhere; has no superblock associated with it;
attempt to step on it triggers event; normal result of that event is to
get a normal mount on top of it, at which point usual chaining logics
will make sure that we don't see the trap until it's uncovered by removal
of covering filesystem.  Trap (and everything mounted on it, etc.) can
be removed by normal lazy umount.

Basically, it's a single-point analog of autofs done entirely in VFS.
The job of automounter is to maintain the traps and react to events.

And yes, I should've done that months ago.  Waaaaay too long backlog -
bdev work, dev_t stuff, netdev, yadda, yadda.

^ permalink raw reply	[flat|nested] 82+ messages in thread

* Re: [autofs] [RFC] Towards a Modern Autofs
  2004-01-08 18:31       ` viro
@ 2004-01-09 18:43         ` Ian Kent
  2004-01-09 19:41         ` Mike Waychison
  1 sibling, 0 replies; 82+ messages in thread
From: Ian Kent @ 2004-01-09 18:43 UTC (permalink / raw)
  To: viro
  Cc: H. Peter Anvin, Jim Carter, Ogden, Aaron A.,
	thockin, autofs mailing list, Mike Waychison,
	Kernel Mailing List

On Thu, 8 Jan 2004 viro@parcelfarce.linux.theplanet.co.uk wrote:

> Basically, it's a single-point analog of autofs done entirely in VFS.
> The job of automounter is to maintain the traps and react to events.
>
> And yes, I should've done that months ago.  Waaaaay too long backlog -
> bdev work, dev_t stuff, netdev, yadda, yadda.
>

So that's why Peter appears to have not made progress.

Yes. Tell me about the 24 hour days that feel like an hour and feel like
only an hours progress has been made.

Ian




^ permalink raw reply	[flat|nested] 82+ messages in thread

* Re: [autofs] [RFC] Towards a Modern Autofs
  2004-01-08 18:31       ` viro
  2004-01-09 18:43         ` Ian Kent
@ 2004-01-09 19:41         ` Mike Waychison
  2004-01-09 19:57           ` H. Peter Anvin
  1 sibling, 1 reply; 82+ messages in thread
From: Mike Waychison @ 2004-01-09 19:41 UTC (permalink / raw)
  To: viro
  Cc: Ian Kent, autofs mailing list, H. Peter Anvin, Ogden, Aaron A.,
	Kernel Mailing List

[-- Attachment #1: Type: text/plain, Size: 1778 bytes --]

viro@parcelfarce.linux.theplanet.co.uk wrote:

>On Thu, Jan 08, 2004 at 08:52:31PM +0800, Ian Kent wrote:
>  
>
>>On Wed, 7 Jan 2004, H. Peter Anvin wrote:
>>
>>    
>>
>>>These are the mount traps Al Viro has been architecting.
>>>
>>>      
>>>
>>Please tell me about these.
>>
>>I have`nt seen any discussion on the implementation.
>>
>>Just a few sentences ....
>>    
>>
>
>Special vfsmount mounted somewhere; has no superblock associated with it;
>attempt to step on it triggers event; normal result of that event is to
>get a normal mount on top of it, at which point usual chaining logics
>will make sure that we don't see the trap until it's uncovered by removal
>of covering filesystem.  Trap (and everything mounted on it, etc.) can
>be removed by normal lazy umount.
>
>Basically, it's a single-point analog of autofs done entirely in VFS.
>The job of automounter is to maintain the traps and react to events.
>
>  
>
Is there any clear advantage to doing this in the VFS other than saving 
a superblock and a dentry/inode pair or two?

I remember talking to you about this, and I seem to recall that these 
mount traps would probably communicate using a struct file, so a 
trap-user would somehow receive events about when the trap was set 
off.   Will this communication model continue to work within a cloned 
namespace?  What happens if the trap-client closes the file?

-- 
Mike Waychison
Sun Microsystems, Inc.
1 (650) 352-5299 voice
1 (416) 202-8336 voice
mailto: Michael.Waychison@Sun.COM
http://www.sun.com

~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
NOTICE:  The opinions expressed in this email are held by me, 
and may not represent the views of Sun Microsystems, Inc.
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ 


[-- Attachment #2: Type: application/pgp-signature, Size: 251 bytes --]

^ permalink raw reply	[flat|nested] 82+ messages in thread

* Re: [autofs] [RFC] Towards a Modern Autofs
  2004-01-09 19:41         ` Mike Waychison
@ 2004-01-09 19:57           ` H. Peter Anvin
  2004-01-09 21:31             ` Mike Waychison
  0 siblings, 1 reply; 82+ messages in thread
From: H. Peter Anvin @ 2004-01-09 19:57 UTC (permalink / raw)
  To: Mike Waychison
  Cc: viro, Ian Kent, autofs mailing list, Ogden, Aaron A.,
	Kernel Mailing List

Mike Waychison wrote:
>>
>> Special vfsmount mounted somewhere; has no superblock associated with it;
>> attempt to step on it triggers event; normal result of that event is to
>> get a normal mount on top of it, at which point usual chaining logics
>> will make sure that we don't see the trap until it's uncovered by removal
>> of covering filesystem.  Trap (and everything mounted on it, etc.) can
>> be removed by normal lazy umount.
>>
>> Basically, it's a single-point analog of autofs done entirely in VFS.
>> The job of automounter is to maintain the traps and react to events.
>>
> Is there any clear advantage to doing this in the VFS other than saving
> a superblock and a dentry/inode pair or two?
> 
> I remember talking to you about this, and I seem to recall that these
> mount traps would probably communicate using a struct file, so a
> trap-user would somehow receive events about when the trap was set
> off.   Will this communication model continue to work within a cloned
> namespace?  What happens if the trap-client closes the file?
> 

The biggest issue is to ensure that the appropriate atomicity guarantees
can be maintained.  In particular, it must be possible to umount the
underlying filesystem and all mount traps on top of it atomically.
Anything less will result in race conditions.

	-hpa


^ permalink raw reply	[flat|nested] 82+ messages in thread

* Re: [autofs] [RFC] Towards a Modern Autofs
  2004-01-09 19:57           ` H. Peter Anvin
@ 2004-01-09 21:31             ` Mike Waychison
  2004-01-09 21:36               ` H. Peter Anvin
  0 siblings, 1 reply; 82+ messages in thread
From: Mike Waychison @ 2004-01-09 21:31 UTC (permalink / raw)
  To: H. Peter Anvin
  Cc: viro, Ian Kent, autofs mailing list, Ogden, Aaron A.,
	Kernel Mailing List

[-- Attachment #1: Type: text/plain, Size: 2050 bytes --]

H. Peter Anvin wrote:

>Mike Waychison wrote:
>  
>
>>>Special vfsmount mounted somewhere; has no superblock associated with it;
>>>attempt to step on it triggers event; normal result of that event is to
>>>get a normal mount on top of it, at which point usual chaining logics
>>>will make sure that we don't see the trap until it's uncovered by removal
>>>of covering filesystem.  Trap (and everything mounted on it, etc.) can
>>>be removed by normal lazy umount.
>>>
>>>Basically, it's a single-point analog of autofs done entirely in VFS.
>>>The job of automounter is to maintain the traps and react to events.
>>>
>>>      
>>>
>>Is there any clear advantage to doing this in the VFS other than saving
>>a superblock and a dentry/inode pair or two?
>>
>>I remember talking to you about this, and I seem to recall that these
>>mount traps would probably communicate using a struct file, so a
>>trap-user would somehow receive events about when the trap was set
>>off.   Will this communication model continue to work within a cloned
>>namespace?  What happens if the trap-client closes the file?
>>
>>    
>>
>
>The biggest issue is to ensure that the appropriate atomicity guarantees
>can be maintained.  In particular, it must be possible to umount the
>underlying filesystem and all mount traps on top of it atomically.
>Anything less will result in race conditions.
>
>	-hpa
>
>  
>
Unless I'm missing something, implementing this as a seperate filesystem 
type still has the appropriate atomicity guarantees as long as the VFS 
support complex expiry, whereby userspace would tag submounts as being 
part of the overall expiry for a base mountpoint.

-- 
Mike Waychison
Sun Microsystems, Inc.
1 (650) 352-5299 voice
1 (416) 202-8336 voice
mailto: Michael.Waychison@Sun.COM
http://www.sun.com

~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
NOTICE:  The opinions expressed in this email are held by me, 
and may not represent the views of Sun Microsystems, Inc.
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ 


[-- Attachment #2: Type: application/pgp-signature, Size: 251 bytes --]

^ permalink raw reply	[flat|nested] 82+ messages in thread

* Re: [autofs] [RFC] Towards a Modern Autofs
  2004-01-09 21:31             ` Mike Waychison
@ 2004-01-09 21:36               ` H. Peter Anvin
  0 siblings, 0 replies; 82+ messages in thread
From: H. Peter Anvin @ 2004-01-09 21:36 UTC (permalink / raw)
  To: Mike Waychison
  Cc: viro, Ian Kent, autofs mailing list, Ogden, Aaron A.,
	Kernel Mailing List

Mike Waychison wrote:
>
> Unless I'm missing something, implementing this as a seperate filesystem
> type still has the appropriate atomicity guarantees as long as the VFS
> support complex expiry, whereby userspace would tag submounts as being
> part of the overall expiry for a base mountpoint.
> 

It would, but it seems like a vastly more invasive change to the VFS
than ought to be necessary.

	-hpa


^ permalink raw reply	[flat|nested] 82+ messages in thread

* Re: [autofs] [RFC] Towards a Modern Autofs
  2004-01-13 19:01                         ` Mike Waychison
@ 2004-01-14 15:58                           ` raven
  0 siblings, 0 replies; 82+ messages in thread
From: raven @ 2004-01-14 15:58 UTC (permalink / raw)
  To: Mike Waychison; +Cc: autofs mailing list, Kernel Mailing List

On Tue, 13 Jan 2004, Mike Waychison wrote:

> >
> My proposal uses filesystems for all automount mechanism *except* 
> expiry. I see expiry as a VFS service, and strongly believe that this is 
> where it belongs.
> 

I'm certainly thinking alot about this and have made quite a bit of 
progress thanks to the patiience of all.

Now it think it may be time to ponder the expire mechanism.

I was thinking it might be good for me to write up a specification based 
on the discussion so far to make sure that we all have the same 
understanding of what has been discussed. Perhaps this could allow for a 
specification to follow.

Good idea or not?

Ian


^ permalink raw reply	[flat|nested] 82+ messages in thread

* Re: [autofs] [RFC] Towards a Modern Autofs
  2004-01-13  1:54                       ` Ian Kent
@ 2004-01-13 19:01                         ` Mike Waychison
  2004-01-14 15:58                           ` raven
  0 siblings, 1 reply; 82+ messages in thread
From: Mike Waychison @ 2004-01-13 19:01 UTC (permalink / raw)
  To: Ian Kent; +Cc: autofs mailing list, Kernel Mailing List

[-- Attachment #1: Type: text/plain, Size: 3103 bytes --]

Ian Kent wrote:

>On Mon, 12 Jan 2004, Mike Waychison wrote:
>
>  
>
>>>And I have another question concerning namespaces.
>>>
>>>Given that there may be several namespaces, each of which may or may not
>>>have a trigger on this dentry, is there some sort of list of triggers?
>>>
>>>How do the triggers know who owns them?
>>>
>>>
>>>
>>>
>>>      
>>>
>>This is the reason I went with using distinct filesystems to perform the
>>triggers.  If we use follow_link logic, we will have a reference to the
>>respective vfsmount.  Dentry's themselves know nothing about the
>>triggers, as the triggers just look like a mounted filesystem.   The
>>vfsmount information has enough information for autofs to call a
>>userspace agent through hotplug and have userspace handle the mount.  In
>>effect, there is no daemon so nobody 'owns' a trigger in the same sense
>>as with autofs3/4.
>>    
>>
>
>I'm not familiar with the follow_link mechanism (no prob. I'll pick it up
>as I go).
>
>Correct me if I'm wrong but, the only thing that I can see that is
>duplicated in cloning a namespace is the root dentry. The rest of the
>dentries on the system remain the same. The increase in complexity to the
>VFS to change this would be prohibitive.
>  
>
No.  Dentries are *never* duplicated.  This goes back to Viro's work on 
allowing a filesystem to be mounted in multiple locations.  See 
http://kt.zork.net/kernel-traffic/kt20000424_64.html#9 .

What is duplicated is the current->namespace tree of vfsmounts.  After 
this is done, current->fs vfsmount members are updated to point to their 
cloned counterparts.

>I see we want the triggers in the vfsmount struct. Is this a good idea?
>The vfsmount struct has always been difficult to get hold of during lookup
>and revalidate for me (someone like to help here).
>
>  
>
If triggers in the vfsmount struct are done, then there will be no need 
to handle lookups or revalidates.  In fact, triggers in the vfsmount 
struct will not help at all for indirect maps.

>
>Also, something needs to be done about mount table noise. Several hundred
>entries is very bad from an administration viewpoint.
>  
>
I don't see what you want here.  If you have hundreds of users logged 
into the same machine, you *will* have hundreds of entries in the 
mount-table.

>Except for the cross namespace issues, which I'm still digesting, I can't
>see why your design can't be done entirely as a filesystem using dentries
>instead of vfsmount, including expirey. Perhaps you could reinterate a few
>of the reasons for this.
>  
>
My proposal uses filesystems for all automount mechanism *except* 
expiry. I see expiry as a VFS service, and strongly believe that this is 
where it belongs.

-- 
Mike Waychison
Sun Microsystems, Inc.
1 (650) 352-5299 voice
1 (416) 202-8336 voice
mailto: Michael.Waychison@Sun.COM
http://www.sun.com

~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
NOTICE:  The opinions expressed in this email are held by me, 
and may not represent the views of Sun Microsystems, Inc.
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ 


[-- Attachment #2: Type: application/pgp-signature, Size: 251 bytes --]

^ permalink raw reply	[flat|nested] 82+ messages in thread

* Re: [autofs] [RFC] Towards a Modern Autofs
  2004-01-12 16:01                 ` raven
  2004-01-12 16:26                   ` Mike Waychison
  2004-01-12 16:28                   ` raven
@ 2004-01-13 18:46                   ` Mike Waychison
  2 siblings, 0 replies; 82+ messages in thread
From: Mike Waychison @ 2004-01-13 18:46 UTC (permalink / raw)
  To: raven; +Cc: autofs mailing list, Kernel Mailing List

[-- Attachment #1: Type: text/plain, Size: 2519 bytes --]

raven@themaw.net wrote:

>On Mon, 12 Jan 2004, Mike Waychison wrote:
>
>  
>
>>>Transparency of an autofs filesystem (as I'm calling it) is the situation
>>>where, given a map
>>>
>>>/usr	/man1	server:/usr/man1
>>>	/man2	server:/usr/man2
>>>
>>>where the filesystem /usr contains, say a directory lib, that needs to be
>>>available while also seeing the automounted directories.
>>>
>>> 
>>>
>>>      
>>>
>>I see.  This requires direct mount triggers to do properly.  Trying to 
>>do it with some sort of passthrough to the underlying filesystem is a 
>>nightmare waiting to happen..
>>
>>    
>>
>
>So what are we saying here?
>
>We install triggers at /usr/man1 and /usr/man2.
>Then suppose the map had a nobrowse option.
>  
>
This is a direct map.  The browse / nobrowse options do not apply to 
direct maps.

>Does the trigger also take care of hiding man1 and man2?
>
>  
>
No.  man1 and man2 appear as directories to anyone doing an lstat on 
them.  Traversing *into* them will cause filesystems to be mounted on 
them.  This appears to be similar to browsing of an indirect map at 
first, however it is a different beast.  With indirect maps, we are 
given the right to cover up /usr to help us detects stats and traversals 
into its sub-directories.  With direct entries, we don't have these 
leisure.  Everything in /usr most be accessible at all times. 

Your need for 'transparency' comes from the fact that you convert direct 
maps into indirect maps, which require the covering of /usr.

>Is there some definition of these triggers?
>
>  
>
This question is up in the air. 

I propose using a magic filesystem, whose root dentry has a follow_link 
callback defined.  When somebody walks into the filesystem, the 
follow_link is called, which does the mount onto a different dentry, and 
then forwards the original caller to the new vfsmount/dentry pair. 

HPA and Viro believe this is better done in the VFS layer directly by 
using special vfsmounts without super_blocks.  The path walking code 
would be modified to know of these 'traps' or 'triggers' natively.

Which solution is best is left as an exercise.

-- 
Mike Waychison
Sun Microsystems, Inc.
1 (650) 352-5299 voice
1 (416) 202-8336 voice
mailto: Michael.Waychison@Sun.COM
http://www.sun.com

~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
NOTICE:  The opinions expressed in this email are held by me, 
and may not represent the views of Sun Microsystems, Inc.
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ 


[-- Attachment #2: Type: application/pgp-signature, Size: 251 bytes --]

^ permalink raw reply	[flat|nested] 82+ messages in thread

* Re: [autofs] [RFC] Towards a Modern Autofs
  2004-01-12 16:57               ` Mike Waychison
@ 2004-01-13  7:39                 ` Ian Kent
  0 siblings, 0 replies; 82+ messages in thread
From: Ian Kent @ 2004-01-13  7:39 UTC (permalink / raw)
  To: Mike Waychison; +Cc: Jeff Garzik, H. Peter Anvin, linux-kernel

On Mon, 12 Jan 2004, Mike Waychison wrote:

> Jeff Garzik wrote:
>
> >
> >
> > You're still using arguments -against- putting software in the kernel.
> > You don't decrease software's chances of "being broken" by putting it
> > in the kernel, the opposite occurs -- you increase the likelihood of
> > making the entire system unstable.  This is one point that Solaris and
> > Win32 have both missed :)
> >
> >     Jeff
> >
> I get what you're saying. :)
>
> However, doing so achieves two goals:
>     - I want kernelspace to provide mechanism, and let userspace define
> policy. In this case, the policy is even finer grained than what we had
> before and can be set at trigger time, rather than at initscript start time.
>     - I want to get rid of the old ioctl poll interface that didn't work
> in namespaces.
>
> The namespace problem effectively limits what we can do in userspace to
> simply prodding the kernel to tell _it_ to unmount stuff.  A daemon
> alone cannot unmount across namespaces.

Another important consideration is "can implementing this functionality be
significanly simplified by doing it within the kernel" or if
functionality is not otherwise able to be done in userspace. I believe
that these points were made in the original proposal.

Ian





^ permalink raw reply	[flat|nested] 82+ messages in thread

* Re: [autofs] [RFC] Towards a Modern Autofs
  2004-01-12 16:58                     ` Mike Waychison
@ 2004-01-13  1:54                       ` Ian Kent
  2004-01-13 19:01                         ` Mike Waychison
  0 siblings, 1 reply; 82+ messages in thread
From: Ian Kent @ 2004-01-13  1:54 UTC (permalink / raw)
  To: Mike Waychison; +Cc: Jim Carter, autofs mailing list, Kernel Mailing List

On Mon, 12 Jan 2004, Mike Waychison wrote:

> >
> >And I have another question concerning namespaces.
> >
> >Given that there may be several namespaces, each of which may or may not
> >have a trigger on this dentry, is there some sort of list of triggers?
> >
> >How do the triggers know who owns them?
> >
> >
> >
> >
> This is the reason I went with using distinct filesystems to perform the
> triggers.  If we use follow_link logic, we will have a reference to the
> respective vfsmount.  Dentry's themselves know nothing about the
> triggers, as the triggers just look like a mounted filesystem.   The
> vfsmount information has enough information for autofs to call a
> userspace agent through hotplug and have userspace handle the mount.  In
> effect, there is no daemon so nobody 'owns' a trigger in the same sense
> as with autofs3/4.

I'm not familiar with the follow_link mechanism (no prob. I'll pick it up
as I go).

Correct me if I'm wrong but, the only thing that I can see that is
duplicated in cloning a namespace is the root dentry. The rest of the
dentries on the system remain the same. The increase in complexity to the
VFS to change this would be prohibitive.

I see we want the triggers in the vfsmount struct. Is this a good idea?
The vfsmount struct has always been difficult to get hold of during lookup
and revalidate for me (someone like to help here).

>
> As far as userspace is concerned, an autofs filesystem is mounted as is
> any other filesystem.  All that is required is a proper set of mount
> options.  For example, mounting auto_home on /home is:
>
> mount -t autofs -o maptype=indirect,mapname=auto_home auto_home /home
>
> Whenever somebody traverses into a subdir in /home within any namespace
> this autofs filesystem has been inherited, userspace is invoked (in that
> namespace) to perform the mount.  Again, there is no 'ownership' other
> than maybe calling the namespace it resides it the 'owner', as you would
> for any other mountpoint.

The "mount all automount entries" has always been the simpler option but,
as you know, the kernel still allows only 255 anonymous mounts. This would
have to be the first order of business. Ohh, I was supposed to be working
on sysctl inerface for that. I'll just be quiet now.

Also, something needs to be done about mount table noise. Several hundred
entries is very bad from an administration viewpoint.

Except for the cross namespace issues, which I'm still digesting, I can't
see why your design can't be done entirely as a filesystem using dentries
instead of vfsmount, including expirey. Perhaps you could reinterate a few
of the reasons for this.

Ian



^ permalink raw reply	[flat|nested] 82+ messages in thread

* Re: [autofs] [RFC] Towards a Modern Autofs
  2004-01-12 22:50                     ` Tim Hockin
  2004-01-12 23:28                       ` Mike Waychison
@ 2004-01-13  1:30                       ` Ian Kent
  1 sibling, 0 replies; 82+ messages in thread
From: Ian Kent @ 2004-01-13  1:30 UTC (permalink / raw)
  To: Tim Hockin
  Cc: Mike Waychison, Jim Carter, autofs mailing list, Kernel Mailing List

On Mon, 12 Jan 2004, Tim Hockin wrote:

> On Mon, Jan 12, 2004 at 11:26:30AM -0500, Mike Waychison wrote:
> > /usr   /man1   server:/usr/man1   \
> >          /man2   server:/usr/man2
> >
> > is the same as the two distinct entries:
> >
> > /usr/man1   server:/usr/man1
> > /usr/man2   server:/usr/man2
> >
> > Now that I think about it, the discussion in my proposal paper about
> > multimounts with no root offsets probably isn't required.
>
> The latter requires /usr/man1 and /usr/man2 to exist.  The former only
> requires /usr to exist, right?
>

That's one possibility, but man1 and man2 could simply not call filler in
the readdir call.

Ian



^ permalink raw reply	[flat|nested] 82+ messages in thread

* Re: [autofs] [RFC] Towards a Modern Autofs
  2004-01-12 22:50                     ` Tim Hockin
@ 2004-01-12 23:28                       ` Mike Waychison
  2004-01-13  1:30                       ` Ian Kent
  1 sibling, 0 replies; 82+ messages in thread
From: Mike Waychison @ 2004-01-12 23:28 UTC (permalink / raw)
  To: Tim Hockin; +Cc: raven, Jim Carter, autofs mailing list, Kernel Mailing List

[-- Attachment #1: Type: text/plain, Size: 1008 bytes --]

Tim Hockin wrote:

>On Mon, Jan 12, 2004 at 11:26:30AM -0500, Mike Waychison wrote:
>  
>
>>/usr   /man1   server:/usr/man1   \
>>         /man2   server:/usr/man2
>>
>>is the same as the two distinct entries:
>>
>>/usr/man1   server:/usr/man1
>>/usr/man2   server:/usr/man2
>>
>>Now that I think about it, the discussion in my proposal paper about 
>>multimounts with no root offsets probably isn't required.
>>    
>>
>
>The latter requires /usr/man1 and /usr/man2 to exist.  The former only
>requires /usr to exist, right?
>
>  
>
Traditionally, the automount system is allowed to create directories as 
needed.


-- 
Mike Waychison
Sun Microsystems, Inc.
1 (650) 352-5299 voice
1 (416) 202-8336 voice
mailto: Michael.Waychison@Sun.COM
http://www.sun.com

~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
NOTICE:  The opinions expressed in this email are held by me, 
and may not represent the views of Sun Microsystems, Inc.
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ 


[-- Attachment #2: Type: application/pgp-signature, Size: 251 bytes --]

^ permalink raw reply	[flat|nested] 82+ messages in thread

* Re: [autofs] [RFC] Towards a Modern Autofs
  2004-01-12 16:26                   ` Mike Waychison
@ 2004-01-12 22:50                     ` Tim Hockin
  2004-01-12 23:28                       ` Mike Waychison
  2004-01-13  1:30                       ` Ian Kent
  0 siblings, 2 replies; 82+ messages in thread
From: Tim Hockin @ 2004-01-12 22:50 UTC (permalink / raw)
  To: Mike Waychison
  Cc: raven, Jim Carter, autofs mailing list, Kernel Mailing List

On Mon, Jan 12, 2004 at 11:26:30AM -0500, Mike Waychison wrote:
> /usr   /man1   server:/usr/man1   \
>          /man2   server:/usr/man2
> 
> is the same as the two distinct entries:
> 
> /usr/man1   server:/usr/man1
> /usr/man2   server:/usr/man2
> 
> Now that I think about it, the discussion in my proposal paper about 
> multimounts with no root offsets probably isn't required.

The latter requires /usr/man1 and /usr/man2 to exist.  The former only
requires /usr to exist, right?


^ permalink raw reply	[flat|nested] 82+ messages in thread

* Re: [autofs] [RFC] Towards a Modern Autofs
  2004-01-12 16:28                   ` raven
@ 2004-01-12 16:58                     ` Mike Waychison
  2004-01-13  1:54                       ` Ian Kent
  0 siblings, 1 reply; 82+ messages in thread
From: Mike Waychison @ 2004-01-12 16:58 UTC (permalink / raw)
  To: raven; +Cc: Jim Carter, autofs mailing list, Kernel Mailing List

[-- Attachment #1: Type: text/plain, Size: 2683 bytes --]

raven@themaw.net wrote:

>On Tue, 13 Jan 2004 raven@themaw.net wrote:
>
>  
>
>>On Mon, 12 Jan 2004, Mike Waychison wrote:
>>
>>    
>>
>>>>Transparency of an autofs filesystem (as I'm calling it) is the situation
>>>>where, given a map
>>>>
>>>>/usr	/man1	server:/usr/man1
>>>>	/man2	server:/usr/man2
>>>>
>>>>where the filesystem /usr contains, say a directory lib, that needs to be
>>>>available while also seeing the automounted directories.
>>>>
>>>> 
>>>>
>>>>        
>>>>
>>>I see.  This requires direct mount triggers to do properly.  Trying to 
>>>do it with some sort of passthrough to the underlying filesystem is a 
>>>nightmare waiting to happen..
>>>
>>>      
>>>
>>So what are we saying here?
>>
>>We install triggers at /usr/man1 and /usr/man2.
>>Then suppose the map had a nobrowse option.
>>Does the trigger also take care of hiding man1 and man2?
>>
>>Is there some definition of these triggers?
>>
>>    
>>
>
>And I have another question concerning namespaces.
>
>Given that there may be several namespaces, each of which may or may not 
>have a trigger on this dentry, is there some sort of list of triggers?
>
>How do the triggers know who owns them?
>
>
>  
>
This is the reason I went with using distinct filesystems to perform the 
triggers.  If we use follow_link logic, we will have a reference to the 
respective vfsmount.  Dentry's themselves know nothing about the 
triggers, as the triggers just look like a mounted filesystem.   The 
vfsmount information has enough information for autofs to call a 
userspace agent through hotplug and have userspace handle the mount.  In 
effect, there is no daemon so nobody 'owns' a trigger in the same sense 
as with autofs3/4.

As far as userspace is concerned, an autofs filesystem is mounted as is 
any other filesystem.  All that is required is a proper set of mount 
options.  For example, mounting auto_home on /home is:

mount -t autofs -o maptype=indirect,mapname=auto_home auto_home /home

Whenever somebody traverses into a subdir in /home within any namespace 
this autofs filesystem has been inherited, userspace is invoked (in that 
namespace) to perform the mount.  Again, there is no 'ownership' other 
than maybe calling the namespace it resides it the 'owner', as you would 
for any other mountpoint.

-- 
Mike Waychison
Sun Microsystems, Inc.
1 (650) 352-5299 voice
1 (416) 202-8336 voice
mailto: Michael.Waychison@Sun.COM
http://www.sun.com

~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
NOTICE:  The opinions expressed in this email are held by me, 
and may not represent the views of Sun Microsystems, Inc.
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ 


[-- Attachment #2: Type: application/pgp-signature, Size: 251 bytes --]

^ permalink raw reply	[flat|nested] 82+ messages in thread

* Re: [autofs] [RFC] Towards a Modern Autofs
  2004-01-07 23:56             ` Jeff Garzik
@ 2004-01-12 16:57               ` Mike Waychison
  2004-01-13  7:39                 ` Ian Kent
  0 siblings, 1 reply; 82+ messages in thread
From: Mike Waychison @ 2004-01-12 16:57 UTC (permalink / raw)
  To: Jeff Garzik; +Cc: H. Peter Anvin, linux-kernel

[-- Attachment #1: Type: text/plain, Size: 1356 bytes --]

Jeff Garzik wrote:

>
>
> You're still using arguments -against- putting software in the kernel. 
> You don't decrease software's chances of "being broken" by putting it 
> in the kernel, the opposite occurs -- you increase the likelihood of 
> making the entire system unstable.  This is one point that Solaris and 
> Win32 have both missed :)
>
>     Jeff
>
I get what you're saying. :)

However, doing so achieves two goals:
    - I want kernelspace to provide mechanism, and let userspace define 
policy. In this case, the policy is even finer grained than what we had 
before and can be set at trigger time, rather than at initscript start time.
    - I want to get rid of the old ioctl poll interface that didn't work 
in namespaces.

The namespace problem effectively limits what we can do in userspace to 
simply prodding the kernel to tell _it_ to unmount stuff.  A daemon 
alone cannot unmount across namespaces. 

I hope this clarifies where I stand :)

-- 
Mike Waychison
Sun Microsystems, Inc.
1 (650) 352-5299 voice
1 (416) 202-8336 voice
mailto: Michael.Waychison@Sun.COM
http://www.sun.com

~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
NOTICE:  The opinions expressed in this email are held by me, 
and may not represent the views of Sun Microsystems, Inc.
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ 


[-- Attachment #2: Type: application/pgp-signature, Size: 251 bytes --]

^ permalink raw reply	[flat|nested] 82+ messages in thread

* Re: [autofs] [RFC] Towards a Modern Autofs
  2004-01-12 16:01                 ` raven
  2004-01-12 16:26                   ` Mike Waychison
@ 2004-01-12 16:28                   ` raven
  2004-01-12 16:58                     ` Mike Waychison
  2004-01-13 18:46                   ` Mike Waychison
  2 siblings, 1 reply; 82+ messages in thread
From: raven @ 2004-01-12 16:28 UTC (permalink / raw)
  To: Mike Waychison; +Cc: Jim Carter, autofs mailing list, Kernel Mailing List

On Tue, 13 Jan 2004 raven@themaw.net wrote:

> On Mon, 12 Jan 2004, Mike Waychison wrote:
> 
> > >
> > >Transparency of an autofs filesystem (as I'm calling it) is the situation
> > >where, given a map
> > >
> > >/usr	/man1	server:/usr/man1
> > >	/man2	server:/usr/man2
> > >
> > >where the filesystem /usr contains, say a directory lib, that needs to be
> > >available while also seeing the automounted directories.
> > >
> > >  
> > >
> > I see.  This requires direct mount triggers to do properly.  Trying to 
> > do it with some sort of passthrough to the underlying filesystem is a 
> > nightmare waiting to happen..
> > 
> 
> So what are we saying here?
> 
> We install triggers at /usr/man1 and /usr/man2.
> Then suppose the map had a nobrowse option.
> Does the trigger also take care of hiding man1 and man2?
> 
> Is there some definition of these triggers?
> 

And I have another question concerning namespaces.

Given that there may be several namespaces, each of which may or may not 
have a trigger on this dentry, is there some sort of list of triggers?

How do the triggers know who owns them?

Ian


^ permalink raw reply	[flat|nested] 82+ messages in thread

* Re: [autofs] [RFC] Towards a Modern Autofs
  2004-01-12 16:01                 ` raven
@ 2004-01-12 16:26                   ` Mike Waychison
  2004-01-12 22:50                     ` Tim Hockin
  2004-01-12 16:28                   ` raven
  2004-01-13 18:46                   ` Mike Waychison
  2 siblings, 1 reply; 82+ messages in thread
From: Mike Waychison @ 2004-01-12 16:26 UTC (permalink / raw)
  To: raven; +Cc: Jim Carter, autofs mailing list, Kernel Mailing List

[-- Attachment #1: Type: text/plain, Size: 1876 bytes --]

raven@themaw.net wrote:

>On Mon, 12 Jan 2004, Mike Waychison wrote:
>
>  
>
>>>Transparency of an autofs filesystem (as I'm calling it) is the situation
>>>where, given a map
>>>
>>>/usr	/man1	server:/usr/man1
>>>	/man2	server:/usr/man2
>>>
>>>where the filesystem /usr contains, say a directory lib, that needs to be
>>>available while also seeing the automounted directories.
>>>
>>> 
>>>
>>>      
>>>
>>I see.  This requires direct mount triggers to do properly.  Trying to 
>>do it with some sort of passthrough to the underlying filesystem is a 
>>nightmare waiting to happen..
>>
>>    
>>
>
>So what are we saying here?
>
>We install triggers at /usr/man1 and /usr/man2.
>Then suppose the map had a nobrowse option.
>Does the trigger also take care of hiding man1 and man2?
>
>Is there some definition of these triggers?
>  
>
The example above is a direct map entry with no root offset.  The 
semantics are different than if it were an indirect map with browsing 
enable. 

I tested this out against other automount implementations and discovered 
that direct map entries with no root offsets should be broken down into 
several direct map entries with root offsets.. so:

/usr   /man1   server:/usr/man1   \
          /man2   server:/usr/man2

is the same as the two distinct entries:

/usr/man1   server:/usr/man1
/usr/man2   server:/usr/man2

Now that I think about it, the discussion in my proposal paper about 
multimounts with no root offsets probably isn't required.

-- 
Mike Waychison
Sun Microsystems, Inc.
1 (650) 352-5299 voice
1 (416) 202-8336 voice
mailto: Michael.Waychison@Sun.COM
http://www.sun.com

~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
NOTICE:  The opinions expressed in this email are held by me, 
and may not represent the views of Sun Microsystems, Inc.
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ 


[-- Attachment #2: Type: application/pgp-signature, Size: 251 bytes --]

^ permalink raw reply	[flat|nested] 82+ messages in thread

* Re: [autofs] [RFC] Towards a Modern Autofs
  2004-01-12 13:07               ` Mike Waychison
@ 2004-01-12 16:01                 ` raven
  2004-01-12 16:26                   ` Mike Waychison
                                     ` (2 more replies)
  0 siblings, 3 replies; 82+ messages in thread
From: raven @ 2004-01-12 16:01 UTC (permalink / raw)
  To: Mike Waychison; +Cc: Jim Carter, autofs mailing list, Kernel Mailing List

On Mon, 12 Jan 2004, Mike Waychison wrote:

> >
> >Transparency of an autofs filesystem (as I'm calling it) is the situation
> >where, given a map
> >
> >/usr	/man1	server:/usr/man1
> >	/man2	server:/usr/man2
> >
> >where the filesystem /usr contains, say a directory lib, that needs to be
> >available while also seeing the automounted directories.
> >
> >  
> >
> I see.  This requires direct mount triggers to do properly.  Trying to 
> do it with some sort of passthrough to the underlying filesystem is a 
> nightmare waiting to happen..
> 

So what are we saying here?

We install triggers at /usr/man1 and /usr/man2.
Then suppose the map had a nobrowse option.
Does the trigger also take care of hiding man1 and man2?

Is there some definition of these triggers?

Ian


^ permalink raw reply	[flat|nested] 82+ messages in thread

* Re: [autofs] [RFC] Towards a Modern Autofs
  2004-01-10  5:43             ` Ian Kent
@ 2004-01-12 13:07               ` Mike Waychison
  2004-01-12 16:01                 ` raven
  0 siblings, 1 reply; 82+ messages in thread
From: Mike Waychison @ 2004-01-12 13:07 UTC (permalink / raw)
  To: Ian Kent; +Cc: Jim Carter, autofs mailing list, Kernel Mailing List

[-- Attachment #1: Type: text/plain, Size: 1398 bytes --]

Ian Kent wrote:

>On Fri, 9 Jan 2004, Mike Waychison wrote:
>
>  
>
>>>Indeed, I
>>>haven't solved my requirement of a transparent autofs filesystem aka.
>>>Solaris automounter again. A difficult problem that will require
>>>considerable effort.
>>>
>>>
>>>
>>>      
>>>
>>What do you mean by this?  Something that doesn't show up in
>>/proc/mounts?  I don't see this as much of an issue..  On any decently
>>large machine, there are so many entries anyway that /etc/mtab and
>>/proc/mounts become humanly unparseable anyhow.
>>    
>>
>
>Transparency of an autofs filesystem (as I'm calling it) is the situation
>where, given a map
>
>/usr	/man1	server:/usr/man1
>	/man2	server:/usr/man2
>
>where the filesystem /usr contains, say a directory lib, that needs to be
>available while also seeing the automounted directories.
>
>  
>
I see.  This requires direct mount triggers to do properly.  Trying to 
do it with some sort of passthrough to the underlying filesystem is a 
nightmare waiting to happen..

-- 
Mike Waychison
Sun Microsystems, Inc.
1 (650) 352-5299 voice
1 (416) 202-8336 voice
mailto: Michael.Waychison@Sun.COM
http://www.sun.com

~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
NOTICE:  The opinions expressed in this email are held by me, 
and may not represent the views of Sun Microsystems, Inc.
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ 


[-- Attachment #2: Type: application/pgp-signature, Size: 251 bytes --]

^ permalink raw reply	[flat|nested] 82+ messages in thread

* Re: [autofs] [RFC] Towards a Modern Autofs
  2004-01-09 20:52           ` Mike Waychison
@ 2004-01-10  6:05             ` Ian Kent
  0 siblings, 0 replies; 82+ messages in thread
From: Ian Kent @ 2004-01-10  6:05 UTC (permalink / raw)
  To: Mike Waychison; +Cc: H. Peter Anvin, autofs mailing list, Kernel Mailing List

On Fri, 9 Jan 2004, Mike Waychison wrote:

> >
> >This may sound a little silly but it may be able to be done using
> >stackable filesystem methods (aka. Zadok et. al.). I'm thinking of an
> >autofs filesystem stacked on a host filesystem. The dentrys corresponding
> >to mount points marked in some way and the mount occuring under it, on top
> >of the host filesystem. Yes I know it sounds ugly but maybe it's not.
> >Maybe it's actually quite simple. I can't give an opinion yet as I'm still
> >thinking it through and haven't done any feasibility. However, this
> >approach would lend itself to providing autofs filesystem transparency. A
> >requirement as yet not discussed.
> >
> >Ian
> >
> >
> >
> Doing stackable filesystems is still an area of OS research.  It turns
> out to be a very hard problem to solve (if it's possible at all).
> Although there are systems in the wild that appear to work, they are
> usually sub-optimal because there remains alot of issues such as
> maintaining coherent caches, as well as just staying coherent given that
> one filesystem may be directly accessible while also accessed from
> another overlayed filesystem.

Yes I see that in what I've read.

But I'm thinking of a very tightly controlled autofs layer controlled
only by automount. Once owned by automount that part of the underlying fs
could only be accessed via automount. The boundry cases obviously are
a sensitive area.

>
> Not really something you'd want to waste alot of time on unless your
> looking for a phd thesis. ;)

A masters one day might be good.

Ian




^ permalink raw reply	[flat|nested] 82+ messages in thread

* Re: [autofs] [RFC] Towards a Modern Autofs
  2004-01-09 20:51           ` Jim Carter
@ 2004-01-10  5:56             ` Ian Kent
  0 siblings, 0 replies; 82+ messages in thread
From: Ian Kent @ 2004-01-10  5:56 UTC (permalink / raw)
  To: Jim Carter; +Cc: Mike Waychison, autofs mailing list, Kernel Mailing List

On Fri, 9 Jan 2004, Jim Carter wrote:

> On Sat, 10 Jan 2004, Ian Kent wrote:
> > On Thu, 8 Jan 2004, Mike Waychison wrote:
> > > This module will have its own new autofs module (hopefully named
> > > something other than autofs to avoid confusion/mishaps).  The VFS will
>
> autofs v3 -> autofs.o
> autofs v4 -> autofs4.o
> May I suggest autofs5.o?  It should still be named "autofs-something",
> after all.
>

Nop. I will continue to develop under the v4 banner. As far as I'm
concerned Peter Anvin has claimed v5 and I don't want to challenge that.
Mike Waychisons' initiative may possibly be called v6???

In any case the module works fine with v3 and v4 (I haven't tested
4.0.0pre10 for a while though). The 4.1 daemon detects the enhanced module
if present. It is currently dubed 4.04. The 'plays well with others' is a
self imposed design requirement.

Ian



^ permalink raw reply	[flat|nested] 82+ messages in thread

* Re: [autofs] [RFC] Towards a Modern Autofs
  2004-01-09 20:06           ` Mike Waychison
@ 2004-01-10  5:43             ` Ian Kent
  2004-01-12 13:07               ` Mike Waychison
  0 siblings, 1 reply; 82+ messages in thread
From: Ian Kent @ 2004-01-10  5:43 UTC (permalink / raw)
  To: Mike Waychison; +Cc: Jim Carter, autofs mailing list, Kernel Mailing List

On Fri, 9 Jan 2004, Mike Waychison wrote:

>
> > Indeed, I
> >haven't solved my requirement of a transparent autofs filesystem aka.
> >Solaris automounter again. A difficult problem that will require
> >considerable effort.
> >
> >
> >
> What do you mean by this?  Something that doesn't show up in
> /proc/mounts?  I don't see this as much of an issue..  On any decently
> large machine, there are so many entries anyway that /etc/mtab and
> /proc/mounts become humanly unparseable anyhow.

Transparency of an autofs filesystem (as I'm calling it) is the situation
where, given a map

/usr	/man1	server:/usr/man1
	/man2	server:/usr/man2

where the filesystem /usr contains, say a directory lib, that needs to be
available while also seeing the automounted directories.

>
> >>>Mmm. The vfsmount_lock is available to modules in 2.6. At least it was in
> >>>test11. I'm sure I compiled the module under 2.6 as well???
> >>>
> >>>I thought that, taking the dcache_lock was the correct thing to do when
> >>>traversing a dentry list?
> >>>
> >>>
> >>>
> >>>
> >>>
> >>Walking dentrys still takes the dcache_lock, however walking vfsmounts
> >>takes the vfsmount_lock.  dcache_lock is no longer used for fast path
> >>walking either (to the best of my understanding).
> >>
> >>find . -name '*.[ch]' -not -path '*SCCS*' | xargs grep vfsmount_lock |
> >>grep EXPORT
> >>
> >>
> >
> >Strange. How does the module compile I wonder? How does it load without
> >unresolved symbols? Another little mystery to work on.
> >
> >
> >
> No, you're module doesn't use vfsmount_lock.  At least the module in
> autofs4-2.4-module-20031201.tar.gz doesn't.

This is the 2.4 code. I do (or though I was able to) use the vfsmount_lock
in the 2.6 patches I have in
kernel.org/pub/linux/kernel/people/raven/autofs4-2.6. This is bad for me.

>
> >>The raciness comes from the fact that we now support the lazy-mounting
> >>of multimount offsets using embedded direct mounts.  Autofs4 mounts all
> >>(or as much as it can) from the multimount all together, and unmounts it
> >>all on expiry.
> >>
> >>
> >
> >But 4.1 does lazy mount for several maps. Mounts that are triggered
> >during the umount step of the expire are put on a wait queue along with
> >the task requesting the umount. I think autofs always worked that way.
> >
> >
> >
> This isn't lazy mounting per se.  If you are talking about autofs4's use
> of AUTOFS_INF_EXPIRING, it's there to prevent somebody from walking into
> a multimount while it is expiring.

Or any umount when sending the expire request to userspace.




^ permalink raw reply	[flat|nested] 82+ messages in thread

* Re: [autofs] [RFC] Towards a Modern Autofs
  2004-01-09 21:02         ` H. Peter Anvin
@ 2004-01-09 21:52           ` Mike Waychison
  0 siblings, 0 replies; 82+ messages in thread
From: Mike Waychison @ 2004-01-09 21:52 UTC (permalink / raw)
  To: H. Peter Anvin; +Cc: trond.myklebust, viro, linux-kernel, raven, thockin

[-- Attachment #1: Type: text/plain, Size: 2131 bytes --]

H. Peter Anvin wrote:

>Mike Waychison wrote:
>  
>
>>H. Peter Anvin wrote:
>>
>>    
>>
>>>My point is that it's what you get for having an automounter.
>>>
>>>We can't solve Sun's designed-in braindamage, unfortunately.  This is
>>>partially why I'd like people to consider the scope of what automounting
>>>does; there are tons of policy issues not all of which are going to be
>>>appropriate in all contexts.  To some degree, if you have to have an
>>>automounter you have already lost.
>>> 
>>>      
>>>
>>However, we can solve Linux's designed-in braindamage.
>>
>>    
>>
>
>I was referring to the visibility of server-side mount points in NFS 2/3
>and the fact that most of the uses of the automounter is to work around
>this shortcoming.  This is a protocol limitation.
>
>  
>
It's just a different way of looking at it.   NFS exports filesystems,  
not namespaces.  It's the server implementation that decides it should 
try to map these exports to its local namespace.  Indeed, this is what 
exportfs and /etc/exports tries to do.  Nobody said this mapping made 
alot of sense.

>(Don't get me started on stuff like "plus lines" in map, which breaks
>the map paradigm completely.  That's brokenness on a whole other level,
>but which can be reasonably ignored.)
>
>  
>
Your on your own on that one.  I don't see it as an issue as the 
semantics are pretty well defined.

>It's trivial to crash most filesystem drivers (or get to a security leak
>level) by feeding them deliberately bad input.  Robustness against
>corruption in Linux has been with respect to likely data corruption much
>more than deliberate attacks.  It's a major effort; security-auditing
>every filesystem driver.
>  
>
Ok.  Thanks for clearing that up. 

-- 
Mike Waychison
Sun Microsystems, Inc.
1 (650) 352-5299 voice
1 (416) 202-8336 voice
mailto: Michael.Waychison@Sun.COM
http://www.sun.com

~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
NOTICE:  The opinions expressed in this email are held by me, 
and may not represent the views of Sun Microsystems, Inc.
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ 


[-- Attachment #2: Type: application/pgp-signature, Size: 251 bytes --]

^ permalink raw reply	[flat|nested] 82+ messages in thread

* Re: [autofs] [RFC] Towards a Modern Autofs
  2004-01-09 20:54             ` H. Peter Anvin
@ 2004-01-09 21:43               ` Mike Waychison
  0 siblings, 0 replies; 82+ messages in thread
From: Mike Waychison @ 2004-01-09 21:43 UTC (permalink / raw)
  To: H. Peter Anvin
  Cc: Michael Clark, Ian Kent, autofs mailing list, Kernel Mailing List

[-- Attachment #1: Type: text/plain, Size: 3876 bytes --]

H. Peter Anvin wrote:

>Mike Waychison wrote:
>  
>
>>This is an interesting approach to killing off a mountpoint.  However,
>>the problem in question is not the destruction of the mountpoints, but
>>rather being able to
>>check_activity_of_a_hierarchy_of_mountpoints/unmount_them_together
>>atomically.  This cannot be done cleanly in userspace even when given an
>>interface to do the check, someone can race in before userspace
>>initiates the unmounts.  The alternative is to have userspace detach the
>>hierarchy of mountpoints using the '-l' option to umount(8), but then we
>>may still unneccesarily unmount the filesystem will someone is in it.
>>I think that both HPA and I agree that this capability is needed in
>>order to support lazy mounting of multimounts properly.    The issue
>>that remains is *how* to do it.
>>
>>    
>>
>
>I would argue even stronger: allowing the administrator to umount
>directories manually is a hard requirement.  This means that partial
>hierarchies *will* occur.  Thus, relying on the hierarchy being
>atomically destructed in inherently broken.
>  
>
Yes, but they shouldn't occur due to normal operation of the system.  
Yes, the administrator can manually prune things away, yet the remaining 
bits should still be able to expire atomically.

On the other end of the spectrum is the situation where if I had 
accessed my homedir, /home/mikew, and then I manually mounted something 
in /home/mikew/mnt as root in another window, /home/mikew should _not_ 
expire.  /home/mikew/mnt is not managed by the automounter, so it 
shouldn't be expired by it either.

>This means that constructing the hierarchy with direct-mount automount
>triggers in between the filesystems is mandatory; you get lazy mounting
>for free, then -- it's a userspace policy decision whether or not to
>release the waiting processes before the hierarchy is complete or not.
>
>  
>
Yes, and this policy in my proposal is handled by the automount 
useragent.  The system is constructed such that any waiting processes 
are released when the useragent dies off.  If userspace wanted to let 
people in before it finished construction, it would fork and exit in the 
parent process.

>Now, once you recognize that the administrator needs to be able to do
>umounts, expiry in userspace becomes quite trivial, since expiry is
>inherently probabilistic: it can simply mimic an administrator preening
>the trees, and if it fails, stop (or re-mount the submounts, policy
>decision.)  Having a simple kernel-assist to avoid needless umount
>operations is a good thing if (and only if!) it's cheap, but it doesn't
>have to be foolproof.
>
>  
>
But it doesn't work as a daemon when you have namespaces created left 
and right.  It *would maybe* work as a cron job, if cron was namespace 
aware.

>Again, the atomicity constraint that umounting a filesystem needs to
>destroy the mount traps above it derives from the need to cleanly deal
>with nonatomic destruction.
>
>  
>
??

>>The time required to unmount something is constant if we detach the
>>mountpoint using a lazy umount.
>>
>>    
>>
>
>You probably don't want to do that -- you could end up with some really
>odd timing-related bugs if you then re-mount the filesystem.  It's also
>unnecessary, since expiry is not a triggered event and therefore doesn't
>keep anything that needs to happen from happening.
>
>  
>
Off the top of my head, I don't see any issues, but you are right in 
that something may creep up. 

-- 
Mike Waychison
Sun Microsystems, Inc.
1 (650) 352-5299 voice
1 (416) 202-8336 voice
mailto: Michael.Waychison@Sun.COM
http://www.sun.com

~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
NOTICE:  The opinions expressed in this email are held by me, 
and may not represent the views of Sun Microsystems, Inc.
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ 


[-- Attachment #2: Type: application/pgp-signature, Size: 251 bytes --]

^ permalink raw reply	[flat|nested] 82+ messages in thread

* Re: [autofs] [RFC] Towards a Modern Autofs
  2004-01-09 20:37       ` Mike Waychison
@ 2004-01-09 21:02         ` H. Peter Anvin
  2004-01-09 21:52           ` Mike Waychison
  0 siblings, 1 reply; 82+ messages in thread
From: H. Peter Anvin @ 2004-01-09 21:02 UTC (permalink / raw)
  To: Mike Waychison; +Cc: trond.myklebust, viro, linux-kernel, raven, thockin

Mike Waychison wrote:
> H. Peter Anvin wrote:
> 
>> My point is that it's what you get for having an automounter.
>>
>> We can't solve Sun's designed-in braindamage, unfortunately.  This is
>> partially why I'd like people to consider the scope of what automounting
>> does; there are tons of policy issues not all of which are going to be
>> appropriate in all contexts.  To some degree, if you have to have an
>> automounter you have already lost.
>>  
> 
> However, we can solve Linux's designed-in braindamage.
> 

I was referring to the visibility of server-side mount points in NFS 2/3
and the fact that most of the uses of the automounter is to work around
this shortcoming.  This is a protocol limitation.

(Don't get me started on stuff like "plus lines" in map, which breaks
the map paradigm completely.  That's brokenness on a whole other level,
but which can be reasonably ignored.)

>> in particular there is no security
>> against root.  Stupid tricks like remapping uid 0 are just that; stupid
>> tricks without any real security value.  You know this, of course.
>> However, if you think the automounter doesn't have the privilege to
>> access the remote server but the user does, then that's false security.
>
> No, the security lies in the fact that the remote server knows the user
> is privileged to access it.  It's a side issue that the mount itself is
> made using an automounter.

Again, it doesn't matter if the user passes credentials to an
automounter or to the kernel.

>> Linux at this point has no ability to support actual user-mounted
>> filesystems.  There are things that could be done to remedy this, but it
>> would require massive changes to every filesystem driver as well as to
>> the VFS. 
> 
> ??  As part of our research into namespaces, we at Sun have gone through
> and tried to identify the number of semantic changes required to achieve
> user-privileged mounting, however we never saw the need to do anything
> special at all in 'each filesystem driver'.  The issue is one of a
> permission model and should be out of scope for individual filesystems.

It's trivial to crash most filesystem drivers (or get to a security leak
level) by feeding them deliberately bad input.  Robustness against
corruption in Linux has been with respect to likely data corruption much
more than deliberate attacks.  It's a major effort; security-auditing
every filesystem driver.

	-hpa


^ permalink raw reply	[flat|nested] 82+ messages in thread

* Re: [autofs] [RFC] Towards a Modern Autofs
  2004-01-09 20:28           ` Mike Waychison
@ 2004-01-09 20:54             ` H. Peter Anvin
  2004-01-09 21:43               ` Mike Waychison
  0 siblings, 1 reply; 82+ messages in thread
From: H. Peter Anvin @ 2004-01-09 20:54 UTC (permalink / raw)
  To: Mike Waychison
  Cc: Michael Clark, Ian Kent, autofs mailing list, Kernel Mailing List

Mike Waychison wrote:
> 
> This is an interesting approach to killing off a mountpoint.  However,
> the problem in question is not the destruction of the mountpoints, but
> rather being able to
> check_activity_of_a_hierarchy_of_mountpoints/unmount_them_together
> atomically.  This cannot be done cleanly in userspace even when given an
> interface to do the check, someone can race in before userspace
> initiates the unmounts.  The alternative is to have userspace detach the
> hierarchy of mountpoints using the '-l' option to umount(8), but then we
> may still unneccesarily unmount the filesystem will someone is in it.
> I think that both HPA and I agree that this capability is needed in
> order to support lazy mounting of multimounts properly.    The issue
> that remains is *how* to do it.
> 

I would argue even stronger: allowing the administrator to umount
directories manually is a hard requirement.  This means that partial
hierarchies *will* occur.  Thus, relying on the hierarchy being
atomically destructed in inherently broken.

This means that constructing the hierarchy with direct-mount automount
triggers in between the filesystems is mandatory; you get lazy mounting
for free, then -- it's a userspace policy decision whether or not to
release the waiting processes before the hierarchy is complete or not.

Now, once you recognize that the administrator needs to be able to do
umounts, expiry in userspace becomes quite trivial, since expiry is
inherently probabilistic: it can simply mimic an administrator preening
the trees, and if it fails, stop (or re-mount the submounts, policy
decision.)  Having a simple kernel-assist to avoid needless umount
operations is a good thing if (and only if!) it's cheap, but it doesn't
have to be foolproof.

Again, the atomicity constraint that umounting a filesystem needs to
destroy the mount traps above it derives from the need to cleanly deal
with nonatomic destruction.

>
> The time required to unmount something is constant if we detach the
> mountpoint using a lazy umount.
> 

You probably don't want to do that -- you could end up with some really
odd timing-related bugs if you then re-mount the filesystem.  It's also
unnecessary, since expiry is not a triggered event and therefore doesn't
keep anything that needs to happen from happening.

	-hpa


^ permalink raw reply	[flat|nested] 82+ messages in thread

* Re: [autofs] [RFC] Towards a Modern Autofs
  2004-01-09 18:32         ` Ian Kent
@ 2004-01-09 20:52           ` Mike Waychison
  2004-01-10  6:05             ` Ian Kent
  0 siblings, 1 reply; 82+ messages in thread
From: Mike Waychison @ 2004-01-09 20:52 UTC (permalink / raw)
  To: Ian Kent; +Cc: H. Peter Anvin, autofs mailing list, Kernel Mailing List

[-- Attachment #1: Type: text/plain, Size: 3481 bytes --]

Ian Kent wrote:

>On Thu, 8 Jan 2004, H. Peter Anvin wrote:
>
>  
>
>>Ian Kent wrote:
>>    
>>
>>>If wildcard map entries are not in autofs v3 then Jeremy implemented this
>>>in v4.
>>>
>>>      
>>>
>>v3 has had wildcard map entries and substitutions for a very, very, very
>>long time... it was a v2 feature, in fact.
>>
>>    
>>
>>>And yes the host map is basically a program map and that's all. Worse, as
>>>pointed out in the paper it mounts everything under it. This is a source
>>>of stress for mount and umount. I have put in a fair bit of time on ugly
>>>hacks to work around this. This same problem is also evident in startup
>>>and shutdown for master maps with a good number of entries (~50 or more).
>>>A consequence of the current multiple daemon approach.
>>>      
>>>
>>This is why one wants to implement a mount tree with "direct mount
>>pads"; which also means keeping some state in the daemon.
>>
>>For example, let's say one has a mount tree like:
>>
>>/foo		server1:/export/foo \
>>/foo/bar	server1:/export/bar \
>>/bar		server2:/export/bar
>>
>>... then you actually have four diffenent filesystems involved: first,
>>some kind of "scaffolding" (this can be part of the autofs filesystem
>>itself or a ramfs) that hold the "foo" and "bar" directories, and then
>>foo, foo/bar, and bar.
>>
>>Consider the following implementation: when one encounters the above,
>>the daemon stashes this away as an already-encountered map entry (in
>>case the map entries change, we don't want to be inconsistent), creates
>>a ramfs for the scaffolding, creates the "foo" and "bar" subdirectories
>>and mount-traps "foo" and "bar".  Then it releases userspace.  When it
>>encounters an access on "foo", it gets invoked again, looks it up in its
>>"partial mounts" state, then mounts "foo" and mount-traps "foo/bar",
>>then releases userspace.
>>
>>    
>>
>
>Umm. The cross filesystem problem again.
>
>This may sound a little silly but it may be able to be done using
>stackable filesystem methods (aka. Zadok et. al.). I'm thinking of an
>autofs filesystem stacked on a host filesystem. The dentrys corresponding
>to mount points marked in some way and the mount occuring under it, on top
>of the host filesystem. Yes I know it sounds ugly but maybe it's not.
>Maybe it's actually quite simple. I can't give an opinion yet as I'm still
>thinking it through and haven't done any feasibility. However, this
>approach would lend itself to providing autofs filesystem transparency. A
>requirement as yet not discussed.
>
>Ian
>
>  
>
Doing stackable filesystems is still an area of OS research.  It turns 
out to be a very hard problem to solve (if it's possible at all).   
Although there are systems in the wild that appear to work, they are 
usually sub-optimal because there remains alot of issues such as 
maintaining coherent caches, as well as just staying coherent given that 
one filesystem may be directly accessible while also accessed from 
another overlayed filesystem.

Not really something you'd want to waste alot of time on unless your 
looking for a phd thesis. ;)

-- 
Mike Waychison
Sun Microsystems, Inc.
1 (650) 352-5299 voice
1 (416) 202-8336 voice
mailto: Michael.Waychison@Sun.COM
http://www.sun.com

~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
NOTICE:  The opinions expressed in this email are held by me, 
and may not represent the views of Sun Microsystems, Inc.
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ 


[-- Attachment #2: Type: application/pgp-signature, Size: 251 bytes --]

^ permalink raw reply	[flat|nested] 82+ messages in thread

* Re: [autofs] [RFC] Towards a Modern Autofs
  2004-01-09 18:20         ` Ian Kent
  2004-01-09 20:06           ` Mike Waychison
@ 2004-01-09 20:51           ` Jim Carter
  2004-01-10  5:56             ` Ian Kent
  1 sibling, 1 reply; 82+ messages in thread
From: Jim Carter @ 2004-01-09 20:51 UTC (permalink / raw)
  To: Ian Kent; +Cc: Mike Waychison, autofs mailing list, Kernel Mailing List

On Sat, 10 Jan 2004, Ian Kent wrote:
> On Thu, 8 Jan 2004, Mike Waychison wrote:
> > This module will have its own new autofs module (hopefully named
> > something other than autofs to avoid confusion/mishaps).  The VFS will

autofs v3 -> autofs.o
autofs v4 -> autofs4.o
May I suggest autofs5.o?  It should still be named "autofs-something",
after all.

James F. Carter          Voice 310 825 2897    FAX 310 206 6673
UCLA-Mathnet;  6115 MSA; 405 Hilgard Ave.; Los Angeles, CA, USA  90095-1555
Email: jimc@math.ucla.edu    http://www.math.ucla.edu/~jimc (q.v. for PGP key)

^ permalink raw reply	[flat|nested] 82+ messages in thread

* Re: [autofs] [RFC] Towards a Modern Autofs
  2004-01-08 21:13     ` H. Peter Anvin
  2004-01-08 22:20       ` J. Bruce Fields
@ 2004-01-09 20:37       ` Mike Waychison
  2004-01-09 21:02         ` H. Peter Anvin
  1 sibling, 1 reply; 82+ messages in thread
From: Mike Waychison @ 2004-01-09 20:37 UTC (permalink / raw)
  To: H. Peter Anvin; +Cc: trond.myklebust, viro, linux-kernel, raven, thockin

[-- Attachment #1: Type: text/plain, Size: 2631 bytes --]

H. Peter Anvin wrote:

>My point is that it's what you get for having an automounter.
>
>We can't solve Sun's designed-in braindamage, unfortunately.  This is
>partially why I'd like people to consider the scope of what automounting
>does; there are tons of policy issues not all of which are going to be
>appropriate in all contexts.  To some degree, if you have to have an
>automounter you have already lost.
>  
>
However, we can solve Linux's designed-in braindamage.

>Also, your global machine credential is to some degree "all the security
>you get."  Any security which isn't enforced by the filesystem driver
>doesn't exist in a Unix environment;
>

What does this mean?   I don't understand.

> in particular there is no security
>against root.  Stupid tricks like remapping uid 0 are just that; stupid
>tricks without any real security value.  You know this, of course.
>However, if you think the automounter doesn't have the privilege to
>access the remote server but the user does, then that's false security.
>
>  
>
No, the security lies in the fact that the remote server knows the user 
is privileged to access it.  It's a side issue that the mount itself is 
made using an automounter.

>Linux at this point has no ability to support actual user-mounted
>filesystems.  There are things that could be done to remedy this, but it
>would require massive changes to every filesystem driver as well as to
>the VFS.  
>
??  As part of our research into namespaces, we at Sun have gone through 
and tried to identify the number of semantic changes required to achieve 
user-privileged mounting, however we never saw the need to do anything 
special at all in 'each filesystem driver'.  The issue is one of a 
permission model and should be out of scope for individual filesystems.

>Would it be desirable?  Absolutely.  However, it's partially
>the quagmire that got the HURD stuck for a very long time, even though
>they had the huge advantage of being able to run their filesystem
>drivers in a nonprivileged context.
>
>  
>
Other systems such as plan 9 have done it though..    If anything is 
keeping us from doing it, it's the traditional unix mount semantics and 
the security models that have been built on top of them.

-- 
Mike Waychison
Sun Microsystems, Inc.
1 (650) 352-5299 voice
1 (416) 202-8336 voice
mailto: Michael.Waychison@Sun.COM
http://www.sun.com

~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
NOTICE:  The opinions expressed in this email are held by me, 
and may not represent the views of Sun Microsystems, Inc.
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ 


[-- Attachment #2: Type: application/pgp-signature, Size: 251 bytes --]

^ permalink raw reply	[flat|nested] 82+ messages in thread

* Re: [autofs] [RFC] Towards a Modern Autofs
  2004-01-08 23:42         ` Michael Clark
@ 2004-01-09 20:28           ` Mike Waychison
  2004-01-09 20:54             ` H. Peter Anvin
  0 siblings, 1 reply; 82+ messages in thread
From: Mike Waychison @ 2004-01-09 20:28 UTC (permalink / raw)
  To: Michael Clark
  Cc: H. Peter Anvin, Ian Kent, autofs mailing list, Kernel Mailing List

[-- Attachment #1: Type: text/plain, Size: 2194 bytes --]

Michael Clark wrote:

> On 01/09/04 01:34, H. Peter Anvin wrote:
>
>> In many ways this returns to the simplicity of the autofs v3 design 
>> where the atomicity constraints where guaranteed by the VFS itself, 
>> *as long as* mount traps can be atomically destroyed with umounting 
>> the underlying filesystem.
>
>
> Do we need to revive Tigran's forced unmount patch 'badfs' ala FreeBSD's
> deadfs? Although it doesn't guarantee atomic unmount, it could help
> a lot with the tendancy to get stuck autofs mounts.
>
>   http://tinyurl.com/2hto8
>
> I've been long waiting for this functionality in mainline.


This is an interesting approach to killing off a mountpoint.  However, 
the problem in question is not the destruction of the mountpoints, but 
rather being able to 
check_activity_of_a_hierarchy_of_mountpoints/unmount_them_together 
atomically.  This cannot be done cleanly in userspace even when given an 
interface to do the check, someone can race in before userspace 
initiates the unmounts.  The alternative is to have userspace detach the 
hierarchy of mountpoints using the '-l' option to umount(8), but then we 
may still unneccesarily unmount the filesystem will someone is in it. 

I think that both HPA and I agree that this capability is needed in 
order to support lazy mounting of multimounts properly.    The issue 
that remains is *how* to do it. 

>
> I wonder if binding badfs over the mountpoint at the beginning of the
> potentially lengthy unmount process would improve the atomicity
> to userspace. ie although the unmount would proceed in the background,
> badfs would have been mounted at that point at the start of the process
> - mounts are atomic no?
>
> ~mc
>
The time required to unmount something is constant if we detach the 
mountpoint using a lazy umount.

-- 
Mike Waychison
Sun Microsystems, Inc.
1 (650) 352-5299 voice
1 (416) 202-8336 voice
mailto: Michael.Waychison@Sun.COM
http://www.sun.com

~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
NOTICE:  The opinions expressed in this email are held by me, 
and may not represent the views of Sun Microsystems, Inc.
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ 


[-- Attachment #2: Type: application/pgp-signature, Size: 251 bytes --]

^ permalink raw reply	[flat|nested] 82+ messages in thread

* Re: [autofs] [RFC] Towards a Modern Autofs
  2004-01-08 19:41 ` H. Peter Anvin
  2004-01-08 20:08   ` trond.myklebust
@ 2004-01-09 20:16   ` Mike Waychison
  1 sibling, 0 replies; 82+ messages in thread
From: Mike Waychison @ 2004-01-09 20:16 UTC (permalink / raw)
  To: H. Peter Anvin; +Cc: trond.myklebust, viro, linux-kernel, raven, thockin

[-- Attachment #1: Type: text/plain, Size: 2847 bytes --]

H. Peter Anvin wrote:

>trond.myklebust@fys.uio.no wrote:
>  
>
>>Finally, because the upcall is done in the user's own context, you avoid
>>the whole problem of automounter credentials that are a constant plague
>>to all those daemon-based implementations when working in an environment
>>where you have strong authentication.
>>If anyone wants evidence of how broken the whole daemon thing is, then see
>>the workarounds that had to be made in RFC-2623 to disable strong
>>authentication for GETATTR etc. on the NFSv2/v3 mount point.
>>
>>    
>>
>
>It's not broken as much as what you want to do is outside the scope of
>automount.  automount is one particular user of these facilities, and as
>you correctly point out, it can't solve the problems for all of them.
>The right thing for AFS and NFSv4 is clearly to do something different.
>
>  
>
If automount is going to be mounting NFS shares for users, I don't see 
how this is out of scope.

>Mount traps by themselves are not sufficient for automount, which is why
>I think we will always have a special "autofs" filesystem, for the
>simple reason that automount in typical use doesn't either have an a
>priori complete list of directories!  Even with ghosting you might find
>that you're accessing a new key which has not yet been ghosted, and it
>needs to be handled correctly.  Additionally, not all map types can be
>enumerated, and some aren't even finite in size (consider /net, program
>maps and wildcard map entries.)  Thus, for indirect mountpoints you
>still need a filesystem which can trap on non-enumerated entries.
>
>  
>
Yup.

>That being said, mount traps in particular, and possibly this "trap
>filesystem" are more generic kernel facilities which should be of use to
>other things than automount.  AFS/NFSv4 are the obvious examples, quite
>possibly other things like intermezzo might be interested, and we don't
>want to have to reinvent the wheel every time.
>
>  
>
I could see AFS using these mounttraps, however I don't see any benefit 
for NFS.   If anything, the migration issue is about getting rid of the 
daemon, not mounttraps.  The issues I think Trond is putting forward are:

a) The kernel needs to initiate a remount, but doesn't have nameservice 
functionality.

b) User credentials are needed to perform the initial mount itself 
because some servers don't allow non-authenticated calls to the MOUNT 
program, keeping the system from grabbing a root filehandle.

-- 
Mike Waychison
Sun Microsystems, Inc.
1 (650) 352-5299 voice
1 (416) 202-8336 voice
mailto: Michael.Waychison@Sun.COM
http://www.sun.com

~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
NOTICE:  The opinions expressed in this email are held by me, 
and may not represent the views of Sun Microsystems, Inc.
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ 


[-- Attachment #2: Type: application/pgp-signature, Size: 251 bytes --]

^ permalink raw reply	[flat|nested] 82+ messages in thread

* Re: [autofs] [RFC] Towards a Modern Autofs
  2004-01-09 18:20         ` Ian Kent
@ 2004-01-09 20:06           ` Mike Waychison
  2004-01-10  5:43             ` Ian Kent
  2004-01-09 20:51           ` Jim Carter
  1 sibling, 1 reply; 82+ messages in thread
From: Mike Waychison @ 2004-01-09 20:06 UTC (permalink / raw)
  To: Ian Kent; +Cc: Jim Carter, autofs mailing list, Kernel Mailing List

[-- Attachment #1: Type: text/plain, Size: 5300 bytes --]

Ian Kent wrote:

>On Thu, 8 Jan 2004, Mike Waychison wrote:
>
>  
>
>>The direct map 'triggers' will be taken care of by another filesystem
>>with a magic root directory that will catch traversals using some
>>follow_link magic.   I wrote a prototype for this last summer, but
>>haven't released it as the userspace stuff completely does not fit in
>>with the existing daemon that was out at the time do the the mess of
>>glue that was pgids, pipes and processes.   It worked in the simple
>>case, but it didn't extend to being able to direct mount an indirect
>>map, nor was it able to do the lazy mounting in multimounts as I had
>>desired.
>>    
>>
>
>Is this the stuf that Al Viro is working on?
>
>  
>
Al is proposing doing the same thing directly in the VFS instead of 
using a magic filesystem as I've described in the document.  

> Indeed, I
>haven't solved my requirement of a transparent autofs filesystem aka.
>Solaris automounter again. A difficult problem that will require
>considerable effort.
>
>  
>
What do you mean by this?  Something that doesn't show up in 
/proc/mounts?  I don't see this as much of an issue..  On any decently 
large machine, there are so many entries anyway that /etc/mtab and 
/proc/mounts become humanly unparseable anyhow.

>>>>This is the subtle difference between direct and indirect maps.   The
>>>>direct map keys are absolute paths, not path components.  We are
>>>>implementing direct mounts as individual filesystems that will trap on
>>>>traversal into their base directory.  This filesystem has no idea where
>>>>it is located as far as the user is concerned.  We need to tell the
>>>>filesystem directly so that the usermode helper can look it up.
>>>>Conversely, the indirect map uses the sub-directory name as a mapkey.
>>>>
>>>>
>>>>        
>>>>
>>>I'm not sure what you are saying here. Does this mean there is a mount for
>>>every direct mount (this might be what you call a trigger)?
>>>
>>>
>>>
>>>      
>>>
>>Yes, it is its own filesystem (type autofs).  This is needed because we
>>need to overlay direct triggers within NFS filesystems for multimounts.
>>    
>>
>
>Ahh. I see, you are talking about the cross filesystem problem. I haven't
>solved that in what I have done either. Fortuneately I still get a good
>hit rate in satisfying peoples' needs as in practice many people don't use
>full automounter functionality.
>
>  
>
Yup.  But still, one of the nice things about a full automounter 
solution is accessing a netapp with hundreds of exports through /net in 
a reasonably fast way.

>>??  Really?  I find that hard to believe.  I thought Solaris shared it's
>>automounter with HPUX and AIX.  I may be wrong though.
>>    
>>
>
>Old versions perhaps. AIX 4.x was the last I used. It was definately like
>that then. 500+ automounts tends to cluter the mount display a bit.
>
>  
>
Could be.  Either way, on a system with a thousand NFS shares 
automounted, I'm not really concerned about what the mounttable looks 
like.  It won't be human parseable anyway.

>>>Mmm. The vfsmount_lock is available to modules in 2.6. At least it was in
>>>test11. I'm sure I compiled the module under 2.6 as well???
>>>
>>>I thought that, taking the dcache_lock was the correct thing to do when
>>>traversing a dentry list?
>>>
>>>
>>>
>>>      
>>>
>>Walking dentrys still takes the dcache_lock, however walking vfsmounts
>>takes the vfsmount_lock.  dcache_lock is no longer used for fast path
>>walking either (to the best of my understanding).
>>
>>find . -name '*.[ch]' -not -path '*SCCS*' | xargs grep vfsmount_lock |
>>grep EXPORT
>>    
>>
>
>Strange. How does the module compile I wonder? How does it load without
>unresolved symbols? Another little mystery to work on.
>
>  
>
No, you're module doesn't use vfsmount_lock.  At least the module in 
autofs4-2.4-module-20031201.tar.gz doesn't.

>>The raciness comes from the fact that we now support the lazy-mounting
>>of multimount offsets using embedded direct mounts.  Autofs4 mounts all
>>(or as much as it can) from the multimount all together, and unmounts it
>>all on expiry.
>>    
>>
>
>But 4.1 does lazy mount for several maps. Mounts that are triggered
>during the umount step of the expire are put on a wait queue along with
>the task requesting the umount. I think autofs always worked that way.
>
>  
>
This isn't lazy mounting per se.  If you are talking about autofs4's use 
of AUTOFS_INF_EXPIRING, it's there to prevent somebody from walking into 
a multimount while it is expiring. 

>>>Hang on. From the discussion my impression of a lazy mount is that it is
>>>not actually mounted!
>>>
>>>
>>>
>>>      
>>>
>>Lazy _un_mounts as opposed to lazy mounts. Lazy unmounts are described
>>in umount(8):
>>    
>>
>
>But will this leave the filesystem in a consistent state and allow further
>mount activity on that mount point?
>  
>
The underlying autofs filesystem?  Sure.


-- 
Mike Waychison
Sun Microsystems, Inc.
1 (650) 352-5299 voice
1 (416) 202-8336 voice
mailto: Michael.Waychison@Sun.COM
http://www.sun.com

~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
NOTICE:  The opinions expressed in this email are held by me, 
and may not represent the views of Sun Microsystems, Inc.
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ 


[-- Attachment #2: Type: application/pgp-signature, Size: 251 bytes --]

^ permalink raw reply	[flat|nested] 82+ messages in thread

* Re: [autofs] [RFC] Towards a Modern Autofs
  2004-01-08 17:34       ` H. Peter Anvin
  2004-01-08 19:41         ` Mike Waychison
  2004-01-08 23:42         ` Michael Clark
@ 2004-01-09 18:32         ` Ian Kent
  2004-01-09 20:52           ` Mike Waychison
  2 siblings, 1 reply; 82+ messages in thread
From: Ian Kent @ 2004-01-09 18:32 UTC (permalink / raw)
  To: H. Peter Anvin; +Cc: Mike Waychison, autofs mailing list, Kernel Mailing List

On Thu, 8 Jan 2004, H. Peter Anvin wrote:

> Ian Kent wrote:
> >
> > If wildcard map entries are not in autofs v3 then Jeremy implemented this
> > in v4.
> >
>
> v3 has had wildcard map entries and substitutions for a very, very, very
> long time... it was a v2 feature, in fact.
>
> > And yes the host map is basically a program map and that's all. Worse, as
> > pointed out in the paper it mounts everything under it. This is a source
> > of stress for mount and umount. I have put in a fair bit of time on ugly
> > hacks to work around this. This same problem is also evident in startup
> > and shutdown for master maps with a good number of entries (~50 or more).
> > A consequence of the current multiple daemon approach.
>
> This is why one wants to implement a mount tree with "direct mount
> pads"; which also means keeping some state in the daemon.
>
> For example, let's say one has a mount tree like:
>
> /foo		server1:/export/foo \
> /foo/bar	server1:/export/bar \
> /bar		server2:/export/bar
>
> ... then you actually have four diffenent filesystems involved: first,
> some kind of "scaffolding" (this can be part of the autofs filesystem
> itself or a ramfs) that hold the "foo" and "bar" directories, and then
> foo, foo/bar, and bar.
>
> Consider the following implementation: when one encounters the above,
> the daemon stashes this away as an already-encountered map entry (in
> case the map entries change, we don't want to be inconsistent), creates
> a ramfs for the scaffolding, creates the "foo" and "bar" subdirectories
> and mount-traps "foo" and "bar".  Then it releases userspace.  When it
> encounters an access on "foo", it gets invoked again, looks it up in its
> "partial mounts" state, then mounts "foo" and mount-traps "foo/bar",
> then releases userspace.
>

Umm. The cross filesystem problem again.

This may sound a little silly but it may be able to be done using
stackable filesystem methods (aka. Zadok et. al.). I'm thinking of an
autofs filesystem stacked on a host filesystem. The dentrys corresponding
to mount points marked in some way and the mount occuring under it, on top
of the host filesystem. Yes I know it sounds ugly but maybe it's not.
Maybe it's actually quite simple. I can't give an opinion yet as I'm still
thinking it through and haven't done any feasibility. However, this
approach would lend itself to providing autofs filesystem transparency. A
requirement as yet not discussed.

Ian





^ permalink raw reply	[flat|nested] 82+ messages in thread

* Re: [autofs] [RFC] Towards a Modern Autofs
  2004-01-08 15:39       ` Mike Waychison
@ 2004-01-09 18:20         ` Ian Kent
  2004-01-09 20:06           ` Mike Waychison
  2004-01-09 20:51           ` Jim Carter
  0 siblings, 2 replies; 82+ messages in thread
From: Ian Kent @ 2004-01-09 18:20 UTC (permalink / raw)
  To: Mike Waychison; +Cc: Jim Carter, autofs mailing list, Kernel Mailing List

On Thu, 8 Jan 2004, Mike Waychison wrote:

> >
> >Mike can you enlighten me with a few words about how namespaces are useful
> >in the design. I have not seen or heard much about them so please be
> >gentle.
> >
> >
>

Think I have enough on namespaces to understand your proposal now. Thanks.

> >What is the form of the trigger talked about? Identifying the automount
> >points in the autofs filesystem has always been hard and error prone.
> >
> >
> >
> I don't understand what you mean by the identifying part.   However, the
> 'trigger' would the traditional method used in autofsv3/4 for indirect
> maps and probably based off what you already have for doing the browsing
> stuff.
>
> The direct map 'triggers' will be taken care of by another filesystem
> with a magic root directory that will catch traversals using some
> follow_link magic.   I wrote a prototype for this last summer, but
> haven't released it as the userspace stuff completely does not fit in
> with the existing daemon that was out at the time do the the mess of
> glue that was pgids, pipes and processes.   It worked in the simple
> case, but it didn't extend to being able to direct mount an indirect
> map, nor was it able to do the lazy mounting in multimounts as I had
> desired.

Is this the stuf that Al Viro is working on?

>
> >Please clearify what we are talking about WRT kernel support for
> >automount. Is the plan a new kernel module or are we talking about
> >unspecified 'in VFS' support or both?
> >
> >
> >
> This module will have its own new autofs module (hopefully named
> something other than autofs to avoid confusion/mishaps).  The VFS will
> have native support for expiry.  The VFS will also be slightly extended
> to allow the super_block cloning on namespace clone (although this can
> probably hold off a while, it's more a semantic issue than anything else).

Yep. Got that as well.

>
> >
> >Yes. In 4.1 NIS, LDAP and file maps are browsable for both direct and
> >indirect maps. The browsability, only, requires my kernel patch.
> >The daemon detects the updated modules' presence, and if the option is
> >specified 'ghosts' the directories, mounting them only when accessed.
> >
> >
> >
> What is the difference between Solaris's -browse and your ghosting then?

Well I don't know, nothing really. I was working to the requirement of
providing browsable mount trees. The 'doing it properly' was secondary to
satisfying my spec. Mind there are a number of things I haven't done.
Since I don't have a need for tree-mounts (closest would be multi-mount) I
haven't done anything there. As you say in v4 they are a mount/umount
everthing. Consequenty, only the top level leaves are browsable. Indeed, I
haven't solved my requirement of a transparent autofs filesystem aka.
Solaris automounter again. A difficult problem that will require
considerable effort.

>
> >>lstat('dir')
> >>chdir('dir')
> >>lstat('.')
> >>
> >>
> >
> >This suggestion has been made by others several times but doesn't seem
> >to be a problem in practice. In all my testing I have only been able to
> >find one case that does'nt work as needed when ghosted. This is the
> >situation where a home directory in a map exported from a server, is
> >actually not available (eg does not exist) and someone logs into the
> >account using wu-ftpd. In this case wu-ftpd thinks all is ok but of course
> >an error is returned when the directory access is attempted. In fact an
> >error should have been returned at login. Further, I believe this can be
> >solved with as little as an additional revalidate call in sys_stat (I
> >think the problem call was sys_stst ???).
> >
> >
> >
> The find(1) issue is fairly recent.   This check was added some time
> within the last two years (?) and only appears in the latest distros.
>
> Another problem were the ACL patches for ls(1) and friends.  I *really*
> think they should be lgetxattr ing instead of getxattr.  They even
> explicitly check via an lstat _before hand_ to verify if the file
> S_ISLNK, and only then will it getxattr if it isn't.  Why not extend
> it?   I duno.

Looks like I have more testing to do to get a better feel for the way this
behaves.

>
> >>This is the subtle difference between direct and indirect maps.   The
> >>direct map keys are absolute paths, not path components.  We are
> >>implementing direct mounts as individual filesystems that will trap on
> >>traversal into their base directory.  This filesystem has no idea where
> >>it is located as far as the user is concerned.  We need to tell the
> >>filesystem directly so that the usermode helper can look it up.
> >>Conversely, the indirect map uses the sub-directory name as a mapkey.
> >>
> >>
> >
> >I'm not sure what you are saying here. Does this mean there is a mount for
> >every direct mount (this might be what you call a trigger)?
> >
> >
> >
> Yes, it is its own filesystem (type autofs).  This is needed because we
> need to overlay direct triggers within NFS filesystems for multimounts.

Ahh. I see, you are talking about the cross filesystem problem. I haven't
solved that in what I have done either. Fortuneately I still get a good
hit rate in satisfying peoples' needs as in practice many people don't use
full automounter functionality.

>
> Browsing however obviously doesn't need that because we control the
> parent directory.
>
> >AIX implemented automounts by mounting everything in each map. This
> >made the mount listing very ugly.
> >
> >
> >
> ??  Really?  I find that hard to believe.  I thought Solaris shared it's
> automounter with HPUX and AIX.  I may be wrong though.

Old versions perhaps. AIX 4.x was the last I used. It was definately like
that then. 500+ automounts tends to cluter the mount display a bit.

> >Mmm. The vfsmount_lock is available to modules in 2.6. At least it was in
> >test11. I'm sure I compiled the module under 2.6 as well???
> >
> >I thought that, taking the dcache_lock was the correct thing to do when
> >traversing a dentry list?
> >
> >
> >
> Walking dentrys still takes the dcache_lock, however walking vfsmounts
> takes the vfsmount_lock.  dcache_lock is no longer used for fast path
> walking either (to the best of my understanding).
>
> find . -name '*.[ch]' -not -path '*SCCS*' | xargs grep vfsmount_lock |
> grep EXPORT

Strange. How does the module compile I wonder? How does it load without
unresolved symbols? Another little mystery to work on.

>
> shows no results for vfsmount_lock being exported to modules in 2.6.
>
> >
> >The autofs4 moudule blocks (auto) mounts during the umount callback.
> >Surely this is the sensible thing to do.
> >
> >
> >
> The raciness comes from the fact that we now support the lazy-mounting
> of multimount offsets using embedded direct mounts.  Autofs4 mounts all
> (or as much as it can) from the multimount all together, and unmounts it
> all on expiry.

But 4.1 does lazy mount for several maps. Mounts that are triggered
during the umount step of the expire are put on a wait queue along with
the task requesting the umount. I think autofs always worked that way.

> >
> >Hang on. From the discussion my impression of a lazy mount is that it is
> >not actually mounted!
> >
> >
> >
> Lazy _un_mounts as opposed to lazy mounts. Lazy unmounts are described
> in umount(8):

But will this leave the filesystem in a consistent state and allow further
mount activity on that mount point?

Ian



^ permalink raw reply	[flat|nested] 82+ messages in thread

* Re: [autofs] [RFC] Towards a Modern Autofs
  2004-01-08 17:34       ` H. Peter Anvin
  2004-01-08 19:41         ` Mike Waychison
@ 2004-01-08 23:42         ` Michael Clark
  2004-01-09 20:28           ` Mike Waychison
  2004-01-09 18:32         ` Ian Kent
  2 siblings, 1 reply; 82+ messages in thread
From: Michael Clark @ 2004-01-08 23:42 UTC (permalink / raw)
  To: H. Peter Anvin
  Cc: Ian Kent, Mike Waychison, autofs mailing list, Kernel Mailing List

On 01/09/04 01:34, H. Peter Anvin wrote:
> In many ways this returns to the simplicity of the autofs v3 design 
> where the atomicity constraints where guaranteed by the VFS itself, *as 
> long as* mount traps can be atomically destroyed with umounting the 
> underlying filesystem.

Do we need to revive Tigran's forced unmount patch 'badfs' ala FreeBSD's
deadfs? Although it doesn't guarantee atomic unmount, it could help
a lot with the tendancy to get stuck autofs mounts.

   http://tinyurl.com/2hto8

I've been long waiting for this functionality in mainline.

I wonder if binding badfs over the mountpoint at the beginning of the
potentially lengthy unmount process would improve the atomicity
to userspace. ie although the unmount would proceed in the background,
badfs would have been mounted at that point at the start of the process
- mounts are atomic no?

~mc


^ permalink raw reply	[flat|nested] 82+ messages in thread

* Re: [autofs] [RFC] Towards a Modern Autofs
  2004-01-08 22:20       ` J. Bruce Fields
@ 2004-01-08 22:24         ` H. Peter Anvin
  0 siblings, 0 replies; 82+ messages in thread
From: H. Peter Anvin @ 2004-01-08 22:24 UTC (permalink / raw)
  To: J. Bruce Fields
  Cc: trond.myklebust, viro, linux-kernel, raven, Michael.Waychison, thockin

J. Bruce Fields wrote:
>
> On Thu, Jan 08, 2004 at 01:13:24PM -0800, H. Peter Anvin wrote:
> 
>>Also, your global machine credential is to some degree "all the security
>>you get."  Any security which isn't enforced by the filesystem driver
>>doesn't exist in a Unix environment; in particular there is no security
>>against root.
> 
> I only have to trust root on the nfs client machines that I actually
> use.  (In fact, I only really have to trust those machines with a
> short-lived ticket, preventing even those machines from impersonating me
> beyond a limited time.)
> 

And when that ticket expires, you better have NFS itself know how to
renew its credentials, or you're up the creek.  Nothing that autofs can
help you with.

> 
>>Stupid tricks like remapping uid 0 are just that; stupid
>>tricks without any real security value.  You know this, of course.
>>However, if you think the automounter doesn't have the privilege to
>>access the remote server but the user does, then that's false security.
> 
> If the server requires kerberos credentials that only a user has, then
> the automounter can't do anything until the user coughs up those
> credentials.
> 

True, but giving them to a privileged daemon is no different that giving
them to the kernel in that way.

	-hpa


^ permalink raw reply	[flat|nested] 82+ messages in thread

* Re: [autofs] [RFC] Towards a Modern Autofs
  2004-01-08 21:13     ` H. Peter Anvin
@ 2004-01-08 22:20       ` J. Bruce Fields
  2004-01-08 22:24         ` H. Peter Anvin
  2004-01-09 20:37       ` Mike Waychison
  1 sibling, 1 reply; 82+ messages in thread
From: J. Bruce Fields @ 2004-01-08 22:20 UTC (permalink / raw)
  To: H. Peter Anvin
  Cc: trond.myklebust, viro, linux-kernel, raven, Michael.Waychison, thockin

On Thu, Jan 08, 2004 at 01:13:24PM -0800, H. Peter Anvin wrote:
> Also, your global machine credential is to some degree "all the security
> you get."  Any security which isn't enforced by the filesystem driver
> doesn't exist in a Unix environment; in particular there is no security
> against root.

I only have to trust root on the nfs client machines that I actually
use.  (In fact, I only really have to trust those machines with a
short-lived ticket, preventing even those machines from impersonating me
beyond a limited time.)

> Stupid tricks like remapping uid 0 are just that; stupid
> tricks without any real security value.  You know this, of course.
> However, if you think the automounter doesn't have the privilege to
> access the remote server but the user does, then that's false security.

If the server requires kerberos credentials that only a user has, then
the automounter can't do anything until the user coughs up those
credentials.

--Bruce Fields

^ permalink raw reply	[flat|nested] 82+ messages in thread

* Re: [autofs] [RFC] Towards a Modern Autofs
  2004-01-08 20:08   ` trond.myklebust
@ 2004-01-08 21:13     ` H. Peter Anvin
  2004-01-08 22:20       ` J. Bruce Fields
  2004-01-09 20:37       ` Mike Waychison
  0 siblings, 2 replies; 82+ messages in thread
From: H. Peter Anvin @ 2004-01-08 21:13 UTC (permalink / raw)
  To: trond.myklebust; +Cc: viro, linux-kernel, raven, Michael.Waychison, thockin

trond.myklebust@fys.uio.no wrote:
> 
> My point is that the above problem crops up in almost *all* combinations
> of automounter daemon with remote filesystem and strong authentication.
> In order to correctly mount the remote filesystem, the automounter
> itself needs a minimum set of remote privileges (typically it needs to be
> able to browse the remote filesystem).
> 
> RFC-2623 describes how to add RPCSEC_GSS to NFSv2/v3. The
> workarounds (hacks really) that I refer to above had to be deliberately
> added in order to make Sun's automounter work in this environment.
> The alternative would have been to have a global "machine" credential
> for use by the automounter when browsing /net. Hardly secure...
> 

My point is that it's what you get for having an automounter.

We can't solve Sun's designed-in braindamage, unfortunately.  This is
partially why I'd like people to consider the scope of what automounting
does; there are tons of policy issues not all of which are going to be
appropriate in all contexts.  To some degree, if you have to have an
automounter you have already lost.

Also, your global machine credential is to some degree "all the security
you get."  Any security which isn't enforced by the filesystem driver
doesn't exist in a Unix environment; in particular there is no security
against root.  Stupid tricks like remapping uid 0 are just that; stupid
tricks without any real security value.  You know this, of course.
However, if you think the automounter doesn't have the privilege to
access the remote server but the user does, then that's false security.

Linux at this point has no ability to support actual user-mounted
filesystems.  There are things that could be done to remedy this, but it
would require massive changes to every filesystem driver as well as to
the VFS.  Would it be desirable?  Absolutely.  However, it's partially
the quagmire that got the HURD stuck for a very long time, even though
they had the huge advantage of being able to run their filesystem
drivers in a nonprivileged context.

	-hpa


^ permalink raw reply	[flat|nested] 82+ messages in thread

* Re: [autofs] [RFC] Towards a Modern Autofs
  2004-01-08 18:20     ` Jim Carter
@ 2004-01-08 21:01       ` H. Peter Anvin
  0 siblings, 0 replies; 82+ messages in thread
From: H. Peter Anvin @ 2004-01-08 21:01 UTC (permalink / raw)
  To: Jim Carter; +Cc: Mike Waychison, autofs mailing list, Kernel Mailing List

Jim Carter wrote:
> 
>>For justification to it's worth, some institutions have file servers
>>that export hundreds or even thousands of shares over NFS.   As /net is
>>really just a kind of executable indirect map that returns multimounts
>>for each hostname used as a key,  just doing 'cd /net/hostname' may
>>potentially mount hundreds of filesystems.  This is not cool!
> 
> Definitely not cool.  But some users (yours truly among them) do "alias ls
> 'ls -F'", which requires "ls" to stat (and thus mount) every exported
> filesystem.  More uncool, and I don't see any non-disgusting way around it.
> 

No, it doesn't... this has been covered several times already.  It
requires ls to *lstat* the point; it only does a stat() if the resulting
entry is S_IFLNK.  The same is true for GUI tools.  There is a fairly
easy way to distinguish lstat() from virtually all other filesystem
calls -- it doesn't invoke follow_link.  So the answer is simply to
create an inode which is S_IFDIR but has a follow_link method.  The
follow_link method triggers a mount.  This is called a "pseudo-symlink
directory" or sometimes "ghost directory".

	-hpa


^ permalink raw reply	[flat|nested] 82+ messages in thread

* Re: [autofs] [RFC] Towards a Modern Autofs
  2004-01-08 19:41 ` H. Peter Anvin
@ 2004-01-08 20:08   ` trond.myklebust
  2004-01-08 21:13     ` H. Peter Anvin
  2004-01-09 20:16   ` Mike Waychison
  1 sibling, 1 reply; 82+ messages in thread
From: trond.myklebust @ 2004-01-08 20:08 UTC (permalink / raw)
  To: hpa; +Cc: viro, linux-kernel, raven, Michael.Waychison, thockin

>> If anyone wants evidence of how broken the whole daemon thing is, then
>> see the workarounds that had to be made in RFC-2623 to disable strong
>> authentication for GETATTR etc. on the NFSv2/v3 mount point.
>>
>
> It's not broken as much as what you want to do is outside the scope of
> automount.  automount is one particular user of these facilities, and as
> you correctly point out, it can't solve the problems for all of them.
> The right thing for AFS and NFSv4 is clearly to do something different.

My point is that the above problem crops up in almost *all* combinations
of automounter daemon with remote filesystem and strong authentication.
In order to correctly mount the remote filesystem, the automounter
itself needs a minimum set of remote privileges (typically it needs to be
able
to browse the remote filesystem).

RFC-2623 describes how to add RPCSEC_GSS to NFSv2/v3. The
workarounds (hacks really) that I refer to above had to be deliberately
added in order to make Sun's automounter work in this environment.
The alternative would have been to have a global "machine" credential
for use by the automounter when browsing /net. Hardly secure...

> That being said, mount traps in particular, and possibly this "trap
> filesystem" are more generic kernel facilities which should be of use to
> other things than automount.  AFS/NFSv4 are the obvious examples, quite
> possibly other things like intermezzo might be interested, and we don't
> want to have to reinvent the wheel every time.

Certainly. I believe CIFS might also have a similar mechanism.

Cheers,
   Trond



^ permalink raw reply	[flat|nested] 82+ messages in thread

* Re: [autofs] [RFC] Towards a Modern Autofs
  2004-01-08 17:34       ` H. Peter Anvin
@ 2004-01-08 19:41         ` Mike Waychison
  2004-01-08 23:42         ` Michael Clark
  2004-01-09 18:32         ` Ian Kent
  2 siblings, 0 replies; 82+ messages in thread
From: Mike Waychison @ 2004-01-08 19:41 UTC (permalink / raw)
  To: H. Peter Anvin
  Cc: Ian Kent, Mike Waychison, autofs mailing list, Kernel Mailing List

[-- Attachment #1: Type: text/plain, Size: 2726 bytes --]

H. Peter Anvin wrote:
> Ian Kent wrote:
> 
>>
>> If wildcard map entries are not in autofs v3 then Jeremy implemented this
>> in v4.
>>
> 
> v3 has had wildcard map entries and substitutions for a very, very, very 
> long time... it was a v2 feature, in fact.
> 
>> And yes the host map is basically a program map and that's all. Worse, as
>> pointed out in the paper it mounts everything under it. This is a source
>> of stress for mount and umount. I have put in a fair bit of time on ugly
>> hacks to work around this. This same problem is also evident in startup
>> and shutdown for master maps with a good number of entries (~50 or more).
>> A consequence of the current multiple daemon approach.
> 
> 
> This is why one wants to implement a mount tree with "direct mount 
> pads"; which also means keeping some state in the daemon.
> 
> For example, let's say one has a mount tree like:
> 
> /foo        server1:/export/foo \
> /foo/bar    server1:/export/bar \
> /bar        server2:/export/bar
> 
> ... then you actually have four diffenent filesystems involved: first, 
> some kind of "scaffolding" (this can be part of the autofs filesystem 
> itself or a ramfs) that hold the "foo" and "bar" directories, and then 
> foo, foo/bar, and bar.
> 
> Consider the following implementation: when one encounters the above, 
> the daemon stashes this away as an already-encountered map entry (in 
> case the map entries change, we don't want to be inconsistent), creates 
> a ramfs for the scaffolding, creates the "foo" and "bar" subdirectories 
> and mount-traps "foo" and "bar".  Then it releases userspace.  When it 
> encounters an access on "foo", it gets invoked again, looks it up in its 
> "partial mounts" state, then mounts "foo" and mount-traps "foo/bar", 
> then releases userspace.
> 
> In many ways this returns to the simplicity of the autofs v3 design 
> where the atomicity constraints where guaranteed by the VFS itself, *as 
> long as* mount traps can be atomically destroyed with umounting the 
> underlying filesystem.
> 

Great!

This is exactly what I found when looking into the situation.  However, 
namespaces still break automounting unless you can rid yourself of the 
daemon.  Move events into call_usermodehelper calls in current's 
namespace and maintain what little state you need as a set of tokens.


-- 
Mike Waychison
Sun Microsystems, Inc.
1 (650) 352-5299 voice
1 (416) 202-8336 voice
mailto: Michael.Waychison@Sun.COM
http://www.sun.com

~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
NOTICE:  The opinions expressed in this email are held by me,
and may not represent the views of Sun Microsystems, Inc.
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~

[-- Attachment #2: Type: application/pgp-signature, Size: 251 bytes --]

^ permalink raw reply	[flat|nested] 82+ messages in thread

* Re: [autofs] [RFC] Towards a Modern Autofs
  2004-01-08 19:32 trond.myklebust
@ 2004-01-08 19:41 ` H. Peter Anvin
  2004-01-08 20:08   ` trond.myklebust
  2004-01-09 20:16   ` Mike Waychison
  0 siblings, 2 replies; 82+ messages in thread
From: H. Peter Anvin @ 2004-01-08 19:41 UTC (permalink / raw)
  To: trond.myklebust; +Cc: viro, linux-kernel, raven, Michael.Waychison, thockin

trond.myklebust@fys.uio.no wrote:
> 
> Finally, because the upcall is done in the user's own context, you avoid
> the whole problem of automounter credentials that are a constant plague
> to all those daemon-based implementations when working in an environment
> where you have strong authentication.
> If anyone wants evidence of how broken the whole daemon thing is, then see
> the workarounds that had to be made in RFC-2623 to disable strong
> authentication for GETATTR etc. on the NFSv2/v3 mount point.
> 

It's not broken as much as what you want to do is outside the scope of
automount.  automount is one particular user of these facilities, and as
you correctly point out, it can't solve the problems for all of them.
The right thing for AFS and NFSv4 is clearly to do something different.

Mount traps by themselves are not sufficient for automount, which is why
I think we will always have a special "autofs" filesystem, for the
simple reason that automount in typical use doesn't either have an a
priori complete list of directories!  Even with ghosting you might find
that you're accessing a new key which has not yet been ghosted, and it
needs to be handled correctly.  Additionally, not all map types can be
enumerated, and some aren't even finite in size (consider /net, program
maps and wildcard map entries.)  Thus, for indirect mountpoints you
still need a filesystem which can trap on non-enumerated entries.

That being said, mount traps in particular, and possibly this "trap
filesystem" are more generic kernel facilities which should be of use to
other things than automount.  AFS/NFSv4 are the obvious examples, quite
possibly other things like intermezzo might be interested, and we don't
want to have to reinvent the wheel every time.

	-hpa


^ permalink raw reply	[flat|nested] 82+ messages in thread

* Re: [autofs] [RFC] Towards a Modern Autofs
@ 2004-01-08 19:32 trond.myklebust
  2004-01-08 19:41 ` H. Peter Anvin
  0 siblings, 1 reply; 82+ messages in thread
From: trond.myklebust @ 2004-01-08 19:32 UTC (permalink / raw)
  To: viro, linux-kernel; +Cc: raven, hpa, Michael.Waychison, thockin

På to , 08/01/2004 klokka 13:31, skreiv
viro@parcelfarce.linux.theplanet.co.uk:

> Special vfsmount mounted somewhere; has no superblock associated with
> it; attempt to step on it triggers event; normal result of that event
> is to get a normal mount on top of it, at which point usual chaining
> logics will make sure that we don't see the trap until it's uncovered
> by removal of covering filesystem.  Trap (and everything mounted on
> it, etc.) can be removed by normal lazy umount.
>
> Basically, it's a single-point analog of autofs done entirely in VFS.
> The job of automounter is to maintain the traps and react to events.

What if the trap is set by the filesystem? I'm thinking about AFS
volumes and NFSv4 migration events here.

Both these need something that goes beyond the current autofs "daemon
waiting on top of a single trap" thinking.

In the NFSv4 migration case we can be walking down the filesystem patch
and enter a directory where we are basically told by the server that
"this volume has been moved" and are given a list of replicated
servername/pathname fields. Those then need to be interpreted in
userland by means of an upcall of some sort, and the new volume needs to
be mounted.

Neither autofs3 nor autofs4 can currently help us do this, because we
don't a priori have a complete list of directories on which to start a
bunch of "automount" daemons (and it wouldn't help anyway since a server
failover event etc. might cause the list to change).

Setting up our own traps, however, and then doing the upcall by means of
an exec & an "intelligent" mount program (as Mike & co. propose) OTOH,
would very much simplify matters, since that allows us to do simple
string parameter passing from the kernel to direct how the mount is to
be set up.
It still leaves the final policy decisions of which server to mount &
where in userland.

Finally, because the upcall is done in the user's own context, you avoid
the whole problem of automounter credentials that are a constant plague
to all those daemon-based implementations when working in an environment
where you have strong authentication.
If anyone wants evidence of how broken the whole daemon thing is, then see
the workarounds that had to be made in RFC-2623 to disable strong
authentication for GETATTR etc. on the NFSv2/v3 mount point.

Cheers,
  Trond





^ permalink raw reply	[flat|nested] 82+ messages in thread

* Re: [autofs] [RFC] Towards a Modern Autofs
  2004-01-07 22:55   ` Mike Waychison
                       ` (2 preceding siblings ...)
  2004-01-08 12:35     ` Ian Kent
@ 2004-01-08 18:20     ` Jim Carter
  2004-01-08 21:01       ` H. Peter Anvin
  3 siblings, 1 reply; 82+ messages in thread
From: Jim Carter @ 2004-01-08 18:20 UTC (permalink / raw)
  To: Mike Waychison; +Cc: Kernel Mailing List, autofs mailing list

On Wed, 7 Jan 2004, Mike Waychison wrote:
> Jim Carter wrote:

> >  That's not too bad, since we rely on UNIX file permissions
> >or ACLs for security, not visibility in the automount map.  If an indirect
> >map entry was formerly absent but now present, presumably the userspace
> >helper will consult the then-prevailing automount map and find it
> >successfully.
>
> Yes, but then when the other namespace accesses this entry and attempts
> to mount it and no longer finds it in the map, it is unhashed and no
> enumerated as a cache entry, which is still valid in the first
> namespace.  This cache coherency is a subtle point.  The main point is
> that without super_block cloning, we are left with two namespaces that
> can effectively alter each other's automount policy be remounting the
> filesystem.

So for browsing ("ls" an indirect map's mountpoint without statting each
file), one namespace will see targets not in its version of the map, or the
other namespace will fail to see targets in its map.  Hmm, in the strict
userspace helper model, how does the helper get the file list into the
kernel module's data structures?  Perhaps we need an "inverse stat" ioctl
to pass a stat struct down to the kernel.  Plus another ioctl or a special
variant of mkdir, to populate the kernel's view of an indirect map with
names, but not stat data.  Running a pipe/socket/etc. between the kernel
and userspace is yucky.  By the way, IPSec handles the problem by letting
its userspace daemon create a socket with address family PF_KEY.

(About multimounts:)

> This is pretty much needed no matter how you look at it.   If you set it
> up so that it peeked at the NFS share for /usr/src to get permission
> information, you also have to verify that it contains a directory
> 'linux'.  This doesn't seem like much, but these things can change from
> underneath us.

I don't see that.  What I do see is, if /usr/src/linux is an autofs direct
map, and /usr/src is also a direct (or indirect?) map, then both
/usr/src/linux and /usr/src must have autofs filesystems (local kernel data
structures) mounted on them at all times, whether or not the NFS
filesystems were mounted.  And when /usr/src eventually gets NFS mounted,
the /usr/src/linux autofs FS has to percolate upward, and percolate back
when /usr/src is unmounted.  Or else, after /usr/src is NFS mounted you
need some magic (the multimount mechanism) to install an autofs filesystem
on /usr/src/linux.  The two approaches are very similar, but I think the
difference is that in Sun's implementation you have this special feature
with syntax and logic to support it, whereas as described by me, the man
page would just say "don't worry about autofs mount points located in a
filesystem that isn't mounted yet; we'll take care of it one way or
another."

> For justification to it's worth, some institutions have file servers
> that export hundreds or even thousands of shares over NFS.   As /net is
> really just a kind of executable indirect map that returns multimounts
> for each hostname used as a key,  just doing 'cd /net/hostname' may
> potentially mount hundreds of filesystems.  This is not cool!

Definitely not cool.  But some users (yours truly among them) do "alias ls
'ls -F'", which requires "ls" to stat (and thus mount) every exported
filesystem.  More uncool, and I don't see any non-disgusting way around it.

> >So the helper's umount() will fail.  OK, it failed.  The kernel module
> >should not recognize the mounted dir as being gone, until the module itself
> >has seen that it's gone.  This policy also helps in cases where the sysop
> >manually unmounts an automounted directory for repair purposes.

> But this leads to races which cause partial expiries to occur in autofs4.

But it's a fact of life that some umounts will fail.  Perhaps that's one
reason why I'm dragging my heels so hard about the multimounts: they depend
on being mounted and unmounted as a unit, and that atomicity can't be
guaranteed.  Whereas if the subdir and containing dir are unmounted
independently, the use counts will insure that the subdir is unmounted
first, and the containing dir is unmounted (and the subdir's autofs FS
mount is put back in a "storage" state) only after successful unmounting of
the subdir.

Aha, I hear someone snarling, "you can't umount the containing dir if an
autofs FS is mounted on the subdir, and conversely, you can't mount the
subdir autofs FS until after the containing dir is mounted".  So the autofs
private data for the containing dir needs a chain saying "there are
supposed to be autofs subdirs mounted on these subdirs (relative paths or
"offsets").  Perhaps we're both talking about the same mechanism for
multimounts, but I'm just resisting some of the extras that go with them,
such as the atomicity and the special syntax.

> >A filesystem is "in use" if anything is mounted on its subdirs.  That
> >precludes premature auto-unmounting of a containing directory, in the case
> >of a multi-mount or jimc's recommended non-implementation thereof.  I don't
> >see that a multi-mount stack needs to expire as a unit -- just let the
> >components expire normally, leaf to root.  It doesn't bother jimc that some
> >members are mounted and some aren't; by the principle of lazy mounting,
> >that's what we're trying to accomplish.

> The thing is that we use autofs filesystems as traps.  Following from
> the previous /usr/src/linux example:

---- snip most of example ----

> Now, Assume that nobody is using /usr/src and /usr/src/linux.   The
> first fs to expire is going to be the nfs from hostb on /usr/src/linux
>
> # cat /proc/mounts
> rootfs /
> autofs /usr/src
> hosta:/src /usr/src
> autofs /usr/src/linux
>
> Next, /usr/src should go.  The thing is, we do _not_ want to unmount the
> autofs filesystem at /usr/src/linux before unmounting the nfs filesystem
> at /usr/src because that would open ourselves up to a user coming in and
> doing chdir(/usr/src/linux).  We would catch the traversal because our
> trigger on 'linux' is gone.  We also shouldn't unmount the nfs
> filesystem from hosta now, because somebody is using it.

Solution: do a "move" remount, remounting the NFS filesystem from /usr/src
to /tmp/_garbage/src.  In the instant after that finishes, a wayward user
does "cd /usr/src/linux".  Since only the autofs FS is currently on
/usr/src, it triggers and forks another userspace helper to mount
serverA:/export/src on /usr/src, and it *atomically* mounts an autofs FS
on /usr/src/linux before signalling the caller that /usr/src is ready for
use.  Then when the first userspace helper regains the CPU, all the stuff
on /tmp/_garbage/src would be broken down with no need to worry about race
conditions.

Minor detail, applying to both Sun-style multimounts and my ideas: can you
"mount" an autofs FS without statting its mount point?  Probably not.
This means that the kernel has to run the userspace helper twice, once to
mount the containing dir and again to implant the autofs FS on the subdir,
before reporting to the caller that the containing dir is ready.
Alternatively the helper should infer that the subdir needs an autofs FS
when it's mounting the containing dir (potentially needing to consult every
map file and NIS map in the system to figure that out).  Hmm, am I arguing
in favor of the special syntax of Sun multimounts?

More on /tmp/_garbage: when a server crashes and you aren't sure whether
forced or lazy unmounts will get rid of the mount strucures, if you move
the mount into /tmp/_garbage then the main automount tree will still be
functional.  A problem I see from time to time is, serverX is rebooted, the
client has a stale NFS filehandle, and I can't make the broken mount
disappear, hence can't mount that filesystem from the revived serverX.
This is particularly a problem on Solaris 2.6; on Linux I can usually
recover by sufficiently many "umount -f" or "umount -l" or "kill -9".

James F. Carter          Voice 310 825 2897    FAX 310 206 6673
UCLA-Mathnet;  6115 MSA; 405 Hilgard Ave.; Los Angeles, CA, USA  90095-1555
Email: jimc@math.ucla.edu    http://www.math.ucla.edu/~jimc (q.v. for PGP key)

^ permalink raw reply	[flat|nested] 82+ messages in thread

* Re: [autofs] [RFC] Towards a Modern Autofs
  2004-01-08 12:00     ` Ian Kent
  2004-01-08 15:39       ` Mike Waychison
@ 2004-01-08 17:34       ` H. Peter Anvin
  2004-01-08 19:41         ` Mike Waychison
                           ` (2 more replies)
  1 sibling, 3 replies; 82+ messages in thread
From: H. Peter Anvin @ 2004-01-08 17:34 UTC (permalink / raw)
  To: Ian Kent; +Cc: Mike Waychison, autofs mailing list, Kernel Mailing List

Ian Kent wrote:
> 
> If wildcard map entries are not in autofs v3 then Jeremy implemented this
> in v4.
> 

v3 has had wildcard map entries and substitutions for a very, very, very 
long time... it was a v2 feature, in fact.

> And yes the host map is basically a program map and that's all. Worse, as
> pointed out in the paper it mounts everything under it. This is a source
> of stress for mount and umount. I have put in a fair bit of time on ugly
> hacks to work around this. This same problem is also evident in startup
> and shutdown for master maps with a good number of entries (~50 or more).
> A consequence of the current multiple daemon approach.

This is why one wants to implement a mount tree with "direct mount 
pads"; which also means keeping some state in the daemon.

For example, let's say one has a mount tree like:

/foo		server1:/export/foo \
/foo/bar	server1:/export/bar \
/bar		server2:/export/bar

... then you actually have four diffenent filesystems involved: first, 
some kind of "scaffolding" (this can be part of the autofs filesystem 
itself or a ramfs) that hold the "foo" and "bar" directories, and then 
foo, foo/bar, and bar.

Consider the following implementation: when one encounters the above, 
the daemon stashes this away as an already-encountered map entry (in 
case the map entries change, we don't want to be inconsistent), creates 
a ramfs for the scaffolding, creates the "foo" and "bar" subdirectories 
and mount-traps "foo" and "bar".  Then it releases userspace.  When it 
encounters an access on "foo", it gets invoked again, looks it up in its 
"partial mounts" state, then mounts "foo" and mount-traps "foo/bar", 
then releases userspace.

In many ways this returns to the simplicity of the autofs v3 design 
where the atomicity constraints where guaranteed by the VFS itself, *as 
long as* mount traps can be atomically destroyed with umounting the 
underlying filesystem.

	-hpa

^ permalink raw reply	[flat|nested] 82+ messages in thread

* Re: [autofs] [RFC] Towards a Modern Autofs
  2004-01-08 12:29     ` Olivier Galibert
  2004-01-08 13:20       ` Robin Rosenberg
@ 2004-01-08 16:23       ` Mike Waychison
  1 sibling, 0 replies; 82+ messages in thread
From: Mike Waychison @ 2004-01-08 16:23 UTC (permalink / raw)
  To: Olivier Galibert; +Cc: Kernel Mailing List

[-- Attachment #1: Type: text/plain, Size: 958 bytes --]

Olivier Galibert wrote:
> On Wed, Jan 07, 2004 at 05:55:23PM -0500, Mike Waychison wrote:
> 
>>Yes, an 'ls' actually does an lstat on every file.
> 
> 
> I guess you haven't met the plague called color-ls yet.  Lucky you.
> 
> Most modern file browsers also seem to feel obligated to follow
> symlinks to check whether they're dangling.  A mis-click on "up" when
> you're on your home directory could cause a beautiful mount-storm.

Why would any file browser or even ls feel compelled to 'stat' something 
right after an 'lstat' says it is not a symbolic link though?

-- 
Mike Waychison
Sun Microsystems, Inc.
1 (650) 352-5299 voice
1 (416) 202-8336 voice
mailto: Michael.Waychison@Sun.COM
http://www.sun.com

~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
NOTICE:  The opinions expressed in this email are held by me,
and may not represent the views of Sun Microsystems, Inc.
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~

[-- Attachment #2: Type: application/pgp-signature, Size: 251 bytes --]

^ permalink raw reply	[flat|nested] 82+ messages in thread

* Re: [autofs] [RFC] Towards a Modern Autofs
  2004-01-08 12:00     ` Ian Kent
@ 2004-01-08 15:39       ` Mike Waychison
  2004-01-09 18:20         ` Ian Kent
  2004-01-08 17:34       ` H. Peter Anvin
  1 sibling, 1 reply; 82+ messages in thread
From: Mike Waychison @ 2004-01-08 15:39 UTC (permalink / raw)
  To: Ian Kent; +Cc: Jim Carter, autofs mailing list, Kernel Mailing List

[-- Attachment #1: Type: text/plain, Size: 13238 bytes --]

Ian Kent wrote:

>Don't expect we'll get many readers of posts this long ...
>
>On Wed, 7 Jan 2004, Mike Waychison wrote:
>
>Mike can you enlighten me with a few words about how namespaces are useful
>in the design. I have not seen or heard much about them so please be
>gentle.
>  
>

Your best bet to learn more about namespaces is probably to read 
copy_namespace() in fs/namespace.c.  There isn't much to google for, 
other than the CLONE_NEWNS flag for clone(2).  Basically, the idea is 
that you can give a new process its own independent mount table to play 
with.  Any changes to it are not seen by any other processes and vice-versa.

As for usefulness, the use namespaces in general is up for debate.  
IMHO, namespaces in Linux are ill designed, however I'm told that their 
uses are still far off and it is understood that they break several 
things.  

AFAIK, the long-term goal of namespaces is to one day be able to do 
user-priviledged mounting.  Basically allowing users to play in their 
own sandbox mounttable, mounting/moving/binding/unmounting filesystems 
as they see fit, without affecting the overall security of the machine 
and without disturbing other users.  Someone correct me here if I'm wrong.

>I don't understand the super block cloning problem you describe either.
>Some words on that would be greatly appreciated as well.
>
>  
>
One of the benefits of namespace cloning is complete mount configuration 
isolation between processes.  In my eyes, automounting is a part of that 
configuration.  To over-simplify the problem, any given filesystem may 
have a single set of mount options.  When a namespace is cloned, every 
mounted filesystem is shared between the two namespaces.  Now we have 
the problem that a change in mount options in one namespace affects the 
other.  This breaks the mountpoint isolation namespaces tried to achieve. 

The 'quick-fix' to this is that filesystems should be allowed to 
determine if they should clone themselves when a namespace is cloned.  
This would ensure that each namespace now has its own copy of the 
filesystem, each with individual sets of mount options.

>What is the form of the trigger talked about? Identifying the automount
>points in the autofs filesystem has always been hard and error prone.
>
>  
>
I don't understand what you mean by the identifying part.   However, the 
'trigger' would the traditional method used in autofsv3/4 for indirect 
maps and probably based off what you already have for doing the browsing 
stuff.

The direct map 'triggers' will be taken care of by another filesystem 
with a magic root directory that will catch traversals using some 
follow_link magic.   I wrote a prototype for this last summer, but 
haven't released it as the userspace stuff completely does not fit in 
with the existing daemon that was out at the time do the the mess of 
glue that was pgids, pipes and processes.   It worked in the simple 
case, but it didn't extend to being able to direct mount an indirect 
map, nor was it able to do the lazy mounting in multimounts as I had 
desired.

>Please clearify what we are talking about WRT kernel support for
>automount. Is the plan a new kernel module or are we talking about
>unspecified 'in VFS' support or both?
>
>  
>
This module will have its own new autofs module (hopefully named 
something other than autofs to avoid confusion/mishaps).  The VFS will 
have native support for expiry.  The VFS will also be slightly extended 
to allow the super_block cloning on namespace clone (although this can 
probably hold off a while, it's more a semantic issue than anything else).

>
>Yes. In 4.1 NIS, LDAP and file maps are browsable for both direct and
>indirect maps. The browsability, only, requires my kernel patch.
>The daemon detects the updated modules' presence, and if the option is
>specified 'ghosts' the directories, mounting them only when accessed.
>
>  
>
What is the difference between Solaris's -browse and your ghosting then? 

>>lstat('dir')
>>chdir('dir')
>>lstat('.')
>>    
>>
>
>This suggestion has been made by others several times but doesn't seem
>to be a problem in practice. In all my testing I have only been able to
>find one case that does'nt work as needed when ghosted. This is the
>situation where a home directory in a map exported from a server, is
>actually not available (eg does not exist) and someone logs into the
>account using wu-ftpd. In this case wu-ftpd thinks all is ok but of course
>an error is returned when the directory access is attempted. In fact an
>error should have been returned at login. Further, I believe this can be
>solved with as little as an additional revalidate call in sys_stat (I
>think the problem call was sys_stst ???).
>
>  
>
The find(1) issue is fairly recent.   This check was added some time 
within the last two years (?) and only appears in the latest distros.

Another problem were the ACL patches for ls(1) and friends.  I *really* 
think they should be lgetxattr ing instead of getxattr.  They even 
explicitly check via an lstat _before hand_ to verify if the file 
S_ISLNK, and only then will it getxattr if it isn't.  Why not extend 
it?   I duno.

>>This is the subtle difference between direct and indirect maps.   The
>>direct map keys are absolute paths, not path components.  We are
>>implementing direct mounts as individual filesystems that will trap on
>>traversal into their base directory.  This filesystem has no idea where
>>it is located as far as the user is concerned.  We need to tell the
>>filesystem directly so that the usermode helper can look it up.
>>Conversely, the indirect map uses the sub-directory name as a mapkey.
>>    
>>
>
>I'm not sure what you are saying here. Does this mean there is a mount for
>every direct mount (this might be what you call a trigger)?
>
>  
>
Yes, it is its own filesystem (type autofs).  This is needed because we 
need to overlay direct triggers within NFS filesystems for multimounts.

Browsing however obviously doesn't need that because we control the 
parent directory.

>AIX implemented automounts by mounting everything in each map. This
>made the mount listing very ugly.
>
>  
>
??  Really?  I find that hard to believe.  I thought Solaris shared it's 
automounter with HPUX and AIX.  I may be wrong though.

>This sounds like the stat/lstat question again.
>
>I have been able to provide lazy mounts in 4.1 with directory
>browsing but have had to resort to internal sub-mounts when browsing is
>not requested or available. This process sounds similar to some of
>discussion of muti-mount maps in the paper.
>
>  
>
Yup. We use your browsing stuff for indirect maps with -browse, and we 
use nested direct triggers for the offsets within the multimounts.

>>
>>    
>>
>>>>5.4 Expiry
>>>>
>>>>
>>>>        
>>>>
>>>
>>>      
>>>
>>>>Handling expiry of mounts is difficult to get right.  Several different
>>>>aspects need to be considered before being able to properly perform
>>>>expiry.
>>>>
>>>>
>>>>        
>>>>
>>>The current daemon (with latest patches) seems to get it right most of the
>>>time.
>>>
>>>
>>>
>>>      
>>>
>>It's the rest of the time we want to deal with.  I know Ian has done a
>>lot of good work on this over the past few months and I hope we will be
>>able to use his insight to get everything right.
>>
>>    
>>
>>>>The autofs filesystem really should know as little about VFS internal
>>>>structures as possible.  In this case, the filesystem code is charged
>>>>with walking across mountpoints and manually counting reference counts.
>>>>This task is much better left to the VFS internals.
>>>>
>>>>
>>>>        
>>>>
>>>Someone with a more thorough understanding of the code should comment on
>>>this, but I didn't notice the module rooting through VFS data; it looks
>>>like it relies on use counts maintained by the VFS layer, similar to what
>>>mount(2) relies on to declare a mount to be busy.
>>>
>>>
>>>
>>>      
>>>
>>It manually walks through dentry trees and vfsmount trees (albeit the v3
>>code doesn't do the latter). It manually does reference count checks for
>>business which can change over time.  It also has to do this all with
>>locking, by grabbing vfs specific locks.  I'm pretty sure these
>>structures are _not_ meant to be traversed by anything outside the vfs
>>and the fact that autofs has gotten away with it is a remnant of the
>>fact that dcache_lock used to encompass a lot.  In fact, in 2.5, the
>>vfsmount structures that autofs walks is has split locks and now uses
>>vfsmount_lock, which isn't exported to modules at all.
>>
>>This is a good example of why this stuff should probably be merged into
>>VFS,  autofs4 has yet to be updated to use this lock.  This comes with
>>the decision to a) no longer support it as a module, only built in, or
>>b) make vfsmount_lock accessible to modules.
>>
>>But yes, someone with a more thorough understanding of the code should
>>comment  :)
>>    
>>
>
>Mmm. The vfsmount_lock is available to modules in 2.6. At least it was in
>test11. I'm sure I compiled the module under 2.6 as well???
>
>I thought that, taking the dcache_lock was the correct thing to do when
>traversing a dentry list?
>
>  
>
Walking dentrys still takes the dcache_lock, however walking vfsmounts 
takes the vfsmount_lock.  dcache_lock is no longer used for fast path 
walking either (to the best of my understanding).

find . -name '*.[ch]' -not -path '*SCCS*' | xargs grep vfsmount_lock | 
grep EXPORT

shows no results for vfsmount_lock being exported to modules in 2.6.

>In any case after a mail discussion with Maneesh Soni regarding the
>autofs4 expiry code I rewrote it. Maneesh felt that using reference counts
>was unreliable and recommended that it use VFS api calls where possible. I
>did that and that code is now part of my autofs4 module kit for 2.4 and is
>also present in the patch set I offered to Andrew Morten for inclusion
>in 2.6. It seems to work well. The dentry structures are traversed
>and the dcache_lock is obtained as needed. When I can go no further
>within the autofs filesystem I resort to traversing the vfsmount
>structures to check the mount counts. Maybe we can get some usefull code
>from this.
>
>  
>
I haven't had the chance to step through your new module code 
completely.  sorry.

>>>>Unmounting the filesystem from userspace is racy, as any program can
>>>>begin using a mount between the time the daemon has received a path to
>>>>expire and the time it actually makes the umount(2) system call.
>>>>
>>>>
>>>>        
>>>>
>>>So the helper's umount() will fail.  OK, it failed.  The kernel module
>>>should not recognize the mounted dir as being gone, until the module itself
>>>has seen that it's gone.  This policy also helps in cases where the sysop
>>>manually unmounts an automounted directory for repair purposes.
>>>      
>>>
>
>The autofs4 moudule blocks (auto) mounts during the umount callback.
>Surely this is the sensible thing to do.
>
>  
>
The raciness comes from the fact that we now support the lazy-mounting 
of multimount offsets using embedded direct mounts.  Autofs4 mounts all 
(or as much as it can) from the multimount all together, and unmounts it 
all on expiry.

>>>As pointed out in 5.5.1, when the maps change a userspace program will have
>>>to detect some added or deleted items.  This program will have to run
>>>separately in the context of every namespace.  Thus, we should probably
>>>burden the sysop with remembering to run it if he wants his new/deleted
>>>maps to be recognized. But we'll have to use some ioctl to stimulate the
>>>kernel module to enumerate all known namespaces and run the updater for
>>>each one.
>>>
>>>      
>>>
>>Nah.   I leave that as a namespace-aware cron job problem ;)
>>    
>>
>
>More info please?
>Cloning namespaces?
>
>  
>
I think this 'stimulation' you called it should be the responsibility of 
the namespace cloner.  They could fork off their own little daemon that 
will call 'automount update' every so often.

>>Lazy unmounts appear immediately in your system.
>>
>>This may not be the only functionality needed, yes.  I'm sure there are
>>more options required given the circumstances of the kill.  I probably
>>shouldn't have mentioned the lazy unmounting for the forced expiry.
>>
>>I'd be interested to hear more about the different types of
>>(expire/kill) operations that sysadmins prefer.
>>    
>>
>
>Hang on. From the discussion my impression of a lazy mount is that it is
>not actually mounted!
>
>  
>
Lazy _un_mounts as opposed to lazy mounts. Lazy unmounts are described 
in umount(8):

       -l     Lazy unmount. Detach the filesystem from the filesystem  
hierar-
              chy now, and cleanup all references to the filesystem as 
soon as
              it is not busy anymore.  (Requires kernel 2.4.11 or later.)

HTH,

-- 
Mike Waychison
Sun Microsystems, Inc.
1 (650) 352-5299 voice
1 (416) 202-8336 voice
mailto: Michael.Waychison@Sun.COM
http://www.sun.com

~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
NOTICE:  The opinions expressed in this email are held by me, 
and may not represent the views of Sun Microsystems, Inc.
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ 


[-- Attachment #2: Type: application/pgp-signature, Size: 251 bytes --]

^ permalink raw reply	[flat|nested] 82+ messages in thread

* Re: [autofs] [RFC] Towards a Modern Autofs
  2004-01-08 12:29     ` Olivier Galibert
@ 2004-01-08 13:20       ` Robin Rosenberg
  2004-01-08 16:23       ` Mike Waychison
  1 sibling, 0 replies; 82+ messages in thread
From: Robin Rosenberg @ 2004-01-08 13:20 UTC (permalink / raw)
  To: Olivier Galibert, Kernel Mailing List

torsdagen den 8 januari 2004 13.29 skrev Olivier Galibert:
> On Wed, Jan 07, 2004 at 05:55:23PM -0500, Mike Waychison wrote:
> > Yes, an 'ls' actually does an lstat on every file.
>
> I guess you haven't met the plague called color-ls yet.  Lucky you.
>
> Most modern file browsers also seem to feel obligated to follow
> symlinks to check whether they're dangling.  A mis-click on "up" when
> you're on your home directory could cause a beautiful mount-storm.
>

Not to mention the more complex graphical environments like Konqueror in KDE which produces a 
nice icon with a preview of whatever the a link points to. It also scans directories in 
order to tag the large icon with an even smaller icons to indicate what type of files the directory 
contains. It is very nice, but very different from ls.

-- robin


^ permalink raw reply	[flat|nested] 82+ messages in thread

* Re: [autofs] [RFC] Towards a Modern Autofs
  2004-01-08 12:35     ` Ian Kent
@ 2004-01-08 13:08       ` Ian Kent
  0 siblings, 0 replies; 82+ messages in thread
From: Ian Kent @ 2004-01-08 13:08 UTC (permalink / raw)
  To: Mike Waychison; +Cc: Jim Carter, autofs mailing list, Kernel Mailing List

On Thu, 8 Jan 2004, Ian Kent wrote:

Oh! This should have related to the comments about removing autofs from
the kernel.

Sorry about the confusion.

> On Wed, 7 Jan 2004, Mike Waychison wrote:
>
> >
> > This is a good example of why this stuff should probably be merged into
> > VFS,  autofs4 has yet to be updated to use this lock.  This comes with
> > the decision to a) no longer support it as a module, only built in, or
> > b) make vfsmount_lock accessible to modules.
>
> Please don't say it this way.
>
> A new implementation may mean current autofs becomes depricated but
> this is a deprecation process, not a slash and burn, and needs to be
> managed.
>



^ permalink raw reply	[flat|nested] 82+ messages in thread

* Re: [autofs] [RFC] Towards a Modern Autofs
  2004-01-07 22:55   ` Mike Waychison
  2004-01-08 12:00     ` Ian Kent
  2004-01-08 12:29     ` Olivier Galibert
@ 2004-01-08 12:35     ` Ian Kent
  2004-01-08 13:08       ` Ian Kent
  2004-01-08 18:20     ` Jim Carter
  3 siblings, 1 reply; 82+ messages in thread
From: Ian Kent @ 2004-01-08 12:35 UTC (permalink / raw)
  To: Mike Waychison; +Cc: Jim Carter, autofs mailing list, Kernel Mailing List

On Wed, 7 Jan 2004, Mike Waychison wrote:

>
> This is a good example of why this stuff should probably be merged into
> VFS,  autofs4 has yet to be updated to use this lock.  This comes with
> the decision to a) no longer support it as a module, only built in, or
> b) make vfsmount_lock accessible to modules.

Please don't say it this way.

A new implementation may mean current autofs becomes depricated but
this is a deprecation process, not a slash and burn, and needs to be
managed.

Ian



^ permalink raw reply	[flat|nested] 82+ messages in thread

* Re: [autofs] [RFC] Towards a Modern Autofs
  2004-01-07 22:55   ` Mike Waychison
  2004-01-08 12:00     ` Ian Kent
@ 2004-01-08 12:29     ` Olivier Galibert
  2004-01-08 13:20       ` Robin Rosenberg
  2004-01-08 16:23       ` Mike Waychison
  2004-01-08 12:35     ` Ian Kent
  2004-01-08 18:20     ` Jim Carter
  3 siblings, 2 replies; 82+ messages in thread
From: Olivier Galibert @ 2004-01-08 12:29 UTC (permalink / raw)
  To: Kernel Mailing List

On Wed, Jan 07, 2004 at 05:55:23PM -0500, Mike Waychison wrote:
> Yes, an 'ls' actually does an lstat on every file.

I guess you haven't met the plague called color-ls yet.  Lucky you.

Most modern file browsers also seem to feel obligated to follow
symlinks to check whether they're dangling.  A mis-click on "up" when
you're on your home directory could cause a beautiful mount-storm.

  OG.


^ permalink raw reply	[flat|nested] 82+ messages in thread

* Re: [autofs] [RFC] Towards a Modern Autofs
  2004-01-07 22:55   ` Mike Waychison
@ 2004-01-08 12:00     ` Ian Kent
  2004-01-08 15:39       ` Mike Waychison
  2004-01-08 17:34       ` H. Peter Anvin
  2004-01-08 12:29     ` Olivier Galibert
                       ` (2 subsequent siblings)
  3 siblings, 2 replies; 82+ messages in thread
From: Ian Kent @ 2004-01-08 12:00 UTC (permalink / raw)
  To: Mike Waychison; +Cc: Jim Carter, autofs mailing list, Kernel Mailing List


Don't expect we'll get many readers of posts this long ...

On Wed, 7 Jan 2004, Mike Waychison wrote:

Mike can you enlighten me with a few words about how namespaces are useful
in the design. I have not seen or heard much about them so please be
gentle.

I don't understand the super block cloning problem you describe either.
Some words on that would be greatly appreciated as well.

What is the form of the trigger talked about? Identifying the automount
points in the autofs filesystem has always been hard and error prone.

Please clearify what we are talking about WRT kernel support for
automount. Is the plan a new kernel module or are we talking about
unspecified 'in VFS' support or both?

> >
> >Solaris 2.6 and above has the -browse option on indirect maps, so the set
> >of subdirs potentially mountable can be seen, without mounting them. I
> >don't see where this is implemented in Linux, nor do I see how it's done,
> >documented in Solaris NFS man pages, but I didn't put a lot of time into
> >the search.
> >
>
> Yes.   Ian Kent has something similar in his release of autofs 4.1.0
> called ghosting.  Unfortunately, I haven't had the chance to play with
> it very much.

Yes. In 4.1 NIS, LDAP and file maps are browsable for both direct and
indirect maps. The browsability, only, requires my kernel patch.
The daemon detects the updated modules' presence, and if the option is
specified 'ghosts' the directories, mounting them only when accessed.

>
> >I *hope* rpc.mountd has an opcode to enumerate every
> >filesystem it's willing to export.
> >
>
> # showmount -e hostname    ?
>
> >Does it "stat" and return the stat
> >data?  That would be important for "ls".
> >
> >
> >
> Yes, an 'ls' actually does an lstat on every file.   This is cool
> because it doesn't follow links, which is how direct mounts and most
> likely browsing will work.   There are other cases where userspace will
> inadvertedly stat (instead of lstat) or getxattr (instead of lgetxattr)
> and these will need to be fixed.
>
> Other known things that will break is gnu find(1).   For some reason, it
> now does:
>
> lstat('dir')
> chdir('dir')
> lstat('.')

This suggestion has been made by others several times but doesn't seem
to be a problem in practice. In all my testing I have only been able to
find one case that does'nt work as needed when ghosted. This is the
situation where a home directory in a map exported from a server, is
actually not available (eg does not exist) and someone logs into the
account using wu-ftpd. In this case wu-ftpd thinks all is ok but of course
an error is returned when the directory access is attempted. In fact an
error should have been returned at login. Further, I believe this can be
solved with as little as an additional revalidate call in sys_stat (I
think the problem call was sys_stst ???).

> >
> >
> In some environments, maps change fairly often (a couple times a day).
> A timeout of 10 or 15 minutes is reasonable to me for this timeout to
> occur.  Of course, the way things are setup, a stale entry will still
> fail and return ENOENT if it has been removed from the maps since the
> last browse update.

My thoughts on map info and cacheing of it will come when I have had more
time to digest your paper.

> This is the subtle difference between direct and indirect maps.   The
> direct map keys are absolute paths, not path components.  We are
> implementing direct mounts as individual filesystems that will trap on
> traversal into their base directory.  This filesystem has no idea where
> it is located as far as the user is concerned.  We need to tell the
> filesystem directly so that the usermode helper can look it up.
> Conversely, the indirect map uses the sub-directory name as a mapkey.

I'm not sure what you are saying here. Does this mean there is a mount for
every direct mount (this might be what you call a trigger)?

AIX implemented automounts by mounting everything in each map. This
made the mount listing very ugly.

>
> >What is the significance of "lazy mount"?  I don't see the word "lazy" in
> >any of the Solaris NFS or automount docs I looked at.  In sec. 5.3.1
> >you say it means "mount only when accessed".  Thus the whole idea of autofs
> >is to "lazy mount" vast numbers of filesystems.  Right?
> >

>
> The key is the 'as needed' bit, something we don't have in Linux yet.
>
> For justification to it's worth, some institutions have file servers
> that export hundreds or even thousands of shares over NFS.   As /net is
> really just a kind of executable indirect map that returns multimounts
> for each hostname used as a key,  just doing 'cd /net/hostname' may
> potentially mount hundreds of filesystems.  This is not cool!

This sounds like the stat/lstat question again.

I have been able to provide lazy mounts in 4.1 with directory
browsing but have had to resort to internal sub-mounts when browsing is
not requested or available. This process sounds similar to some of
discussion of muti-mount maps in the paper.

>
>
>
> >>5.4 Expiry
> >>
> >>
> >
> >
> >
> >>Handling expiry of mounts is difficult to get right.  Several different
> >>aspects need to be considered before being able to properly perform
> >>expiry.
> >>
> >>
> >
> >The current daemon (with latest patches) seems to get it right most of the
> >time.
> >
> >
> >
> It's the rest of the time we want to deal with.  I know Ian has done a
> lot of good work on this over the past few months and I hope we will be
> able to use his insight to get everything right.
>
> >>The autofs filesystem really should know as little about VFS internal
> >>structures as possible.  In this case, the filesystem code is charged
> >>with walking across mountpoints and manually counting reference counts.
> >>This task is much better left to the VFS internals.
> >>
> >>
> >
> >Someone with a more thorough understanding of the code should comment on
> >this, but I didn't notice the module rooting through VFS data; it looks
> >like it relies on use counts maintained by the VFS layer, similar to what
> >mount(2) relies on to declare a mount to be busy.
> >
> >
> >
> It manually walks through dentry trees and vfsmount trees (albeit the v3
> code doesn't do the latter). It manually does reference count checks for
> business which can change over time.  It also has to do this all with
> locking, by grabbing vfs specific locks.  I'm pretty sure these
> structures are _not_ meant to be traversed by anything outside the vfs
> and the fact that autofs has gotten away with it is a remnant of the
> fact that dcache_lock used to encompass a lot.  In fact, in 2.5, the
> vfsmount structures that autofs walks is has split locks and now uses
> vfsmount_lock, which isn't exported to modules at all.
>
> This is a good example of why this stuff should probably be merged into
> VFS,  autofs4 has yet to be updated to use this lock.  This comes with
> the decision to a) no longer support it as a module, only built in, or
> b) make vfsmount_lock accessible to modules.
>
> But yes, someone with a more thorough understanding of the code should
> comment  :)

Mmm. The vfsmount_lock is available to modules in 2.6. At least it was in
test11. I'm sure I compiled the module under 2.6 as well???

I thought that, taking the dcache_lock was the correct thing to do when
traversing a dentry list?

In any case after a mail discussion with Maneesh Soni regarding the
autofs4 expiry code I rewrote it. Maneesh felt that using reference counts
was unreliable and recommended that it use VFS api calls where possible. I
did that and that code is now part of my autofs4 module kit for 2.4 and is
also present in the patch set I offered to Andrew Morten for inclusion
in 2.6. It seems to work well. The dentry structures are traversed
and the dcache_lock is obtained as needed. When I can go no further
within the autofs filesystem I resort to traversing the vfsmount
structures to check the mount counts. Maybe we can get some usefull code
from this.

>
> >>Unmounting the filesystem from userspace is racy, as any program can
> >>begin using a mount between the time the daemon has received a path to
> >>expire and the time it actually makes the umount(2) system call.
> >>
> >>
> >
> >So the helper's umount() will fail.  OK, it failed.  The kernel module
> >should not recognize the mounted dir as being gone, until the module itself
> >has seen that it's gone.  This policy also helps in cases where the sysop
> >manually unmounts an automounted directory for repair purposes.

The autofs4 moudule blocks (auto) mounts during the umount callback.
Surely this is the sensible thing to do.

> >
> >>These points suggest that the kernel's VFS sub-system should be charged
> >>with handling expiry.
> >>
> >>
> >
> >The point is well taken that a VFS layer expiry mechanism would be welcomed
> >by many filesystems.  But autofs has to work with the kernel as it lies
> >now.
> >
> >
> >
> Why? Things change in the kernel all the time.  Please note, we will be
> doing development against 2.6.

Mmm ... exirey in VFS ... later also.

>
> I'd like to see an independent patch out there for those who want it on
> 2.4, but the fact of the matter is that alot has changed since 2.4 and
> the amount of work required may not be worth it.
>
> >>As described above, we may be installing multiple mounts upon each
> >>trigger. This tree of mounts will need to expire together as an atomic
> >>unit.  We will need to register this block of mounts to some expiry
> >>system.  This will be done by performing a remount on the base
> >>automounted filesystem after any nested offset mounts have been installed
> >>
> >>
> >
> >A filesystem is "in use" if anything is mounted on its subdirs.  That
> >precludes premature auto-unmounting of a containing directory, in the case
> >of a multi-mount or jimc's recommended non-implementation thereof.  I don't
> >see that a multi-mount stack needs to expire as a unit -- just let the
> >components expire normally, leaf to root.  It doesn't bother jimc that some
> >members are mounted and some aren't; by the principle of lazy mounting,
> >that's what we're trying to accomplish.

My understanding of the multi-mount/tree mounts is flawed. Don't look to
autofs v4 for correct functionality ... bummer ... missed that.

>
> >>5.5 Handling Changing Maps
> >>
> >>
> >
> >The whole issue of changed maps is closely related to the case of cloning a
> >namespace and discovering that an autofs map is non-identical in the new
> >namespace.
> >
> >As pointed out in 5.5.1, when the maps change a userspace program will have
> >to detect some added or deleted items.  This program will have to run
> >separately in the context of every namespace.  Thus, we should probably
> >burden the sysop with remembering to run it if he wants his new/deleted
> >maps to be recognized. But we'll have to use some ioctl to stimulate the
> >kernel module to enumerate all known namespaces and run the updater for
> >each one.
> >
> >
> >
> Nah.   I leave that as a namespace-aware cron job problem ;)

More info please?
Cloning namespaces?

>
>
> >>5.5.2 Forcing Expiry to Occur
> >>
> >>
> >
> >When I do this the reason is generally that I'm going to take down a
> >server.  Then I don't want "lazy unmounts"; I want immediate unmounts that
> >will be fatal to the processes using the filesystem.  When the server is
> >already dead, then I may do a lazy unmount with the expectation that the
> >structure will never be cleaned up until the client is rebooted, but at
> >least the client can continue to run.
> >
> >
> >
> Lazy unmounts appear immediately in your system.
>
> This may not be the only functionality needed, yes.  I'm sure there are
> more options required given the circumstances of the kill.  I probably
> shouldn't have mentioned the lazy unmounting for the forced expiry.
>
> I'd be interested to hear more about the different types of
> (expire/kill) operations that sysadmins prefer.

Hang on. From the discussion my impression of a lazy mount is that it is
not actually mounted!

Indeed, why should it be, it's basically a directory or a dentry in the
kernel.

>
>
> >>7 Scalability
> >>
> >>
> >
> >Necessarily mount(8) is used to mount filesystems, since only it has all
> >the spaghetti code and pseudo-object-oriented executables to deal with the
> >various filesystem types.  Hence at least one process (and most likely a
> >parent shell script) is expected per mount.  We need to be frugal in
> >writing the userspace helper (and this is a reason to roll our own, not use
> >hotplug), but the idea of using a userspace helper to mount, rather than a
> >persistent daemon, doesn't sound scary to me.
> >
> >For me the biggest attraction of a Solaris-style automount upgrade is
> >the ability to create wildcard maps with substitutible variables, e.g.
> >rather than having a kludgey programmatic map that creates little map
> >files on the fly looking like "* tupelo:/&", a host map can be implemented
> >via "* $SERVER:/&".  Of course Solaris has a native "-host" map type,
> >which is also good.
> >
> >
> >
> The substitution stuff I think Ian had worked on: Ian correct me if I'm
> wrong here.
>
> The -host map really is does act like an executable indirect map.   This
> is traditionally implemented on Linux as scripts, but that does keep you
> from using 'The Same Automounter Maps' on linux and solaris.   (It's
> also a big Linux customer complaint afaict).

If wildcard map entries are not in autofs v3 then Jeremy implemented this
in v4.

And yes the host map is basically a program map and that's all. Worse, as
pointed out in the paper it mounts everything under it. This is a source
of stress for mount and umount. I have put in a fair bit of time on ugly
hacks to work around this. This same problem is also evident in startup
and shutdown for master maps with a good number of entries (~50 or more).
A consequence of the current multiple daemon approach.

Ian


^ permalink raw reply	[flat|nested] 82+ messages in thread

* Re: [autofs] [RFC] Towards a Modern Autofs
  2004-01-07 21:14 ` Jim Carter
  2004-01-07 22:55   ` Mike Waychison
@ 2004-01-08  0:48   ` Ian Kent
  1 sibling, 0 replies; 82+ messages in thread
From: Ian Kent @ 2004-01-08  0:48 UTC (permalink / raw)
  To: Jim Carter; +Cc: Mike Waychison, autofs mailing list, Kernel Mailing List

On Wed, 7 Jan 2004, Jim Carter wrote:

>
> > The exception to this rule is when the map entry for /home contains the
> > option 'browse':
>
> Solaris 2.6 and above has the -browse option on indirect maps, so the set
> of subdirs potentially mountable can be seen, without mounting them. I
> don't see where this is implemented in Linux, nor do I see how it's done,
> documented in Solaris NFS man pages, but I didn't put a lot of time into
> the search.  I *hope* rpc.mountd has an opcode to enumerate every
> filesystem it's willing to export.  Does it "stat" and return the stat
> data?  That would be important for "ls".

So, even after our most recent email conversation, you still haven't
checked out autofs 4.1.0 and my kernel module kit.



^ permalink raw reply	[flat|nested] 82+ messages in thread

* Re: [autofs] [RFC] Towards a Modern Autofs
  2004-01-07 23:47           ` Mike Waychison
@ 2004-01-07 23:56             ` Jeff Garzik
  2004-01-12 16:57               ` Mike Waychison
  0 siblings, 1 reply; 82+ messages in thread
From: Jeff Garzik @ 2004-01-07 23:56 UTC (permalink / raw)
  To: Mike Waychison; +Cc: H. Peter Anvin, linux-kernel

Mike Waychison wrote:
> You wouldn't put a bdflush daemon in userspace either would you?  The 
> loop in question is just that; (overly simplified):
> 
> while (1) {
>     f = ask_kernel_if_anything_looks_inactive();
>     if (f) {
>         try_to_umount(f);
>         continue;
>     } else {
>         sleep(x seconds);
>     }
> }
> 
> My point is, if this is the only active action done by userspace, why 
> open it up to being broken?


You're still using arguments -against- putting software in the kernel. 
You don't decrease software's chances of "being broken" by putting it in 
the kernel, the opposite occurs -- you increase the likelihood of making 
the entire system unstable.  This is one point that Solaris and Win32 
have both missed :)

	Jeff




^ permalink raw reply	[flat|nested] 82+ messages in thread

* Re: [autofs] [RFC] Towards a Modern Autofs
  2004-01-07 21:24         ` Jeff Garzik
@ 2004-01-07 23:47           ` Mike Waychison
  2004-01-07 23:56             ` Jeff Garzik
  0 siblings, 1 reply; 82+ messages in thread
From: Mike Waychison @ 2004-01-07 23:47 UTC (permalink / raw)
  To: Jeff Garzik; +Cc: Mike Waychison, H. Peter Anvin, linux-kernel

[-- Attachment #1: Type: text/plain, Size: 1789 bytes --]

Jeff Garzik wrote:
> Mike Waychison wrote:
> 
>> To put it into perspective, the I'm calling for the following major 
>> changes:
> 
> [...]
> 
>> 2) move the loop that used to spin around and ask kernelspace if there 
>> was anything to expire into the VFS as well, where it won't be killed.
> 
> [...]
> 
>> (1) and (2) shouldn't be hard at all to do considering David Howells 
>> has done the majority of this already. (3) is needed in order to 
>> manage direct mounts properly for when they are 'covered'.  
>> Admittedly, (4) comes off as an ugly hack.
>>
>> Also, (2) was the only 'active' task the automount daemon was doing. 
>> Everything else it did can be rewritten in the form of a usermode 
>> helper that runs only when it is needed.  This simplifies the 
>> userspace code a lot.
> 
> 
> Just going by your own explanation here, #2 should not be in the kernel.
> 
> If we moving daemons into the kernel just because they won't be killed, 
> we'll have Oracle in-kernel before you know it.  Completely spurious 
> reason.
> 

You wouldn't put a bdflush daemon in userspace either would you?  The 
loop in question is just that; (overly simplified):

while (1) {
	f = ask_kernel_if_anything_looks_inactive();
	if (f) {
		try_to_umount(f);
		continue;
	} else {
		sleep(x seconds);
	}
}

My point is, if this is the only active action done by userspace, why 
open it up to being broken?

-- 
Mike Waychison
Sun Microsystems, Inc.
1 (650) 352-5299 voice
1 (416) 202-8336 voice
mailto: Michael.Waychison@Sun.COM
http://www.sun.com

~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
NOTICE:  The opinions expressed in this email are held by me,
and may not represent the views of Sun Microsystems, Inc.
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~

[-- Attachment #2: Type: application/pgp-signature, Size: 251 bytes --]

^ permalink raw reply	[flat|nested] 82+ messages in thread

* Re: [autofs] [RFC] Towards a Modern Autofs
  2004-01-07 21:11         ` Mike Fedyk
@ 2004-01-07 23:40           ` Jesper Juhl
  0 siblings, 0 replies; 82+ messages in thread
From: Jesper Juhl @ 2004-01-07 23:40 UTC (permalink / raw)
  To: Mike Fedyk; +Cc: Mike Waychison, H. Peter Anvin, linux-kernel



On Wed, 7 Jan 2004, Mike Fedyk wrote:

> On Wed, Jan 07, 2004 at 04:04:41PM -0500, Mike Waychison wrote:
> > H. Peter Anvin wrote:
> >
> > >>Also when /home or other important fs are mounted via autofs there is
> > >>not much practical difference between a hung kernel and a hung
> > >>daemon. You have to reboot the system anyways.
> > >
> > >
> > >a) Guess which one is easier to debug?
> >
> > When they may both equally hang your machine, neither.
>
> Let's see.
>
> If it's in userspace, then setup your debug area in an area your system
> doesn't depend on, and wham, the hang won't affect the entire system anymore.
>
> Also, if you have /home automounted then it only affects the users on /home,
> and root's $home should be /home...
>

>From a user point of view I have to agree with you. Keeping it out of the
kernel makes perfect sense to me.

Easier to test your setup - errors will not hang the box.

In the case the implementation is buggy a daemon can easily be restarted
nightly without disrupting other things running on the box (a nightly
reboot is not as friendly).


>From a developer point of view, I also agree.

Debugging kernel code is in general a much harder thing to do than
debugging a userspace daemon. I'd also guess that more people will be
inclined to contribute development time to a userspace program than a
kernel based implementation - just the fact that it's in-kernel will be
percieved as having a much higher barrier-to-entry and I suspect that fact
alone might discourage potential contributers.


- Jesper Juhl


^ permalink raw reply	[flat|nested] 82+ messages in thread

* Re: [autofs] [RFC] Towards a Modern Autofs
  2004-01-07 21:14 ` Jim Carter
@ 2004-01-07 22:55   ` Mike Waychison
  2004-01-08 12:00     ` Ian Kent
                       ` (3 more replies)
  2004-01-08  0:48   ` Ian Kent
  1 sibling, 4 replies; 82+ messages in thread
From: Mike Waychison @ 2004-01-07 22:55 UTC (permalink / raw)
  To: Jim Carter; +Cc: Kernel Mailing List, autofs mailing list

[-- Attachment #1: Type: text/plain, Size: 21233 bytes --]

Hi Jim

Thanks for taking the time to read the document thoroughly and for Great 
feedback!  

Please see responses inlined below.

Jim Carter wrote:

>On Tue, 6 Jan 2004, Mike Waychison wrote:
>  
>
>>We've spent some time over the past couple months researching how Linux
>>autofs can be brought to a level that is comparable to that found on
>>other major Unix systems out there.
>>
>>ftp://ftp-eng.cobalt.com/pub/whitepapers/autofs/towards_a_modern_autofs.txt
>>ftp://ftp-eng.cobalt.com/pub/whitepapers/autofs/towards_a_modern_autofs.pdf
>>    
>>
>
>Mounting on a file descriptor is nice but it takes work for all filesystems
>to perform it.  Not to discourage work toward this goal, I suggest not
>entangling autofs with that work.  Instead, if we're doing the userspace
>helper thing, the kernel knows the process group of the helper it started.
>Do "oz" mode for that PG, and revoke the privilege when it exits.  Do the
>same thing again for unmounting.
>
>If the userspace helper is invoked in the triggering process' namespace,
>any full paths given to it will be resolved in that namespace.  This
>bypasses one of the main justifications for having autofs work only with FD
>mounts.
>
>If a sysop mounts autofs filesystems (installs triggers), that will and
>should happen in the namespace inhabited by him, not in any cloned
>namespaces.  Without needing to wait for someone to work through kernel
>politics and make FD mounts happen.
>
>  
>

Yes, this is most likely the way it will happen.  Note that I 'mounted 
on a file descriptor' in the examples
for multimounts by doing a fchdir(fd) and a mount --move 
/tmp/<unique_dir> '.'    Using file descriptors is however important for 
maintaining up to date direct mounts on the system. 

>>The exception to this rule is when the map entry for /home contains the
>>option 'browse':
>>    
>>
>
>Solaris 2.6 and above has the -browse option on indirect maps, so the set
>of subdirs potentially mountable can be seen, without mounting them. I
>don't see where this is implemented in Linux, nor do I see how it's done,
>documented in Solaris NFS man pages, but I didn't put a lot of time into
>the search.  
>

Yes.   Ian Kent has something similar in his release of autofs 4.1.0 
called ghosting.  Unfortunately, I haven't had the chance to play with 
it very much.

>I *hope* rpc.mountd has an opcode to enumerate every
>filesystem it's willing to export.  
>

# showmount -e hostname    ?

>Does it "stat" and return the stat
>data?  That would be important for "ls".
>
>  
>
Yes, an 'ls' actually does an lstat on every file.   This is cool 
because it doesn't follow links, which is how direct mounts and most 
likely browsing will work.   There are other cases where userspace will 
inadvertedly stat (instead of lstat) or getxattr (instead of lgetxattr) 
and these will need to be fixed.

Other known things that will break is gnu find(1).   For some reason, it 
now does:

lstat('dir')
chdir('dir')
lstat('.')

and compares st_dev and st_ino from the two lstat calls.

This obviously breaks when you use browsing and direct mounts.

>>In order to maintain some form of coherency between changing maps, these
>>dummy directory entries will remain in place within the dcache so that
>>the kernel doesn't need to query the usermode helper as often.  These
>>entries will periodically timeout and will be unhashed from the dcache.
>>    
>>
>
>Browsetimeout -- Each autofs instance necessarily has an in-core list of
>its subdirectories.  If the caller stats any of these and that one (or
>alternatively, any of the known subdirs) is not in the dcache, the module
>needs to run the helper again, refreshing all dcache entries.  But you
>still need a timeout because the mode etc. might change on the server, but
>it's rare.  Let's avoid committing a lot of coding effort and CPU time to
>supporting events tht might happen once per year.
>
>  
>
In some environments, maps change fairly often (a couple times a day).  
A timeout of 10 or 15 minutes is reasonable to me for this timeout to 
occur.  Of course, the way things are setup, a stale entry will still 
fail and return ENOENT if it has been removed from the maps since the 
last browse update.

>>Executing the usermode helper within the namespace of the triggering
>>application does have a problem when browsing is used.  We are caching
>>map keys in kernelspace and can run into coherency problems when an
>>autofs super_block is associated with multiple namespaces which have
>>differing automount maps in /etc. This kind of situation may occur if a
>>namespace is cloned and a new /etc directory with a different auto_home
>>map is mounted.
>>    
>>
>
>The uncloned superblock problem is discussed later in the paper.  It looks
>to me like the VFS layer ought to be responsible for cloning superblocks.
>Not to discourage work towards that goal, but I suggest not delaying autofs
>until it happens.  The result is that some users will see mount points
>(mounted or potentially mountable) that within-namespace policy says should
>be invisible.
>
Agreed.  This can hold off until later, as it isn't neccesarily an easy 
thing to do either.

>  That's not too bad, since we rely on UNIX file permissions
>or ACLs for security, not visibility in the automount map.  If an indirect
>map entry was formerly absent but now present, presumably the userspace
>helper will consult the then-prevailing automount map and find it
>successfully.
>
>  
>
Yes, but then when the other namespace accesses this entry and attempts 
to mount it and no longer finds it in the map, it is unhashed and no 
enumerated as a cache entry, which is still valid in the first 
namespace.  This cache coherency is a subtle point.  The main point is 
that without super_block cloning, we are left with two namespaces that 
can effectively alter each other's automount policy be remounting the 
filesystem.

>>Sect. 5.2 Direct Maps
>>    
>>
>
>  
>
>>2) The map key for the direct mount entry is now passed as a new mount
>>option called 'mapkey'.
>>    
>>
>
>I don't quite see the need for the mapkey mount option.  It seems to me
>that the name of the mount point is always equal to the map key.  In my
>model, mounting on open FDs isn't going to be implementable, and so the
>userspace helper has to know the full path name of the mount point, anyway.
>
>  
>
This is the subtle difference between direct and indirect maps.   The 
direct map keys are absolute paths, not path components.  We are 
implementing direct mounts as individual filesystems that will trap on 
traversal into their base directory.  This filesystem has no idea where 
it is located as far as the user is concerned.  We need to tell the 
filesystem directly so that the usermode helper can look it up.  
Conversely, the indirect map uses the sub-directory name as a mapkey.

As noted, we don't actually rely on this value as an absolute path.  
This means that we can move or bind the direct mount trapping 
filesystem.   As for mounting on open fd's, the fchdir(fd); mount --move 
/tmp/foo '.' still works.

>>5.3 Multimounts and Offsets
>>    
>>
>
>  
>
>>/usr/src                hosta:/export/src	\
>>            /linux      hostb:/export/linuxsrc
>>    
>>
>
>Suppose someone accesses /usr/src/linux.  Is it not true that both the
>original process and mount(8) have to first access /usr/src, triggering
>automounting of hostA:/export/src, and only when the stat info and readdir
>from that step have come through at least twice, can they go on to monkey
>with /usr/src/linux, triggering mounting of hostB:/export/otherlinux? Thus
>I don't see the need for multimounts.  The conceptual idea of mounting both
>dirs "as a unit" is maybe attractive when not looked at too closely, but it
>seems to me that by just punting, you get infinitesimally slower service to
>the user and a significant section of logic avoided in the code.
>
>  
>
This is pretty much needed no matter how you look at it.   If you set it 
up so that it peeked at the NFS share for /usr/src to get permission 
information, you also have to verify that it contains a directory 
'linux'.  This doesn't seem like much, but these things can change from 
underneath us.

My understanding of NFS is that you cannot 'pin' a directory on the 
server in order to keep it there as your mountpoint in the client.  You 
have to simply look it up and pin it in the client.  If you don't mount 
/usr/src, then you also won't have permission changes on it's base 
directory reflected on your system either.

>The kernel would need to know to install an autofs structure (trigger) on
>/usr/src/linux even though /usr/src was represented by only an autofs
>structure, not actually mounted yet, just like we see in procfs.  I doubt
>that's a showstopper, although you'd have to write the kernel code
>carefully.  The example of userD/server{1,2} indicates that you intend for
>the autofs structure, with nothing mounted on it, ought to be a really
>existing and traversable directory on whose subdirs other autofs FS's can
>be mounted.  Good.
>
>But in sec. 5.3.2 I see you making filesystem dirs in /tmp which seem to
>substitute for the synthetic autofs directories.  Bad, if I've understood
>the example.  Comments suggest that you need the /tmp directory to avoid
>setting off the autofs trigger.  Better: if a synthetic autofs directory
>has no corresponding entry in an automount map, you don't mount anything on
>it.  But if it *does* have a map entry, you need to mount it in order to
>stat it (the server's instance) to determine if the user has permission to
>traverse it, before even considering whether to mount the subdir. Remember
>that in my model I'm leaving aside FD mounts, so traversing containing
>directories by name is a valid concept.
>
>  
>
The directory /tmp/<unique_dir> is _not_ a synthetic autofs directory, 
it is a point where we perform our mounts before we move them.  The 
synthetic directories for multimounts w/o root offsets are handled by a 
tmpfs filesystem simply because it reduces code duplication.

>What is the significance of "lazy mount"?  I don't see the word "lazy" in
>any of the Solaris NFS or automount docs I looked at.  In sec. 5.3.1
>you say it means "mount only when accessed".  Thus the whole idea of autofs
>is to "lazy mount" vast numbers of filesystems.  Right?
>
>  
>
The term 'lazy mount' as used in the document refers to lazily mounting 
the offsets (subdirectories) of a multimount on an as needed basis.  
 From the Solaris 9 automount(1M) manpage:

  Multiple Mounts
     A multiple mount entry takes the form:
                                                                                 

     key [-mount-options] [[mountpoint] [-mount-options] location...]...
                                                                                 

     The initial /[mountpoint] is optional for  the  first  mount
     and  mandatory  for  all  subsequent  mounts.  The  optional
     mountpoint is taken as a pathname relative to the  directory
     named  by  key.  If  mountpoint  is  omitted  in  the  first
     occurrence, a mountpoint of / (root) is implied.
                                                                                 

     Given an entry in the indirect map for /src
                                                                                 

     beta     -ro\
       /           svr1,svr2:/export/src/beta  \
       /1.0        svr1,svr2:/export/src/beta/1.0  \
       /1.0/man    svr1,svr2:/export/src/beta/1.0/man
                                                                                 

     All offsets must exist on the server under  beta.  automount
     will   automatically  mount  /src/beta,  /src/beta/1.0,  and
     /src/beta/1.0/man, as needed,  from  either  svr1  or  svr2,
     whichever host is nearest and responds first.

The key is the 'as needed' bit, something we don't have in Linux yet.  

For justification to it's worth, some institutions have file servers 
that export hundreds or even thousands of shares over NFS.   As /net is 
really just a kind of executable indirect map that returns multimounts 
for each hostname used as a key,  just doing 'cd /net/hostname' may 
potentially mount hundreds of filesystems.  This is not cool! 
                                                                               


>>5.4 Expiry
>>    
>>
>
>  
>
>>Handling expiry of mounts is difficult to get right.  Several different
>>aspects need to be considered before being able to properly perform
>>expiry.
>>    
>>
>
>The current daemon (with latest patches) seems to get it right most of the
>time.
>
>  
>
It's the rest of the time we want to deal with.  I know Ian has done a 
lot of good work on this over the past few months and I hope we will be 
able to use his insight to get everything right.

>>The autofs filesystem really should know as little about VFS internal
>>structures as possible.  In this case, the filesystem code is charged
>>with walking across mountpoints and manually counting reference counts.
>>This task is much better left to the VFS internals.
>>    
>>
>
>Someone with a more thorough understanding of the code should comment on
>this, but I didn't notice the module rooting through VFS data; it looks
>like it relies on use counts maintained by the VFS layer, similar to what
>mount(2) relies on to declare a mount to be busy.
>
>  
>
It manually walks through dentry trees and vfsmount trees (albeit the v3 
code doesn't do the latter). It manually does reference count checks for 
business which can change over time.  It also has to do this all with 
locking, by grabbing vfs specific locks.  I'm pretty sure these 
structures are _not_ meant to be traversed by anything outside the vfs 
and the fact that autofs has gotten away with it is a remnant of the 
fact that dcache_lock used to encompass a lot.  In fact, in 2.5, the 
vfsmount structures that autofs walks is has split locks and now uses 
vfsmount_lock, which isn't exported to modules at all.

This is a good example of why this stuff should probably be merged into 
VFS,  autofs4 has yet to be updated to use this lock.  This comes with 
the decision to a) no longer support it as a module, only built in, or 
b) make vfsmount_lock accessible to modules.

But yes, someone with a more thorough understanding of the code should 
comment  :) 

>>Unmounting the filesystem from userspace is racy, as any program can
>>begin using a mount between the time the daemon has received a path to
>>expire and the time it actually makes the umount(2) system call.
>>    
>>
>
>So the helper's umount() will fail.  OK, it failed.  The kernel module
>should not recognize the mounted dir as being gone, until the module itself
>has seen that it's gone.  This policy also helps in cases where the sysop
>manually unmounts an automounted directory for repair purposes.
>  
>
But this leads to races which cause partial expiries to occur in autofs4.

>A common problem is stale NFS filehandles, and in this case we'd like the
>userspace helper to be aggressive in using "umount -f" or other advanced
>techniques.  The freedom to fail is important here.
>  
>
I'd much much rather see umount -l happen.  At least with -l, there is a 
slight chance that the file system will come back and the processes 
affected will be able to continue operating as usual.

>  
>
>>These points suggest that the kernel's VFS sub-system should be charged
>>with handling expiry.
>>    
>>
>
>The point is well taken that a VFS layer expiry mechanism would be welcomed
>by many filesystems.  But autofs has to work with the kernel as it lies
>now.
>
>  
>
Why? Things change in the kernel all the time.  Please note, we will be 
doing development against 2.6. 

I'd like to see an independent patch out there for those who want it on 
2.4, but the fact of the matter is that alot has changed since 2.4 and 
the amount of work required may not be worth it.

>>As described above, we may be installing multiple mounts upon each
>>trigger. This tree of mounts will need to expire together as an atomic
>>unit.  We will need to register this block of mounts to some expiry
>>system.  This will be done by performing a remount on the base
>>automounted filesystem after any nested offset mounts have been installed
>>    
>>
>
>A filesystem is "in use" if anything is mounted on its subdirs.  That
>precludes premature auto-unmounting of a containing directory, in the case
>of a multi-mount or jimc's recommended non-implementation thereof.  I don't
>see that a multi-mount stack needs to expire as a unit -- just let the
>components expire normally, leaf to root.  It doesn't bother jimc that some
>members are mounted and some aren't; by the principle of lazy mounting,
>that's what we're trying to accomplish.
>
>  
>
The thing is that we use autofs filesystems as traps.  Following from 
the previous /usr/src/linux example:

# cat /proc/mounts
rootfs /
autofs /usr/src
# cd /usr/src
# cat /proc/mounts
rootfs /
autofs /usr/src
hosta:/src /usr/src
autofs /usr/src/linux
# cd linux
# cat /proc/mounts
rootfs /
autofs /usr/src
hosta:/src /usr/src
autofs /usr/src/linux
hostb:/linux /usr/src/linux
#cd /

Now, Assume that nobody is using /usr/src and /usr/src/linux.   The 
first fs to expire is going to be the nfs from hostb on /usr/src/linux

# cat /proc/mounts
rootfs /
autofs /usr/src
hosta:/src /usr/src
autofs /usr/src/linux

Next, /usr/src should go.  The thing is, we do _not_ want to unmount the 
autofs filesystem at /usr/src/linux before unmounting the nfs filesystem 
at /usr/src because that would open ourselves up to a user coming in and 
doing chdir(/usr/src/linux).  We would catch the traversal because our 
trigger on 'linux' is gone.  We also shouldn't unmount the nfs 
filesystem from hosta now, because somebody is using it.  

However, if we had removed the two filesystems toghether atomically, 
then everything works fine.

Does that clear it up a bit?  

>>5.5 Handling Changing Maps
>>    
>>
>
>The whole issue of changed maps is closely related to the case of cloning a
>namespace and discovering that an autofs map is non-identical in the new
>namespace.
>
>As pointed out in 5.5.1, when the maps change a userspace program will have
>to detect some added or deleted items.  This program will have to run
>separately in the context of every namespace.  Thus, we should probably
>burden the sysop with remembering to run it if he wants his new/deleted
>maps to be recognized. But we'll have to use some ioctl to stimulate the
>kernel module to enumerate all known namespaces and run the updater for
>each one.
>
>  
>
Nah.   I leave that as a namespace-aware cron job problem ;)


>>5.5.2 Forcing Expiry to Occur
>>    
>>
>
>When I do this the reason is generally that I'm going to take down a
>server.  Then I don't want "lazy unmounts"; I want immediate unmounts that
>will be fatal to the processes using the filesystem.  When the server is
>already dead, then I may do a lazy unmount with the expectation that the
>structure will never be cleaned up until the client is rebooted, but at
>least the client can continue to run.
>
>  
>
Lazy unmounts appear immediately in your system.  

This may not be the only functionality needed, yes.  I'm sure there are 
more options required given the circumstances of the kill.  I probably 
shouldn't have mentioned the lazy unmounting for the forced expiry. 

I'd be interested to hear more about the different types of 
(expire/kill) operations that sysadmins prefer.


>>7 Scalability
>>    
>>
>
>Necessarily mount(8) is used to mount filesystems, since only it has all
>the spaghetti code and pseudo-object-oriented executables to deal with the
>various filesystem types.  Hence at least one process (and most likely a
>parent shell script) is expected per mount.  We need to be frugal in
>writing the userspace helper (and this is a reason to roll our own, not use
>hotplug), but the idea of using a userspace helper to mount, rather than a
>persistent daemon, doesn't sound scary to me.
>
>For me the biggest attraction of a Solaris-style automount upgrade is
>the ability to create wildcard maps with substitutible variables, e.g.
>rather than having a kludgey programmatic map that creates little map
>files on the fly looking like "* tupelo:/&", a host map can be implemented
>via "* $SERVER:/&".  Of course Solaris has a native "-host" map type,
>which is also good.
>
>  
>
The substitution stuff I think Ian had worked on: Ian correct me if I'm 
wrong here.

The -host map really is does act like an executable indirect map.   This 
is traditionally implemented on Linux as scripts, but that does keep you 
from using 'The Same Automounter Maps' on linux and solaris.   (It's 
also a big Linux customer complaint afaict).

-- 
Mike Waychison
Sun Microsystems, Inc.
1 (650) 352-5299 voice
1 (416) 202-8336 voice
mailto: Michael.Waychison@Sun.COM
http://www.sun.com

~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
NOTICE:  The opinions expressed in this email are held by me, 
and may not represent the views of Sun Microsystems, Inc.
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ 


[-- Attachment #2: Type: application/pgp-signature, Size: 251 bytes --]

^ permalink raw reply	[flat|nested] 82+ messages in thread

* Re: [autofs] [RFC] Towards a Modern Autofs
  2004-01-07 21:04       ` Mike Waychison
  2004-01-07 21:11         ` Mike Fedyk
@ 2004-01-07 21:24         ` Jeff Garzik
  2004-01-07 23:47           ` Mike Waychison
  1 sibling, 1 reply; 82+ messages in thread
From: Jeff Garzik @ 2004-01-07 21:24 UTC (permalink / raw)
  To: Mike Waychison; +Cc: H. Peter Anvin, linux-kernel

Mike Waychison wrote:
> To put it into perspective, the I'm calling for the following major 
> changes:
[...]
> 2) move the loop that used to spin around and ask kernelspace if there 
> was anything to expire into the VFS as well, where it won't be killed.
[...]
> (1) and (2) shouldn't be hard at all to do considering David Howells has 
> done the majority of this already. (3) is needed in order to manage 
> direct mounts properly for when they are 'covered'.  Admittedly, (4) 
> comes off as an ugly hack.
> 
> Also, (2) was the only 'active' task the automount daemon was doing. 
> Everything else it did can be rewritten in the form of a usermode helper 
> that runs only when it is needed.  This simplifies the userspace code a 
> lot.

Just going by your own explanation here, #2 should not be in the kernel.

If we moving daemons into the kernel just because they won't be killed, 
we'll have Oracle in-kernel before you know it.  Completely spurious reason.

	Jeff




^ permalink raw reply	[flat|nested] 82+ messages in thread

* Re: [autofs] [RFC] Towards a Modern Autofs
  2004-01-06 19:55 Mike Waychison
  2004-01-06 21:01 ` [autofs] " H. Peter Anvin
@ 2004-01-07 21:14 ` Jim Carter
  2004-01-07 22:55   ` Mike Waychison
  2004-01-08  0:48   ` Ian Kent
  1 sibling, 2 replies; 82+ messages in thread
From: Jim Carter @ 2004-01-07 21:14 UTC (permalink / raw)
  To: Mike Waychison; +Cc: Kernel Mailing List, autofs mailing list

On Tue, 6 Jan 2004, Mike Waychison wrote:
> We've spent some time over the past couple months researching how Linux
> autofs can be brought to a level that is comparable to that found on
> other major Unix systems out there.
>
> ftp://ftp-eng.cobalt.com/pub/whitepapers/autofs/towards_a_modern_autofs.txt
> ftp://ftp-eng.cobalt.com/pub/whitepapers/autofs/towards_a_modern_autofs.pdf

Mounting on a file descriptor is nice but it takes work for all filesystems
to perform it.  Not to discourage work toward this goal, I suggest not
entangling autofs with that work.  Instead, if we're doing the userspace
helper thing, the kernel knows the process group of the helper it started.
Do "oz" mode for that PG, and revoke the privilege when it exits.  Do the
same thing again for unmounting.

If the userspace helper is invoked in the triggering process' namespace,
any full paths given to it will be resolved in that namespace.  This
bypasses one of the main justifications for having autofs work only with FD
mounts.

If a sysop mounts autofs filesystems (installs triggers), that will and
should happen in the namespace inhabited by him, not in any cloned
namespaces.  Without needing to wait for someone to work through kernel
politics and make FD mounts happen.

> The exception to this rule is when the map entry for /home contains the
> option 'browse':

Solaris 2.6 and above has the -browse option on indirect maps, so the set
of subdirs potentially mountable can be seen, without mounting them. I
don't see where this is implemented in Linux, nor do I see how it's done,
documented in Solaris NFS man pages, but I didn't put a lot of time into
the search.  I *hope* rpc.mountd has an opcode to enumerate every
filesystem it's willing to export.  Does it "stat" and return the stat
data?  That would be important for "ls".

> In order to maintain some form of coherency between changing maps, these
> dummy directory entries will remain in place within the dcache so that
> the kernel doesn't need to query the usermode helper as often.  These
> entries will periodically timeout and will be unhashed from the dcache.

Browsetimeout -- Each autofs instance necessarily has an in-core list of
its subdirectories.  If the caller stats any of these and that one (or
alternatively, any of the known subdirs) is not in the dcache, the module
needs to run the helper again, refreshing all dcache entries.  But you
still need a timeout because the mode etc. might change on the server, but
it's rare.  Let's avoid committing a lot of coding effort and CPU time to
supporting events tht might happen once per year.

> Executing the usermode helper within the namespace of the triggering
> application does have a problem when browsing is used.  We are caching
> map keys in kernelspace and can run into coherency problems when an
> autofs super_block is associated with multiple namespaces which have
> differing automount maps in /etc. This kind of situation may occur if a
> namespace is cloned and a new /etc directory with a different auto_home
> map is mounted.

The uncloned superblock problem is discussed later in the paper.  It looks
to me like the VFS layer ought to be responsible for cloning superblocks.
Not to discourage work towards that goal, but I suggest not delaying autofs
until it happens.  The result is that some users will see mount points
(mounted or potentially mountable) that within-namespace policy says should
be invisible.  That's not too bad, since we rely on UNIX file permissions
or ACLs for security, not visibility in the automount map.  If an indirect
map entry was formerly absent but now present, presumably the userspace
helper will consult the then-prevailing automount map and find it
successfully.

> Sect. 5.2 Direct Maps

> 2) The map key for the direct mount entry is now passed as a new mount
> option called 'mapkey'.

I don't quite see the need for the mapkey mount option.  It seems to me
that the name of the mount point is always equal to the map key.  In my
model, mounting on open FDs isn't going to be implementable, and so the
userspace helper has to know the full path name of the mount point, anyway.

> 5.3 Multimounts and Offsets

> /usr/src                hosta:/export/src	\
>             /linux      hostb:/export/linuxsrc

Suppose someone accesses /usr/src/linux.  Is it not true that both the
original process and mount(8) have to first access /usr/src, triggering
automounting of hostA:/export/src, and only when the stat info and readdir
from that step have come through at least twice, can they go on to monkey
with /usr/src/linux, triggering mounting of hostB:/export/otherlinux? Thus
I don't see the need for multimounts.  The conceptual idea of mounting both
dirs "as a unit" is maybe attractive when not looked at too closely, but it
seems to me that by just punting, you get infinitesimally slower service to
the user and a significant section of logic avoided in the code.

The kernel would need to know to install an autofs structure (trigger) on
/usr/src/linux even though /usr/src was represented by only an autofs
structure, not actually mounted yet, just like we see in procfs.  I doubt
that's a showstopper, although you'd have to write the kernel code
carefully.  The example of userD/server{1,2} indicates that you intend for
the autofs structure, with nothing mounted on it, ought to be a really
existing and traversable directory on whose subdirs other autofs FS's can
be mounted.  Good.

But in sec. 5.3.2 I see you making filesystem dirs in /tmp which seem to
substitute for the synthetic autofs directories.  Bad, if I've understood
the example.  Comments suggest that you need the /tmp directory to avoid
setting off the autofs trigger.  Better: if a synthetic autofs directory
has no corresponding entry in an automount map, you don't mount anything on
it.  But if it *does* have a map entry, you need to mount it in order to
stat it (the server's instance) to determine if the user has permission to
traverse it, before even considering whether to mount the subdir. Remember
that in my model I'm leaving aside FD mounts, so traversing containing
directories by name is a valid concept.

What is the significance of "lazy mount"?  I don't see the word "lazy" in
any of the Solaris NFS or automount docs I looked at.  In sec. 5.3.1
you say it means "mount only when accessed".  Thus the whole idea of autofs
is to "lazy mount" vast numbers of filesystems.  Right?

> 5.4 Expiry

> Handling expiry of mounts is difficult to get right.  Several different
> aspects need to be considered before being able to properly perform
> expiry.

The current daemon (with latest patches) seems to get it right most of the
time.

> The autofs filesystem really should know as little about VFS internal
> structures as possible.  In this case, the filesystem code is charged
> with walking across mountpoints and manually counting reference counts.
> This task is much better left to the VFS internals.

Someone with a more thorough understanding of the code should comment on
this, but I didn't notice the module rooting through VFS data; it looks
like it relies on use counts maintained by the VFS layer, similar to what
mount(2) relies on to declare a mount to be busy.

> Unmounting the filesystem from userspace is racy, as any program can
> begin using a mount between the time the daemon has received a path to
> expire and the time it actually makes the umount(2) system call.

So the helper's umount() will fail.  OK, it failed.  The kernel module
should not recognize the mounted dir as being gone, until the module itself
has seen that it's gone.  This policy also helps in cases where the sysop
manually unmounts an automounted directory for repair purposes.

A common problem is stale NFS filehandles, and in this case we'd like the
userspace helper to be aggressive in using "umount -f" or other advanced
techniques.  The freedom to fail is important here.

> These points suggest that the kernel's VFS sub-system should be charged
> with handling expiry.

The point is well taken that a VFS layer expiry mechanism would be welcomed
by many filesystems.  But autofs has to work with the kernel as it lies
now.

> As described above, we may be installing multiple mounts upon each
> trigger. This tree of mounts will need to expire together as an atomic
> unit.  We will need to register this block of mounts to some expiry
> system.  This will be done by performing a remount on the base
> automounted filesystem after any nested offset mounts have been installed

A filesystem is "in use" if anything is mounted on its subdirs.  That
precludes premature auto-unmounting of a containing directory, in the case
of a multi-mount or jimc's recommended non-implementation thereof.  I don't
see that a multi-mount stack needs to expire as a unit -- just let the
components expire normally, leaf to root.  It doesn't bother jimc that some
members are mounted and some aren't; by the principle of lazy mounting,
that's what we're trying to accomplish.

> 5.5 Handling Changing Maps

The whole issue of changed maps is closely related to the case of cloning a
namespace and discovering that an autofs map is non-identical in the new
namespace.

As pointed out in 5.5.1, when the maps change a userspace program will have
to detect some added or deleted items.  This program will have to run
separately in the context of every namespace.  Thus, we should probably
burden the sysop with remembering to run it if he wants his new/deleted
maps to be recognized. But we'll have to use some ioctl to stimulate the
kernel module to enumerate all known namespaces and run the updater for
each one.

> 5.5.2 Forcing Expiry to Occur

When I do this the reason is generally that I'm going to take down a
server.  Then I don't want "lazy unmounts"; I want immediate unmounts that
will be fatal to the processes using the filesystem.  When the server is
already dead, then I may do a lazy unmount with the expectation that the
structure will never be cleaned up until the client is rebooted, but at
least the client can continue to run.

> 7 Scalability

Necessarily mount(8) is used to mount filesystems, since only it has all
the spaghetti code and pseudo-object-oriented executables to deal with the
various filesystem types.  Hence at least one process (and most likely a
parent shell script) is expected per mount.  We need to be frugal in
writing the userspace helper (and this is a reason to roll our own, not use
hotplug), but the idea of using a userspace helper to mount, rather than a
persistent daemon, doesn't sound scary to me.

For me the biggest attraction of a Solaris-style automount upgrade is
the ability to create wildcard maps with substitutible variables, e.g.
rather than having a kludgey programmatic map that creates little map
files on the fly looking like "* tupelo:/&", a host map can be implemented
via "* $SERVER:/&".  Of course Solaris has a native "-host" map type,
which is also good.


James F. Carter          Voice 310 825 2897    FAX 310 206 6673
UCLA-Mathnet;  6115 MSA; 405 Hilgard Ave.; Los Angeles, CA, USA  90095-1555
Email: jimc@math.ucla.edu    http://www.math.ucla.edu/~jimc (q.v. for PGP key)

^ permalink raw reply	[flat|nested] 82+ messages in thread

* Re: [autofs] [RFC] Towards a Modern Autofs
  2004-01-07 17:55             ` H. Peter Anvin
@ 2004-01-07 21:13               ` Mike Waychison
  0 siblings, 0 replies; 82+ messages in thread
From: Mike Waychison @ 2004-01-07 21:13 UTC (permalink / raw)
  To: Kernel Mailing List


[-- Attachment #1.1: Type: text/plain, Size: 3434 bytes --]

H. Peter Anvin wrote:

> Mike Waychison wrote:
>
>> This is clearly not 'all of userspace'.  Autofs is an exception.  As 
>> is /etc/mtab.  The way I see it, automounting is a 'mount facility', 
>> as are namespaces.  The two should be made to work together.  Yes, 
>> mount(8) should probably be fixed one way or another as well due to 
>> /etc/mtab breakage. Why? Because it too is a mount facility.
>>
>> There are a couple problems inherent with namespaces.  Most of these 
>> are mount facilities that are broken such as mentioned above.  They 
>> *should* be fixed to work nicely.
>
>
> For that one needs to know how the namespaces are used, not just how 
> they are implemented.  There was a long discussion on this on #kernel 
> yesterday, by the way.
>
The one between you and viro?   I read the logs last night.  I didn't
see much discussion at all.

>> Other parts of userspace get confused with namespaces, eg: cron and 
>> atd.  These programs clearly need infrastructure added that somehow 
>> allows for arbitrary namespace joining/saving.  If you have 
>> suggestions for how we can solve this issue, please do let me know.  
>> I'm stumped :\  I'd be more than happy to discuss this with you.
>
>
> Do they?  In order for that to be a "clearly", I believe one needs to 
> understand how namespaces are used in practice.  It may not be 
> desirable or even possible; this starts getting into a policy decision.
>
Yes.  It is somewhat policy, but as it currently stands, the kernel not
being able to give userspace the option is itself quite restricting.

>> One not-so-far fetched approach would be to associate cron/at jobs 
>> with automount configurations so that a namespace can be 
>> re-constructed at runtime.
>
>
> I am not entirely sure what you mean with this, but it sounds 
> incredibly dangerous to me.
>
>
Basically, consider Plan 9 namespaces (I admit I'm no expert on Plan 9
:).  I'm told it allows one to somehow share namespaces with other users
/ processes as long as they exists.   Linux has a per-process namespace
model that (currently) doesn't allow this to happen.  Once all processes
that share a namespace die, the entire namespace ceases to exist.  In
order to 'reconstruct' a namespace, a service could somehow say: "Use
this automount configuration" (manually by creating a fresh namespace,
removing all non-essential mounts, and installing new mount-traps within
this namespace).

This of course has corner cases like the chicken and egg problem where
the configuration files have to be available in that namespace already,
but with some thought we could figure that out.  (Something similar to
the way 2.6 uses rootfs could be used to strap into this fresh
namespace, entirely from userspace).

This would in effect allow me to say "Take a snapshot of my namespace"
(this would probably require more help from individual filesystem
implementations in order to get all the mount information used) which
would dump an automount map that could later be used to lazily recreate it.

Just a thought.

-- 
Mike Waychison
Sun Microsystems, Inc.
1 (650) 352-5299 voice
1 (416) 202-8336 voice
mailto: Michael.Waychison@Sun.COM
http://www.sun.com

~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
NOTICE:  The opinions expressed in this email are held by me,
and may not represent the views of Sun Microsystems, Inc.
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~



[-- Attachment #1.2: file:///tmp/nsmail.pgp --]
[-- Type: application/pgp-signature, Size: 252 bytes --]

[-- Attachment #2: Type: application/pgp-signature, Size: 251 bytes --]

^ permalink raw reply	[flat|nested] 82+ messages in thread

* Re: [autofs] [RFC] Towards a Modern Autofs
  2004-01-07 21:04       ` Mike Waychison
@ 2004-01-07 21:11         ` Mike Fedyk
  2004-01-07 23:40           ` Jesper Juhl
  2004-01-07 21:24         ` Jeff Garzik
  1 sibling, 1 reply; 82+ messages in thread
From: Mike Fedyk @ 2004-01-07 21:11 UTC (permalink / raw)
  To: Mike Waychison; +Cc: H. Peter Anvin, linux-kernel

On Wed, Jan 07, 2004 at 04:04:41PM -0500, Mike Waychison wrote:
> H. Peter Anvin wrote:
> 
> >>Also when /home or other important fs are mounted via autofs there is
> >>not much practical difference between a hung kernel and a hung
> >>daemon. You have to reboot the system anyways.
> >
> >
> >a) Guess which one is easier to debug?
> 
> When they may both equally hang your machine, neither.

Let's see.

If it's in userspace, then setup your debug area in an area your system
doesn't depend on, and wham, the hang won't affect the entire system anymore.

Also, if you have /home automounted then it only affects the users on /home,
and root's $home should be /home...

Though, you can debug in-kernel code with UML...

^ permalink raw reply	[flat|nested] 82+ messages in thread

* Re: [autofs] [RFC] Towards a Modern Autofs
  2004-01-07 17:50     ` H. Peter Anvin
@ 2004-01-07 21:04       ` Mike Waychison
  2004-01-07 21:11         ` Mike Fedyk
  2004-01-07 21:24         ` Jeff Garzik
  0 siblings, 2 replies; 82+ messages in thread
From: Mike Waychison @ 2004-01-07 21:04 UTC (permalink / raw)
  To: H. Peter Anvin; +Cc: linux-kernel

[-- Attachment #1: Type: text/plain, Size: 2256 bytes --]

H. Peter Anvin wrote:

>> Also when /home or other important fs are mounted via autofs there is
>> not much practical difference between a hung kernel and a hung
>> daemon. You have to reboot the system anyways.
> 
> 
> a) Guess which one is easier to debug?

When they may both equally hang your machine, neither.

> b) Do people around here really believe that putting things in the 
> kernel magically makes them work right?
> 

No magic involved.

When atomicity is needed wrt. mountpoints, moving the logic into the 
kernel is a much simpler solution.

How much code was required to handle the corner cases and races in the 
existing autofs implementations?

To put it into perspective, the I'm calling for the following major changes:

1) move expiry logic out of autofs and into the VFS where others can use 
it and notice when it breaks when VFS internals change.  For example, I 
just noticed that autofs4 in 2.6 hasn't been updated to grab the new 
vfsmount_lock instead of dcache_lock in certain circumstances.

2) move the loop that used to spin around and ask kernelspace if there 
was anything to expire into the VFS as well, where it won't be killed.

3) introduce some way to let userspace walk the mountpoints using file 
descriptors as references.

4) figure out a way to get super_blocks to clone so that we can have 
some consistent automount functionality across cloned namespaces.

(1) and (2) shouldn't be hard at all to do considering David Howells has 
done the majority of this already. (3) is needed in order to manage 
direct mounts properly for when they are 'covered'.  Admittedly, (4) 
comes off as an ugly hack.

Also, (2) was the only 'active' task the automount daemon was doing. 
Everything else it did can be rewritten in the form of a usermode helper 
that runs only when it is needed.  This simplifies the userspace code a lot.


-- 
Mike Waychison
Sun Microsystems, Inc.
1 (650) 352-5299 voice
1 (416) 202-8336 voice
mailto: Michael.Waychison@Sun.COM
http://www.sun.com

~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
NOTICE:  The opinions expressed in this email are held by me,
and may not represent the views of Sun Microsystems, Inc.
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~

[-- Attachment #2: Type: application/pgp-signature, Size: 251 bytes --]

^ permalink raw reply	[flat|nested] 82+ messages in thread

* Re: [autofs] [RFC] Towards a Modern Autofs
  2004-01-07 16:19           ` Mike Waychison
@ 2004-01-07 17:55             ` H. Peter Anvin
  2004-01-07 21:13               ` Mike Waychison
  0 siblings, 1 reply; 82+ messages in thread
From: H. Peter Anvin @ 2004-01-07 17:55 UTC (permalink / raw)
  To: linux-kernel

Mike Waychison wrote:

> This is clearly not 'all of userspace'.  Autofs is an exception.  As is 
> /etc/mtab.  The way I see it, automounting is a 'mount facility', as are 
> namespaces.  The two should be made to work together.  Yes, mount(8) 
> should probably be fixed one way or another as well due to /etc/mtab 
> breakage. Why? Because it too is a mount facility.
> 
> There are a couple problems inherent with namespaces.  Most of these are 
> mount facilities that are broken such as mentioned above.  They *should* 
> be fixed to work nicely.

For that one needs to know how the namespaces are used, not just how 
they are implemented.  There was a long discussion on this on #kernel 
yesterday, by the way.

> Other parts of userspace get confused with namespaces, eg: cron and atd. 
>  These programs clearly need infrastructure added that somehow allows 
> for arbitrary namespace joining/saving.  If you have suggestions for how 
> we can solve this issue, please do let me know.  I'm stumped :\  I'd be 
> more than happy to discuss this with you.

Do they?  In order for that to be a "clearly", I believe one needs to 
understand how namespaces are used in practice.  It may not be desirable 
or even possible; this starts getting into a policy decision.

> One not-so-far fetched approach would be to associate cron/at jobs with 
> automount configurations so that a namespace can be re-constructed at 
> runtime.

I am not entirely sure what you mean with this, but it sounds incredibly 
dangerous to me.

	-hpa


^ permalink raw reply	[flat|nested] 82+ messages in thread

* Re: [autofs] [RFC] Towards a Modern Autofs
  2004-01-07  4:21   ` Andi Kleen
@ 2004-01-07 17:50     ` H. Peter Anvin
  2004-01-07 21:04       ` Mike Waychison
  0 siblings, 1 reply; 82+ messages in thread
From: H. Peter Anvin @ 2004-01-07 17:50 UTC (permalink / raw)
  To: linux-kernel

Andi Kleen wrote:
> 
> I personally would be in favour of doing it all in the kernel because
> autofs3 and autofs4 are not fully compatible and break in subtle ways
> when not matching and in my experience when you have autofs3 compiled
> into the kernel the system happens to have an autofs 4 daemon
> installed and vice versa. Doing it in the kernel would avoid this
> nasty dependency problem.
> 

"Don't do that then."  Really.  Originally the autofs v3 filesystem was 
called "autofs" and the autofs v4 filesystem was called "autofs4" and 
the intent was that you should *never* run them across versions.

Jeremy tried nevertheless to be compatible (mistake #1) and Linus then 
renamed the autofs4 filesystem "autofs" (mistake #2).  There was no good 
reason for this and it should never have happened -- it broke the design 
that was intended to make sure the above wasn't going to be a problem.

> Also when /home or other important fs are mounted via autofs there is
> not much practical difference between a hung kernel and a hung
> daemon. You have to reboot the system anyways.

a) Guess which one is easier to debug?
b) Do people around here really believe that putting things in the 
kernel magically makes them work right?

	-hpa


^ permalink raw reply	[flat|nested] 82+ messages in thread

* Re: [autofs] [RFC] Towards a Modern Autofs
  2004-01-06 22:20         ` H. Peter Anvin
@ 2004-01-07 16:19           ` Mike Waychison
  2004-01-07 17:55             ` H. Peter Anvin
  0 siblings, 1 reply; 82+ messages in thread
From: Mike Waychison @ 2004-01-07 16:19 UTC (permalink / raw)
  To: H. Peter Anvin; +Cc: Tim Hockin, autofs, linux-kernel

[-- Attachment #1: Type: text/plain, Size: 2431 bytes --]

H. Peter Anvin wrote:
> Tim Hockin wrote:
> 
>>On Tue, Jan 06, 2004 at 02:06:34PM -0800, H. Peter Anvin wrote:
 >>
>>>
>>>First of all, I'll be blunt: namespaces currently provide zero benefit
>>>in Linux, and virtually noone uses them.  I have discussed this with
>>>Linus in the past, and neither one of us see namespaces as being worth
>>
>>Let's get rid of them, then.  Make life that much easier.
>>
> 
> 
> That's what the Linux community is doing, de facto.  The Linux userspace
> simply is not set up to handle namespaces, and the autofs daemon is no
> exception.  Consider such a simple thing as /etc/mtab - /proc/mounts
> which is necessary for most of the mount(8) functionality to work.  It
> doesn't support namespaces and really cannot be made to.
> 
> namespace support in Linux is at the best a far-off future goal.  It is
> one thing to put in infrastructure, especially since it has some other
> nice benefits; it's another thing to revamp all of userspace to use it;
> it's nowhere close and autofs is no exception.
> 

This is clearly not 'all of userspace'.  Autofs is an exception.  As is 
/etc/mtab.  The way I see it, automounting is a 'mount facility', as are 
namespaces.  The two should be made to work together.  Yes, mount(8) 
should probably be fixed one way or another as well due to /etc/mtab 
breakage. Why? Because it too is a mount facility.

There are a couple problems inherent with namespaces.  Most of these are 
mount facilities that are broken such as mentioned above.  They *should* 
be fixed to work nicely.

Other parts of userspace get confused with namespaces, eg: cron and atd. 
  These programs clearly need infrastructure added that somehow allows 
for arbitrary namespace joining/saving.  If you have suggestions for how 
we can solve this issue, please do let me know.  I'm stumped :\  I'd be 
more than happy to discuss this with you.

One not-so-far fetched approach would be to associate cron/at jobs with 
automount configurations so that a namespace can be re-constructed at 
runtime.


-- 
Mike Waychison
Sun Microsystems, Inc.
1 (650) 352-5299 voice
1 (416) 202-8336 voice
mailto: Michael.Waychison@Sun.COM
http://www.sun.com

~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
NOTICE:  The opinions expressed in this email are held by me,
and may not represent the views of Sun Microsystems, Inc.
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~

[-- Attachment #2: Type: application/pgp-signature, Size: 251 bytes --]

^ permalink raw reply	[flat|nested] 82+ messages in thread

* Re: [autofs] [RFC] Towards a Modern Autofs
       [not found] ` <1b6CO-3v0-15@gated-at.bofh.it>
@ 2004-01-07  4:21   ` Andi Kleen
  2004-01-07 17:50     ` H. Peter Anvin
  0 siblings, 1 reply; 82+ messages in thread
From: Andi Kleen @ 2004-01-07  4:21 UTC (permalink / raw)
  To: H. Peter Anvin; +Cc: linux-kernel, Michael.Waychison

"H. Peter Anvin" <hpa@zytor.com> writes:

>  A dead daemon is a
> painful recovery, admitted.  It is also a THIS SHOULD NOT HAPPEN
> condition.  By cramming it into the kernel, you're in fact making the
> system less stable, not more, because the kernel being tainted with
> faulty code is a total system malfunction; a crashed userspace daemon is
> "merely" a messy cleanup.  In practice, the autofs daemon/ does not die
> unless a careless system administrator kills it.  It is a non-problem.

I personally would be in favour of doing it all in the kernel because
autofs3 and autofs4 are not fully compatible and break in subtle ways
when not matching and in my experience when you have autofs3 compiled
into the kernel the system happens to have an autofs 4 daemon
installed and vice versa. Doing it in the kernel would avoid this
nasty dependency problem.

Also when /home or other important fs are mounted via autofs there is
not much practical difference between a hung kernel and a hung
daemon. You have to reboot the system anyways.

-Andi

^ permalink raw reply	[flat|nested] 82+ messages in thread

* Re: [autofs] [RFC] Towards a Modern Autofs
  2004-01-06 23:34 Ogden, Aaron A.
@ 2004-01-06 23:47 ` Tim Hockin
  0 siblings, 0 replies; 82+ messages in thread
From: Tim Hockin @ 2004-01-06 23:47 UTC (permalink / raw)
  To: Ogden, Aaron A.
  Cc: thockin, H. Peter Anvin, autofs mailing list, Mike Waychison,
	Kernel Mailing List

On Tue, Jan 06, 2004 at 05:34:08PM -0600, Ogden, Aaron A. wrote:
> autofs work like Solaris autofs.  Is Sun willing to devote man-hours to
> help implement the new autofs?  I think Ian has done a tremendous job

Yes. :)

^ permalink raw reply	[flat|nested] 82+ messages in thread

* RE: [autofs] [RFC] Towards a Modern Autofs
@ 2004-01-06 23:34 Ogden, Aaron A.
  2004-01-06 23:47 ` Tim Hockin
  0 siblings, 1 reply; 82+ messages in thread
From: Ogden, Aaron A. @ 2004-01-06 23:34 UTC (permalink / raw)
  To: Tim Hockin
  Cc: thockin, H. Peter Anvin, autofs mailing list, Mike Waychison,
	Kernel Mailing List



> -----Original Message-----
> From: Tim Hockin [mailto:thockin@hockin.org] 
> Sent: Tuesday, January 06, 2004 4:48 PM
> To: Ogden, Aaron A.
> Cc: thockin@Sun.COM; H. Peter Anvin; autofs mailing list; 
> Mike Waychison; Kernel Mailing List
> Subject: Re: [autofs] [RFC] Towards a Modern Autofs
> 
> 
> On Tue, Jan 06, 2004 at 04:28:59PM -0600, Ogden, Aaron A. wrote:
> > Solaris there is a command called 'automount' that tells the kernel
to
> > re-read the automount maps, perhaps it resets the autofs subsystem
in
> > the kernel as well.  If linux autofs had the same capability we
might
> > not need the daemon, but until then, having the daemon in userland
is a
> > good thing.
> 
> That's more or less exactly what is proposed.
> 

Excellent!  I haven't read through the proposal yet but I have it open
in another window.  :-)
The detailed proposal you've written implies that Sun as a whole has
given serious thought to the problem, which IMHO is how to make linux
autofs work like Solaris autofs.  Is Sun willing to devote man-hours to
help implement the new autofs?  I think Ian has done a tremendous job
with autofs4 but the more minds we throw at the problem the better.

^ permalink raw reply	[flat|nested] 82+ messages in thread

* Re: [autofs] [RFC] Towards a Modern Autofs
       [not found]       ` <20040106221502.GA7398@hockin.org>
@ 2004-01-06 22:20         ` H. Peter Anvin
  2004-01-07 16:19           ` Mike Waychison
  0 siblings, 1 reply; 82+ messages in thread
From: H. Peter Anvin @ 2004-01-06 22:20 UTC (permalink / raw)
  To: Tim Hockin; +Cc: autofs, linux-kernel

Tim Hockin wrote:
> On Tue, Jan 06, 2004 at 02:06:34PM -0800, H. Peter Anvin wrote:
> 
>>>Can you maybe share some details?  I think this deign moves MORE state to
>>>userspace (expiry aside).  The "state" in kernel is really mostly sent back
>>>to userspace.  No more passing pipes into the kernel (state) or tracking the
>>>pgid of the daemon (state).
>>
>>If you want to fire up a new daemon, all that state that was supposed to
>>be kept in userspace has to be reconstructed.  That means the kernel has
>>to have all that information; this would include stuff like what kind of
>>umount policy you want for each key entry (the current daemon doesn't do
>>that because it doesn't have the proper state.)
> 
> I'm not really sure what you're saying., here.  I'm sorry.  Not trying to be
> thick, just not understanding.
> 
> What umount policy?  What state is supposed to be kept in userspace that isn't?
> 

The current autofs daemon, for example, does not handle different
procedures on umount.  This is particularly important when you have
mount trees.

> 
>>>The daemon as it stands does NOT handle namespaces, does NOT handle expiry
>>>well, and is a pretty sad copy of an old design.
>>
>>First of all, I'll be blunt: namespaces currently provide zero benefit
>>in Linux, and virtually noone uses them.  I have discussed this with
>>Linus in the past, and neither one of us see namespaces as being worth
> 
> Let's get rid of them, then.  Make life that much easier.
> 

That's what the Linux community is doing, de facto.  The Linux userspace
simply is not set up to handle namespaces, and the autofs daemon is no
exception.  Consider such a simple thing as /etc/mtab - /proc/mounts
which is necessary for most of the mount(8) functionality to work.  It
doesn't support namespaces and really cannot be made to.

namespace support in Linux is at the best a far-off future goal.  It is
one thing to put in infrastructure, especially since it has some other
nice benefits; it's another thing to revamp all of userspace to use it;
it's nowhere close and autofs is no exception.

	-hpa


^ permalink raw reply	[flat|nested] 82+ messages in thread

* Re: [autofs] [RFC] Towards a Modern Autofs
  2004-01-06 22:06     ` H. Peter Anvin
@ 2004-01-06 22:17       ` Tim Hockin
       [not found]       ` <20040106221502.GA7398@hockin.org>
  1 sibling, 0 replies; 82+ messages in thread
From: Tim Hockin @ 2004-01-06 22:17 UTC (permalink / raw)
  To: H. Peter Anvin; +Cc: autofs mailing list, Kernel Mailing List

(sorry for the resend, forgot to CC the lists)

On Tue, Jan 06, 2004 at 02:06:34PM -0800, H. Peter Anvin wrote:
> > Can you maybe share some details?  I think this deign moves MORE state to
> > userspace (expiry aside).  The "state" in kernel is really mostly sent back
> > to userspace.  No more passing pipes into the kernel (state) or tracking the
> > pgid of the daemon (state).
> 
> If you want to fire up a new daemon, all that state that was supposed to
> be kept in userspace has to be reconstructed.  That means the kernel has
> to have all that information; this would include stuff like what kind of
> umount policy you want for each key entry (the current daemon doesn't do
> that because it doesn't have the proper state.)

I'm not really sure what you're saying., here.  I'm sorry.  Not trying to be
thick, just not understanding.

What umount policy?  What state is supposed to be kept in userspace that isn't?

> > The daemon as it stands does NOT handle namespaces, does NOT handle expiry
> > well, and is a pretty sad copy of an old design.
> 
> First of all, I'll be blunt: namespaces currently provide zero benefit
> in Linux, and virtually noone uses them.  I have discussed this with
> Linus in the past, and neither one of us see namespaces as being worth

Let's get rid of them, then.  Make life that much easier.

^ permalink raw reply	[flat|nested] 82+ messages in thread

* Re: [autofs] [RFC] Towards a Modern Autofs
  2004-01-06 21:50   ` Tim Hockin
@ 2004-01-06 22:06     ` H. Peter Anvin
  2004-01-06 22:17       ` Tim Hockin
       [not found]       ` <20040106221502.GA7398@hockin.org>
  0 siblings, 2 replies; 82+ messages in thread
From: H. Peter Anvin @ 2004-01-06 22:06 UTC (permalink / raw)
  To: thockin; +Cc: Mike Waychison, autofs mailing list, Kernel Mailing List

Tim Hockin wrote:
> On Tue, Jan 06, 2004 at 01:01:46PM -0800, H. Peter Anvin wrote:
> 
>>Finally, throwing out the daemon is a huge step backwards.  Most of the
>>problems with autofs v3 (and to a lesser extent v4) are due to the
>>*lack* of state in userspace (the current daemon is mostly stateless);
>>putting additional state in userspace would be a benefit in my experience.
> 
> Can you maybe share some details?  I think this deign moves MORE state to
> userspace (expiry aside).  The "state" in kernel is really mostly sent back
> to userspace.  No more passing pipes into the kernel (state) or tracking the
> pgid of the daemon (state).
> 

If you want to fire up a new daemon, all that state that was supposed to
be kept in userspace has to be reconstructed.  That means the kernel has
to have all that information; this would include stuff like what kind of
umount policy you want for each key entry (the current daemon doesn't do
that because it doesn't have the proper state.)

>>Pardon me for sounding harsh, but I'm seriously sick of the oft-repeated
>>idiocy that effectively boils down to "the daemon can die and would lose
>>its state, so let's put it all in the kernel."  A dead daemon is a
>>painful recovery, admitted.  It is also a THIS SHOULD NOT HAPPEN
>  
> But it *does* happen.

I don't believe it happens on any significant degree in cases where you
wouldn't have a kernel panic if you put the stuff in the kernel, *or* a
careless system admininistrator killed it.  In fact, I suspect it's
virtually all the latter.

>>condition.  By cramming it into the kernel, you're in fact making the
>>system less stable, not more, because the kernel being tainted with
>>faulty code is a total system malfunction; a crashed userspace daemon is
> 
> I don't think this design crams anything into the kernel.  It doesn't put a
> whole lot more into the kernel than is currently in there (expiry and new
> mount stuff, aside).  All the work still happens in userland.
> 
> The daemon as it stands does NOT handle namespaces, does NOT handle expiry
> well, and is a pretty sad copy of an old design.

First of all, I'll be blunt: namespaces currently provide zero benefit
in Linux, and virtually noone uses them.  I have discussed this with
Linus in the past, and neither one of us see namespaces as being worth
jumping though hoops to support.  That being said, it's doable by either
having different daemons for different namespaces (useful for policy) or
by having them gain access to the requisite namespaces.

Second, what you say about the state of the daemon is obviously true.
autofs v3 was developed on Linux 2.0 which had a vastly different VFS,
and it has by and large bitrotted.  Furthermore, at that point Linux
didn't support threading in any useful way, which meant that keeping the
appropriate state the in daemon was too painful -- hence the largely
stateless design with its associated problems.

>>"merely" a messy cleanup.  In practice, the autofs daemon does not die
>>unless a careless system administrator kills it.  It is a non-problem.
> 
> I have some customers I'd love to send to you, if you really think that's
> true.

As root, I can kill the system too by doing "cat /dev/zero > /dev/mem".
 If you do stupid shit as root you're dead.  What's the news?

	-hpa


^ permalink raw reply	[flat|nested] 82+ messages in thread

* Re: [autofs] [RFC] Towards a Modern Autofs
  2004-01-06 21:01 ` [autofs] " H. Peter Anvin
  2004-01-06 21:44   ` Mike Waychison
@ 2004-01-06 21:50   ` Tim Hockin
  2004-01-06 22:06     ` H. Peter Anvin
  1 sibling, 1 reply; 82+ messages in thread
From: Tim Hockin @ 2004-01-06 21:50 UTC (permalink / raw)
  To: H. Peter Anvin; +Cc: Mike Waychison, autofs mailing list, Kernel Mailing List

On Tue, Jan 06, 2004 at 01:01:46PM -0800, H. Peter Anvin wrote:
> Finally, throwing out the daemon is a huge step backwards.  Most of the
> problems with autofs v3 (and to a lesser extent v4) are due to the
> *lack* of state in userspace (the current daemon is mostly stateless);
> putting additional state in userspace would be a benefit in my experience.

Can you maybe share some details?  I think this deign moves MORE state to
userspace (expiry aside).  The "state" in kernel is really mostly sent back
to userspace.  No more passing pipes into the kernel (state) or tracking the
pgid of the daemon (state).

> Pardon me for sounding harsh, but I'm seriously sick of the oft-repeated
> idiocy that effectively boils down to "the daemon can die and would lose
> its state, so let's put it all in the kernel."  A dead daemon is a
> painful recovery, admitted.  It is also a THIS SHOULD NOT HAPPEN

But it *does* happen.

> condition.  By cramming it into the kernel, you're in fact making the
> system less stable, not more, because the kernel being tainted with
> faulty code is a total system malfunction; a crashed userspace daemon is

I don't think this design crams anything into the kernel.  It doesn't put a
whole lot more into the kernel than is currently in there (expiry and new
mount stuff, aside).  All the work still happens in userland.

The daemon as it stands does NOT handle namespaces, does NOT handle expiry
well, and is a pretty sad copy of an old design.

> "merely" a messy cleanup.  In practice, the autofs daemon does not die
> unless a careless system administrator kills it.  It is a non-problem.

I have some customers I'd love to send to you, if you really think that's
true.

^ permalink raw reply	[flat|nested] 82+ messages in thread

* Re: [autofs] [RFC] Towards a Modern Autofs
  2004-01-06 21:01 ` [autofs] " H. Peter Anvin
@ 2004-01-06 21:44   ` Mike Waychison
  2004-01-06 21:50   ` Tim Hockin
  1 sibling, 0 replies; 82+ messages in thread
From: Mike Waychison @ 2004-01-06 21:44 UTC (permalink / raw)
  To: H. Peter Anvin; +Cc: Kernel Mailing List, autofs mailing list

[-- Attachment #1: Type: text/plain, Size: 4328 bytes --]

Hi Peter,

H. Peter Anvin wrote:

>Mike Waychison wrote:
>  
>
>>The attached paper was written an attempt to design an automount system
>>with complete Solaris-style autofs functionality.  This includes
>>browsing, direct maps and lazy mounting of multimounts.  The paper can
>>also be found online at:
>>                                                                              
>>    
>>
>
>Sorry to sound like sour grapes, but this is a requirements document,
>not a proposed implementation.  
>
You surely read the whole thing, didn't you?

>Furthermore, as I have expressed before,
>I think your claim that expiry should be done in the VFS to be incorrect.
>  
>
Why?  You haven't convinced me that it should be elsewhere. 

>I think you're on the completely wrong track, because you're starting
>with the wrong problem.  The implementation needs to start with the VFS
>implementation and derive from that.
>  
>

In which sense?   Re-design it?

>Finally, throwing out the daemon is a huge step backwards.  Most of the
>problems with autofs v3 (and to a lesser extent v4) are due to the
>*lack* of state in userspace (the current daemon is mostly stateless);
>putting additional state in userspace would be a benefit in my experience.
>  
>
Bull.   Having a single process for each autofs filesystem is state in 
itself.   Eg:

- setup an auto_home map on /home
- mkdir /home2
- mount --bind /home /home2

The state that you manage with your automount processes themselves is 
now inconsistent with what the kernel has.  

>Pardon me for sounding harsh, but I'm seriously sick of the oft-repeated
>idiocy that effectively boils down to "the daemon can die and would lose
>its state, so let's put it all in the kernel."  A dead daemon is a
>painful recovery, admitted.  It is also a THIS SHOULD NOT HAPPEN
>condition.  
>

You've completely discarded the fact that a daemon breaks namespaces in 
your argument.

You somehow mistook the arguments I've presented and assume that we get 
rid of the daemon solely so that we eliminate state in userspace.  The 
point of getting rid of the daemon is that tying a single process to 
each mountpoint:

- breaks on mount --bind operations
- breaks on namespace clones

These _can_ be circumvented by using a single process daemon which 
catches _ALL_ automount requests from the kernel, however:

- There are NO facilities for changes namespaces, and there doesn't 
appear to be any plans to implement them.   This doesn't only affect the 
mount operations themselves, but also reading the /etc/auto_* maps in 
the different namespace.
- This limits a running system to _exactly_ one policy system for 
handling automount points.  Differing namespaces may have different 
automounter maps and even automounters themselves if they want to under 
the scheme I've outlined.

Also, the current implementation uses pathnames to do everything.  This 
breaks:

- mountpount binds in another way
- mountpoint moves

My goal here is to fix all of the mountpoint logic in automounting that 
relies on there being a single namespace. 

Now, going back to your argument of reliability and reconnectivity, yes, 
I agree that the daemon dying is something that _SHOULD NOT HAPPEN_.  
But it does in practice.  Getting rid of the daemon the way I've 
outlined simply eliminates that from ever happening as an added bonus.

>By cramming it into the kernel, you're in fact making the
>system less stable, not more, because the kernel being tainted with
>faulty code is a total system malfunction; a crashed userspace daemon is
>"merely" a messy cleanup.  In practice, the autofs daemon does not die
>unless a careless system administrator kills it.  It is a non-problem.
>  
>
"Faulty code"?    I haven't even presented you with code yet.  Nice.

Somehow, you got the impression that the system I've proposed would be 
more complex than what we have today, when in fact I believe it's a lot 
simpler.

-- 
Mike Waychison
Sun Microsystems, Inc.
1 (650) 352-5299 voice
1 (416) 202-8336 voice
mailto: Michael.Waychison@Sun.COM
http://www.sun.com

~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
NOTICE:  The opinions expressed in this email are held by me, 
and may not represent the views of Sun Microsystems, Inc.
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ 


[-- Attachment #2: Type: application/pgp-signature, Size: 251 bytes --]

^ permalink raw reply	[flat|nested] 82+ messages in thread

* Re: [autofs] [RFC] Towards a Modern Autofs
  2004-01-06 19:55 Mike Waychison
@ 2004-01-06 21:01 ` H. Peter Anvin
  2004-01-06 21:44   ` Mike Waychison
  2004-01-06 21:50   ` Tim Hockin
  2004-01-07 21:14 ` Jim Carter
  1 sibling, 2 replies; 82+ messages in thread
From: H. Peter Anvin @ 2004-01-06 21:01 UTC (permalink / raw)
  To: Mike Waychison; +Cc: Kernel Mailing List, autofs mailing list

Mike Waychison wrote:
> 
> The attached paper was written an attempt to design an automount system
> with complete Solaris-style autofs functionality.  This includes
> browsing, direct maps and lazy mounting of multimounts.  The paper can
> also be found online at:
>                                                                               

Sorry to sound like sour grapes, but this is a requirements document,
not a proposed implementation.  Furthermore, as I have expressed before,
I think your claim that expiry should be done in the VFS to be incorrect.

I think you're on the completely wrong track, because you're starting
with the wrong problem.  The implementation needs to start with the VFS
implementation and derive from that.

Finally, throwing out the daemon is a huge step backwards.  Most of the
problems with autofs v3 (and to a lesser extent v4) are due to the
*lack* of state in userspace (the current daemon is mostly stateless);
putting additional state in userspace would be a benefit in my experience.

Pardon me for sounding harsh, but I'm seriously sick of the oft-repeated
idiocy that effectively boils down to "the daemon can die and would lose
its state, so let's put it all in the kernel."  A dead daemon is a
painful recovery, admitted.  It is also a THIS SHOULD NOT HAPPEN
condition.  By cramming it into the kernel, you're in fact making the
system less stable, not more, because the kernel being tainted with
faulty code is a total system malfunction; a crashed userspace daemon is
"merely" a messy cleanup.  In practice, the autofs daemon does not die
unless a careless system administrator kills it.  It is a non-problem.

	-hpa


^ permalink raw reply	[flat|nested] 82+ messages in thread

end of thread, other threads:[~2004-01-14 15:58 UTC | newest]

Thread overview: 82+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2004-01-06 22:28 [autofs] [RFC] Towards a Modern Autofs Ogden, Aaron A.
2004-01-06 22:41 ` Mike Fedyk
2004-01-06 22:47 ` Tim Hockin
2004-01-06 22:53 ` Paul Raines
2004-01-07 23:14 ` Jim Carter
2004-01-07 23:32   ` H. Peter Anvin
2004-01-08 12:52     ` Ian Kent
2004-01-08 18:31       ` viro
2004-01-09 18:43         ` Ian Kent
2004-01-09 19:41         ` Mike Waychison
2004-01-09 19:57           ` H. Peter Anvin
2004-01-09 21:31             ` Mike Waychison
2004-01-09 21:36               ` H. Peter Anvin
  -- strict thread matches above, loose matches on Subject: below --
2004-01-08 19:32 trond.myklebust
2004-01-08 19:41 ` H. Peter Anvin
2004-01-08 20:08   ` trond.myklebust
2004-01-08 21:13     ` H. Peter Anvin
2004-01-08 22:20       ` J. Bruce Fields
2004-01-08 22:24         ` H. Peter Anvin
2004-01-09 20:37       ` Mike Waychison
2004-01-09 21:02         ` H. Peter Anvin
2004-01-09 21:52           ` Mike Waychison
2004-01-09 20:16   ` Mike Waychison
     [not found] <1b5GC-29h-1@gated-at.bofh.it>
     [not found] ` <1b6CO-3v0-15@gated-at.bofh.it>
2004-01-07  4:21   ` Andi Kleen
2004-01-07 17:50     ` H. Peter Anvin
2004-01-07 21:04       ` Mike Waychison
2004-01-07 21:11         ` Mike Fedyk
2004-01-07 23:40           ` Jesper Juhl
2004-01-07 21:24         ` Jeff Garzik
2004-01-07 23:47           ` Mike Waychison
2004-01-07 23:56             ` Jeff Garzik
2004-01-12 16:57               ` Mike Waychison
2004-01-13  7:39                 ` Ian Kent
2004-01-06 23:34 Ogden, Aaron A.
2004-01-06 23:47 ` Tim Hockin
2004-01-06 19:55 Mike Waychison
2004-01-06 21:01 ` [autofs] " H. Peter Anvin
2004-01-06 21:44   ` Mike Waychison
2004-01-06 21:50   ` Tim Hockin
2004-01-06 22:06     ` H. Peter Anvin
2004-01-06 22:17       ` Tim Hockin
     [not found]       ` <20040106221502.GA7398@hockin.org>
2004-01-06 22:20         ` H. Peter Anvin
2004-01-07 16:19           ` Mike Waychison
2004-01-07 17:55             ` H. Peter Anvin
2004-01-07 21:13               ` Mike Waychison
2004-01-07 21:14 ` Jim Carter
2004-01-07 22:55   ` Mike Waychison
2004-01-08 12:00     ` Ian Kent
2004-01-08 15:39       ` Mike Waychison
2004-01-09 18:20         ` Ian Kent
2004-01-09 20:06           ` Mike Waychison
2004-01-10  5:43             ` Ian Kent
2004-01-12 13:07               ` Mike Waychison
2004-01-12 16:01                 ` raven
2004-01-12 16:26                   ` Mike Waychison
2004-01-12 22:50                     ` Tim Hockin
2004-01-12 23:28                       ` Mike Waychison
2004-01-13  1:30                       ` Ian Kent
2004-01-12 16:28                   ` raven
2004-01-12 16:58                     ` Mike Waychison
2004-01-13  1:54                       ` Ian Kent
2004-01-13 19:01                         ` Mike Waychison
2004-01-14 15:58                           ` raven
2004-01-13 18:46                   ` Mike Waychison
2004-01-09 20:51           ` Jim Carter
2004-01-10  5:56             ` Ian Kent
2004-01-08 17:34       ` H. Peter Anvin
2004-01-08 19:41         ` Mike Waychison
2004-01-08 23:42         ` Michael Clark
2004-01-09 20:28           ` Mike Waychison
2004-01-09 20:54             ` H. Peter Anvin
2004-01-09 21:43               ` Mike Waychison
2004-01-09 18:32         ` Ian Kent
2004-01-09 20:52           ` Mike Waychison
2004-01-10  6:05             ` Ian Kent
2004-01-08 12:29     ` Olivier Galibert
2004-01-08 13:20       ` Robin Rosenberg
2004-01-08 16:23       ` Mike Waychison
2004-01-08 12:35     ` Ian Kent
2004-01-08 13:08       ` Ian Kent
2004-01-08 18:20     ` Jim Carter
2004-01-08 21:01       ` H. Peter Anvin
2004-01-08  0:48   ` Ian Kent

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).