All of lore.kernel.org
 help / color / mirror / Atom feed
* whither NFS umount?
@ 2010-10-12 16:29 Chuck Lever
  2010-10-12 17:04 ` Trond Myklebust
  0 siblings, 1 reply; 36+ messages in thread
From: Chuck Lever @ 2010-10-12 16:29 UTC (permalink / raw)
  To: Linux NFS Mailing List

I've been looking at a bug where "mount.nfs -o remount" wipes all the mount options for that mount out of /etc/mtab, thereby making umounting break.

This is a tough nut to crack in user space... Not even utils-linux-ng seems to get mtab option rewriting correct in this case.

Jeff suggested a few weeks ago that we should just chuck user space umount and go with a kernel umount implementation.  I'm beginning to think that is a good strategy, even though a UMNT request is advisory.

  + Only the kernel knows when the last instance of a shared mount point is gone -- only then should a UMNT be sent to the server

  + The kernel might do a delayed lazy UMNT.  It would avoid sending a UMNT until the client is actually done using the export.  Today we just don't send UMNT at all in this case

  + The kernel preserves the original mount options in an internal data structure rather than in /etc/mtab, even after a remount.  This eliminates the NFS requirement for /etc/mtab -- one step closer to getting rid of it

  + The kernel already handles umounts for under-the-cover NFSv4 mounts, right?

  + The kernel is the authority on what is an NFSv4 mount point, so it knows exactly what kind of umount to do every time (send a UMNT or not)

  + There is already a UMNT client in the kernel, used when the kernel's MNT request fails such that a UMNT is needed

Thoughts, comments?

-- 
chuck[dot]lever[at]oracle[dot]com





^ permalink raw reply	[flat|nested] 36+ messages in thread

* Re: whither NFS umount?
  2010-10-12 16:29 whither NFS umount? Chuck Lever
@ 2010-10-12 17:04 ` Trond Myklebust
       [not found]   ` <1286903046.24878.13.camel-rJ7iovZKK19ZJLDQqaL3InhyD016LWXt@public.gmane.org>
  0 siblings, 1 reply; 36+ messages in thread
From: Trond Myklebust @ 2010-10-12 17:04 UTC (permalink / raw)
  To: Chuck Lever; +Cc: Linux NFS Mailing List

On Tue, 2010-10-12 at 12:29 -0400, Chuck Lever wrote:
> I've been looking at a bug where "mount.nfs -o remount" wipes all the mount options for that mount out of /etc/mtab, thereby making umounting break.
> 
> This is a tough nut to crack in user space... Not even utils-linux-ng seems to get mtab option rewriting correct in this case.
> 
> Jeff suggested a few weeks ago that we should just chuck user space umount and go with a kernel umount implementation.  I'm beginning to think that is a good strategy, even though a UMNT request is advisory.
> 
>   + Only the kernel knows when the last instance of a shared mount point is gone -- only then should a UMNT be sent to the server
> 
>   + The kernel might do a delayed lazy UMNT.  It would avoid sending a UMNT until the client is actually done using the export.  Today we just don't send UMNT at all in this case
> 
>   + The kernel preserves the original mount options in an internal data structure rather than in /etc/mtab, even after a remount.  This eliminates the NFS requirement for /etc/mtab -- one step closer to getting rid of it
> 
>   + The kernel already handles umounts for under-the-cover NFSv4 mounts, right?
> 
>   + The kernel is the authority on what is an NFSv4 mount point, so it knows exactly what kind of umount to do every time (send a UMNT or not)
> 
>   + There is already a UMNT client in the kernel, used when the kernel's MNT request fails such that a UMNT is needed
> 
> Thoughts, comments?
> 

UMNT is an advisory thing. If it causes problems, then lets just drop
it.

Cheers
  Trond

^ permalink raw reply	[flat|nested] 36+ messages in thread

* Re: whither NFS umount?
       [not found]   ` <1286903046.24878.13.camel-rJ7iovZKK19ZJLDQqaL3InhyD016LWXt@public.gmane.org>
@ 2010-10-12 17:57     ` Chuck Lever
  2010-10-12 19:18       ` Jeff Layton
  0 siblings, 1 reply; 36+ messages in thread
From: Chuck Lever @ 2010-10-12 17:57 UTC (permalink / raw)
  To: Trond Myklebust; +Cc: Linux NFS Mailing List


On Oct 12, 2010, at 1:04 PM, Trond Myklebust wrote:

> On Tue, 2010-10-12 at 12:29 -0400, Chuck Lever wrote:
>> I've been looking at a bug where "mount.nfs -o remount" wipes all the mount options for that mount out of /etc/mtab, thereby making umounting break.
>> 
>> This is a tough nut to crack in user space... Not even utils-linux-ng seems to get mtab option rewriting correct in this case.
>> 
>> Jeff suggested a few weeks ago that we should just chuck user space umount and go with a kernel umount implementation.  I'm beginning to think that is a good strategy, even though a UMNT request is advisory.
>> 
>>  + Only the kernel knows when the last instance of a shared mount point is gone -- only then should a UMNT be sent to the server
>> 
>>  + The kernel might do a delayed lazy UMNT.  It would avoid sending a UMNT until the client is actually done using the export.  Today we just don't send UMNT at all in this case
>> 
>>  + The kernel preserves the original mount options in an internal data structure rather than in /etc/mtab, even after a remount.  This eliminates the NFS requirement for /etc/mtab -- one step closer to getting rid of it
>> 
>>  + The kernel already handles umounts for under-the-cover NFSv4 mounts, right?
>> 
>>  + The kernel is the authority on what is an NFSv4 mount point, so it knows exactly what kind of umount to do every time (send a UMNT or not)
>> 
>>  + There is already a UMNT client in the kernel, used when the kernel's MNT request fails such that a UMNT is needed
>> 
>> Thoughts, comments?
>> 
> 
> UMNT is an advisory thing. If it causes problems, then lets just drop
> it.

Out of interest, what would that look like?  Would we rip out all code from nfsumount.c and network.c that handles UMNT calls?

-- 
chuck[dot]lever[at]oracle[dot]com





^ permalink raw reply	[flat|nested] 36+ messages in thread

* Re: whither NFS umount?
  2010-10-12 17:57     ` Chuck Lever
@ 2010-10-12 19:18       ` Jeff Layton
  2010-10-12 19:44         ` Trond Myklebust
  0 siblings, 1 reply; 36+ messages in thread
From: Jeff Layton @ 2010-10-12 19:18 UTC (permalink / raw)
  To: Chuck Lever; +Cc: Trond Myklebust, Linux NFS Mailing List

On Tue, 12 Oct 2010 13:57:53 -0400
Chuck Lever <chuck.lever@oracle.com> wrote:

> 
> On Oct 12, 2010, at 1:04 PM, Trond Myklebust wrote:
> 
> > On Tue, 2010-10-12 at 12:29 -0400, Chuck Lever wrote:
> >> I've been looking at a bug where "mount.nfs -o remount" wipes all the mount options for that mount out of /etc/mtab, thereby making umounting break.
> >> 
> >> This is a tough nut to crack in user space... Not even utils-linux-ng seems to get mtab option rewriting correct in this case.
> >> 
> >> Jeff suggested a few weeks ago that we should just chuck user space umount and go with a kernel umount implementation.  I'm beginning to think that is a good strategy, even though a UMNT request is advisory.
> >> 
> >>  + Only the kernel knows when the last instance of a shared mount point is gone -- only then should a UMNT be sent to the server
> >> 
> >>  + The kernel might do a delayed lazy UMNT.  It would avoid sending a UMNT until the client is actually done using the export.  Today we just don't send UMNT at all in this case
> >> 
> >>  + The kernel preserves the original mount options in an internal data structure rather than in /etc/mtab, even after a remount.  This eliminates the NFS requirement for /etc/mtab -- one step closer to getting rid of it
> >> 
> >>  + The kernel already handles umounts for under-the-cover NFSv4 mounts, right?
> >> 
> >>  + The kernel is the authority on what is an NFSv4 mount point, so it knows exactly what kind of umount to do every time (send a UMNT or not)
> >> 
> >>  + There is already a UMNT client in the kernel, used when the kernel's MNT request fails such that a UMNT is needed
> >> 
> >> Thoughts, comments?
> >> 
> > 
> > UMNT is an advisory thing. If it causes problems, then lets just drop
> > it.
> 
> Out of interest, what would that look like?  Would we rip out all code from nfsumount.c and network.c that handles UMNT calls?
> 

I think the part that causes problems is having userspace do this. In
theory, if the kernel were in charge of sending the UMNT, then it's not
really a problem since it knows when to do it. If we have code that
sends a UMNT already, why not do a best-effort UMNT call from the
kernel when we tear down the sb?

Either way, eliminating umount.nfs would be nice...

-- 
Jeff Layton <jlayton@redhat.com>

^ permalink raw reply	[flat|nested] 36+ messages in thread

* Re: whither NFS umount?
  2010-10-12 19:18       ` Jeff Layton
@ 2010-10-12 19:44         ` Trond Myklebust
  2010-10-12 19:52           ` Jeff Layton
  2010-10-13 17:40           ` Steve Dickson
  0 siblings, 2 replies; 36+ messages in thread
From: Trond Myklebust @ 2010-10-12 19:44 UTC (permalink / raw)
  To: Jeff Layton; +Cc: Chuck Lever, Linux NFS Mailing List

On Tue, 2010-10-12 at 15:18 -0400, Jeff Layton wrote:

> I think the part that causes problems is having userspace do this. In
> theory, if the kernel were in charge of sending the UMNT, then it's not
> really a problem since it knows when to do it. If we have code that
> sends a UMNT already, why not do a best-effort UMNT call from the
> kernel when we tear down the sb?

Purely for the pleasure of allowing the server to maintain inaccurate
statistics about who is currently mounting what? I think not...

You can get far more accurate results by replacing the MNT/UMNT state
counter with a purely server-based scheme to track who accessed one or
more files on each exported partition in the past 5 minutes or so. That
would even work with NFSv4...

> Either way, eliminating umount.nfs would be nice...

Agreed.

Trond

^ permalink raw reply	[flat|nested] 36+ messages in thread

* Re: whither NFS umount?
  2010-10-12 19:44         ` Trond Myklebust
@ 2010-10-12 19:52           ` Jeff Layton
  2010-10-12 19:59             ` Chuck Lever
  2010-10-12 20:21             ` Trond Myklebust
  2010-10-13 17:40           ` Steve Dickson
  1 sibling, 2 replies; 36+ messages in thread
From: Jeff Layton @ 2010-10-12 19:52 UTC (permalink / raw)
  To: Trond Myklebust; +Cc: Chuck Lever, Linux NFS Mailing List

On Tue, 12 Oct 2010 15:44:09 -0400
Trond Myklebust <Trond.Myklebust@netapp.com> wrote:

> On Tue, 2010-10-12 at 15:18 -0400, Jeff Layton wrote:
> 
> > I think the part that causes problems is having userspace do this. In
> > theory, if the kernel were in charge of sending the UMNT, then it's not
> > really a problem since it knows when to do it. If we have code that
> > sends a UMNT already, why not do a best-effort UMNT call from the
> > kernel when we tear down the sb?
> 
> Purely for the pleasure of allowing the server to maintain inaccurate
> statistics about who is currently mounting what? I think not...
> 
> You can get far more accurate results by replacing the MNT/UMNT state
> counter with a purely server-based scheme to track who accessed one or
> more files on each exported partition in the past 5 minutes or so. That
> would even work with NFSv4...
> 

True, but for better or worse, UMNT is part of the protocol. It seems
like we ought to do our best to implement it, even if it is
fundamentally flawed.

-- 
Jeff Layton <jlayton@redhat.com>

^ permalink raw reply	[flat|nested] 36+ messages in thread

* Re: whither NFS umount?
  2010-10-12 19:52           ` Jeff Layton
@ 2010-10-12 19:59             ` Chuck Lever
  2010-10-12 20:21             ` Trond Myklebust
  1 sibling, 0 replies; 36+ messages in thread
From: Chuck Lever @ 2010-10-12 19:59 UTC (permalink / raw)
  To: Jeff Layton; +Cc: Trond Myklebust, Linux NFS Mailing List


On Oct 12, 2010, at 3:52 PM, Jeff Layton wrote:

> On Tue, 12 Oct 2010 15:44:09 -0400
> Trond Myklebust <Trond.Myklebust@netapp.com> wrote:
> 
>> On Tue, 2010-10-12 at 15:18 -0400, Jeff Layton wrote:
>> 
>>> I think the part that causes problems is having userspace do this. In
>>> theory, if the kernel were in charge of sending the UMNT, then it's not
>>> really a problem since it knows when to do it. If we have code that
>>> sends a UMNT already, why not do a best-effort UMNT call from the
>>> kernel when we tear down the sb?
>> 
>> Purely for the pleasure of allowing the server to maintain inaccurate
>> statistics about who is currently mounting what? I think not...
>> 
>> You can get far more accurate results by replacing the MNT/UMNT state
>> counter with a purely server-based scheme to track who accessed one or
>> more files on each exported partition in the past 5 minutes or so. That
>> would even work with NFSv4...
>> 
> 
> True, but for better or worse, UMNT is part of the protocol. It seems
> like we ought to do our best to implement it, even if it is
> fundamentally flawed.

I would like to see the kernel send UMNT, but I don't have a strong technical reason for it.  It's just a best-effort kind of thing.  Besides, we should be conservative in what we send (ie follow the protocol as closely as possible), and liberal in what we receive.

In any event, whether the kernel sends a UMNT or no-one does, we can rip it out of umount.nfs now, I think.

-- 
chuck[dot]lever[at]oracle[dot]com





^ permalink raw reply	[flat|nested] 36+ messages in thread

* Re: whither NFS umount?
  2010-10-12 19:52           ` Jeff Layton
  2010-10-12 19:59             ` Chuck Lever
@ 2010-10-12 20:21             ` Trond Myklebust
  2010-10-12 20:26               ` Jeff Layton
  1 sibling, 1 reply; 36+ messages in thread
From: Trond Myklebust @ 2010-10-12 20:21 UTC (permalink / raw)
  To: Jeff Layton; +Cc: Chuck Lever, Linux NFS Mailing List

On Tue, 2010-10-12 at 15:52 -0400, Jeff Layton wrote:
> On Tue, 12 Oct 2010 15:44:09 -0400
> Trond Myklebust <Trond.Myklebust@netapp.com> wrote:
> 
> > On Tue, 2010-10-12 at 15:18 -0400, Jeff Layton wrote:
> > 
> > > I think the part that causes problems is having userspace do this. In
> > > theory, if the kernel were in charge of sending the UMNT, then it's not
> > > really a problem since it knows when to do it. If we have code that
> > > sends a UMNT already, why not do a best-effort UMNT call from the
> > > kernel when we tear down the sb?
> > 
> > Purely for the pleasure of allowing the server to maintain inaccurate
> > statistics about who is currently mounting what? I think not...
> > 
> > You can get far more accurate results by replacing the MNT/UMNT state
> > counter with a purely server-based scheme to track who accessed one or
> > more files on each exported partition in the past 5 minutes or so. That
> > would even work with NFSv4...
> > 
> 
> True, but for better or worse, UMNT is part of the protocol. It seems
> like we ought to do our best to implement it, even if it is
> fundamentally flawed.
> 

UMNTALL is part of the same protocol, and yet we have never implemented
that. Just because something is documented, it doesn't mean we have to
do it...

The bottom line is that UMNT doesn't do what it advertises itself as
doing, and so we should not waste space supporting it in the kernel. We
shouldn't do so in userspace either.

Trond

^ permalink raw reply	[flat|nested] 36+ messages in thread

* Re: whither NFS umount?
  2010-10-12 20:21             ` Trond Myklebust
@ 2010-10-12 20:26               ` Jeff Layton
  2010-10-12 20:34                 ` Chuck Lever
  0 siblings, 1 reply; 36+ messages in thread
From: Jeff Layton @ 2010-10-12 20:26 UTC (permalink / raw)
  To: Trond Myklebust; +Cc: Chuck Lever, Linux NFS Mailing List

On Tue, 12 Oct 2010 16:21:00 -0400
Trond Myklebust <Trond.Myklebust@netapp.com> wrote:

> On Tue, 2010-10-12 at 15:52 -0400, Jeff Layton wrote:
> > On Tue, 12 Oct 2010 15:44:09 -0400
> > Trond Myklebust <Trond.Myklebust@netapp.com> wrote:
> > 
> > > On Tue, 2010-10-12 at 15:18 -0400, Jeff Layton wrote:
> > > 
> > > > I think the part that causes problems is having userspace do this. In
> > > > theory, if the kernel were in charge of sending the UMNT, then it's not
> > > > really a problem since it knows when to do it. If we have code that
> > > > sends a UMNT already, why not do a best-effort UMNT call from the
> > > > kernel when we tear down the sb?
> > > 
> > > Purely for the pleasure of allowing the server to maintain inaccurate
> > > statistics about who is currently mounting what? I think not...
> > > 
> > > You can get far more accurate results by replacing the MNT/UMNT state
> > > counter with a purely server-based scheme to track who accessed one or
> > > more files on each exported partition in the past 5 minutes or so. That
> > > would even work with NFSv4...
> > > 
> > 
> > True, but for better or worse, UMNT is part of the protocol. It seems
> > like we ought to do our best to implement it, even if it is
> > fundamentally flawed.
> > 
> 
> UMNTALL is part of the same protocol, and yet we have never implemented
> that. Just because something is documented, it doesn't mean we have to
> do it...
> 
> The bottom line is that UMNT doesn't do what it advertises itself as
> doing, and so we should not waste space supporting it in the kernel. We
> shouldn't do so in userspace either.
> 

Fair enough. Like Chuck I don't have strong feelings about it, other
than seeing no need to continue shipping umount.nfs.

-- 
Jeff Layton <jlayton@redhat.com>

^ permalink raw reply	[flat|nested] 36+ messages in thread

* Re: whither NFS umount?
  2010-10-12 20:26               ` Jeff Layton
@ 2010-10-12 20:34                 ` Chuck Lever
  2010-10-12 20:50                   ` Jeff Layton
  0 siblings, 1 reply; 36+ messages in thread
From: Chuck Lever @ 2010-10-12 20:34 UTC (permalink / raw)
  To: Jeff Layton; +Cc: Trond Myklebust, Linux NFS Mailing List


On Oct 12, 2010, at 4:26 PM, Jeff Layton wrote:

> On Tue, 12 Oct 2010 16:21:00 -0400
> Trond Myklebust <Trond.Myklebust@netapp.com> wrote:
> 
>> On Tue, 2010-10-12 at 15:52 -0400, Jeff Layton wrote:
>>> On Tue, 12 Oct 2010 15:44:09 -0400
>>> Trond Myklebust <Trond.Myklebust@netapp.com> wrote:
>>> 
>>>> On Tue, 2010-10-12 at 15:18 -0400, Jeff Layton wrote:
>>>> 
>>>>> I think the part that causes problems is having userspace do this. In
>>>>> theory, if the kernel were in charge of sending the UMNT, then it's not
>>>>> really a problem since it knows when to do it. If we have code that
>>>>> sends a UMNT already, why not do a best-effort UMNT call from the
>>>>> kernel when we tear down the sb?
>>>> 
>>>> Purely for the pleasure of allowing the server to maintain inaccurate
>>>> statistics about who is currently mounting what? I think not...
>>>> 
>>>> You can get far more accurate results by replacing the MNT/UMNT state
>>>> counter with a purely server-based scheme to track who accessed one or
>>>> more files on each exported partition in the past 5 minutes or so. That
>>>> would even work with NFSv4...
>>>> 
>>> 
>>> True, but for better or worse, UMNT is part of the protocol. It seems
>>> like we ought to do our best to implement it, even if it is
>>> fundamentally flawed.
>>> 
>> 
>> UMNTALL is part of the same protocol, and yet we have never implemented
>> that. Just because something is documented, it doesn't mean we have to
>> do it...
>> 
>> The bottom line is that UMNT doesn't do what it advertises itself as
>> doing, and so we should not waste space supporting it in the kernel. We
>> shouldn't do so in userspace either.
>> 
> 
> Fair enough. Like Chuck I don't have strong feelings about it, other
> than seeing no need to continue shipping umount.nfs.

Careful... we still need umount.nfs, which is a link to mount.nfs.  Something in user space has to kick off local namespace changes, even if the server isn't affected.

-- 
chuck[dot]lever[at]oracle[dot]com





^ permalink raw reply	[flat|nested] 36+ messages in thread

* Re: whither NFS umount?
  2010-10-12 20:34                 ` Chuck Lever
@ 2010-10-12 20:50                   ` Jeff Layton
  2010-10-12 21:19                     ` Chuck Lever
  0 siblings, 1 reply; 36+ messages in thread
From: Jeff Layton @ 2010-10-12 20:50 UTC (permalink / raw)
  To: Chuck Lever; +Cc: Trond Myklebust, Linux NFS Mailing List

On Tue, 12 Oct 2010 16:34:45 -0400
Chuck Lever <chuck.lever@oracle.com> wrote:

> 
> On Oct 12, 2010, at 4:26 PM, Jeff Layton wrote:
> 
> > On Tue, 12 Oct 2010 16:21:00 -0400
> > Trond Myklebust <Trond.Myklebust@netapp.com> wrote:
> > 
> >> On Tue, 2010-10-12 at 15:52 -0400, Jeff Layton wrote:
> >>> On Tue, 12 Oct 2010 15:44:09 -0400
> >>> Trond Myklebust <Trond.Myklebust@netapp.com> wrote:
> >>> 
> >>>> On Tue, 2010-10-12 at 15:18 -0400, Jeff Layton wrote:
> >>>> 
> >>>>> I think the part that causes problems is having userspace do this. In
> >>>>> theory, if the kernel were in charge of sending the UMNT, then it's not
> >>>>> really a problem since it knows when to do it. If we have code that
> >>>>> sends a UMNT already, why not do a best-effort UMNT call from the
> >>>>> kernel when we tear down the sb?
> >>>> 
> >>>> Purely for the pleasure of allowing the server to maintain inaccurate
> >>>> statistics about who is currently mounting what? I think not...
> >>>> 
> >>>> You can get far more accurate results by replacing the MNT/UMNT state
> >>>> counter with a purely server-based scheme to track who accessed one or
> >>>> more files on each exported partition in the past 5 minutes or so. That
> >>>> would even work with NFSv4...
> >>>> 
> >>> 
> >>> True, but for better or worse, UMNT is part of the protocol. It seems
> >>> like we ought to do our best to implement it, even if it is
> >>> fundamentally flawed.
> >>> 
> >> 
> >> UMNTALL is part of the same protocol, and yet we have never implemented
> >> that. Just because something is documented, it doesn't mean we have to
> >> do it...
> >> 
> >> The bottom line is that UMNT doesn't do what it advertises itself as
> >> doing, and so we should not waste space supporting it in the kernel. We
> >> shouldn't do so in userspace either.
> >> 
> > 
> > Fair enough. Like Chuck I don't have strong feelings about it, other
> > than seeing no need to continue shipping umount.nfs.
> 
> Careful... we still need umount.nfs, which is a link to mount.nfs.  Something in user space has to kick off local namespace changes, even if the server isn't affected.
> 

Most other filesystems don't need a umount helper. If there isn't one
present then /bin/umount takes care of the umount() call and cleaning
the entry out of /etc/mtab.

My understanding was that umount.nfs only existed because it was needed
to handle the UMNT RPC. Without that, there's really no need for it.

...or am I overlooking something?

-- 
Jeff Layton <jlayton@redhat.com>

^ permalink raw reply	[flat|nested] 36+ messages in thread

* Re: whither NFS umount?
  2010-10-12 20:50                   ` Jeff Layton
@ 2010-10-12 21:19                     ` Chuck Lever
  2010-10-13  1:00                       ` Jeff Layton
  0 siblings, 1 reply; 36+ messages in thread
From: Chuck Lever @ 2010-10-12 21:19 UTC (permalink / raw)
  To: Jeff Layton; +Cc: Trond Myklebust, Linux NFS Mailing List


On Oct 12, 2010, at 4:50 PM, Jeff Layton wrote:

> On Tue, 12 Oct 2010 16:34:45 -0400
> Chuck Lever <chuck.lever@oracle.com> wrote:
> 
>> 
>> On Oct 12, 2010, at 4:26 PM, Jeff Layton wrote:
>> 
>>> On Tue, 12 Oct 2010 16:21:00 -0400
>>> Trond Myklebust <Trond.Myklebust@netapp.com> wrote:
>>> 
>>>> On Tue, 2010-10-12 at 15:52 -0400, Jeff Layton wrote:
>>>>> On Tue, 12 Oct 2010 15:44:09 -0400
>>>>> Trond Myklebust <Trond.Myklebust@netapp.com> wrote:
>>>>> 
>>>>>> On Tue, 2010-10-12 at 15:18 -0400, Jeff Layton wrote:
>>>>>> 
>>>>>>> I think the part that causes problems is having userspace do this. In
>>>>>>> theory, if the kernel were in charge of sending the UMNT, then it's not
>>>>>>> really a problem since it knows when to do it. If we have code that
>>>>>>> sends a UMNT already, why not do a best-effort UMNT call from the
>>>>>>> kernel when we tear down the sb?
>>>>>> 
>>>>>> Purely for the pleasure of allowing the server to maintain inaccurate
>>>>>> statistics about who is currently mounting what? I think not...
>>>>>> 
>>>>>> You can get far more accurate results by replacing the MNT/UMNT state
>>>>>> counter with a purely server-based scheme to track who accessed one or
>>>>>> more files on each exported partition in the past 5 minutes or so. That
>>>>>> would even work with NFSv4...
>>>>>> 
>>>>> 
>>>>> True, but for better or worse, UMNT is part of the protocol. It seems
>>>>> like we ought to do our best to implement it, even if it is
>>>>> fundamentally flawed.
>>>>> 
>>>> 
>>>> UMNTALL is part of the same protocol, and yet we have never implemented
>>>> that. Just because something is documented, it doesn't mean we have to
>>>> do it...
>>>> 
>>>> The bottom line is that UMNT doesn't do what it advertises itself as
>>>> doing, and so we should not waste space supporting it in the kernel. We
>>>> shouldn't do so in userspace either.
>>>> 
>>> 
>>> Fair enough. Like Chuck I don't have strong feelings about it, other
>>> than seeing no need to continue shipping umount.nfs.
>> 
>> Careful... we still need umount.nfs, which is a link to mount.nfs.  Something in user space has to kick off local namespace changes, even if the server isn't affected.
>> 
> 
> Most other filesystems don't need a umount helper. If there isn't one
> present then /bin/umount takes care of the umount() call and cleaning
> the entry out of /etc/mtab.
> 
> My understanding was that umount.nfs only existed because it was needed
> to handle the UMNT RPC. Without that, there's really no need for it.
> 
> ...or am I overlooking something?

umount.nfs still handles "users=" which has special behavior for NFS mounts, for example.  Does that matter?

-- 
chuck[dot]lever[at]oracle[dot]com





^ permalink raw reply	[flat|nested] 36+ messages in thread

* Re: whither NFS umount?
  2010-10-12 21:19                     ` Chuck Lever
@ 2010-10-13  1:00                       ` Jeff Layton
  0 siblings, 0 replies; 36+ messages in thread
From: Jeff Layton @ 2010-10-13  1:00 UTC (permalink / raw)
  To: Chuck Lever; +Cc: Trond Myklebust, Linux NFS Mailing List

On Tue, 12 Oct 2010 17:19:11 -0400
Chuck Lever <chuck.lever@oracle.com> wrote:

> 
> On Oct 12, 2010, at 4:50 PM, Jeff Layton wrote:
> 
> > On Tue, 12 Oct 2010 16:34:45 -0400
> > Chuck Lever <chuck.lever@oracle.com> wrote:
> > 
> >> 
> >> On Oct 12, 2010, at 4:26 PM, Jeff Layton wrote:
> >> 
> >>> On Tue, 12 Oct 2010 16:21:00 -0400
> >>> Trond Myklebust <Trond.Myklebust@netapp.com> wrote:
> >>> 
> >>>> On Tue, 2010-10-12 at 15:52 -0400, Jeff Layton wrote:
> >>>>> On Tue, 12 Oct 2010 15:44:09 -0400
> >>>>> Trond Myklebust <Trond.Myklebust@netapp.com> wrote:
> >>>>> 
> >>>>>> On Tue, 2010-10-12 at 15:18 -0400, Jeff Layton wrote:
> >>>>>> 
> >>>>>>> I think the part that causes problems is having userspace do this. In
> >>>>>>> theory, if the kernel were in charge of sending the UMNT, then it's not
> >>>>>>> really a problem since it knows when to do it. If we have code that
> >>>>>>> sends a UMNT already, why not do a best-effort UMNT call from the
> >>>>>>> kernel when we tear down the sb?
> >>>>>> 
> >>>>>> Purely for the pleasure of allowing the server to maintain inaccurate
> >>>>>> statistics about who is currently mounting what? I think not...
> >>>>>> 
> >>>>>> You can get far more accurate results by replacing the MNT/UMNT state
> >>>>>> counter with a purely server-based scheme to track who accessed one or
> >>>>>> more files on each exported partition in the past 5 minutes or so. That
> >>>>>> would even work with NFSv4...
> >>>>>> 
> >>>>> 
> >>>>> True, but for better or worse, UMNT is part of the protocol. It seems
> >>>>> like we ought to do our best to implement it, even if it is
> >>>>> fundamentally flawed.
> >>>>> 
> >>>> 
> >>>> UMNTALL is part of the same protocol, and yet we have never implemented
> >>>> that. Just because something is documented, it doesn't mean we have to
> >>>> do it...
> >>>> 
> >>>> The bottom line is that UMNT doesn't do what it advertises itself as
> >>>> doing, and so we should not waste space supporting it in the kernel. We
> >>>> shouldn't do so in userspace either.
> >>>> 
> >>> 
> >>> Fair enough. Like Chuck I don't have strong feelings about it, other
> >>> than seeing no need to continue shipping umount.nfs.
> >> 
> >> Careful... we still need umount.nfs, which is a link to mount.nfs.  Something in user space has to kick off local namespace changes, even if the server isn't affected.
> >> 
> > 
> > Most other filesystems don't need a umount helper. If there isn't one
> > present then /bin/umount takes care of the umount() call and cleaning
> > the entry out of /etc/mtab.
> > 
> > My understanding was that umount.nfs only existed because it was needed
> > to handle the UMNT RPC. Without that, there's really no need for it.
> > 
> > ...or am I overlooking something?
> 
> umount.nfs still handles "users=" which has special behavior for NFS mounts, for example.  Does that matter?
> 

Does it? I don't see where it does anything with "users=", but it does
handle "users" and "user=". So does /bin/umount though, and I think the
semantics are the same for both that and umount.nfs.

-- 
Jeff Layton <jlayton@redhat.com>

^ permalink raw reply	[flat|nested] 36+ messages in thread

* Re: whither NFS umount?
  2010-10-12 19:44         ` Trond Myklebust
  2010-10-12 19:52           ` Jeff Layton
@ 2010-10-13 17:40           ` Steve Dickson
  2010-10-13 18:13             ` Jeff Layton
  2010-10-13 18:18             ` Trond Myklebust
  1 sibling, 2 replies; 36+ messages in thread
From: Steve Dickson @ 2010-10-13 17:40 UTC (permalink / raw)
  To: Trond Myklebust; +Cc: Jeff Layton, Chuck Lever, Linux NFS Mailing List

Sorry for joining late... 

On 10/12/2010 03:44 PM, Trond Myklebust wrote:
> On Tue, 2010-10-12 at 15:18 -0400, Jeff Layton wrote:
> 
>> I think the part that causes problems is having userspace do this. In
>> theory, if the kernel were in charge of sending the UMNT, then it's not
>> really a problem since it knows when to do it. If we have code that
>> sends a UMNT already, why not do a best-effort UMNT call from the
>> kernel when we tear down the sb?
> 
> Purely for the pleasure of allowing the server to maintain inaccurate
> statistics about who is currently mounting what? I think not...
> 
> You can get far more accurate results by replacing the MNT/UMNT state
> counter with a purely server-based scheme to track who accessed one or
> more files on each exported partition in the past 5 minutes or so. That
> would even work with NFSv4...
> 
>> Either way, eliminating umount.nfs would be nice...
> 
> Agreed.
I having a hard time understanding this logic... 

Why do we think we (the Linux community) can simply 
throw way an established part of the protocol just because 
we deem it advisory... Now maybe in our implementation UMNT its
advisory and it might even be advisory in the spec, but how do we 
know with  other NFS implementation is not advisory, its actually needed.
We don't known and we can't known....

Now when our implementation becomes an NFSv4 only implementation, 
I say fine; Eliminate all the protocols that go along
with both v2 and v3. But until then lets just have leave
the legacy protocols along and move forward in more meaningful 
efforts... 

steved.
 



^ permalink raw reply	[flat|nested] 36+ messages in thread

* Re: whither NFS umount?
  2010-10-13 17:40           ` Steve Dickson
@ 2010-10-13 18:13             ` Jeff Layton
  2010-10-13 18:45               ` Steve Dickson
  2010-10-13 18:18             ` Trond Myklebust
  1 sibling, 1 reply; 36+ messages in thread
From: Jeff Layton @ 2010-10-13 18:13 UTC (permalink / raw)
  To: Steve Dickson; +Cc: Trond Myklebust, Chuck Lever, Linux NFS Mailing List

On Wed, 13 Oct 2010 13:40:37 -0400
Steve Dickson <SteveD@redhat.com> wrote:

> Sorry for joining late... 
> 
> On 10/12/2010 03:44 PM, Trond Myklebust wrote:
> > On Tue, 2010-10-12 at 15:18 -0400, Jeff Layton wrote:
> > 
> >> I think the part that causes problems is having userspace do this. In
> >> theory, if the kernel were in charge of sending the UMNT, then it's not
> >> really a problem since it knows when to do it. If we have code that
> >> sends a UMNT already, why not do a best-effort UMNT call from the
> >> kernel when we tear down the sb?
> > 
> > Purely for the pleasure of allowing the server to maintain inaccurate
> > statistics about who is currently mounting what? I think not...
> > 
> > You can get far more accurate results by replacing the MNT/UMNT state
> > counter with a purely server-based scheme to track who accessed one or
> > more files on each exported partition in the past 5 minutes or so. That
> > would even work with NFSv4...
> > 
> >> Either way, eliminating umount.nfs would be nice...
> > 
> > Agreed.
> I having a hard time understanding this logic... 
> 
> Why do we think we (the Linux community) can simply 
> throw way an established part of the protocol just because 
> we deem it advisory... Now maybe in our implementation UMNT its
> advisory and it might even be advisory in the spec, but how do we 
> know with  other NFS implementation is not advisory, its actually needed.
> We don't known and we can't known....
> 
> Now when our implementation becomes an NFSv4 only implementation, 
> I say fine; Eliminate all the protocols that go along
> with both v2 and v3. But until then lets just have leave
> the legacy protocols along and move forward in more meaningful 
> efforts... 
> 

It's not clear to me what you're advocating here...

umount.nfs is clearly broken today -- it sends UMNT calls even when it
shouldn't and there are problems with mtab handling.

The latter part is the main problem. The question is though -- should
we fix umount.nfs or just do away with it?

My position is that it really serves no good purpose these days. Only
the kernel knows when a UMNT call should be sent, so that really has no
place in umount.nfs. Once that piece is out of userspace, we might as
well just let /bin/umount do everything and not bother with umount.nfs.

A separate question is whether we should sent a UMNT request at all.
Trond has basically said "don't bother" to that one. I don't feel
strongly either way, but wouldn't be opposed to sending a best-effort
UMNT call from the kernel if it makes some servers happy.

-- 
Jeff Layton <jlayton@redhat.com>

^ permalink raw reply	[flat|nested] 36+ messages in thread

* Re: whither NFS umount?
  2010-10-13 17:40           ` Steve Dickson
  2010-10-13 18:13             ` Jeff Layton
@ 2010-10-13 18:18             ` Trond Myklebust
  2010-10-13 19:28               ` Steve Dickson
  1 sibling, 1 reply; 36+ messages in thread
From: Trond Myklebust @ 2010-10-13 18:18 UTC (permalink / raw)
  To: Steve Dickson; +Cc: Jeff Layton, Chuck Lever, Linux NFS Mailing List

On Wed, 2010-10-13 at 13:40 -0400, Steve Dickson wrote:
> Sorry for joining late... 
> 
> On 10/12/2010 03:44 PM, Trond Myklebust wrote:
> > On Tue, 2010-10-12 at 15:18 -0400, Jeff Layton wrote:
> > 
> >> I think the part that causes problems is having userspace do this. In
> >> theory, if the kernel were in charge of sending the UMNT, then it's not
> >> really a problem since it knows when to do it. If we have code that
> >> sends a UMNT already, why not do a best-effort UMNT call from the
> >> kernel when we tear down the sb?
> > 
> > Purely for the pleasure of allowing the server to maintain inaccurate
> > statistics about who is currently mounting what? I think not...
> > 
> > You can get far more accurate results by replacing the MNT/UMNT state
> > counter with a purely server-based scheme to track who accessed one or
> > more files on each exported partition in the past 5 minutes or so. That
> > would even work with NFSv4...
> > 
> >> Either way, eliminating umount.nfs would be nice...
> > 
> > Agreed.
> I having a hard time understanding this logic... 
> 
> Why do we think we (the Linux community) can simply 
> throw way an established part of the protocol just because 
> we deem it advisory... Now maybe in our implementation UMNT its
> advisory and it might even be advisory in the spec, but how do we 
> know with  other NFS implementation is not advisory, its actually needed.
> We don't known and we can't known....

Yes we do know!

Anything that relies on a _stateful_ protocol that doesn't have a way to
deal with the fact that clients may go away and never return is
inherently broken. That lesson is exactly why we moved to making state
subject to a lease in NFSv4.

Furthermore, it is not as if we have more than a semi-working
implementation of this now: we don't implement UMNTALL on client reboot
(I doubt that even Solaris bothers doing that) and we don't get UMNT
right if the same filesystem is mounted twice on the same client.

IOW: if there are servers that really do require UMNT to work, then they
will already be learning the errors of their assumptions with today's
client.

> Now when our implementation becomes an NFSv4 only implementation, 
> I say fine; Eliminate all the protocols that go along
> with both v2 and v3. But until then lets just have leave
> the legacy protocols along and move forward in more meaningful 
> efforts... 

For the reasons state above, I see no need to put UMNT support in the
kernel, nor do I want yet another upcall mechanism in order to make
UMNTALL work.
For the same reasons, I don't care if people keep it or throw it out
from the userland utilities.

Trond

^ permalink raw reply	[flat|nested] 36+ messages in thread

* Re: whither NFS umount?
  2010-10-13 18:13             ` Jeff Layton
@ 2010-10-13 18:45               ` Steve Dickson
       [not found]                 ` <4CB5FE65.3090409-AfCzQyP5zfLQT0dZR+AlfA@public.gmane.org>
  0 siblings, 1 reply; 36+ messages in thread
From: Steve Dickson @ 2010-10-13 18:45 UTC (permalink / raw)
  To: Jeff Layton; +Cc: Trond Myklebust, Chuck Lever, Linux NFS Mailing List



On 10/13/2010 02:13 PM, Jeff Layton wrote:
> On Wed, 13 Oct 2010 13:40:37 -0400
> Steve Dickson <SteveD@redhat.com> wrote:
> 
>> Sorry for joining late... 
>>
>> On 10/12/2010 03:44 PM, Trond Myklebust wrote:
>>> On Tue, 2010-10-12 at 15:18 -0400, Jeff Layton wrote:
>>>
>>>> I think the part that causes problems is having userspace do this. In
>>>> theory, if the kernel were in charge of sending the UMNT, then it's not
>>>> really a problem since it knows when to do it. If we have code that
>>>> sends a UMNT already, why not do a best-effort UMNT call from the
>>>> kernel when we tear down the sb?
>>>
>>> Purely for the pleasure of allowing the server to maintain inaccurate
>>> statistics about who is currently mounting what? I think not...
>>>
>>> You can get far more accurate results by replacing the MNT/UMNT state
>>> counter with a purely server-based scheme to track who accessed one or
>>> more files on each exported partition in the past 5 minutes or so. That
>>> would even work with NFSv4...
>>>
>>>> Either way, eliminating umount.nfs would be nice...
>>>
>>> Agreed.
>> I having a hard time understanding this logic... 
>>
>> Why do we think we (the Linux community) can simply 
>> throw way an established part of the protocol just because 
>> we deem it advisory... Now maybe in our implementation UMNT its
>> advisory and it might even be advisory in the spec, but how do we 
>> know with  other NFS implementation is not advisory, its actually needed.
>> We don't known and we can't known....
>>
>> Now when our implementation becomes an NFSv4 only implementation, 
>> I say fine; Eliminate all the protocols that go along
>> with both v2 and v3. But until then lets just have leave
>> the legacy protocols along and move forward in more meaningful 
>> efforts... 
>>
> 
> It's not clear to me what you're advocating here...
I'm advocating do remove the umount.nfs command which
in turn would not remove the UMNT calls. Its just not 
worth the pain... because there is no gain! 

> 
> umount.nfs is clearly broken today -- it sends UMNT calls even when it
> shouldn't and there are problems with mtab handling.
Well that's a bug, so lets fix it... But the fix should not
be to completely remove call. 

> 
> The latter part is the main problem. The question is though -- should
> we fix umount.nfs or just do away with it?
Fix it... IMHO..

> 
> My position is that it really serves no good purpose these days. Only
> the kernel knows when a UMNT call should be sent, so that really has no
> place in umount.nfs. Once that piece is out of userspace, we might as
> well just let /bin/umount do everything and not bother with umount.nfs.
Just curious... how would we maintain backwards completable with old
kernels? 

Again, why even put effort in things like this? All the testing
something this could cause... all the (potential) breakage of 
third party applicants.... There is absolution no reason to
do this... because, again, there is no real gain to it... IMHO... 

> 
> A separate question is whether we should sent a UMNT request at all.
> Trond has basically said "don't bother" to that one. I don't feel
> strongly either way, but wouldn't be opposed to sending a best-effort
> UMNT call from the kernel if it makes some servers happy.
> 
I would say send the UMNT, since it does not cause any pain to send it
verses the pain that could be cause by not sending it...

This is a perfect example of fixing something that is not
broken... We can put our energy in better place that worrying
about things like this... IMHO...

steved.

^ permalink raw reply	[flat|nested] 36+ messages in thread

* Re: whither NFS umount?
       [not found]                 ` <4CB5FE65.3090409-AfCzQyP5zfLQT0dZR+AlfA@public.gmane.org>
@ 2010-10-13 18:56                   ` Jeff Layton
  2010-10-13 18:58                     ` Jeff Layton
       [not found]                     ` <20101013145601.468acc2a-4QP7MXygkU+dMjc06nkz3ljfA9RmPOcC@public.gmane.org>
  0 siblings, 2 replies; 36+ messages in thread
From: Jeff Layton @ 2010-10-13 18:56 UTC (permalink / raw)
  To: Steve Dickson; +Cc: Trond Myklebust, Chuck Lever, Linux NFS Mailing List

On Wed, 13 Oct 2010 14:45:57 -0400
Steve Dickson <SteveD@redhat.com> wrote:

> I would say send the UMNT, since it does not cause any pain to send it
> verses the pain that could be cause by not sending it...
> 
> This is a perfect example of fixing something that is not
> broken... We can put our energy in better place that worrying
> about things like this... IMHO...

But it *is* broken. As Chuck pointed out, the main problem is that mtab
handling is broken on remounts. That's a real problem that needs to be
fixed.

I agree that our time is better spent elsewhere. I just think that we
ought to make that happen by eliminating the unnecessary umount helper.
The less code that we need to maintain, the better...

-- 
Jeff Layton <jlayton@redhat.com>

^ permalink raw reply	[flat|nested] 36+ messages in thread

* Re: whither NFS umount?
  2010-10-13 18:56                   ` Jeff Layton
@ 2010-10-13 18:58                     ` Jeff Layton
       [not found]                     ` <20101013145601.468acc2a-4QP7MXygkU+dMjc06nkz3ljfA9RmPOcC@public.gmane.org>
  1 sibling, 0 replies; 36+ messages in thread
From: Jeff Layton @ 2010-10-13 18:58 UTC (permalink / raw)
  To: Steve Dickson; +Cc: Trond Myklebust, Chuck Lever, Linux NFS Mailing List

On Wed, 13 Oct 2010 14:56:01 -0400
Jeff Layton <jlayton@redhat.com> wrote:

> On Wed, 13 Oct 2010 14:45:57 -0400
> Steve Dickson <SteveD@redhat.com> wrote:
> 
> > I would say send the UMNT, since it does not cause any pain to send it
> > verses the pain that could be cause by not sending it...
> > 
> > This is a perfect example of fixing something that is not
> > broken... We can put our energy in better place that worrying
> > about things like this... IMHO...
> 
> But it *is* broken. As Chuck pointed out, the main problem is that mtab
> handling is broken on remounts. That's a real problem that needs to be
> fixed.
> 
> I agree that our time is better spent elsewhere. I just think that we
> ought to make that happen by eliminating the unnecessary umount helper.
> The less code that we need to maintain, the better...
> 

Sorry, let me clarify... we'll still need to fix -o remount handling in
mount.nfs. At the same time though, we can reduce our maintenance
burden by getting rid of umount.nfs.

I just don't think it serves much of a purpose these days...

-- 
Jeff Layton <jlayton@redhat.com>

^ permalink raw reply	[flat|nested] 36+ messages in thread

* Re: whither NFS umount?
  2010-10-13 18:18             ` Trond Myklebust
@ 2010-10-13 19:28               ` Steve Dickson
  2010-10-14 14:00                 ` J. Bruce Fields
  0 siblings, 1 reply; 36+ messages in thread
From: Steve Dickson @ 2010-10-13 19:28 UTC (permalink / raw)
  To: Trond Myklebust; +Cc: Jeff Layton, Chuck Lever, Linux NFS Mailing List



On 10/13/2010 02:18 PM, Trond Myklebust wrote:
> On Wed, 2010-10-13 at 13:40 -0400, Steve Dickson wrote:
>> Sorry for joining late... 
>>
>> On 10/12/2010 03:44 PM, Trond Myklebust wrote:
>>> On Tue, 2010-10-12 at 15:18 -0400, Jeff Layton wrote:
>>>
>>>> I think the part that causes problems is having userspace do this. In
>>>> theory, if the kernel were in charge of sending the UMNT, then it's not
>>>> really a problem since it knows when to do it. If we have code that
>>>> sends a UMNT already, why not do a best-effort UMNT call from the
>>>> kernel when we tear down the sb?
>>>
>>> Purely for the pleasure of allowing the server to maintain inaccurate
>>> statistics about who is currently mounting what? I think not...
>>>
>>> You can get far more accurate results by replacing the MNT/UMNT state
>>> counter with a purely server-based scheme to track who accessed one or
>>> more files on each exported partition in the past 5 minutes or so. That
>>> would even work with NFSv4...
>>>
>>>> Either way, eliminating umount.nfs would be nice...
>>>
>>> Agreed.
>> I having a hard time understanding this logic... 
>>
>> Why do we think we (the Linux community) can simply 
>> throw way an established part of the protocol just because 
>> we deem it advisory... Now maybe in our implementation UMNT its
>> advisory and it might even be advisory in the spec, but how do we 
>> know with  other NFS implementation is not advisory, its actually needed.
>> We don't known and we can't known....
> 
> Yes we do know!
> 
> Anything that relies on a _stateful_ protocol that doesn't have a way to
> deal with the fact that clients may go away and never return is
> inherently broken. That lesson is exactly why we moved to making state
> subject to a lease in NFSv4.
> 
> Furthermore, it is not as if we have more than a semi-working
> implementation of this now: we don't implement UMNTALL on client reboot
> (I doubt that even Solaris bothers doing that) and we don't get UMNT
> right if the same filesystem is mounted twice on the same client.
> 
> IOW: if there are servers that really do require UMNT to work, then they
> will already be learning the errors of their assumptions with today's
> client.
You reasoning is very solid... I agree, if servers, for some reason,
are depended on this state they are broken. But *not* staying 
compatible with broken server is not an option, at least from 
where I view the world.. ;-)
  
> 
>> Now when our implementation becomes an NFSv4 only implementation, 
>> I say fine; Eliminate all the protocols that go along
>> with both v2 and v3. But until then lets just have leave
>> the legacy protocols along and move forward in more meaningful 
>> efforts... 
> 
> For the reasons state above, I see no need to put UMNT support in the
> kernel, nor do I want yet another upcall mechanism in order to make
> UMNTALL work.
Fine... 

> For the same reasons, I don't care if people keep it or throw it out
> from the userland utilities.
Unfortunately I do! 8-) 

steved.



^ permalink raw reply	[flat|nested] 36+ messages in thread

* Re: whither NFS umount?
       [not found]                     ` <20101013145601.468acc2a-4QP7MXygkU+dMjc06nkz3ljfA9RmPOcC@public.gmane.org>
@ 2010-10-13 19:31                       ` Steve Dickson
  2010-10-13 20:47                         ` Chuck Lever
  0 siblings, 1 reply; 36+ messages in thread
From: Steve Dickson @ 2010-10-13 19:31 UTC (permalink / raw)
  To: Jeff Layton; +Cc: Trond Myklebust, Chuck Lever, Linux NFS Mailing List

On 10/13/2010 02:56 PM, Jeff Layton wrote:
> On Wed, 13 Oct 2010 14:45:57 -0400
> Steve Dickson <SteveD@redhat.com> wrote:
> 
>> I would say send the UMNT, since it does not cause any pain to send it
>> verses the pain that could be cause by not sending it...
>>
>> This is a perfect example of fixing something that is not
>> broken... We can put our energy in better place that worrying
>> about things like this... IMHO...
> 
> But it *is* broken. As Chuck pointed out, the main problem is that mtab
> handling is broken on remounts. That's a real problem that needs to be
> fixed.
Fine... Lets just focus on that issue...  

> 
> I agree that our time is better spent elsewhere. I just think that we
> ought to make that happen by eliminating the unnecessary umount helper.
> The less code that we need to maintain, the better...
In general I agree... but removing functionality (i.e. umount.nfs)
can cause more pain than just leaving things as is...

steved.


^ permalink raw reply	[flat|nested] 36+ messages in thread

* Re: whither NFS umount?
  2010-10-13 19:31                       ` Steve Dickson
@ 2010-10-13 20:47                         ` Chuck Lever
  2010-10-13 23:19                           ` Steve Dickson
  0 siblings, 1 reply; 36+ messages in thread
From: Chuck Lever @ 2010-10-13 20:47 UTC (permalink / raw)
  To: Steve Dickson; +Cc: Jeff Layton, Trond Myklebust, Linux NFS Mailing List


On Oct 13, 2010, at 3:31 PM, Steve Dickson wrote:

> On 10/13/2010 02:56 PM, Jeff Layton wrote:
>> On Wed, 13 Oct 2010 14:45:57 -0400
>> Steve Dickson <SteveD@redhat.com> wrote:
>> 
>>> I would say send the UMNT, since it does not cause any pain to send it
>>> verses the pain that could be cause by not sending it...
>>> 
>>> This is a perfect example of fixing something that is not
>>> broken... We can put our energy in better place that worrying
>>> about things like this... IMHO...
>> 
>> But it *is* broken. As Chuck pointed out, the main problem is that mtab
>> handling is broken on remounts. That's a real problem that needs to be
>> fixed.
> Fine... Lets just focus on that issue...  

Actually...

Removing UMNT support _is_ the proposed fix for the "-o remount" issue.  The reason we depend on /etc/mtab is because umount needs to know how to do the unmount.  A "-o remount" wipes that information.  It turns out that there is no correct implementation of remount rewriting the options in /etc/mtab.  utils-linux-ng does not even get this right, and mount(8) claims that /etc/mtab will not be reliable after a remount.

If we no longer have to perform a UMNT, then there is no need to worry about the contents of /etc/mtab, for any reason, and it can go away.  I believe we are the only file system that is holding up the complete removal of /etc/mtab.

I think there's no reason to keep UMNT in user space; it simply can never work correctly there.  I believe a kernel space implementation would be simple, and would work correctly much more often than user space UMNT does now.  I don't agree with Trond that we should not do it there, but that's an argument I can't win.

>> I agree that our time is better spent elsewhere. I just think that we
>> ought to make that happen by eliminating the unnecessary umount helper.
>> The less code that we need to maintain, the better...
> In general I agree... but removing functionality (i.e. umount.nfs)
> can cause more pain than just leaving things as is...

Is there any specific evidence that there will be pain if umount.nfs is removed?  I can't think of any use case it would harm.  It's already broken in many cases, and we have never heard of a specific application complaint about it.  If there had been a complaint we would have fixed it already.  UMNT is clearly vestigial.

-- 
chuck[dot]lever[at]oracle[dot]com


^ permalink raw reply	[flat|nested] 36+ messages in thread

* Re: whither NFS umount?
  2010-10-13 20:47                         ` Chuck Lever
@ 2010-10-13 23:19                           ` Steve Dickson
  2010-10-14 15:29                             ` Chuck Lever
  0 siblings, 1 reply; 36+ messages in thread
From: Steve Dickson @ 2010-10-13 23:19 UTC (permalink / raw)
  To: Chuck Lever; +Cc: Jeff Layton, Trond Myklebust, Linux NFS Mailing List

Hey... 

On 10/13/2010 04:47 PM, Chuck Lever wrote:
> 
> On Oct 13, 2010, at 3:31 PM, Steve Dickson wrote:
> 
>> On 10/13/2010 02:56 PM, Jeff Layton wrote:
>>> On Wed, 13 Oct 2010 14:45:57 -0400
>>> Steve Dickson <SteveD@redhat.com> wrote:
>>>
>>>> I would say send the UMNT, since it does not cause any pain to send it
>>>> verses the pain that could be cause by not sending it...
>>>>
>>>> This is a perfect example of fixing something that is not
>>>> broken... We can put our energy in better place that worrying
>>>> about things like this... IMHO...
>>>
>>> But it *is* broken. As Chuck pointed out, the main problem is that mtab
>>> handling is broken on remounts. That's a real problem that needs to be
>>> fixed.
>> Fine... Lets just focus on that issue...  
> 
> Actually...
> 
> Removing UMNT support _is_ the proposed fix for the "-o remount" issue.  
> The reason we depend on /etc/mtab is because umount needs to know how to do 
> the unmount.  
I'm in the middle of testing one of your patches that that uses /proc/mounts
to figure out how to unmount things... Which I think is the correct way
to do it... Breaking our dependence on /etc/mtab is a good thing... IMHO...
 
> A "-o remount" wipes that information.  It turns out that 
> there is no correct implementation of remount rewriting the options in /etc/mtab.
> utils-linux-ng does not even get this right, and mount(8) claims 
> that /etc/mtab will not be reliable after a remount.
Another reason for us to move away from /etc/mtab... 

> 
> If we no longer have to perform a UMNT, then there is no need to worry about
> the contents of /etc/mtab, for any reason, and it can go away.  I believe we 
> are the only file system that is holding up the complete removal of /etc/mtab.
I guess don't understand the justification of not send an industry
wide accepted and expected protocol message because we have bug in our 
code... Is a protocol message every single NFS client sends and
has sent for many many years... Now we are not going adhere to this
industry practice because we don't think its necessary since it 
will make things hard for us to fix bug? 

Lets try to fix this  bug in least disruptive way? A way that does not 
change an industry common practice...
 
> 
> I think there's no reason to keep UMNT in user space; it simply can never 
> work correctly there.  I believe a kernel space implementation would be 
> simple, and would work correctly much more often than user space UMNT does now.
> I don't agree with Trond that we should not do it there, but that's an argument 
> I can't win.
I can understand the reasoning... As you know I'm always in favour of
keeping things in user space because its always easier updated a user 
packages than a kernel... 
 
> 
>>> I agree that our time is better spent elsewhere. I just think that we
>>> ought to make that happen by eliminating the unnecessary umount helper.
>>> The less code that we need to maintain, the better...
>> In general I agree... but removing functionality (i.e. umount.nfs)
>> can cause more pain than just leaving things as is...
> 
> Is there any specific evidence that there will be pain if umount.nfs is removed? 
> I can't think of any use case it would harm.  It's already broken in many cases,
> and we have never heard of a specific application complaint about it.  If there 
> had been a complaint we would have fixed it already.  UMNT is clearly vestigial.
No I do not any evidence.... its just pure paranoia in the fact we will
could easily break third party applications that will depend on the 
existence of that binary... Plus removing the binary is simply not needed
to fix a bug in the processing of /etc/mtab... Its just overkill... IMHO...

I'll take a look and see if I can come up with a way to fixing the remount
bug without remove the umount.nfs binary as well as continuing to
send out the UMNT message... It has to be possible... 

steved.


^ permalink raw reply	[flat|nested] 36+ messages in thread

* Re: whither NFS umount?
  2010-10-13 19:28               ` Steve Dickson
@ 2010-10-14 14:00                 ` J. Bruce Fields
  2010-10-14 14:17                   ` Trond Myklebust
  0 siblings, 1 reply; 36+ messages in thread
From: J. Bruce Fields @ 2010-10-14 14:00 UTC (permalink / raw)
  To: Steve Dickson
  Cc: Trond Myklebust, Jeff Layton, Chuck Lever, Linux NFS Mailing List

On Wed, Oct 13, 2010 at 03:28:41PM -0400, Steve Dickson wrote:
> 
> 
> On 10/13/2010 02:18 PM, Trond Myklebust wrote:
> > Yes we do know!
> > 
> > Anything that relies on a _stateful_ protocol that doesn't have a way to
> > deal with the fact that clients may go away and never return is
> > inherently broken. That lesson is exactly why we moved to making state
> > subject to a lease in NFSv4.

And yet, we try to make lockd work as well as we can.

(OK, it's a question of degree: nlm isn't as broken as UMNT.)

> > Furthermore, it is not as if we have more than a semi-working
> > implementation of this now: we don't implement UMNTALL on client reboot
> > (I doubt that even Solaris bothers doing that) and we don't get UMNT
> > right if the same filesystem is mounted twice on the same client.
> > 
> > IOW: if there are servers that really do require UMNT to work, then they
> > will already be learning the errors of their assumptions with today's
> > client.
> You reasoning is very solid... I agree, if servers, for some reason,
> are depended on this state they are broken. But *not* staying 
> compatible with broken server is not an option, at least from 
> where I view the world.. ;-)

I'm inclined to agree, but a) I haven't thought about it much, b)
posting a patch would be more effective than repeating the same point,
and c) we shouldn't take this as an excuse not to fix the server.

So, on the server side...  As Trond says, we'd like to know which
clients have been active recently.  (In the v4 case, the lease gives
that a precise meaning.  In the v2/v3 case, I suppose we just pick a
number to use as a timeout.  We could use the v4 lease time.)  It's only
the kernel that knows which clients have been active recently.  So we'd
need a way for the client to export that information to userspace.

Greg's per-client statistics might be a good starting point:

	http://thread.gmane.org/gmane.linux.nfs/25537/focus=25534

Though that's just per-ip address, which isn't ideal for v4.  (In the v4
case the client identifier/owner allows us to distinguish e.g. between
multiple clients on the same machine, or behind the same NAT.)  I think
that wouldn't be hard to fix.  (Maybe just add on some ascii-escaped
representation of the clientid to the end of the same line?)

--b.

^ permalink raw reply	[flat|nested] 36+ messages in thread

* Re: whither NFS umount?
  2010-10-14 14:00                 ` J. Bruce Fields
@ 2010-10-14 14:17                   ` Trond Myklebust
       [not found]                     ` <1287065841.3015.233.camel-rJ7iovZKK19ZJLDQqaL3InhyD016LWXt@public.gmane.org>
  0 siblings, 1 reply; 36+ messages in thread
From: Trond Myklebust @ 2010-10-14 14:17 UTC (permalink / raw)
  To: J. Bruce Fields
  Cc: Steve Dickson, Jeff Layton, Chuck Lever, Linux NFS Mailing List

On Thu, 2010-10-14 at 10:00 -0400, J. Bruce Fields wrote:
> So, on the server side...  As Trond says, we'd like to know which
> clients have been active recently.  (In the v4 case, the lease gives
> that a precise meaning.  In the v2/v3 case, I suppose we just pick a
> number to use as a timeout.  We could use the v4 lease time.)  It's only
> the kernel that knows which clients have been active recently.  So we'd
> need a way for the client to export that information to userspace.
> 
> Greg's per-client statistics might be a good starting point:
> 
> 	http://thread.gmane.org/gmane.linux.nfs/25537/focus=25534
> 
> Though that's just per-ip address, which isn't ideal for v4.  (In the v4
> case the client identifier/owner allows us to distinguish e.g. between
> multiple clients on the same machine, or behind the same NAT.)  I think
> that wouldn't be hard to fix.  (Maybe just add on some ascii-escaped
> representation of the clientid to the end of the same line?)

Relying on the lease isn't good enough. While an NFSv4.1 client is
indeed required to renew its lease before it can do any useful work, an
NFSv4.0 client is only required to establish a lease in order to OPEN a
file.
The Linux NFS client will, for instance, defer establishing a lease
until first open, and will stop renewing that lease when nobody is
holding any open file state.

One alternative to using the lease state is to use the TCP connection
state and/or the TCP/UDP port number. That's pretty much what we use to
distinguish separate NFS clients in the replay cache anyway...

Cheers
  Trond

^ permalink raw reply	[flat|nested] 36+ messages in thread

* Re: whither NFS umount?
       [not found]                     ` <1287065841.3015.233.camel-rJ7iovZKK19ZJLDQqaL3InhyD016LWXt@public.gmane.org>
@ 2010-10-14 14:34                       ` J. Bruce Fields
  0 siblings, 0 replies; 36+ messages in thread
From: J. Bruce Fields @ 2010-10-14 14:34 UTC (permalink / raw)
  To: Trond Myklebust
  Cc: Steve Dickson, Jeff Layton, Chuck Lever, Linux NFS Mailing List

On Thu, Oct 14, 2010 at 10:17:21AM -0400, Trond Myklebust wrote:
> On Thu, 2010-10-14 at 10:00 -0400, J. Bruce Fields wrote:
> > So, on the server side...  As Trond says, we'd like to know which
> > clients have been active recently.  (In the v4 case, the lease gives
> > that a precise meaning.  In the v2/v3 case, I suppose we just pick a
> > number to use as a timeout.  We could use the v4 lease time.)  It's only
> > the kernel that knows which clients have been active recently.  So we'd
> > need a way for the client to export that information to userspace.
> > 
> > Greg's per-client statistics might be a good starting point:
> > 
> > 	http://thread.gmane.org/gmane.linux.nfs/25537/focus=25534
> > 
> > Though that's just per-ip address, which isn't ideal for v4.  (In the v4
> > case the client identifier/owner allows us to distinguish e.g. between
> > multiple clients on the same machine, or behind the same NAT.)  I think
> > that wouldn't be hard to fix.  (Maybe just add on some ascii-escaped
> > representation of the clientid to the end of the same line?)
> 
> Relying on the lease isn't good enough. While an NFSv4.1 client is
> indeed required to renew its lease before it can do any useful work, an
> NFSv4.0 client is only required to establish a lease in order to OPEN a
> file.
> The Linux NFS client will, for instance, defer establishing a lease
> until first open, and will stop renewing that lease when nobody is
> holding any open file state.

Good point, thanks.

--b.

> One alternative to using the lease state is to use the TCP connection
> state and/or the TCP/UDP port number. That's pretty much what we use to
> distinguish separate NFS clients in the replay cache anyway...

^ permalink raw reply	[flat|nested] 36+ messages in thread

* Re: whither NFS umount?
  2010-10-13 23:19                           ` Steve Dickson
@ 2010-10-14 15:29                             ` Chuck Lever
  2010-10-14 18:27                               ` Steve Dickson
  0 siblings, 1 reply; 36+ messages in thread
From: Chuck Lever @ 2010-10-14 15:29 UTC (permalink / raw)
  To: Steve Dickson; +Cc: Jeff Layton, Trond Myklebust, Linux NFS Mailing List


On Oct 13, 2010, at 7:19 PM, Steve Dickson wrote:

> Hey... 
> 
> On 10/13/2010 04:47 PM, Chuck Lever wrote:
>> 
>> On Oct 13, 2010, at 3:31 PM, Steve Dickson wrote:
>> 
>>> On 10/13/2010 02:56 PM, Jeff Layton wrote:
>>>> On Wed, 13 Oct 2010 14:45:57 -0400
>>>> Steve Dickson <SteveD@redhat.com> wrote:
>>>> 
>>>>> I would say send the UMNT, since it does not cause any pain to send it
>>>>> verses the pain that could be cause by not sending it...
>>>>> 
>>>>> This is a perfect example of fixing something that is not
>>>>> broken... We can put our energy in better place that worrying
>>>>> about things like this... IMHO...
>>>> 
>>>> But it *is* broken. As Chuck pointed out, the main problem is that mtab
>>>> handling is broken on remounts. That's a real problem that needs to be
>>>> fixed.
>>> Fine... Lets just focus on that issue...  
>> 
>> Actually...
>> 
>> Removing UMNT support _is_ the proposed fix for the "-o remount" issue.  
>> The reason we depend on /etc/mtab is because umount needs to know how to do 
>> the unmount.  
> I'm in the middle of testing one of your patches that that uses /proc/mounts
> to figure out how to unmount things... Which I think is the correct way
> to do it... Breaking our dependence on /etc/mtab is a good thing... IMHO...
> 
>> A "-o remount" wipes that information.  It turns out that 
>> there is no correct implementation of remount rewriting the options in /etc/mtab.
>> utils-linux-ng does not even get this right, and mount(8) claims 
>> that /etc/mtab will not be reliable after a remount.
> Another reason for us to move away from /etc/mtab...

Using /proc/mounts to determine whether a mount point is NFSv4 or NFS is OK.

However, /proc/mounts doesn't contain the original mount options used to negotiate the mount.  Ideally, we want to use the original mount command line options, and not the exact options that were negotiated, when doing the umount.

The example I use to explain why /proc/mounts is not appropriate is a simple mount "vers=3".  It negotiates with the server and discovers the mount service is on port 4545.  The mount proceeds.  After a while the server reboots, and puts the mount service on port 32769.  Our /proc/mounts still has "mountport=4545" since that's what was negotiated at mount time and passed to the kernel.  When umount tries the UMNT request, it will read /proc/mounts, send the UMNT request to port 4545, and fail.

Now, we could just say "screw it" and use the negotiated options rather than the user-specified ones.  But that means there are still cases we could perform the UMNT correctly which would not work.

Another choice would be to store the original NFS-related command line mount options somewhere other than /etc/mtab, and leave them alone during a remount.  If we can't perform UMNT in the kernel, I'd prefer this type of solution to either updating /etc/mtab on a remount or using /proc/mounts during umount.

How about a file under /var/lib/nfs ?

>> If we no longer have to perform a UMNT, then there is no need to worry about
>> the contents of /etc/mtab, for any reason, and it can go away.  I believe we 
>> are the only file system that is holding up the complete removal of /etc/mtab.
> I guess don't understand the justification of not send an industry
> wide accepted and expected protocol message because we have bug in our 
> code... Is a protocol message every single NFS client sends and
> has sent for many many years... Now we are not going adhere to this
> industry practice because we don't think its necessary since it 
> will make things hard for us to fix bug? 

I'm sympathetic to that, though I also agree with Trond that UMNT is probably not relevant these days.  Since it is so unreliable, I think most implementations avoid depending on it for anything.

> Lets try to fix this  bug in least disruptive way? A way that does not 
> change an industry common practice...
> 
>> 
>> I think there's no reason to keep UMNT in user space; it simply can never 
>> work correctly there.  I believe a kernel space implementation would be 
>> simple, and would work correctly much more often than user space UMNT does now.
>> I don't agree with Trond that we should not do it there, but that's an argument 
>> I can't win.
> I can understand the reasoning... As you know I'm always in favour of
> keeping things in user space because its always easier updated a user 
> packages than a kernel... 
> 
>> 
>>>> I agree that our time is better spent elsewhere. I just think that we
>>>> ought to make that happen by eliminating the unnecessary umount helper.
>>>> The less code that we need to maintain, the better...
>>> In general I agree... but removing functionality (i.e. umount.nfs)
>>> can cause more pain than just leaving things as is...
>> 
>> Is there any specific evidence that there will be pain if umount.nfs is removed? 
>> I can't think of any use case it would harm.  It's already broken in many cases,
>> and we have never heard of a specific application complaint about it.  If there 
>> had been a complaint we would have fixed it already.  UMNT is clearly vestigial.
> No I do not any evidence.... its just pure paranoia in the fact we will
> could easily break third party applications that will depend on the 
> existence of that binary... Plus removing the binary is simply not needed
> to fix a bug in the processing of /etc/mtab... Its just overkill... IMHO...

For a single bug, perhaps it is overkill.  However, there are a host of bugs around UMNT, not just this one.  A narrow fix for this bug will not address any of the other problems.  Removing umount.nfs and possibly performing UMNT in the kernel is a solution to all of these issues.

> I'll take a look and see if I can come up with a way to fixing the remount
> bug without remove the umount.nfs binary as well as continuing to
> send out the UMNT message... It has to be possible... 

The key problem is rewriting the mount options after the remount, based on what's already in /etc/mtab.  It's certainly possible, but it's going to add a lot of code. I think you will soon find that removing umount.nfs is significantly easier than trying to rewrite these options!

Also, you probably want a solution that fixes the legacy mount code as well as the text-based code.  I've looked at a solution that fits in the common code in mount.c to try addressing this.

Note that I never said it couldn't be done... just that a broader and simpler solution to all of this is to get rid of UMNT in user space.

-- 
chuck[dot]lever[at]oracle[dot]com





^ permalink raw reply	[flat|nested] 36+ messages in thread

* Re: whither NFS umount?
  2010-10-14 15:29                             ` Chuck Lever
@ 2010-10-14 18:27                               ` Steve Dickson
  2010-10-14 19:13                                 ` Chuck Lever
  0 siblings, 1 reply; 36+ messages in thread
From: Steve Dickson @ 2010-10-14 18:27 UTC (permalink / raw)
  To: Chuck Lever; +Cc: Jeff Layton, Trond Myklebust, Linux NFS Mailing List

Hey... 

On 10/14/2010 11:29 AM, Chuck Lever wrote:
> On Oct 13, 2010, at 7:19 PM, Steve Dickson wrote:
> 
> Using /proc/mounts to determine whether a mount point is NFSv4 or NFS is OK.
> However, /proc/mounts doesn't contain the original mount options used 
> to negotiate the mount.  Ideally, we want to use the original mount command 
> line options, and not the exact options that were negotiated, when doing the umount.
>
> The example I use to explain why /proc/mounts is not appropriate is a 
> simple mount "vers=3".  It negotiates with the server and discovers the 
> mount service is on port 4545.  The mount proceeds.  After a while the 
> server reboots, and puts the mount service on port 32769.  Our /proc/mounts 
> still has "mountport=4545" since that's what was negotiated at mount time 
> and passed to the kernel.  When umount tries the UMNT request, it will 
> read /proc/mounts, send the UMNT request to port 4545, and fail.
Either don't use the mountport that's in /proc/mounts or better yet
be prepared to do a GETPORT of the service when the UMNT call
fails with 'RPC: Program not registered'. Something that is 
commonly done.. 

> 
> Now, we could just say "screw it" and use the negotiated options rather 
> than the user-specified ones.  But that means there are still cases we 
> could perform the UMNT correctly which would not work.
Even the information in /etc/mtab can go stale. 
   mount mountport=4545 /mnt; server restart ; umount /mnt
were the server comes up on a different port. 

Granted this is a pilot error by the admin since if the 
port is  specified on the command line the server better
be listening on that port. The point being there can be stale 
information in any static list.

So regardless of which list is read (/etc/mtab or /proc/mounts)
we need to be prepared deal with static information...  

> 
> Another choice would be to store the original NFS-related command line
> mount options somewhere other than /etc/mtab, and leave them alone during 
> a remount.  If we can't perform UMNT in the kernel, I'd prefer this type
> of solution to either updating /etc/mtab on a remount or using /proc/mounts 
> during umount.
> 
> How about a file under /var/lib/nfs ?
I refer to my above statement... No matter where we put the list, there
is always a chance of the information in the list to go stale... 

So I would suggest we use /proc/mounts since that is the last known
valid options and then be prepared to deal with stale options.

> 
>>> If we no longer have to perform a UMNT, then there is no need to worry about
>>> the contents of /etc/mtab, for any reason, and it can go away.  I believe we 
>>> are the only file system that is holding up the complete removal of /etc/mtab.
>> I guess don't understand the justification of not send an industry
>> wide accepted and expected protocol message because we have bug in our 
>> code... Is a protocol message every single NFS client sends and
>> has sent for many many years... Now we are not going adhere to this
>> industry practice because we don't think its necessary since it 
>> will make things hard for us to fix bug? 
> 
> I'm sympathetic to that, though I also agree with Trond that UMNT is
> probably not relevant these days.  Since it is so unreliable, I think
> most implementations avoid depending on it for anything.
> 
Just today I found a bug talking about an older HPUX server not
being compatible with a Fedora client... So the assumption you guys
are making are very worrisome.... Its an assumption I simply can't make...   


>>> Is there any specific evidence that there will be pain if umount.nfs is removed? 
>>> I can't think of any use case it would harm.  It's already broken in many cases,
>>> and we have never heard of a specific application complaint about it.  If there 
>>> had been a complaint we would have fixed it already.  UMNT is clearly vestigial.
>> No I do not any evidence.... its just pure paranoia in the fact we will
>> could easily break third party applications that will depend on the 
>> existence of that binary... Plus removing the binary is simply not needed
>> to fix a bug in the processing of /etc/mtab... Its just overkill... IMHO...
> 
> For a single bug, perhaps it is overkill.  However, there are a host of bugs 
> around UMNT, not just this one.  A narrow fix for this bug will not address 
> any of the other problems.  Removing umount.nfs and possibly performing UMNT 
> in the kernel is a solution to all of these issues.
Go ahead and point me to those bug reports on those issues so I can 
get a better understanding... 

> 
>> I'll take a look and see if I can come up with a way to fixing the remount
>> bug without remove the umount.nfs binary as well as continuing to
>> send out the UMNT message... It has to be possible... 
> 
> The key problem is rewriting the mount options after the remount, based on what's 
> already in /etc/mtab.  It's certainly possible, but it's going to add a lot of 
> code. I think you will soon find that removing umount.nfs is significantly 
> easier than trying to rewrite these options!
> 
> Also, you probably want a solution that fixes the legacy mount code as well
> as the text-based code.  I've looked at a solution that fits in the common
> code in mount.c to try addressing this.
> 
> Note that I never said it couldn't be done... just that a broader and
> simpler solution to all of this is to get rid of UMNT in user space.
> 
At the end of the day I am dead against removing umount.nfs and stopping
sending the UMNT messages until it can be conclusively demonstrated 
we will continue be compatible all of the new and older servers out there.

Now if this cause more work then please send it my way, since I'm
the one being the stick in the mud, I'm the one that should do the
extra work... 

steved.

^ permalink raw reply	[flat|nested] 36+ messages in thread

* Re: whither NFS umount?
  2010-10-14 18:27                               ` Steve Dickson
@ 2010-10-14 19:13                                 ` Chuck Lever
  2010-10-14 21:24                                   ` Steve Dickson
  0 siblings, 1 reply; 36+ messages in thread
From: Chuck Lever @ 2010-10-14 19:13 UTC (permalink / raw)
  To: Steve Dickson; +Cc: Jeff Layton, Trond Myklebust, Linux NFS Mailing List


On Oct 14, 2010, at 2:27 PM, Steve Dickson wrote:

> Hey... 
> 
> On 10/14/2010 11:29 AM, Chuck Lever wrote:
>> On Oct 13, 2010, at 7:19 PM, Steve Dickson wrote:
>> 
>> Using /proc/mounts to determine whether a mount point is NFSv4 or NFS is OK.
>> However, /proc/mounts doesn't contain the original mount options used 
>> to negotiate the mount.  Ideally, we want to use the original mount command 
>> line options, and not the exact options that were negotiated, when doing the umount.
>> 
>> The example I use to explain why /proc/mounts is not appropriate is a 
>> simple mount "vers=3".  It negotiates with the server and discovers the 
>> mount service is on port 4545.  The mount proceeds.  After a while the 
>> server reboots, and puts the mount service on port 32769.  Our /proc/mounts 
>> still has "mountport=4545" since that's what was negotiated at mount time 
>> and passed to the kernel.  When umount tries the UMNT request, it will 
>> read /proc/mounts, send the UMNT request to port 4545, and fail.
> Either don't use the mountport that's in /proc/mounts or better yet
> be prepared to do a GETPORT of the service when the UMNT call
> fails with 'RPC: Program not registered'. Something that is 
> commonly done.. 
> 
>> 
>> Now, we could just say "screw it" and use the negotiated options rather 
>> than the user-specified ones.  But that means there are still cases we 
>> could perform the UMNT correctly which would not work.
> Even the information in /etc/mtab can go stale. 
>   mount mountport=4545 /mnt; server restart ; umount /mnt
> were the server comes up on a different port. 
> 
> Granted this is a pilot error by the admin since if the 
> port is  specified on the command line the server better
> be listening on that port. The point being there can be stale 
> information in any static list.
> 
> So regardless of which list is read (/etc/mtab or /proc/mounts)
> we need to be prepared deal with static information...  
> 
>> 
>> Another choice would be to store the original NFS-related command line
>> mount options somewhere other than /etc/mtab, and leave them alone during 
>> a remount.  If we can't perform UMNT in the kernel, I'd prefer this type
>> of solution to either updating /etc/mtab on a remount or using /proc/mounts 
>> during umount.
>> 
>> How about a file under /var/lib/nfs ?
> I refer to my above statement... No matter where we put the list, there
> is always a chance of the information in the list to go stale... 
> 
> So I would suggest we use /proc/mounts since that is the last known
> valid options and then be prepared to deal with stale options.

It sounds like we agree that umount.nfs needs to do some negotiation (and it already does this today).  What I'm suggesting is we can be more clever about what the starting point is for the umount negotiation.

My goal for text-based mounts (until now) has been to keep the originally specified mount options in /etc/mtab, and use those as the starting point for negotiation during the umount.  This appears to have been the intent of mount.nfs's use of /etc/mtab, all along (but we agree /etc/mtab is probably a bad place to put this info).  The original mount options will have the best chance of working.  Ideally these would be recorded after config file processing, but before version and transport negotiation (unlike /proc/mounts, which would have the post-negotiation options).

The mount protocol information in /proc/mounts can be very very stale.  If the mount point is very long lived, as it is for static mount points on server-class systems, the client may have been up for months, while the NFS servers can have rebooted multiple times during that time span.  Each server reboot can result in the mount port changing, for example.

/proc/mounts has the specific set of options that were the result of negotiation during the mount process.  Those will work sometimes, but I think those actually have a good chance of not working in some cases.  If umount.nfs starts with /proc/mounts, how can it know which of "vers=" and "proto=" and "port=" and "mountport=" were specified on the original command line (and thus are required to make the mount work) and those which were negotiated by mount.nfs (and thus may have changed since the original mount)?

So, preserving the original mount options somewhere and using that as the starting point for negotiation during the umount is the best way to ensure that a UMNT request will get to the server, in my opinion, not the least because that's the intent of the code we have now.

I think there are more cases when using /proc/mounts will be worse than using /etc/mtab, and thus we'll get worse behavior on UMNT than we have today in some cases.  If this weren't true, I think we would have embraced /proc/mounts already.  I consider a change to use /proc/mounts as risky as a change to not send UMNT at all.

So, I'm OK with keeping umount.nfs around for the time being, but maybe I have to put my foot down and say we mustn't use /proc/mounts for anything but deciding whether the mount point is an NFSv4 mount.  I'm happy to volunteer code, and also happy to collaborate with you on a fix.  I've already spent a lot of time poking at this and coding prototypes, so I'm "invested."

To summarize: instead of relying on /etc/mtab, also use an NFS-specific place to record the same information.  umount.nfs can use that instead of /etc/mtab.  And by the way, we don't touch this information during a remount... heh.  That guarantees that we preserve existing good behaviors of umount.nfs, continue to update /etc/mtab as documented, until maybe it goes away, but eliminate our functional dependence on it.

-- 
chuck[dot]lever[at]oracle[dot]com





^ permalink raw reply	[flat|nested] 36+ messages in thread

* Re: whither NFS umount?
  2010-10-14 19:13                                 ` Chuck Lever
@ 2010-10-14 21:24                                   ` Steve Dickson
  2010-10-14 22:22                                     ` Chuck Lever
  0 siblings, 1 reply; 36+ messages in thread
From: Steve Dickson @ 2010-10-14 21:24 UTC (permalink / raw)
  To: Chuck Lever; +Cc: Jeff Layton, Trond Myklebust, Linux NFS Mailing List

On 10/14/2010 03:13 PM, Chuck Lever wrote:
> 
> On Oct 14, 2010, at 2:27 PM, Steve Dickson wrote:
> 
>> Hey... 
>>
>> On 10/14/2010 11:29 AM, Chuck Lever wrote:
>>> On Oct 13, 2010, at 7:19 PM, Steve Dickson wrote:
>>>
>>> Using /proc/mounts to determine whether a mount point is NFSv4 or NFS is OK.
>>> However, /proc/mounts doesn't contain the original mount options used 
>>> to negotiate the mount.  Ideally, we want to use the original mount command 
>>> line options, and not the exact options that were negotiated, when doing the umount.
>>>
>>> The example I use to explain why /proc/mounts is not appropriate is a 
>>> simple mount "vers=3".  It negotiates with the server and discovers the 
>>> mount service is on port 4545.  The mount proceeds.  After a while the 
>>> server reboots, and puts the mount service on port 32769.  Our /proc/mounts 
>>> still has "mountport=4545" since that's what was negotiated at mount time 
>>> and passed to the kernel.  When umount tries the UMNT request, it will 
>>> read /proc/mounts, send the UMNT request to port 4545, and fail.
>> Either don't use the mountport that's in /proc/mounts or better yet
>> be prepared to do a GETPORT of the service when the UMNT call
>> fails with 'RPC: Program not registered'. Something that is 
>> commonly done.. 
>>
>>>
>>> Now, we could just say "screw it" and use the negotiated options rather 
>>> than the user-specified ones.  But that means there are still cases we 
>>> could perform the UMNT correctly which would not work.
>> Even the information in /etc/mtab can go stale. 
>>   mount mountport=4545 /mnt; server restart ; umount /mnt
>> were the server comes up on a different port. 
>>
>> Granted this is a pilot error by the admin since if the 
>> port is  specified on the command line the server better
>> be listening on that port. The point being there can be stale 
>> information in any static list.
>>
>> So regardless of which list is read (/etc/mtab or /proc/mounts)
>> we need to be prepared deal with static information...  
>>
>>>
>>> Another choice would be to store the original NFS-related command line
>>> mount options somewhere other than /etc/mtab, and leave them alone during 
>>> a remount.  If we can't perform UMNT in the kernel, I'd prefer this type
>>> of solution to either updating /etc/mtab on a remount or using /proc/mounts 
>>> during umount.
>>>
>>> How about a file under /var/lib/nfs ?
>> I refer to my above statement... No matter where we put the list, there
>> is always a chance of the information in the list to go stale... 
>>
>> So I would suggest we use /proc/mounts since that is the last known
>> valid options and then be prepared to deal with stale options.
> 
> It sounds like we agree that umount.nfs needs to do some negotiation 
> (and it already does this today).  What I'm suggesting is we can be more 
> clever about what the starting point is for the umount negotiation
Yes.. I do agree umount will need to do some negotiation...
.
> 
> My goal for text-based mounts (until now) has been to keep the originally
> specified mount options in /etc/mtab, and use those as the starting point
> for negotiation during the umount.  This appears to have been the intent of
> mount.nfs's use of /etc/mtab, all along (but we agree /etc/mtab is probably 
> a bad place to put this info).  The original mount options will have the best
> chance of working.  Ideally these would be recorded after config file 
> processing, but before version and transport negotiation (unlike /proc/mounts,
> which would have the post-negotiation options).
> 
> The mount protocol information in /proc/mounts can be very very stale
Well the mount(8) man page seems to disagrees with you:

    When  the  proc  filesystem is mounted (say at /proc), the files
    /etc/mtab and /proc/mounts have very similar contents. The  former  
    has  somewhat  more  information, such as the mount options used, but is not    
    necessarily  up-to-date  (cf.  the  -n  option below).  It is possible to replace  
    /etc/mtab by a symbolic link to /proc/mounts, and especially when you have
    very large number of mounts things will be much faster with that symlink,
    but some information is lost that way, and in particular using the "user"
    option will fail.

They are basically say you should replace /etc/mtab with /proc/mounts.
. 
> If the mount point is very long lived, as it is for static mount points on 
> server-class systems, the client may have been up for months, while the 
> NFS servers can have rebooted multiple times during that time span.  
> Each server reboot can result in the mount port changing, for example.
> /proc/mounts has the specific set of options that were the result of 
> negotiation during the mount process.  Those will work sometimes, but I 
> think those actually have a good chance of not working in some cases.
>
> If umount.nfs starts with /proc/mounts, how can it know which of "vers=" 
> and "proto=" and "port=" and "mountport=" were specified on the original 
> command line (and thus are required to make the mount work) and those which
> were negotiated by mount.nfs (and thus may have changed since the original mount)?
Well I don't believe either the proto= or vers= will change
over a server reboot since the values in /proc/mounts are the
were negotiated to...  I do agree both the "port=" and 
"mountport=" can go stale... So many be should just never use them... 

> So, preserving the original mount options somewhere and using that as the
> starting point for negotiation during the umount is the best way to ensure
> that a UMNT request will get to the server, in my opinion, not the least 
> because that's the intent of the code we have now.
> 
> I think there are more cases when using /proc/mounts will be worse than 
> using /etc/mtab, and thus we'll get worse behavior on UMNT than we have
> today in some cases.  If this weren't true, I think we would have 
> embraced /proc/mounts already.  I consider a change to use /proc/mounts 
> as risky as a change to not send UMNT at all.
Can you outline these cases? The only thing I think can go stale
is the port numbers... Everything else should stay relatively 
valid... as I just stated... 

> 
> So, I'm OK with keeping umount.nfs around for the time being, but
> maybe I have to put my foot down and say we mustn't use /proc/mounts
> for anything but deciding whether the mount point is an NFSv4 mount.  
> I'm happy to volunteer code, and also happy to collaborate with you on a fix.
> I've already spent a lot of time poking at this and coding prototypes, 
> so I'm "invested."
Well talking with the upstream maintainer of the mount command
as soon as the new libmount makes an appearance, there is 
a very really possibility /etc/mtab will be going away... He
says it will be replace with something like /var/run/mount/something

So maybe we start looking into how to make /proc/mounts work.
> 
> To summarize: instead of relying on /etc/mtab, also use an NFS-specific 
> place to record the same information.  umount.nfs can use that 
> instead of /etc/mtab.  And by the way, we don't touch this information
> during a remount... heh.  That guarantees that we preserve existing 
> good behaviors of umount.nfs, continue to update /etc/mtab as documented,
> until maybe it goes away, but eliminate our functional dependence on it.
> 
If the info in /etc/mntab is not updated on remounts, then what is 
the issue we are talking about? Just curious, will the info in /proc/mounts
be updated on remounts?

steved.

^ permalink raw reply	[flat|nested] 36+ messages in thread

* Re: whither NFS umount?
  2010-10-14 21:24                                   ` Steve Dickson
@ 2010-10-14 22:22                                     ` Chuck Lever
  2010-10-15 13:11                                       ` Steve Dickson
  0 siblings, 1 reply; 36+ messages in thread
From: Chuck Lever @ 2010-10-14 22:22 UTC (permalink / raw)
  To: Steve Dickson; +Cc: Jeff Layton, Trond Myklebust, Linux NFS Mailing List


On Oct 14, 2010, at 5:24 PM, Steve Dickson wrote:

> On 10/14/2010 03:13 PM, Chuck Lever wrote:
>> 
>> On Oct 14, 2010, at 2:27 PM, Steve Dickson wrote:
>> 
>>> Hey... 
>>> 
>>> On 10/14/2010 11:29 AM, Chuck Lever wrote:
>>>> On Oct 13, 2010, at 7:19 PM, Steve Dickson wrote:
>>>> 
>>>> Using /proc/mounts to determine whether a mount point is NFSv4 or NFS is OK.
>>>> However, /proc/mounts doesn't contain the original mount options used 
>>>> to negotiate the mount.  Ideally, we want to use the original mount command 
>>>> line options, and not the exact options that were negotiated, when doing the umount.
>>>> 
>>>> The example I use to explain why /proc/mounts is not appropriate is a 
>>>> simple mount "vers=3".  It negotiates with the server and discovers the 
>>>> mount service is on port 4545.  The mount proceeds.  After a while the 
>>>> server reboots, and puts the mount service on port 32769.  Our /proc/mounts 
>>>> still has "mountport=4545" since that's what was negotiated at mount time 
>>>> and passed to the kernel.  When umount tries the UMNT request, it will 
>>>> read /proc/mounts, send the UMNT request to port 4545, and fail.
>>> Either don't use the mountport that's in /proc/mounts or better yet
>>> be prepared to do a GETPORT of the service when the UMNT call
>>> fails with 'RPC: Program not registered'. Something that is 
>>> commonly done.. 
>>> 
>>>> 
>>>> Now, we could just say "screw it" and use the negotiated options rather 
>>>> than the user-specified ones.  But that means there are still cases we 
>>>> could perform the UMNT correctly which would not work.
>>> Even the information in /etc/mtab can go stale. 
>>>  mount mountport=4545 /mnt; server restart ; umount /mnt
>>> were the server comes up on a different port. 
>>> 
>>> Granted this is a pilot error by the admin since if the 
>>> port is  specified on the command line the server better
>>> be listening on that port. The point being there can be stale 
>>> information in any static list.
>>> 
>>> So regardless of which list is read (/etc/mtab or /proc/mounts)
>>> we need to be prepared deal with static information...  
>>> 
>>>> 
>>>> Another choice would be to store the original NFS-related command line
>>>> mount options somewhere other than /etc/mtab, and leave them alone during 
>>>> a remount.  If we can't perform UMNT in the kernel, I'd prefer this type
>>>> of solution to either updating /etc/mtab on a remount or using /proc/mounts 
>>>> during umount.
>>>> 
>>>> How about a file under /var/lib/nfs ?
>>> I refer to my above statement... No matter where we put the list, there
>>> is always a chance of the information in the list to go stale... 
>>> 
>>> So I would suggest we use /proc/mounts since that is the last known
>>> valid options and then be prepared to deal with stale options.
>> 
>> It sounds like we agree that umount.nfs needs to do some negotiation 
>> (and it already does this today).  What I'm suggesting is we can be more 
>> clever about what the starting point is for the umount negotiation
> Yes.. I do agree umount will need to do some negotiation...
> .
>> 
>> My goal for text-based mounts (until now) has been to keep the originally
>> specified mount options in /etc/mtab, and use those as the starting point
>> for negotiation during the umount.  This appears to have been the intent of
>> mount.nfs's use of /etc/mtab, all along (but we agree /etc/mtab is probably 
>> a bad place to put this info).  The original mount options will have the best
>> chance of working.  Ideally these would be recorded after config file 
>> processing, but before version and transport negotiation (unlike /proc/mounts,
>> which would have the post-negotiation options).
>> 
>> The mount protocol information in /proc/mounts can be very very stale
> Well the mount(8) man page seems to disagrees with you:
> 
>    When  the  proc  filesystem is mounted (say at /proc), the files
>    /etc/mtab and /proc/mounts have very similar contents. The  former  
>    has  somewhat  more  information, such as the mount options used, but is not    
>    necessarily  up-to-date  (cf.  the  -n  option below).  It is possible to replace  
>    /etc/mtab by a symbolic link to /proc/mounts, and especially when you have
>    very large number of mounts things will be much faster with that symlink,
>    but some information is lost that way, and in particular using the "user"
>    option will fail.
> 
> They are basically say you should replace /etc/mtab with /proc/mounts.

Right, that text is not written with NFS in mind, unfortunately.  I thought it was common knowledge that replacing /etc/mtab with a link was bad for NFS.

Notice they call out support for the "user" mount option explicitly here.  That seems to be an important feature for network file systems.

> . 
>> If the mount point is very long lived, as it is for static mount points on 
>> server-class systems, the client may have been up for months, while the 
>> NFS servers can have rebooted multiple times during that time span.  
>> Each server reboot can result in the mount port changing, for example.
>> /proc/mounts has the specific set of options that were the result of 
>> negotiation during the mount process.  Those will work sometimes, but I 
>> think those actually have a good chance of not working in some cases.
>> 
>> If umount.nfs starts with /proc/mounts, how can it know which of "vers=" 
>> and "proto=" and "port=" and "mountport=" were specified on the original 
>> command line (and thus are required to make the mount work) and those which
>> were negotiated by mount.nfs (and thus may have changed since the original mount)?
> Well I don't believe either the proto= or vers= will change
> over a server reboot since the values in /proc/mounts are the
> were negotiated to...  I do agree both the "port=" and 
> "mountport=" can go stale... So many be should just never use them... 

Vers= won't change, which is why we can trust /proc/mounts to tell us what NFS version to use for the umount.

mountvers= may go stale, mountproto= can go stale, mountport= can also go stale.  For umount, we don't care about port=.  The problem is we can't tell whether mountproto and mountport in /proc/mounts was specified on the command line (say, to punch through a firewall) or was negotiated by user space (and is thus safe to ignore and renegotiate).

The relationship between mounthost and mountaddr can also change over time.  /proc/mounts has mountaddr.  We really want to look up mounthost again to be reliable.

>> So, preserving the original mount options somewhere and using that as the
>> starting point for negotiation during the umount is the best way to ensure
>> that a UMNT request will get to the server, in my opinion, not the least 
>> because that's the intent of the code we have now.
>> 
>> I think there are more cases when using /proc/mounts will be worse than 
>> using /etc/mtab, and thus we'll get worse behavior on UMNT than we have
>> today in some cases.  If this weren't true, I think we would have 
>> embraced /proc/mounts already.  I consider a change to use /proc/mounts 
>> as risky as a change to not send UMNT at all.
> Can you outline these cases? The only thing I think can go stale
> is the port numbers... Everything else should stay relatively 
> valid... as I just stated... 

See above.  The options we care about for doing an umount reliably can go stale, and there's no way to tell if the information in /proc/mounts was specified on purpose or negotiated automatically.

>> So, I'm OK with keeping umount.nfs around for the time being, but
>> maybe I have to put my foot down and say we mustn't use /proc/mounts
>> for anything but deciding whether the mount point is an NFSv4 mount.  
>> I'm happy to volunteer code, and also happy to collaborate with you on a fix.
>> I've already spent a lot of time poking at this and coding prototypes, 
>> so I'm "invested."
> Well talking with the upstream maintainer of the mount command
> as soon as the new libmount makes an appearance, there is 
> a very really possibility /etc/mtab will be going away... He
> says it will be replace with something like /var/run/mount/something
> 
> So maybe we start looking into how to make /proc/mounts work.

I agree that we should work towards unlinking our mount subcommands from relying on /etc/mtab.  I don't think the impending presence of libmount mandates the use of /proc/mounts, though.

>> To summarize: instead of relying on /etc/mtab, also use an NFS-specific 
>> place to record the same information.  umount.nfs can use that 
>> instead of /etc/mtab.  And by the way, we don't touch this information
>> during a remount... heh.  That guarantees that we preserve existing 
>> good behaviors of umount.nfs, continue to update /etc/mtab as documented,
>> until maybe it goes away, but eliminate our functional dependence on it.
>> 
> If the info in /etc/mntab is not updated on remounts, then what is 
> the issue we are talking about? Just curious, will the info in /proc/mounts
> be updated on remounts?

/etc/mtab would still be updated on remounts, and would still have the bug where "remount" would wipe the options.  But we would no longer depend on that destroyed information to perform the umount reliably.

This new stash of information I'm proposing would not be altered by a remount.  It sounds like we would need to store only the MNT protocol related options, described above.

In /proc/mounts, the NFS-specific mount options aren't supposed to change at all on a remount.  Only the generic mount options ("sync", "ro", etc) should change.

-- 
chuck[dot]lever[at]oracle[dot]com





^ permalink raw reply	[flat|nested] 36+ messages in thread

* Re: whither NFS umount?
  2010-10-14 22:22                                     ` Chuck Lever
@ 2010-10-15 13:11                                       ` Steve Dickson
  2010-10-15 13:41                                         ` Jeff Layton
  2010-10-15 16:00                                         ` Chuck Lever
  0 siblings, 2 replies; 36+ messages in thread
From: Steve Dickson @ 2010-10-15 13:11 UTC (permalink / raw)
  To: Chuck Lever; +Cc: Jeff Layton, Trond Myklebust, Linux NFS Mailing List

Good Morning,

On 10/14/2010 06:22 PM, Chuck Lever wrote:
> 
> On Oct 14, 2010, at 5:24 PM, Steve Dickson wrote:
>
>>> The mount protocol information in /proc/mounts can be very very stale
>> Well the mount(8) man page seems to disagrees with you:
>>
>>    When  the  proc  filesystem is mounted (say at /proc), the files
>>    /etc/mtab and /proc/mounts have very similar contents. The  former  
>>    has  somewhat  more  information, such as the mount options used, but is not    
>>    necessarily  up-to-date  (cf.  the  -n  option below).  It is possible to replace  
>>    /etc/mtab by a symbolic link to /proc/mounts, and especially when you have
>>    very large number of mounts things will be much faster with that symlink,
>>    but some information is lost that way, and in particular using the "user"
>>    option will fail.
>>
>> They are basically say you should replace /etc/mtab with /proc/mounts.
> 
> Right, that text is not written with NFS in mind, unfortunately. 
> I thought it was common knowledge that replacing /etc/mtab with a link
> was bad for NFS.
> 
> Notice they call out support for the "user" mount option explicitly here.
> That seems to be an important feature for network file systems.
My point is staleness.... BOTH /etc/mtab and /proc/mounts "can be very very stale"

>> . 
>>> If the mount point is very long lived, as it is for static mount points on 
>>> server-class systems, the client may have been up for months, while the 
>>> NFS servers can have rebooted multiple times during that time span.  
>>> Each server reboot can result in the mount port changing, for example.
>>> /proc/mounts has the specific set of options that were the result of 
>>> negotiation during the mount process.  Those will work sometimes, but I 
>>> think those actually have a good chance of not working in some cases.
>>>
>>> If umount.nfs starts with /proc/mounts, how can it know which of "vers=" 
>>> and "proto=" and "port=" and "mountport=" were specified on the original 
>>> command line (and thus are required to make the mount work) and those which
>>> were negotiated by mount.nfs (and thus may have changed since the original mount)?
>> Well I don't believe either the proto= or vers= will change
>> over a server reboot since the values in /proc/mounts are the
>> were negotiated to...  I do agree both the "port=" and 
>> "mountport=" can go stale... So many be should just never use them... 
> 
> Vers= won't change, which is why we can trust /proc/mounts to tell us what 
> NFS version to use for the umount.
proto= will not go stale either... 

> 
> mountvers= may go stale, 
> mountproto= can go stale, 
I think these going stale would be highly unlikely, but recoverable... 

> mountport= can also go stale.  For umount, we don't care about port=.  
True any port value can easily go stale... 

> The problem is we can't tell whether mountproto and mountport in /proc/mounts
< was specified on the command line (say, to punch through a firewall) or 
< was negotiated by user space (and is thus safe to ignore and renegotiate).
We shouldn't care whether those options were specified or negotiated.
The values in /proc/mounts are the ones that worked! So at one point 
in time we know all the values in /proc/mounts were valid (since the
entry exists). This is something that cannot be said about options
specified on the command line.

> 
> The relationship between mounthost and mountaddr can also change over time.
> /proc/mounts has mountaddr.  We really want to look up mounthost again to be reliable.
Fine... Add that to the list that needs to be updated once the first call fails... 

> 
>>> So, preserving the original mount options somewhere and using that as the
>>> starting point for negotiation during the umount is the best way to ensure
>>> that a UMNT request will get to the server, in my opinion, not the least 
>>> because that's the intent of the code we have now.
>>>
>>> I think there are more cases when using /proc/mounts will be worse than 
>>> using /etc/mtab, and thus we'll get worse behavior on UMNT than we have
>>> today in some cases.  If this weren't true, I think we would have 
>>> embraced /proc/mounts already.  I consider a change to use /proc/mounts 
>>> as risky as a change to not send UMNT at all.
>> Can you outline these cases? The only thing I think can go stale
>> is the port numbers... Everything else should stay relatively 
>> valid... as I just stated... 
> 
> See above.  The options we care about for doing an umount reliably can go 
> stale, and there's no way to tell if the information in /proc/mounts 
> was specified on purpose or negotiated automatically.
My point is it really does not matter... the values in /proc/mounts
allowed the mount to succeed at one point and that's not a bad place
to start from... IMHO... 

> 
>>> So, I'm OK with keeping umount.nfs around for the time being, but
>>> maybe I have to put my foot down and say we mustn't use /proc/mounts
>>> for anything but deciding whether the mount point is an NFSv4 mount.  
>>> I'm happy to volunteer code, and also happy to collaborate with you on a fix.
>>> I've already spent a lot of time poking at this and coding prototypes, 
>>> so I'm "invested."
>> Well talking with the upstream maintainer of the mount command
>> as soon as the new libmount makes an appearance, there is 
>> a very really possibility /etc/mtab will be going away... He
>> says it will be replace with something like /var/run/mount/something
>>
>> So maybe we start looking into how to make /proc/mounts work.
> 
> I agree that we should work towards unlinking our mount subcommands from 
> relying on /etc/mtab.  I don't think the impending presence of 
> libmount mandates the use of /proc/mounts, though.
True... All I'm trying to point out is the information in /proc/mounts and 
/etc/mtab can be equally as good and equally as bad at any point
in time. 
 
Now that there is a real possibly that /etc/mtab could deprecated,
I think we should start looking into making the info in /proc/mounts 
work, since /proc/mounts not going anywhere... 

> 
>>> To summarize: instead of relying on /etc/mtab, also use an NFS-specific 
>>> place to record the same information.  umount.nfs can use that 
>>> instead of /etc/mtab.  And by the way, we don't touch this information
>>> during a remount... heh.  That guarantees that we preserve existing 
>>> good behaviors of umount.nfs, continue to update /etc/mtab as documented,
>>> until maybe it goes away, but eliminate our functional dependence on it.
>>>
>> If the info in /etc/mntab is not updated on remounts, then what is 
>> the issue we are talking about? Just curious, will the info in /proc/mounts
>> be updated on remounts?
> 
> /etc/mtab would still be updated on remounts, and would still have the 
> bug where "remount" would wipe the options.  But we would no longer depend
> on that destroyed information to perform the umount reliably.
The remount would wipe out the *original* options, basically overriding
them with the updated options... As long as we have a mechanism to
retry the UMNT if the first call fails, I don't see this a being a 
problem... 
 
> 
> This new stash of information I'm proposing would not be altered by a
> remount.  It sounds like we would need to store only the MNT protocol 
> related options, described above.
> 
> In /proc/mounts, the NFS-specific mount options aren't supposed to 
> change at all on a remount.  Only the generic mount options 
> ("sync", "ro", etc) should change.
Could you please point me to where the above rule is mandated...
I had know idea there were rules of what can and cannot be
changed in /proc/mounts... tia...

steved.


^ permalink raw reply	[flat|nested] 36+ messages in thread

* Re: whither NFS umount?
  2010-10-15 13:11                                       ` Steve Dickson
@ 2010-10-15 13:41                                         ` Jeff Layton
  2010-10-15 16:00                                         ` Chuck Lever
  1 sibling, 0 replies; 36+ messages in thread
From: Jeff Layton @ 2010-10-15 13:41 UTC (permalink / raw)
  To: Steve Dickson; +Cc: Chuck Lever, Trond Myklebust, Linux NFS Mailing List

On Fri, 15 Oct 2010 09:11:55 -0400
Steve Dickson <SteveD@redhat.com> wrote:

> > 
> >>> To summarize: instead of relying on /etc/mtab, also use an NFS-specific 
> >>> place to record the same information.  umount.nfs can use that 
> >>> instead of /etc/mtab.  And by the way, we don't touch this information
> >>> during a remount... heh.  That guarantees that we preserve existing 
> >>> good behaviors of umount.nfs, continue to update /etc/mtab as documented,
> >>> until maybe it goes away, but eliminate our functional dependence on it.
> >>>
> >> If the info in /etc/mntab is not updated on remounts, then what is 
> >> the issue we are talking about? Just curious, will the info in /proc/mounts
> >> be updated on remounts?
> > 
> > /etc/mtab would still be updated on remounts, and would still have the 
> > bug where "remount" would wipe the options.  But we would no longer depend
> > on that destroyed information to perform the umount reliably.
> The remount would wipe out the *original* options, basically overriding
> them with the updated options... As long as we have a mechanism to
> retry the UMNT if the first call fails, I don't see this a being a 
> problem... 
>  

FWIW, I think the only info we really need to worry about preserving
in mtab is the "user=" and "users" options. Without those, you can have
user mountable filesystems that may become un-unmountable by those
users.

This is true whether umount.nfs stays or not.

-- 
Jeff Layton <jlayton@redhat.com>

^ permalink raw reply	[flat|nested] 36+ messages in thread

* Re: whither NFS umount?
  2010-10-15 13:11                                       ` Steve Dickson
  2010-10-15 13:41                                         ` Jeff Layton
@ 2010-10-15 16:00                                         ` Chuck Lever
  2010-10-15 20:08                                           ` Steve Dickson
  1 sibling, 1 reply; 36+ messages in thread
From: Chuck Lever @ 2010-10-15 16:00 UTC (permalink / raw)
  To: Steve Dickson; +Cc: Jeff Layton, Trond Myklebust, Linux NFS Mailing List

Steve -

On Oct 15, 2010, at 9:11 AM, Steve Dickson wrote:

> Good Morning,
> 
> On 10/14/2010 06:22 PM, Chuck Lever wrote:
>> 
>> On Oct 14, 2010, at 5:24 PM, Steve Dickson wrote:
>> 
>>>> The mount protocol information in /proc/mounts can be very very stale
>>> Well the mount(8) man page seems to disagrees with you:
>>> 
>>>   When  the  proc  filesystem is mounted (say at /proc), the files
>>>   /etc/mtab and /proc/mounts have very similar contents. The  former  
>>>   has  somewhat  more  information, such as the mount options used, but is not    
>>>   necessarily  up-to-date  (cf.  the  -n  option below).  It is possible to replace  
>>>   /etc/mtab by a symbolic link to /proc/mounts, and especially when you have
>>>   very large number of mounts things will be much faster with that symlink,
>>>   but some information is lost that way, and in particular using the "user"
>>>   option will fail.
>>> 
>>> They are basically say you should replace /etc/mtab with /proc/mounts.
>> 
>> Right, that text is not written with NFS in mind, unfortunately. 
>> I thought it was common knowledge that replacing /etc/mtab with a link
>> was bad for NFS.
>> 
>> Notice they call out support for the "user" mount option explicitly here.
>> That seems to be an important feature for network file systems.
> My point is staleness.... BOTH /etc/mtab and /proc/mounts "can be very very stale"
> 
>>> . 
>>>> If the mount point is very long lived, as it is for static mount points on 
>>>> server-class systems, the client may have been up for months, while the 
>>>> NFS servers can have rebooted multiple times during that time span.  
>>>> Each server reboot can result in the mount port changing, for example.
>>>> /proc/mounts has the specific set of options that were the result of 
>>>> negotiation during the mount process.  Those will work sometimes, but I 
>>>> think those actually have a good chance of not working in some cases.
>>>> 
>>>> If umount.nfs starts with /proc/mounts, how can it know which of "vers=" 
>>>> and "proto=" and "port=" and "mountport=" were specified on the original 
>>>> command line (and thus are required to make the mount work) and those which
>>>> were negotiated by mount.nfs (and thus may have changed since the original mount)?
>>> Well I don't believe either the proto= or vers= will change
>>> over a server reboot since the values in /proc/mounts are the
>>> were negotiated to...  I do agree both the "port=" and 
>>> "mountport=" can go stale... So many be should just never use them... 
>> 
>> Vers= won't change, which is why we can trust /proc/mounts to tell us what 
>> NFS version to use for the umount.
> proto= will not go stale either... 

That's not trivial.  Remember that proto= controls both the NFS protocol and the mount protocol.

The NFS protocol will not go stale, but the server can easily change the mount protocols it serves, and the clients are none-the-wiser until they attempt another mount operation.

>> mountvers= may go stale, 
>> mountproto= can go stale, 
> I think these going stale would be highly unlikely, but recoverable... 
> 
>> mountport= can also go stale.  For umount, we don't care about port=.  
> True any port value can easily go stale... 
> 
>> The problem is we can't tell whether mountproto and mountport in /proc/mounts
> < was specified on the command line (say, to punch through a firewall) or 
> < was negotiated by user space (and is thus safe to ignore and renegotiate).
> We shouldn't care whether those options were specified or negotiated.

I disagree.

If these options were specified options, it's likely that they were used to get through a firewall.  In that case, we can use these settings and expect the UMNT to work as well as the MNT did.

On the other hand, if these were negotiated options, there are some common cases where the /proc/mounts options won't work.  Your "workaround," if these don't work, is to fall back and retry the UMNT by negotiating.  So: what mount options do you use to start the negotiation with?

We don't need to try and fall back.  Start with the original command line options and negotiate as you did for the original MNT request.  That means you perform a single UMNT try.  My way is less complicated on the wire in these cases, and has more consistent results.  What's more, it's what the code already does today.

> The values in /proc/mounts are the ones that worked! So at one point 
> in time we know all the values in /proc/mounts were valid (since the
> entry exists). This is something that cannot be said about options
> specified on the command line.

Dude, that makes no sense.  The options in /proc/mounts were derived from the options on the command line.  So if the options in /proc/mounts worked, BY DEFINITION the command line options in /etc/mtab worked.

We really must start the negotiation from the original command line options.  Going with the options in /proc/mounts first and then renegotiating if they fail is ass-backwards.

I should also point out that older kernels did not display this information.  Before the kernel groked MNT (2.6.22?) none of this information appears in /proc/mounts.  I'm guessing that using /proc/mounts simply won't work on older kernels the way you want it to.

>> The relationship between mounthost and mountaddr can also change over time.
>> /proc/mounts has mountaddr.  We really want to look up mounthost again to be reliable.
> Fine... Add that to the list that needs to be updated once the first call fails... 

We don't need a "first call" and a "second call."  One call, done with the correct starting options, is all that is necessary, if you start with the original command line options.  Retrying UMNT here is new (and unnecessary) behavior.  The client has everything it needs to negotiate the correct settings the first time.

> 
>>>> So, I'm OK with keeping umount.nfs around for the time being, but
>>>> maybe I have to put my foot down and say we mustn't use /proc/mounts
>>>> for anything but deciding whether the mount point is an NFSv4 mount.  
>>>> I'm happy to volunteer code, and also happy to collaborate with you on a fix.
>>>> I've already spent a lot of time poking at this and coding prototypes, 
>>>> so I'm "invested."
>>> Well talking with the upstream maintainer of the mount command
>>> as soon as the new libmount makes an appearance, there is 
>>> a very really possibility /etc/mtab will be going away... He
>>> says it will be replace with something like /var/run/mount/something
>>> 
>>> So maybe we start looking into how to make /proc/mounts work.
>> 
>> I agree that we should work towards unlinking our mount subcommands from 
>> relying on /etc/mtab.  I don't think the impending presence of 
>> libmount mandates the use of /proc/mounts, though.
> True... All I'm trying to point out is the information in /proc/mounts and 
> /etc/mtab can be equally as good and equally as bad at any point
> in time. 

These are not equivalent pieces of information.

/etc/mtab can contain something like "vers=3" and /proc/mounts might contain "vers=3,mountport=4545,mountproto=tcp,mountaddr=192.168.1.77".  During a UMNT, the latter three of the options might be completely wrong.  Which one do you try first to negotiate?  With the /etc/mtab options, you have a clear starting point from which to re-derive the /proc/mounts options, and a clear method (the same one MNT uses) to derive fresh options.

In utils/mount/stropts.c:nfs_do_mount_v3v2(), the extra_opts string, which is eventually planted in /etc/mtab, is generated _before_ the specific mount options are negotiated by nfs_rewrite_pmap_mount_options() and sent to mount(2).  This is on purpose: we want to record the original mount options here so that umount.nfs can use them to figure out how to do the UMNT in exactly the same way the MNT was negotiated.

On unmount, umount.nfs reads those options from /etc/mtab, and then calls nfs_probe_mntport() to negotiate the settings it needs to do the UMNT.  This already does a negotiation based on the original mount options.  I'm saying: let's be conservative, and not change this logic, because this has the best chance of working.

The options in /proc/mounts worked at one point in time, but /etc/mtab has the options that are probably used every time you do the mount.  They are basically copied from /etc/fstab or the command line.  So we know that, no matter what the server does, the /etc/mtab options are tested and known to allow the client to negotiate the correct settings.

> Now that there is a real possibly that /etc/mtab could deprecated,
> I think we should start looking into making the info in /proc/mounts 
> work, since /proc/mounts not going anywhere... 

Again, I agree /etc/mtab should be deprecated, but we must not use /proc/mounts for this purpose.  Save the original mount options and use them for the umount.  That way the negotiation behavior of the MNT and the behavior of the UMNT follow exactly the same rules.

Let's just write these options to another place besides /etc/mtab, and read them from that place during unmount.  The only change I'm talking about is putting a second copy of these options on disk somewhere.

> 
>> 
>>>> To summarize: instead of relying on /etc/mtab, also use an NFS-specific 
>>>> place to record the same information.  umount.nfs can use that 
>>>> instead of /etc/mtab.  And by the way, we don't touch this information
>>>> during a remount... heh.  That guarantees that we preserve existing 
>>>> good behaviors of umount.nfs, continue to update /etc/mtab as documented,
>>>> until maybe it goes away, but eliminate our functional dependence on it.
>>>> 
>>> If the info in /etc/mntab is not updated on remounts, then what is 
>>> the issue we are talking about? Just curious, will the info in /proc/mounts
>>> be updated on remounts?
>> 
>> /etc/mtab would still be updated on remounts, and would still have the 
>> bug where "remount" would wipe the options.  But we would no longer depend
>> on that destroyed information to perform the umount reliably.
> The remount would wipe out the *original* options, basically overriding
> them with the updated options... As long as we have a mechanism to
> retry the UMNT if the first call fails, I don't see this a being a 
> problem... 

The "problem" is where do you start the renegotiation?  You need the original mount options to do that reliably.

> 
>> This new stash of information I'm proposing would not be altered by a
>> remount.  It sounds like we would need to store only the MNT protocol 
>> related options, described above.
>> 
>> In /proc/mounts, the NFS-specific mount options aren't supposed to 
>> change at all on a remount.  Only the generic mount options 
>> ("sync", "ro", etc) should change.
> Could you please point me to where the above rule is mandated...
> I had know idea there were rules of what can and cannot be
> changed in /proc/mounts... tia...

Go look at the kernel mount code in fs/nfs/super.c, and you will see that we don't allow any NFS options save a select few to change on a remount.  nfs_compare_remount_data() requires that most all the important options like rsize and transport and server address are not allowed to change.  But that's a red herring, I think.

We've relented on getting rid of umount.nfs, and we've relented on not performing UMNT.  You won both of those.  I think it's time for you to concede and do this little tiny piece my way, not the least because my way results in less change in behavior than using /proc/mounts, is more efficient on the wire, on average, is backwards compatible with older kernels (unlike using /proc/mounts), and will result in more reliable UMNT results in all cases.

-- 
chuck[dot]lever[at]oracle[dot]com





^ permalink raw reply	[flat|nested] 36+ messages in thread

* Re: whither NFS umount?
  2010-10-15 16:00                                         ` Chuck Lever
@ 2010-10-15 20:08                                           ` Steve Dickson
  2010-10-18 15:18                                             ` Chuck Lever
  0 siblings, 1 reply; 36+ messages in thread
From: Steve Dickson @ 2010-10-15 20:08 UTC (permalink / raw)
  To: Chuck Lever; +Cc: Jeff Layton, Trond Myklebust, Linux NFS Mailing List

On 10/15/2010 12:00 PM, Chuck Lever wrote:
> On Oct 15, 2010, at 9:11 AM, Steve Dickson wrote:
>>>> . 
>>>>> If the mount point is very long lived, as it is for static mount points on 
>>>>> server-class systems, the client may have been up for months, while the 
>>>>> NFS servers can have rebooted multiple times during that time span.  
>>>>> Each server reboot can result in the mount port changing, for example.
>>>>> /proc/mounts has the specific set of options that were the result of 
>>>>> negotiation during the mount process.  Those will work sometimes, but I 
>>>>> think those actually have a good chance of not working in some cases.
>>>>>
>>>>> If umount.nfs starts with /proc/mounts, how can it know which of "vers=" 
>>>>> and "proto=" and "port=" and "mountport=" were specified on the original 
>>>>> command line (and thus are required to make the mount work) and those which
>>>>> were negotiated by mount.nfs (and thus may have changed since the original mount)?
>>>> Well I don't believe either the proto= or vers= will change
>>>> over a server reboot since the values in /proc/mounts are the
>>>> were negotiated to...  I do agree both the "port=" and 
>>>> "mountport=" can go stale... So many be should just never use them... 
>>>
>>> Vers= won't change, which is why we can trust /proc/mounts to tell us what 
>>> NFS version to use for the umount.
>> proto= will not go stale either... 
> 
> That's not trivial.  Remember that proto= controls both the NFS protocol and 
> the mount protocol.
> 
> The NFS protocol will not go stale, but the server can easily change the 
> mount protocols it serves, and the clients are none-the-wiser until they 
> attempt another mount operation.
That was my point the NFS protocol (which is the proto=) in /proc/mounts
will not go stale and the mountproto= can go stale but is very unlikely.

>>
>>> The problem is we can't tell whether mountproto and mountport in /proc/mounts
>> < was specified on the command line (say, to punch through a firewall) or 
>> < was negotiated by user space (and is thus safe to ignore and renegotiate).
>> We shouldn't care whether those options were specified or negotiated.
> 
> I disagree.
> 
> If these options were specified options, it's likely that they were used to get 
> through a firewall.  In that case, we can use these settings and expect 
> the UMNT to work as well as the MNT did.
And those specified options will be the ones found in /proc/mounts.

> 
> On the other hand, if these were negotiated options, there are some 
> common cases where the /proc/mounts options won't work. 
I agree they will not only if the server is rebooted and
the server daemons are not listening on the same port.. 

> Your "workaround," if these don't work, is to fall back and retry the UMNT 
> by negotiating.  So: what mount options do you use to start the negotiation with?
I guess I was thinking just blindly try the options in /proc/mounts first,
since those will work unless the server has been reboot and are listening
on different port. If that fails I guess we start negotiation scratch.. 
 
> 
> We don't need to try and fall back.  Start with the original command line 
> options and negotiate as you did for the original MNT request.  That means 
> you perform a single UMNT try.  My way is less complicated on the wire in 
> these cases, and has more consistent results.  What's more, it's what the 
> code already does today.
> 
>> The values in /proc/mounts are the ones that worked! So at one point 
>> in time we know all the values in /proc/mounts were valid (since the
>> entry exists). This is something that cannot be said about options
>> specified on the command line.
> 
> Dude, that makes no sense.  The options in /proc/mounts were derived from 
> the options on the command line.  So if the options in /proc/mounts worked,
> BY DEFINITION the command line options in /etc/mtab worked.
> 
> We really must start the negotiation from the original command line options.
> Going with the options in /proc/mounts first and then renegotiating if they
> fail is ass-backwards.
Well it turn out that using /proc/mounts is not quite that "ass-backwards"
or senses... 8-) because the current code uses /proc/mounts when /etc/mtab 
does not exist... and work just as expected... So I'm happy with that..

>>
>>>>> So, I'm OK with keeping umount.nfs around for the time being, but
>>>>> maybe I have to put my foot down and say we mustn't use /proc/mounts
>>>>> for anything but deciding whether the mount point is an NFSv4 mount.  
>>>>> I'm happy to volunteer code, and also happy to collaborate with you on a fix.
>>>>> I've already spent a lot of time poking at this and coding prototypes, 
>>>>> so I'm "invested."
>>>> Well talking with the upstream maintainer of the mount command
>>>> as soon as the new libmount makes an appearance, there is 
>>>> a very really possibility /etc/mtab will be going away... He
>>>> says it will be replace with something like /var/run/mount/something
>>>>
>>>> So maybe we start looking into how to make /proc/mounts work.
>>>
>>> I agree that we should work towards unlinking our mount subcommands from 
>>> relying on /etc/mtab.  I don't think the impending presence of 
>>> libmount mandates the use of /proc/mounts, though.
>> True... All I'm trying to point out is the information in /proc/mounts and 
>> /etc/mtab can be equally as good and equally as bad at any point
>> in time. 
> 
> These are not equivalent pieces of information.
I think we can agree to disagree... ;-) Given the fact we are
currently using /proc/mounts in a back up makes me feel 
confident we could use it when /etc/mtab goes away.. 

> 
> The options in /proc/mounts worked at one point in time, but /etc/mtab 
> has the options that are probably used every time you do the mount.  
> They are basically copied from /etc/fstab or the command line.  So we know
> that, no matter what the server does, the /etc/mtab options are tested and
> known to allow the client to negotiate the correct settings.
> 
>> Now that there is a real possibly that /etc/mtab could deprecated,
>> I think we should start looking into making the info in /proc/mounts 
>> work, since /proc/mounts not going anywhere... 
> 
> Again, I agree /etc/mtab should be deprecated, but we must not use 
> /proc/mounts for this purpose.  Save the original mount options and 
> use them for the umount.  That way the negotiation behavior of the MNT 
> and the behavior of the UMNT follow exactly the same rules.
As I've said, we already do use /proc/mounts which is fine.. IMHO.. 

> 
> Let's just write these options to another place besides /etc/mtab, 
> and read them from that place during unmount.  The only change I'm 
> talking about is putting a second copy of these options on disk somewhere.
> 
I don't think I'm a fan of creating yet another non-scalable
list we need to maintain.. but that's for another day.. if/when
/etc/mtab goes away.. 

For now, I agree with you, lets keep using /etc/mtab to 
maintain the original options...  

>>> In /proc/mounts, the NFS-specific mount options aren't supposed to 
>>> change at all on a remount.  Only the generic mount options 
>>> ("sync", "ro", etc) should change.
>> Could you please point me to where the above rule is mandated...
>> I had know idea there were rules of what can and cannot be
>> changed in /proc/mounts... tia...
> 
> Go look at the kernel mount code in fs/nfs/super.c, and you will see that
> we don't allow any NFS options save a select few to change on a remount.
> nfs_compare_remount_data() requires that most all the important
> options like rsize and transport and server address are not allowed to change. 
> But that's a red herring, I think.
Thanks for pointing this out...I'll take a look..

steved.


^ permalink raw reply	[flat|nested] 36+ messages in thread

* Re: whither NFS umount?
  2010-10-15 20:08                                           ` Steve Dickson
@ 2010-10-18 15:18                                             ` Chuck Lever
  0 siblings, 0 replies; 36+ messages in thread
From: Chuck Lever @ 2010-10-18 15:18 UTC (permalink / raw)
  To: Steve Dickson; +Cc: Jeff Layton, Trond Myklebust, Linux NFS Mailing List

Happy Monday...

On Oct 15, 2010, at 4:08 PM, Steve Dickson wrote:

> On 10/15/2010 12:00 PM, Chuck Lever wrote:
>> On Oct 15, 2010, at 9:11 AM, Steve Dickson wrote:
>> The options in /proc/mounts worked at one point in time, but /etc/mtab 
>> has the options that are probably used every time you do the mount.  
>> They are basically copied from /etc/fstab or the command line.  So we know
>> that, no matter what the server does, the /etc/mtab options are tested and
>> known to allow the client to negotiate the correct settings.
>> 
>>> Now that there is a real possibly that /etc/mtab could deprecated,
>>> I think we should start looking into making the info in /proc/mounts 
>>> work, since /proc/mounts not going anywhere... 
>> 
>> Again, I agree /etc/mtab should be deprecated, but we must not use 
>> /proc/mounts for this purpose.  Save the original mount options and 
>> use them for the umount.  That way the negotiation behavior of the MNT 
>> and the behavior of the UMNT follow exactly the same rules.
> As I've said, we already do use /proc/mounts which is fine.. IMHO.. 

The "fallback to /proc/mounts case" breaks UMNT in exactly the ways I've described.

umount.nfs will use /proc/mounts as a fallback because, without /etc/mtab, that's the only way it can match a mount point directory on the client with the file server name and export information.  However, the mount options can still be incorrect, just as I have described.  Thus this fallback is weak to worthless for sending UMNT (it will get the local umount(2) call right, but that's all that truly matters for the client, anyway).

Using /proc/mounts as a substitute for /etc/mtab works fine for a vast majority of Linux file systems, because during an unmount, they don't care what the mount options were.  They just need to know what device is associated with the mount point being unmounted.

Unfortunately for us, NFS does care about those mount options, and uses /etc/mtab in a way that most other file systems do not.  Yet another reason why no-one else but us chickens cares that a remount can rewrite mount options in /etc/mtab at whim.

>> Let's just write these options to another place besides /etc/mtab, 
>> and read them from that place during unmount.  The only change I'm 
>> talking about is putting a second copy of these options on disk somewhere.
>> 
> I don't think I'm a fan of creating yet another non-scalable
> list we need to maintain.. but that's for another day.. if/when
> /etc/mtab goes away.. 
> 
> For now, I agree with you, lets keep using /etc/mtab to 
> maintain the original options...  

We have to do something now to address this remount bug.  I'm going to code up a patch that stores the mount command line somewhere under /var/run.

Plus, as you have said, /etc/mtab is destined to be removed; has been for some time.  We will have to do this at some point anyway.

-- 
chuck[dot]lever[at]oracle[dot]com





^ permalink raw reply	[flat|nested] 36+ messages in thread

end of thread, other threads:[~2010-10-18 15:19 UTC | newest]

Thread overview: 36+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2010-10-12 16:29 whither NFS umount? Chuck Lever
2010-10-12 17:04 ` Trond Myklebust
     [not found]   ` <1286903046.24878.13.camel-rJ7iovZKK19ZJLDQqaL3InhyD016LWXt@public.gmane.org>
2010-10-12 17:57     ` Chuck Lever
2010-10-12 19:18       ` Jeff Layton
2010-10-12 19:44         ` Trond Myklebust
2010-10-12 19:52           ` Jeff Layton
2010-10-12 19:59             ` Chuck Lever
2010-10-12 20:21             ` Trond Myklebust
2010-10-12 20:26               ` Jeff Layton
2010-10-12 20:34                 ` Chuck Lever
2010-10-12 20:50                   ` Jeff Layton
2010-10-12 21:19                     ` Chuck Lever
2010-10-13  1:00                       ` Jeff Layton
2010-10-13 17:40           ` Steve Dickson
2010-10-13 18:13             ` Jeff Layton
2010-10-13 18:45               ` Steve Dickson
     [not found]                 ` <4CB5FE65.3090409-AfCzQyP5zfLQT0dZR+AlfA@public.gmane.org>
2010-10-13 18:56                   ` Jeff Layton
2010-10-13 18:58                     ` Jeff Layton
     [not found]                     ` <20101013145601.468acc2a-4QP7MXygkU+dMjc06nkz3ljfA9RmPOcC@public.gmane.org>
2010-10-13 19:31                       ` Steve Dickson
2010-10-13 20:47                         ` Chuck Lever
2010-10-13 23:19                           ` Steve Dickson
2010-10-14 15:29                             ` Chuck Lever
2010-10-14 18:27                               ` Steve Dickson
2010-10-14 19:13                                 ` Chuck Lever
2010-10-14 21:24                                   ` Steve Dickson
2010-10-14 22:22                                     ` Chuck Lever
2010-10-15 13:11                                       ` Steve Dickson
2010-10-15 13:41                                         ` Jeff Layton
2010-10-15 16:00                                         ` Chuck Lever
2010-10-15 20:08                                           ` Steve Dickson
2010-10-18 15:18                                             ` Chuck Lever
2010-10-13 18:18             ` Trond Myklebust
2010-10-13 19:28               ` Steve Dickson
2010-10-14 14:00                 ` J. Bruce Fields
2010-10-14 14:17                   ` Trond Myklebust
     [not found]                     ` <1287065841.3015.233.camel-rJ7iovZKK19ZJLDQqaL3InhyD016LWXt@public.gmane.org>
2010-10-14 14:34                       ` J. Bruce Fields

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.