From mboxrd@z Thu Jan 1 00:00:00 1970 From: "J. Bruce Fields" Subject: Re: [PATCH 4/4] nfsd: Pin to vfsmnt instead of mntget Date: Fri, 15 May 2015 17:09:34 -0400 Message-ID: <20150515210934.GF29627@fieldses.org> References: <554A149B.5060102@gmail.com> <554A154B.6040103@gmail.com> <20150508144031.6f0d3cda@notabene.brown> <20150508134744.GA23753@fieldses.org> <5550A9DF.1070908@gmail.com> Mime-Version: 1.0 Content-Type: text/plain; charset=us-ascii Cc: NeilBrown , linux-fsdevel-u79uwXL29TY76Z2rM5mHXA@public.gmane.org, "linux-nfs-u79uwXL29TY76Z2rM5mHXA@public.gmane.org" , Al Viro , Trond Myklebust To: Kinglong Mee Return-path: Content-Disposition: inline In-Reply-To: <5550A9DF.1070908-Re5JQEeQqe8AvxtiuMwx3w@public.gmane.org> Sender: linux-nfs-owner-u79uwXL29TY76Z2rM5mHXA@public.gmane.org List-Id: linux-fsdevel.vger.kernel.org On Mon, May 11, 2015 at 09:08:47PM +0800, Kinglong Mee wrote: > On 5/8/2015 9:47 PM, J. Bruce Fields wrote: > > On Fri, May 08, 2015 at 02:40:31PM +1000, NeilBrown wrote: > >> Thanks for this patch. It looks good! > >> > >> My only comment on the code is that I would really like to see a > >> "path_get_pin()" and "path_put_unpin()" rather than open coding: > >> > >>> + dget(item->ek_path.dentry); > >>> + pin_insert_group(&new->ek_pin, item->ek_path.mnt, NULL); > >> > >> and > >> > >>> + dput(key->ek_path.dentry); > >>> + pin_remove(&key->ek_pin); > >> > >> > >> But the question you raise is an important one: Exactly which filesystems > >> should be allowed to be unmounted? > >> This is a change in behaviour - is it one that people uniformly would want? > >> > >> The kernel doesn't currently know which file systems were explicitly listed > >> in /etc/exports, and which were found by following a 'crossmnt'. > >> It could guess and allow the unmounting of anything below a 'crossmnt', but I > >> wouldn't be comfortable with that - it is error prone. > >> > >> mountd does know what is in /etc/exports, and could tell the kernel. > >> For the expkey cache, we could always use path_get_pin. > >> For the export cache (where flags are available) we could use path_get > >> or path_get_pin depending on some new flag. > >> > >> I'm not really sure it is worth it. I would rather the filesystems could > >> always be unmounted. But doing that could possibly break someone's work > >> flow. Maybe. > >> > >> Or maybe I'm seeing problems where there aren't any. > >> > >> Anyone else have an opinion? > > > > The undisputed bug here was negative cache entries preventing unmount. > > So most conservative might be just to purge negative entries. > > I'd like this, > if the cache is valid, user should not be allowed to umount the filesystem. > > > > > Otherwise, the only guarantees I think we've really had is that we won't > > allow unmount if you hold any actual state on the filesystem (NLM locks, > > NFSv4 locks, opens, or delegations). > > Those resources hold the reference of vfsmnt. > > > > > If a filesystem is exported but no clients hold state on it, then it's > > currently mostly chance whether the unmount succeeds or not. So we're > > probably free to change the behavior in this case. I'd be inclined to > > allow the unmount, but haven't thought this through carefully. > > If client mount a nfsserver succeed without holds state, > nfs server umounts the exported filesystem, > client also think the filesystem is valid, but it is umounted. People do sometimes want that even when state's held. The case I've seen is migration of individual exports (or sets of exports) on shared block storage, using a floating IP--I think the sequence is: shut down the new server, move the floating IP to the new server, then unexport and unmount on the old server, then mount on the new server, export, and restart the new server. Or maybe they really just want to unmount something and don't mind client applications erroring out. --b. -- To unsubscribe from this list: send the line "unsubscribe linux-nfs" in the body of a message to majordomo-u79uwXL29TY76Z2rM5mHXA@public.gmane.org More majordomo info at http://vger.kernel.org/majordomo-info.html