linux-fsdevel.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
From: Kinglong Mee <kinglongmee-Re5JQEeQqe8AvxtiuMwx3w@public.gmane.org>
To: NeilBrown <neilb-l3A5Bk7waGM@public.gmane.org>
Cc: "J. Bruce Fields"
	<bfields-uC3wQj2KruNg9hUCZPvPmw@public.gmane.org>,
	linux-fsdevel-u79uwXL29TY76Z2rM5mHXA@public.gmane.org,
	"linux-nfs-u79uwXL29TY76Z2rM5mHXA@public.gmane.org"
	<linux-nfs-u79uwXL29TY76Z2rM5mHXA@public.gmane.org>,
	Al Viro <viro-RmSDqhL/yNMiFSDQTTA3OLVCufUGDwFn@public.gmane.org>,
	Trond Myklebust
	<trond.myklebust-7I+n7zu2hftEKMMhf/gKZA@public.gmane.org>
Subject: Re: [PATCH 4/4] nfsd: Pin to vfsmnt instead of mntget
Date: Wed, 13 May 2015 20:55:47 +0800	[thread overview]
Message-ID: <555349D3.1020903@gmail.com> (raw)
In-Reply-To: <555343CA.6010307-Re5JQEeQqe8AvxtiuMwx3w@public.gmane.org>

On 5/13/2015 8:30 PM, Kinglong Mee wrote:
> On 5/13/2015 12:25 PM, NeilBrown wrote:
>> On Mon, 11 May 2015 21:08:47 +0800 Kinglong Mee <kinglongmee-Re5JQEeQqe8AvxtiuMwx3w@public.gmane.org> wrote:
>>
>>> On 5/8/2015 9:47 PM, J. Bruce Fields wrote:
>>>> On Fri, May 08, 2015 at 02:40:31PM +1000, NeilBrown wrote:
>>>>> Thanks for this patch.  It looks good!
>>>>>
>>>>> My only comment on the code is that I would really like to see a
>>>>> "path_get_pin()" and "path_put_unpin()" rather than open coding:
>>>>>
>>>>>> +	dget(item->ek_path.dentry);
>>>>>> +	pin_insert_group(&new->ek_pin, item->ek_path.mnt, NULL);
>>>>>
>>>>> and 
>>>>>
>>>>>> +		dput(key->ek_path.dentry);
>>>>>> +		pin_remove(&key->ek_pin);
>>>>>
>>>>>
>>>>> But the question you raise is an important one:  Exactly which filesystems
>>>>> should be allowed to be unmounted?
>>>>> This is a change in behaviour - is it one that people uniformly would want?
>>>>>
>>>>> The kernel doesn't currently know which file systems were explicitly listed
>>>>> in /etc/exports, and which were found by following a 'crossmnt'.
>>>>> It could guess and allow the unmounting of anything below a 'crossmnt', but I
>>>>> wouldn't be comfortable with that - it is error prone.
>>>>>
>>>>> mountd does know what is in /etc/exports, and could tell the kernel.
>>>>> For the expkey cache, we could always use path_get_pin.
>>>>> For the export cache (where flags are available) we could use path_get
>>>>> or path_get_pin depending on some new flag.
>>>>>
>>>>> I'm not really sure it is worth it.  I would rather the filesystems could
>>>>> always be unmounted.  But doing that could possibly break someone's work
>>>>> flow.  Maybe.
>>>>>
>>>>> Or maybe I'm seeing problems where there aren't any.
>>>>>
>>>>> Anyone else have an opinion?
>>>>
>>>> The undisputed bug here was negative cache entries preventing unmount.
>>>> So most conservative might be just to purge negative entries.
>>>
>>> I'd like this,
>>> if the cache is valid, user should not be allowed to umount the filesystem.
>>>
>>>>
>>>> Otherwise, the only guarantees I think we've really had is that we won't
>>>> allow unmount if you hold any actual state on the filesystem (NLM locks,
>>>> NFSv4 locks, opens, or delegations).
>>>
>>> Those resources hold the reference of vfsmnt.
>>>
>>>>
>>>> If a filesystem is exported but no clients hold state on it, then it's
>>>> currently mostly chance whether the unmount succeeds or not.  So we're
>>>> probably free to change the behavior in this case.  I'd be inclined to
>>>> allow the unmount, but haven't thought this through carefully.
>>>
>>> If client mount a nfsserver succeed without holds state, 
>>> nfs server umounts the exported filesystem, 
>>> client also think the filesystem is valid, but it is umounted.
>>
>> This is no different from "exportfs -au" being run on the server, thus
>> unexporting the filesystem and making in unavailable to the client, even
>> though the client has it mounted.
> 
> No, I don't think so.
> If user using "exportfs -au" to flush caches, I think he known
> what the influence of he does, but an umount of filesystem, 
> maybe he doesn't known that contains flushing nfsd's exports cache.
> 
> For an using of nfsd exports, I'd like an error of an umount,
> because I don't realize the exports for nfsd.
> 
> I also think nfsd should allowing umount of unexported filesystem,
> because user has the right to umount it.

The following is a diff draft of umounting an unexported filesystem.

thanks,
Kinglong Mee

------------------------------------------------------------------------
diff --git a/fs/nfsd/export.c b/fs/nfsd/export.c
index f79521a..bcaa914 100644
--- a/fs/nfsd/export.c
+++ b/fs/nfsd/export.c
@@ -309,11 +309,17 @@ static void nfsd4_fslocs_free(struct nfsd4_fs_locations *fsloc)
 static void svc_export_put(struct kref *ref)
 {
 	struct svc_export *exp = container_of(ref, struct svc_export, h.ref);
-	path_put(&exp->ex_path);
+
+	if (exp->ex_pin.kill) {
+		dput(exp->ex_path.dentry);
+		pin_remove(&exp->ex_pin);
+	} else
+		path_put(&exp->ex_path);
+
 	auth_domain_put(exp->ex_client);
 	nfsd4_fslocs_free(&exp->ex_fslocs);
 	kfree(exp->ex_uuid);
-	kfree(exp);
+	kfree_rcu(exp, rcu_head);
 }
 
 static void svc_export_request(struct cache_detail *cd,
@@ -699,6 +705,7 @@ static void svc_export_init(struct cache_head *cnew, struct cache_head *citem)
 	struct svc_export *new = container_of(cnew, struct svc_export, h);
 	struct svc_export *item = container_of(citem, struct svc_export, h);
 
+	init_fs_pin(&new->ex_pin, NULL);
 	kref_get(&item->ex_client->ref);
 	new->ex_client = item->ex_client;
 	new->ex_path = item->ex_path;
@@ -738,6 +745,24 @@ static void export_update(struct cache_head *cnew, struct cache_head *citem)
 	}
 }
 
+static void export_pin_kill(struct fs_pin *pin)
+{
+	struct svc_export *exp = container_of(pin, struct svc_export, ex_pin);
+	cache_force_expire(exp->cd, &exp->h);
+}
+
+static void export_update_negative(struct cache_head *cnew, struct cache_head *citem)
+{
+	struct svc_export *new = container_of(cnew, struct svc_export, h);
+
+	if (!test_bit(CACHE_NEGATIVE, &new->h.flags))
+		return ;
+
+	init_fs_pin(&new->ex_pin, export_pin_kill);
+	pin_insert_group(&new->ex_pin, new->ex_path.mnt, NULL);
+	mntput(new->ex_path.mnt);
+}
+
 static struct cache_head *svc_export_alloc(void)
 {
 	struct svc_export *i = kmalloc(sizeof(*i), GFP_KERNEL);
@@ -758,6 +783,7 @@ static struct cache_detail svc_export_cache_template = {
 	.match		= svc_export_match,
 	.init		= svc_export_init,
 	.update		= export_update,
+	.update_negative= export_update_negative,
 	.alloc		= svc_export_alloc,
 };
 
diff --git a/fs/nfsd/export.h b/fs/nfsd/export.h
index 1f52bfc..c764a8e 100644
--- a/fs/nfsd/export.h
+++ b/fs/nfsd/export.h
@@ -4,6 +4,7 @@
 #ifndef NFSD_EXPORT_H
 #define NFSD_EXPORT_H
 
+#include <linux/fs_pin.h>
 #include <linux/sunrpc/cache.h>
 #include <uapi/linux/nfsd/export.h>
 
@@ -46,6 +47,8 @@ struct exp_flavor_info {
 
 struct svc_export {
 	struct cache_head	h;
+	struct cache_detail	*cd;
+
 	struct auth_domain *	ex_client;
 	int			ex_flags;
 	struct path		ex_path;
@@ -58,7 +61,9 @@ struct svc_export {
 	struct exp_flavor_info	ex_flavors[MAX_SECINFO_LIST];
 	enum pnfs_layouttype	ex_layout_type;
 	struct nfsd4_deviceid_map *ex_devid_map;
-	struct cache_detail	*cd;
+
+	struct fs_pin		ex_pin;
+	struct rcu_head		rcu_head;
 };
 
 /* an "export key" (expkey) maps a filehandlefragement to an
diff --git a/include/linux/sunrpc/cache.h b/include/linux/sunrpc/cache.h
index 437ddb6..39b31b5 100644
--- a/include/linux/sunrpc/cache.h
+++ b/include/linux/sunrpc/cache.h
@@ -101,6 +101,7 @@ struct cache_detail {
 	int			(*match)(struct cache_head *orig, struct cache_head *new);
 	void			(*init)(struct cache_head *orig, struct cache_head *new);
 	void			(*update)(struct cache_head *orig, struct cache_head *new);
+	void			(*update_negative)(struct cache_head *orig, struct cache_head *new);
 
 	/* fields below this comment are for internal use
 	 * and should not be touched by cache owners
diff --git a/net/sunrpc/cache.c b/net/sunrpc/cache.c
index 2928aff..4a95dee 100644
--- a/net/sunrpc/cache.c
+++ b/net/sunrpc/cache.c
@@ -149,9 +149,11 @@ struct cache_head *sunrpc_cache_update(struct cache_detail *detail,
 	if (!test_bit(CACHE_VALID, &old->flags)) {
 		write_lock(&detail->hash_lock);
 		if (!test_bit(CACHE_VALID, &old->flags)) {
-			if (test_bit(CACHE_NEGATIVE, &new->flags))
+			if (test_bit(CACHE_NEGATIVE, &new->flags)) {
 				set_bit(CACHE_NEGATIVE, &old->flags);
-			else
+				if (detail->update_negative)
+					detail->update_negative(old, new);
+			} else
 				detail->update(old, new);
 			cache_fresh_locked(old, new->expiry_time);
 			write_unlock(&detail->hash_lock);
@@ -171,9 +173,11 @@ struct cache_head *sunrpc_cache_update(struct cache_detail *detail,
 	head = &detail->hash_table[hash];
 
 	write_lock(&detail->hash_lock);
-	if (test_bit(CACHE_NEGATIVE, &new->flags))
+	if (test_bit(CACHE_NEGATIVE, &new->flags)) {
 		set_bit(CACHE_NEGATIVE, &tmp->flags);
-	else
+		if (detail->update_negative)
+			detail->update_negative(old, new);
+	} else
 		detail->update(tmp, new);
 	tmp->next = *head;
 	*head = tmp;

--
To unsubscribe from this list: send the line "unsubscribe linux-nfs" in
the body of a message to majordomo-u79uwXL29TY76Z2rM5mHXA@public.gmane.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

  parent reply	other threads:[~2015-05-13 12:55 UTC|newest]

Thread overview: 18+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2015-05-06 13:18 [PATCH 0/4] NFSD: Pin to vfsmount instead of mntget for export cache Kinglong Mee
     [not found] ` <554A149B.5060102-Re5JQEeQqe8AvxtiuMwx3w@public.gmane.org>
2015-05-06 13:19   ` [PATCH 1/4] fs_pin: Fix uninitialized value in fs_pin Kinglong Mee
2015-05-07 19:43     ` J. Bruce Fields
     [not found]       ` <20150507194335.GA16527-uC3wQj2KruNg9hUCZPvPmw@public.gmane.org>
2015-05-08  0:36         ` Kinglong Mee
2015-05-06 13:19   ` [PATCH 2/4] fs_pin: Export functions for specific filesystem Kinglong Mee
2015-05-06 13:20   ` [PATCH 3/4] sunrpc: New helper cache_force_expire for cache cleanup Kinglong Mee
2015-05-06 13:21   ` [PATCH 4/4] nfsd: Pin to vfsmnt instead of mntget Kinglong Mee
2015-05-08  4:40     ` NeilBrown
2015-05-08 13:47       ` J. Bruce Fields
2015-05-11 13:08         ` Kinglong Mee
     [not found]           ` <5550A9DF.1070908-Re5JQEeQqe8AvxtiuMwx3w@public.gmane.org>
2015-05-13  4:25             ` NeilBrown
2015-05-13 12:30               ` Kinglong Mee
     [not found]                 ` <555343CA.6010307-Re5JQEeQqe8AvxtiuMwx3w@public.gmane.org>
2015-05-13 12:55                   ` Kinglong Mee [this message]
     [not found]               ` <20150513142515.6bd881c8-wvvUuzkyo1EYVZTmpyfIwg@public.gmane.org>
2015-05-15 21:11                 ` J. Bruce Fields
2015-05-15 23:23                   ` NeilBrown
2015-05-22 15:02                     ` Kinglong Mee
2015-05-22 16:03                       ` J. Bruce Fields
2015-05-15 21:09             ` J. Bruce Fields

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=555349D3.1020903@gmail.com \
    --to=kinglongmee-re5jqeeqqe8avxtiumwx3w@public.gmane.org \
    --cc=bfields-uC3wQj2KruNg9hUCZPvPmw@public.gmane.org \
    --cc=linux-fsdevel-u79uwXL29TY76Z2rM5mHXA@public.gmane.org \
    --cc=linux-nfs-u79uwXL29TY76Z2rM5mHXA@public.gmane.org \
    --cc=neilb-l3A5Bk7waGM@public.gmane.org \
    --cc=trond.myklebust-7I+n7zu2hftEKMMhf/gKZA@public.gmane.org \
    --cc=viro-RmSDqhL/yNMiFSDQTTA3OLVCufUGDwFn@public.gmane.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).