linux-fsdevel.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
* [PATCH 00/10 v6] NFSD: Pin to vfsmount for nfsd exports cache
@ 2015-06-25 14:17 Kinglong Mee
  2015-06-25 14:19 ` [PATCH 02/10 v6] fs_pin: Export functions for specific filesystem Kinglong Mee
                   ` (6 more replies)
  0 siblings, 7 replies; 13+ messages in thread
From: Kinglong Mee @ 2015-06-25 14:17 UTC (permalink / raw)
  To: Al Viro, J. Bruce Fields, linux-nfs-u79uwXL29TY76Z2rM5mHXA,
	linux-fsdevel-u79uwXL29TY76Z2rM5mHXA
  Cc: NeilBrown, Trond Myklebust, kinglongmee-Re5JQEeQqe8AvxtiuMwx3w

If there are some mount points(not exported for nfs) under pseudo root,
after client's operation of those entry under the root, anyone *can't*
unmount those mount points until export cache expired.

# cat /etc/exports
/nfs/xfs        *(rw,insecure,no_subtree_check,no_root_squash)
/nfs/pnfs       *(rw,insecure,no_subtree_check,no_root_squash)
# ll /nfs/
total 0
drwxr-xr-x. 3 root root 84 Apr 21 22:27 pnfs
drwxr-xr-x. 3 root root 84 Apr 21 22:27 test
drwxr-xr-x. 2 root root  6 Apr 20 22:01 xfs
# mount /dev/sde /nfs/test
# df
Filesystem                      1K-blocks    Used Available Use% Mounted on
......
/dev/sdd                          1038336   32944   1005392   4% /nfs/pnfs
/dev/sdc                         10475520   32928  10442592   1% /nfs/xfs
/dev/sde                           999320    1284    929224   1% /nfs/test
# mount -t nfs 127.0.0.1:/nfs/ /mnt
# ll /mnt/*/
/mnt/pnfs/:
total 0
-rw-r--r--. 1 root root 0 Apr 21 22:23 attr
drwxr-xr-x. 2 root root 6 Apr 21 22:19 tmp

/mnt/xfs/:
total 0
# umount /nfs/test/
umount: /nfs/test/: target is busy
        (In some cases useful info about processes that
         use the device is found by lsof(8) or fuser(1).)

It's caused by exports cache of nfsd holds the reference of
the path (here is /nfs/test/), so, it can't be umounted.

I don't think that's user expect, they want umount /nfs/test/.
Bruce think user can also umount /nfs/pnfs/ and /nfs/xfs.

This patch site lets nfsd exports pinning to vfsmount, 
not using mntget, so user can umount any exports mountpoint now.

v3, 
1. New helpers path_get_pin/path_put_unpin for path pin.
2. Use kzalloc for allocating memory.

v4, Thanks for Al Viro's commets for the logic of fs_pin.
1. add a completion for pin_kill waiting the reference is decreased to zero.
2. add a work_struct for pin_kill decreases the reference indirectly.
3. free svc_export/svc_expkey in pin_kill, not svc_export_put/svc_expkey_put.
4. svc_export_put/svc_expkey_put go though pin_kill logic.

v5, 
let killing fs_pin under a reference of vfsmnt.

v6,
1. revert the change of v5
2. new helper legitimize_mntget() for nfsd exports/expkey cache
   get vfsmount from fs_pin
3. cleanup some codes of sunrpc's cache
4. switch using list_head instead of single list for cache_head
   in cache_detail
5. new functions validate/invalidate for processing of reference
   increase/decrease change (nfsd exports/expkey using grab the
   reference of mnt)
6. delete cache_head directly from cache_detail in pin_kill

Right now,

When reference of cahce_head increase(>1), grab a reference of mnt once.
and reference decrease to 1 (==1), drop the reference of mnt.

So after that,
When ref > 1, user cannot umount the filesystem with -EBUSY.
when ref ==1, means cache only reference by nfsd cache,
no other reference. So user can try umount, 
1. before set MNT_UMOUNT (protected by mount_lock), nfsd cache is
   referenced (ref > 1, legitimize_mntget), umount will fail with -EBUSY.
2. after set MNT_UMOUNT, nfsd cache is referenced (ref == 2),
   legitimize_mntget will fail, and set cache to CACHE_NEGATIVE,
   and the reference will be dropped, re-back to 1.
   So, pin_kill can delete the cache and umount success.
3. when umountting, no reference to nfsd cache, 
   pin_kill can delete the cache and umount success.

Kinglong Mee (10):
  fs_pin: Initialize value for fs_pin explicitly
  fs_pin: Export functions for specific filesystem
  path: New helpers path_get_pin/path_put_unpin for path pin
  fs: New helper legitimize_mntget() for getting an legitimize mnt
  sunrpc: Store cache_detail in seq_file's private directly
  sunrpc/nfsd: Remove duplicate code by exports seq_operations functions
  sunrpc: Switch to using list_head instead single list
  sunrpc: New helper cache_delete_entry for deleting cache_head directly
  sunrpc: Support validate/invalidate for reference change in cache_detail
  nfsd: Allows user un-mounting filesystem where nfsd exports base on

 fs/fs_pin.c                       |   4 +
 fs/namei.c                        |  26 ++++
 fs/namespace.c                    |  19 +++
 fs/nfsd/export.c                  | 242 ++++++++++++++++++++++++--------------
 fs/nfsd/export.h                  |  26 +++-
 include/linux/fs_pin.h            |   6 +
 include/linux/mount.h             |   1 +
 include/linux/path.h              |   4 +
 include/linux/sunrpc/cache.h      |  21 +++-
 net/sunrpc/auth_gss/svcauth_gss.c |   2 +-
 net/sunrpc/cache.c                | 159 +++++++++++++++----------
 net/sunrpc/svcauth_unix.c         |   2 +-
 12 files changed, 357 insertions(+), 155 deletions(-)

-- 
2.4.3

--
To unsubscribe from this list: send the line "unsubscribe linux-nfs" in
the body of a message to majordomo-u79uwXL29TY76Z2rM5mHXA@public.gmane.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

^ permalink raw reply	[flat|nested] 13+ messages in thread

* [PATCH 01/10 v6] fs_pin: Initialize value for fs_pin explicitly
       [not found] ` <558C0D6A.9050104-Re5JQEeQqe8AvxtiuMwx3w@public.gmane.org>
@ 2015-06-25 14:18   ` Kinglong Mee
  2015-06-25 14:25   ` [PATCH 05/10 v6] sunrpc: Store cache_detail in seq_file's private directly Kinglong Mee
                     ` (2 subsequent siblings)
  3 siblings, 0 replies; 13+ messages in thread
From: Kinglong Mee @ 2015-06-25 14:18 UTC (permalink / raw)
  To: Al Viro, J. Bruce Fields, linux-nfs-u79uwXL29TY76Z2rM5mHXA,
	linux-fsdevel-u79uwXL29TY76Z2rM5mHXA
  Cc: NeilBrown, Trond Myklebust, kinglongmee-Re5JQEeQqe8AvxtiuMwx3w

Without initialized, done in fs_pin at stack space may
contains strange value.

v3, v4, v5, v6
Adds macro for header file

Signed-off-by: Kinglong Mee <kinglongmee-Re5JQEeQqe8AvxtiuMwx3w@public.gmane.org>
---
 include/linux/fs_pin.h | 6 ++++++
 1 file changed, 6 insertions(+)

diff --git a/include/linux/fs_pin.h b/include/linux/fs_pin.h
index 3886b3b..0dde7b7 100644
--- a/include/linux/fs_pin.h
+++ b/include/linux/fs_pin.h
@@ -1,3 +1,6 @@
+#ifndef _LINUX_FS_PIN_H
+#define _LINUX_FS_PIN_H
+
 #include <linux/wait.h>
 
 struct fs_pin {
@@ -16,9 +19,12 @@ static inline void init_fs_pin(struct fs_pin *p, void (*kill)(struct fs_pin *))
 	INIT_HLIST_NODE(&p->s_list);
 	INIT_HLIST_NODE(&p->m_list);
 	p->kill = kill;
+	p->done = 0;
 }
 
 void pin_remove(struct fs_pin *);
 void pin_insert_group(struct fs_pin *, struct vfsmount *, struct hlist_head *);
 void pin_insert(struct fs_pin *, struct vfsmount *);
 void pin_kill(struct fs_pin *);
+
+#endif
-- 
2.4.3

--
To unsubscribe from this list: send the line "unsubscribe linux-nfs" in
the body of a message to majordomo-u79uwXL29TY76Z2rM5mHXA@public.gmane.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

^ permalink raw reply related	[flat|nested] 13+ messages in thread

* [PATCH 02/10 v6] fs_pin: Export functions for specific filesystem
  2015-06-25 14:17 [PATCH 00/10 v6] NFSD: Pin to vfsmount for nfsd exports cache Kinglong Mee
@ 2015-06-25 14:19 ` Kinglong Mee
  2015-06-25 14:19 ` [PATCH 03/10 v6] path: New helpers path_get_pin/path_put_unpin for path pin Kinglong Mee
                   ` (5 subsequent siblings)
  6 siblings, 0 replies; 13+ messages in thread
From: Kinglong Mee @ 2015-06-25 14:19 UTC (permalink / raw)
  To: Al Viro, J. Bruce Fields, linux-nfs, linux-fsdevel
  Cc: NeilBrown, Trond Myklebust, kinglongmee

Exports functions for others who want pin to vfsmount,
eg, nfsd's export cache.

v4, v5, v6
add exporting of pin_kill.

Signed-off-by: Kinglong Mee <kinglongmee@gmail.com>
---
 fs/fs_pin.c | 4 ++++
 1 file changed, 4 insertions(+)

diff --git a/fs/fs_pin.c b/fs/fs_pin.c
index 611b540..a1a4eb2 100644
--- a/fs/fs_pin.c
+++ b/fs/fs_pin.c
@@ -17,6 +17,7 @@ void pin_remove(struct fs_pin *pin)
 	wake_up_locked(&pin->wait);
 	spin_unlock_irq(&pin->wait.lock);
 }
+EXPORT_SYMBOL(pin_remove);
 
 void pin_insert_group(struct fs_pin *pin, struct vfsmount *m, struct hlist_head *p)
 {
@@ -26,11 +27,13 @@ void pin_insert_group(struct fs_pin *pin, struct vfsmount *m, struct hlist_head
 	hlist_add_head(&pin->m_list, &real_mount(m)->mnt_pins);
 	spin_unlock(&pin_lock);
 }
+EXPORT_SYMBOL(pin_insert_group);
 
 void pin_insert(struct fs_pin *pin, struct vfsmount *m)
 {
 	pin_insert_group(pin, m, &m->mnt_sb->s_pins);
 }
+EXPORT_SYMBOL(pin_insert);
 
 void pin_kill(struct fs_pin *p)
 {
@@ -72,6 +75,7 @@ void pin_kill(struct fs_pin *p)
 	}
 	rcu_read_unlock();
 }
+EXPORT_SYMBOL(pin_kill);
 
 void mnt_pin_kill(struct mount *m)
 {
-- 
2.4.3


^ permalink raw reply related	[flat|nested] 13+ messages in thread

* [PATCH 03/10 v6] path: New helpers path_get_pin/path_put_unpin for path pin
  2015-06-25 14:17 [PATCH 00/10 v6] NFSD: Pin to vfsmount for nfsd exports cache Kinglong Mee
  2015-06-25 14:19 ` [PATCH 02/10 v6] fs_pin: Export functions for specific filesystem Kinglong Mee
@ 2015-06-25 14:19 ` Kinglong Mee
  2015-06-25 14:21 ` [PATCH 04/10 v6] fs: New helper legitimize_mntget() for getting a legitimize mnt Kinglong Mee
                   ` (4 subsequent siblings)
  6 siblings, 0 replies; 13+ messages in thread
From: Kinglong Mee @ 2015-06-25 14:19 UTC (permalink / raw)
  To: Al Viro, J. Bruce Fields, linux-nfs, linux-fsdevel
  Cc: NeilBrown, Trond Myklebust, kinglongmee

Two helpers for filesystem pining to vfsmnt, not mntget.

v4, v5, v6 same as v2.

Signed-off-by: Kinglong Mee <kinglongmee@gmail.com>
---
 fs/namei.c           | 26 ++++++++++++++++++++++++++
 include/linux/path.h |  4 ++++
 2 files changed, 30 insertions(+)

diff --git a/fs/namei.c b/fs/namei.c
index 4a8d998b..ac71c65 100644
--- a/fs/namei.c
+++ b/fs/namei.c
@@ -492,6 +492,32 @@ void path_put(const struct path *path)
 }
 EXPORT_SYMBOL(path_put);
 
+/**
+ * path_get_pin - get a reference to a path's dentry
+ *                and pin to path's vfsmnt
+ * @path: path to get the reference to
+ * @p: the fs_pin pin to vfsmnt
+ */
+void path_get_pin(struct path *path, struct fs_pin *p)
+{
+	dget(path->dentry);
+	pin_insert_group(p, path->mnt, NULL);
+}
+EXPORT_SYMBOL(path_get_pin);
+
+/**
+ * path_put_unpin - put a reference to a path's dentry
+ *                  and remove pin to path's vfsmnt
+ * @path: path to put the reference to
+ * @p: the fs_pin removed from vfsmnt
+ */
+void path_put_unpin(struct path *path, struct fs_pin *p)
+{
+	dput(path->dentry);
+	pin_remove(p);
+}
+EXPORT_SYMBOL(path_put_unpin);
+
 struct nameidata {
 	struct path	path;
 	struct qstr	last;
diff --git a/include/linux/path.h b/include/linux/path.h
index d137218..34599fb 100644
--- a/include/linux/path.h
+++ b/include/linux/path.h
@@ -3,6 +3,7 @@
 
 struct dentry;
 struct vfsmount;
+struct fs_pin;
 
 struct path {
 	struct vfsmount *mnt;
@@ -12,6 +13,9 @@ struct path {
 extern void path_get(const struct path *);
 extern void path_put(const struct path *);
 
+extern void path_get_pin(struct path *, struct fs_pin *);
+extern void path_put_unpin(struct path *, struct fs_pin *);
+
 static inline int path_equal(const struct path *path1, const struct path *path2)
 {
 	return path1->mnt == path2->mnt && path1->dentry == path2->dentry;
-- 
2.4.3


^ permalink raw reply related	[flat|nested] 13+ messages in thread

* [PATCH 04/10 v6] fs: New helper legitimize_mntget() for getting a legitimize mnt
  2015-06-25 14:17 [PATCH 00/10 v6] NFSD: Pin to vfsmount for nfsd exports cache Kinglong Mee
  2015-06-25 14:19 ` [PATCH 02/10 v6] fs_pin: Export functions for specific filesystem Kinglong Mee
  2015-06-25 14:19 ` [PATCH 03/10 v6] path: New helpers path_get_pin/path_put_unpin for path pin Kinglong Mee
@ 2015-06-25 14:21 ` Kinglong Mee
  2015-06-25 14:27 ` [PATCH 06/10 v6] sunrpc/nfsd: Remove redundant code by exports seq_operations functions Kinglong Mee
                   ` (3 subsequent siblings)
  6 siblings, 0 replies; 13+ messages in thread
From: Kinglong Mee @ 2015-06-25 14:21 UTC (permalink / raw)
  To: Al Viro, J. Bruce Fields, linux-nfs, linux-fsdevel
  Cc: NeilBrown, Trond Myklebust, kinglongmee

New helper legitimize_mntget for getting a mnt without setting
MNT_SYNC_UMOUNT | MNT_UMOUNT | MNT_DOOMED, otherwise return NULL.

v6, New one

Signed-off-by: Kinglong Mee <kinglongmee@gmail.com>
---
 fs/namespace.c        | 19 +++++++++++++++++++
 include/linux/mount.h |  1 +
 2 files changed, 20 insertions(+)

diff --git a/fs/namespace.c b/fs/namespace.c
index 1f4f9da..f31d165 100644
--- a/fs/namespace.c
+++ b/fs/namespace.c
@@ -1142,6 +1142,25 @@ struct vfsmount *mntget(struct vfsmount *mnt)
 }
 EXPORT_SYMBOL(mntget);
 
+struct vfsmount *legitimize_mntget(struct vfsmount *vfsmnt)
+{
+	struct mount *mnt;
+
+	if (vfsmnt == NULL)
+		return NULL;
+
+	read_seqlock_excl(&mount_lock);
+	mnt = real_mount(vfsmnt);
+	if (vfsmnt->mnt_flags & (MNT_SYNC_UMOUNT | MNT_UMOUNT | MNT_DOOMED))
+		vfsmnt = NULL;
+	else
+		mnt_add_count(mnt, 1);
+	read_sequnlock_excl(&mount_lock);
+
+	return vfsmnt;
+}
+EXPORT_SYMBOL(legitimize_mntget);
+
 struct vfsmount *mnt_clone_internal(struct path *path)
 {
 	struct mount *p;
diff --git a/include/linux/mount.h b/include/linux/mount.h
index f822c3c..8ae9dc0 100644
--- a/include/linux/mount.h
+++ b/include/linux/mount.h
@@ -79,6 +79,7 @@ extern void mnt_drop_write(struct vfsmount *mnt);
 extern void mnt_drop_write_file(struct file *file);
 extern void mntput(struct vfsmount *mnt);
 extern struct vfsmount *mntget(struct vfsmount *mnt);
+extern struct vfsmount *legitimize_mntget(struct vfsmount *vfsmnt);
 extern struct vfsmount *mnt_clone_internal(struct path *path);
 extern int __mnt_is_readonly(struct vfsmount *mnt);
 
-- 
2.4.3


^ permalink raw reply related	[flat|nested] 13+ messages in thread

* [PATCH 05/10 v6] sunrpc: Store cache_detail in seq_file's private directly
       [not found] ` <558C0D6A.9050104-Re5JQEeQqe8AvxtiuMwx3w@public.gmane.org>
  2015-06-25 14:18   ` [PATCH 01/10 v6] fs_pin: Initialize value for fs_pin explicitly Kinglong Mee
@ 2015-06-25 14:25   ` Kinglong Mee
  2015-06-25 14:34   ` [PATCH 08/10] sunrpc: New helper cache_delete_entry for deleting cache_head directly Kinglong Mee
  2015-06-25 14:36   ` [PATCH 09/10 v6] sunrpc: Support validate/invalidate for reference change in cache_detail Kinglong Mee
  3 siblings, 0 replies; 13+ messages in thread
From: Kinglong Mee @ 2015-06-25 14:25 UTC (permalink / raw)
  To: Al Viro, J. Bruce Fields, linux-nfs-u79uwXL29TY76Z2rM5mHXA,
	linux-fsdevel-u79uwXL29TY76Z2rM5mHXA
  Cc: NeilBrown, Trond Myklebust, kinglongmee-Re5JQEeQqe8AvxtiuMwx3w

Cleanup.

Just store cache_detail in seq_file's private,
an allocated handle is redundant.

Signed-off-by: Kinglong Mee <kinglongmee-Re5JQEeQqe8AvxtiuMwx3w@public.gmane.org>
---
 net/sunrpc/cache.c | 28 +++++++++++++---------------
 1 file changed, 13 insertions(+), 15 deletions(-)

diff --git a/net/sunrpc/cache.c b/net/sunrpc/cache.c
index 2928aff..edec603 100644
--- a/net/sunrpc/cache.c
+++ b/net/sunrpc/cache.c
@@ -1270,18 +1270,13 @@ EXPORT_SYMBOL_GPL(qword_get);
  * get a header, then pass each real item in the cache
  */
 
-struct handle {
-	struct cache_detail *cd;
-};
-
 static void *c_start(struct seq_file *m, loff_t *pos)
 	__acquires(cd->hash_lock)
 {
 	loff_t n = *pos;
 	unsigned int hash, entry;
 	struct cache_head *ch;
-	struct cache_detail *cd = ((struct handle*)m->private)->cd;
-
+	struct cache_detail *cd = m->private;
 
 	read_lock(&cd->hash_lock);
 	if (!n--)
@@ -1308,7 +1303,7 @@ static void *c_next(struct seq_file *m, void *p, loff_t *pos)
 {
 	struct cache_head *ch = p;
 	int hash = (*pos >> 32);
-	struct cache_detail *cd = ((struct handle*)m->private)->cd;
+	struct cache_detail *cd = m->private;
 
 	if (p == SEQ_START_TOKEN)
 		hash = 0;
@@ -1334,14 +1329,14 @@ static void *c_next(struct seq_file *m, void *p, loff_t *pos)
 static void c_stop(struct seq_file *m, void *p)
 	__releases(cd->hash_lock)
 {
-	struct cache_detail *cd = ((struct handle*)m->private)->cd;
+	struct cache_detail *cd = m->private;
 	read_unlock(&cd->hash_lock);
 }
 
 static int c_show(struct seq_file *m, void *p)
 {
 	struct cache_head *cp = p;
-	struct cache_detail *cd = ((struct handle*)m->private)->cd;
+	struct cache_detail *cd = m->private;
 
 	if (p == SEQ_START_TOKEN)
 		return cd->cache_show(m, cd, NULL);
@@ -1373,24 +1368,27 @@ static const struct seq_operations cache_content_op = {
 static int content_open(struct inode *inode, struct file *file,
 			struct cache_detail *cd)
 {
-	struct handle *han;
+	struct seq_file *seq;
+	int err;
 
 	if (!cd || !try_module_get(cd->owner))
 		return -EACCES;
-	han = __seq_open_private(file, &cache_content_op, sizeof(*han));
-	if (han == NULL) {
+
+	err = seq_open(file, &cache_content_op);
+	if (err) {
 		module_put(cd->owner);
-		return -ENOMEM;
+		return err;
 	}
 
-	han->cd = cd;
+	seq = file->private_data;
+	seq->private = cd;
 	return 0;
 }
 
 static int content_release(struct inode *inode, struct file *file,
 		struct cache_detail *cd)
 {
-	int ret = seq_release_private(inode, file);
+	int ret = seq_release(inode, file);
 	module_put(cd->owner);
 	return ret;
 }
-- 
2.4.3

--
To unsubscribe from this list: send the line "unsubscribe linux-nfs" in
the body of a message to majordomo-u79uwXL29TY76Z2rM5mHXA@public.gmane.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

^ permalink raw reply related	[flat|nested] 13+ messages in thread

* [PATCH 06/10 v6] sunrpc/nfsd: Remove redundant code by exports seq_operations functions
  2015-06-25 14:17 [PATCH 00/10 v6] NFSD: Pin to vfsmount for nfsd exports cache Kinglong Mee
                   ` (2 preceding siblings ...)
  2015-06-25 14:21 ` [PATCH 04/10 v6] fs: New helper legitimize_mntget() for getting a legitimize mnt Kinglong Mee
@ 2015-06-25 14:27 ` Kinglong Mee
  2015-06-25 14:29 ` [PATCH 07/10 v6] sunrpc: Switch to using list_head instead single list Kinglong Mee
                   ` (2 subsequent siblings)
  6 siblings, 0 replies; 13+ messages in thread
From: Kinglong Mee @ 2015-06-25 14:27 UTC (permalink / raw)
  To: Al Viro, J. Bruce Fields, linux-nfs, linux-fsdevel
  Cc: NeilBrown, Trond Myklebust, kinglongmee

Nfsd has implement a site of seq_operations functions as sunrpc's cache.
Just exports sunrpc's codes, and remove nfsd's redundant codes.

Signed-off-by: Kinglong Mee <kinglongmee@gmail.com>
---
 fs/nfsd/export.c             | 73 ++------------------------------------------
 include/linux/sunrpc/cache.h |  5 +++
 net/sunrpc/cache.c           | 15 +++++----
 3 files changed, 17 insertions(+), 76 deletions(-)

diff --git a/fs/nfsd/export.c b/fs/nfsd/export.c
index 002d3a9..34a384c 100644
--- a/fs/nfsd/export.c
+++ b/fs/nfsd/export.c
@@ -1075,73 +1075,6 @@ exp_pseudoroot(struct svc_rqst *rqstp, struct svc_fh *fhp)
 	return rv;
 }
 
-/* Iterator */
-
-static void *e_start(struct seq_file *m, loff_t *pos)
-	__acquires(((struct cache_detail *)m->private)->hash_lock)
-{
-	loff_t n = *pos;
-	unsigned hash, export;
-	struct cache_head *ch;
-	struct cache_detail *cd = m->private;
-	struct cache_head **export_table = cd->hash_table;
-
-	read_lock(&cd->hash_lock);
-	if (!n--)
-		return SEQ_START_TOKEN;
-	hash = n >> 32;
-	export = n & ((1LL<<32) - 1);
-
-	
-	for (ch=export_table[hash]; ch; ch=ch->next)
-		if (!export--)
-			return ch;
-	n &= ~((1LL<<32) - 1);
-	do {
-		hash++;
-		n += 1LL<<32;
-	} while(hash < EXPORT_HASHMAX && export_table[hash]==NULL);
-	if (hash >= EXPORT_HASHMAX)
-		return NULL;
-	*pos = n+1;
-	return export_table[hash];
-}
-
-static void *e_next(struct seq_file *m, void *p, loff_t *pos)
-{
-	struct cache_head *ch = p;
-	int hash = (*pos >> 32);
-	struct cache_detail *cd = m->private;
-	struct cache_head **export_table = cd->hash_table;
-
-	if (p == SEQ_START_TOKEN)
-		hash = 0;
-	else if (ch->next == NULL) {
-		hash++;
-		*pos += 1LL<<32;
-	} else {
-		++*pos;
-		return ch->next;
-	}
-	*pos &= ~((1LL<<32) - 1);
-	while (hash < EXPORT_HASHMAX && export_table[hash] == NULL) {
-		hash++;
-		*pos += 1LL<<32;
-	}
-	if (hash >= EXPORT_HASHMAX)
-		return NULL;
-	++*pos;
-	return export_table[hash];
-}
-
-static void e_stop(struct seq_file *m, void *p)
-	__releases(((struct cache_detail *)m->private)->hash_lock)
-{
-	struct cache_detail *cd = m->private;
-
-	read_unlock(&cd->hash_lock);
-}
-
 static struct flags {
 	int flag;
 	char *name[2];
@@ -1270,9 +1203,9 @@ static int e_show(struct seq_file *m, void *p)
 }
 
 const struct seq_operations nfs_exports_op = {
-	.start	= e_start,
-	.next	= e_next,
-	.stop	= e_stop,
+	.start	= cache_seq_start,
+	.next	= cache_seq_next,
+	.stop	= cache_seq_stop,
 	.show	= e_show,
 };
 
diff --git a/include/linux/sunrpc/cache.h b/include/linux/sunrpc/cache.h
index 437ddb6..04ee5a2 100644
--- a/include/linux/sunrpc/cache.h
+++ b/include/linux/sunrpc/cache.h
@@ -224,6 +224,11 @@ extern int sunrpc_cache_register_pipefs(struct dentry *parent, const char *,
 					umode_t, struct cache_detail *);
 extern void sunrpc_cache_unregister_pipefs(struct cache_detail *);
 
+/* Must store cache_detail in seq_file->private if using next three functions */
+extern void *cache_seq_start(struct seq_file *file, loff_t *pos);
+extern void *cache_seq_next(struct seq_file *file, void *p, loff_t *pos);
+extern void cache_seq_stop(struct seq_file *file, void *p);
+
 extern void qword_add(char **bpp, int *lp, char *str);
 extern void qword_addhex(char **bpp, int *lp, char *buf, int blen);
 extern int qword_get(char **bpp, char *dest, int bufsize);
diff --git a/net/sunrpc/cache.c b/net/sunrpc/cache.c
index edec603..673c2fa 100644
--- a/net/sunrpc/cache.c
+++ b/net/sunrpc/cache.c
@@ -1270,7 +1270,7 @@ EXPORT_SYMBOL_GPL(qword_get);
  * get a header, then pass each real item in the cache
  */
 
-static void *c_start(struct seq_file *m, loff_t *pos)
+void *cache_seq_start(struct seq_file *m, loff_t *pos)
 	__acquires(cd->hash_lock)
 {
 	loff_t n = *pos;
@@ -1298,8 +1298,9 @@ static void *c_start(struct seq_file *m, loff_t *pos)
 	*pos = n+1;
 	return cd->hash_table[hash];
 }
+EXPORT_SYMBOL_GPL(cache_seq_start);
 
-static void *c_next(struct seq_file *m, void *p, loff_t *pos)
+void *cache_seq_next(struct seq_file *m, void *p, loff_t *pos)
 {
 	struct cache_head *ch = p;
 	int hash = (*pos >> 32);
@@ -1325,13 +1326,15 @@ static void *c_next(struct seq_file *m, void *p, loff_t *pos)
 	++*pos;
 	return cd->hash_table[hash];
 }
+EXPORT_SYMBOL_GPL(cache_seq_next);
 
-static void c_stop(struct seq_file *m, void *p)
+void cache_seq_stop(struct seq_file *m, void *p)
 	__releases(cd->hash_lock)
 {
 	struct cache_detail *cd = m->private;
 	read_unlock(&cd->hash_lock);
 }
+EXPORT_SYMBOL_GPL(cache_seq_stop);
 
 static int c_show(struct seq_file *m, void *p)
 {
@@ -1359,9 +1362,9 @@ static int c_show(struct seq_file *m, void *p)
 }
 
 static const struct seq_operations cache_content_op = {
-	.start	= c_start,
-	.next	= c_next,
-	.stop	= c_stop,
+	.start	= cache_seq_start,
+	.next	= cache_seq_next,
+	.stop	= cache_seq_stop,
 	.show	= c_show,
 };
 
-- 
2.4.3


^ permalink raw reply related	[flat|nested] 13+ messages in thread

* [PATCH 07/10 v6] sunrpc: Switch to using list_head instead single list
  2015-06-25 14:17 [PATCH 00/10 v6] NFSD: Pin to vfsmount for nfsd exports cache Kinglong Mee
                   ` (3 preceding siblings ...)
  2015-06-25 14:27 ` [PATCH 06/10 v6] sunrpc/nfsd: Remove redundant code by exports seq_operations functions Kinglong Mee
@ 2015-06-25 14:29 ` Kinglong Mee
       [not found] ` <558C0D6A.9050104-Re5JQEeQqe8AvxtiuMwx3w@public.gmane.org>
  2015-06-25 14:37 ` [PATCH 10/10 v6] nfsd: Allows user un-mounting filesystem where nfsd exports base on Kinglong Mee
  6 siblings, 0 replies; 13+ messages in thread
From: Kinglong Mee @ 2015-06-25 14:29 UTC (permalink / raw)
  To: Al Viro, J. Bruce Fields, linux-nfs, linux-fsdevel
  Cc: NeilBrown, Trond Myklebust, kinglongmee

Switch using list_head for cache_head in cache_detail,
it is useful of remove an cache_head entry directly from cache_detail.

Signed-off-by: Kinglong Mee <kinglongmee@gmail.com>
---
 include/linux/sunrpc/cache.h |  4 +--
 net/sunrpc/cache.c           | 74 ++++++++++++++++++++++++--------------------
 2 files changed, 43 insertions(+), 35 deletions(-)

diff --git a/include/linux/sunrpc/cache.h b/include/linux/sunrpc/cache.h
index 04ee5a2..ecc0ff6 100644
--- a/include/linux/sunrpc/cache.h
+++ b/include/linux/sunrpc/cache.h
@@ -46,7 +46,7 @@
  * 
  */
 struct cache_head {
-	struct cache_head * next;
+	struct list_head	cache_list;
 	time_t		expiry_time;	/* After time time, don't use the data */
 	time_t		last_refresh;   /* If CACHE_PENDING, this is when upcall 
 					 * was sent, else this is when update was received
@@ -73,7 +73,7 @@ struct cache_detail_pipefs {
 struct cache_detail {
 	struct module *		owner;
 	int			hash_size;
-	struct cache_head **	hash_table;
+	struct list_head *	hash_table;
 	rwlock_t		hash_lock;
 
 	atomic_t		inuse; /* active user-space update or lookup */
diff --git a/net/sunrpc/cache.c b/net/sunrpc/cache.c
index 673c2fa..ad2155c 100644
--- a/net/sunrpc/cache.c
+++ b/net/sunrpc/cache.c
@@ -44,7 +44,7 @@ static void cache_revisit_request(struct cache_head *item);
 static void cache_init(struct cache_head *h)
 {
 	time_t now = seconds_since_boot();
-	h->next = NULL;
+	INIT_LIST_HEAD(&h->cache_list);
 	h->flags = 0;
 	kref_init(&h->ref);
 	h->expiry_time = now + CACHE_NEW_EXPIRY;
@@ -54,15 +54,16 @@ static void cache_init(struct cache_head *h)
 struct cache_head *sunrpc_cache_lookup(struct cache_detail *detail,
 				       struct cache_head *key, int hash)
 {
-	struct cache_head **head,  **hp;
 	struct cache_head *new = NULL, *freeme = NULL;
+	struct cache_head *tmp;
+	struct list_head *head, *pos, *tpos;
 
 	head = &detail->hash_table[hash];
 
 	read_lock(&detail->hash_lock);
 
-	for (hp=head; *hp != NULL ; hp = &(*hp)->next) {
-		struct cache_head *tmp = *hp;
+	list_for_each_safe(pos, tpos, head) {
+		tmp = list_entry(pos, struct cache_head, cache_list);
 		if (detail->match(tmp, key)) {
 			if (cache_is_expired(detail, tmp))
 				/* This entry is expired, we will discard it. */
@@ -88,12 +89,11 @@ struct cache_head *sunrpc_cache_lookup(struct cache_detail *detail,
 	write_lock(&detail->hash_lock);
 
 	/* check if entry appeared while we slept */
-	for (hp=head; *hp != NULL ; hp = &(*hp)->next) {
-		struct cache_head *tmp = *hp;
+	list_for_each_safe(pos, tpos, head) {
+		tmp = list_entry(pos, struct cache_head, cache_list);
 		if (detail->match(tmp, key)) {
 			if (cache_is_expired(detail, tmp)) {
-				*hp = tmp->next;
-				tmp->next = NULL;
+				list_del_init(&tmp->cache_list);
 				detail->entries --;
 				freeme = tmp;
 				break;
@@ -104,8 +104,8 @@ struct cache_head *sunrpc_cache_lookup(struct cache_detail *detail,
 			return tmp;
 		}
 	}
-	new->next = *head;
-	*head = new;
+
+	list_add(&new->cache_list, head);
 	detail->entries++;
 	cache_get(new);
 	write_unlock(&detail->hash_lock);
@@ -143,7 +143,6 @@ struct cache_head *sunrpc_cache_update(struct cache_detail *detail,
 	 * If 'old' is not VALID, we update it directly,
 	 * otherwise we need to replace it
 	 */
-	struct cache_head **head;
 	struct cache_head *tmp;
 
 	if (!test_bit(CACHE_VALID, &old->flags)) {
@@ -168,15 +167,13 @@ struct cache_head *sunrpc_cache_update(struct cache_detail *detail,
 	}
 	cache_init(tmp);
 	detail->init(tmp, old);
-	head = &detail->hash_table[hash];
 
 	write_lock(&detail->hash_lock);
 	if (test_bit(CACHE_NEGATIVE, &new->flags))
 		set_bit(CACHE_NEGATIVE, &tmp->flags);
 	else
 		detail->update(tmp, new);
-	tmp->next = *head;
-	*head = tmp;
+	list_add(&tmp->cache_list, &detail->hash_table[hash]);
 	detail->entries++;
 	cache_get(tmp);
 	cache_fresh_locked(tmp, new->expiry_time);
@@ -416,42 +413,44 @@ static int cache_clean(void)
 	/* find a non-empty bucket in the table */
 	while (current_detail &&
 	       current_index < current_detail->hash_size &&
-	       current_detail->hash_table[current_index] == NULL)
+	       list_empty(&current_detail->hash_table[current_index]))
 		current_index++;
 
 	/* find a cleanable entry in the bucket and clean it, or set to next bucket */
 
 	if (current_detail && current_index < current_detail->hash_size) {
-		struct cache_head *ch, **cp;
+		struct cache_head *ch = NULL, *putme = NULL;
+		struct list_head *head, *pos, *tpos;
 		struct cache_detail *d;
 
 		write_lock(&current_detail->hash_lock);
 
 		/* Ok, now to clean this strand */
 
-		cp = & current_detail->hash_table[current_index];
-		for (ch = *cp ; ch ; cp = & ch->next, ch = *cp) {
+		head = &current_detail->hash_table[current_index];
+		list_for_each_safe(pos, tpos, head) {
+			ch = list_entry(pos, struct cache_head, cache_list);
 			if (current_detail->nextcheck > ch->expiry_time)
 				current_detail->nextcheck = ch->expiry_time+1;
 			if (!cache_is_expired(current_detail, ch))
 				continue;
 
-			*cp = ch->next;
-			ch->next = NULL;
+			list_del_init(pos);
 			current_detail->entries--;
+			putme = ch;
 			rv = 1;
 			break;
 		}
 
 		write_unlock(&current_detail->hash_lock);
 		d = current_detail;
-		if (!ch)
+		if (!putme)
 			current_index ++;
 		spin_unlock(&cache_list_lock);
-		if (ch) {
-			set_bit(CACHE_CLEANED, &ch->flags);
-			cache_fresh_unlocked(ch, d);
-			cache_put(ch, d);
+		if (putme) {
+			set_bit(CACHE_CLEANED, &putme->flags);
+			cache_fresh_unlocked(putme, d);
+			cache_put(putme, d);
 		}
 	} else
 		spin_unlock(&cache_list_lock);
@@ -1277,6 +1276,7 @@ void *cache_seq_start(struct seq_file *m, loff_t *pos)
 	unsigned int hash, entry;
 	struct cache_head *ch;
 	struct cache_detail *cd = m->private;
+	struct list_head *ptr, *tptr;
 
 	read_lock(&cd->hash_lock);
 	if (!n--)
@@ -1284,19 +1284,22 @@ void *cache_seq_start(struct seq_file *m, loff_t *pos)
 	hash = n >> 32;
 	entry = n & ((1LL<<32) - 1);
 
-	for (ch=cd->hash_table[hash]; ch; ch=ch->next)
+	list_for_each_safe(ptr, tptr, &cd->hash_table[hash]) {
+		ch = list_entry(ptr, struct cache_head, cache_list);
 		if (!entry--)
 			return ch;
+	}
 	n &= ~((1LL<<32) - 1);
 	do {
 		hash++;
 		n += 1LL<<32;
 	} while(hash < cd->hash_size &&
-		cd->hash_table[hash]==NULL);
+		list_empty(&cd->hash_table[hash]));
 	if (hash >= cd->hash_size)
 		return NULL;
 	*pos = n+1;
-	return cd->hash_table[hash];
+	return list_first_entry_or_null(&cd->hash_table[hash],
+					struct cache_head, cache_list);
 }
 EXPORT_SYMBOL_GPL(cache_seq_start);
 
@@ -1308,23 +1311,24 @@ void *cache_seq_next(struct seq_file *m, void *p, loff_t *pos)
 
 	if (p == SEQ_START_TOKEN)
 		hash = 0;
-	else if (ch->next == NULL) {
+	else if (list_is_last(&ch->cache_list, &cd->hash_table[hash])) {
 		hash++;
 		*pos += 1LL<<32;
 	} else {
 		++*pos;
-		return ch->next;
+		return list_next_entry(ch, cache_list);
 	}
 	*pos &= ~((1LL<<32) - 1);
 	while (hash < cd->hash_size &&
-	       cd->hash_table[hash] == NULL) {
+	       list_empty(&cd->hash_table[hash])) {
 		hash++;
 		*pos += 1LL<<32;
 	}
 	if (hash >= cd->hash_size)
 		return NULL;
 	++*pos;
-	return cd->hash_table[hash];
+	return list_first_entry_or_null(&cd->hash_table[hash],
+					struct cache_head, cache_list);
 }
 EXPORT_SYMBOL_GPL(cache_seq_next);
 
@@ -1666,17 +1670,21 @@ EXPORT_SYMBOL_GPL(cache_unregister_net);
 struct cache_detail *cache_create_net(struct cache_detail *tmpl, struct net *net)
 {
 	struct cache_detail *cd;
+	int i;
 
 	cd = kmemdup(tmpl, sizeof(struct cache_detail), GFP_KERNEL);
 	if (cd == NULL)
 		return ERR_PTR(-ENOMEM);
 
-	cd->hash_table = kzalloc(cd->hash_size * sizeof(struct cache_head *),
+	cd->hash_table = kzalloc(cd->hash_size * sizeof(struct list_head),
 				 GFP_KERNEL);
 	if (cd->hash_table == NULL) {
 		kfree(cd);
 		return ERR_PTR(-ENOMEM);
 	}
+
+	for (i = 0; i < cd->hash_size; i++)
+		INIT_LIST_HEAD(&cd->hash_table[i]);
 	cd->net = net;
 	return cd;
 }
-- 
2.4.3


^ permalink raw reply related	[flat|nested] 13+ messages in thread

* [PATCH 08/10] sunrpc: New helper cache_delete_entry for deleting cache_head directly
       [not found] ` <558C0D6A.9050104-Re5JQEeQqe8AvxtiuMwx3w@public.gmane.org>
  2015-06-25 14:18   ` [PATCH 01/10 v6] fs_pin: Initialize value for fs_pin explicitly Kinglong Mee
  2015-06-25 14:25   ` [PATCH 05/10 v6] sunrpc: Store cache_detail in seq_file's private directly Kinglong Mee
@ 2015-06-25 14:34   ` Kinglong Mee
  2015-06-25 14:36   ` [PATCH 09/10 v6] sunrpc: Support validate/invalidate for reference change in cache_detail Kinglong Mee
  3 siblings, 0 replies; 13+ messages in thread
From: Kinglong Mee @ 2015-06-25 14:34 UTC (permalink / raw)
  To: Al Viro, J. Bruce Fields, linux-nfs-u79uwXL29TY76Z2rM5mHXA,
	linux-fsdevel-u79uwXL29TY76Z2rM5mHXA
  Cc: NeilBrown, Trond Myklebust, kinglongmee-Re5JQEeQqe8AvxtiuMwx3w

A new helper cache_delete_entry() for delete cache_head from
cache_detail directly.

It will be used by pin_kill, so make sure the cache_detail is valid
before deleting is needed.

Because pin_kill is not many times, 
so the influence of performance is accepted.

Signed-off-by: Kinglong Mee <kinglongmee-Re5JQEeQqe8AvxtiuMwx3w@public.gmane.org>
---
 include/linux/sunrpc/cache.h |  1 +
 net/sunrpc/cache.c           | 30 ++++++++++++++++++++++++++++++
 2 files changed, 31 insertions(+)

diff --git a/include/linux/sunrpc/cache.h b/include/linux/sunrpc/cache.h
index ecc0ff6..5a4b921 100644
--- a/include/linux/sunrpc/cache.h
+++ b/include/linux/sunrpc/cache.h
@@ -210,6 +210,7 @@ extern int cache_check(struct cache_detail *detail,
 		       struct cache_head *h, struct cache_req *rqstp);
 extern void cache_flush(void);
 extern void cache_purge(struct cache_detail *detail);
+extern void cache_delete_entry(struct cache_detail *cd, struct cache_head *h);
 #define NEVER (0x7FFFFFFF)
 extern void __init cache_initialize(void);
 extern int cache_register_net(struct cache_detail *cd, struct net *net);
diff --git a/net/sunrpc/cache.c b/net/sunrpc/cache.c
index ad2155c..8a27483 100644
--- a/net/sunrpc/cache.c
+++ b/net/sunrpc/cache.c
@@ -458,6 +458,36 @@ static int cache_clean(void)
 	return rv;
 }
 
+void cache_delete_entry(struct cache_detail *detail, struct cache_head *h)
+{
+	struct cache_detail *tmp;
+
+	if (!detail || !h)
+		return;
+
+	spin_lock(&cache_list_lock);
+	list_for_each_entry(tmp, &cache_list, others) {
+		if (tmp == detail)
+			goto found;
+	}
+	spin_unlock(&cache_list_lock);
+	printk(KERN_WARNING "%s: Deleted cache detail %p\n", __func__, detail);
+	return ;
+
+found:
+	write_lock(&detail->hash_lock);
+
+	list_del_init(&h->cache_list);
+	detail->entries--;
+	set_bit(CACHE_CLEANED, &h->flags);
+
+	write_unlock(&detail->hash_lock);
+	spin_unlock(&cache_list_lock);
+
+	cache_put(h, detail);
+}
+EXPORT_SYMBOL_GPL(cache_delete_entry);
+
 /*
  * We want to regularly clean the cache, so we need to schedule some work ...
  */
-- 
2.4.3

--
To unsubscribe from this list: send the line "unsubscribe linux-nfs" in
the body of a message to majordomo-u79uwXL29TY76Z2rM5mHXA@public.gmane.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

^ permalink raw reply related	[flat|nested] 13+ messages in thread

* [PATCH 09/10 v6] sunrpc: Support validate/invalidate for reference change in cache_detail
       [not found] ` <558C0D6A.9050104-Re5JQEeQqe8AvxtiuMwx3w@public.gmane.org>
                     ` (2 preceding siblings ...)
  2015-06-25 14:34   ` [PATCH 08/10] sunrpc: New helper cache_delete_entry for deleting cache_head directly Kinglong Mee
@ 2015-06-25 14:36   ` Kinglong Mee
  3 siblings, 0 replies; 13+ messages in thread
From: Kinglong Mee @ 2015-06-25 14:36 UTC (permalink / raw)
  To: Al Viro, J. Bruce Fields, linux-nfs-u79uwXL29TY76Z2rM5mHXA,
	linux-fsdevel-u79uwXL29TY76Z2rM5mHXA
  Cc: NeilBrown, Trond Myklebust, kinglongmee-Re5JQEeQqe8AvxtiuMwx3w

Add validate/invalidate functions in cache_detail for processing
reference change (increase/decrease, both are before change!)

Signed-off-by: Kinglong Mee <kinglongmee-Re5JQEeQqe8AvxtiuMwx3w@public.gmane.org>
---
 fs/nfsd/export.h                  |  2 +-
 include/linux/sunrpc/cache.h      | 11 ++++++++++-
 net/sunrpc/auth_gss/svcauth_gss.c |  2 +-
 net/sunrpc/cache.c                | 12 ++++++------
 net/sunrpc/svcauth_unix.c         |  2 +-
 5 files changed, 19 insertions(+), 10 deletions(-)

diff --git a/fs/nfsd/export.h b/fs/nfsd/export.h
index 1f52bfc..b559acf 100644
--- a/fs/nfsd/export.h
+++ b/fs/nfsd/export.h
@@ -105,7 +105,7 @@ static inline void exp_put(struct svc_export *exp)
 
 static inline struct svc_export *exp_get(struct svc_export *exp)
 {
-	cache_get(&exp->h);
+	cache_get(&exp->h, exp->cd);
 	return exp;
 }
 struct svc_export * rqst_exp_find(struct svc_rqst *, int, u32 *);
diff --git a/include/linux/sunrpc/cache.h b/include/linux/sunrpc/cache.h
index 5a4b921..f77b2cd 100644
--- a/include/linux/sunrpc/cache.h
+++ b/include/linux/sunrpc/cache.h
@@ -101,6 +101,8 @@ struct cache_detail {
 	int			(*match)(struct cache_head *orig, struct cache_head *new);
 	void			(*init)(struct cache_head *orig, struct cache_head *new);
 	void			(*update)(struct cache_head *orig, struct cache_head *new);
+	void			(*validate)(struct cache_head *h);
+	void			(*invalidate)(struct cache_head *h);
 
 	/* fields below this comment are for internal use
 	 * and should not be touched by cache owners
@@ -185,8 +187,11 @@ sunrpc_cache_pipe_upcall(struct cache_detail *detail, struct cache_head *h);
 
 extern void cache_clean_deferred(void *owner);
 
-static inline struct cache_head  *cache_get(struct cache_head *h)
+static inline struct cache_head *cache_get(struct cache_head *h, struct cache_detail *cd)
 {
+	if (cd && cd->validate)
+		cd->validate(h);
+
 	kref_get(&h->ref);
 	return h;
 }
@@ -197,6 +202,10 @@ static inline void cache_put(struct cache_head *h, struct cache_detail *cd)
 	if (atomic_read(&h->ref.refcount) <= 2 &&
 	    h->expiry_time < cd->nextcheck)
 		cd->nextcheck = h->expiry_time;
+
+	if (cd->invalidate)
+		cd->invalidate(h);
+
 	kref_put(&h->ref, cd->cache_put);
 }
 
diff --git a/net/sunrpc/auth_gss/svcauth_gss.c b/net/sunrpc/auth_gss/svcauth_gss.c
index 1095be9..ee1faa2 100644
--- a/net/sunrpc/auth_gss/svcauth_gss.c
+++ b/net/sunrpc/auth_gss/svcauth_gss.c
@@ -1520,7 +1520,7 @@ svcauth_gss_accept(struct svc_rqst *rqstp, __be32 *authp)
 			goto auth_err;
 		}
 		svcdata->rsci = rsci;
-		cache_get(&rsci->h);
+		cache_get(&rsci->h, NULL);
 		rqstp->rq_cred.cr_flavor = gss_svc_to_pseudoflavor(
 					rsci->mechctx->mech_type,
 					GSS_C_QOP_DEFAULT,
diff --git a/net/sunrpc/cache.c b/net/sunrpc/cache.c
index 8a27483..cb7f3c0 100644
--- a/net/sunrpc/cache.c
+++ b/net/sunrpc/cache.c
@@ -68,7 +68,7 @@ struct cache_head *sunrpc_cache_lookup(struct cache_detail *detail,
 			if (cache_is_expired(detail, tmp))
 				/* This entry is expired, we will discard it. */
 				break;
-			cache_get(tmp);
+			cache_get(tmp, detail);
 			read_unlock(&detail->hash_lock);
 			return tmp;
 		}
@@ -98,7 +98,7 @@ struct cache_head *sunrpc_cache_lookup(struct cache_detail *detail,
 				freeme = tmp;
 				break;
 			}
-			cache_get(tmp);
+			cache_get(tmp, detail);
 			write_unlock(&detail->hash_lock);
 			cache_put(new, detail);
 			return tmp;
@@ -107,7 +107,7 @@ struct cache_head *sunrpc_cache_lookup(struct cache_detail *detail,
 
 	list_add(&new->cache_list, head);
 	detail->entries++;
-	cache_get(new);
+	cache_get(new, detail);
 	write_unlock(&detail->hash_lock);
 
 	if (freeme)
@@ -175,7 +175,7 @@ struct cache_head *sunrpc_cache_update(struct cache_detail *detail,
 		detail->update(tmp, new);
 	list_add(&tmp->cache_list, &detail->hash_table[hash]);
 	detail->entries++;
-	cache_get(tmp);
+	cache_get(tmp, detail);
 	cache_fresh_locked(tmp, new->expiry_time);
 	cache_fresh_locked(old, 0);
 	write_unlock(&detail->hash_lock);
@@ -1204,7 +1204,7 @@ int sunrpc_cache_pipe_upcall(struct cache_detail *detail, struct cache_head *h)
 	}
 
 	crq->q.reader = 0;
-	crq->item = cache_get(h);
+	crq->item = cache_get(h, detail);
 	crq->buf = buf;
 	crq->len = 0;
 	crq->readers = 0;
@@ -1382,7 +1382,7 @@ static int c_show(struct seq_file *m, void *p)
 		seq_printf(m, "# expiry=%ld refcnt=%d flags=%lx\n",
 			   convert_to_wallclock(cp->expiry_time),
 			   atomic_read(&cp->ref.refcount), cp->flags);
-	cache_get(cp);
+	cache_get(cp, cd);
 	if (cache_check(cd, cp, NULL))
 		/* cache_check does a cache_put on failure */
 		seq_printf(m, "# ");
diff --git a/net/sunrpc/svcauth_unix.c b/net/sunrpc/svcauth_unix.c
index 621ca7b..ebba6b7 100644
--- a/net/sunrpc/svcauth_unix.c
+++ b/net/sunrpc/svcauth_unix.c
@@ -359,7 +359,7 @@ ip_map_cached_get(struct svc_xprt *xprt)
 				cache_put(&ipm->h, sn->ip_map_cache);
 				return NULL;
 			}
-			cache_get(&ipm->h);
+			cache_get(&ipm->h, NULL);
 		}
 		spin_unlock(&xprt->xpt_lock);
 	}
-- 
2.4.3

--
To unsubscribe from this list: send the line "unsubscribe linux-nfs" in
the body of a message to majordomo-u79uwXL29TY76Z2rM5mHXA@public.gmane.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

^ permalink raw reply related	[flat|nested] 13+ messages in thread

* [PATCH 10/10 v6] nfsd: Allows user un-mounting filesystem where nfsd exports base on
  2015-06-25 14:17 [PATCH 00/10 v6] NFSD: Pin to vfsmount for nfsd exports cache Kinglong Mee
                   ` (5 preceding siblings ...)
       [not found] ` <558C0D6A.9050104-Re5JQEeQqe8AvxtiuMwx3w@public.gmane.org>
@ 2015-06-25 14:37 ` Kinglong Mee
  2015-07-01  5:47   ` Al Viro
  6 siblings, 1 reply; 13+ messages in thread
From: Kinglong Mee @ 2015-06-25 14:37 UTC (permalink / raw)
  To: Al Viro, J. Bruce Fields, linux-nfs, linux-fsdevel
  Cc: NeilBrown, Trond Myklebust, kinglongmee

If there are some mount points(not exported for nfs) under pseudo root,
after client's operation of those entry under the root, anyone *can't*
unmount those mount points until export cache expired.

/nfs/xfs        *(rw,insecure,no_subtree_check,no_root_squash)
/nfs/pnfs       *(rw,insecure,no_subtree_check,no_root_squash)
total 0
drwxr-xr-x. 3 root root 84 Apr 21 22:27 pnfs
drwxr-xr-x. 3 root root 84 Apr 21 22:27 test
drwxr-xr-x. 2 root root  6 Apr 20 22:01 xfs
Filesystem                      1K-blocks    Used Available Use% Mounted on
......
/dev/sdd                          1038336   32944   1005392   4% /nfs/pnfs
/dev/sdc                         10475520   32928  10442592   1% /nfs/xfs
/dev/sde                           999320    1284    929224   1% /nfs/test
/mnt/pnfs/:
total 0
-rw-r--r--. 1 root root 0 Apr 21 22:23 attr
drwxr-xr-x. 2 root root 6 Apr 21 22:19 tmp

/mnt/xfs/:
total 0
umount: /nfs/test/: target is busy
        (In some cases useful info about processes that
        use the device is found by lsof(8) or fuser(1).)

It's caused by exports cache of nfsd holds the reference of
the path (here is /nfs/test/), so, it can't be umounted.

I don't think that's user expect, they want umount /nfs/test/.
Bruce think user can also umount /nfs/pnfs/ and /nfs/xfs.

Also, using kzalloc for all memory allocating without kmalloc.
Thanks for Al Viro's commets for the logic of fs_pin.

v3,
1. using path_get_pin/path_put_unpin for path pin
2. using kzalloc for memory allocating

v4,
1. add a completion for pin_kill waiting the reference is decreased to zero.
2. add a work_struct for pin_kill decreases the reference indirectly.
3. free svc_export/svc_expkey in pin_kill, not svc_export_put/svc_expkey_put.
4. svc_export_put/svc_expkey_put go though pin_kill logic.

v5, same as v4.

v6,
1. Pin vfsmnt to mount point at first, when reference increace (==2),
   grab a reference to vfsmnt by mntget. When decreace (==1),
   drop the reference to vfsmnt, left pin.
2. Delete cache_head directly from cache_detail.

Right now,
When reference of cahce_head increase(>1), grab a reference of mnt once.
and reference decrease to 1 (==1), drop the reference of mnt.

So after that,
When ref > 1, user cannot umount the filesystem with -EBUSY.
when ref ==1, means cache only reference by nfsd cache,
no other reference. So user can try umount, 
1. before set MNT_UMOUNT (protected by mount_lock), nfsd cache is
   referenced (ref > 1, legitimize_mntget), umount will fail with -EBUSY.
2. after set MNT_UMOUNT, nfsd cache is referenced (ref == 2),
   legitimize_mntget will fail, and set cache to CACHE_NEGATIVE,
   and the reference will be dropped, re-back to 1.
   So, pin_kill can delete the cache and umount success.
3. when umountting, no reference to nfsd cache, 
   pin_kill can delete the cache and umount success.

Signed-off-by: Kinglong Mee <kinglongmee@gmail.com>
---
 fs/nfsd/export.c | 169 +++++++++++++++++++++++++++++++++++++++++++++++++------
 fs/nfsd/export.h |  24 +++++++-
 2 files changed, 174 insertions(+), 19 deletions(-)

diff --git a/fs/nfsd/export.c b/fs/nfsd/export.c
index 34a384c..f7b1aa8 100644
--- a/fs/nfsd/export.c
+++ b/fs/nfsd/export.c
@@ -37,15 +37,23 @@
 #define	EXPKEY_HASHMAX		(1 << EXPKEY_HASHBITS)
 #define	EXPKEY_HASHMASK		(EXPKEY_HASHMAX -1)
 
+static void expkey_destroy(struct svc_expkey *key)
+{
+	auth_domain_put(key->ek_client);
+	kfree_rcu(key, rcu_head);
+}
+
 static void expkey_put(struct kref *ref)
 {
 	struct svc_expkey *key = container_of(ref, struct svc_expkey, h.ref);
 
 	if (test_bit(CACHE_VALID, &key->h.flags) &&
-	    !test_bit(CACHE_NEGATIVE, &key->h.flags))
-		path_put(&key->ek_path);
-	auth_domain_put(key->ek_client);
-	kfree(key);
+	    !test_bit(CACHE_NEGATIVE, &key->h.flags)) {
+		rcu_read_lock();
+		complete(&key->ek_done);
+		pin_kill(&key->ek_pin);
+	} else
+		expkey_destroy(key);
 }
 
 static void expkey_request(struct cache_detail *cd,
@@ -83,7 +91,7 @@ static int expkey_parse(struct cache_detail *cd, char *mesg, int mlen)
 		return -EINVAL;
 	mesg[mlen-1] = 0;
 
-	buf = kmalloc(PAGE_SIZE, GFP_KERNEL);
+	buf = kzalloc(PAGE_SIZE, GFP_KERNEL);
 	err = -ENOMEM;
 	if (!buf)
 		goto out;
@@ -119,6 +127,7 @@ static int expkey_parse(struct cache_detail *cd, char *mesg, int mlen)
 	if (key.h.expiry_time == 0)
 		goto out;
 
+	key.cd = cd;
 	key.ek_client = dom;	
 	key.ek_fsidtype = fsidtype;
 	memcpy(key.ek_fsid, buf, len);
@@ -210,6 +219,59 @@ static inline void expkey_init(struct cache_head *cnew,
 	new->ek_fsidtype = item->ek_fsidtype;
 
 	memcpy(new->ek_fsid, item->ek_fsid, sizeof(new->ek_fsid));
+	new->cd = item->cd;
+}
+
+static void expkey_validate(struct cache_head *h)
+{
+	struct svc_expkey *key = container_of(h, struct svc_expkey, h);
+
+	if (!test_bit(CACHE_VALID, &key->h.flags) ||
+	    test_bit(CACHE_NEGATIVE, &key->h.flags))
+		return;
+
+	if (atomic_read(&h->ref.refcount) == 1) {
+		mutex_lock(&key->ek_mutex);
+		if (legitimize_mntget(key->ek_path.mnt) == NULL) {
+			printk(KERN_WARNING "%s: Get mnt for %pd2 failed!\n",
+				__func__, key->ek_path.dentry);
+			set_bit(CACHE_NEGATIVE, &h->flags);
+		} else
+			key->ek_mnt_ref = true;
+		mutex_unlock(&key->ek_mutex);
+	}
+}
+
+static void expkey_invalidate(struct cache_head *h)
+{
+	struct svc_expkey *key = container_of(h, struct svc_expkey, h);
+
+	if (atomic_read(&h->ref.refcount) == 2) {
+		mutex_lock(&key->ek_mutex);
+		if (key->ek_mnt_ref) {
+			mntput(key->ek_path.mnt);
+			key->ek_mnt_ref = false;
+		}
+		mutex_unlock(&key->ek_mutex);
+	}
+}
+
+static void expkey_pin_kill(struct fs_pin *pin)
+{
+	struct svc_expkey *key = container_of(pin, struct svc_expkey, ek_pin);
+
+	if (!completion_done(&key->ek_done)) {
+		schedule_work(&key->ek_work);
+		wait_for_completion(&key->ek_done);
+	}
+	path_put_unpin(&key->ek_path, &key->ek_pin);
+	expkey_destroy(key);
+}
+
+static void expkey_close_work(struct work_struct *work)
+{
+	struct svc_expkey *key = container_of(work, struct svc_expkey, ek_work);
+	cache_delete_entry(key->cd, &key->h);
 }
 
 static inline void expkey_update(struct cache_head *cnew,
@@ -218,16 +280,20 @@ static inline void expkey_update(struct cache_head *cnew,
 	struct svc_expkey *new = container_of(cnew, struct svc_expkey, h);
 	struct svc_expkey *item = container_of(citem, struct svc_expkey, h);
 
+	init_fs_pin(&new->ek_pin, expkey_pin_kill);
 	new->ek_path = item->ek_path;
-	path_get(&item->ek_path);
+	path_get_pin(&new->ek_path, &new->ek_pin);
 }
 
 static struct cache_head *expkey_alloc(void)
 {
-	struct svc_expkey *i = kmalloc(sizeof(*i), GFP_KERNEL);
-	if (i)
+	struct svc_expkey *i = kzalloc(sizeof(*i), GFP_KERNEL);
+	if (i) {
+		INIT_WORK(&i->ek_work, expkey_close_work);
+		init_completion(&i->ek_done);
+		mutex_init(&i->ek_mutex);
 		return &i->h;
-	else
+	} else
 		return NULL;
 }
 
@@ -243,6 +309,8 @@ static struct cache_detail svc_expkey_cache_template = {
 	.init		= expkey_init,
 	.update       	= expkey_update,
 	.alloc		= expkey_alloc,
+	.validate	= expkey_validate,
+	.invalidate	= expkey_invalidate,
 };
 
 static int
@@ -306,14 +374,21 @@ static void nfsd4_fslocs_free(struct nfsd4_fs_locations *fsloc)
 	fsloc->locations = NULL;
 }
 
-static void svc_export_put(struct kref *ref)
+static void svc_export_destroy(struct svc_export *exp)
 {
-	struct svc_export *exp = container_of(ref, struct svc_export, h.ref);
-	path_put(&exp->ex_path);
 	auth_domain_put(exp->ex_client);
 	nfsd4_fslocs_free(&exp->ex_fslocs);
 	kfree(exp->ex_uuid);
-	kfree(exp);
+	kfree_rcu(exp, rcu_head);
+}
+
+static void svc_export_put(struct kref *ref)
+{
+	struct svc_export *exp = container_of(ref, struct svc_export, h.ref);
+
+	rcu_read_lock();
+	complete(&exp->ex_done);
+	pin_kill(&exp->ex_pin);
 }
 
 static void svc_export_request(struct cache_detail *cd,
@@ -520,7 +595,7 @@ static int svc_export_parse(struct cache_detail *cd, char *mesg, int mlen)
 		return -EINVAL;
 	mesg[mlen-1] = 0;
 
-	buf = kmalloc(PAGE_SIZE, GFP_KERNEL);
+	buf = kzalloc(PAGE_SIZE, GFP_KERNEL);
 	if (!buf)
 		return -ENOMEM;
 
@@ -694,15 +769,67 @@ static int svc_export_match(struct cache_head *a, struct cache_head *b)
 		path_equal(&orig->ex_path, &new->ex_path);
 }
 
+static void export_validate(struct cache_head *h)
+{
+	struct svc_export *exp = container_of(h, struct svc_export, h);
+
+	if (test_bit(CACHE_NEGATIVE, &h->flags))
+		return;
+
+	if (atomic_read(&h->ref.refcount) == 1) {
+		mutex_lock(&exp->ex_mutex);
+		if (legitimize_mntget(exp->ex_path.mnt) == NULL) {
+			printk(KERN_WARNING "%s: Get mnt for %pd2 failed!\n",
+				__func__, exp->ex_path.dentry);
+			set_bit(CACHE_NEGATIVE, &h->flags);
+		} else
+			exp->ex_mnt_ref = true;
+		mutex_unlock(&exp->ex_mutex);
+	}
+}
+
+static void export_invalidate(struct cache_head *h)
+{
+	struct svc_export *exp = container_of(h, struct svc_export, h);
+
+	if (atomic_read(&h->ref.refcount) == 2) {
+		mutex_lock(&exp->ex_mutex);
+		if (exp->ex_mnt_ref) {
+			mntput(exp->ex_path.mnt);
+			exp->ex_mnt_ref = false;
+		}
+		mutex_unlock(&exp->ex_mutex);
+	}
+}
+
+static void export_pin_kill(struct fs_pin *pin)
+{
+	struct svc_export *exp = container_of(pin, struct svc_export, ex_pin);
+
+	if (!completion_done(&exp->ex_done)) {
+		schedule_work(&exp->ex_work);
+		wait_for_completion(&exp->ex_done);
+	}
+	path_put_unpin(&exp->ex_path, &exp->ex_pin);
+	svc_export_destroy(exp);
+}
+
+static void export_close_work(struct work_struct *work)
+{
+	struct svc_export *exp = container_of(work, struct svc_export, ex_work);
+	cache_delete_entry(exp->cd, &exp->h);
+}
+
 static void svc_export_init(struct cache_head *cnew, struct cache_head *citem)
 {
 	struct svc_export *new = container_of(cnew, struct svc_export, h);
 	struct svc_export *item = container_of(citem, struct svc_export, h);
 
+	init_fs_pin(&new->ex_pin, export_pin_kill);
 	kref_get(&item->ex_client->ref);
 	new->ex_client = item->ex_client;
 	new->ex_path = item->ex_path;
-	path_get(&item->ex_path);
+	path_get_pin(&new->ex_path, &new->ex_pin);
 	new->ex_fslocs.locations = NULL;
 	new->ex_fslocs.locations_count = 0;
 	new->ex_fslocs.migrated = 0;
@@ -740,10 +867,13 @@ static void export_update(struct cache_head *cnew, struct cache_head *citem)
 
 static struct cache_head *svc_export_alloc(void)
 {
-	struct svc_export *i = kmalloc(sizeof(*i), GFP_KERNEL);
-	if (i)
+	struct svc_export *i = kzalloc(sizeof(*i), GFP_KERNEL);
+	if (i) {
+		INIT_WORK(&i->ex_work, export_close_work);
+		init_completion(&i->ex_done);
+		mutex_init(&i->ex_mutex);
 		return &i->h;
-	else
+	} else
 		return NULL;
 }
 
@@ -759,6 +889,8 @@ static struct cache_detail svc_export_cache_template = {
 	.init		= svc_export_init,
 	.update		= export_update,
 	.alloc		= svc_export_alloc,
+	.validate	= export_validate,
+	.invalidate	= export_invalidate,
 };
 
 static int
@@ -809,6 +941,7 @@ exp_find_key(struct cache_detail *cd, struct auth_domain *clp, int fsid_type,
 	if (!clp)
 		return ERR_PTR(-ENOENT);
 
+	key.cd = cd;
 	key.ek_client = clp;
 	key.ek_fsidtype = fsid_type;
 	memcpy(key.ek_fsid, fsidv, key_len(fsid_type));
diff --git a/fs/nfsd/export.h b/fs/nfsd/export.h
index b559acf..1b5c5f8 100644
--- a/fs/nfsd/export.h
+++ b/fs/nfsd/export.h
@@ -4,6 +4,7 @@
 #ifndef NFSD_EXPORT_H
 #define NFSD_EXPORT_H
 
+#include <linux/fs_pin.h>
 #include <linux/sunrpc/cache.h>
 #include <uapi/linux/nfsd/export.h>
 
@@ -46,6 +47,8 @@ struct exp_flavor_info {
 
 struct svc_export {
 	struct cache_head	h;
+	struct cache_detail	*cd;
+
 	struct auth_domain *	ex_client;
 	int			ex_flags;
 	struct path		ex_path;
@@ -58,7 +61,16 @@ struct svc_export {
 	struct exp_flavor_info	ex_flavors[MAX_SECINFO_LIST];
 	enum pnfs_layouttype	ex_layout_type;
 	struct nfsd4_deviceid_map *ex_devid_map;
-	struct cache_detail	*cd;
+
+	struct fs_pin		ex_pin;
+	struct rcu_head		rcu_head;
+
+	bool			ex_mnt_ref;
+	struct mutex		ex_mutex;
+
+	/* For cache_put and fs umounting window */
+	struct completion	ex_done;
+	struct work_struct	ex_work;
 };
 
 /* an "export key" (expkey) maps a filehandlefragement to an
@@ -67,12 +79,22 @@ struct svc_export {
  */
 struct svc_expkey {
 	struct cache_head	h;
+	struct cache_detail	*cd;
 
 	struct auth_domain *	ek_client;
 	int			ek_fsidtype;
 	u32			ek_fsid[6];
 
 	struct path		ek_path;
+	struct fs_pin		ek_pin;
+	struct rcu_head		rcu_head;
+
+	bool			ek_mnt_ref;
+	struct mutex		ek_mutex;
+
+	/* For cache_put and fs umounting window */
+	struct completion	ek_done;
+	struct work_struct	ek_work;
 };
 
 #define EX_ISSYNC(exp)		(!((exp)->ex_flags & NFSEXP_ASYNC))
-- 
2.4.3


^ permalink raw reply related	[flat|nested] 13+ messages in thread

* Re: [PATCH 10/10 v6] nfsd: Allows user un-mounting filesystem where nfsd exports base on
  2015-06-25 14:37 ` [PATCH 10/10 v6] nfsd: Allows user un-mounting filesystem where nfsd exports base on Kinglong Mee
@ 2015-07-01  5:47   ` Al Viro
       [not found]     ` <20150701054751.GB17109-3bDd1+5oDREiFSDQTTA3OLVCufUGDwFn@public.gmane.org>
  0 siblings, 1 reply; 13+ messages in thread
From: Al Viro @ 2015-07-01  5:47 UTC (permalink / raw)
  To: Kinglong Mee
  Cc: J. Bruce Fields, linux-nfs, linux-fsdevel, NeilBrown, Trond Myklebust

On Thu, Jun 25, 2015 at 10:37:14PM +0800, Kinglong Mee wrote:
> +static void expkey_validate(struct cache_head *h)
> +{
> +	struct svc_expkey *key = container_of(h, struct svc_expkey, h);
> +
> +	if (!test_bit(CACHE_VALID, &key->h.flags) ||
> +	    test_bit(CACHE_NEGATIVE, &key->h.flags))
> +		return;
> +
> +	if (atomic_read(&h->ref.refcount) == 1) {
> +		mutex_lock(&key->ek_mutex);

... followed by kref_get(&h->ref) in caller

> +	if (atomic_read(&h->ref.refcount) == 2) {
> +		mutex_lock(&key->ek_mutex);

... followed by kref_put() in caller.

Suppose two threads call cache_get() at the same time.  Refcount is 1.
Depending on the timing you get either one or both grabbing vfsmount
references.  Whichever variant matches the one you want, there is no way
to tell one from another afterwards and they *do* differ in the resulting
vfsmount refcount changes.

Similar to that, suppose the refcount is 3 and two threads call cache_put()
at the same time.  If one of them gets through the entire thing (including
kref_put()) before the other gets to atomic_read(), you get the second
see refcount 2 and do that mntput().  If not, _nobody_ will ever see refcount
2 and mntput() is not done.

How can that code possibly be correct?  This kind of splitting atomic_read
from increment/decrement (and slapping a sleeping operation in between,
no less) is basically never right.  Not unless you have everything serialized
on the outside and do not need the atomic in the first place, which doesn't
seem to be the case here.

^ permalink raw reply	[flat|nested] 13+ messages in thread

* Re: [PATCH 10/10 v6] nfsd: Allows user un-mounting filesystem where nfsd exports base on
       [not found]     ` <20150701054751.GB17109-3bDd1+5oDREiFSDQTTA3OLVCufUGDwFn@public.gmane.org>
@ 2015-07-02 15:17       ` Kinglong Mee
  0 siblings, 0 replies; 13+ messages in thread
From: Kinglong Mee @ 2015-07-02 15:17 UTC (permalink / raw)
  To: Al Viro
  Cc: J. Bruce Fields, linux-nfs-u79uwXL29TY76Z2rM5mHXA,
	linux-fsdevel-u79uwXL29TY76Z2rM5mHXA, NeilBrown, Trond Myklebust,
	kinglongmee-Re5JQEeQqe8AvxtiuMwx3w

On 7/1/2015 13:47, Al Viro wrote:
> On Thu, Jun 25, 2015 at 10:37:14PM +0800, Kinglong Mee wrote:
>> +static void expkey_validate(struct cache_head *h)
>> +{
>> +	struct svc_expkey *key = container_of(h, struct svc_expkey, h);
>> +
>> +	if (!test_bit(CACHE_VALID, &key->h.flags) ||
>> +	    test_bit(CACHE_NEGATIVE, &key->h.flags))
>> +		return;
>> +
>> +	if (atomic_read(&h->ref.refcount) == 1) {
>> +		mutex_lock(&key->ek_mutex);
> 
> ... followed by kref_get(&h->ref) in caller

Got it.

> 
>> +	if (atomic_read(&h->ref.refcount) == 2) {
>> +		mutex_lock(&key->ek_mutex);
> 
> ... followed by kref_put() in caller.

No, must before kref_put.
If kref_put() to zero will free the structure.

> 
> Suppose two threads call cache_get() at the same time.  Refcount is 1.
> Depending on the timing you get either one or both grabbing vfsmount
> references.  Whichever variant matches the one you want, there is no way
> to tell one from another afterwards and they *do* differ in the resulting
> vfsmount refcount changes.
> 
> Similar to that, suppose the refcount is 3 and two threads call cache_put()
> at the same time.  If one of them gets through the entire thing (including
> kref_put()) before the other gets to atomic_read(), you get the second
> see refcount 2 and do that mntput().  If not, _nobody_ will ever see refcount
> 2 and mntput() is not done.
> 
> How can that code possibly be correct?  This kind of splitting atomic_read
> from increment/decrement (and slapping a sleeping operation in between,
> no less) is basically never right.  Not unless you have everything serialized
> on the outside and do not need the atomic in the first place, which doesn't
> seem to be the case here.

For protect the reference, maybe I will implements a couple of get_ref/put_ref
as kref_get/kref_put.

+static void expkey_get_ref(struct cache_head *h)
+{
+       struct svc_expkey *key = container_of(h, struct svc_expkey, h);
+
+       mutex_lock(&key->ref_mutex);
+       kref_get(&h->ref);
+
+       if (!test_bit(CACHE_VALID, &key->h.flags) ||
+           test_bit(CACHE_NEGATIVE, &key->h.flags))
+               goto out;
+
+       if (atomic_read(&h->ref.refcount) == 2) {
+               if (legitimize_mntget(key->ek_path.mnt) == NULL) {
+                       printk(KERN_WARNING "%s: Get mnt for %pd2 failed!\n",
+                               __func__, key->ek_path.dentry);
+                       set_bit(CACHE_NEGATIVE, &h->flags);
+               } else
+                       key->ek_mnt_ref = true;
+       }
+out:
+       mutex_unlock(&key->ref_mutex);
+}
+
+static void expkey_put_ref(struct cache_head *h)
+{
+       struct svc_expkey *key = container_of(h, struct svc_expkey, h);
+
+       mutex_lock(&key->ref_mutex);
+       if (key->ek_mnt_ref && (atomic_read(&h->ref.refcount) == 2)) {
+               mntput(key->ek_path.mnt);
+               key->ek_mnt_ref = false;
+       }
+
+       if (unlikely(!atomic_dec_and_test(&h->ref.refcount))) {
+               mutex_unlock(&key->ref_mutex);
+               return ;
+       }
+
+       expkey_put(&h->ref);
+}
+

Code for nfsd exports cache is similar as expkey.

thanks,
Kinglong Mee
--
To unsubscribe from this list: send the line "unsubscribe linux-nfs" in
the body of a message to majordomo-u79uwXL29TY76Z2rM5mHXA@public.gmane.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

^ permalink raw reply	[flat|nested] 13+ messages in thread

end of thread, other threads:[~2015-07-02 15:17 UTC | newest]

Thread overview: 13+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2015-06-25 14:17 [PATCH 00/10 v6] NFSD: Pin to vfsmount for nfsd exports cache Kinglong Mee
2015-06-25 14:19 ` [PATCH 02/10 v6] fs_pin: Export functions for specific filesystem Kinglong Mee
2015-06-25 14:19 ` [PATCH 03/10 v6] path: New helpers path_get_pin/path_put_unpin for path pin Kinglong Mee
2015-06-25 14:21 ` [PATCH 04/10 v6] fs: New helper legitimize_mntget() for getting a legitimize mnt Kinglong Mee
2015-06-25 14:27 ` [PATCH 06/10 v6] sunrpc/nfsd: Remove redundant code by exports seq_operations functions Kinglong Mee
2015-06-25 14:29 ` [PATCH 07/10 v6] sunrpc: Switch to using list_head instead single list Kinglong Mee
     [not found] ` <558C0D6A.9050104-Re5JQEeQqe8AvxtiuMwx3w@public.gmane.org>
2015-06-25 14:18   ` [PATCH 01/10 v6] fs_pin: Initialize value for fs_pin explicitly Kinglong Mee
2015-06-25 14:25   ` [PATCH 05/10 v6] sunrpc: Store cache_detail in seq_file's private directly Kinglong Mee
2015-06-25 14:34   ` [PATCH 08/10] sunrpc: New helper cache_delete_entry for deleting cache_head directly Kinglong Mee
2015-06-25 14:36   ` [PATCH 09/10 v6] sunrpc: Support validate/invalidate for reference change in cache_detail Kinglong Mee
2015-06-25 14:37 ` [PATCH 10/10 v6] nfsd: Allows user un-mounting filesystem where nfsd exports base on Kinglong Mee
2015-07-01  5:47   ` Al Viro
     [not found]     ` <20150701054751.GB17109-3bDd1+5oDREiFSDQTTA3OLVCufUGDwFn@public.gmane.org>
2015-07-02 15:17       ` Kinglong Mee

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).