linux-fsdevel.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
* [PATCH v4 00/16] nfsd: open file caching
@ 2015-09-11 10:54 Jeff Layton
  2015-09-11 10:54 ` [PATCH v4 01/16] locks: change tracepoint for generic_add_lease Jeff Layton
                   ` (9 more replies)
  0 siblings, 10 replies; 19+ messages in thread
From: Jeff Layton @ 2015-09-11 10:54 UTC (permalink / raw)
  To: bfields; +Cc: linux-nfs, linux-fsdevel

v4:
- squash some of the patches down into one patch to reduce churn
- close cached open files after unlink instead of before
- don't just close files after nfsd does an unlink, must do it
  after any vfs-layer unlink. Use fsnotify to handle that.
- use a SRCU notifier chain for setlease
- add patch to allow non-kthreads to do a fput_sync

v3:
- open files are now hashed on inode pointer instead of fh
- eliminate the recurring workqueue job in favor of shrinker/LRU and
  notifier from lease setting code
- have nfsv4 use the cache as well
- removal of raparms cache

v2:
- changelog cleanups and clarifications
- allow COMMIT to use cached open files
- tracepoints for nfsd_file cache
- proactively close open files prior to REMOVE, or a RENAME over a
  positive dentry

This is the fourth iteration of the open file cache patches for nfsd.
There are some rather major changes here vs. the earlier sets,
particularly in how we close cached open files when there is vfs-layer
activity (unlinks or setlease calls in particular).

For those seeing this for the first time, main impetus here is to help
speed up NFSv3 I/O. nfsd will do an open+read/write+close for every READ
or WRITE RPC. This patchset allows us to cache those open files more or
less indefinitely, and close them out in response to certain vfs-layer
activity (unlinks and setlease attempts primarily).

The first few patches in the series make (small) changes to several
subsystems to enable the caching infrastructure. The 8th patch adds
the cache itself, and then the remaining patches hook the nfsd code
up to the cache. The final patch rips out the raparms cache since it's
no longer needed with these changes.

The most controversial piece here I think will be the 3rd patch in the
series which allows fput_sync to run in non-kthread context. This is
necessary to allow userland threads to close out cached nfsd files in
advance of setlease attempts. Al, I'd appreciate it if you could weigh
in on that one. I'm fine with adding more scary warnings to the comment
above it if you think it's warranted.

Jeff Layton (16):
  locks: change tracepoint for generic_add_lease
  list_lru: add list_lru_rotate
  fs: allow __fput_sync to be used by non-kthreads and in modules
  fsnotify: export several symbols
  locks: create a new notifier chain for lease attempts
  nfsd: move include of state.h from trace.c to trace.h
  sunrpc: add a new cache_detail operation for when a cache is flushed
  nfsd: add a new struct file caching facility to nfsd
  nfsd: hook up nfsd_write to the new nfsd_file cache
  nfsd: hook up nfsd_read to the nfsd_file cache
  nfsd: hook nfsd_commit up to the nfsd_file cache
  nfsd: convert nfs4_file->fi_fds array to use nfsd_files
  nfsd: have nfsd_test_lock use the nfsd_file cache
  nfsd: convert fi_deleg_file and ls_file fields to nfsd_file
  nfsd: hook up nfs4_preprocess_stateid_op to the nfsd_file cache
  nfsd: rip out the raparms cache

 fs/file_table.c                 |  27 +-
 fs/locks.c                      |  37 +++
 fs/nfsd/Kconfig                 |   2 +
 fs/nfsd/Makefile                |   3 +-
 fs/nfsd/export.c                |  14 +
 fs/nfsd/filecache.c             | 552 ++++++++++++++++++++++++++++++++++++++++
 fs/nfsd/filecache.h             |  37 +++
 fs/nfsd/nfs3proc.c              |   2 +-
 fs/nfsd/nfs4layouts.c           |  12 +-
 fs/nfsd/nfs4proc.c              |  32 +--
 fs/nfsd/nfs4state.c             | 174 ++++++-------
 fs/nfsd/nfs4xdr.c               |  16 +-
 fs/nfsd/nfsproc.c               |   2 +-
 fs/nfsd/nfssvc.c                |  16 +-
 fs/nfsd/state.h                 |  10 +-
 fs/nfsd/trace.c                 |   2 -
 fs/nfsd/trace.h                 | 129 ++++++++++
 fs/nfsd/vfs.c                   | 220 +++-------------
 fs/nfsd/vfs.h                   |   8 +-
 fs/nfsd/xdr4.h                  |  15 +-
 fs/notify/group.c               |   2 +
 fs/notify/mark.c                |   3 +
 include/linux/file.h            |   2 +-
 include/linux/fs.h              |   1 +
 include/linux/list_lru.h        |  13 +
 include/linux/sunrpc/cache.h    |   1 +
 include/trace/events/filelock.h |  38 ++-
 kernel/acct.c                   |   2 +-
 mm/list_lru.c                   |  15 ++
 net/sunrpc/cache.c              |   3 +
 30 files changed, 1027 insertions(+), 363 deletions(-)
 create mode 100644 fs/nfsd/filecache.c
 create mode 100644 fs/nfsd/filecache.h

-- 
2.4.3


^ permalink raw reply	[flat|nested] 19+ messages in thread

* [PATCH v4 01/16] locks: change tracepoint for generic_add_lease
  2015-09-11 10:54 [PATCH v4 00/16] nfsd: open file caching Jeff Layton
@ 2015-09-11 10:54 ` Jeff Layton
  2015-09-11 10:54 ` [PATCH v4 02/16] list_lru: add list_lru_rotate Jeff Layton
                   ` (8 subsequent siblings)
  9 siblings, 0 replies; 19+ messages in thread
From: Jeff Layton @ 2015-09-11 10:54 UTC (permalink / raw)
  To: bfields; +Cc: linux-nfs, linux-fsdevel

...add some more helpful info.

Signed-off-by: Jeff Layton <jeff.layton@primarydata.com>
---
 include/trace/events/filelock.h | 38 +++++++++++++++++++++++++++++++++++---
 1 file changed, 35 insertions(+), 3 deletions(-)

diff --git a/include/trace/events/filelock.h b/include/trace/events/filelock.h
index a0d008070962..c72f2dc01d0b 100644
--- a/include/trace/events/filelock.h
+++ b/include/trace/events/filelock.h
@@ -81,15 +81,47 @@ DEFINE_EVENT(filelock_lease, break_lease_block, TP_PROTO(struct inode *inode, st
 DEFINE_EVENT(filelock_lease, break_lease_unblock, TP_PROTO(struct inode *inode, struct file_lock *fl),
 		TP_ARGS(inode, fl));
 
-DEFINE_EVENT(filelock_lease, generic_add_lease, TP_PROTO(struct inode *inode, struct file_lock *fl),
-		TP_ARGS(inode, fl));
-
 DEFINE_EVENT(filelock_lease, generic_delete_lease, TP_PROTO(struct inode *inode, struct file_lock *fl),
 		TP_ARGS(inode, fl));
 
 DEFINE_EVENT(filelock_lease, time_out_leases, TP_PROTO(struct inode *inode, struct file_lock *fl),
 		TP_ARGS(inode, fl));
 
+TRACE_EVENT(generic_add_lease,
+	TP_PROTO(struct inode *inode, struct file_lock *fl),
+
+	TP_ARGS(inode, fl),
+
+	TP_STRUCT__entry(
+		__field(unsigned long, i_ino)
+		__field(int, wcount)
+		__field(int, dcount)
+		__field(int, icount)
+		__field(dev_t, s_dev)
+		__field(fl_owner_t, fl_owner)
+		__field(unsigned int, fl_flags)
+		__field(unsigned char, fl_type)
+	),
+
+	TP_fast_assign(
+		__entry->s_dev = inode->i_sb->s_dev;
+		__entry->i_ino = inode->i_ino;
+		__entry->wcount = atomic_read(&inode->i_writecount);
+		__entry->dcount = d_count(fl->fl_file->f_path.dentry);
+		__entry->icount = atomic_read(&inode->i_count);
+		__entry->fl_owner = fl ? fl->fl_owner : NULL;
+		__entry->fl_flags = fl ? fl->fl_flags : 0;
+		__entry->fl_type = fl ? fl->fl_type : 0;
+	),
+
+	TP_printk("dev=0x%x:0x%x ino=0x%lx wcount=%d dcount=%d icount=%d fl_owner=0x%p fl_flags=%s fl_type=%s",
+		MAJOR(__entry->s_dev), MINOR(__entry->s_dev),
+		__entry->i_ino, __entry->wcount, __entry->dcount,
+		__entry->icount, __entry->fl_owner,
+		show_fl_flags(__entry->fl_flags),
+		show_fl_type(__entry->fl_type))
+);
+
 #endif /* _TRACE_FILELOCK_H */
 
 /* This part must be outside protection */
-- 
2.4.3


^ permalink raw reply related	[flat|nested] 19+ messages in thread

* [PATCH v4 02/16] list_lru: add list_lru_rotate
  2015-09-11 10:54 [PATCH v4 00/16] nfsd: open file caching Jeff Layton
  2015-09-11 10:54 ` [PATCH v4 01/16] locks: change tracepoint for generic_add_lease Jeff Layton
@ 2015-09-11 10:54 ` Jeff Layton
  2015-09-11 10:54 ` [PATCH v4 03/16] fs: allow __fput_sync to be used by non-kthreads and in modules Jeff Layton
                   ` (7 subsequent siblings)
  9 siblings, 0 replies; 19+ messages in thread
From: Jeff Layton @ 2015-09-11 10:54 UTC (permalink / raw)
  To: bfields; +Cc: linux-nfs, linux-fsdevel, Andrew Morton, linux-mm

Add a function that can move an entry to the MRU end of the list.

Cc: Andrew Morton <akpm@linux-foundation.org>
Cc: linux-mm@kvack.org
Reviewed-by: Vladimir Davydov <vdavydov@parallels.com>
Signed-off-by: Jeff Layton <jeff.layton@primarydata.com>
---
 include/linux/list_lru.h | 13 +++++++++++++
 mm/list_lru.c            | 15 +++++++++++++++
 2 files changed, 28 insertions(+)

diff --git a/include/linux/list_lru.h b/include/linux/list_lru.h
index 2a6b9947aaa3..4534b1b34d2d 100644
--- a/include/linux/list_lru.h
+++ b/include/linux/list_lru.h
@@ -96,6 +96,19 @@ bool list_lru_add(struct list_lru *lru, struct list_head *item);
 bool list_lru_del(struct list_lru *lru, struct list_head *item);
 
 /**
+ * list_lru_rotate: rotate an element to the end of an lru list
+ * @list_lru: the lru pointer
+ * @item: the item to be rotated
+ *
+ * This function moves an entry to the end of an LRU list. Should be used when
+ * an entry that is on the LRU is used, and should be moved to the MRU end of
+ * the list. If the item is not on a list, then this function has no effect.
+ * The comments about an element already pertaining to a list are also valid
+ * for list_lru_rotate.
+ */
+void list_lru_rotate(struct list_lru *lru, struct list_head *item);
+
+/**
  * list_lru_count_one: return the number of objects currently held by @lru
  * @lru: the lru pointer.
  * @nid: the node id to count from.
diff --git a/mm/list_lru.c b/mm/list_lru.c
index e1da19fac1b3..66718c2a9a7b 100644
--- a/mm/list_lru.c
+++ b/mm/list_lru.c
@@ -130,6 +130,21 @@ bool list_lru_del(struct list_lru *lru, struct list_head *item)
 }
 EXPORT_SYMBOL_GPL(list_lru_del);
 
+void list_lru_rotate(struct list_lru *lru, struct list_head *item)
+{
+	int nid = page_to_nid(virt_to_page(item));
+	struct list_lru_node *nlru = &lru->node[nid];
+	struct list_lru_one *l;
+
+	spin_lock(&nlru->lock);
+	if (!list_empty(item)) {
+		l = list_lru_from_kmem(nlru, item);
+		list_move_tail(item, &l->list);
+	}
+	spin_unlock(&nlru->lock);
+}
+EXPORT_SYMBOL_GPL(list_lru_rotate);
+
 void list_lru_isolate(struct list_lru_one *list, struct list_head *item)
 {
 	list_del_init(item);
-- 
2.4.3

--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

^ permalink raw reply related	[flat|nested] 19+ messages in thread

* [PATCH v4 03/16] fs: allow __fput_sync to be used by non-kthreads and in modules
  2015-09-11 10:54 [PATCH v4 00/16] nfsd: open file caching Jeff Layton
  2015-09-11 10:54 ` [PATCH v4 01/16] locks: change tracepoint for generic_add_lease Jeff Layton
  2015-09-11 10:54 ` [PATCH v4 02/16] list_lru: add list_lru_rotate Jeff Layton
@ 2015-09-11 10:54 ` Jeff Layton
       [not found]   ` <1441968882-7851-4-git-send-email-jeff.layton-7I+n7zu2hftEKMMhf/gKZA@public.gmane.org>
  2015-09-11 10:54 ` [PATCH v4 04/16] fsnotify: export several symbols Jeff Layton
                   ` (6 subsequent siblings)
  9 siblings, 1 reply; 19+ messages in thread
From: Jeff Layton @ 2015-09-11 10:54 UTC (permalink / raw)
  To: bfields; +Cc: linux-nfs, linux-fsdevel, Al Viro

We want nfsd to keep a cache of open files, but that would potentially
block userland callers from obtaining leases on them. To fix this,
we'll be adding a new notifier chain to the lease code that will call
back into nfsd on any attempt to set a FL_LEASE. nfsd can then close
any open files for that inode in advance of that.

The problem however is that since that notifier will run in normal
process context, the final __fput will be delayed a'la task_work and we
are still unable to set a lease. What we need to do is to put the struct
file synchronously so that the __fput runs before returning from the
notifier call.

The comments over __fput_sync and the BUG_ON in there mandate that it
should only be used in kthread context, but I see no reason why that
should be so. As long as the caller avoids holding locks that may be
problematic, it should be OK to use it from normal process context as
well.

Remove the __ prefix and the BUG_ON from that function and update the
comments over it. Also export it so that it can be used from nfsd code,
and move the export of fput just below the function definition.

Cc: Al Viro <viro@ZenIV.linux.org.uk>
Signed-off-by: Jeff Layton <jeff.layton@primarydata.com>
---
 fs/file_table.c      | 27 ++++++++++++++-------------
 include/linux/file.h |  2 +-
 kernel/acct.c        |  2 +-
 3 files changed, 16 insertions(+), 15 deletions(-)

diff --git a/fs/file_table.c b/fs/file_table.c
index f4833af62eae..6769ed45c35f 100644
--- a/fs/file_table.c
+++ b/fs/file_table.c
@@ -280,25 +280,26 @@ void fput(struct file *file)
 			schedule_delayed_work(&delayed_fput_work, 1);
 	}
 }
+EXPORT_SYMBOL(fput);
 
 /*
- * synchronous analog of fput(); for kernel threads that might be needed
- * in some umount() (and thus can't use flush_delayed_fput() without
- * risking deadlocks), need to wait for completion of __fput() and know
- * for this specific struct file it won't involve anything that would
- * need them.  Use only if you really need it - at the very least,
- * don't blindly convert fput() by kernel thread to that.
+ * synchronous analog of fput(); this is necessary for tasks
+ * that might be needed in some umount() (and thus can't use
+ * flush_delayed_fput() without risking deadlocks), need to wait for
+ * completion of __fput() and know for this specific struct file it
+ * won't involve anything that would need them. It's also necessary
+ * for nfsd, which needs to be able to synchronously close files
+ * on which userspace programs are trying to set leases.
+ *
+ * Use only if you really need it - at the very least, don't blindly
+ * convert fput() to this.
  */
-void __fput_sync(struct file *file)
+void fput_sync(struct file *file)
 {
-	if (atomic_long_dec_and_test(&file->f_count)) {
-		struct task_struct *task = current;
-		BUG_ON(!(task->flags & PF_KTHREAD));
+	if (atomic_long_dec_and_test(&file->f_count))
 		__fput(file);
-	}
 }
-
-EXPORT_SYMBOL(fput);
+EXPORT_SYMBOL(fput_sync);
 
 void put_filp(struct file *file)
 {
diff --git a/include/linux/file.h b/include/linux/file.h
index f87d30882a24..046a8c477b9a 100644
--- a/include/linux/file.h
+++ b/include/linux/file.h
@@ -71,6 +71,6 @@ extern void put_unused_fd(unsigned int fd);
 extern void fd_install(unsigned int fd, struct file *file);
 
 extern void flush_delayed_fput(void);
-extern void __fput_sync(struct file *);
+extern void fput_sync(struct file *);
 
 #endif /* __LINUX_FILE_H */
diff --git a/kernel/acct.c b/kernel/acct.c
index 74963d192c5d..b58300ebd819 100644
--- a/kernel/acct.c
+++ b/kernel/acct.c
@@ -183,7 +183,7 @@ static void close_work(struct work_struct *work)
 	struct file *file = acct->file;
 	if (file->f_op->flush)
 		file->f_op->flush(file, NULL);
-	__fput_sync(file);
+	fput_sync(file);
 	complete(&acct->done);
 }
 
-- 
2.4.3


^ permalink raw reply related	[flat|nested] 19+ messages in thread

* [PATCH v4 04/16] fsnotify: export several symbols
  2015-09-11 10:54 [PATCH v4 00/16] nfsd: open file caching Jeff Layton
                   ` (2 preceding siblings ...)
  2015-09-11 10:54 ` [PATCH v4 03/16] fs: allow __fput_sync to be used by non-kthreads and in modules Jeff Layton
@ 2015-09-11 10:54 ` Jeff Layton
       [not found] ` <1441968882-7851-1-git-send-email-jeff.layton-7I+n7zu2hftEKMMhf/gKZA@public.gmane.org>
                   ` (5 subsequent siblings)
  9 siblings, 0 replies; 19+ messages in thread
From: Jeff Layton @ 2015-09-11 10:54 UTC (permalink / raw)
  To: bfields; +Cc: linux-nfs, linux-fsdevel, Eric Paris

With nfsd's new open-file caching infrastructure, we need a way to know
when unlinks occur so it can close files that it may be holding open.
fsnotify fits the bill nicely, but the symbols aren't currently exported
to modules. Export some of its symbols so nfsd can use this
infrastructure.

Cc: Eric Paris <eparis@redhat.com>
Signed-off-by: Jeff Layton <jeff.layton@primarydata.com>
---
 fs/notify/group.c | 2 ++
 fs/notify/mark.c  | 3 +++
 2 files changed, 5 insertions(+)

diff --git a/fs/notify/group.c b/fs/notify/group.c
index d16b62cb2854..295d08800126 100644
--- a/fs/notify/group.c
+++ b/fs/notify/group.c
@@ -81,6 +81,7 @@ void fsnotify_put_group(struct fsnotify_group *group)
 	if (atomic_dec_and_test(&group->refcnt))
 		fsnotify_final_destroy_group(group);
 }
+EXPORT_SYMBOL_GPL(fsnotify_put_group);
 
 /*
  * Create a new fsnotify_group and hold a reference for the group returned.
@@ -109,6 +110,7 @@ struct fsnotify_group *fsnotify_alloc_group(const struct fsnotify_ops *ops)
 
 	return group;
 }
+EXPORT_SYMBOL_GPL(fsnotify_alloc_group);
 
 int fsnotify_fasync(int fd, struct file *file, int on)
 {
diff --git a/fs/notify/mark.c b/fs/notify/mark.c
index 3e594ce41010..0dedd692d032 100644
--- a/fs/notify/mark.c
+++ b/fs/notify/mark.c
@@ -184,6 +184,7 @@ void fsnotify_destroy_mark(struct fsnotify_mark *mark,
 	fsnotify_destroy_mark_locked(mark, group);
 	mutex_unlock(&group->mark_mutex);
 }
+EXPORT_SYMBOL_GPL(fsnotify_destroy_mark);
 
 /*
  * Destroy all marks in the given list. The marks must be already detached from
@@ -371,6 +372,7 @@ int fsnotify_add_mark(struct fsnotify_mark *mark, struct fsnotify_group *group,
 	mutex_unlock(&group->mark_mutex);
 	return ret;
 }
+EXPORT_SYMBOL_GPL(fsnotify_add_mark);
 
 /*
  * Given a list of marks, find the mark associated with given group. If found
@@ -440,6 +442,7 @@ void fsnotify_init_mark(struct fsnotify_mark *mark,
 	atomic_set(&mark->refcnt, 1);
 	mark->free_mark = free_mark;
 }
+EXPORT_SYMBOL_GPL(fsnotify_init_mark);
 
 static int fsnotify_mark_destroy(void *ignored)
 {
-- 
2.4.3


^ permalink raw reply related	[flat|nested] 19+ messages in thread

* [PATCH v4 05/16] locks: create a new notifier chain for lease attempts
       [not found] ` <1441968882-7851-1-git-send-email-jeff.layton-7I+n7zu2hftEKMMhf/gKZA@public.gmane.org>
@ 2015-09-11 10:54   ` Jeff Layton
  2015-09-11 10:54   ` [PATCH v4 08/16] nfsd: add a new struct file caching facility to nfsd Jeff Layton
                     ` (5 subsequent siblings)
  6 siblings, 0 replies; 19+ messages in thread
From: Jeff Layton @ 2015-09-11 10:54 UTC (permalink / raw)
  To: bfields-uC3wQj2KruNg9hUCZPvPmw
  Cc: linux-nfs-u79uwXL29TY76Z2rM5mHXA, linux-fsdevel-u79uwXL29TY76Z2rM5mHXA

With the new file caching infrastructure in nfsd, we can end up holding
files open for an indefinite period of time, even when they are still
idle. This may prevent the kernel from handing out leases on the file,
which is something we don't want to block.

Fix this by running a SRCU notifier call chain whenever on any
lease attempt. nfsd can then purge the cache for that inode before
returning.

Since SRCU is only conditionally compiled in, we must only define the
new chain if it's enabled, and users of the chain must ensure that
SRCU is enabled.

Signed-off-by: Jeff Layton <jeff.layton-7I+n7zu2hftEKMMhf/gKZA@public.gmane.org>
---
 fs/locks.c         | 37 +++++++++++++++++++++++++++++++++++++
 include/linux/fs.h |  1 +
 2 files changed, 38 insertions(+)

diff --git a/fs/locks.c b/fs/locks.c
index 2a54c800a223..a2d5794d713a 100644
--- a/fs/locks.c
+++ b/fs/locks.c
@@ -166,6 +166,7 @@ int lease_break_time = 45;
 DEFINE_STATIC_LGLOCK(file_lock_lglock);
 static DEFINE_PER_CPU(struct hlist_head, file_lock_list);
 
+
 /*
  * The blocked_hash is used to find POSIX lock loops for deadlock detection.
  * It is protected by blocked_lock_lock.
@@ -1780,6 +1781,40 @@ int generic_setlease(struct file *filp, long arg, struct file_lock **flp,
 }
 EXPORT_SYMBOL(generic_setlease);
 
+#if IS_ENABLED(CONFIG_SRCU)
+/*
+ * Kernel subsystems can register to be notified on any attempt to set
+ * a new lease with the lease_notifier_chain. This is used by (e.g.) nfsd
+ * to close files that it may have cached when there is an attempt to set a
+ * conflicting lease.
+ */
+struct srcu_notifier_head lease_notifier_chain;
+EXPORT_SYMBOL_GPL(lease_notifier_chain);
+
+static inline void
+lease_notifier_chain_init(void)
+{
+	srcu_init_notifier_head(&lease_notifier_chain);
+}
+
+static inline void
+setlease_notifier(long arg, struct file_lock *lease)
+{
+	if (arg != F_UNLCK)
+		srcu_notifier_call_chain(&lease_notifier_chain, arg, lease);
+}
+#else /* !IS_ENABLED(CONFIG_SRCU) */
+static inline void
+lease_notifier_chain_init(void)
+{
+}
+
+static inline void
+setlease_notifier(long arg, struct file_lock *lease)
+{
+}
+#endif /* IS_ENABLED(CONFIG_SRCU) */
+
 /**
  * vfs_setlease        -       sets a lease on an open file
  * @filp:	file pointer
@@ -1800,6 +1835,7 @@ EXPORT_SYMBOL(generic_setlease);
 int
 vfs_setlease(struct file *filp, long arg, struct file_lock **lease, void **priv)
 {
+	setlease_notifier(arg, *lease);
 	if (filp->f_op->setlease)
 		return filp->f_op->setlease(filp, arg, lease, priv);
 	else
@@ -2696,6 +2732,7 @@ static int __init filelock_init(void)
 	for_each_possible_cpu(i)
 		INIT_HLIST_HEAD(per_cpu_ptr(&file_lock_list, i));
 
+	lease_notifier_chain_init();
 	return 0;
 }
 
diff --git a/include/linux/fs.h b/include/linux/fs.h
index 9a9d314f7b27..e5fcf56c5ba1 100644
--- a/include/linux/fs.h
+++ b/include/linux/fs.h
@@ -1041,6 +1041,7 @@ extern int fcntl_setlease(unsigned int fd, struct file *filp, long arg);
 extern int fcntl_getlease(struct file *filp);
 
 /* fs/locks.c */
+extern struct srcu_notifier_head	lease_notifier_chain;
 void locks_free_lock_context(struct file_lock_context *ctx);
 void locks_free_lock(struct file_lock *fl);
 extern void locks_init_lock(struct file_lock *);
-- 
2.4.3

--
To unsubscribe from this list: send the line "unsubscribe linux-nfs" in
the body of a message to majordomo-u79uwXL29TY76Z2rM5mHXA@public.gmane.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

^ permalink raw reply related	[flat|nested] 19+ messages in thread

* [PATCH v4 06/16] nfsd: move include of state.h from trace.c to trace.h
  2015-09-11 10:54 [PATCH v4 00/16] nfsd: open file caching Jeff Layton
                   ` (4 preceding siblings ...)
       [not found] ` <1441968882-7851-1-git-send-email-jeff.layton-7I+n7zu2hftEKMMhf/gKZA@public.gmane.org>
@ 2015-09-11 10:54 ` Jeff Layton
  2015-09-11 10:54 ` [PATCH v4 07/16] sunrpc: add a new cache_detail operation for when a cache is flushed Jeff Layton
                   ` (3 subsequent siblings)
  9 siblings, 0 replies; 19+ messages in thread
From: Jeff Layton @ 2015-09-11 10:54 UTC (permalink / raw)
  To: bfields; +Cc: linux-nfs, linux-fsdevel

Any file which includes trace.h will need to include state.h, even if
they aren't using any state tracepoints. Ensure that we include any
headers that might be needed in trace.h instead of relying on the
*.c files to have the right ones.

Signed-off-by: Jeff Layton <jeff.layton@primarydata.com>
Reviewed-by: Christoph Hellwig <hch@lst.de>
---
 fs/nfsd/trace.c | 2 --
 fs/nfsd/trace.h | 2 ++
 2 files changed, 2 insertions(+), 2 deletions(-)

diff --git a/fs/nfsd/trace.c b/fs/nfsd/trace.c
index 82f89070594c..90967466a1e5 100644
--- a/fs/nfsd/trace.c
+++ b/fs/nfsd/trace.c
@@ -1,5 +1,3 @@
 
-#include "state.h"
-
 #define CREATE_TRACE_POINTS
 #include "trace.h"
diff --git a/fs/nfsd/trace.h b/fs/nfsd/trace.h
index c668520c344b..0befe762762b 100644
--- a/fs/nfsd/trace.h
+++ b/fs/nfsd/trace.h
@@ -9,6 +9,8 @@
 
 #include <linux/tracepoint.h>
 
+#include "state.h"
+
 DECLARE_EVENT_CLASS(nfsd_stateid_class,
 	TP_PROTO(stateid_t *stp),
 	TP_ARGS(stp),
-- 
2.4.3


^ permalink raw reply related	[flat|nested] 19+ messages in thread

* [PATCH v4 07/16] sunrpc: add a new cache_detail operation for when a cache is flushed
  2015-09-11 10:54 [PATCH v4 00/16] nfsd: open file caching Jeff Layton
                   ` (5 preceding siblings ...)
  2015-09-11 10:54 ` [PATCH v4 06/16] nfsd: move include of state.h from trace.c to trace.h Jeff Layton
@ 2015-09-11 10:54 ` Jeff Layton
  2015-09-11 10:54 ` [PATCH v4 12/16] nfsd: convert nfs4_file->fi_fds array to use nfsd_files Jeff Layton
                   ` (2 subsequent siblings)
  9 siblings, 0 replies; 19+ messages in thread
From: Jeff Layton @ 2015-09-11 10:54 UTC (permalink / raw)
  To: bfields; +Cc: linux-nfs, linux-fsdevel

When the exports table is changed, exportfs will usually write a new
time to the "flush" file in the nfsd.export cache procfile. This tells
the kernel to flush any entries that are older than that value.

This gives us a mechanism to tell whether an unexport might have
occurred. Add a new ->flush cache_detail operation that is called after
flushing the cache whenever someone writes to a "flush" file.

Signed-off-by: Jeff Layton <jeff.layton@primarydata.com>
---
 include/linux/sunrpc/cache.h | 1 +
 net/sunrpc/cache.c           | 3 +++
 2 files changed, 4 insertions(+)

diff --git a/include/linux/sunrpc/cache.h b/include/linux/sunrpc/cache.h
index 03d3b4c92d9f..d1c10a978bb2 100644
--- a/include/linux/sunrpc/cache.h
+++ b/include/linux/sunrpc/cache.h
@@ -98,6 +98,7 @@ struct cache_detail {
 					      int has_died);
 
 	struct cache_head *	(*alloc)(void);
+	void			(*flush)(void);
 	int			(*match)(struct cache_head *orig, struct cache_head *new);
 	void			(*init)(struct cache_head *orig, struct cache_head *new);
 	void			(*update)(struct cache_head *orig, struct cache_head *new);
diff --git a/net/sunrpc/cache.c b/net/sunrpc/cache.c
index 4a2340a54401..60da9aa2bdc5 100644
--- a/net/sunrpc/cache.c
+++ b/net/sunrpc/cache.c
@@ -1451,6 +1451,9 @@ static ssize_t write_flush(struct file *file, const char __user *buf,
 	cd->nextcheck = seconds_since_boot();
 	cache_flush();
 
+	if (cd->flush)
+		cd->flush();
+
 	*ppos += count;
 	return count;
 }
-- 
2.4.3


^ permalink raw reply related	[flat|nested] 19+ messages in thread

* [PATCH v4 08/16] nfsd: add a new struct file caching facility to nfsd
       [not found] ` <1441968882-7851-1-git-send-email-jeff.layton-7I+n7zu2hftEKMMhf/gKZA@public.gmane.org>
  2015-09-11 10:54   ` [PATCH v4 05/16] locks: create a new notifier chain for lease attempts Jeff Layton
@ 2015-09-11 10:54   ` Jeff Layton
  2015-09-11 10:54   ` [PATCH v4 09/16] nfsd: hook up nfsd_write to the new nfsd_file cache Jeff Layton
                     ` (4 subsequent siblings)
  6 siblings, 0 replies; 19+ messages in thread
From: Jeff Layton @ 2015-09-11 10:54 UTC (permalink / raw)
  To: bfields-uC3wQj2KruNg9hUCZPvPmw
  Cc: linux-nfs-u79uwXL29TY76Z2rM5mHXA, linux-fsdevel-u79uwXL29TY76Z2rM5mHXA

Currently, NFSv2/3 reads and writes have to open a file, do the read or
write and then close it again for each RPC. This is highly inefficient,
especially when the underlying filesystem has a relatively slow open
routine.

This patch adds a new open file cache to knfsd. Rather than doing an
open for each RPC, the read/write handlers can call into this cache to
see if there is one already there for the correct filehandle and
NFS_MAY_READ/WRITE flags.

If there isn't an entry, then we create a new one and attempt to
perform the open. If there is, then we wait until the entry is fully
instantiated and return it if it is at the end of the wait. If it's
not, then we attempt to take over construction.

Since the main goal is to speed up NFSv2/3 I/O, we don't want to
close these files on last put of these objects. We need to keep them
around for a little while since we never know when the next READ/WRITE
will come in.

Note that this patch just adds the infrastructure to allow the caching
of open files. Later patches will actually make nfsd use it.

Signed-off-by: Jeff Layton <jeff.layton-7I+n7zu2hftEKMMhf/gKZA@public.gmane.org>
---
 fs/nfsd/Kconfig     |   2 +
 fs/nfsd/Makefile    |   3 +-
 fs/nfsd/export.c    |  14 ++
 fs/nfsd/filecache.c | 552 ++++++++++++++++++++++++++++++++++++++++++++++++++++
 fs/nfsd/filecache.h |  37 ++++
 fs/nfsd/nfssvc.c    |  10 +-
 fs/nfsd/trace.h     | 127 ++++++++++++
 fs/nfsd/vfs.c       |   3 +-
 fs/nfsd/vfs.h       |   1 +
 9 files changed, 746 insertions(+), 3 deletions(-)
 create mode 100644 fs/nfsd/filecache.c
 create mode 100644 fs/nfsd/filecache.h

diff --git a/fs/nfsd/Kconfig b/fs/nfsd/Kconfig
index a0b77fc1bd39..95e0a91d41ef 100644
--- a/fs/nfsd/Kconfig
+++ b/fs/nfsd/Kconfig
@@ -6,6 +6,8 @@ config NFSD
 	select SUNRPC
 	select EXPORTFS
 	select NFS_ACL_SUPPORT if NFSD_V2_ACL
+	select SRCU
+	select FSNOTIFY
 	depends on MULTIUSER
 	help
 	  Choose Y here if you want to allow other computers to access
diff --git a/fs/nfsd/Makefile b/fs/nfsd/Makefile
index 9a6028e120c6..8908bb467727 100644
--- a/fs/nfsd/Makefile
+++ b/fs/nfsd/Makefile
@@ -10,7 +10,8 @@ obj-$(CONFIG_NFSD)	+= nfsd.o
 nfsd-y			+= trace.o
 
 nfsd-y 			+= nfssvc.o nfsctl.o nfsproc.o nfsfh.o vfs.o \
-			   export.o auth.o lockd.o nfscache.o nfsxdr.o stats.o
+			   export.o auth.o lockd.o nfscache.o nfsxdr.o \
+			   stats.o filecache.o
 nfsd-$(CONFIG_NFSD_FAULT_INJECTION) += fault_inject.o
 nfsd-$(CONFIG_NFSD_V2_ACL) += nfs2acl.o
 nfsd-$(CONFIG_NFSD_V3)	+= nfs3proc.o nfs3xdr.o
diff --git a/fs/nfsd/export.c b/fs/nfsd/export.c
index b4d84b579f20..4b504edff121 100644
--- a/fs/nfsd/export.c
+++ b/fs/nfsd/export.c
@@ -21,6 +21,7 @@
 #include "nfsfh.h"
 #include "netns.h"
 #include "pnfs.h"
+#include "filecache.h"
 
 #define NFSDDBG_FACILITY	NFSDDBG_EXPORT
 
@@ -231,6 +232,18 @@ static struct cache_head *expkey_alloc(void)
 		return NULL;
 }
 
+static void
+expkey_flush(void)
+{
+	/*
+	 * Take the nfsd_mutex here to ensure that the file cache is not
+	 * destroyed while we're in the middle of flushing.
+	 */
+	mutex_lock(&nfsd_mutex);
+	nfsd_file_cache_purge();
+	mutex_unlock(&nfsd_mutex);
+}
+
 static struct cache_detail svc_expkey_cache_template = {
 	.owner		= THIS_MODULE,
 	.hash_size	= EXPKEY_HASHMAX,
@@ -243,6 +256,7 @@ static struct cache_detail svc_expkey_cache_template = {
 	.init		= expkey_init,
 	.update       	= expkey_update,
 	.alloc		= expkey_alloc,
+	.flush		= expkey_flush,
 };
 
 static int
diff --git a/fs/nfsd/filecache.c b/fs/nfsd/filecache.c
new file mode 100644
index 000000000000..21ec7adbf58c
--- /dev/null
+++ b/fs/nfsd/filecache.c
@@ -0,0 +1,552 @@
+/*
+ * Open file cache.
+ *
+ * (c) 2015 - Jeff Layton <jeff.layton-7I+n7zu2hftEKMMhf/gKZA@public.gmane.org>
+ */
+
+#include <linux/hash.h>
+#include <linux/slab.h>
+#include <linux/hash.h>
+#include <linux/file.h>
+#include <linux/sched.h>
+#include <linux/list_lru.h>
+#include <linux/fsnotify_backend.h>
+
+#include "vfs.h"
+#include "nfsd.h"
+#include "nfsfh.h"
+#include "filecache.h"
+#include "trace.h"
+
+#define NFSDDBG_FACILITY	NFSDDBG_FH
+
+/* hash table for nfs4_file */
+#define NFSD_FILE_HASH_BITS                   8
+#define NFSD_FILE_HASH_SIZE                  (1 << NFSD_FILE_HASH_BITS)
+
+/* We only care about NFSD_MAY_READ/WRITE for this cache */
+#define NFSD_FILE_MAY_MASK	(NFSD_MAY_READ|NFSD_MAY_WRITE)
+
+struct nfsd_fcache_bucket {
+	struct hlist_head	nfb_head;
+	spinlock_t		nfb_lock;
+};
+
+static struct nfsd_fcache_bucket	*nfsd_file_hashtbl;
+static struct list_lru			nfsd_file_lru;
+static struct fsnotify_group		*nfsd_file_fsnotify_group;
+
+/*
+ * The fsnotify_mark is embedded inside the nfsd_file and we don't want to
+ * explicitly free it. It'll be freed when the nfsd_file is and we always
+ * remove the mark from the inode before freeing it. So, this is a no-op.
+ */
+static void
+nfsd_file_free_mark(struct fsnotify_mark *mark)
+{
+}
+
+static struct nfsd_file *
+nfsd_file_alloc(struct inode *inode, unsigned int may, unsigned int hashval)
+{
+	struct nfsd_file *nf;
+
+	/* FIXME: create a new slabcache for these? */
+	nf = kzalloc(sizeof(*nf), GFP_KERNEL);
+	if (nf) {
+		INIT_HLIST_NODE(&nf->nf_node);
+		INIT_LIST_HEAD(&nf->nf_lru);
+		nf->nf_inode = inode;
+		nf->nf_hashval = hashval;
+		atomic_set(&nf->nf_ref, 1);
+		nf->nf_may = NFSD_FILE_MAY_MASK & may;
+		if (may & NFSD_MAY_NOT_BREAK_LEASE) {
+			if (may & NFSD_MAY_WRITE)
+				__set_bit(NFSD_FILE_BREAK_WRITE, &nf->nf_flags);
+			if (may & NFSD_MAY_READ)
+				__set_bit(NFSD_FILE_BREAK_READ, &nf->nf_flags);
+		}
+		fsnotify_init_mark(&nf->nf_mark, nfsd_file_free_mark);
+		nf->nf_mark.mask = FS_ATTRIB|FS_DELETE_SELF;
+		trace_nfsd_file_alloc(nf);
+	}
+	return nf;
+}
+
+static void
+nfsd_file_put_final(struct nfsd_file *nf)
+{
+	trace_nfsd_file_put_final(nf);
+	fsnotify_destroy_mark(&nf->nf_mark, nfsd_file_fsnotify_group);
+	if (nf->nf_file)
+		fput(nf->nf_file);
+	kfree_rcu(nf, nf_rcu);
+}
+
+static void
+nfsd_file_put_final_sync(struct nfsd_file *nf)
+{
+	trace_nfsd_file_put_final(nf);
+	fsnotify_destroy_mark(&nf->nf_mark, nfsd_file_fsnotify_group);
+	if (nf->nf_file)
+		fput_sync(nf->nf_file);
+	kfree_rcu(nf, nf_rcu);
+}
+
+static bool
+nfsd_file_unhash(struct nfsd_file *nf)
+{
+	lockdep_assert_held(&nfsd_file_hashtbl[nf->nf_hashval].nfb_lock);
+
+	trace_nfsd_file_unhash(nf);
+	if (test_bit(NFSD_FILE_HASHED, &nf->nf_flags)) {
+		clear_bit(NFSD_FILE_HASHED, &nf->nf_flags);
+		hlist_del_rcu(&nf->nf_node);
+		list_lru_del(&nfsd_file_lru, &nf->nf_lru);
+		return true;
+	}
+	return false;
+}
+
+static void
+nfsd_file_unhash_and_release_locked(struct nfsd_file *nf, struct list_head *dispose)
+{
+	lockdep_assert_held(&nfsd_file_hashtbl[nf->nf_hashval].nfb_lock);
+
+	trace_nfsd_file_unhash_and_release_locked(nf);
+	if (!nfsd_file_unhash(nf))
+		return;
+	if (!atomic_dec_and_test(&nf->nf_ref))
+		return;
+
+	list_add(&nf->nf_lru, dispose);
+}
+
+static void
+nfsd_file_unhash_and_release(struct nfsd_file *nf)
+{
+	bool destroy = false;
+
+	spin_lock(&nfsd_file_hashtbl[nf->nf_hashval].nfb_lock);
+	if (nfsd_file_unhash(nf))
+		destroy = atomic_dec_and_test(&nf->nf_ref);
+	spin_unlock(&nfsd_file_hashtbl[nf->nf_hashval].nfb_lock);
+	if (destroy)
+		nfsd_file_put_final_sync(nf);
+}
+
+void
+nfsd_file_put(struct nfsd_file *nf)
+{
+	trace_nfsd_file_put(nf);
+	list_lru_rotate(&nfsd_file_lru, &nf->nf_lru);
+	if (!atomic_dec_and_test(&nf->nf_ref))
+		return;
+
+	WARN_ON(test_bit(NFSD_FILE_HASHED, &nf->nf_flags));
+	nfsd_file_put_final(nf);
+}
+
+struct nfsd_file *
+nfsd_file_get(struct nfsd_file *nf)
+{
+	if (likely(atomic_inc_not_zero(&nf->nf_ref)))
+		return nf;
+	return NULL;
+}
+
+static void
+nfsd_file_dispose_list(struct list_head *dispose)
+{
+	struct nfsd_file *nf;
+
+	while(!list_empty(dispose)) {
+		nf = list_first_entry(dispose, struct nfsd_file, nf_lru);
+		list_del(&nf->nf_lru);
+		nfsd_file_put_final(nf);
+	}
+}
+
+static void
+nfsd_file_dispose_list_sync(struct list_head *dispose)
+{
+	struct nfsd_file *nf;
+
+	while(!list_empty(dispose)) {
+		nf = list_first_entry(dispose, struct nfsd_file, nf_lru);
+		list_del(&nf->nf_lru);
+		nfsd_file_put_final_sync(nf);
+	}
+}
+
+static enum lru_status
+nfsd_file_lru_cb(struct list_head *item, struct list_lru_one *lru,
+		 spinlock_t *lock, void *arg)
+	__releases(lock)
+	__acquires(lock)
+{
+	struct nfsd_file *nf = list_entry(item, struct nfsd_file, nf_lru);
+	bool unhashed;
+
+	if (atomic_read(&nf->nf_ref) > 1)
+		return LRU_SKIP;
+
+	spin_unlock(lock);
+	spin_lock(&nfsd_file_hashtbl[nf->nf_hashval].nfb_lock);
+	unhashed = nfsd_file_unhash(nf);
+	spin_unlock(&nfsd_file_hashtbl[nf->nf_hashval].nfb_lock);
+	if (unhashed)
+		nfsd_file_put(nf);
+	spin_lock(lock);
+	return unhashed ? LRU_REMOVED_RETRY : LRU_RETRY;
+}
+
+static unsigned long
+nfsd_file_lru_count(struct shrinker *s, struct shrink_control *sc)
+{
+	return list_lru_count(&nfsd_file_lru);
+}
+
+static unsigned long
+nfsd_file_lru_scan(struct shrinker *s, struct shrink_control *sc)
+{
+	return list_lru_shrink_walk(&nfsd_file_lru, sc, nfsd_file_lru_cb, NULL);
+}
+
+static struct shrinker	nfsd_file_shrinker = {
+	.scan_objects = nfsd_file_lru_scan,
+	.count_objects = nfsd_file_lru_count,
+	.seeks = 1,
+};
+
+static int
+nfsd_file_lease_notifier_call(struct notifier_block *nb, unsigned long arg,
+			    void *data)
+{
+	struct file_lock *fl = data;
+
+	/* Only close files for F_SETLEASE leases */
+	if (fl->fl_flags & FL_LEASE)
+		nfsd_file_close_inode_sync(file_inode(fl->fl_file));
+	return 0;
+}
+
+static struct notifier_block nfsd_file_lease_notifier = {
+	.notifier_call = nfsd_file_lease_notifier_call,
+};
+
+static int
+nfsd_file_fsnotify_handle_event(struct fsnotify_group *group,
+				struct inode *inode,
+				struct fsnotify_mark *inode_mark,
+				struct fsnotify_mark *vfsmount_mark,
+				u32 mask, void *data, int data_type,
+				const unsigned char *file_name, u32 cookie)
+{
+	struct nfsd_file	*nf;
+
+	trace_nfsd_file_fsnotify_handle_event(inode, mask);
+
+	/* Should be no marks on non-regular files */
+	if (!S_ISREG(inode->i_mode)) {
+		WARN_ON_ONCE(1);
+		return 0;
+	}
+
+	/* ...and we don't do anything with vfsmount marks */
+	BUG_ON(vfsmount_mark);
+
+	/* don't close files if this was not the last link */
+	if (mask & FS_ATTRIB) {
+		if (inode->i_nlink)
+			return 0;
+	}
+
+	/* FIXME: get container of mark, unhash and release it */
+	nf = container_of(inode_mark, struct nfsd_file, nf_mark);
+	nfsd_file_unhash_and_release(nf);
+	return 0;
+}
+
+
+const static struct fsnotify_ops nfsd_file_fsnotify_ops = {
+	.handle_event = nfsd_file_fsnotify_handle_event,
+};
+
+int
+nfsd_file_cache_init(void)
+{
+	int		ret = -ENOMEM;
+	unsigned int	i;
+
+	if (nfsd_file_hashtbl)
+		return 0;
+
+	nfsd_file_hashtbl = kcalloc(NFSD_FILE_HASH_SIZE,
+				sizeof(*nfsd_file_hashtbl), GFP_KERNEL);
+	if (!nfsd_file_hashtbl) {
+		pr_err("nfsd: unable to allocate nfsd_file_hashtbl\n");
+		goto out_err;
+	}
+
+	ret = list_lru_init(&nfsd_file_lru);
+	if (ret) {
+		pr_err("nfsd: failed to init nfsd_file_lru: %d\n", ret);
+		goto out_err;
+	}
+
+	ret = register_shrinker(&nfsd_file_shrinker);
+	if (ret) {
+		pr_err("nfsd: failed to register nfsd_file_shrinker: %d\n", ret);
+		goto out_lru;
+	}
+
+	ret = srcu_notifier_chain_register(&lease_notifier_chain,
+				&nfsd_file_lease_notifier);
+	if (ret) {
+		pr_err("nfsd: unable to register lease notifier: %d\n", ret);
+		goto out_shrinker;
+	}
+
+	nfsd_file_fsnotify_group = fsnotify_alloc_group(&nfsd_file_fsnotify_ops);
+	if (IS_ERR(nfsd_file_fsnotify_group)) {
+		pr_err("nfsd: unable to create fsnotify group: %ld\n",
+			PTR_ERR(nfsd_file_fsnotify_group));
+		nfsd_file_fsnotify_group = NULL;
+		goto out_notifier;
+	}
+
+	for (i = 0; i < NFSD_FILE_HASH_SIZE; i++) {
+		INIT_HLIST_HEAD(&nfsd_file_hashtbl[i].nfb_head);
+		spin_lock_init(&nfsd_file_hashtbl[i].nfb_lock);
+	}
+out:
+	return ret;
+out_notifier:
+	srcu_notifier_chain_unregister(&lease_notifier_chain,
+				&nfsd_file_lease_notifier);
+out_shrinker:
+	unregister_shrinker(&nfsd_file_shrinker);
+out_lru:
+	list_lru_destroy(&nfsd_file_lru);
+out_err:
+	kfree(nfsd_file_hashtbl);
+	nfsd_file_hashtbl = NULL;
+	goto out;
+}
+
+void
+nfsd_file_cache_purge(void)
+{
+	unsigned int		i;
+	struct nfsd_file	*nf;
+	LIST_HEAD(dispose);
+
+	if (!nfsd_file_hashtbl)
+		return;
+
+	for (i = 0; i < NFSD_FILE_HASH_SIZE; i++) {
+		spin_lock(&nfsd_file_hashtbl[i].nfb_lock);
+		while(!hlist_empty(&nfsd_file_hashtbl[i].nfb_head)) {
+			nf = hlist_entry(nfsd_file_hashtbl[i].nfb_head.first,
+					 struct nfsd_file, nf_node);
+			nfsd_file_unhash_and_release_locked(nf, &dispose);
+		}
+		spin_unlock(&nfsd_file_hashtbl[i].nfb_lock);
+		nfsd_file_dispose_list(&dispose);
+	}
+}
+
+void
+nfsd_file_cache_shutdown(void)
+{
+	LIST_HEAD(dispose);
+
+	srcu_notifier_chain_unregister(&lease_notifier_chain,
+				&nfsd_file_lease_notifier);
+	unregister_shrinker(&nfsd_file_shrinker);
+	nfsd_file_cache_purge();
+	fsnotify_put_group(nfsd_file_fsnotify_group);
+	nfsd_file_fsnotify_group = NULL;
+	list_lru_destroy(&nfsd_file_lru);
+	kfree(nfsd_file_hashtbl);
+	nfsd_file_hashtbl = NULL;
+}
+
+/*
+ * Search nfsd_file_hashtbl[] for file. We hash on the filehandle and also on
+ * the NFSD_MAY_READ/WRITE flags. If the file is open for r/w, then it's usable
+ * for either.
+ */
+static struct nfsd_file *
+nfsd_file_find_locked(struct inode *inode, unsigned int may_flags,
+			unsigned int hashval)
+{
+	struct nfsd_file *nf;
+	unsigned char need = may_flags & NFSD_FILE_MAY_MASK;
+
+	hlist_for_each_entry_rcu(nf, &nfsd_file_hashtbl[hashval].nfb_head,
+				 nf_node) {
+		if ((need & nf->nf_may) != need)
+			continue;
+		if (nf->nf_inode == inode)
+			return nfsd_file_get(nf);
+	}
+	return NULL;
+}
+
+/**
+ * nfsd_file_close_inode - attempt to forcibly close a nfsd_file
+ * @inode: inode of the file to attempt to remove
+ *
+ * Walk the whole hash bucket, looking for any files that correspond to "inode".
+ * If any do, then unhash them and put the hashtable reference to them.
+ */
+void
+nfsd_file_close_inode_sync(struct inode *inode)
+{
+	struct nfsd_file	*nf;
+	struct hlist_node	*tmp;
+	unsigned int		hashval = (unsigned int)hash_ptr(inode, NFSD_FILE_HASH_BITS);
+	LIST_HEAD(dispose);
+
+	spin_lock(&nfsd_file_hashtbl[hashval].nfb_lock);
+	hlist_for_each_entry_safe(nf, tmp, &nfsd_file_hashtbl[hashval].nfb_head, nf_node) {
+		if (inode == nf->nf_inode)
+			nfsd_file_unhash_and_release_locked(nf, &dispose);
+	}
+	spin_unlock(&nfsd_file_hashtbl[hashval].nfb_lock);
+	trace_nfsd_file_close_inode_sync(inode, hashval, !list_empty(&dispose));
+	nfsd_file_dispose_list_sync(&dispose);
+}
+
+__be32
+nfsd_file_acquire(struct svc_rqst *rqstp, struct svc_fh *fhp,
+		  unsigned int may_flags, struct nfsd_file **pnf)
+{
+	__be32	status = nfs_ok;
+	struct nfsd_file *nf, *new = NULL;
+	struct inode *inode;
+	unsigned int hashval;
+
+	/* FIXME: skip this if fh_dentry is already set? */
+	status = fh_verify(rqstp, fhp, S_IFREG, may_flags|NFSD_MAY_OWNER_OVERRIDE);
+	if (status != nfs_ok)
+		return status;
+
+	inode = d_inode(fhp->fh_dentry);
+	hashval = (unsigned int)hash_ptr(inode, NFSD_FILE_HASH_BITS);
+retry:
+	rcu_read_lock();
+	nf = nfsd_file_find_locked(inode, may_flags, hashval);
+	rcu_read_unlock();
+	if (nf)
+		goto wait_for_construction;
+
+	if (!new) {
+		new = nfsd_file_alloc(inode, may_flags, hashval);
+		if (!new) {
+			trace_nfsd_file_acquire(hashval, inode, may_flags, NULL,
+						nfserr_jukebox);
+			return nfserr_jukebox;
+		}
+	}
+
+	spin_lock(&nfsd_file_hashtbl[hashval].nfb_lock);
+	nf = nfsd_file_find_locked(inode, may_flags, hashval);
+	if (likely(nf == NULL)) {
+		/* Take reference for the hashtable */
+		atomic_inc(&new->nf_ref);
+		__set_bit(NFSD_FILE_HASHED, &new->nf_flags);
+		__set_bit(NFSD_FILE_PENDING, &new->nf_flags);
+		list_lru_add(&nfsd_file_lru, &new->nf_lru);
+		hlist_add_head_rcu(&new->nf_node,
+				&nfsd_file_hashtbl[hashval].nfb_head);
+		spin_unlock(&nfsd_file_hashtbl[hashval].nfb_lock);
+
+		/* This should never fail since we set allow_dups to true */
+		WARN_ON_ONCE(fsnotify_add_mark(&new->nf_mark,
+			nfsd_file_fsnotify_group, inode, NULL, true));
+		nf = new;
+		new = NULL;
+		goto open_file;
+	}
+	spin_unlock(&nfsd_file_hashtbl[hashval].nfb_lock);
+
+wait_for_construction:
+	wait_on_bit(&nf->nf_flags, NFSD_FILE_PENDING, TASK_UNINTERRUPTIBLE);
+
+	/* Did construction of this file fail? */
+	if (!nf->nf_file) {
+		/*
+		 * We can only take over construction for this nfsd_file if the
+		 * MAY flags are equal. Otherwise, we put the reference and try
+		 * again.
+		 */
+		if ((may_flags & NFSD_FILE_MAY_MASK) != nf->nf_may) {
+			nfsd_file_put(nf);
+			goto retry;
+		}
+
+		/* try to take over construction for this file */
+		if (test_and_set_bit(NFSD_FILE_PENDING, &nf->nf_flags))
+			goto wait_for_construction;
+
+		/* sync up the BREAK_* flags with our may_flags */
+		if (may_flags & NFSD_MAY_NOT_BREAK_LEASE) {
+			if (may_flags & NFSD_MAY_WRITE)
+				set_bit(NFSD_FILE_BREAK_WRITE, &nf->nf_flags);
+			if (may_flags & NFSD_MAY_READ)
+				set_bit(NFSD_FILE_BREAK_READ, &nf->nf_flags);
+		} else {
+			clear_bit(NFSD_FILE_BREAK_WRITE, &nf->nf_flags);
+			clear_bit(NFSD_FILE_BREAK_READ, &nf->nf_flags);
+		}
+
+		goto open_file;
+	}
+
+	/*
+	 * We have a file that was opened in the context of another rqst. We
+	 * must check permissions. Since we're dealing with open files here,
+	 * we always want to set the OWNER_OVERRIDE bit.
+	 */
+	status = nfsd_permission(rqstp, fhp->fh_export, fhp->fh_dentry,
+					may_flags|NFSD_MAY_OWNER_OVERRIDE);
+
+	if (status == nfs_ok && !(may_flags & NFSD_MAY_NOT_BREAK_LEASE)) {
+		bool write = (may_flags & NFSD_MAY_WRITE);
+
+		if (test_bit(NFSD_FILE_BREAK_READ, &nf->nf_flags) ||
+		    (test_bit(NFSD_FILE_BREAK_WRITE, &nf->nf_flags) && write)) {
+			status = nfserrno(nfsd_open_break_lease(
+					file_inode(nf->nf_file), may_flags));
+			if (status == nfs_ok) {
+				clear_bit(NFSD_FILE_BREAK_READ, &nf->nf_flags);
+				if (write)
+					clear_bit(NFSD_FILE_BREAK_WRITE,
+						  &nf->nf_flags);
+			}
+		}
+	}
+out:
+	if (status == nfs_ok) {
+		*pnf = nf;
+	} else {
+		nfsd_file_put(nf);
+		nf = NULL;
+	}
+
+	if (new)
+		nfsd_file_put(new);
+
+	trace_nfsd_file_acquire(hashval, inode, may_flags, nf, status);
+	return status;
+open_file:
+	/* FIXME: should we abort opening if the link count goes to 0? */
+	status = nfsd_open(rqstp, fhp, S_IFREG, may_flags, &nf->nf_file);
+	clear_bit_unlock(NFSD_FILE_PENDING, &nf->nf_flags);
+	smp_mb__after_atomic();
+	wake_up_bit(&nf->nf_flags, NFSD_FILE_PENDING);
+	goto out;
+}
diff --git a/fs/nfsd/filecache.h b/fs/nfsd/filecache.h
new file mode 100644
index 000000000000..5c871c3114f2
--- /dev/null
+++ b/fs/nfsd/filecache.h
@@ -0,0 +1,37 @@
+#ifndef _FS_NFSD_FILECACHE_H
+#define _FS_NFSD_FILECACHE_H
+
+#include <linux/fsnotify_backend.h>
+
+/*
+ * A representation of a file that has been opened by knfsd. These are hashed
+ * in the hashtable by inode pointer value. Note that this object doesn't
+ * hold a reference to the inode by itself, so the nf_inode pointer should
+ * never be dereferenced, only be used for comparison.
+ */
+struct nfsd_file {
+	struct hlist_node	nf_node;
+	struct list_head	nf_lru;
+	struct rcu_head		nf_rcu;
+	struct file		*nf_file;
+#define NFSD_FILE_HASHED	(0)
+#define NFSD_FILE_PENDING	(1)
+#define NFSD_FILE_BREAK_READ	(2)
+#define NFSD_FILE_BREAK_WRITE	(3)
+	unsigned long		nf_flags;
+	struct inode		*nf_inode;
+	unsigned int		nf_hashval;
+	atomic_t		nf_ref;
+	unsigned char		nf_may;
+	struct fsnotify_mark	nf_mark;
+};
+
+int nfsd_file_cache_init(void);
+void nfsd_file_cache_purge(void);
+void nfsd_file_cache_shutdown(void);
+void nfsd_file_put(struct nfsd_file *nf);
+struct nfsd_file *nfsd_file_get(struct nfsd_file *nf);
+void nfsd_file_close_inode_sync(struct inode *inode);
+__be32 nfsd_file_acquire(struct svc_rqst *rqstp, struct svc_fh *fhp,
+		  unsigned int may_flags, struct nfsd_file **nfp);
+#endif /* _FS_NFSD_FILECACHE_H */
diff --git a/fs/nfsd/nfssvc.c b/fs/nfsd/nfssvc.c
index ad4e2377dd63..d816bb3faa6e 100644
--- a/fs/nfsd/nfssvc.c
+++ b/fs/nfsd/nfssvc.c
@@ -22,6 +22,7 @@
 #include "cache.h"
 #include "vfs.h"
 #include "netns.h"
+#include "filecache.h"
 
 #define NFSDDBG_FACILITY	NFSDDBG_SVC
 
@@ -224,11 +225,17 @@ static int nfsd_startup_generic(int nrservs)
 	if (ret)
 		goto dec_users;
 
-	ret = nfs4_state_start();
+	ret = nfsd_file_cache_init();
 	if (ret)
 		goto out_racache;
+
+	ret = nfs4_state_start();
+	if (ret)
+		goto out_file_cache;
 	return 0;
 
+out_file_cache:
+	nfsd_file_cache_shutdown();
 out_racache:
 	nfsd_racache_shutdown();
 dec_users:
@@ -242,6 +249,7 @@ static void nfsd_shutdown_generic(void)
 		return;
 
 	nfs4_state_shutdown();
+	nfsd_file_cache_shutdown();
 	nfsd_racache_shutdown();
 }
 
diff --git a/fs/nfsd/trace.h b/fs/nfsd/trace.h
index 0befe762762b..49f0d1f9c949 100644
--- a/fs/nfsd/trace.h
+++ b/fs/nfsd/trace.h
@@ -10,6 +10,8 @@
 #include <linux/tracepoint.h>
 
 #include "state.h"
+#include "filecache.h"
+#include "vfs.h"
 
 DECLARE_EVENT_CLASS(nfsd_stateid_class,
 	TP_PROTO(stateid_t *stp),
@@ -48,6 +50,131 @@ DEFINE_STATEID_EVENT(layout_recall_done);
 DEFINE_STATEID_EVENT(layout_recall_fail);
 DEFINE_STATEID_EVENT(layout_recall_release);
 
+#define show_nf_flags(val)						\
+	__print_flags(val, "|",						\
+		{ 1 << NFSD_FILE_HASHED,	"HASHED" },		\
+		{ 1 << NFSD_FILE_PENDING,	"PENDING" },		\
+		{ 1 << NFSD_FILE_BREAK_READ,	"BREAK_READ" },		\
+		{ 1 << NFSD_FILE_BREAK_WRITE,	"BREAK_WRITE" })
+
+/* FIXME: This should probably be fleshed out in the future. */
+#define show_nf_may(val)						\
+	__print_flags(val, "|",						\
+		{ NFSD_MAY_READ,		"READ" },		\
+		{ NFSD_MAY_WRITE,		"WRITE" },		\
+		{ NFSD_MAY_NOT_BREAK_LEASE,	"NOT_BREAK_LEASE" })
+
+DECLARE_EVENT_CLASS(nfsd_file_class,
+	TP_PROTO(struct nfsd_file *nf),
+	TP_ARGS(nf),
+	TP_STRUCT__entry(
+		__field(unsigned int, nf_hashval)
+		__field(void *, nf_inode)
+		__field(int, nf_ref)
+		__field(unsigned long, nf_flags)
+		__field(unsigned char, nf_may)
+		__field(struct file *, nf_file)
+	),
+	TP_fast_assign(
+		__entry->nf_hashval = nf->nf_hashval;
+		__entry->nf_inode = nf->nf_inode;
+		__entry->nf_ref = atomic_read(&nf->nf_ref);
+		__entry->nf_flags = nf->nf_flags;
+		__entry->nf_may = nf->nf_may;
+		__entry->nf_file = nf->nf_file;
+	),
+	TP_printk("hash=0x%x inode=0x%p ref=%d flags=%s may=%s file=%p",
+		__entry->nf_hashval,
+		__entry->nf_inode,
+		__entry->nf_ref,
+		show_nf_flags(__entry->nf_flags),
+		show_nf_may(__entry->nf_may),
+		__entry->nf_file)
+)
+
+#define DEFINE_NFSD_FILE_EVENT(name) \
+DEFINE_EVENT(nfsd_file_class, name, \
+	TP_PROTO(struct nfsd_file *nf), \
+	TP_ARGS(nf))
+
+DEFINE_NFSD_FILE_EVENT(nfsd_file_alloc);
+DEFINE_NFSD_FILE_EVENT(nfsd_file_put_final);
+DEFINE_NFSD_FILE_EVENT(nfsd_file_unhash);
+DEFINE_NFSD_FILE_EVENT(nfsd_file_put);
+DEFINE_NFSD_FILE_EVENT(nfsd_file_unhash_and_release_locked);
+
+TRACE_EVENT(nfsd_file_acquire,
+	TP_PROTO(unsigned int hash, struct inode *inode,
+		 unsigned int may_flags, struct nfsd_file *nf,
+		 __be32 status),
+
+	TP_ARGS(hash, inode, may_flags, nf, status),
+
+	TP_STRUCT__entry(
+		__field(unsigned int, hash)
+		__field(void *, inode)
+		__field(unsigned int, may_flags)
+		__field(int, nf_ref)
+		__field(unsigned long, nf_flags)
+		__field(unsigned char, nf_may)
+		__field(struct file *, nf_file)
+		__field(__be32, status)
+	),
+
+	TP_fast_assign(
+		__entry->hash = hash;
+		__entry->inode = inode;
+		__entry->may_flags = may_flags;
+		__entry->nf_ref = nf ? atomic_read(&nf->nf_ref) : 0;
+		__entry->nf_flags = nf ? nf->nf_flags : 0;
+		__entry->nf_may = nf ? nf->nf_may : 0;
+		__entry->nf_file = nf ? nf->nf_file : NULL;
+		__entry->status = status;
+	),
+
+	TP_printk("hash=0x%x inode=0x%p may_flags=%s ref=%d nf_flags=%s nf_may=%s nf_file=0x%p status=%u",
+			__entry->hash, __entry->inode,
+			show_nf_may(__entry->may_flags), __entry->nf_ref,
+			show_nf_flags(__entry->nf_flags),
+			show_nf_may(__entry->nf_may), __entry->nf_file,
+			be32_to_cpu(__entry->status))
+);
+
+TRACE_EVENT(nfsd_file_close_inode_sync,
+	TP_PROTO(struct inode *inode, unsigned int hash, int found),
+	TP_ARGS(inode, hash, found),
+	TP_STRUCT__entry(
+		__field(struct inode *, inode)
+		__field(unsigned int, hash)
+		__field(int, found)
+	),
+	TP_fast_assign(
+		__entry->inode = inode;
+		__entry->hash = hash;
+		__entry->found = found;
+	),
+	TP_printk("hash=0x%x inode=0x%p found=%d", __entry->hash,
+			__entry->inode, __entry->found)
+);
+
+TRACE_EVENT(nfsd_file_fsnotify_handle_event,
+	TP_PROTO(struct inode *inode, u32 mask),
+	TP_ARGS(inode, mask),
+	TP_STRUCT__entry(
+		__field(struct inode *, inode)
+		__field(unsigned int, nlink)
+		__field(umode_t, mode)
+		__field(u32, mask)
+	),
+	TP_fast_assign(
+		__entry->inode = inode;
+		__entry->nlink = inode->i_nlink;
+		__entry->mode = inode->i_mode;
+		__entry->mask = mask;
+	),
+	TP_printk("inode=0x%p nlink=%u mode=0%ho mask=0x%x", __entry->inode,
+			__entry->nlink, __entry->mode, __entry->mask)
+);
 #endif /* _NFSD_TRACE_H */
 
 #undef TRACE_INCLUDE_PATH
diff --git a/fs/nfsd/vfs.c b/fs/nfsd/vfs.c
index 2ea6c6a37364..4ce447e56155 100644
--- a/fs/nfsd/vfs.c
+++ b/fs/nfsd/vfs.c
@@ -623,7 +623,8 @@ nfsd_access(struct svc_rqst *rqstp, struct svc_fh *fhp, u32 *access, u32 *suppor
 }
 #endif /* CONFIG_NFSD_V3 */
 
-static int nfsd_open_break_lease(struct inode *inode, int access)
+int
+nfsd_open_break_lease(struct inode *inode, int access)
 {
 	unsigned int mode;
 
diff --git a/fs/nfsd/vfs.h b/fs/nfsd/vfs.h
index fee2451ae248..4bc127e0ca15 100644
--- a/fs/nfsd/vfs.h
+++ b/fs/nfsd/vfs.h
@@ -69,6 +69,7 @@ __be32		do_nfsd_create(struct svc_rqst *, struct svc_fh *,
 __be32		nfsd_commit(struct svc_rqst *, struct svc_fh *,
 				loff_t, unsigned long);
 #endif /* CONFIG_NFSD_V3 */
+int		nfsd_open_break_lease(struct inode *, int);
 __be32		nfsd_open(struct svc_rqst *, struct svc_fh *, umode_t,
 				int, struct file **);
 struct raparms;
-- 
2.4.3

--
To unsubscribe from this list: send the line "unsubscribe linux-nfs" in
the body of a message to majordomo-u79uwXL29TY76Z2rM5mHXA@public.gmane.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

^ permalink raw reply related	[flat|nested] 19+ messages in thread

* [PATCH v4 09/16] nfsd: hook up nfsd_write to the new nfsd_file cache
       [not found] ` <1441968882-7851-1-git-send-email-jeff.layton-7I+n7zu2hftEKMMhf/gKZA@public.gmane.org>
  2015-09-11 10:54   ` [PATCH v4 05/16] locks: create a new notifier chain for lease attempts Jeff Layton
  2015-09-11 10:54   ` [PATCH v4 08/16] nfsd: add a new struct file caching facility to nfsd Jeff Layton
@ 2015-09-11 10:54   ` Jeff Layton
  2015-09-11 10:54   ` [PATCH v4 10/16] nfsd: hook up nfsd_read to the " Jeff Layton
                     ` (3 subsequent siblings)
  6 siblings, 0 replies; 19+ messages in thread
From: Jeff Layton @ 2015-09-11 10:54 UTC (permalink / raw)
  To: bfields-uC3wQj2KruNg9hUCZPvPmw
  Cc: linux-nfs-u79uwXL29TY76Z2rM5mHXA, linux-fsdevel-u79uwXL29TY76Z2rM5mHXA

Note that all callers currently pass in NULL for "file" anyway, so
there was already some dead code in here. Just eliminate that parm
and have it use the file cache instead of dealing directly with a
filp.

Signed-off-by: Jeff Layton <jeff.layton-7I+n7zu2hftEKMMhf/gKZA@public.gmane.org>
---
 fs/nfsd/nfs3proc.c |  2 +-
 fs/nfsd/nfsproc.c  |  2 +-
 fs/nfsd/vfs.c      | 33 +++++++++++----------------------
 fs/nfsd/vfs.h      |  2 +-
 4 files changed, 14 insertions(+), 25 deletions(-)

diff --git a/fs/nfsd/nfs3proc.c b/fs/nfsd/nfs3proc.c
index 7b755b7f785c..4e46ac511479 100644
--- a/fs/nfsd/nfs3proc.c
+++ b/fs/nfsd/nfs3proc.c
@@ -192,7 +192,7 @@ nfsd3_proc_write(struct svc_rqst *rqstp, struct nfsd3_writeargs *argp,
 
 	fh_copy(&resp->fh, &argp->fh);
 	resp->committed = argp->stable;
-	nfserr = nfsd_write(rqstp, &resp->fh, NULL,
+	nfserr = nfsd_write(rqstp, &resp->fh,
 				   argp->offset,
 				   rqstp->rq_vec, argp->vlen,
 				   &cnt,
diff --git a/fs/nfsd/nfsproc.c b/fs/nfsd/nfsproc.c
index 4cd78ef4c95c..9893095cbee1 100644
--- a/fs/nfsd/nfsproc.c
+++ b/fs/nfsd/nfsproc.c
@@ -213,7 +213,7 @@ nfsd_proc_write(struct svc_rqst *rqstp, struct nfsd_writeargs *argp,
 		SVCFH_fmt(&argp->fh),
 		argp->len, argp->offset);
 
-	nfserr = nfsd_write(rqstp, fh_copy(&resp->fh, &argp->fh), NULL,
+	nfserr = nfsd_write(rqstp, fh_copy(&resp->fh, &argp->fh),
 				   argp->offset,
 				   rqstp->rq_vec, argp->vlen,
 			           &cnt,
diff --git a/fs/nfsd/vfs.c b/fs/nfsd/vfs.c
index 4ce447e56155..0a9bffe5ba97 100644
--- a/fs/nfsd/vfs.c
+++ b/fs/nfsd/vfs.c
@@ -42,6 +42,7 @@
 
 #include "nfsd.h"
 #include "vfs.h"
+#include "filecache.h"
 
 #define NFSDDBG_FACILITY		NFSDDBG_FILEOP
 
@@ -1009,30 +1010,18 @@ __be32 nfsd_read(struct svc_rqst *rqstp, struct svc_fh *fhp,
  * N.B. After this call fhp needs an fh_put
  */
 __be32
-nfsd_write(struct svc_rqst *rqstp, struct svc_fh *fhp, struct file *file,
-		loff_t offset, struct kvec *vec, int vlen, unsigned long *cnt,
-		int *stablep)
+nfsd_write(struct svc_rqst *rqstp, struct svc_fh *fhp, loff_t offset,
+	   struct kvec *vec, int vlen, unsigned long *cnt, int *stablep)
 {
-	__be32			err = 0;
-
-	if (file) {
-		err = nfsd_permission(rqstp, fhp->fh_export, fhp->fh_dentry,
-				NFSD_MAY_WRITE|NFSD_MAY_OWNER_OVERRIDE);
-		if (err)
-			goto out;
-		err = nfsd_vfs_write(rqstp, fhp, file, offset, vec, vlen, cnt,
-				stablep);
-	} else {
-		err = nfsd_open(rqstp, fhp, S_IFREG, NFSD_MAY_WRITE, &file);
-		if (err)
-			goto out;
-
-		if (cnt)
-			err = nfsd_vfs_write(rqstp, fhp, file, offset, vec, vlen,
-					     cnt, stablep);
-		fput(file);
+	__be32			err;
+	struct nfsd_file	*nf;
+
+	err = nfsd_file_acquire(rqstp, fhp, NFSD_MAY_WRITE, &nf);
+	if (err == nfs_ok) {
+		err = nfsd_vfs_write(rqstp, fhp, nf->nf_file, offset, vec,
+					vlen, cnt, stablep);
+		nfsd_file_put(nf);
 	}
-out:
 	return err;
 }
 
diff --git a/fs/nfsd/vfs.h b/fs/nfsd/vfs.h
index 4bc127e0ca15..303db66dca0a 100644
--- a/fs/nfsd/vfs.h
+++ b/fs/nfsd/vfs.h
@@ -79,7 +79,7 @@ __be32		nfsd_readv(struct file *, loff_t, struct kvec *, int,
 				unsigned long *);
 __be32 		nfsd_read(struct svc_rqst *, struct svc_fh *,
 				loff_t, struct kvec *, int, unsigned long *);
-__be32 		nfsd_write(struct svc_rqst *, struct svc_fh *,struct file *,
+__be32 		nfsd_write(struct svc_rqst *, struct svc_fh *,
 				loff_t, struct kvec *,int, unsigned long *, int *);
 __be32		nfsd_vfs_write(struct svc_rqst *rqstp, struct svc_fh *fhp,
 				struct file *file, loff_t offset,
-- 
2.4.3

--
To unsubscribe from this list: send the line "unsubscribe linux-nfs" in
the body of a message to majordomo-u79uwXL29TY76Z2rM5mHXA@public.gmane.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

^ permalink raw reply related	[flat|nested] 19+ messages in thread

* [PATCH v4 10/16] nfsd: hook up nfsd_read to the nfsd_file cache
       [not found] ` <1441968882-7851-1-git-send-email-jeff.layton-7I+n7zu2hftEKMMhf/gKZA@public.gmane.org>
                     ` (2 preceding siblings ...)
  2015-09-11 10:54   ` [PATCH v4 09/16] nfsd: hook up nfsd_write to the new nfsd_file cache Jeff Layton
@ 2015-09-11 10:54   ` Jeff Layton
  2015-09-11 10:54   ` [PATCH v4 11/16] nfsd: hook nfsd_commit up " Jeff Layton
                     ` (2 subsequent siblings)
  6 siblings, 0 replies; 19+ messages in thread
From: Jeff Layton @ 2015-09-11 10:54 UTC (permalink / raw)
  To: bfields-uC3wQj2KruNg9hUCZPvPmw
  Cc: linux-nfs-u79uwXL29TY76Z2rM5mHXA, linux-fsdevel-u79uwXL29TY76Z2rM5mHXA

Signed-off-by: Jeff Layton <jeff.layton-7I+n7zu2hftEKMMhf/gKZA@public.gmane.org>
---
 fs/nfsd/vfs.c | 21 ++++++++-------------
 1 file changed, 8 insertions(+), 13 deletions(-)

diff --git a/fs/nfsd/vfs.c b/fs/nfsd/vfs.c
index 0a9bffe5ba97..af942ff4546f 100644
--- a/fs/nfsd/vfs.c
+++ b/fs/nfsd/vfs.c
@@ -987,20 +987,15 @@ out_nfserr:
 __be32 nfsd_read(struct svc_rqst *rqstp, struct svc_fh *fhp,
 	loff_t offset, struct kvec *vec, int vlen, unsigned long *count)
 {
-	struct file *file;
-	struct raparms	*ra;
-	__be32 err;
-
-	err = nfsd_open(rqstp, fhp, S_IFREG, NFSD_MAY_READ, &file);
-	if (err)
-		return err;
-
-	ra = nfsd_init_raparms(file);
-	err = nfsd_vfs_read(rqstp, file, offset, vec, vlen, count);
-	if (ra)
-		nfsd_put_raparams(file, ra);
-	fput(file);
+	__be32			err;
+	struct nfsd_file	*nf;
 
+	err = nfsd_file_acquire(rqstp, fhp, NFSD_MAY_READ, &nf);
+	if (err == nfs_ok) {
+		err = nfsd_vfs_read(rqstp, nf->nf_file, offset, vec, vlen,
+					count);
+		nfsd_file_put(nf);
+	}
 	return err;
 }
 
-- 
2.4.3

--
To unsubscribe from this list: send the line "unsubscribe linux-nfs" in
the body of a message to majordomo-u79uwXL29TY76Z2rM5mHXA@public.gmane.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

^ permalink raw reply related	[flat|nested] 19+ messages in thread

* [PATCH v4 11/16] nfsd: hook nfsd_commit up to the nfsd_file cache
       [not found] ` <1441968882-7851-1-git-send-email-jeff.layton-7I+n7zu2hftEKMMhf/gKZA@public.gmane.org>
                     ` (3 preceding siblings ...)
  2015-09-11 10:54   ` [PATCH v4 10/16] nfsd: hook up nfsd_read to the " Jeff Layton
@ 2015-09-11 10:54   ` Jeff Layton
  2015-09-11 10:54   ` [PATCH v4 13/16] nfsd: have nfsd_test_lock use " Jeff Layton
  2015-09-11 10:54   ` [PATCH v4 15/16] nfsd: hook up nfs4_preprocess_stateid_op to " Jeff Layton
  6 siblings, 0 replies; 19+ messages in thread
From: Jeff Layton @ 2015-09-11 10:54 UTC (permalink / raw)
  To: bfields-uC3wQj2KruNg9hUCZPvPmw
  Cc: linux-nfs-u79uwXL29TY76Z2rM5mHXA, linux-fsdevel-u79uwXL29TY76Z2rM5mHXA

Use cached filps if possible instead of opening a new one every time.

Signed-off-by: Jeff Layton <jeff.layton-7I+n7zu2hftEKMMhf/gKZA@public.gmane.org>
---
 fs/nfsd/vfs.c | 14 +++++++-------
 1 file changed, 7 insertions(+), 7 deletions(-)

diff --git a/fs/nfsd/vfs.c b/fs/nfsd/vfs.c
index af942ff4546f..ca9dc84ca4b0 100644
--- a/fs/nfsd/vfs.c
+++ b/fs/nfsd/vfs.c
@@ -1034,9 +1034,9 @@ __be32
 nfsd_commit(struct svc_rqst *rqstp, struct svc_fh *fhp,
                loff_t offset, unsigned long count)
 {
-	struct file	*file;
-	loff_t		end = LLONG_MAX;
-	__be32		err = nfserr_inval;
+	struct nfsd_file	*nf;
+	loff_t			end = LLONG_MAX;
+	__be32			err = nfserr_inval;
 
 	if (offset < 0)
 		goto out;
@@ -1046,12 +1046,12 @@ nfsd_commit(struct svc_rqst *rqstp, struct svc_fh *fhp,
 			goto out;
 	}
 
-	err = nfsd_open(rqstp, fhp, S_IFREG,
-			NFSD_MAY_WRITE|NFSD_MAY_NOT_BREAK_LEASE, &file);
+	err = nfsd_file_acquire(rqstp, fhp,
+			NFSD_MAY_WRITE|NFSD_MAY_NOT_BREAK_LEASE, &nf);
 	if (err)
 		goto out;
 	if (EX_ISSYNC(fhp->fh_export)) {
-		int err2 = vfs_fsync_range(file, offset, end, 0);
+		int err2 = vfs_fsync_range(nf->nf_file, offset, end, 0);
 
 		if (err2 != -EINVAL)
 			err = nfserrno(err2);
@@ -1059,7 +1059,7 @@ nfsd_commit(struct svc_rqst *rqstp, struct svc_fh *fhp,
 			err = nfserr_notsupp;
 	}
 
-	fput(file);
+	nfsd_file_put(nf);
 out:
 	return err;
 }
-- 
2.4.3

--
To unsubscribe from this list: send the line "unsubscribe linux-nfs" in
the body of a message to majordomo-u79uwXL29TY76Z2rM5mHXA@public.gmane.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

^ permalink raw reply related	[flat|nested] 19+ messages in thread

* [PATCH v4 12/16] nfsd: convert nfs4_file->fi_fds array to use nfsd_files
  2015-09-11 10:54 [PATCH v4 00/16] nfsd: open file caching Jeff Layton
                   ` (6 preceding siblings ...)
  2015-09-11 10:54 ` [PATCH v4 07/16] sunrpc: add a new cache_detail operation for when a cache is flushed Jeff Layton
@ 2015-09-11 10:54 ` Jeff Layton
  2015-09-11 10:54 ` [PATCH v4 14/16] nfsd: convert fi_deleg_file and ls_file fields to nfsd_file Jeff Layton
  2015-09-11 10:54 ` [PATCH v4 16/16] nfsd: rip out the raparms cache Jeff Layton
  9 siblings, 0 replies; 19+ messages in thread
From: Jeff Layton @ 2015-09-11 10:54 UTC (permalink / raw)
  To: bfields; +Cc: linux-nfs, linux-fsdevel

Signed-off-by: Jeff Layton <jeff.layton@primarydata.com>
---
 fs/nfsd/nfs4state.c | 23 ++++++++++++-----------
 fs/nfsd/state.h     |  2 +-
 2 files changed, 13 insertions(+), 12 deletions(-)

diff --git a/fs/nfsd/nfs4state.c b/fs/nfsd/nfs4state.c
index 0f1d5691b795..891c9153a5c6 100644
--- a/fs/nfsd/nfs4state.c
+++ b/fs/nfsd/nfs4state.c
@@ -49,6 +49,7 @@
 
 #include "netns.h"
 #include "pnfs.h"
+#include "filecache.h"
 
 #define NFSDDBG_FACILITY                NFSDDBG_PROC
 
@@ -292,7 +293,7 @@ static struct file *
 __nfs4_get_fd(struct nfs4_file *f, int oflag)
 {
 	if (f->fi_fds[oflag])
-		return get_file(f->fi_fds[oflag]);
+		return get_file(f->fi_fds[oflag]->nf_file);
 	return NULL;
 }
 
@@ -449,17 +450,17 @@ static void __nfs4_file_put_access(struct nfs4_file *fp, int oflag)
 	might_lock(&fp->fi_lock);
 
 	if (atomic_dec_and_lock(&fp->fi_access[oflag], &fp->fi_lock)) {
-		struct file *f1 = NULL;
-		struct file *f2 = NULL;
+		struct nfsd_file *f1 = NULL;
+		struct nfsd_file *f2 = NULL;
 
 		swap(f1, fp->fi_fds[oflag]);
 		if (atomic_read(&fp->fi_access[1 - oflag]) == 0)
 			swap(f2, fp->fi_fds[O_RDWR]);
 		spin_unlock(&fp->fi_lock);
 		if (f1)
-			fput(f1);
+			nfsd_file_put(f1);
 		if (f2)
-			fput(f2);
+			nfsd_file_put(f2);
 	}
 }
 
@@ -3827,7 +3828,7 @@ static __be32 nfs4_get_vfs_file(struct svc_rqst *rqstp, struct nfs4_file *fp,
 		struct svc_fh *cur_fh, struct nfs4_ol_stateid *stp,
 		struct nfsd4_open *open)
 {
-	struct file *filp = NULL;
+	struct nfsd_file *nf = NULL;
 	__be32 status;
 	int oflag = nfs4_access_to_omode(open->op_share_access);
 	int access = nfs4_access_to_access(open->op_share_access);
@@ -3863,18 +3864,18 @@ static __be32 nfs4_get_vfs_file(struct svc_rqst *rqstp, struct nfs4_file *fp,
 
 	if (!fp->fi_fds[oflag]) {
 		spin_unlock(&fp->fi_lock);
-		status = nfsd_open(rqstp, cur_fh, S_IFREG, access, &filp);
+		status = nfsd_file_acquire(rqstp, cur_fh, access, &nf);
 		if (status)
 			goto out_put_access;
 		spin_lock(&fp->fi_lock);
 		if (!fp->fi_fds[oflag]) {
-			fp->fi_fds[oflag] = filp;
-			filp = NULL;
+			fp->fi_fds[oflag] = nf;
+			nf = NULL;
 		}
 	}
 	spin_unlock(&fp->fi_lock);
-	if (filp)
-		fput(filp);
+	if (nf)
+		nfsd_file_put(nf);
 
 	status = nfsd4_truncate(rqstp, cur_fh, open);
 	if (status)
diff --git a/fs/nfsd/state.h b/fs/nfsd/state.h
index 583ffc13cae2..70b3e51ba089 100644
--- a/fs/nfsd/state.h
+++ b/fs/nfsd/state.h
@@ -499,7 +499,7 @@ struct nfs4_file {
 	};
 	struct list_head	fi_clnt_odstate;
 	/* One each for O_RDONLY, O_WRONLY, O_RDWR: */
-	struct file *		fi_fds[3];
+	struct nfsd_file	*fi_fds[3];
 	/*
 	 * Each open or lock stateid contributes 0-4 to the counts
 	 * below depending on which bits are set in st_access_bitmap:
-- 
2.4.3


^ permalink raw reply related	[flat|nested] 19+ messages in thread

* [PATCH v4 13/16] nfsd: have nfsd_test_lock use the nfsd_file cache
       [not found] ` <1441968882-7851-1-git-send-email-jeff.layton-7I+n7zu2hftEKMMhf/gKZA@public.gmane.org>
                     ` (4 preceding siblings ...)
  2015-09-11 10:54   ` [PATCH v4 11/16] nfsd: hook nfsd_commit up " Jeff Layton
@ 2015-09-11 10:54   ` Jeff Layton
  2015-09-11 10:54   ` [PATCH v4 15/16] nfsd: hook up nfs4_preprocess_stateid_op to " Jeff Layton
  6 siblings, 0 replies; 19+ messages in thread
From: Jeff Layton @ 2015-09-11 10:54 UTC (permalink / raw)
  To: bfields-uC3wQj2KruNg9hUCZPvPmw
  Cc: linux-nfs-u79uwXL29TY76Z2rM5mHXA, linux-fsdevel-u79uwXL29TY76Z2rM5mHXA

Signed-off-by: Jeff Layton <jeff.layton-7I+n7zu2hftEKMMhf/gKZA@public.gmane.org>
---
 fs/nfsd/nfs4state.c | 8 ++++----
 1 file changed, 4 insertions(+), 4 deletions(-)

diff --git a/fs/nfsd/nfs4state.c b/fs/nfsd/nfs4state.c
index 891c9153a5c6..980383922f6a 100644
--- a/fs/nfsd/nfs4state.c
+++ b/fs/nfsd/nfs4state.c
@@ -5568,11 +5568,11 @@ out:
  */
 static __be32 nfsd_test_lock(struct svc_rqst *rqstp, struct svc_fh *fhp, struct file_lock *lock)
 {
-	struct file *file;
-	__be32 err = nfsd_open(rqstp, fhp, S_IFREG, NFSD_MAY_READ, &file);
+	struct nfsd_file *nf;
+	__be32 err = nfsd_file_acquire(rqstp, fhp, NFSD_MAY_READ, &nf);
 	if (!err) {
-		err = nfserrno(vfs_test_lock(file, lock));
-		fput(file);
+		err = nfserrno(vfs_test_lock(nf->nf_file, lock));
+		nfsd_file_put(nf);
 	}
 	return err;
 }
-- 
2.4.3

--
To unsubscribe from this list: send the line "unsubscribe linux-nfs" in
the body of a message to majordomo-u79uwXL29TY76Z2rM5mHXA@public.gmane.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

^ permalink raw reply related	[flat|nested] 19+ messages in thread

* [PATCH v4 14/16] nfsd: convert fi_deleg_file and ls_file fields to nfsd_file
  2015-09-11 10:54 [PATCH v4 00/16] nfsd: open file caching Jeff Layton
                   ` (7 preceding siblings ...)
  2015-09-11 10:54 ` [PATCH v4 12/16] nfsd: convert nfs4_file->fi_fds array to use nfsd_files Jeff Layton
@ 2015-09-11 10:54 ` Jeff Layton
  2015-09-11 10:54 ` [PATCH v4 16/16] nfsd: rip out the raparms cache Jeff Layton
  9 siblings, 0 replies; 19+ messages in thread
From: Jeff Layton @ 2015-09-11 10:54 UTC (permalink / raw)
  To: bfields; +Cc: linux-nfs, linux-fsdevel

Have them keep an nfsd_file reference instead of a struct file.

Signed-off-by: Jeff Layton <jeff.layton@primarydata.com>
---
 fs/nfsd/nfs4layouts.c |  12 ++---
 fs/nfsd/nfs4state.c   | 131 ++++++++++++++++++++++++++------------------------
 fs/nfsd/state.h       |   6 +--
 3 files changed, 76 insertions(+), 73 deletions(-)

diff --git a/fs/nfsd/nfs4layouts.c b/fs/nfsd/nfs4layouts.c
index ebf90e487c75..9b49876a9cc1 100644
--- a/fs/nfsd/nfs4layouts.c
+++ b/fs/nfsd/nfs4layouts.c
@@ -144,8 +144,8 @@ nfsd4_free_layout_stateid(struct nfs4_stid *stid)
 	list_del_init(&ls->ls_perfile);
 	spin_unlock(&fp->fi_lock);
 
-	vfs_setlease(ls->ls_file, F_UNLCK, NULL, (void **)&ls);
-	fput(ls->ls_file);
+	vfs_setlease(ls->ls_file->nf_file, F_UNLCK, NULL, (void **)&ls);
+	nfsd_file_put(ls->ls_file);
 
 	if (ls->ls_recalled)
 		atomic_dec(&ls->ls_stid.sc_file->fi_lo_recalls);
@@ -169,7 +169,7 @@ nfsd4_layout_setlease(struct nfs4_layout_stateid *ls)
 	fl->fl_end = OFFSET_MAX;
 	fl->fl_owner = ls;
 	fl->fl_pid = current->tgid;
-	fl->fl_file = ls->ls_file;
+	fl->fl_file = ls->ls_file->nf_file;
 
 	status = vfs_setlease(fl->fl_file, fl->fl_type, &fl, NULL);
 	if (status) {
@@ -206,13 +206,13 @@ nfsd4_alloc_layout_stateid(struct nfsd4_compound_state *cstate,
 			NFSPROC4_CLNT_CB_LAYOUT);
 
 	if (parent->sc_type == NFS4_DELEG_STID)
-		ls->ls_file = get_file(fp->fi_deleg_file);
+		ls->ls_file = nfsd_file_get(fp->fi_deleg_file);
 	else
 		ls->ls_file = find_any_file(fp);
 	BUG_ON(!ls->ls_file);
 
 	if (nfsd4_layout_setlease(ls)) {
-		fput(ls->ls_file);
+		nfsd_file_put(ls->ls_file);
 		put_nfs4_file(fp);
 		kmem_cache_free(nfs4_layout_stateid_cache, ls);
 		return NULL;
@@ -598,7 +598,7 @@ nfsd4_cb_layout_fail(struct nfs4_layout_stateid *ls)
 
 	argv[0] = "/sbin/nfsd-recall-failed";
 	argv[1] = addr_str;
-	argv[2] = ls->ls_file->f_path.mnt->mnt_sb->s_id;
+	argv[2] = ls->ls_file->nf_file->f_path.mnt->mnt_sb->s_id;
 	argv[3] = NULL;
 
 	error = call_usermodehelper(argv[0], argv, envp, UMH_WAIT_PROC);
diff --git a/fs/nfsd/nfs4state.c b/fs/nfsd/nfs4state.c
index 980383922f6a..3616c735f256 100644
--- a/fs/nfsd/nfs4state.c
+++ b/fs/nfsd/nfs4state.c
@@ -289,18 +289,18 @@ put_nfs4_file(struct nfs4_file *fi)
 	}
 }
 
-static struct file *
+static struct nfsd_file *
 __nfs4_get_fd(struct nfs4_file *f, int oflag)
 {
 	if (f->fi_fds[oflag])
-		return get_file(f->fi_fds[oflag]->nf_file);
+		return nfsd_file_get(f->fi_fds[oflag]);
 	return NULL;
 }
 
-static struct file *
+static struct nfsd_file *
 find_writeable_file_locked(struct nfs4_file *f)
 {
-	struct file *ret;
+	struct nfsd_file *ret;
 
 	lockdep_assert_held(&f->fi_lock);
 
@@ -310,10 +310,10 @@ find_writeable_file_locked(struct nfs4_file *f)
 	return ret;
 }
 
-static struct file *
+static struct nfsd_file *
 find_writeable_file(struct nfs4_file *f)
 {
-	struct file *ret;
+	struct nfsd_file *ret;
 
 	spin_lock(&f->fi_lock);
 	ret = find_writeable_file_locked(f);
@@ -322,9 +322,10 @@ find_writeable_file(struct nfs4_file *f)
 	return ret;
 }
 
-static struct file *find_readable_file_locked(struct nfs4_file *f)
+static struct nfsd_file *
+find_readable_file_locked(struct nfs4_file *f)
 {
-	struct file *ret;
+	struct nfsd_file *ret;
 
 	lockdep_assert_held(&f->fi_lock);
 
@@ -334,10 +335,10 @@ static struct file *find_readable_file_locked(struct nfs4_file *f)
 	return ret;
 }
 
-static struct file *
+static struct nfsd_file *
 find_readable_file(struct nfs4_file *f)
 {
-	struct file *ret;
+	struct nfsd_file *ret;
 
 	spin_lock(&f->fi_lock);
 	ret = find_readable_file_locked(f);
@@ -346,10 +347,10 @@ find_readable_file(struct nfs4_file *f)
 	return ret;
 }
 
-struct file *
+struct nfsd_file *
 find_any_file(struct nfs4_file *f)
 {
-	struct file *ret;
+	struct nfsd_file *ret;
 
 	spin_lock(&f->fi_lock);
 	ret = __nfs4_get_fd(f, O_RDWR);
@@ -748,16 +749,16 @@ nfs4_put_stid(struct nfs4_stid *s)
 
 static void nfs4_put_deleg_lease(struct nfs4_file *fp)
 {
-	struct file *filp = NULL;
+	struct nfsd_file *nf = NULL;
 
 	spin_lock(&fp->fi_lock);
 	if (fp->fi_deleg_file && --fp->fi_delegees == 0)
-		swap(filp, fp->fi_deleg_file);
+		swap(nf, fp->fi_deleg_file);
 	spin_unlock(&fp->fi_lock);
 
-	if (filp) {
-		vfs_setlease(filp, F_UNLCK, NULL, (void **)&fp);
-		fput(filp);
+	if (nf) {
+		vfs_setlease(nf->nf_file, F_UNLCK, NULL, (void **)&fp);
+		nfsd_file_put(nf);
 	}
 }
 
@@ -1049,11 +1050,14 @@ static void nfs4_free_lock_stateid(struct nfs4_stid *stid)
 {
 	struct nfs4_ol_stateid *stp = openlockstateid(stid);
 	struct nfs4_lockowner *lo = lockowner(stp->st_stateowner);
-	struct file *file;
+	struct nfsd_file *nf;
 
-	file = find_any_file(stp->st_stid.sc_file);
-	if (file)
-		filp_close(file, (fl_owner_t)lo);
+	nf = find_any_file(stp->st_stid.sc_file);
+	if (nf) {
+		get_file(nf->nf_file);
+		filp_close(nf->nf_file, (fl_owner_t)lo);
+		nfsd_file_put(nf);
+	}
 	nfs4_free_ol_stateid(stid);
 }
 
@@ -3950,21 +3954,21 @@ static int nfs4_setlease(struct nfs4_delegation *dp)
 {
 	struct nfs4_file *fp = dp->dl_stid.sc_file;
 	struct file_lock *fl;
-	struct file *filp;
+	struct nfsd_file *nf;
 	int status = 0;
 
 	fl = nfs4_alloc_init_lease(fp, NFS4_OPEN_DELEGATE_READ);
 	if (!fl)
 		return -ENOMEM;
-	filp = find_readable_file(fp);
-	if (!filp) {
+	nf = find_readable_file(fp);
+	if (!nf) {
 		/* We should always have a readable file here */
 		WARN_ON_ONCE(1);
 		locks_free_lock(fl);
 		return -EBADF;
 	}
-	fl->fl_file = filp;
-	status = vfs_setlease(filp, fl->fl_type, &fl, NULL);
+	fl->fl_file = nf->nf_file;
+	status = vfs_setlease(nf->nf_file, fl->fl_type, &fl, NULL);
 	if (fl)
 		locks_free_lock(fl);
 	if (status)
@@ -3982,7 +3986,7 @@ static int nfs4_setlease(struct nfs4_delegation *dp)
 		hash_delegation_locked(dp, fp);
 		goto out_unlock;
 	}
-	fp->fi_deleg_file = filp;
+	fp->fi_deleg_file = nf;
 	fp->fi_delegees = 1;
 	hash_delegation_locked(dp, fp);
 	spin_unlock(&fp->fi_lock);
@@ -3992,7 +3996,7 @@ out_unlock:
 	spin_unlock(&fp->fi_lock);
 	spin_unlock(&state_lock);
 out_fput:
-	fput(filp);
+	nfsd_file_put(nf);
 	return status;
 }
 
@@ -4598,7 +4602,7 @@ nfsd4_lookup_stateid(struct nfsd4_compound_state *cstate,
 	return nfs_ok;
 }
 
-static struct file *
+static struct nfsd_file *
 nfs4_find_file(struct nfs4_stid *s, int flags)
 {
 	if (!s)
@@ -4608,7 +4612,7 @@ nfs4_find_file(struct nfs4_stid *s, int flags)
 	case NFS4_DELEG_STID:
 		if (WARN_ON_ONCE(!s->sc_file->fi_deleg_file))
 			return NULL;
-		return get_file(s->sc_file->fi_deleg_file);
+		return nfsd_file_get(s->sc_file->fi_deleg_file);
 	case NFS4_OPEN_STID:
 	case NFS4_LOCK_STID:
 		if (flags & RD_STATE)
@@ -4637,21 +4641,17 @@ nfs4_check_file(struct svc_rqst *rqstp, struct svc_fh *fhp, struct nfs4_stid *s,
 		struct file **filpp, bool *tmp_file, int flags)
 {
 	int acc = (flags & RD_STATE) ? NFSD_MAY_READ : NFSD_MAY_WRITE;
-	struct file *file;
+	struct nfsd_file *nf;
 	__be32 status;
 
-	file = nfs4_find_file(s, flags);
-	if (file) {
+	nf = nfs4_find_file(s, flags);
+	if (nf) {
 		status = nfsd_permission(rqstp, fhp->fh_export, fhp->fh_dentry,
 				acc | NFSD_MAY_OWNER_OVERRIDE);
-		if (status) {
-			fput(file);
-			return status;
-		}
-
-		*filpp = file;
+		if (status)
+			goto out;
 	} else {
-		status = nfsd_open(rqstp, fhp, S_IFREG, acc, filpp);
+		status = nfsd_file_acquire(rqstp, fhp, acc, &nf);
 		if (status)
 			return status;
 
@@ -4659,7 +4659,10 @@ nfs4_check_file(struct svc_rqst *rqstp, struct svc_fh *fhp, struct nfs4_stid *s,
 			*tmp_file = true;
 	}
 
-	return 0;
+	*filpp = get_file(nf->nf_file);
+out:
+	nfsd_file_put(nf);
+	return status;
 }
 
 /*
@@ -5388,7 +5391,7 @@ nfsd4_lock(struct svc_rqst *rqstp, struct nfsd4_compound_state *cstate,
 	struct nfs4_ol_stateid *lock_stp = NULL;
 	struct nfs4_ol_stateid *open_stp = NULL;
 	struct nfs4_file *fp;
-	struct file *filp = NULL;
+	struct nfsd_file *nf = NULL;
 	struct file_lock *file_lock = NULL;
 	struct file_lock *conflock = NULL;
 	__be32 status = 0;
@@ -5470,8 +5473,8 @@ nfsd4_lock(struct svc_rqst *rqstp, struct nfsd4_compound_state *cstate,
 		case NFS4_READ_LT:
 		case NFS4_READW_LT:
 			spin_lock(&fp->fi_lock);
-			filp = find_readable_file_locked(fp);
-			if (filp)
+			nf = find_readable_file_locked(fp);
+			if (nf)
 				get_lock_access(lock_stp, NFS4_SHARE_ACCESS_READ);
 			spin_unlock(&fp->fi_lock);
 			file_lock->fl_type = F_RDLCK;
@@ -5479,8 +5482,8 @@ nfsd4_lock(struct svc_rqst *rqstp, struct nfsd4_compound_state *cstate,
 		case NFS4_WRITE_LT:
 		case NFS4_WRITEW_LT:
 			spin_lock(&fp->fi_lock);
-			filp = find_writeable_file_locked(fp);
-			if (filp)
+			nf = find_writeable_file_locked(fp);
+			if (nf)
 				get_lock_access(lock_stp, NFS4_SHARE_ACCESS_WRITE);
 			spin_unlock(&fp->fi_lock);
 			file_lock->fl_type = F_WRLCK;
@@ -5489,14 +5492,14 @@ nfsd4_lock(struct svc_rqst *rqstp, struct nfsd4_compound_state *cstate,
 			status = nfserr_inval;
 		goto out;
 	}
-	if (!filp) {
+	if (!nf) {
 		status = nfserr_openmode;
 		goto out;
 	}
 
 	file_lock->fl_owner = (fl_owner_t)lockowner(nfs4_get_stateowner(&lock_sop->lo_owner));
 	file_lock->fl_pid = current->tgid;
-	file_lock->fl_file = filp;
+	file_lock->fl_file = nf->nf_file;
 	file_lock->fl_flags = FL_POSIX;
 	file_lock->fl_lmops = &nfsd_posix_mng_ops;
 	file_lock->fl_start = lock->lk_offset;
@@ -5510,7 +5513,7 @@ nfsd4_lock(struct svc_rqst *rqstp, struct nfsd4_compound_state *cstate,
 		goto out;
 	}
 
-	err = vfs_lock_file(filp, F_SETLK, file_lock, conflock);
+	err = vfs_lock_file(nf->nf_file, F_SETLK, file_lock, conflock);
 	switch (-err) {
 	case 0: /* success! */
 		update_stateid(&lock_stp->st_stid.sc_stateid);
@@ -5532,8 +5535,8 @@ nfsd4_lock(struct svc_rqst *rqstp, struct nfsd4_compound_state *cstate,
 		break;
 	}
 out:
-	if (filp)
-		fput(filp);
+	if (nf)
+		nfsd_file_put(nf);
 	if (lock_stp) {
 		/* Bump seqid manually if the 4.0 replay owner is openowner */
 		if (cstate->replay_owner &&
@@ -5658,7 +5661,7 @@ nfsd4_locku(struct svc_rqst *rqstp, struct nfsd4_compound_state *cstate,
 	    struct nfsd4_locku *locku)
 {
 	struct nfs4_ol_stateid *stp;
-	struct file *filp = NULL;
+	struct nfsd_file *nf;
 	struct file_lock *file_lock = NULL;
 	__be32 status;
 	int err;
@@ -5676,8 +5679,8 @@ nfsd4_locku(struct svc_rqst *rqstp, struct nfsd4_compound_state *cstate,
 					&stp, nn);
 	if (status)
 		goto out;
-	filp = find_any_file(stp->st_stid.sc_file);
-	if (!filp) {
+	nf = find_any_file(stp->st_stid.sc_file);
+	if (!nf) {
 		status = nfserr_lock_range;
 		goto put_stateid;
 	}
@@ -5685,13 +5688,13 @@ nfsd4_locku(struct svc_rqst *rqstp, struct nfsd4_compound_state *cstate,
 	if (!file_lock) {
 		dprintk("NFSD: %s: unable to allocate lock!\n", __func__);
 		status = nfserr_jukebox;
-		goto fput;
+		goto put_file;
 	}
 
 	file_lock->fl_type = F_UNLCK;
 	file_lock->fl_owner = (fl_owner_t)lockowner(nfs4_get_stateowner(stp->st_stateowner));
 	file_lock->fl_pid = current->tgid;
-	file_lock->fl_file = filp;
+	file_lock->fl_file = nf->nf_file;
 	file_lock->fl_flags = FL_POSIX;
 	file_lock->fl_lmops = &nfsd_posix_mng_ops;
 	file_lock->fl_start = locku->lu_offset;
@@ -5700,15 +5703,15 @@ nfsd4_locku(struct svc_rqst *rqstp, struct nfsd4_compound_state *cstate,
 						locku->lu_length);
 	nfs4_transform_lock_offset(file_lock);
 
-	err = vfs_lock_file(filp, F_SETLK, file_lock, NULL);
+	err = vfs_lock_file(nf->nf_file, F_SETLK, file_lock, NULL);
 	if (err) {
 		dprintk("NFSD: nfs4_locku: vfs_lock_file failed!\n");
 		goto out_nfserr;
 	}
 	update_stateid(&stp->st_stid.sc_stateid);
 	memcpy(&locku->lu_stateid, &stp->st_stid.sc_stateid, sizeof(stateid_t));
-fput:
-	fput(filp);
+put_file:
+	nfsd_file_put(nf);
 put_stateid:
 	nfs4_put_stid(&stp->st_stid);
 out:
@@ -5719,7 +5722,7 @@ out:
 
 out_nfserr:
 	status = nfserrno(err);
-	goto fput;
+	goto put_file;
 }
 
 /*
@@ -5732,17 +5735,17 @@ check_for_locks(struct nfs4_file *fp, struct nfs4_lockowner *lowner)
 {
 	struct file_lock *fl;
 	int status = false;
-	struct file *filp = find_any_file(fp);
+	struct nfsd_file *nf = find_any_file(fp);
 	struct inode *inode;
 	struct file_lock_context *flctx;
 
-	if (!filp) {
+	if (!nf) {
 		/* Any valid lock stateid should have some sort of access */
 		WARN_ON_ONCE(1);
 		return status;
 	}
 
-	inode = file_inode(filp);
+	inode = file_inode(nf->nf_file);
 	flctx = inode->i_flctx;
 
 	if (flctx && !list_empty_careful(&flctx->flc_posix)) {
@@ -5755,7 +5758,7 @@ check_for_locks(struct nfs4_file *fp, struct nfs4_lockowner *lowner)
 		}
 		spin_unlock(&flctx->flc_lock);
 	}
-	fput(filp);
+	nfsd_file_put(nf);
 	return status;
 }
 
diff --git a/fs/nfsd/state.h b/fs/nfsd/state.h
index 70b3e51ba089..8a317de773b9 100644
--- a/fs/nfsd/state.h
+++ b/fs/nfsd/state.h
@@ -509,7 +509,7 @@ struct nfs4_file {
 	 */
 	atomic_t		fi_access[2];
 	u32			fi_share_deny;
-	struct file		*fi_deleg_file;
+	struct nfsd_file	*fi_deleg_file;
 	int			fi_delegees;
 	struct knfsd_fh		fi_fhandle;
 	bool			fi_had_conflict;
@@ -557,7 +557,7 @@ struct nfs4_layout_stateid {
 	spinlock_t			ls_lock;
 	struct list_head		ls_layouts;
 	u32				ls_layout_type;
-	struct file			*ls_file;
+	struct nfsd_file		*ls_file;
 	struct nfsd4_callback		ls_recall;
 	stateid_t			ls_recall_sid;
 	bool				ls_recalled;
@@ -620,7 +620,7 @@ static inline void get_nfs4_file(struct nfs4_file *fi)
 {
 	atomic_inc(&fi->fi_ref);
 }
-struct file *find_any_file(struct nfs4_file *f);
+struct nfsd_file *find_any_file(struct nfs4_file *f);
 
 /* grace period management */
 void nfsd4_end_grace(struct nfsd_net *nn);
-- 
2.4.3


^ permalink raw reply related	[flat|nested] 19+ messages in thread

* [PATCH v4 15/16] nfsd: hook up nfs4_preprocess_stateid_op to the nfsd_file cache
       [not found] ` <1441968882-7851-1-git-send-email-jeff.layton-7I+n7zu2hftEKMMhf/gKZA@public.gmane.org>
                     ` (5 preceding siblings ...)
  2015-09-11 10:54   ` [PATCH v4 13/16] nfsd: have nfsd_test_lock use " Jeff Layton
@ 2015-09-11 10:54   ` Jeff Layton
  6 siblings, 0 replies; 19+ messages in thread
From: Jeff Layton @ 2015-09-11 10:54 UTC (permalink / raw)
  To: bfields-uC3wQj2KruNg9hUCZPvPmw
  Cc: linux-nfs-u79uwXL29TY76Z2rM5mHXA, linux-fsdevel-u79uwXL29TY76Z2rM5mHXA

Have nfs4_preprocess_stateid_op pass back a nfsd_file instead of a filp.
Since we now presume that the struct file will be persistent in most
cases, we can stop fiddling with the raparms in the read code. This
also means that we don't really care about the rd_tmp_file field
anymore.

Signed-off-by: Jeff Layton <jeff.layton-7I+n7zu2hftEKMMhf/gKZA@public.gmane.org>
---
 fs/nfsd/nfs4proc.c  | 32 ++++++++++++++++----------------
 fs/nfsd/nfs4state.c | 24 ++++++++++--------------
 fs/nfsd/nfs4xdr.c   | 16 +++++-----------
 fs/nfsd/state.h     |  2 +-
 fs/nfsd/xdr4.h      | 15 +++++++--------
 5 files changed, 39 insertions(+), 50 deletions(-)

diff --git a/fs/nfsd/nfs4proc.c b/fs/nfsd/nfs4proc.c
index 4ce6b97b31ad..0f8cd2458a58 100644
--- a/fs/nfsd/nfs4proc.c
+++ b/fs/nfsd/nfs4proc.c
@@ -758,7 +758,7 @@ nfsd4_read(struct svc_rqst *rqstp, struct nfsd4_compound_state *cstate,
 {
 	__be32 status;
 
-	read->rd_filp = NULL;
+	read->rd_nf = NULL;
 	if (read->rd_offset >= OFFSET_MAX)
 		return nfserr_inval;
 
@@ -775,7 +775,7 @@ nfsd4_read(struct svc_rqst *rqstp, struct nfsd4_compound_state *cstate,
 
 	/* check stateid */
 	status = nfs4_preprocess_stateid_op(rqstp, cstate, &read->rd_stateid,
-			RD_STATE, &read->rd_filp, &read->rd_tmp_file);
+			RD_STATE, &read->rd_nf);
 	if (status) {
 		dprintk("NFSD: nfsd4_read: couldn't process stateid!\n");
 		goto out;
@@ -921,7 +921,7 @@ nfsd4_setattr(struct svc_rqst *rqstp, struct nfsd4_compound_state *cstate,
 
 	if (setattr->sa_iattr.ia_valid & ATTR_SIZE) {
 		status = nfs4_preprocess_stateid_op(rqstp, cstate,
-			&setattr->sa_stateid, WR_STATE, NULL, NULL);
+			&setattr->sa_stateid, WR_STATE, NULL);
 		if (status) {
 			dprintk("NFSD: nfsd4_setattr: couldn't process stateid!\n");
 			return status;
@@ -977,7 +977,7 @@ nfsd4_write(struct svc_rqst *rqstp, struct nfsd4_compound_state *cstate,
 	    struct nfsd4_write *write)
 {
 	stateid_t *stateid = &write->wr_stateid;
-	struct file *filp = NULL;
+	struct nfsd_file *nf = NULL;
 	__be32 status = nfs_ok;
 	unsigned long cnt;
 	int nvecs;
@@ -986,7 +986,7 @@ nfsd4_write(struct svc_rqst *rqstp, struct nfsd4_compound_state *cstate,
 		return nfserr_inval;
 
 	status = nfs4_preprocess_stateid_op(rqstp, cstate, stateid, WR_STATE,
-			&filp, NULL);
+			&nf);
 	if (status) {
 		dprintk("NFSD: nfsd4_write: couldn't process stateid!\n");
 		return status;
@@ -999,10 +999,10 @@ nfsd4_write(struct svc_rqst *rqstp, struct nfsd4_compound_state *cstate,
 	nvecs = fill_in_write_vector(rqstp->rq_vec, write);
 	WARN_ON_ONCE(nvecs > ARRAY_SIZE(rqstp->rq_vec));
 
-	status = nfsd_vfs_write(rqstp, &cstate->current_fh, filp,
+	status = nfsd_vfs_write(rqstp, &cstate->current_fh, nf->nf_file,
 				write->wr_offset, rqstp->rq_vec, nvecs, &cnt,
 				&write->wr_how_written);
-	fput(filp);
+	nfsd_file_put(nf);
 
 	write->wr_bytes_written = cnt;
 
@@ -1014,21 +1014,21 @@ nfsd4_fallocate(struct svc_rqst *rqstp, struct nfsd4_compound_state *cstate,
 		struct nfsd4_fallocate *fallocate, int flags)
 {
 	__be32 status = nfserr_notsupp;
-	struct file *file;
+	struct nfsd_file *nf;
 
 	status = nfs4_preprocess_stateid_op(rqstp, cstate,
 					    &fallocate->falloc_stateid,
-					    WR_STATE, &file, NULL);
+					    WR_STATE, &nf);
 	if (status != nfs_ok) {
 		dprintk("NFSD: nfsd4_fallocate: couldn't process stateid!\n");
 		return status;
 	}
 
-	status = nfsd4_vfs_fallocate(rqstp, &cstate->current_fh, file,
+	status = nfsd4_vfs_fallocate(rqstp, &cstate->current_fh, nf->nf_file,
 				     fallocate->falloc_offset,
 				     fallocate->falloc_length,
 				     flags);
-	fput(file);
+	nfsd_file_put(nf);
 	return status;
 }
 
@@ -1053,11 +1053,11 @@ nfsd4_seek(struct svc_rqst *rqstp, struct nfsd4_compound_state *cstate,
 {
 	int whence;
 	__be32 status;
-	struct file *file;
+	struct nfsd_file *nf;
 
 	status = nfs4_preprocess_stateid_op(rqstp, cstate,
 					    &seek->seek_stateid,
-					    RD_STATE, &file, NULL);
+					    RD_STATE, &nf);
 	if (status) {
 		dprintk("NFSD: nfsd4_seek: couldn't process stateid!\n");
 		return status;
@@ -1079,14 +1079,14 @@ nfsd4_seek(struct svc_rqst *rqstp, struct nfsd4_compound_state *cstate,
 	 * Note:  This call does change file->f_pos, but nothing in NFSD
 	 *        should ever file->f_pos.
 	 */
-	seek->seek_pos = vfs_llseek(file, seek->seek_offset, whence);
+	seek->seek_pos = vfs_llseek(nf->nf_file, seek->seek_offset, whence);
 	if (seek->seek_pos < 0)
 		status = nfserrno(seek->seek_pos);
-	else if (seek->seek_pos >= i_size_read(file_inode(file)))
+	else if (seek->seek_pos >= i_size_read(file_inode(nf->nf_file)))
 		seek->seek_eof = true;
 
 out:
-	fput(file);
+	nfsd_file_put(nf);
 	return status;
 }
 
diff --git a/fs/nfsd/nfs4state.c b/fs/nfsd/nfs4state.c
index 3616c735f256..eb2f7f2d73e5 100644
--- a/fs/nfsd/nfs4state.c
+++ b/fs/nfsd/nfs4state.c
@@ -4638,7 +4638,7 @@ nfs4_check_olstateid(struct svc_fh *fhp, struct nfs4_ol_stateid *ols, int flags)
 
 static __be32
 nfs4_check_file(struct svc_rqst *rqstp, struct svc_fh *fhp, struct nfs4_stid *s,
-		struct file **filpp, bool *tmp_file, int flags)
+		struct nfsd_file **nfp, int flags)
 {
 	int acc = (flags & RD_STATE) ? NFSD_MAY_READ : NFSD_MAY_WRITE;
 	struct nfsd_file *nf;
@@ -4648,20 +4648,18 @@ nfs4_check_file(struct svc_rqst *rqstp, struct svc_fh *fhp, struct nfs4_stid *s,
 	if (nf) {
 		status = nfsd_permission(rqstp, fhp->fh_export, fhp->fh_dentry,
 				acc | NFSD_MAY_OWNER_OVERRIDE);
-		if (status)
+		if (status) {
+			nfsd_file_put(nf);
 			goto out;
+		}
 	} else {
 		status = nfsd_file_acquire(rqstp, fhp, acc, &nf);
 		if (status)
 			return status;
-
-		if (tmp_file)
-			*tmp_file = true;
 	}
 
-	*filpp = get_file(nf->nf_file);
+	*nfp = nf;
 out:
-	nfsd_file_put(nf);
 	return status;
 }
 
@@ -4671,7 +4669,7 @@ out:
 __be32
 nfs4_preprocess_stateid_op(struct svc_rqst *rqstp,
 		struct nfsd4_compound_state *cstate, stateid_t *stateid,
-		int flags, struct file **filpp, bool *tmp_file)
+		int flags, struct nfsd_file **nfp)
 {
 	struct svc_fh *fhp = &cstate->current_fh;
 	struct inode *ino = d_inode(fhp->fh_dentry);
@@ -4680,10 +4678,8 @@ nfs4_preprocess_stateid_op(struct svc_rqst *rqstp,
 	struct nfs4_stid *s = NULL;
 	__be32 status;
 
-	if (filpp)
-		*filpp = NULL;
-	if (tmp_file)
-		*tmp_file = false;
+	if (nfp)
+		*nfp = NULL;
 
 	if (grace_disallows_io(net, ino))
 		return nfserr_grace;
@@ -4720,8 +4716,8 @@ nfs4_preprocess_stateid_op(struct svc_rqst *rqstp,
 	status = nfs4_check_fh(fhp, s);
 
 done:
-	if (!status && filpp)
-		status = nfs4_check_file(rqstp, fhp, s, filpp, tmp_file, flags);
+	if (status == nfs_ok && nfp)
+		status = nfs4_check_file(rqstp, fhp, s, nfp, flags);
 out:
 	if (s)
 		nfs4_put_stid(s);
diff --git a/fs/nfsd/nfs4xdr.c b/fs/nfsd/nfs4xdr.c
index 325521ce389a..92e5e8f884d0 100644
--- a/fs/nfsd/nfs4xdr.c
+++ b/fs/nfsd/nfs4xdr.c
@@ -49,6 +49,7 @@
 #include "cache.h"
 #include "netns.h"
 #include "pnfs.h"
+#include "filecache.h"
 
 #ifdef CONFIG_NFSD_V4_SECURITY_LABEL
 #include <linux/security.h>
@@ -3460,14 +3461,14 @@ nfsd4_encode_read(struct nfsd4_compoundres *resp, __be32 nfserr,
 {
 	unsigned long maxcount;
 	struct xdr_stream *xdr = &resp->xdr;
-	struct file *file = read->rd_filp;
+	struct file *file;
 	int starting_len = xdr->buf->len;
-	struct raparms *ra = NULL;
 	__be32 *p;
 
 	if (nfserr)
 		goto out;
 
+	file = read->rd_nf->nf_file;
 	p = xdr_reserve_space(xdr, 8); /* eof flag and byte count */
 	if (!p) {
 		WARN_ON_ONCE(test_bit(RQ_SPLICE_OK, &resp->rqstp->rq_flags));
@@ -3487,24 +3488,17 @@ nfsd4_encode_read(struct nfsd4_compoundres *resp, __be32 nfserr,
 			 (xdr->buf->buflen - xdr->buf->len));
 	maxcount = min_t(unsigned long, maxcount, read->rd_length);
 
-	if (read->rd_tmp_file)
-		ra = nfsd_init_raparms(file);
-
 	if (file->f_op->splice_read &&
 	    test_bit(RQ_SPLICE_OK, &resp->rqstp->rq_flags))
 		nfserr = nfsd4_encode_splice_read(resp, read, file, maxcount);
 	else
 		nfserr = nfsd4_encode_readv(resp, read, file, maxcount);
 
-	if (ra)
-		nfsd_put_raparams(file, ra);
-
 	if (nfserr)
 		xdr_truncate_encode(xdr, starting_len);
-
 out:
-	if (file)
-		fput(file);
+	if (read->rd_nf)
+		nfsd_file_put(read->rd_nf);
 	return nfserr;
 }
 
diff --git a/fs/nfsd/state.h b/fs/nfsd/state.h
index 8a317de773b9..cf7e27199507 100644
--- a/fs/nfsd/state.h
+++ b/fs/nfsd/state.h
@@ -585,7 +585,7 @@ struct nfsd_net;
 
 extern __be32 nfs4_preprocess_stateid_op(struct svc_rqst *rqstp,
 		struct nfsd4_compound_state *cstate, stateid_t *stateid,
-		int flags, struct file **filp, bool *tmp_file);
+		int flags, struct nfsd_file **filp);
 __be32 nfsd4_lookup_stateid(struct nfsd4_compound_state *cstate,
 		     stateid_t *stateid, unsigned char typemask,
 		     struct nfs4_stid **s, struct nfsd_net *nn);
diff --git a/fs/nfsd/xdr4.h b/fs/nfsd/xdr4.h
index 9f991007a578..ea016fb24675 100644
--- a/fs/nfsd/xdr4.h
+++ b/fs/nfsd/xdr4.h
@@ -268,15 +268,14 @@ struct nfsd4_open_downgrade {
 
 
 struct nfsd4_read {
-	stateid_t	rd_stateid;         /* request */
-	u64		rd_offset;          /* request */
-	u32		rd_length;          /* request */
-	int		rd_vlen;
-	struct file     *rd_filp;
-	bool		rd_tmp_file;
+	stateid_t		rd_stateid;         /* request */
+	u64			rd_offset;          /* request */
+	u32			rd_length;          /* request */
+	int			rd_vlen;
+	struct nfsd_file	*rd_nf;
 	
-	struct svc_rqst *rd_rqstp;          /* response */
-	struct svc_fh * rd_fhp;             /* response */
+	struct svc_rqst		*rd_rqstp;          /* response */
+	struct svc_fh		*rd_fhp;             /* response */
 };
 
 struct nfsd4_readdir {
-- 
2.4.3

--
To unsubscribe from this list: send the line "unsubscribe linux-nfs" in
the body of a message to majordomo-u79uwXL29TY76Z2rM5mHXA@public.gmane.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

^ permalink raw reply related	[flat|nested] 19+ messages in thread

* [PATCH v4 16/16] nfsd: rip out the raparms cache
  2015-09-11 10:54 [PATCH v4 00/16] nfsd: open file caching Jeff Layton
                   ` (8 preceding siblings ...)
  2015-09-11 10:54 ` [PATCH v4 14/16] nfsd: convert fi_deleg_file and ls_file fields to nfsd_file Jeff Layton
@ 2015-09-11 10:54 ` Jeff Layton
  9 siblings, 0 replies; 19+ messages in thread
From: Jeff Layton @ 2015-09-11 10:54 UTC (permalink / raw)
  To: bfields; +Cc: linux-nfs, linux-fsdevel

Nothing uses it anymore.

Signed-off-by: Jeff Layton <jeff.layton@primarydata.com>
---
 fs/nfsd/nfssvc.c |  14 +-----
 fs/nfsd/vfs.c    | 147 -------------------------------------------------------
 fs/nfsd/vfs.h    |   5 --
 3 files changed, 1 insertion(+), 165 deletions(-)

diff --git a/fs/nfsd/nfssvc.c b/fs/nfsd/nfssvc.c
index d816bb3faa6e..d1034d119afb 100644
--- a/fs/nfsd/nfssvc.c
+++ b/fs/nfsd/nfssvc.c
@@ -216,18 +216,9 @@ static int nfsd_startup_generic(int nrservs)
 	if (nfsd_users++)
 		return 0;
 
-	/*
-	 * Readahead param cache - will no-op if it already exists.
-	 * (Note therefore results will be suboptimal if number of
-	 * threads is modified after nfsd start.)
-	 */
-	ret = nfsd_racache_init(2*nrservs);
-	if (ret)
-		goto dec_users;
-
 	ret = nfsd_file_cache_init();
 	if (ret)
-		goto out_racache;
+		goto dec_users;
 
 	ret = nfs4_state_start();
 	if (ret)
@@ -236,8 +227,6 @@ static int nfsd_startup_generic(int nrservs)
 
 out_file_cache:
 	nfsd_file_cache_shutdown();
-out_racache:
-	nfsd_racache_shutdown();
 dec_users:
 	nfsd_users--;
 	return ret;
@@ -250,7 +239,6 @@ static void nfsd_shutdown_generic(void)
 
 	nfs4_state_shutdown();
 	nfsd_file_cache_shutdown();
-	nfsd_racache_shutdown();
 }
 
 static bool nfsd_needs_lockd(void)
diff --git a/fs/nfsd/vfs.c b/fs/nfsd/vfs.c
index ca9dc84ca4b0..38f57891e411 100644
--- a/fs/nfsd/vfs.c
+++ b/fs/nfsd/vfs.c
@@ -46,34 +46,6 @@
 
 #define NFSDDBG_FACILITY		NFSDDBG_FILEOP
 
-
-/*
- * This is a cache of readahead params that help us choose the proper
- * readahead strategy. Initially, we set all readahead parameters to 0
- * and let the VFS handle things.
- * If you increase the number of cached files very much, you'll need to
- * add a hash table here.
- */
-struct raparms {
-	struct raparms		*p_next;
-	unsigned int		p_count;
-	ino_t			p_ino;
-	dev_t			p_dev;
-	int			p_set;
-	struct file_ra_state	p_ra;
-	unsigned int		p_hindex;
-};
-
-struct raparm_hbucket {
-	struct raparms		*pb_head;
-	spinlock_t		pb_lock;
-} ____cacheline_aligned_in_smp;
-
-#define RAPARM_HASH_BITS	4
-#define RAPARM_HASH_SIZE	(1<<RAPARM_HASH_BITS)
-#define RAPARM_HASH_MASK	(RAPARM_HASH_SIZE-1)
-static struct raparm_hbucket	raparm_hash[RAPARM_HASH_SIZE];
-
 /* 
  * Called from nfsd_lookup and encode_dirent. Check if we have crossed 
  * a mount point.
@@ -728,65 +700,6 @@ out:
 	return err;
 }
 
-struct raparms *
-nfsd_init_raparms(struct file *file)
-{
-	struct inode *inode = file_inode(file);
-	dev_t dev = inode->i_sb->s_dev;
-	ino_t ino = inode->i_ino;
-	struct raparms	*ra, **rap, **frap = NULL;
-	int depth = 0;
-	unsigned int hash;
-	struct raparm_hbucket *rab;
-
-	hash = jhash_2words(dev, ino, 0xfeedbeef) & RAPARM_HASH_MASK;
-	rab = &raparm_hash[hash];
-
-	spin_lock(&rab->pb_lock);
-	for (rap = &rab->pb_head; (ra = *rap); rap = &ra->p_next) {
-		if (ra->p_ino == ino && ra->p_dev == dev)
-			goto found;
-		depth++;
-		if (ra->p_count == 0)
-			frap = rap;
-	}
-	depth = nfsdstats.ra_size;
-	if (!frap) {	
-		spin_unlock(&rab->pb_lock);
-		return NULL;
-	}
-	rap = frap;
-	ra = *frap;
-	ra->p_dev = dev;
-	ra->p_ino = ino;
-	ra->p_set = 0;
-	ra->p_hindex = hash;
-found:
-	if (rap != &rab->pb_head) {
-		*rap = ra->p_next;
-		ra->p_next   = rab->pb_head;
-		rab->pb_head = ra;
-	}
-	ra->p_count++;
-	nfsdstats.ra_depth[depth*10/nfsdstats.ra_size]++;
-	spin_unlock(&rab->pb_lock);
-
-	if (ra->p_set)
-		file->f_ra = ra->p_ra;
-	return ra;
-}
-
-void nfsd_put_raparams(struct file *file, struct raparms *ra)
-{
-	struct raparm_hbucket *rab = &raparm_hash[ra->p_hindex];
-
-	spin_lock(&rab->pb_lock);
-	ra->p_ra = file->f_ra;
-	ra->p_set = 1;
-	ra->p_count--;
-	spin_unlock(&rab->pb_lock);
-}
-
 /*
  * Grab and keep cached pages associated with a file in the svc_rqst
  * so that they can be passed to the network sendmsg/sendpage routines
@@ -1996,63 +1909,3 @@ nfsd_permission(struct svc_rqst *rqstp, struct svc_export *exp,
 
 	return err? nfserrno(err) : 0;
 }
-
-void
-nfsd_racache_shutdown(void)
-{
-	struct raparms *raparm, *last_raparm;
-	unsigned int i;
-
-	dprintk("nfsd: freeing readahead buffers.\n");
-
-	for (i = 0; i < RAPARM_HASH_SIZE; i++) {
-		raparm = raparm_hash[i].pb_head;
-		while(raparm) {
-			last_raparm = raparm;
-			raparm = raparm->p_next;
-			kfree(last_raparm);
-		}
-		raparm_hash[i].pb_head = NULL;
-	}
-}
-/*
- * Initialize readahead param cache
- */
-int
-nfsd_racache_init(int cache_size)
-{
-	int	i;
-	int	j = 0;
-	int	nperbucket;
-	struct raparms **raparm = NULL;
-
-
-	if (raparm_hash[0].pb_head)
-		return 0;
-	nperbucket = DIV_ROUND_UP(cache_size, RAPARM_HASH_SIZE);
-	nperbucket = max(2, nperbucket);
-	cache_size = nperbucket * RAPARM_HASH_SIZE;
-
-	dprintk("nfsd: allocating %d readahead buffers.\n", cache_size);
-
-	for (i = 0; i < RAPARM_HASH_SIZE; i++) {
-		spin_lock_init(&raparm_hash[i].pb_lock);
-
-		raparm = &raparm_hash[i].pb_head;
-		for (j = 0; j < nperbucket; j++) {
-			*raparm = kzalloc(sizeof(struct raparms), GFP_KERNEL);
-			if (!*raparm)
-				goto out_nomem;
-			raparm = &(*raparm)->p_next;
-		}
-		*raparm = NULL;
-	}
-
-	nfsdstats.ra_size = cache_size;
-	return 0;
-
-out_nomem:
-	dprintk("nfsd: kmalloc failed, freeing readahead buffers\n");
-	nfsd_racache_shutdown();
-	return -ENOMEM;
-}
diff --git a/fs/nfsd/vfs.h b/fs/nfsd/vfs.h
index 303db66dca0a..1014bd4b212f 100644
--- a/fs/nfsd/vfs.h
+++ b/fs/nfsd/vfs.h
@@ -39,8 +39,6 @@
 typedef int (*nfsd_filldir_t)(void *, const char *, int, loff_t, u64, unsigned);
 
 /* nfsd/vfs.c */
-int		nfsd_racache_init(int);
-void		nfsd_racache_shutdown(void);
 int		nfsd_cross_mnt(struct svc_rqst *rqstp, struct dentry **dpp,
 		                struct svc_export **expp);
 __be32		nfsd_lookup(struct svc_rqst *, struct svc_fh *,
@@ -105,9 +103,6 @@ __be32		nfsd_statfs(struct svc_rqst *, struct svc_fh *,
 __be32		nfsd_permission(struct svc_rqst *, struct svc_export *,
 				struct dentry *, int);
 
-struct raparms *nfsd_init_raparms(struct file *file);
-void		nfsd_put_raparams(struct file *file, struct raparms *ra);
-
 static inline int fh_want_write(struct svc_fh *fh)
 {
 	int ret = mnt_want_write(fh->fh_export->ex_path.mnt);
-- 
2.4.3


^ permalink raw reply related	[flat|nested] 19+ messages in thread

* Re: [PATCH v4 03/16] fs: allow __fput_sync to be used by non-kthreads and in modules
       [not found]   ` <1441968882-7851-4-git-send-email-jeff.layton-7I+n7zu2hftEKMMhf/gKZA@public.gmane.org>
@ 2015-09-11 14:00     ` Al Viro
       [not found]       ` <20150911140049.GN22011-3bDd1+5oDREiFSDQTTA3OLVCufUGDwFn@public.gmane.org>
  0 siblings, 1 reply; 19+ messages in thread
From: Al Viro @ 2015-09-11 14:00 UTC (permalink / raw)
  To: Jeff Layton
  Cc: bfields-uC3wQj2KruNg9hUCZPvPmw, linux-nfs-u79uwXL29TY76Z2rM5mHXA,
	linux-fsdevel-u79uwXL29TY76Z2rM5mHXA

On Fri, Sep 11, 2015 at 06:54:29AM -0400, Jeff Layton wrote:
> We want nfsd to keep a cache of open files, but that would potentially
> block userland callers from obtaining leases on them. To fix this,
> we'll be adding a new notifier chain to the lease code that will call
> back into nfsd on any attempt to set a FL_LEASE. nfsd can then close
> any open files for that inode in advance of that.
> 
> The problem however is that since that notifier will run in normal
> process context, the final __fput will be delayed a'la task_work and we
> are still unable to set a lease. What we need to do is to put the struct
> file synchronously so that the __fput runs before returning from the
> notifier call.
> 
> The comments over __fput_sync and the BUG_ON in there mandate that it
> should only be used in kthread context, but I see no reason why that
> should be so. As long as the caller avoids holding locks that may be
> problematic, it should be OK to use it from normal process context as
> well.
> 
> Remove the __ prefix and the BUG_ON from that function and update the
> comments over it. Also export it so that it can be used from nfsd code,
> and move the export of fput just below the function definition.

I really don't like that.
	a) how deep in kernel stack will that thing run?
	b) what locking environment is expected in your case?

And opening it for use by any random driver that just feels like e.g.
using it to go parse its config over there in /lib/we/are/special/wank.conf
with 5Kb worth of kernel stack already eaten is a really bad idea.
--
To unsubscribe from this list: send the line "unsubscribe linux-nfs" in
the body of a message to majordomo-u79uwXL29TY76Z2rM5mHXA@public.gmane.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

^ permalink raw reply	[flat|nested] 19+ messages in thread

* Re: [PATCH v4 03/16] fs: allow __fput_sync to be used by non-kthreads and in modules
       [not found]       ` <20150911140049.GN22011-3bDd1+5oDREiFSDQTTA3OLVCufUGDwFn@public.gmane.org>
@ 2015-09-11 14:38         ` Jeff Layton
  0 siblings, 0 replies; 19+ messages in thread
From: Jeff Layton @ 2015-09-11 14:38 UTC (permalink / raw)
  To: Al Viro
  Cc: bfields-uC3wQj2KruNg9hUCZPvPmw, linux-nfs-u79uwXL29TY76Z2rM5mHXA,
	linux-fsdevel-u79uwXL29TY76Z2rM5mHXA

On Fri, 11 Sep 2015 15:00:49 +0100
Al Viro <viro-3bDd1+5oDREiFSDQTTA3OLVCufUGDwFn@public.gmane.org> wrote:

> On Fri, Sep 11, 2015 at 06:54:29AM -0400, Jeff Layton wrote:
> > We want nfsd to keep a cache of open files, but that would potentially
> > block userland callers from obtaining leases on them. To fix this,
> > we'll be adding a new notifier chain to the lease code that will call
> > back into nfsd on any attempt to set a FL_LEASE. nfsd can then close
> > any open files for that inode in advance of that.
> > 
> > The problem however is that since that notifier will run in normal
> > process context, the final __fput will be delayed a'la task_work and we
> > are still unable to set a lease. What we need to do is to put the struct
> > file synchronously so that the __fput runs before returning from the
> > notifier call.
> > 
> > The comments over __fput_sync and the BUG_ON in there mandate that it
> > should only be used in kthread context, but I see no reason why that
> > should be so. As long as the caller avoids holding locks that may be
> > problematic, it should be OK to use it from normal process context as
> > well.
> > 
> > Remove the __ prefix and the BUG_ON from that function and update the
> > comments over it. Also export it so that it can be used from nfsd code,
> > and move the export of fput just below the function definition.
> 
> I really don't like that.
> 	a) how deep in kernel stack will that thing run?
> 	b) what locking environment is expected in your case?
> 
> And opening it for use by any random driver that just feels like e.g.
> using it to go parse its config over there in /lib/we/are/special/wank.conf
> with 5Kb worth of kernel stack already eaten is a really bad idea.


Not too deep in our case, and with no real locking held aside from a
SRCU lock. Basically we're going to have a SRCU notifier chain that
will run from vfs_setlease. That will call back into the nfsd code when
it's running which will scan the hash for open files for the inode,
unhash and release them (synchronously). If they're being held open in
the cache but are otherwise idle, that's enough to allow a lease to be
acquired.

That said, I'm not thrilled with it either. There are some
alternatives:

1) we could just call task_work_run after the fput, but that seems
scary if (e.g.) some random interrupt walks in and queues up some
task_work.

2) we could add a "delayed_fput(file)", that adds it to the
delayed_fput_list, even when being run from normal process context.
Then we could just flush_delayed_fput() afterward. More context
switching, but that should be relatively safe I'd think.

-- 
Jeff Layton <jlayton-vpEMnDpepFuMZCB2o+C8xQ@public.gmane.org>
--
To unsubscribe from this list: send the line "unsubscribe linux-nfs" in
the body of a message to majordomo-u79uwXL29TY76Z2rM5mHXA@public.gmane.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

^ permalink raw reply	[flat|nested] 19+ messages in thread

end of thread, other threads:[~2015-09-11 14:38 UTC | newest]

Thread overview: 19+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2015-09-11 10:54 [PATCH v4 00/16] nfsd: open file caching Jeff Layton
2015-09-11 10:54 ` [PATCH v4 01/16] locks: change tracepoint for generic_add_lease Jeff Layton
2015-09-11 10:54 ` [PATCH v4 02/16] list_lru: add list_lru_rotate Jeff Layton
2015-09-11 10:54 ` [PATCH v4 03/16] fs: allow __fput_sync to be used by non-kthreads and in modules Jeff Layton
     [not found]   ` <1441968882-7851-4-git-send-email-jeff.layton-7I+n7zu2hftEKMMhf/gKZA@public.gmane.org>
2015-09-11 14:00     ` Al Viro
     [not found]       ` <20150911140049.GN22011-3bDd1+5oDREiFSDQTTA3OLVCufUGDwFn@public.gmane.org>
2015-09-11 14:38         ` Jeff Layton
2015-09-11 10:54 ` [PATCH v4 04/16] fsnotify: export several symbols Jeff Layton
     [not found] ` <1441968882-7851-1-git-send-email-jeff.layton-7I+n7zu2hftEKMMhf/gKZA@public.gmane.org>
2015-09-11 10:54   ` [PATCH v4 05/16] locks: create a new notifier chain for lease attempts Jeff Layton
2015-09-11 10:54   ` [PATCH v4 08/16] nfsd: add a new struct file caching facility to nfsd Jeff Layton
2015-09-11 10:54   ` [PATCH v4 09/16] nfsd: hook up nfsd_write to the new nfsd_file cache Jeff Layton
2015-09-11 10:54   ` [PATCH v4 10/16] nfsd: hook up nfsd_read to the " Jeff Layton
2015-09-11 10:54   ` [PATCH v4 11/16] nfsd: hook nfsd_commit up " Jeff Layton
2015-09-11 10:54   ` [PATCH v4 13/16] nfsd: have nfsd_test_lock use " Jeff Layton
2015-09-11 10:54   ` [PATCH v4 15/16] nfsd: hook up nfs4_preprocess_stateid_op to " Jeff Layton
2015-09-11 10:54 ` [PATCH v4 06/16] nfsd: move include of state.h from trace.c to trace.h Jeff Layton
2015-09-11 10:54 ` [PATCH v4 07/16] sunrpc: add a new cache_detail operation for when a cache is flushed Jeff Layton
2015-09-11 10:54 ` [PATCH v4 12/16] nfsd: convert nfs4_file->fi_fds array to use nfsd_files Jeff Layton
2015-09-11 10:54 ` [PATCH v4 14/16] nfsd: convert fi_deleg_file and ls_file fields to nfsd_file Jeff Layton
2015-09-11 10:54 ` [PATCH v4 16/16] nfsd: rip out the raparms cache Jeff Layton

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).