All of lore.kernel.org
 help / color / mirror / Atom feed
* [PATCH 00/67] fscache: Rewrite index API and management system
@ 2021-10-18 14:50 David Howells
  2021-10-18 14:50 ` [PATCH 01/67] mm: Stop filemap_read() from grabbing a superfluous page David Howells
                   ` (70 more replies)
  0 siblings, 71 replies; 87+ messages in thread
From: David Howells @ 2021-10-18 14:50 UTC (permalink / raw)
  To: linux-cachefs
  Cc: ceph-devel, linux-afs, Anna Schumaker, linux-nfs,
	Kent Overstreet, linux-mm, Matthew Wilcox, linux-fsdevel,
	Dave Wysochanski, Marc Dionne, Trond Myklebust, Shyam Prasad N,
	Eric Van Hensbergen, v9fs-developer, linux-cifs,
	Latchesar Ionkov, Jeff Layton, Steve French, Al Viro,
	Dominique Martinet, Matthew Wilcox (Oracle),
	Ilya Dryomov, dhowells, Trond Myklebust, Anna Schumaker,
	Steve French, Dominique Martinet, Jeff Layton, Matthew Wilcox,
	Alexander Viro, Omar Sandoval, Linus Torvalds, linux-afs,
	linux-nfs, linux-cifs, ceph-devel, v9fs-developer, linux-fsdevel,
	linux-kernel


Here's a set of patches that rewrites and simplifies the fscache index API
to remove the complex operation scheduling and object state machine in
favour of something much smaller and simpler.  It is built on top of the
set of patches that removes the old API[1].

The operation scheduling API was intended to handle sequencing of cache
operations, which were all required (where possible) to run asynchronously
in parallel with the operations being done by the network filesystem, while
allowing the cache to be brought online and offline and interrupt service
with invalidation.

However, with the advent of the tmpfile capacity in the VFS, an opportunity
arises to do invalidation much more easily, without having to wait for I/O
that's actually in progress: Cachefiles can simply cut over its file
pointer for the backing object attached to a cookie and abandon the
in-progress I/O, dismissing it upon completion.

Future work there would involve using Omar Sandoval's vfs_link() with
AT_LINK_REPLACE[2] to allow an extant file to be displaced by a new hard
link from a tmpfile as currently I have to unlink the old file first.

These patches can also simplify the object state handling as I/O operations
to the cache don't all have to be brought to a stop in order to invalidate
a file.  To that end, and with an eye on to writing a new backing cache
model in the future, I've taken the opportunity to simplify the indexing
structure.

I've separated the index cookie concept from the file cookie concept by
type now.  The former is now called a "volume cookie" (struct
fscache_volume) and there is a container of file cookies.  There are then
just the two levels.  All the index cookieage is collapsed into a single
volume cookie, and this has a single printable string as a key.  For
instance, an AFS volume would have a key of something like
"afs,example.com,1000555", combining the filesystem name, cell name and
volume ID.  This is freeform, but must not have '/' chars in it.

I've also eliminated all pointers back from fscache into the network
filesystem.  This required the duplication of a little bit of data in the
cookie (cookie key, coherency data and file size), but it's not actually
that much.  This gets rid of problems with making sure we keep netfs data
structures around so that the cache can access them.

I have changed afs throughout the patch series, but I also have patches for
9p, nfs and cifs.  Jeff Layton is handling ceph support.


BITS THAT MAY BE CONTROVERSIAL
==============================

There are some bits I've added that may be controversial:

 (1) I've provided a flag, S_KERNEL_FILE, that cachefiles uses to check if
     a files is already being used by some other kernel service (e.g. a
     duplicate cachefiles cache in the same directory) and reject it if it
     is.  This isn't entirely necessary, but it helps prevent accidental
     data corruption.

     I don't want to use S_SWAPFILE as that has other effects, but quite
     possibly swapon() should set S_KERNEL_FILE too.

     Note that it doesn't prevent userspace from interfering, though
     perhaps it should.

 (2) Cachefiles wants to keep the backing file for a cookie open whilst we
     might need to write to it from network filesystem writeback.  The
     problem is that the network filesystem unuses its cookie when its file
     is closed, and so we have nothing pinning the cachefiles file open and
     it will get closed automatically after a short time to avoid
     EMFILE/ENFILE problems.

     Reopening the cache file, however, is a problem if this is being done
     due to writeback triggered by exit().  Some filesystems will oops if
     we try to open a file in that context because they want to access
     current->fs or suchlike.

     To get around this, I added the following:

     (A) An inode flag, I_PINNING_FSCACHE_WB, to be set on a network
     	 filesystem inode to indicate that we have a usage count on the
     	 cookie caching that inode.

     (B) A flag in struct writeback_control, unpinned_fscache_wb, that is
     	 set when __writeback_single_inode() clears the last dirty page
     	 from i_pages - at which point it clears I_PINNING_FSCACHE_WB and
     	 sets this flag.

	 This has to be done here so that clearing I_PINNING_FSCACHE_WB can
	 be done atomically with the check of PAGECACHE_TAG_DIRTY that
	 clears I_DIRTY_PAGES.

     (C) A function, fscache_set_page_dirty(), which if it is not set, sets
     	 I_PINNING_FSCACHE_WB and calls fscache_use_cookie() to pin the
     	 cache resources.

     (D) A function, fscache_unpin_writeback(), to be called by
     	 ->write_inode() to unuse the cookie.

     (E) A function, fscache_clear_inode_writeback(), to be called when the
     	 inode is evicted, before clear_inode() is called.  This cleans up
     	 any lingering I_PINNING_FSCACHE_WB.

     The network filesystem can then use these tools to make sure that
     fscache_write_to_cache() can write locally modified data to the cache
     as well as to the server.

     For the future, I'm working on write helpers for netfs lib that should
     allow this facility to be removed by keeping track of the dirty
     regions separately - but that's incomplete at the moment and is also
     going to be affected by folios, one way or another, since it deals
     with pages.


These patches can be found also on:

	https://git.kernel.org/pub/scm/linux/kernel/git/dhowells/linux-fs.git/log/?h=fscache-rewrite-indexing

David

Link: https://lore.kernel.org/r/163363935000.1980952.15279841414072653108.stgit@warthog.procyon.org.uk [1]
Link: https://lore.kernel.org/r/cover.1580251857.git.osandov@fb.com/ [2]

---
Dave Wysochanski (3):
      NFS: Convert fscache_acquire_cookie and fscache_relinquish_cookie
      NFS: Convert fscache_enable_cookie and fscache_disable_cookie
      NFS: Convert fscache invalidation and update aux_data and i_size

David Howells (63):
      mm: Stop filemap_read() from grabbing a superfluous page
      vfs: Provide S_KERNEL_FILE inode flag
      vfs, fscache: Force ->write_inode() to occur if cookie pinned for writeback
      afs: Handle len being extending over page end in write_begin/write_end
      afs: Fix afs_write_end() to handle len > page size
      nfs, cifs, ceph, 9p: Disable use of fscache prior to its rewrite
      fscache: Remove the netfs data from the cookie
      fscache: Remove struct fscache_cookie_def
      fscache: Remove store_limit* from struct fscache_object
      fscache: Remove fscache_check_consistency()
      fscache: Remove fscache_attr_changed()
      fscache: Remove obsolete stats
      fscache: Remove old I/O tracepoints
      fscache: Temporarily disable fscache_invalidate()
      fscache: Disable fscache_begin_operation()
      fscache: Remove the I/O operation manager
      fscache: Rename fscache_cookie_{get,put,see}()
      cachefiles: Remove tree of active files and use S_CACHE_FILE inode flag
      cachefiles: Don't set an xattr on the root of the cache
      cachefiles: Remove some redundant checks on unsigned values
      cachefiles: Prevent inode from going away when burying a dentry
      cachefiles: Simplify the pathwalk and save the filename for an object
      cachefiles: trace: Improve the lookup tracepoint
      cachefiles: Remove separate backer dentry from cachefiles_object
      cachefiles: Fold fscache_object into cachefiles_object
      cachefiles: Change to storing file* rather than dentry*
      cachefiles: trace: Log coherency checks
      cachefiles: Trace truncations
      cachefiles: Trace read and write operations
      cachefiles: Round the cachefile size up to DIO block size
      cachefiles: Don't use XATTR_ flags with vfs_setxattr()
      fscache: Replace the object management state machine
      cachefiles: Trace decisions in cachefiles_prepare_read()
      cachefiles: Make cachefiles_write_prepare() check for space
      fscache: Automatically close a file that's been unused for a while
      fscache: Add stats for the cookie commit LRU
      fscache: Move fscache_update_cookie() complete inline
      fscache: Remove more obsolete stats
      fscache: Note the object size during invalidation
      vfs, fscache: Force ->write_inode() to occur if cookie pinned for writeback
      afs: Render cache cookie key as big endian
      cachefiles: Use tmpfile/link
      fscache: Rewrite invalidation
      cachefiles: Simplify the file lookup/creation/check code
      fscache: Provide resize operation
      cachefiles: Put more information in the xattr attached to the cache file
      fscache: Implement "will_modify" parameter on fscache_use_cookie()
      fscache: Add support for writing to the cache
      fscache: Make fscache_clear_page_bits() conditional on cookie
      fscache: Make fscache_write_to_cache() conditional on cookie
      afs: Copy local writes to the cache when writing to the server
      afs: Invoke fscache_resize_cookie() when handling ATTR_SIZE for setattr
      afs: Add O_DIRECT read support
      afs: Skip truncation on the server of data we haven't written yet
      afs: Make afs_write_begin() return the THP subpage
      cachefiles, afs: Drive FSCACHE_COOKIE_NO_DATA_TO_READ
      nfs: Convert to new fscache volume/cookie API
      9p: Use fscache indexing rewrite and reenable caching
      9p: Copy local writes to the cache when writing to the server
      netfs: Display the netfs inode number in the netfs_read tracepoint
      cachefiles: Add tracepoints to log errors from ops on the backing fs
      cachefiles: Add error injection support
      cifs: Support fscache indexing rewrite (untested)

Jeff Layton (1):
      fscache: disable cookie when doing an invalidation for DIO write


 fs/9p/cache.c                     |  184 +----
 fs/9p/cache.h                     |   23 +-
 fs/9p/v9fs.c                      |   14 +-
 fs/9p/v9fs.h                      |   13 +-
 fs/9p/vfs_addr.c                  |   55 +-
 fs/9p/vfs_dir.c                   |   13 +-
 fs/9p/vfs_file.c                  |    7 +-
 fs/9p/vfs_inode.c                 |   24 +-
 fs/9p/vfs_inode_dotl.c            |    3 +-
 fs/9p/vfs_super.c                 |    3 +
 fs/afs/Makefile                   |    3 -
 fs/afs/cache.c                    |   68 --
 fs/afs/cell.c                     |   12 -
 fs/afs/file.c                     |   83 +-
 fs/afs/fsclient.c                 |   18 +-
 fs/afs/inode.c                    |  101 ++-
 fs/afs/internal.h                 |   36 +-
 fs/afs/main.c                     |   14 -
 fs/afs/super.c                    |    1 +
 fs/afs/volume.c                   |   15 +-
 fs/afs/write.c                    |  170 +++-
 fs/afs/yfsclient.c                |   12 +-
 fs/cachefiles/Kconfig             |    8 +
 fs/cachefiles/Makefile            |    3 +
 fs/cachefiles/bind.c              |  186 +++--
 fs/cachefiles/daemon.c            |   20 +-
 fs/cachefiles/error_inject.c      |   46 ++
 fs/cachefiles/interface.c         |  660 +++++++--------
 fs/cachefiles/internal.h          |  191 +++--
 fs/cachefiles/io.c                |  310 +++++--
 fs/cachefiles/key.c               |  203 +++--
 fs/cachefiles/main.c              |   20 +-
 fs/cachefiles/namei.c             |  978 ++++++++++------------
 fs/cachefiles/volume.c            |  128 +++
 fs/cachefiles/xattr.c             |  367 +++------
 fs/ceph/Kconfig                   |    2 +-
 fs/cifs/Makefile                  |    2 +-
 fs/cifs/cache.c                   |  105 ---
 fs/cifs/cifsfs.c                  |   11 +-
 fs/cifs/cifsglob.h                |    5 +-
 fs/cifs/connect.c                 |    3 -
 fs/cifs/file.c                    |   37 +-
 fs/cifs/fscache.c                 |  201 ++---
 fs/cifs/fscache.h                 |   51 +-
 fs/cifs/inode.c                   |   18 +-
 fs/fs-writeback.c                 |    8 +
 fs/fscache/Kconfig                |    4 +
 fs/fscache/Makefile               |    6 +-
 fs/fscache/cache.c                |  541 ++++++-------
 fs/fscache/cookie.c               | 1262 ++++++++++++++---------------
 fs/fscache/fsdef.c                |   98 ---
 fs/fscache/internal.h             |  213 +----
 fs/fscache/io.c                   |  405 ++++++---
 fs/fscache/main.c                 |  134 +--
 fs/fscache/netfs.c                |   74 --
 fs/fscache/object.c               | 1123 -------------------------
 fs/fscache/operation.c            |  633 ---------------
 fs/fscache/page.c                 |   84 --
 fs/fscache/proc.c                 |   43 +-
 fs/fscache/stats.c                |  202 ++---
 fs/fscache/volume.c               |  449 ++++++++++
 fs/netfs/read_helper.c            |    2 +-
 fs/nfs/Makefile                   |    2 +-
 fs/nfs/client.c                   |    4 -
 fs/nfs/direct.c                   |    2 +
 fs/nfs/file.c                     |    7 +-
 fs/nfs/fscache-index.c            |  114 ---
 fs/nfs/fscache.c                  |  264 ++----
 fs/nfs/fscache.h                  |   89 +-
 fs/nfs/inode.c                    |   11 +-
 fs/nfs/super.c                    |    7 +-
 fs/nfs/write.c                    |    1 +
 include/linux/fs.h                |    4 +
 include/linux/fscache-cache.h     |  463 +++--------
 include/linux/fscache.h           |  626 +++++++-------
 include/linux/netfs.h             |    4 +-
 include/linux/nfs_fs_sb.h         |    9 +-
 include/linux/writeback.h         |    1 +
 include/trace/events/cachefiles.h |  483 ++++++++---
 include/trace/events/fscache.h    |  631 +++++++--------
 include/trace/events/netfs.h      |    5 +-
 81 files changed, 5140 insertions(+), 7295 deletions(-)
 delete mode 100644 fs/afs/cache.c
 create mode 100644 fs/cachefiles/error_inject.c
 create mode 100644 fs/cachefiles/volume.c
 delete mode 100644 fs/cifs/cache.c
 delete mode 100644 fs/fscache/fsdef.c
 delete mode 100644 fs/fscache/netfs.c
 delete mode 100644 fs/fscache/object.c
 delete mode 100644 fs/fscache/operation.c
 create mode 100644 fs/fscache/volume.c
 delete mode 100644 fs/nfs/fscache-index.c



^ permalink raw reply	[flat|nested] 87+ messages in thread

* [PATCH 01/67] mm: Stop filemap_read() from grabbing a superfluous page
  2021-10-18 14:50 [PATCH 00/67] fscache: Rewrite index API and management system David Howells
@ 2021-10-18 14:50 ` David Howells
  2021-10-19 17:13   ` Jeff Layton
                     ` (2 more replies)
  2021-10-18 14:50 ` [PATCH 02/67] vfs: Provide S_KERNEL_FILE inode flag David Howells
                   ` (69 subsequent siblings)
  70 siblings, 3 replies; 87+ messages in thread
From: David Howells @ 2021-10-18 14:50 UTC (permalink / raw)
  To: linux-cachefs
  Cc: Kent Overstreet, Matthew Wilcox (Oracle),
	linux-mm, linux-fsdevel, dhowells, Trond Myklebust,
	Anna Schumaker, Steve French, Dominique Martinet, Jeff Layton,
	Matthew Wilcox, Alexander Viro, Omar Sandoval, Linus Torvalds,
	linux-afs, linux-nfs, linux-cifs, ceph-devel, v9fs-developer,
	linux-fsdevel, linux-kernel

Under some circumstances, filemap_read() will allocate sufficient pages to
read to the end of the file, call readahead/readpages on them and copy the
data over - and then it will allocate another page at the EOF and call
readpage on that and then ignore it.  This is unnecessary and a waste of
time and resources.

filemap_read() *does* check for this, but only after it has already done
the allocation and I/O.  Fix this by checking before calling
filemap_get_pages() also.

Signed-off-by: David Howells <dhowells@redhat.com>
Acked-by: Kent Overstreet <kent.overstreet@gmail.com>
cc: Matthew Wilcox (Oracle) <willy@infradead.org>
cc: linux-mm@kvack.org
cc: linux-fsdevel@vger.kernel.org
Link: https://lore.kernel.org/r/160588481358.3465195.16552616179674485179.stgit@warthog.procyon.org.uk/
---

 mm/filemap.c |    4 ++++
 1 file changed, 4 insertions(+)

diff --git a/mm/filemap.c b/mm/filemap.c
index dae481293b5d..c0cdc44c844e 100644
--- a/mm/filemap.c
+++ b/mm/filemap.c
@@ -2625,6 +2625,10 @@ ssize_t filemap_read(struct kiocb *iocb, struct iov_iter *iter,
 		if ((iocb->ki_flags & IOCB_WAITQ) && already_read)
 			iocb->ki_flags |= IOCB_NOWAIT;
 
+		isize = i_size_read(inode);
+		if (unlikely(iocb->ki_pos >= isize))
+			goto put_pages;
+
 		error = filemap_get_pages(iocb, iter, &pvec);
 		if (error < 0)
 			break;



^ permalink raw reply related	[flat|nested] 87+ messages in thread

* [PATCH 02/67] vfs: Provide S_KERNEL_FILE inode flag
  2021-10-18 14:50 [PATCH 00/67] fscache: Rewrite index API and management system David Howells
  2021-10-18 14:50 ` [PATCH 01/67] mm: Stop filemap_read() from grabbing a superfluous page David Howells
@ 2021-10-18 14:50 ` David Howells
  2021-10-19 18:10   ` Jeff Layton
  2021-10-19 19:02   ` David Howells
  2021-10-18 14:51 ` [PATCH 03/67] vfs, fscache: Force ->write_inode() to occur if cookie pinned for writeback David Howells
                   ` (68 subsequent siblings)
  70 siblings, 2 replies; 87+ messages in thread
From: David Howells @ 2021-10-18 14:50 UTC (permalink / raw)
  To: linux-cachefs
  Cc: dhowells, Trond Myklebust, Anna Schumaker, Steve French,
	Dominique Martinet, Jeff Layton, Matthew Wilcox, Alexander Viro,
	Omar Sandoval, Linus Torvalds, linux-afs, linux-nfs, linux-cifs,
	ceph-devel, v9fs-developer, linux-fsdevel, linux-kernel

Provide an S_KERNEL_FILE inode flag that a kernel service, e.g. cachefiles,
can set to ward off other kernel services and drivers (including itself)
from using files it is actively using.

Signed-off-by: David Howells <dhowells@redhat.com>
---

 include/linux/fs.h |    1 +
 1 file changed, 1 insertion(+)

diff --git a/include/linux/fs.h b/include/linux/fs.h
index e7a633353fd2..197493507744 100644
--- a/include/linux/fs.h
+++ b/include/linux/fs.h
@@ -2250,6 +2250,7 @@ struct super_operations {
 #define S_ENCRYPTED	(1 << 14) /* Encrypted file (using fs/crypto/) */
 #define S_CASEFOLD	(1 << 15) /* Casefolded file */
 #define S_VERITY	(1 << 16) /* Verity file (using fs/verity/) */
+#define S_KERNEL_FILE	(1 << 17) /* File is in use by the kernel (eg. fs/cachefiles) */
 
 /*
  * Note that nosuid etc flags are inode-specific: setting some file-system



^ permalink raw reply related	[flat|nested] 87+ messages in thread

* [PATCH 03/67] vfs, fscache: Force ->write_inode() to occur if cookie pinned for writeback
  2021-10-18 14:50 [PATCH 00/67] fscache: Rewrite index API and management system David Howells
  2021-10-18 14:50 ` [PATCH 01/67] mm: Stop filemap_read() from grabbing a superfluous page David Howells
  2021-10-18 14:50 ` [PATCH 02/67] vfs: Provide S_KERNEL_FILE inode flag David Howells
@ 2021-10-18 14:51 ` David Howells
  2021-10-19 17:22   ` Jeff Layton
  2021-10-20  9:49   ` David Howells
  2021-10-18 14:51 ` [PATCH 04/67] afs: Handle len being extending over page end in write_begin/write_end David Howells
                   ` (67 subsequent siblings)
  70 siblings, 2 replies; 87+ messages in thread
From: David Howells @ 2021-10-18 14:51 UTC (permalink / raw)
  To: linux-cachefs
  Cc: dhowells, Trond Myklebust, Anna Schumaker, Steve French,
	Dominique Martinet, Jeff Layton, Matthew Wilcox, Alexander Viro,
	Omar Sandoval, Linus Torvalds, linux-afs, linux-nfs, linux-cifs,
	ceph-devel, v9fs-developer, linux-fsdevel, linux-kernel

Use an inode flag, I_PINNING_FSCACHE_WB, to indicate that a cookie is
pinned in use by that inode for the purposes of writeback.

Pinning is necessary because the in-use pin from the open file is released
before the writeback takes place, but if the resources aren't pinned, the
dirty data can't be written to the cache.

Signed-off-by: David Howells <dhowells@redhat.com>
---

 fs/fs-writeback.c         |    8 ++++++++
 include/linux/fs.h        |    3 +++
 include/linux/fscache.h   |    1 +
 include/linux/writeback.h |    1 +
 4 files changed, 13 insertions(+)

diff --git a/fs/fs-writeback.c b/fs/fs-writeback.c
index 81ec192ce067..f3122831c4fe 100644
--- a/fs/fs-writeback.c
+++ b/fs/fs-writeback.c
@@ -1666,6 +1666,13 @@ __writeback_single_inode(struct inode *inode, struct writeback_control *wbc)
 
 	if (mapping_tagged(mapping, PAGECACHE_TAG_DIRTY))
 		inode->i_state |= I_DIRTY_PAGES;
+	else if (unlikely(inode->i_state & I_PINNING_FSCACHE_WB)) {
+		if (!(inode->i_state & I_DIRTY_PAGES)) {
+			inode->i_state &= ~I_PINNING_FSCACHE_WB;
+			wbc->unpinned_fscache_wb = true;
+			dirty |= I_PINNING_FSCACHE_WB; /* Cause write_inode */
+		}
+	}
 
 	spin_unlock(&inode->i_lock);
 
@@ -1675,6 +1682,7 @@ __writeback_single_inode(struct inode *inode, struct writeback_control *wbc)
 		if (ret == 0)
 			ret = err;
 	}
+	wbc->unpinned_fscache_wb = false;
 	trace_writeback_single_inode(inode, wbc, nr_to_write);
 	return ret;
 }
diff --git a/include/linux/fs.h b/include/linux/fs.h
index 197493507744..336739fed3e9 100644
--- a/include/linux/fs.h
+++ b/include/linux/fs.h
@@ -2420,6 +2420,8 @@ static inline void kiocb_clone(struct kiocb *kiocb, struct kiocb *kiocb_src,
  *			Used to detect that mark_inode_dirty() should not move
  * 			inode between dirty lists.
  *
+ * I_PINNING_FSCACHE_WB	Inode is pinning an fscache object for writeback.
+ *
  * Q: What is the difference between I_WILL_FREE and I_FREEING?
  */
 #define I_DIRTY_SYNC		(1 << 0)
@@ -2442,6 +2444,7 @@ static inline void kiocb_clone(struct kiocb *kiocb, struct kiocb *kiocb_src,
 #define I_CREATING		(1 << 15)
 #define I_DONTCACHE		(1 << 16)
 #define I_SYNC_QUEUED		(1 << 17)
+#define I_PINNING_FSCACHE_WB	(1 << 18)
 
 #define I_DIRTY_INODE (I_DIRTY_SYNC | I_DIRTY_DATASYNC)
 #define I_DIRTY (I_DIRTY_INODE | I_DIRTY_PAGES)
diff --git a/include/linux/fscache.h b/include/linux/fscache.h
index 01558d155799..ba4878b56717 100644
--- a/include/linux/fscache.h
+++ b/include/linux/fscache.h
@@ -19,6 +19,7 @@
 #include <linux/pagemap.h>
 #include <linux/pagevec.h>
 #include <linux/list_bl.h>
+#include <linux/writeback.h>
 #include <linux/netfs.h>
 
 #if defined(CONFIG_FSCACHE) || defined(CONFIG_FSCACHE_MODULE)
diff --git a/include/linux/writeback.h b/include/linux/writeback.h
index d1f65adf6a26..2fda288600d3 100644
--- a/include/linux/writeback.h
+++ b/include/linux/writeback.h
@@ -69,6 +69,7 @@ struct writeback_control {
 	unsigned for_reclaim:1;		/* Invoked from the page allocator */
 	unsigned range_cyclic:1;	/* range_start is cyclic */
 	unsigned for_sync:1;		/* sync(2) WB_SYNC_ALL writeback */
+	unsigned unpinned_fscache_wb:1;	/* Cleared I_PINNING_FSCACHE_WB */
 
 	/*
 	 * When writeback IOs are bounced through async layers, only the



^ permalink raw reply related	[flat|nested] 87+ messages in thread

* [PATCH 04/67] afs: Handle len being extending over page end in write_begin/write_end
  2021-10-18 14:50 [PATCH 00/67] fscache: Rewrite index API and management system David Howells
                   ` (2 preceding siblings ...)
  2021-10-18 14:51 ` [PATCH 03/67] vfs, fscache: Force ->write_inode() to occur if cookie pinned for writeback David Howells
@ 2021-10-18 14:51 ` David Howells
  2021-10-18 14:51 ` [PATCH 05/67] afs: Fix afs_write_end() to handle len > page size David Howells
                   ` (66 subsequent siblings)
  70 siblings, 0 replies; 87+ messages in thread
From: David Howells @ 2021-10-18 14:51 UTC (permalink / raw)
  To: linux-cachefs
  Cc: Matthew Wilcox (Oracle),
	Jeff Layton, linux-afs, dhowells, Trond Myklebust,
	Anna Schumaker, Steve French, Dominique Martinet, Jeff Layton,
	Matthew Wilcox, Alexander Viro, Omar Sandoval, Linus Torvalds,
	linux-afs, linux-nfs, linux-cifs, ceph-devel, v9fs-developer,
	linux-fsdevel, linux-kernel

With transparent huge pages, in the future, write_begin() and write_end()
may be passed a length parameter that, in combination with the offset into
the page, exceeds the length of that page.  This allows
grab_cache_page_write_begin() to better choose the size of THP to allocate.

Fix afs's functions to handle this by trimming the length as needed after
the page has been allocated.

[Removed the now-unnecessary index var; spotted by kernel test robot]

Fixes: e1b1240c1ff5 ("netfs: Add write_begin helper")
Reported-by: Matthew Wilcox (Oracle) <willy@infradead.org>
Signed-off-by: David Howells <dhowells@redhat.com>
Acked-by: Jeff Layton <jlayton@kernel.org>
cc: linux-afs@lists.infradead.org
Link: https://lore.kernel.org/r/162367681795.460125.11729955608839747375.stgit@warthog.procyon.org.uk/ # v1
---

 fs/afs/write.c |   13 +++++++------
 1 file changed, 7 insertions(+), 6 deletions(-)

diff --git a/fs/afs/write.c b/fs/afs/write.c
index f24370f5c774..c09830c9dc43 100644
--- a/fs/afs/write.c
+++ b/fs/afs/write.c
@@ -25,7 +25,8 @@ int afs_set_page_dirty(struct page *page)
 }
 
 /*
- * prepare to perform part of a write to a page
+ * Prepare to perform part of a write to a page.  Note that len may extend
+ * beyond the end of the page.
  */
 int afs_write_begin(struct file *file, struct address_space *mapping,
 		    loff_t pos, unsigned len, unsigned flags,
@@ -36,7 +37,6 @@ int afs_write_begin(struct file *file, struct address_space *mapping,
 	unsigned long priv;
 	unsigned f, from;
 	unsigned t, to;
-	pgoff_t index;
 	int ret;
 
 	_enter("{%llx:%llu},%llx,%x",
@@ -51,8 +51,8 @@ int afs_write_begin(struct file *file, struct address_space *mapping,
 	if (ret < 0)
 		return ret;
 
-	index = page->index;
-	from = pos - index * PAGE_SIZE;
+	from = offset_in_thp(page, pos);
+	len = min_t(size_t, len, thp_size(page) - from);
 	to = from + len;
 
 try_again:
@@ -103,7 +103,8 @@ int afs_write_begin(struct file *file, struct address_space *mapping,
 }
 
 /*
- * finalise part of a write to a page
+ * Finalise part of a write to a page.  Note that len may extend beyond the end
+ * of the page.
  */
 int afs_write_end(struct file *file, struct address_space *mapping,
 		  loff_t pos, unsigned len, unsigned copied,
@@ -111,7 +112,7 @@ int afs_write_end(struct file *file, struct address_space *mapping,
 {
 	struct afs_vnode *vnode = AFS_FS_I(file_inode(file));
 	unsigned long priv;
-	unsigned int f, from = pos & (thp_size(page) - 1);
+	unsigned int f, from = offset_in_thp(page, pos);
 	unsigned int t, to = from + copied;
 	loff_t i_size, maybe_i_size;
 



^ permalink raw reply related	[flat|nested] 87+ messages in thread

* [PATCH 05/67] afs: Fix afs_write_end() to handle len > page size
  2021-10-18 14:50 [PATCH 00/67] fscache: Rewrite index API and management system David Howells
                   ` (3 preceding siblings ...)
  2021-10-18 14:51 ` [PATCH 04/67] afs: Handle len being extending over page end in write_begin/write_end David Howells
@ 2021-10-18 14:51 ` David Howells
  2021-10-18 14:51 ` [PATCH 06/67] nfs, cifs, ceph, 9p: Disable use of fscache prior to its rewrite David Howells
                   ` (65 subsequent siblings)
  70 siblings, 0 replies; 87+ messages in thread
From: David Howells @ 2021-10-18 14:51 UTC (permalink / raw)
  To: linux-cachefs
  Cc: Jeff Layton, Jeff Layton, Al Viro, Matthew Wilcox, linux-afs,
	dhowells, Trond Myklebust, Anna Schumaker, Steve French,
	Dominique Martinet, Jeff Layton, Matthew Wilcox, Alexander Viro,
	Omar Sandoval, Linus Torvalds, linux-afs, linux-nfs, linux-cifs,
	ceph-devel, v9fs-developer, linux-fsdevel, linux-kernel

It is possible for the len argument to afs_write_end() to overrun the end
of the page (len is used to key the size of the page in afs_write_start()
when compound pages become a regular thing).

Fix afs_write_end() to correctly trim the write length so that it doesn't
exceed the end of the page.

Fixes: 3003bbd0697b ("afs: Use the netfs_write_begin() helper")
Reported-by: Jeff Layton <jlayton@kernel.org>
Signed-off-by: David Howells <dhowells@redhat.com>
Acked-by: Jeff Layton <jlayton@kernel.org>
cc: Al Viro <viro@zeniv.linux.org.uk>
cc: Matthew Wilcox <willy@infradead.org>
cc: linux-afs@lists.infradead.org
Link: https://lore.kernel.org/r/162367682522.460125.5652091227576721609.stgit@warthog.procyon.org.uk/ # v1
---

 fs/afs/write.c |    1 +
 1 file changed, 1 insertion(+)

diff --git a/fs/afs/write.c b/fs/afs/write.c
index c09830c9dc43..e1cb19cb6314 100644
--- a/fs/afs/write.c
+++ b/fs/afs/write.c
@@ -119,6 +119,7 @@ int afs_write_end(struct file *file, struct address_space *mapping,
 	_enter("{%llx:%llu},{%lx}",
 	       vnode->fid.vid, vnode->fid.vnode, page->index);
 
+	len = min_t(size_t, len, thp_size(page) - from);
 	if (!PageUptodate(page)) {
 		if (copied < len) {
 			copied = 0;



^ permalink raw reply related	[flat|nested] 87+ messages in thread

* [PATCH 06/67] nfs, cifs, ceph, 9p: Disable use of fscache prior to its rewrite
  2021-10-18 14:50 [PATCH 00/67] fscache: Rewrite index API and management system David Howells
                   ` (4 preceding siblings ...)
  2021-10-18 14:51 ` [PATCH 05/67] afs: Fix afs_write_end() to handle len > page size David Howells
@ 2021-10-18 14:51 ` David Howells
  2021-10-19 17:50   ` Jeff Layton
  2021-10-20 10:57   ` David Howells
  2021-10-18 14:52 ` [PATCH 07/67] fscache: Remove the netfs data from the cookie David Howells
                   ` (64 subsequent siblings)
  70 siblings, 2 replies; 87+ messages in thread
From: David Howells @ 2021-10-18 14:51 UTC (permalink / raw)
  To: linux-cachefs
  Cc: Dave Wysochanski, Trond Myklebust, Anna Schumaker, linux-nfs,
	Jeff Layton, Ilya Dryomov, ceph-devel, Steve French, linux-cifs,
	Eric Van Hensbergen, Latchesar Ionkov, Dominique Martinet,
	v9fs-developer, dhowells, Trond Myklebust, Anna Schumaker,
	Steve French, Dominique Martinet, Jeff Layton, Matthew Wilcox,
	Alexander Viro, Omar Sandoval, Linus Torvalds, linux-afs,
	linux-nfs, linux-cifs, ceph-devel, v9fs-developer, linux-fsdevel,
	linux-kernel

Temporarily disable the use of fscache by the various Linux network
filesystems, apart from afs, so that the fscache core can be rewritten.

Signed-off-by: David Howells <dhowells@redhat.com>
cc: Dave Wysochanski <dwysocha@redhat.com>
cc: Trond Myklebust <trond.myklebust@hammerspace.com>
cc: Anna Schumaker <anna.schumaker@netapp.com>
cc: linux-nfs@vger.kernel.org
cc: Jeff Layton <jlayton@kernel.org>
cc: Ilya Dryomov <idryomov@gmail.com>
cc: ceph-devel@vger.kernel.org
cc: Steve French <sfrench@samba.org>
cc: linux-cifs@vger.kernel.org
cc: Eric Van Hensbergen <ericvh@gmail.com>
cc: Latchesar Ionkov <lucho@ionkov.net>
cc: Dominique Martinet <asmadeus@codewreck.org>
cc: v9fs-developer@lists.sourceforge.net
---

 fs/9p/Kconfig      |    2 +-
 fs/ceph/Kconfig    |    2 +-
 fs/cifs/Kconfig    |    2 +-
 fs/fscache/Kconfig |    4 ++++
 fs/nfs/Kconfig     |    2 +-
 5 files changed, 8 insertions(+), 4 deletions(-)

diff --git a/fs/9p/Kconfig b/fs/9p/Kconfig
index d7bc93447c85..b11c15c30bac 100644
--- a/fs/9p/Kconfig
+++ b/fs/9p/Kconfig
@@ -14,7 +14,7 @@ config 9P_FS
 if 9P_FS
 config 9P_FSCACHE
 	bool "Enable 9P client caching support"
-	depends on 9P_FS=m && FSCACHE || 9P_FS=y && FSCACHE=y
+	depends on 9P_FS=m && FSCACHE_OLD || 9P_FS=y && FSCACHE_OLD=y
 	help
 	  Choose Y here to enable persistent, read-only local
 	  caching support for 9p clients using FS-Cache
diff --git a/fs/ceph/Kconfig b/fs/ceph/Kconfig
index 94df854147d3..77ad452337ee 100644
--- a/fs/ceph/Kconfig
+++ b/fs/ceph/Kconfig
@@ -21,7 +21,7 @@ config CEPH_FS
 if CEPH_FS
 config CEPH_FSCACHE
 	bool "Enable Ceph client caching support"
-	depends on CEPH_FS=m && FSCACHE || CEPH_FS=y && FSCACHE=y
+	depends on CEPH_FS=m && FSCACHE_OLD || CEPH_FS=y && FSCACHE_OLD=y
 	help
 	  Choose Y here to enable persistent, read-only local
 	  caching support for Ceph clients using FS-Cache
diff --git a/fs/cifs/Kconfig b/fs/cifs/Kconfig
index 3b7e3b9e4fd2..c5477abbcff0 100644
--- a/fs/cifs/Kconfig
+++ b/fs/cifs/Kconfig
@@ -188,7 +188,7 @@ config CIFS_SMB_DIRECT
 
 config CIFS_FSCACHE
 	bool "Provide CIFS client caching support"
-	depends on CIFS=m && FSCACHE || CIFS=y && FSCACHE=y
+	depends on CIFS=m && FSCACHE_OLD || CIFS=y && FSCACHE_OLD=y
 	help
 	  Makes CIFS FS-Cache capable. Say Y here if you want your CIFS data
 	  to be cached locally on disk through the general filesystem cache
diff --git a/fs/fscache/Kconfig b/fs/fscache/Kconfig
index b313a978ae0a..7850de3bdee0 100644
--- a/fs/fscache/Kconfig
+++ b/fs/fscache/Kconfig
@@ -38,3 +38,7 @@ config FSCACHE_DEBUG
 	  enabled by setting bits in /sys/modules/fscache/parameter/debug.
 
 	  See Documentation/filesystems/caching/fscache.rst for more information.
+
+config FSCACHE_OLD
+	bool
+	default n
diff --git a/fs/nfs/Kconfig b/fs/nfs/Kconfig
index 14a72224b657..a8b73c90aa00 100644
--- a/fs/nfs/Kconfig
+++ b/fs/nfs/Kconfig
@@ -170,7 +170,7 @@ config ROOT_NFS
 
 config NFS_FSCACHE
 	bool "Provide NFS client caching support"
-	depends on NFS_FS=m && FSCACHE || NFS_FS=y && FSCACHE=y
+	depends on NFS_FS=m && FSCACHE_OLD || NFS_FS=y && FSCACHE_OLD=y
 	help
 	  Say Y here if you want NFS data to be cached locally on disc through
 	  the general filesystem cache manager



^ permalink raw reply related	[flat|nested] 87+ messages in thread

* [PATCH 07/67] fscache: Remove the netfs data from the cookie
  2021-10-18 14:50 [PATCH 00/67] fscache: Rewrite index API and management system David Howells
                   ` (5 preceding siblings ...)
  2021-10-18 14:51 ` [PATCH 06/67] nfs, cifs, ceph, 9p: Disable use of fscache prior to its rewrite David Howells
@ 2021-10-18 14:52 ` David Howells
  2021-10-19 18:05   ` Jeff Layton
  2021-10-18 14:52 ` [PATCH 08/67] fscache: Remove struct fscache_cookie_def David Howells
                   ` (63 subsequent siblings)
  70 siblings, 1 reply; 87+ messages in thread
From: David Howells @ 2021-10-18 14:52 UTC (permalink / raw)
  To: linux-cachefs
  Cc: dhowells, Trond Myklebust, Anna Schumaker, Steve French,
	Dominique Martinet, Jeff Layton, Matthew Wilcox, Alexander Viro,
	Omar Sandoval, Linus Torvalds, linux-afs, linux-nfs, linux-cifs,
	ceph-devel, v9fs-developer, linux-fsdevel, linux-kernel

Remove the netfs data pointer from the cookie so that we don't refer back
to the netfs state and don't need accessors to manage this.  Keep the
information we do need (of which there's not actually a lot) in the cookie
which we can keep hold of if the netfs state goes away.

Signed-off-by: David Howells <dhowells@redhat.com>
---

 fs/afs/cache.c                |   39 --------
 fs/afs/cell.c                 |    3 -
 fs/afs/inode.c                |    3 -
 fs/afs/volume.c               |    4 -
 fs/cachefiles/interface.c     |  104 +++------------------
 fs/cachefiles/internal.h      |   20 +---
 fs/cachefiles/namei.c         |   10 +-
 fs/cachefiles/xattr.c         |  202 +++++++----------------------------------
 fs/fscache/cache.c            |   20 +---
 fs/fscache/cookie.c           |   33 +++----
 fs/fscache/fsdef.c            |   37 --------
 fs/fscache/internal.h         |   29 ++++--
 fs/fscache/netfs.c            |    3 -
 fs/fscache/object.c           |   49 ----------
 include/linux/fscache-cache.h |   32 +++++-
 include/linux/fscache.h       |   39 +-------
 16 files changed, 136 insertions(+), 491 deletions(-)

diff --git a/fs/afs/cache.c b/fs/afs/cache.c
index 037af93e3aba..9b2de3014dbf 100644
--- a/fs/afs/cache.c
+++ b/fs/afs/cache.c
@@ -8,11 +8,6 @@
 #include <linux/sched.h>
 #include "internal.h"
 
-static enum fscache_checkaux afs_vnode_cache_check_aux(void *cookie_netfs_data,
-						       const void *buffer,
-						       uint16_t buflen,
-						       loff_t object_size);
-
 struct fscache_netfs afs_cache_netfs = {
 	.name			= "afs",
 	.version		= 2,
@@ -31,38 +26,4 @@ struct fscache_cookie_def afs_volume_cache_index_def = {
 struct fscache_cookie_def afs_vnode_cache_index_def = {
 	.name		= "AFS.vnode",
 	.type		= FSCACHE_COOKIE_TYPE_DATAFILE,
-	.check_aux	= afs_vnode_cache_check_aux,
 };
-
-/*
- * check that the auxiliary data indicates that the entry is still valid
- */
-static enum fscache_checkaux afs_vnode_cache_check_aux(void *cookie_netfs_data,
-						       const void *buffer,
-						       uint16_t buflen,
-						       loff_t object_size)
-{
-	struct afs_vnode *vnode = cookie_netfs_data;
-	struct afs_vnode_cache_aux aux;
-
-	_enter("{%llx,%x,%llx},%p,%u",
-	       vnode->fid.vnode, vnode->fid.unique, vnode->status.data_version,
-	       buffer, buflen);
-
-	memcpy(&aux, buffer, sizeof(aux));
-
-	/* check the size of the data is what we're expecting */
-	if (buflen != sizeof(aux)) {
-		_leave(" = OBSOLETE [len %hx != %zx]", buflen, sizeof(aux));
-		return FSCACHE_CHECKAUX_OBSOLETE;
-	}
-
-	if (vnode->status.data_version != aux.data_version) {
-		_leave(" = OBSOLETE [vers %llx != %llx]",
-		       aux.data_version, vnode->status.data_version);
-		return FSCACHE_CHECKAUX_OBSOLETE;
-	}
-
-	_leave(" = SUCCESS");
-	return FSCACHE_CHECKAUX_OKAY;
-}
diff --git a/fs/afs/cell.c b/fs/afs/cell.c
index d88407fb9bc0..416f9bd638a5 100644
--- a/fs/afs/cell.c
+++ b/fs/afs/cell.c
@@ -683,9 +683,10 @@ static int afs_activate_cell(struct afs_net *net, struct afs_cell *cell)
 #ifdef CONFIG_AFS_FSCACHE
 	cell->cache = fscache_acquire_cookie(afs_cache_netfs.primary_index,
 					     &afs_cell_cache_index_def,
+					     NULL,
 					     cell->name, strlen(cell->name),
 					     NULL, 0,
-					     cell, 0, true);
+					     0, true);
 #endif
 	ret = afs_proc_cell_setup(cell);
 	if (ret < 0)
diff --git a/fs/afs/inode.c b/fs/afs/inode.c
index 8fcffea2daf5..3b696ac7c05a 100644
--- a/fs/afs/inode.c
+++ b/fs/afs/inode.c
@@ -432,9 +432,10 @@ static void afs_get_inode_cache(struct afs_vnode *vnode)
 
 	vnode->cache = fscache_acquire_cookie(vnode->volume->cache,
 					      &afs_vnode_cache_index_def,
+					      NULL,
 					      &key, sizeof(key),
 					      &aux, sizeof(aux),
-					      vnode, vnode->status.size, true);
+					      vnode->status.size, true);
 #endif
 }
 
diff --git a/fs/afs/volume.c b/fs/afs/volume.c
index f84194b791d3..9ca246ab1a86 100644
--- a/fs/afs/volume.c
+++ b/fs/afs/volume.c
@@ -273,9 +273,9 @@ void afs_activate_volume(struct afs_volume *volume)
 #ifdef CONFIG_AFS_FSCACHE
 	volume->cache = fscache_acquire_cookie(volume->cell->cache,
 					       &afs_volume_cache_index_def,
+					       NULL,
 					       &volume->vid, sizeof(volume->vid),
-					       NULL, 0,
-					       volume, 0, true);
+					       NULL, 0, 0, true);
 #endif
 }
 
diff --git a/fs/cachefiles/interface.c b/fs/cachefiles/interface.c
index 83671488a323..6c48e81deccc 100644
--- a/fs/cachefiles/interface.c
+++ b/fs/cachefiles/interface.c
@@ -7,13 +7,9 @@
 
 #include <linux/slab.h>
 #include <linux/mount.h>
+#include <linux/xattr.h>
 #include "internal.h"
 
-struct cachefiles_lookup_data {
-	struct cachefiles_xattr	*auxdata;	/* auxiliary data */
-	char			*key;		/* key path */
-};
-
 static int cachefiles_attr_changed(struct fscache_object *_object);
 
 /*
@@ -23,11 +19,9 @@ static struct fscache_object *cachefiles_alloc_object(
 	struct fscache_cache *_cache,
 	struct fscache_cookie *cookie)
 {
-	struct cachefiles_lookup_data *lookup_data;
 	struct cachefiles_object *object;
 	struct cachefiles_cache *cache;
-	struct cachefiles_xattr *auxdata;
-	unsigned keylen, auxlen;
+	unsigned keylen;
 	void *buffer, *p;
 	char *key;
 
@@ -35,10 +29,6 @@ static struct fscache_object *cachefiles_alloc_object(
 
 	_enter("{%s},%x,", cache->cache.identifier, cookie->debug_id);
 
-	lookup_data = kmalloc(sizeof(*lookup_data), cachefiles_gfp);
-	if (!lookup_data)
-		goto nomem_lookup_data;
-
 	/* create a new object record and a temporary leaf image */
 	object = kmem_cache_alloc(cachefiles_object_jar, cachefiles_gfp);
 	if (!object)
@@ -62,10 +52,7 @@ static struct fscache_object *cachefiles_alloc_object(
 		goto nomem_buffer;
 
 	keylen = cookie->key_len;
-	if (keylen <= sizeof(cookie->inline_key))
-		p = cookie->inline_key;
-	else
-		p = cookie->key;
+	p = fscache_get_key(cookie);
 	memcpy(buffer + 2, p, keylen);
 
 	*(uint16_t *)buffer = keylen;
@@ -75,28 +62,13 @@ static struct fscache_object *cachefiles_alloc_object(
 
 	/* turn the raw key into something that can work with as a filename */
 	key = cachefiles_cook_key(buffer, keylen + 2, object->type);
+	kfree(buffer);
 	if (!key)
 		goto nomem_key;
 
-	/* get hold of the auxiliary data and prepend the object type */
-	auxdata = buffer;
-	auxlen = cookie->aux_len;
-	if (auxlen) {
-		if (auxlen <= sizeof(cookie->inline_aux))
-			p = cookie->inline_aux;
-		else
-			p = cookie->aux;
-		memcpy(auxdata->data, p, auxlen);
-	}
-
-	auxdata->len = auxlen + 1;
-	auxdata->type = cookie->type;
-
-	lookup_data->auxdata = auxdata;
-	lookup_data->key = key;
-	object->lookup_data = lookup_data;
+	object->lookup_key = key;
 
-	_leave(" = %x [%p]", object->fscache.debug_id, lookup_data);
+	_leave(" = %x [%s]", object->fscache.debug_id, key);
 	return &object->fscache;
 
 nomem_key:
@@ -106,8 +78,6 @@ static struct fscache_object *cachefiles_alloc_object(
 	kmem_cache_free(cachefiles_object_jar, object);
 	fscache_object_destroyed(&cache->cache);
 nomem_object:
-	kfree(lookup_data);
-nomem_lookup_data:
 	_leave(" = -ENOMEM");
 	return ERR_PTR(-ENOMEM);
 }
@@ -118,7 +88,6 @@ static struct fscache_object *cachefiles_alloc_object(
  */
 static int cachefiles_lookup_object(struct fscache_object *_object)
 {
-	struct cachefiles_lookup_data *lookup_data;
 	struct cachefiles_object *parent, *object;
 	struct cachefiles_cache *cache;
 	const struct cred *saved_cred;
@@ -130,15 +99,12 @@ static int cachefiles_lookup_object(struct fscache_object *_object)
 	parent = container_of(_object->parent,
 			      struct cachefiles_object, fscache);
 	object = container_of(_object, struct cachefiles_object, fscache);
-	lookup_data = object->lookup_data;
 
-	ASSERTCMP(lookup_data, !=, NULL);
+	ASSERTCMP(object->lookup_key, !=, NULL);
 
 	/* look up the key, creating any missing bits */
 	cachefiles_begin_secure(cache, &saved_cred);
-	ret = cachefiles_walk_to_object(parent, object,
-					lookup_data->key,
-					lookup_data->auxdata);
+	ret = cachefiles_walk_to_object(parent, object, object->lookup_key);
 	cachefiles_end_secure(cache, saved_cred);
 
 	/* polish off by setting the attributes of non-index files */
@@ -165,14 +131,10 @@ static void cachefiles_lookup_complete(struct fscache_object *_object)
 
 	object = container_of(_object, struct cachefiles_object, fscache);
 
-	_enter("{OBJ%x,%p}", object->fscache.debug_id, object->lookup_data);
+	_enter("{OBJ%x}", object->fscache.debug_id);
 
-	if (object->lookup_data) {
-		kfree(object->lookup_data->key);
-		kfree(object->lookup_data->auxdata);
-		kfree(object->lookup_data);
-		object->lookup_data = NULL;
-	}
+	kfree(object->lookup_key);
+	object->lookup_key = NULL;
 }
 
 /*
@@ -204,12 +166,8 @@ struct fscache_object *cachefiles_grab_object(struct fscache_object *_object,
 static void cachefiles_update_object(struct fscache_object *_object)
 {
 	struct cachefiles_object *object;
-	struct cachefiles_xattr *auxdata;
 	struct cachefiles_cache *cache;
-	struct fscache_cookie *cookie;
 	const struct cred *saved_cred;
-	const void *aux;
-	unsigned auxlen;
 
 	_enter("{OBJ%x}", _object->debug_id);
 
@@ -217,40 +175,9 @@ static void cachefiles_update_object(struct fscache_object *_object)
 	cache = container_of(object->fscache.cache, struct cachefiles_cache,
 			     cache);
 
-	if (!fscache_use_cookie(_object)) {
-		_leave(" [relinq]");
-		return;
-	}
-
-	cookie = object->fscache.cookie;
-	auxlen = cookie->aux_len;
-
-	if (!auxlen) {
-		fscache_unuse_cookie(_object);
-		_leave(" [no aux]");
-		return;
-	}
-
-	auxdata = kmalloc(2 + auxlen + 3, cachefiles_gfp);
-	if (!auxdata) {
-		fscache_unuse_cookie(_object);
-		_leave(" [nomem]");
-		return;
-	}
-
-	aux = (auxlen <= sizeof(cookie->inline_aux)) ?
-		cookie->inline_aux : cookie->aux;
-
-	memcpy(auxdata->data, aux, auxlen);
-	fscache_unuse_cookie(_object);
-
-	auxdata->len = auxlen + 1;
-	auxdata->type = cookie->type;
-
 	cachefiles_begin_secure(cache, &saved_cred);
-	cachefiles_update_object_xattr(object, auxdata);
+	cachefiles_set_object_xattr(object, XATTR_REPLACE);
 	cachefiles_end_secure(cache, saved_cred);
-	kfree(auxdata);
 	_leave("");
 }
 
@@ -354,12 +281,7 @@ void cachefiles_put_object(struct fscache_object *_object,
 		ASSERTCMP(object->fscache.n_ops, ==, 0);
 		ASSERTCMP(object->fscache.n_children, ==, 0);
 
-		if (object->lookup_data) {
-			kfree(object->lookup_data->key);
-			kfree(object->lookup_data->auxdata);
-			kfree(object->lookup_data);
-			object->lookup_data = NULL;
-		}
+		kfree(object->lookup_key);
 
 		cache = object->fscache.cache;
 		fscache_object_destroy(&object->fscache);
diff --git a/fs/cachefiles/internal.h b/fs/cachefiles/internal.h
index de982f4f513f..f6b85c370935 100644
--- a/fs/cachefiles/internal.h
+++ b/fs/cachefiles/internal.h
@@ -34,7 +34,7 @@ extern unsigned cachefiles_debug;
  */
 struct cachefiles_object {
 	struct fscache_object		fscache;	/* fscache handle */
-	struct cachefiles_lookup_data	*lookup_data;	/* cached lookup data */
+	char				*lookup_key;	/* key to look up */
 	struct dentry			*dentry;	/* the file/dir representing this object */
 	struct dentry			*backer;	/* backing file */
 	loff_t				i_size;		/* object size */
@@ -88,15 +88,6 @@ struct cachefiles_cache {
 	char				*tag;		/* cache binding tag */
 };
 
-/*
- * auxiliary data xattr buffer
- */
-struct cachefiles_xattr {
-	uint16_t			len;
-	uint8_t				type;
-	uint8_t				data[];
-};
-
 #include <trace/events/cachefiles.h>
 
 /*
@@ -145,8 +136,7 @@ extern int cachefiles_delete_object(struct cachefiles_cache *cache,
 				    struct cachefiles_object *object);
 extern int cachefiles_walk_to_object(struct cachefiles_object *parent,
 				     struct cachefiles_object *object,
-				     const char *key,
-				     struct cachefiles_xattr *auxdata);
+				     const char *key);
 extern struct dentry *cachefiles_get_directory(struct cachefiles_cache *cache,
 					       struct dentry *dir,
 					       const char *name);
@@ -188,12 +178,8 @@ static inline void cachefiles_end_secure(struct cachefiles_cache *cache,
  */
 extern int cachefiles_check_object_type(struct cachefiles_object *object);
 extern int cachefiles_set_object_xattr(struct cachefiles_object *object,
-				       struct cachefiles_xattr *auxdata);
-extern int cachefiles_update_object_xattr(struct cachefiles_object *object,
-					  struct cachefiles_xattr *auxdata);
+				       unsigned int xattr_flags);
 extern int cachefiles_check_auxdata(struct cachefiles_object *object);
-extern int cachefiles_check_object_xattr(struct cachefiles_object *object,
-					 struct cachefiles_xattr *auxdata);
 extern int cachefiles_remove_object_xattr(struct cachefiles_cache *cache,
 					  struct dentry *dentry);
 
diff --git a/fs/cachefiles/namei.c b/fs/cachefiles/namei.c
index a9aca5ab5970..dc5e1e48c0a8 100644
--- a/fs/cachefiles/namei.c
+++ b/fs/cachefiles/namei.c
@@ -45,11 +45,10 @@ void __cachefiles_printk_object(struct cachefiles_object *object,
 	spin_lock(&object->fscache.lock);
 	cookie = object->fscache.cookie;
 	if (cookie) {
-		pr_err("%scookie=%x [pr=%x nd=%p fl=%lx]\n",
+		pr_err("%scookie=%x [pr=%x fl=%lx]\n",
 		       prefix,
 		       cookie->debug_id,
 		       cookie->parent ? cookie->parent->debug_id : 0,
-		       cookie->netfs_data,
 		       cookie->flags);
 		pr_err("%skey=[%u] '", prefix, cookie->key_len);
 		k = (cookie->key_len <= sizeof(cookie->inline_key)) ?
@@ -487,8 +486,7 @@ int cachefiles_delete_object(struct cachefiles_cache *cache,
  */
 int cachefiles_walk_to_object(struct cachefiles_object *parent,
 			      struct cachefiles_object *object,
-			      const char *key,
-			      struct cachefiles_xattr *auxdata)
+			      const char *key)
 {
 	struct cachefiles_cache *cache;
 	struct dentry *dir, *next = NULL;
@@ -636,7 +634,7 @@ int cachefiles_walk_to_object(struct cachefiles_object *parent,
 	if (!object->new) {
 		_debug("validate '%pd'", next);
 
-		ret = cachefiles_check_object_xattr(object, auxdata);
+		ret = cachefiles_check_auxdata(object);
 		if (ret == -ESTALE) {
 			/* delete the object (the deleter drops the directory
 			 * mutex) */
@@ -671,7 +669,7 @@ int cachefiles_walk_to_object(struct cachefiles_object *parent,
 
 	if (object->new) {
 		/* attach data to a newly constructed terminal object */
-		ret = cachefiles_set_object_xattr(object, auxdata);
+		ret = cachefiles_set_object_xattr(object, XATTR_CREATE);
 		if (ret < 0)
 			goto check_error;
 	} else {
diff --git a/fs/cachefiles/xattr.c b/fs/cachefiles/xattr.c
index 9e82de668595..c99952404932 100644
--- a/fs/cachefiles/xattr.c
+++ b/fs/cachefiles/xattr.c
@@ -15,6 +15,11 @@
 #include <linux/slab.h>
 #include "internal.h"
 
+struct cachefiles_xattr {
+	uint8_t				type;
+	uint8_t				data[];
+} __packed;
+
 static const char cachefiles_xattr_cache[] =
 	XATTR_USER_PREFIX "CacheFiles.cache";
 
@@ -98,54 +103,35 @@ int cachefiles_check_object_type(struct cachefiles_object *object)
  * set the state xattr on a cache file
  */
 int cachefiles_set_object_xattr(struct cachefiles_object *object,
-				struct cachefiles_xattr *auxdata)
-{
-	struct dentry *dentry = object->dentry;
-	int ret;
-
-	ASSERT(dentry);
-
-	_enter("%p,#%d", object, auxdata->len);
-
-	/* attempt to install the cache metadata directly */
-	_debug("SET #%u", auxdata->len);
-
-	clear_bit(FSCACHE_COOKIE_AUX_UPDATED, &object->fscache.cookie->flags);
-	ret = vfs_setxattr(&init_user_ns, dentry, cachefiles_xattr_cache,
-			   &auxdata->type, auxdata->len, XATTR_CREATE);
-	if (ret < 0 && ret != -ENOMEM)
-		cachefiles_io_error_obj(
-			object,
-			"Failed to set xattr with error %d", ret);
-
-	_leave(" = %d", ret);
-	return ret;
-}
-
-/*
- * update the state xattr on a cache file
- */
-int cachefiles_update_object_xattr(struct cachefiles_object *object,
-				   struct cachefiles_xattr *auxdata)
+				unsigned int xattr_flags)
 {
+	struct cachefiles_xattr *buf;
 	struct dentry *dentry = object->dentry;
+	unsigned int len = object->fscache.cookie->aux_len;
 	int ret;
 
 	if (!dentry)
 		return -ESTALE;
 
-	_enter("%x,#%d", object->fscache.debug_id, auxdata->len);
+	_enter("%x,#%d", object->fscache.debug_id, len);
 
-	/* attempt to install the cache metadata directly */
-	_debug("SET #%u", auxdata->len);
+	buf = kmalloc(sizeof(struct cachefiles_xattr) + len, GFP_KERNEL);
+	if (!buf)
+		return -ENOMEM;
+
+	buf->type = object->fscache.cookie->def->type;
+	if (len > 0)
+		memcpy(buf->data, fscache_get_aux(object->fscache.cookie), len);
 
 	clear_bit(FSCACHE_COOKIE_AUX_UPDATED, &object->fscache.cookie->flags);
 	ret = vfs_setxattr(&init_user_ns, dentry, cachefiles_xattr_cache,
-			   &auxdata->type, auxdata->len, XATTR_REPLACE);
+			   buf, sizeof(struct cachefiles_xattr) + len,
+			   xattr_flags);
+	kfree(buf);
 	if (ret < 0 && ret != -ENOMEM)
 		cachefiles_io_error_obj(
 			object,
-			"Failed to update xattr with error %d", ret);
+			"Failed to set xattr with error %d", ret);
 
 	_leave(" = %d", ret);
 	return ret;
@@ -156,148 +142,30 @@ int cachefiles_update_object_xattr(struct cachefiles_object *object,
  */
 int cachefiles_check_auxdata(struct cachefiles_object *object)
 {
-	struct cachefiles_xattr *auxbuf;
-	enum fscache_checkaux validity;
-	struct dentry *dentry = object->dentry;
-	ssize_t xlen;
-	int ret;
-
-	ASSERT(dentry);
-	ASSERT(d_backing_inode(dentry));
-	ASSERT(object->fscache.cookie->def->check_aux);
-
-	auxbuf = kmalloc(sizeof(struct cachefiles_xattr) + 512, GFP_KERNEL);
-	if (!auxbuf)
-		return -ENOMEM;
-
-	xlen = vfs_getxattr(&init_user_ns, dentry, cachefiles_xattr_cache,
-			    &auxbuf->type, 512 + 1);
-	ret = -ESTALE;
-	if (xlen < 1 ||
-	    auxbuf->type != object->fscache.cookie->def->type)
-		goto error;
-
-	xlen--;
-	validity = fscache_check_aux(&object->fscache, &auxbuf->data, xlen,
-				     i_size_read(d_backing_inode(dentry)));
-	if (validity != FSCACHE_CHECKAUX_OKAY)
-		goto error;
-
-	ret = 0;
-error:
-	kfree(auxbuf);
-	return ret;
-}
-
-/*
- * check the state xattr on a cache file
- * - return -ESTALE if the object should be deleted
- */
-int cachefiles_check_object_xattr(struct cachefiles_object *object,
-				  struct cachefiles_xattr *auxdata)
-{
-	struct cachefiles_xattr *auxbuf;
+	struct cachefiles_xattr *buf;
 	struct dentry *dentry = object->dentry;
-	int ret;
-
-	_enter("%p,#%d", object, auxdata->len);
+	unsigned int len = object->fscache.cookie->aux_len, tlen;
+	const void *p = fscache_get_aux(object->fscache.cookie);
+	ssize_t ret;
 
 	ASSERT(dentry);
 	ASSERT(d_backing_inode(dentry));
 
-	auxbuf = kmalloc(sizeof(struct cachefiles_xattr) + 512, cachefiles_gfp);
-	if (!auxbuf) {
-		_leave(" = -ENOMEM");
+	tlen = sizeof(struct cachefiles_xattr) + len;
+	buf = kmalloc(tlen, GFP_KERNEL);
+	if (!buf)
 		return -ENOMEM;
-	}
-
-	/* read the current type label */
-	ret = vfs_getxattr(&init_user_ns, dentry, cachefiles_xattr_cache,
-			   &auxbuf->type, 512 + 1);
-	if (ret < 0) {
-		if (ret == -ENODATA)
-			goto stale; /* no attribute - power went off
-				     * mid-cull? */
-
-		if (ret == -ERANGE)
-			goto bad_type_length;
 
-		cachefiles_io_error_obj(object,
-					"Can't read xattr on %lu (err %d)",
-					d_backing_inode(dentry)->i_ino, -ret);
-		goto error;
-	}
-
-	/* check the on-disk object */
-	if (ret < 1)
-		goto bad_type_length;
-
-	if (auxbuf->type != auxdata->type)
-		goto stale;
-
-	auxbuf->len = ret;
-
-	/* consult the netfs */
-	if (object->fscache.cookie->def->check_aux) {
-		enum fscache_checkaux result;
-		unsigned int dlen;
-
-		dlen = auxbuf->len - 1;
-
-		_debug("checkaux %s #%u",
-		       object->fscache.cookie->def->name, dlen);
-
-		result = fscache_check_aux(&object->fscache,
-					   &auxbuf->data, dlen,
-					   i_size_read(d_backing_inode(dentry)));
-
-		switch (result) {
-			/* entry okay as is */
-		case FSCACHE_CHECKAUX_OKAY:
-			goto okay;
-
-			/* entry requires update */
-		case FSCACHE_CHECKAUX_NEEDS_UPDATE:
-			break;
-
-			/* entry requires deletion */
-		case FSCACHE_CHECKAUX_OBSOLETE:
-			goto stale;
-
-		default:
-			BUG();
-		}
-
-		/* update the current label */
-		ret = vfs_setxattr(&init_user_ns, dentry,
-				   cachefiles_xattr_cache, &auxdata->type,
-				   auxdata->len, XATTR_REPLACE);
-		if (ret < 0) {
-			cachefiles_io_error_obj(object,
-						"Can't update xattr on %lu"
-						" (error %d)",
-						d_backing_inode(dentry)->i_ino, -ret);
-			goto error;
-		}
-	}
-
-okay:
-	ret = 0;
+	ret = vfs_getxattr(&init_user_ns, dentry, cachefiles_xattr_cache, buf, tlen);
+	if (ret == tlen &&
+	    buf->type == object->fscache.cookie->def->type &&
+	    memcmp(buf->data, p, len) == 0)
+		ret = 0;
+	else
+		ret = -ESTALE;
 
-error:
-	kfree(auxbuf);
-	_leave(" = %d", ret);
+	kfree(buf);
 	return ret;
-
-bad_type_length:
-	pr_err("Cache object %lu xattr length incorrect\n",
-	       d_backing_inode(dentry)->i_ino);
-	ret = -EIO;
-	goto error;
-
-stale:
-	ret = -ESTALE;
-	goto error;
 }
 
 /*
diff --git a/fs/fscache/cache.c b/fs/fscache/cache.c
index cfa60c2faf68..efcdb40267d6 100644
--- a/fs/fscache/cache.c
+++ b/fs/fscache/cache.c
@@ -30,6 +30,7 @@ struct fscache_cache_tag *__fscache_lookup_cache_tag(const char *name)
 	list_for_each_entry(tag, &fscache_cache_tag_list, link) {
 		if (strcmp(tag->name, name) == 0) {
 			atomic_inc(&tag->usage);
+			refcount_inc(&tag->ref);
 			up_read(&fscache_addremove_sem);
 			return tag;
 		}
@@ -44,6 +45,7 @@ struct fscache_cache_tag *__fscache_lookup_cache_tag(const char *name)
 		return ERR_PTR(-ENOMEM);
 
 	atomic_set(&xtag->usage, 1);
+	refcount_set(&xtag->ref, 1);
 	strcpy(xtag->name, name);
 
 	/* write lock, search again and add if still not present */
@@ -52,6 +54,7 @@ struct fscache_cache_tag *__fscache_lookup_cache_tag(const char *name)
 	list_for_each_entry(tag, &fscache_cache_tag_list, link) {
 		if (strcmp(tag->name, name) == 0) {
 			atomic_inc(&tag->usage);
+			refcount_inc(&tag->ref);
 			up_write(&fscache_addremove_sem);
 			kfree(xtag);
 			return tag;
@@ -64,7 +67,7 @@ struct fscache_cache_tag *__fscache_lookup_cache_tag(const char *name)
 }
 
 /*
- * release a reference to a cache tag
+ * Unuse a cache tag
  */
 void __fscache_release_cache_tag(struct fscache_cache_tag *tag)
 {
@@ -77,8 +80,7 @@ void __fscache_release_cache_tag(struct fscache_cache_tag *tag)
 			tag = NULL;
 
 		up_write(&fscache_addremove_sem);
-
-		kfree(tag);
+		fscache_put_cache_tag(tag);
 	}
 }
 
@@ -130,20 +132,10 @@ struct fscache_cache *fscache_select_cache_for_object(
 
 	spin_unlock(&cookie->lock);
 
-	if (!cookie->def->select_cache)
-		goto no_preference;
-
-	/* ask the netfs for its preference */
-	tag = cookie->def->select_cache(cookie->parent->netfs_data,
-					cookie->netfs_data);
+	tag = cookie->preferred_cache;
 	if (!tag)
 		goto no_preference;
 
-	if (tag == ERR_PTR(-ENOMEM)) {
-		_leave(" = NULL [nomem tag]");
-		return NULL;
-	}
-
 	if (!tag->cache) {
 		_leave(" = NULL [unbacked tag]");
 		return NULL;
diff --git a/fs/fscache/cookie.c b/fs/fscache/cookie.c
index 8a850c3d0775..a772d4b6cacd 100644
--- a/fs/fscache/cookie.c
+++ b/fs/fscache/cookie.c
@@ -43,11 +43,10 @@ static void fscache_print_cookie(struct fscache_cookie *cookie, char prefix)
 	       cookie->flags,
 	       atomic_read(&cookie->n_children),
 	       atomic_read(&cookie->n_active));
-	pr_err("%c-cookie d=%p{%s} n=%p\n",
+	pr_err("%c-cookie d=%p{%s}\n",
 	       prefix,
 	       cookie->def,
-	       cookie->def ? cookie->def->name : "?",
-	       cookie->netfs_data);
+	       cookie->def ? cookie->def->name : "?");
 
 	o = READ_ONCE(cookie->backing_objects.first);
 	if (o) {
@@ -74,6 +73,7 @@ void fscache_free_cookie(struct fscache_cookie *cookie)
 			kfree(cookie->aux);
 		if (cookie->key_len > sizeof(cookie->inline_key))
 			kfree(cookie->key);
+		fscache_put_cache_tag(cookie->preferred_cache);
 		kmem_cache_free(fscache_cookie_jar, cookie);
 	}
 }
@@ -138,9 +138,9 @@ static atomic_t fscache_cookie_debug_id = ATOMIC_INIT(1);
 struct fscache_cookie *fscache_alloc_cookie(
 	struct fscache_cookie *parent,
 	const struct fscache_cookie_def *def,
+	struct fscache_cache_tag *preferred_cache,
 	const void *index_key, size_t index_key_len,
 	const void *aux_data, size_t aux_data_len,
-	void *netfs_data,
 	loff_t object_size)
 {
 	struct fscache_cookie *cookie;
@@ -175,7 +175,9 @@ struct fscache_cookie *fscache_alloc_cookie(
 
 	cookie->def		= def;
 	cookie->parent		= parent;
-	cookie->netfs_data	= netfs_data;
+
+	cookie->preferred_cache	= fscache_get_cache_tag(preferred_cache);
+
 	cookie->flags		= (1 << FSCACHE_COOKIE_NO_DATA_YET);
 	cookie->type		= def->type;
 	spin_lock_init(&cookie->lock);
@@ -240,7 +242,6 @@ struct fscache_cookie *fscache_hash_cookie(struct fscache_cookie *candidate)
  *   - the top level index cookie for each netfs is stored in the fscache_netfs
  *     struct upon registration
  * - def points to the definition
- * - the netfs_data will be passed to the functions pointed to in *def
  * - all attached caches will be searched to see if they contain this object
  * - index objects aren't stored on disk until there's a dependent file that
  *   needs storing
@@ -252,9 +253,9 @@ struct fscache_cookie *fscache_hash_cookie(struct fscache_cookie *candidate)
 struct fscache_cookie *__fscache_acquire_cookie(
 	struct fscache_cookie *parent,
 	const struct fscache_cookie_def *def,
+	struct fscache_cache_tag *preferred_cache,
 	const void *index_key, size_t index_key_len,
 	const void *aux_data, size_t aux_data_len,
-	void *netfs_data,
 	loff_t object_size,
 	bool enable)
 {
@@ -262,9 +263,9 @@ struct fscache_cookie *__fscache_acquire_cookie(
 
 	BUG_ON(!def);
 
-	_enter("{%s},{%s},%p,%u",
+	_enter("{%s},{%s},%u",
 	       parent ? (char *) parent->def->name : "<no-parent>",
-	       def->name, netfs_data, enable);
+	       def->name, enable);
 
 	if (!index_key || !index_key_len || index_key_len > 255 || aux_data_len > 255)
 		return NULL;
@@ -288,10 +289,10 @@ struct fscache_cookie *__fscache_acquire_cookie(
 	BUG_ON(def->type == FSCACHE_COOKIE_TYPE_INDEX &&
 	       parent->type != FSCACHE_COOKIE_TYPE_INDEX);
 
-	candidate = fscache_alloc_cookie(parent, def,
+	candidate = fscache_alloc_cookie(parent, def, preferred_cache,
 					 index_key, index_key_len,
 					 aux_data, aux_data_len,
-					 netfs_data, object_size);
+					 object_size);
 	if (!candidate) {
 		fscache_stat(&fscache_n_acquires_oom);
 		_leave(" [ENOMEM]");
@@ -812,7 +813,6 @@ void __fscache_relinquish_cookie(struct fscache_cookie *cookie,
 	__fscache_disable_cookie(cookie, aux_data, retire);
 
 	/* Clear pointers back to the netfs */
-	cookie->netfs_data	= NULL;
 	cookie->def		= NULL;
 
 	if (cookie->parent) {
@@ -978,8 +978,8 @@ static int fscache_cookies_seq_show(struct seq_file *m, void *v)
 
 	if (v == &fscache_cookies) {
 		seq_puts(m,
-			 "COOKIE   PARENT   USAGE CHILD ACT TY FL  DEF              NETFS_DATA\n"
-			 "======== ======== ===== ===== === == === ================ ==========\n"
+			 "COOKIE   PARENT   USAGE CHILD ACT TY FL  DEF             \n"
+			 "======== ======== ===== ===== === == === ================\n"
 			 );
 		return 0;
 	}
@@ -1001,7 +1001,7 @@ static int fscache_cookies_seq_show(struct seq_file *m, void *v)
 	}
 
 	seq_printf(m,
-		   "%08x %08x %5u %5u %3u %s %03lx %-16s %px",
+		   "%08x %08x %5u %5u %3u %s %03lx %-16s",
 		   cookie->debug_id,
 		   cookie->parent ? cookie->parent->debug_id : 0,
 		   refcount_read(&cookie->ref),
@@ -1009,8 +1009,7 @@ static int fscache_cookies_seq_show(struct seq_file *m, void *v)
 		   atomic_read(&cookie->n_active),
 		   type,
 		   cookie->flags,
-		   cookie->def->name,
-		   cookie->netfs_data);
+		   cookie->def->name);
 
 	keylen = cookie->key_len;
 	auxlen = cookie->aux_len;
diff --git a/fs/fscache/fsdef.c b/fs/fscache/fsdef.c
index 0402673c680e..111946ba9ce1 100644
--- a/fs/fscache/fsdef.c
+++ b/fs/fscache/fsdef.c
@@ -9,12 +9,6 @@
 #include <linux/module.h>
 #include "internal.h"
 
-static
-enum fscache_checkaux fscache_fsdef_netfs_check_aux(void *cookie_netfs_data,
-						    const void *data,
-						    uint16_t datalen,
-						    loff_t object_size);
-
 /*
  * The root index is owned by FS-Cache itself.
  *
@@ -64,35 +58,4 @@ EXPORT_SYMBOL(fscache_fsdef_index);
 struct fscache_cookie_def fscache_fsdef_netfs_def = {
 	.name		= "FSDEF.netfs",
 	.type		= FSCACHE_COOKIE_TYPE_INDEX,
-	.check_aux	= fscache_fsdef_netfs_check_aux,
 };
-
-/*
- * check that the index structure version number stored in the auxiliary data
- * matches the one the netfs gave us
- */
-static enum fscache_checkaux fscache_fsdef_netfs_check_aux(
-	void *cookie_netfs_data,
-	const void *data,
-	uint16_t datalen,
-	loff_t object_size)
-{
-	struct fscache_netfs *netfs = cookie_netfs_data;
-	uint32_t version;
-
-	_enter("{%s},,%hu", netfs->name, datalen);
-
-	if (datalen != sizeof(version)) {
-		_leave(" = OBSOLETE [dl=%d v=%zu]", datalen, sizeof(version));
-		return FSCACHE_CHECKAUX_OBSOLETE;
-	}
-
-	memcpy(&version, data, sizeof(version));
-	if (version != netfs->version) {
-		_leave(" = OBSOLETE [ver=%x net=%x]", version, netfs->version);
-		return FSCACHE_CHECKAUX_OBSOLETE;
-	}
-
-	_leave(" = OKAY");
-	return FSCACHE_CHECKAUX_OKAY;
-}
diff --git a/fs/fscache/internal.h b/fs/fscache/internal.h
index 6eb3f51d7275..29f21fbd5b5d 100644
--- a/fs/fscache/internal.h
+++ b/fs/fscache/internal.h
@@ -24,6 +24,7 @@
 
 #define pr_fmt(fmt) "FS-Cache: " fmt
 
+#include <linux/slab.h>
 #include <linux/fscache-cache.h>
 #include <trace/events/fscache.h>
 #include <linux/sched.h>
@@ -41,6 +42,20 @@ extern struct rw_semaphore fscache_addremove_sem;
 extern struct fscache_cache *fscache_select_cache_for_object(
 	struct fscache_cookie *);
 
+static inline
+struct fscache_cache_tag *fscache_get_cache_tag(struct fscache_cache_tag *tag)
+{
+	if (tag)
+		refcount_inc(&tag->ref);
+	return tag;
+}
+
+static inline void fscache_put_cache_tag(struct fscache_cache_tag *tag)
+{
+	if (tag && refcount_dec_and_test(&tag->ref))
+		kfree(tag);
+}
+
 /*
  * cookie.c
  */
@@ -50,9 +65,10 @@ extern const struct seq_operations fscache_cookies_seq_ops;
 extern void fscache_free_cookie(struct fscache_cookie *);
 extern struct fscache_cookie *fscache_alloc_cookie(struct fscache_cookie *,
 						   const struct fscache_cookie_def *,
+						   struct fscache_cache_tag *,
 						   const void *, size_t,
 						   const void *, size_t,
-						   void *, loff_t);
+						   loff_t);
 extern struct fscache_cookie *fscache_hash_cookie(struct fscache_cookie *);
 extern struct fscache_cookie *fscache_cookie_get(struct fscache_cookie *,
 						 enum fscache_cookie_trace);
@@ -270,16 +286,9 @@ static inline void fscache_raise_event(struct fscache_object *object,
 static inline
 void fscache_update_aux(struct fscache_cookie *cookie, const void *aux_data)
 {
-	void *p;
-
-	if (!aux_data)
-		return;
-	if (cookie->aux_len <= sizeof(cookie->inline_aux))
-		p = cookie->inline_aux;
-	else
-		p = cookie->aux;
+	void *p = fscache_get_aux(cookie);
 
-	if (memcmp(p, aux_data, cookie->aux_len) != 0) {
+	if (p && memcmp(p, aux_data, cookie->aux_len) != 0) {
 		memcpy(p, aux_data, cookie->aux_len);
 		set_bit(FSCACHE_COOKIE_AUX_UPDATED, &cookie->flags);
 	}
diff --git a/fs/fscache/netfs.c b/fs/fscache/netfs.c
index d6bdb7b5e723..b8db06804876 100644
--- a/fs/fscache/netfs.c
+++ b/fs/fscache/netfs.c
@@ -22,9 +22,10 @@ int __fscache_register_netfs(struct fscache_netfs *netfs)
 	/* allocate a cookie for the primary index */
 	candidate = fscache_alloc_cookie(&fscache_fsdef_index,
 					 &fscache_fsdef_netfs_def,
+					 NULL,
 					 netfs->name, strlen(netfs->name),
 					 &netfs->version, sizeof(netfs->version),
-					 netfs, 0);
+					 0);
 	if (!candidate) {
 		_leave(" = -ENOMEM");
 		return -ENOMEM;
diff --git a/fs/fscache/object.c b/fs/fscache/object.c
index 86ad941726f7..cdcf6720d748 100644
--- a/fs/fscache/object.c
+++ b/fs/fscache/object.c
@@ -901,55 +901,6 @@ static void fscache_dequeue_object(struct fscache_object *object)
 	_leave("");
 }
 
-/**
- * fscache_check_aux - Ask the netfs whether an object on disk is still valid
- * @object: The object to ask about
- * @data: The auxiliary data for the object
- * @datalen: The size of the auxiliary data
- * @object_size: The size of the object according to the server.
- *
- * This function consults the netfs about the coherency state of an object.
- * The caller must be holding a ref on cookie->n_active (held by
- * fscache_look_up_object() on behalf of the cache backend during object lookup
- * and creation).
- */
-enum fscache_checkaux fscache_check_aux(struct fscache_object *object,
-					const void *data, uint16_t datalen,
-					loff_t object_size)
-{
-	enum fscache_checkaux result;
-
-	if (!object->cookie->def->check_aux) {
-		fscache_stat(&fscache_n_checkaux_none);
-		return FSCACHE_CHECKAUX_OKAY;
-	}
-
-	result = object->cookie->def->check_aux(object->cookie->netfs_data,
-						data, datalen, object_size);
-	switch (result) {
-		/* entry okay as is */
-	case FSCACHE_CHECKAUX_OKAY:
-		fscache_stat(&fscache_n_checkaux_okay);
-		break;
-
-		/* entry requires update */
-	case FSCACHE_CHECKAUX_NEEDS_UPDATE:
-		fscache_stat(&fscache_n_checkaux_update);
-		break;
-
-		/* entry requires deletion */
-	case FSCACHE_CHECKAUX_OBSOLETE:
-		fscache_stat(&fscache_n_checkaux_obsolete);
-		break;
-
-	default:
-		BUG();
-	}
-
-	return result;
-}
-EXPORT_SYMBOL(fscache_check_aux);
-
 /*
  * Asynchronously invalidate an object.
  */
diff --git a/include/linux/fscache-cache.h b/include/linux/fscache-cache.h
index 5e610f9a524c..264b94dad5be 100644
--- a/include/linux/fscache-cache.h
+++ b/include/linux/fscache-cache.h
@@ -45,8 +45,9 @@ struct fscache_cache_tag {
 	struct fscache_cache	*cache;		/* cache referred to by this tag */
 	unsigned long		flags;
 #define FSCACHE_TAG_RESERVED	0		/* T if tag is reserved for a cache */
-	atomic_t		usage;
-	char			name[];	/* tag name */
+	atomic_t		usage;		/* Number of using netfs's */
+	refcount_t		ref;		/* Reference count on structure */
+	char			name[];		/* tag name */
 };
 
 /*
@@ -415,11 +416,6 @@ extern void fscache_io_error(struct fscache_cache *cache);
 
 extern bool fscache_object_sleep_till_congested(signed long *timeoutp);
 
-extern enum fscache_checkaux fscache_check_aux(struct fscache_object *object,
-					       const void *data,
-					       uint16_t datalen,
-					       loff_t object_size);
-
 extern void fscache_object_retrying_stale(struct fscache_object *object);
 
 enum fscache_why_object_killed {
@@ -431,4 +427,26 @@ enum fscache_why_object_killed {
 extern void fscache_object_mark_killed(struct fscache_object *object,
 				       enum fscache_why_object_killed why);
 
+/*
+ * Find the key on a cookie.
+ */
+static inline void *fscache_get_key(struct fscache_cookie *cookie)
+{
+	if (cookie->key_len <= sizeof(cookie->inline_key))
+		return cookie->inline_key;
+	else
+		return cookie->key;
+}
+
+/*
+ * Find the auxiliary data on a cookie.
+ */
+static inline void *fscache_get_aux(struct fscache_cookie *cookie)
+{
+	if (cookie->aux_len <= sizeof(cookie->inline_aux))
+		return cookie->inline_aux;
+	else
+		return cookie->aux;
+}
+
 #endif /* _LINUX_FSCACHE_CACHE_H */
diff --git a/include/linux/fscache.h b/include/linux/fscache.h
index ba4878b56717..cedab654dc79 100644
--- a/include/linux/fscache.h
+++ b/include/linux/fscache.h
@@ -38,13 +38,6 @@ struct fscache_cookie;
 struct fscache_netfs;
 struct netfs_read_request;
 
-/* result of index entry consultation */
-enum fscache_checkaux {
-	FSCACHE_CHECKAUX_OKAY,		/* entry okay as is */
-	FSCACHE_CHECKAUX_NEEDS_UPDATE,	/* entry requires update */
-	FSCACHE_CHECKAUX_OBSOLETE,	/* entry requires deletion */
-};
-
 /*
  * fscache cookie definition
  */
@@ -56,26 +49,6 @@ struct fscache_cookie_def {
 	uint8_t type;
 #define FSCACHE_COOKIE_TYPE_INDEX	0
 #define FSCACHE_COOKIE_TYPE_DATAFILE	1
-
-	/* select the cache into which to insert an entry in this index
-	 * - optional
-	 * - should return a cache identifier or NULL to cause the cache to be
-	 *   inherited from the parent if possible or the first cache picked
-	 *   for a non-index file if not
-	 */
-	struct fscache_cache_tag *(*select_cache)(
-		const void *parent_netfs_data,
-		const void *cookie_netfs_data);
-
-	/* consult the netfs about the state of an object
-	 * - this function can be absent if the index carries no state data
-	 * - the netfs data from the cookie being used as the target is
-	 *   presented, as is the auxiliary data and the object size
-	 */
-	enum fscache_checkaux (*check_aux)(void *cookie_netfs_data,
-					   const void *data,
-					   uint16_t datalen,
-					   loff_t object_size);
 };
 
 /*
@@ -105,9 +78,9 @@ struct fscache_cookie {
 	struct hlist_head		backing_objects; /* object(s) backing this file/index */
 	const struct fscache_cookie_def	*def;		/* definition */
 	struct fscache_cookie		*parent;	/* parent of this entry */
+	struct fscache_cache_tag	*preferred_cache; /* The preferred cache or NULL */
 	struct hlist_bl_node		hash_link;	/* Link in hash table */
 	struct list_head		proc_link;	/* Link in proc list */
-	void				*netfs_data;	/* back pointer to netfs */
 
 	unsigned long			flags;
 #define FSCACHE_COOKIE_LOOKING_UP	0	/* T if non-index cookie being looked up still */
@@ -156,9 +129,10 @@ extern void __fscache_release_cache_tag(struct fscache_cache_tag *);
 extern struct fscache_cookie *__fscache_acquire_cookie(
 	struct fscache_cookie *,
 	const struct fscache_cookie_def *,
+	struct fscache_cache_tag *,
 	const void *, size_t,
 	const void *, size_t,
-	void *, loff_t, bool);
+	loff_t, bool);
 extern void __fscache_relinquish_cookie(struct fscache_cookie *, const void *, bool);
 extern int __fscache_check_consistency(struct fscache_cookie *, const void *);
 extern void __fscache_update_cookie(struct fscache_cookie *, const void *);
@@ -252,6 +226,7 @@ void fscache_release_cache_tag(struct fscache_cache_tag *tag)
  * fscache_acquire_cookie - Acquire a cookie to represent a cache object
  * @parent: The cookie that's to be the parent of this one
  * @def: A description of the cache object, including callback operations
+ * @preferred_cache: The cache to use (or NULL)
  * @index_key: The index key for this cookie
  * @index_key_len: Size of the index key
  * @aux_data: The auxiliary data for the cookie (may be NULL)
@@ -272,19 +247,19 @@ static inline
 struct fscache_cookie *fscache_acquire_cookie(
 	struct fscache_cookie *parent,
 	const struct fscache_cookie_def *def,
+	struct fscache_cache_tag *preferred_cache,
 	const void *index_key,
 	size_t index_key_len,
 	const void *aux_data,
 	size_t aux_data_len,
-	void *netfs_data,
 	loff_t object_size,
 	bool enable)
 {
 	if (fscache_cookie_valid(parent) && fscache_cookie_enabled(parent))
-		return __fscache_acquire_cookie(parent, def,
+		return __fscache_acquire_cookie(parent, def, preferred_cache,
 						index_key, index_key_len,
 						aux_data, aux_data_len,
-						netfs_data, object_size, enable);
+						object_size, enable);
 	else
 		return NULL;
 }



^ permalink raw reply related	[flat|nested] 87+ messages in thread

* [PATCH 08/67] fscache: Remove struct fscache_cookie_def
  2021-10-18 14:50 [PATCH 00/67] fscache: Rewrite index API and management system David Howells
                   ` (6 preceding siblings ...)
  2021-10-18 14:52 ` [PATCH 07/67] fscache: Remove the netfs data from the cookie David Howells
@ 2021-10-18 14:52 ` David Howells
  2021-10-18 14:52 ` [PATCH 09/67] fscache: Remove store_limit* from struct fscache_object David Howells
                   ` (62 subsequent siblings)
  70 siblings, 0 replies; 87+ messages in thread
From: David Howells @ 2021-10-18 14:52 UTC (permalink / raw)
  To: linux-cachefs
  Cc: dhowells, Trond Myklebust, Anna Schumaker, Steve French,
	Dominique Martinet, Jeff Layton, Matthew Wilcox, Alexander Viro,
	Omar Sandoval, Linus Torvalds, linux-afs, linux-nfs, linux-cifs,
	ceph-devel, v9fs-developer, linux-fsdevel, linux-kernel

Remove the cookie definition structure so that there's no pointers from
that back into the network filesystem.  All of the method pointers that
were in there have been removed anyway.  Any remaining information is
stashed in the cookie.

Signed-off-by: David Howells <dhowells@redhat.com>
---

 fs/afs/cache.c                 |   15 ------------
 fs/afs/cell.c                  |    4 ++-
 fs/afs/inode.c                 |   15 +++++++-----
 fs/afs/internal.h              |    7 -----
 fs/afs/volume.c                |    5 ++--
 fs/cachefiles/interface.c      |    4 ++-
 fs/cachefiles/xattr.c          |    6 ++---
 fs/fscache/cookie.c            |   51 ++++++++++++++++++----------------------
 fs/fscache/fsdef.c             |   17 +------------
 fs/fscache/internal.h          |    5 ++--
 fs/fscache/io.c                |    2 +-
 fs/fscache/netfs.c             |    5 ++--
 fs/fscache/object.c            |    2 +-
 fs/fscache/page.c              |    2 +-
 include/linux/fscache.h        |   37 +++++++++++++++--------------
 include/trace/events/fscache.h |    2 +-
 16 files changed, 73 insertions(+), 106 deletions(-)

diff --git a/fs/afs/cache.c b/fs/afs/cache.c
index 9b2de3014dbf..0ee9ede6fc67 100644
--- a/fs/afs/cache.c
+++ b/fs/afs/cache.c
@@ -12,18 +12,3 @@ struct fscache_netfs afs_cache_netfs = {
 	.name			= "afs",
 	.version		= 2,
 };
-
-struct fscache_cookie_def afs_cell_cache_index_def = {
-	.name		= "AFS.cell",
-	.type		= FSCACHE_COOKIE_TYPE_INDEX,
-};
-
-struct fscache_cookie_def afs_volume_cache_index_def = {
-	.name		= "AFS.volume",
-	.type		= FSCACHE_COOKIE_TYPE_INDEX,
-};
-
-struct fscache_cookie_def afs_vnode_cache_index_def = {
-	.name		= "AFS.vnode",
-	.type		= FSCACHE_COOKIE_TYPE_DATAFILE,
-};
diff --git a/fs/afs/cell.c b/fs/afs/cell.c
index 416f9bd638a5..ebf140317232 100644
--- a/fs/afs/cell.c
+++ b/fs/afs/cell.c
@@ -682,7 +682,9 @@ static int afs_activate_cell(struct afs_net *net, struct afs_cell *cell)
 
 #ifdef CONFIG_AFS_FSCACHE
 	cell->cache = fscache_acquire_cookie(afs_cache_netfs.primary_index,
-					     &afs_cell_cache_index_def,
+					     FSCACHE_COOKIE_TYPE_INDEX,
+					     "AFS.cell",
+					     0,
 					     NULL,
 					     cell->name, strlen(cell->name),
 					     NULL, 0,
diff --git a/fs/afs/inode.c b/fs/afs/inode.c
index 3b696ac7c05a..f761c9a5067f 100644
--- a/fs/afs/inode.c
+++ b/fs/afs/inode.c
@@ -430,12 +430,15 @@ static void afs_get_inode_cache(struct afs_vnode *vnode)
 	key.vnode_id_ext[1]	= vnode->fid.vnode_hi;
 	aux.data_version	= vnode->status.data_version;
 
-	vnode->cache = fscache_acquire_cookie(vnode->volume->cache,
-					      &afs_vnode_cache_index_def,
-					      NULL,
-					      &key, sizeof(key),
-					      &aux, sizeof(aux),
-					      vnode->status.size, true);
+	vnode->cache = fscache_acquire_cookie(
+		vnode->volume->cache,
+		FSCACHE_COOKIE_TYPE_DATAFILE,
+		"AFS.vnode",
+		vnode->status.type == AFS_FTYPE_FILE ? 0 : FSCACHE_ADV_SINGLE_CHUNK,
+		NULL,
+		&key, sizeof(key),
+		&aux, sizeof(aux),
+		vnode->status.size, true);
 #endif
 }
 
diff --git a/fs/afs/internal.h b/fs/afs/internal.h
index 0ad97a8fc0d4..bdcc677338fb 100644
--- a/fs/afs/internal.h
+++ b/fs/afs/internal.h
@@ -962,13 +962,6 @@ extern void afs_merge_fs_addr6(struct afs_addr_list *, __be32 *, u16);
  */
 #ifdef CONFIG_AFS_FSCACHE
 extern struct fscache_netfs afs_cache_netfs;
-extern struct fscache_cookie_def afs_cell_cache_index_def;
-extern struct fscache_cookie_def afs_volume_cache_index_def;
-extern struct fscache_cookie_def afs_vnode_cache_index_def;
-#else
-#define afs_cell_cache_index_def	(*(struct fscache_cookie_def *) NULL)
-#define afs_volume_cache_index_def	(*(struct fscache_cookie_def *) NULL)
-#define afs_vnode_cache_index_def	(*(struct fscache_cookie_def *) NULL)
 #endif
 
 /*
diff --git a/fs/afs/volume.c b/fs/afs/volume.c
index 9ca246ab1a86..5eaaa762fbd9 100644
--- a/fs/afs/volume.c
+++ b/fs/afs/volume.c
@@ -272,8 +272,9 @@ void afs_activate_volume(struct afs_volume *volume)
 {
 #ifdef CONFIG_AFS_FSCACHE
 	volume->cache = fscache_acquire_cookie(volume->cell->cache,
-					       &afs_volume_cache_index_def,
-					       NULL,
+					       FSCACHE_COOKIE_TYPE_INDEX,
+					       "AFS.vol",
+					       0, NULL,
 					       &volume->vid, sizeof(volume->vid),
 					       NULL, 0, 0, true);
 #endif
diff --git a/fs/cachefiles/interface.c b/fs/cachefiles/interface.c
index 6c48e81deccc..4de22ab53d73 100644
--- a/fs/cachefiles/interface.c
+++ b/fs/cachefiles/interface.c
@@ -41,7 +41,7 @@ static struct fscache_object *cachefiles_alloc_object(
 
 	fscache_object_init(&object->fscache, cookie, &cache->cache);
 
-	object->type = cookie->def->type;
+	object->type = cookie->type;
 
 	/* get hold of the raw key
 	 * - stick the length on the front and leave space on the back for the
@@ -109,7 +109,7 @@ static int cachefiles_lookup_object(struct fscache_object *_object)
 
 	/* polish off by setting the attributes of non-index files */
 	if (ret == 0 &&
-	    object->fscache.cookie->def->type != FSCACHE_COOKIE_TYPE_INDEX)
+	    object->fscache.cookie->type != FSCACHE_COOKIE_TYPE_INDEX)
 		cachefiles_attr_changed(&object->fscache);
 
 	if (ret < 0 && ret != -ETIMEDOUT) {
diff --git a/fs/cachefiles/xattr.c b/fs/cachefiles/xattr.c
index c99952404932..bcc4a2dfe1e8 100644
--- a/fs/cachefiles/xattr.c
+++ b/fs/cachefiles/xattr.c
@@ -39,7 +39,7 @@ int cachefiles_check_object_type(struct cachefiles_object *object)
 	if (!object->fscache.cookie)
 		strcpy(type, "C3");
 	else
-		snprintf(type, 3, "%02x", object->fscache.cookie->def->type);
+		snprintf(type, 3, "%02x", object->fscache.cookie->type);
 
 	_enter("%x{%s}", object->fscache.debug_id, type);
 
@@ -119,7 +119,7 @@ int cachefiles_set_object_xattr(struct cachefiles_object *object,
 	if (!buf)
 		return -ENOMEM;
 
-	buf->type = object->fscache.cookie->def->type;
+	buf->type = object->fscache.cookie->type;
 	if (len > 0)
 		memcpy(buf->data, fscache_get_aux(object->fscache.cookie), len);
 
@@ -158,7 +158,7 @@ int cachefiles_check_auxdata(struct cachefiles_object *object)
 
 	ret = vfs_getxattr(&init_user_ns, dentry, cachefiles_xattr_cache, buf, tlen);
 	if (ret == tlen &&
-	    buf->type == object->fscache.cookie->def->type &&
+	    buf->type == object->fscache.cookie->type &&
 	    memcmp(buf->data, p, len) == 0)
 		ret = 0;
 	else
diff --git a/fs/fscache/cookie.c b/fs/fscache/cookie.c
index a772d4b6cacd..63e4a7db5804 100644
--- a/fs/fscache/cookie.c
+++ b/fs/fscache/cookie.c
@@ -43,10 +43,9 @@ static void fscache_print_cookie(struct fscache_cookie *cookie, char prefix)
 	       cookie->flags,
 	       atomic_read(&cookie->n_children),
 	       atomic_read(&cookie->n_active));
-	pr_err("%c-cookie d=%p{%s}\n",
+	pr_err("%c-cookie d=%s\n",
 	       prefix,
-	       cookie->def,
-	       cookie->def ? cookie->def->name : "?");
+	       cookie->type_name);
 
 	o = READ_ONCE(cookie->backing_objects.first);
 	if (o) {
@@ -137,7 +136,9 @@ static atomic_t fscache_cookie_debug_id = ATOMIC_INIT(1);
  */
 struct fscache_cookie *fscache_alloc_cookie(
 	struct fscache_cookie *parent,
-	const struct fscache_cookie_def *def,
+	enum fscache_cookie_type type,
+	const char *type_name,
+	u8 advice,
 	struct fscache_cache_tag *preferred_cache,
 	const void *index_key, size_t index_key_len,
 	const void *aux_data, size_t aux_data_len,
@@ -150,8 +151,11 @@ struct fscache_cookie *fscache_alloc_cookie(
 	if (!cookie)
 		return NULL;
 
+	cookie->type = type;
+	cookie->advice = advice;
 	cookie->key_len = index_key_len;
 	cookie->aux_len = aux_data_len;
+	strlcpy(cookie->type_name, type_name, sizeof(cookie->type_name));
 
 	if (fscache_set_key(cookie, index_key, index_key_len) < 0)
 		goto nomem;
@@ -173,13 +177,10 @@ struct fscache_cookie *fscache_alloc_cookie(
 	 */
 	atomic_set(&cookie->n_active, 1);
 
-	cookie->def		= def;
 	cookie->parent		= parent;
-
 	cookie->preferred_cache	= fscache_get_cache_tag(preferred_cache);
 
 	cookie->flags		= (1 << FSCACHE_COOKIE_NO_DATA_YET);
-	cookie->type		= def->type;
 	spin_lock_init(&cookie->lock);
 	INIT_HLIST_HEAD(&cookie->backing_objects);
 
@@ -241,7 +242,6 @@ struct fscache_cookie *fscache_hash_cookie(struct fscache_cookie *candidate)
  * - parent specifies the parent object
  *   - the top level index cookie for each netfs is stored in the fscache_netfs
  *     struct upon registration
- * - def points to the definition
  * - all attached caches will be searched to see if they contain this object
  * - index objects aren't stored on disk until there's a dependent file that
  *   needs storing
@@ -252,7 +252,9 @@ struct fscache_cookie *fscache_hash_cookie(struct fscache_cookie *candidate)
  */
 struct fscache_cookie *__fscache_acquire_cookie(
 	struct fscache_cookie *parent,
-	const struct fscache_cookie_def *def,
+	enum fscache_cookie_type type,
+	const char *type_name,
+	u8 advice,
 	struct fscache_cache_tag *preferred_cache,
 	const void *index_key, size_t index_key_len,
 	const void *aux_data, size_t aux_data_len,
@@ -261,11 +263,8 @@ struct fscache_cookie *__fscache_acquire_cookie(
 {
 	struct fscache_cookie *candidate, *cookie;
 
-	BUG_ON(!def);
-
 	_enter("{%s},{%s},%u",
-	       parent ? (char *) parent->def->name : "<no-parent>",
-	       def->name, enable);
+	       parent ? parent->type_name : "<no-parent>", type_name, enable);
 
 	if (!index_key || !index_key_len || index_key_len > 255 || aux_data_len > 255)
 		return NULL;
@@ -284,12 +283,11 @@ struct fscache_cookie *__fscache_acquire_cookie(
 	}
 
 	/* validate the definition */
-	BUG_ON(!def->name[0]);
-
-	BUG_ON(def->type == FSCACHE_COOKIE_TYPE_INDEX &&
+	BUG_ON(type == FSCACHE_COOKIE_TYPE_INDEX &&
 	       parent->type != FSCACHE_COOKIE_TYPE_INDEX);
 
-	candidate = fscache_alloc_cookie(parent, def, preferred_cache,
+	candidate = fscache_alloc_cookie(parent, type, type_name, advice,
+					 preferred_cache,
 					 index_key, index_key_len,
 					 aux_data, aux_data_len,
 					 object_size);
@@ -483,7 +481,7 @@ static int fscache_alloc_object(struct fscache_cache *cache,
 	struct fscache_object *object;
 	int ret;
 
-	_enter("%s,%x{%s}", cache->tag->name, cookie->debug_id, cookie->def->name);
+	_enter("%s,%x{%s}", cache->tag->name, cookie->debug_id, cookie->type_name);
 
 	spin_lock(&cookie->lock);
 	hlist_for_each_entry(object, &cookie->backing_objects,
@@ -510,7 +508,7 @@ static int fscache_alloc_object(struct fscache_cache *cache,
 	object->debug_id = atomic_inc_return(&fscache_object_debug_id);
 
 	_debug("ALLOC OBJ%x: %s {%lx}",
-	       object->debug_id, cookie->def->name, object->events);
+	       object->debug_id, cookie->type_name, object->events);
 
 	ret = fscache_alloc_object(cache, cookie->parent);
 	if (ret < 0)
@@ -558,7 +556,7 @@ static int fscache_attach_object(struct fscache_cookie *cookie,
 	struct fscache_cache *cache = object->cache;
 	int ret;
 
-	_enter("{%s},{OBJ%x}", cookie->def->name, object->debug_id);
+	_enter("{%s},{OBJ%x}", cookie->type_name, object->debug_id);
 
 	ASSERTCMP(object->cookie, ==, cookie);
 
@@ -618,7 +616,7 @@ void __fscache_invalidate(struct fscache_cookie *cookie)
 {
 	struct fscache_object *object;
 
-	_enter("{%s}", cookie->def->name);
+	_enter("{%s}", cookie->type_name);
 
 	fscache_stat(&fscache_n_invalidates);
 
@@ -683,7 +681,7 @@ void __fscache_update_cookie(struct fscache_cookie *cookie, const void *aux_data
 		return;
 	}
 
-	_enter("{%s}", cookie->def->name);
+	_enter("{%s}", cookie->type_name);
 
 	spin_lock(&cookie->lock);
 
@@ -722,7 +720,7 @@ void __fscache_disable_cookie(struct fscache_cookie *cookie,
 
 	if (atomic_read(&cookie->n_children) != 0) {
 		pr_err("Cookie '%s' still has children\n",
-		       cookie->def->name);
+		       cookie->type_name);
 		BUG();
 	}
 
@@ -801,7 +799,7 @@ void __fscache_relinquish_cookie(struct fscache_cookie *cookie,
 	}
 
 	_enter("%x{%s,%d},%d",
-	       cookie->debug_id, cookie->def->name,
+	       cookie->debug_id, cookie->type_name,
 	       atomic_read(&cookie->n_active), retire);
 
 	trace_fscache_relinquish(cookie, retire);
@@ -812,9 +810,6 @@ void __fscache_relinquish_cookie(struct fscache_cookie *cookie,
 
 	__fscache_disable_cookie(cookie, aux_data, retire);
 
-	/* Clear pointers back to the netfs */
-	cookie->def		= NULL;
-
 	if (cookie->parent) {
 		ASSERTCMP(refcount_read(&cookie->parent->ref), >, 0);
 		ASSERTCMP(atomic_read(&cookie->parent->n_children), >, 0);
@@ -1009,7 +1004,7 @@ static int fscache_cookies_seq_show(struct seq_file *m, void *v)
 		   atomic_read(&cookie->n_active),
 		   type,
 		   cookie->flags,
-		   cookie->def->name);
+		   cookie->type_name);
 
 	keylen = cookie->key_len;
 	auxlen = cookie->aux_len;
diff --git a/fs/fscache/fsdef.c b/fs/fscache/fsdef.c
index 111946ba9ce1..15312f15848b 100644
--- a/fs/fscache/fsdef.c
+++ b/fs/fscache/fsdef.c
@@ -33,29 +33,14 @@
  * cache.  It can create whatever objects it likes in that index, including
  * further indices.
  */
-static struct fscache_cookie_def fscache_fsdef_index_def = {
-	.name		= ".FS-Cache",
-	.type		= FSCACHE_COOKIE_TYPE_INDEX,
-};
-
 struct fscache_cookie fscache_fsdef_index = {
 	.debug_id	= 1,
 	.ref		= REFCOUNT_INIT(1),
 	.n_active	= ATOMIC_INIT(1),
 	.lock		= __SPIN_LOCK_UNLOCKED(fscache_fsdef_index.lock),
 	.backing_objects = HLIST_HEAD_INIT,
-	.def		= &fscache_fsdef_index_def,
+	.type_name	= ".fscach",
 	.flags		= 1 << FSCACHE_COOKIE_ENABLED,
 	.type		= FSCACHE_COOKIE_TYPE_INDEX,
 };
 EXPORT_SYMBOL(fscache_fsdef_index);
-
-/*
- * Definition of an entry in the root index.  Each entry is an index, keyed to
- * a specific netfs and only applicable to a particular version of the index
- * structure used by that netfs.
- */
-struct fscache_cookie_def fscache_fsdef_netfs_def = {
-	.name		= "FSDEF.netfs",
-	.type		= FSCACHE_COOKIE_TYPE_INDEX,
-};
diff --git a/fs/fscache/internal.h b/fs/fscache/internal.h
index 29f21fbd5b5d..bff83b14398b 100644
--- a/fs/fscache/internal.h
+++ b/fs/fscache/internal.h
@@ -64,7 +64,9 @@ extern const struct seq_operations fscache_cookies_seq_ops;
 
 extern void fscache_free_cookie(struct fscache_cookie *);
 extern struct fscache_cookie *fscache_alloc_cookie(struct fscache_cookie *,
-						   const struct fscache_cookie_def *,
+						   enum fscache_cookie_type,
+						   const char *,
+						   u8,
 						   struct fscache_cache_tag *,
 						   const void *, size_t,
 						   const void *, size_t,
@@ -86,7 +88,6 @@ static inline void fscache_cookie_see(struct fscache_cookie *cookie,
  * fsdef.c
  */
 extern struct fscache_cookie fscache_fsdef_index;
-extern struct fscache_cookie_def fscache_fsdef_netfs_def;
 
 /*
  * main.c
diff --git a/fs/fscache/io.c b/fs/fscache/io.c
index e633808ba813..41f56a11b1a9 100644
--- a/fs/fscache/io.c
+++ b/fs/fscache/io.c
@@ -51,7 +51,7 @@ int __fscache_begin_operation(struct netfs_cache_resources *cres,
 		return -ENOBUFS;
 	}
 
-	ASSERTCMP(cookie->def->type, !=, FSCACHE_COOKIE_TYPE_INDEX);
+	ASSERTCMP(cookie->type, !=, FSCACHE_COOKIE_TYPE_INDEX);
 
 	if (fscache_wait_for_deferred_lookup(cookie) < 0)
 		return -ERESTARTSYS;
diff --git a/fs/fscache/netfs.c b/fs/fscache/netfs.c
index b8db06804876..8b0f303a7715 100644
--- a/fs/fscache/netfs.c
+++ b/fs/fscache/netfs.c
@@ -21,8 +21,9 @@ int __fscache_register_netfs(struct fscache_netfs *netfs)
 
 	/* allocate a cookie for the primary index */
 	candidate = fscache_alloc_cookie(&fscache_fsdef_index,
-					 &fscache_fsdef_netfs_def,
-					 NULL,
+					 FSCACHE_COOKIE_TYPE_INDEX,
+					 ".netfs",
+					 0, NULL,
 					 netfs->name, strlen(netfs->name),
 					 &netfs->version, sizeof(netfs->version),
 					 0);
diff --git a/fs/fscache/object.c b/fs/fscache/object.c
index cdcf6720d748..1a061df7cdcd 100644
--- a/fs/fscache/object.c
+++ b/fs/fscache/object.c
@@ -469,7 +469,7 @@ static const struct fscache_state *fscache_look_up_object(struct fscache_object
 	}
 
 	_debug("LOOKUP \"%s\" in \"%s\"",
-	       cookie->def->name, object->cache->tag->name);
+	       cookie->type_name, object->cache->tag->name);
 
 	fscache_stat(&fscache_n_object_lookups);
 	fscache_stat(&fscache_n_cop_lookup_object);
diff --git a/fs/fscache/page.c b/fs/fscache/page.c
index 3fd6a2b45fed..2883d661f972 100644
--- a/fs/fscache/page.c
+++ b/fs/fscache/page.c
@@ -50,7 +50,7 @@ int __fscache_attr_changed(struct fscache_cookie *cookie)
 
 	_enter("%p", cookie);
 
-	ASSERTCMP(cookie->def->type, !=, FSCACHE_COOKIE_TYPE_INDEX);
+	ASSERTCMP(cookie->type, !=, FSCACHE_COOKIE_TYPE_INDEX);
 
 	fscache_stat(&fscache_n_attr_changed);
 
diff --git a/include/linux/fscache.h b/include/linux/fscache.h
index cedab654dc79..166e0f0a6e44 100644
--- a/include/linux/fscache.h
+++ b/include/linux/fscache.h
@@ -32,25 +32,18 @@
 #define fscache_resources_valid(cres) (false)
 #endif
 
-struct pagevec;
 struct fscache_cache_tag;
 struct fscache_cookie;
 struct fscache_netfs;
 struct netfs_read_request;
 
-/*
- * fscache cookie definition
- */
-struct fscache_cookie_def {
-	/* name of cookie type */
-	char name[16];
-
-	/* cookie type */
-	uint8_t type;
-#define FSCACHE_COOKIE_TYPE_INDEX	0
-#define FSCACHE_COOKIE_TYPE_DATAFILE	1
+enum fscache_cookie_type {
+	FSCACHE_COOKIE_TYPE_INDEX,
+	FSCACHE_COOKIE_TYPE_DATAFILE,
 };
 
+#define FSCACHE_ADV_SINGLE_CHUNK	0x01 /* The object is a single chunk of data */
+
 /*
  * fscache cached network filesystem type
  * - name, version and ops must be filled in before registration
@@ -76,11 +69,11 @@ struct fscache_cookie {
 	unsigned int			debug_id;
 	spinlock_t			lock;
 	struct hlist_head		backing_objects; /* object(s) backing this file/index */
-	const struct fscache_cookie_def	*def;		/* definition */
 	struct fscache_cookie		*parent;	/* parent of this entry */
 	struct fscache_cache_tag	*preferred_cache; /* The preferred cache or NULL */
 	struct hlist_bl_node		hash_link;	/* Link in hash table */
 	struct list_head		proc_link;	/* Link in proc list */
+	char				type_name[8];	/* Cookie type name */
 
 	unsigned long			flags;
 #define FSCACHE_COOKIE_LOOKING_UP	0	/* T if non-index cookie being looked up still */
@@ -94,7 +87,8 @@ struct fscache_cookie {
 #define FSCACHE_COOKIE_ACQUIRED		9	/* T if cookie is in use */
 #define FSCACHE_COOKIE_RELINQUISHING	10	/* T if cookie is being relinquished */
 
-	u8				type;		/* Type of object */
+	enum fscache_cookie_type	type:8;
+	u8				advice;		/* FSCACHE_COOKIE_ADV_* */
 	u8				key_len;	/* Length of index key */
 	u8				aux_len;	/* Length of auxiliary data */
 	u32				key_hash;	/* Hash of parent, type, key, len */
@@ -128,7 +122,9 @@ extern void __fscache_release_cache_tag(struct fscache_cache_tag *);
 
 extern struct fscache_cookie *__fscache_acquire_cookie(
 	struct fscache_cookie *,
-	const struct fscache_cookie_def *,
+	enum fscache_cookie_type,
+	const char *,
+	u8,
 	struct fscache_cache_tag *,
 	const void *, size_t,
 	const void *, size_t,
@@ -225,7 +221,9 @@ void fscache_release_cache_tag(struct fscache_cache_tag *tag)
 /**
  * fscache_acquire_cookie - Acquire a cookie to represent a cache object
  * @parent: The cookie that's to be the parent of this one
- * @def: A description of the cache object, including callback operations
+ * @type: Type of the cookie
+ * @type_name: Name of cookie type (max 7 chars)
+ * @advice: Advice flags (FSCACHE_COOKIE_ADV_*)
  * @preferred_cache: The cache to use (or NULL)
  * @index_key: The index key for this cookie
  * @index_key_len: Size of the index key
@@ -246,7 +244,9 @@ void fscache_release_cache_tag(struct fscache_cache_tag *tag)
 static inline
 struct fscache_cookie *fscache_acquire_cookie(
 	struct fscache_cookie *parent,
-	const struct fscache_cookie_def *def,
+	enum fscache_cookie_type type,
+	const char *type_name,
+	u8 advice,
 	struct fscache_cache_tag *preferred_cache,
 	const void *index_key,
 	size_t index_key_len,
@@ -256,7 +256,8 @@ struct fscache_cookie *fscache_acquire_cookie(
 	bool enable)
 {
 	if (fscache_cookie_valid(parent) && fscache_cookie_enabled(parent))
-		return __fscache_acquire_cookie(parent, def, preferred_cache,
+		return __fscache_acquire_cookie(parent, type, type_name, advice,
+						preferred_cache,
 						index_key, index_key_len,
 						aux_data, aux_data_len,
 						object_size, enable);
diff --git a/include/trace/events/fscache.h b/include/trace/events/fscache.h
index 446392f5ba83..ccff379db5e0 100644
--- a/include/trace/events/fscache.h
+++ b/include/trace/events/fscache.h
@@ -223,7 +223,7 @@ TRACE_EVENT(fscache_acquire,
 		    __entry->p_ref		= refcount_read(&cookie->parent->ref);
 		    __entry->p_n_children	= atomic_read(&cookie->parent->n_children);
 		    __entry->p_flags		= cookie->parent->flags;
-		    memcpy(__entry->name, cookie->def->name, 8);
+		    memcpy(__entry->name, cookie->type_name, 8);
 		    __entry->name[7]		= 0;
 			   ),
 



^ permalink raw reply related	[flat|nested] 87+ messages in thread

* [PATCH 09/67] fscache: Remove store_limit* from struct fscache_object
  2021-10-18 14:50 [PATCH 00/67] fscache: Rewrite index API and management system David Howells
                   ` (7 preceding siblings ...)
  2021-10-18 14:52 ` [PATCH 08/67] fscache: Remove struct fscache_cookie_def David Howells
@ 2021-10-18 14:52 ` David Howells
  2021-10-18 14:53 ` [PATCH 10/67] fscache: Remove fscache_check_consistency() David Howells
                   ` (61 subsequent siblings)
  70 siblings, 0 replies; 87+ messages in thread
From: David Howells @ 2021-10-18 14:52 UTC (permalink / raw)
  To: linux-cachefs
  Cc: dhowells, Trond Myklebust, Anna Schumaker, Steve French,
	Dominique Martinet, Jeff Layton, Matthew Wilcox, Alexander Viro,
	Omar Sandoval, Linus Torvalds, linux-afs, linux-nfs, linux-cifs,
	ceph-devel, v9fs-developer, linux-fsdevel, linux-kernel

Remove the store_limit values from struct fscache_object and store the
object size in the cookie.  The netfs can update this at will, and we don't
want to call back into the netfs to fetch it.

Signed-off-by: David Howells <dhowells@redhat.com>
---

 fs/cachefiles/interface.c     |   10 ++--------
 fs/fscache/cookie.c           |   14 ++++++--------
 fs/fscache/object.c           |    2 --
 include/linux/fscache-cache.h |   22 ----------------------
 include/linux/fscache.h       |    1 +
 5 files changed, 9 insertions(+), 40 deletions(-)

diff --git a/fs/cachefiles/interface.c b/fs/cachefiles/interface.c
index 4de22ab53d73..48f7de947a16 100644
--- a/fs/cachefiles/interface.c
+++ b/fs/cachefiles/interface.c
@@ -359,7 +359,7 @@ static int cachefiles_attr_changed(struct fscache_object *_object)
 	loff_t oi_size;
 	int ret;
 
-	ni_size = _object->store_limit_l;
+	ni_size = _object->cookie->object_size;
 
 	_enter("{OBJ%x},[%llu]",
 	       _object->debug_id, (unsigned long long) ni_size);
@@ -376,8 +376,6 @@ static int cachefiles_attr_changed(struct fscache_object *_object)
 
 	ASSERT(d_is_reg(object->backer));
 
-	fscache_set_store_limit(&object->fscache, ni_size);
-
 	oi_size = i_size_read(d_backing_inode(object->backer));
 	if (oi_size == ni_size)
 		return 0;
@@ -406,7 +404,6 @@ static int cachefiles_attr_changed(struct fscache_object *_object)
 	cachefiles_end_secure(cache, saved_cred);
 
 	if (ret == -EIO) {
-		fscache_set_store_limit(&object->fscache, 0);
 		cachefiles_io_error_obj(object, "Size set failed");
 		ret = -ENOBUFS;
 	}
@@ -431,7 +428,7 @@ static void cachefiles_invalidate_object(struct fscache_operation *op)
 	cache = container_of(object->fscache.cache,
 			     struct cachefiles_cache, cache);
 
-	ni_size = op->object->store_limit_l;
+	ni_size = op->object->cookie->object_size;
 
 	_enter("{OBJ%x},[%llu]",
 	       op->object->debug_id, (unsigned long long)ni_size);
@@ -439,8 +436,6 @@ static void cachefiles_invalidate_object(struct fscache_operation *op)
 	if (object->backer) {
 		ASSERT(d_is_reg(object->backer));
 
-		fscache_set_store_limit(&object->fscache, ni_size);
-
 		path.dentry = object->backer;
 		path.mnt = cache->mnt;
 
@@ -451,7 +446,6 @@ static void cachefiles_invalidate_object(struct fscache_operation *op)
 		cachefiles_end_secure(cache, saved_cred);
 
 		if (ret != 0) {
-			fscache_set_store_limit(&object->fscache, 0);
 			if (ret == -EIO)
 				cachefiles_io_error_obj(object,
 							"Invalidate failed");
diff --git a/fs/fscache/cookie.c b/fs/fscache/cookie.c
index 63e4a7db5804..02998ec9f7c9 100644
--- a/fs/fscache/cookie.c
+++ b/fs/fscache/cookie.c
@@ -22,8 +22,7 @@ static struct hlist_bl_head fscache_cookie_hash[1 << fscache_cookie_hash_shift];
 static LIST_HEAD(fscache_cookies);
 static DEFINE_RWLOCK(fscache_cookies_lock);
 
-static int fscache_acquire_non_index_cookie(struct fscache_cookie *cookie,
-					    loff_t object_size);
+static int fscache_acquire_non_index_cookie(struct fscache_cookie *cookie);
 static int fscache_alloc_object(struct fscache_cache *cache,
 				struct fscache_cookie *cookie);
 static int fscache_attach_object(struct fscache_cookie *cookie,
@@ -155,6 +154,7 @@ struct fscache_cookie *fscache_alloc_cookie(
 	cookie->advice = advice;
 	cookie->key_len = index_key_len;
 	cookie->aux_len = aux_data_len;
+	cookie->object_size = object_size;
 	strlcpy(cookie->type_name, type_name, sizeof(cookie->type_name));
 
 	if (fscache_set_key(cookie, index_key, index_key_len) < 0)
@@ -326,7 +326,7 @@ struct fscache_cookie *__fscache_acquire_cookie(
 		 * - we create indices on disk when we need them as an index
 		 * may exist in multiple caches */
 		if (cookie->type != FSCACHE_COOKIE_TYPE_INDEX) {
-			if (fscache_acquire_non_index_cookie(cookie, object_size) == 0) {
+			if (fscache_acquire_non_index_cookie(cookie) == 0) {
 				set_bit(FSCACHE_COOKIE_ENABLED, &cookie->flags);
 			} else {
 				atomic_dec(&parent->n_children);
@@ -365,6 +365,7 @@ void __fscache_enable_cookie(struct fscache_cookie *cookie,
 	wait_on_bit_lock(&cookie->flags, FSCACHE_COOKIE_ENABLEMENT_LOCK,
 			 TASK_UNINTERRUPTIBLE);
 
+	cookie->object_size = object_size;
 	fscache_update_aux(cookie, aux_data);
 
 	if (test_bit(FSCACHE_COOKIE_ENABLED, &cookie->flags))
@@ -376,7 +377,7 @@ void __fscache_enable_cookie(struct fscache_cookie *cookie,
 		/* Wait for outstanding disablement to complete */
 		__fscache_wait_on_invalidate(cookie);
 
-		if (fscache_acquire_non_index_cookie(cookie, object_size) == 0)
+		if (fscache_acquire_non_index_cookie(cookie) == 0)
 			set_bit(FSCACHE_COOKIE_ENABLED, &cookie->flags);
 	} else {
 		set_bit(FSCACHE_COOKIE_ENABLED, &cookie->flags);
@@ -393,8 +394,7 @@ EXPORT_SYMBOL(__fscache_enable_cookie);
  * - this must make sure the index chain is instantiated and instantiate the
  *   object representation too
  */
-static int fscache_acquire_non_index_cookie(struct fscache_cookie *cookie,
-					    loff_t object_size)
+static int fscache_acquire_non_index_cookie(struct fscache_cookie *cookie)
 {
 	struct fscache_object *object;
 	struct fscache_cache *cache;
@@ -445,8 +445,6 @@ static int fscache_acquire_non_index_cookie(struct fscache_cookie *cookie,
 	object = hlist_entry(cookie->backing_objects.first,
 			     struct fscache_object, cookie_link);
 
-	fscache_set_store_limit(object, object_size);
-
 	/* initiate the process of looking up all the objects in the chain
 	 * (done by fscache_initialise_object()) */
 	fscache_raise_event(object, FSCACHE_OBJECT_EV_NEW_CHILD);
diff --git a/fs/fscache/object.c b/fs/fscache/object.c
index 1a061df7cdcd..3fb5a1a6c131 100644
--- a/fs/fscache/object.c
+++ b/fs/fscache/object.c
@@ -315,8 +315,6 @@ void fscache_object_init(struct fscache_object *object,
 	object->n_children = 0;
 	object->n_ops = object->n_in_progress = object->n_exclusive = 0;
 	object->events = 0;
-	object->store_limit = 0;
-	object->store_limit_l = 0;
 	object->cache = cache;
 	object->cookie = cookie;
 	fscache_cookie_get(cookie, fscache_cookie_get_attach_object);
diff --git a/include/linux/fscache-cache.h b/include/linux/fscache-cache.h
index 264b94dad5be..15cd2f04c9e6 100644
--- a/include/linux/fscache-cache.h
+++ b/include/linux/fscache-cache.h
@@ -275,8 +275,6 @@ struct fscache_object {
 	struct list_head	dependents;	/* FIFO of dependent objects */
 	struct list_head	dep_link;	/* link in parent's dependents list */
 	struct list_head	pending_ops;	/* unstarted operations on this object */
-	pgoff_t			store_limit;	/* current storage limit */
-	loff_t			store_limit_l;	/* current storage limit */
 };
 
 extern void fscache_object_init(struct fscache_object *, struct fscache_cookie *,
@@ -337,26 +335,6 @@ static inline void fscache_object_lookup_error(struct fscache_object *object)
 	set_bit(FSCACHE_OBJECT_EV_ERROR, &object->events);
 }
 
-/**
- * fscache_set_store_limit - Set the maximum size to be stored in an object
- * @object: The object to set the maximum on
- * @i_size: The limit to set in bytes
- *
- * Set the maximum size an object is permitted to reach, implying the highest
- * byte that may be written.  Intended to be called by the attr_changed() op.
- *
- * See Documentation/filesystems/caching/backend-api.rst for a complete
- * description.
- */
-static inline
-void fscache_set_store_limit(struct fscache_object *object, loff_t i_size)
-{
-	object->store_limit_l = i_size;
-	object->store_limit = i_size >> PAGE_SHIFT;
-	if (i_size & ~PAGE_MASK)
-		object->store_limit++;
-}
-
 static inline void __fscache_use_cookie(struct fscache_cookie *cookie)
 {
 	atomic_inc(&cookie->n_active);
diff --git a/include/linux/fscache.h b/include/linux/fscache.h
index 166e0f0a6e44..98685bd7204b 100644
--- a/include/linux/fscache.h
+++ b/include/linux/fscache.h
@@ -74,6 +74,7 @@ struct fscache_cookie {
 	struct hlist_bl_node		hash_link;	/* Link in hash table */
 	struct list_head		proc_link;	/* Link in proc list */
 	char				type_name[8];	/* Cookie type name */
+	loff_t				object_size;	/* Size of the netfs object */
 
 	unsigned long			flags;
 #define FSCACHE_COOKIE_LOOKING_UP	0	/* T if non-index cookie being looked up still */



^ permalink raw reply related	[flat|nested] 87+ messages in thread

* [PATCH 10/67] fscache: Remove fscache_check_consistency()
  2021-10-18 14:50 [PATCH 00/67] fscache: Rewrite index API and management system David Howells
                   ` (8 preceding siblings ...)
  2021-10-18 14:52 ` [PATCH 09/67] fscache: Remove store_limit* from struct fscache_object David Howells
@ 2021-10-18 14:53 ` David Howells
  2021-10-18 14:53 ` [PATCH 11/67] fscache: Remove fscache_attr_changed() David Howells
                   ` (60 subsequent siblings)
  70 siblings, 0 replies; 87+ messages in thread
From: David Howells @ 2021-10-18 14:53 UTC (permalink / raw)
  To: linux-cachefs
  Cc: dhowells, Trond Myklebust, Anna Schumaker, Steve French,
	Dominique Martinet, Jeff Layton, Matthew Wilcox, Alexander Viro,
	Omar Sandoval, Linus Torvalds, linux-afs, linux-nfs, linux-cifs,
	ceph-devel, v9fs-developer, linux-fsdevel, linux-kernel

Remove fscache_check_consistency() as that allows the netfs to pry into the
inner working of the cache - and what's in the cookie should be taken as
consistent with the disk (possibly lazily).

Signed-off-by: David Howells <dhowells@redhat.com>
---

 fs/cachefiles/interface.c     |   26 -------------
 fs/fscache/cookie.c           |   79 -----------------------------------------
 include/linux/fscache-cache.h |    4 --
 include/linux/fscache.h       |   23 ------------
 4 files changed, 132 deletions(-)

diff --git a/fs/cachefiles/interface.c b/fs/cachefiles/interface.c
index 48f7de947a16..460dd190dad5 100644
--- a/fs/cachefiles/interface.c
+++ b/fs/cachefiles/interface.c
@@ -320,31 +320,6 @@ static void cachefiles_sync_cache(struct fscache_cache *_cache)
 				    ret);
 }
 
-/*
- * check if the backing cache is updated to FS-Cache
- * - called by FS-Cache when evaluates if need to invalidate the cache
- */
-static int cachefiles_check_consistency(struct fscache_operation *op)
-{
-	struct cachefiles_object *object;
-	struct cachefiles_cache *cache;
-	const struct cred *saved_cred;
-	int ret;
-
-	_enter("{OBJ%x}", op->object->debug_id);
-
-	object = container_of(op->object, struct cachefiles_object, fscache);
-	cache = container_of(object->fscache.cache,
-			     struct cachefiles_cache, cache);
-
-	cachefiles_begin_secure(cache, &saved_cred);
-	ret = cachefiles_check_auxdata(object);
-	cachefiles_end_secure(cache, saved_cred);
-
-	_leave(" = %d", ret);
-	return ret;
-}
-
 /*
  * notification the attributes on an object have changed
  * - called with reads/writes excluded by FS-Cache
@@ -468,6 +443,5 @@ const struct fscache_cache_ops cachefiles_cache_ops = {
 	.put_object		= cachefiles_put_object,
 	.sync_cache		= cachefiles_sync_cache,
 	.attr_changed		= cachefiles_attr_changed,
-	.check_consistency	= cachefiles_check_consistency,
 	.begin_operation	= cachefiles_begin_operation,
 };
diff --git a/fs/fscache/cookie.c b/fs/fscache/cookie.c
index 02998ec9f7c9..78938ea6ad1a 100644
--- a/fs/fscache/cookie.c
+++ b/fs/fscache/cookie.c
@@ -880,85 +880,6 @@ struct fscache_cookie *fscache_cookie_get(struct fscache_cookie *cookie,
 	return cookie;
 }
 
-/*
- * check the consistency between the netfs inode and the backing cache
- *
- * NOTE: it only serves no-index type
- */
-int __fscache_check_consistency(struct fscache_cookie *cookie,
-				const void *aux_data)
-{
-	struct fscache_operation *op;
-	struct fscache_object *object;
-	bool wake_cookie = false;
-	int ret;
-
-	_enter("%p,", cookie);
-
-	ASSERTCMP(cookie->type, ==, FSCACHE_COOKIE_TYPE_DATAFILE);
-
-	if (fscache_wait_for_deferred_lookup(cookie) < 0)
-		return -ERESTARTSYS;
-
-	if (hlist_empty(&cookie->backing_objects))
-		return 0;
-
-	op = kzalloc(sizeof(*op), GFP_NOIO | __GFP_NOMEMALLOC | __GFP_NORETRY);
-	if (!op)
-		return -ENOMEM;
-
-	fscache_operation_init(cookie, op, NULL, NULL, NULL);
-	op->flags = FSCACHE_OP_MYTHREAD |
-		(1 << FSCACHE_OP_WAITING) |
-		(1 << FSCACHE_OP_UNUSE_COOKIE);
-	trace_fscache_page_op(cookie, NULL, op, fscache_page_op_check_consistency);
-
-	spin_lock(&cookie->lock);
-
-	fscache_update_aux(cookie, aux_data);
-
-	if (!fscache_cookie_enabled(cookie) ||
-	    hlist_empty(&cookie->backing_objects))
-		goto inconsistent;
-	object = hlist_entry(cookie->backing_objects.first,
-			     struct fscache_object, cookie_link);
-	if (test_bit(FSCACHE_IOERROR, &object->cache->flags))
-		goto inconsistent;
-
-	op->debug_id = atomic_inc_return(&fscache_op_debug_id);
-
-	__fscache_use_cookie(cookie);
-	if (fscache_submit_op(object, op) < 0)
-		goto submit_failed;
-
-	/* the work queue now carries its own ref on the object */
-	spin_unlock(&cookie->lock);
-
-	ret = fscache_wait_for_operation_activation(object, op, NULL, NULL);
-	if (ret == 0) {
-		/* ask the cache to honour the operation */
-		ret = object->cache->ops->check_consistency(op);
-		fscache_op_complete(op, false);
-	} else if (ret == -ENOBUFS) {
-		ret = 0;
-	}
-
-	fscache_put_operation(op);
-	_leave(" = %d", ret);
-	return ret;
-
-submit_failed:
-	wake_cookie = __fscache_unuse_cookie(cookie);
-inconsistent:
-	spin_unlock(&cookie->lock);
-	if (wake_cookie)
-		__fscache_wake_unused_cookie(cookie);
-	kfree(op);
-	_leave(" = -ESTALE");
-	return -ESTALE;
-}
-EXPORT_SYMBOL(__fscache_check_consistency);
-
 /*
  * Generate a list of extant cookies in /proc/fs/fscache/cookies
  */
diff --git a/include/linux/fscache-cache.h b/include/linux/fscache-cache.h
index 15cd2f04c9e6..15683b7876ff 100644
--- a/include/linux/fscache-cache.h
+++ b/include/linux/fscache-cache.h
@@ -167,10 +167,6 @@ struct fscache_cache_ops {
 	/* unpin an object in the cache */
 	void (*unpin_object)(struct fscache_object *object);
 
-	/* check the consistency between the backing cache and the FS-Cache
-	 * cookie */
-	int (*check_consistency)(struct fscache_operation *op);
-
 	/* store the updated auxiliary data on an object */
 	void (*update_object)(struct fscache_object *object);
 
diff --git a/include/linux/fscache.h b/include/linux/fscache.h
index 98685bd7204b..75ca4fda93d5 100644
--- a/include/linux/fscache.h
+++ b/include/linux/fscache.h
@@ -131,7 +131,6 @@ extern struct fscache_cookie *__fscache_acquire_cookie(
 	const void *, size_t,
 	loff_t, bool);
 extern void __fscache_relinquish_cookie(struct fscache_cookie *, const void *, bool);
-extern int __fscache_check_consistency(struct fscache_cookie *, const void *);
 extern void __fscache_update_cookie(struct fscache_cookie *, const void *);
 extern int __fscache_attr_changed(struct fscache_cookie *);
 extern void __fscache_invalidate(struct fscache_cookie *);
@@ -290,28 +289,6 @@ void fscache_relinquish_cookie(struct fscache_cookie *cookie,
 		__fscache_relinquish_cookie(cookie, aux_data, retire);
 }
 
-/**
- * fscache_check_consistency - Request validation of a cache's auxiliary data
- * @cookie: The cookie representing the cache object
- * @aux_data: The updated auxiliary data for the cookie (may be NULL)
- *
- * Request an consistency check from fscache, which passes the request to the
- * backing cache.  The auxiliary data on the cookie will be updated first if
- * @aux_data is set.
- *
- * Returns 0 if consistent and -ESTALE if inconsistent.  May also
- * return -ENOMEM and -ERESTARTSYS.
- */
-static inline
-int fscache_check_consistency(struct fscache_cookie *cookie,
-			      const void *aux_data)
-{
-	if (fscache_cookie_valid(cookie) && fscache_cookie_enabled(cookie))
-		return __fscache_check_consistency(cookie, aux_data);
-	else
-		return 0;
-}
-
 /**
  * fscache_update_cookie - Request that a cache object be updated
  * @cookie: The cookie representing the cache object



^ permalink raw reply related	[flat|nested] 87+ messages in thread

* [PATCH 11/67] fscache: Remove fscache_attr_changed()
  2021-10-18 14:50 [PATCH 00/67] fscache: Rewrite index API and management system David Howells
                   ` (9 preceding siblings ...)
  2021-10-18 14:53 ` [PATCH 10/67] fscache: Remove fscache_check_consistency() David Howells
@ 2021-10-18 14:53 ` David Howells
  2021-10-18 14:54 ` [PATCH 12/67] fscache: Remove obsolete stats David Howells
                   ` (59 subsequent siblings)
  70 siblings, 0 replies; 87+ messages in thread
From: David Howells @ 2021-10-18 14:53 UTC (permalink / raw)
  To: linux-cachefs
  Cc: dhowells, Trond Myklebust, Anna Schumaker, Steve French,
	Dominique Martinet, Jeff Layton, Matthew Wilcox, Alexander Viro,
	Omar Sandoval, Linus Torvalds, linux-afs, linux-nfs, linux-cifs,
	ceph-devel, v9fs-developer, linux-fsdevel, linux-kernel

Remove fscache_attr_changed() as it's unused.

Signed-off-by: David Howells <dhowells@redhat.com>
---

 fs/cachefiles/interface.c     |   13 ++----
 fs/fscache/page.c             |   84 -----------------------------------------
 include/linux/fscache-cache.h |    4 --
 include/linux/fscache.h       |   21 ----------
 4 files changed, 5 insertions(+), 117 deletions(-)

diff --git a/fs/cachefiles/interface.c b/fs/cachefiles/interface.c
index 460dd190dad5..4a813a490ffe 100644
--- a/fs/cachefiles/interface.c
+++ b/fs/cachefiles/interface.c
@@ -10,7 +10,7 @@
 #include <linux/xattr.h>
 #include "internal.h"
 
-static int cachefiles_attr_changed(struct fscache_object *_object);
+static int cachefiles_attr_changed(struct cachefiles_object *object);
 
 /*
  * allocate an object record for a cookie lookup and prepare the lookup data
@@ -110,7 +110,7 @@ static int cachefiles_lookup_object(struct fscache_object *_object)
 	/* polish off by setting the attributes of non-index files */
 	if (ret == 0 &&
 	    object->fscache.cookie->type != FSCACHE_COOKIE_TYPE_INDEX)
-		cachefiles_attr_changed(&object->fscache);
+		cachefiles_attr_changed(object);
 
 	if (ret < 0 && ret != -ETIMEDOUT) {
 		if (ret != -ENOBUFS)
@@ -324,9 +324,8 @@ static void cachefiles_sync_cache(struct fscache_cache *_cache)
  * notification the attributes on an object have changed
  * - called with reads/writes excluded by FS-Cache
  */
-static int cachefiles_attr_changed(struct fscache_object *_object)
+static int cachefiles_attr_changed(struct cachefiles_object *object)
 {
-	struct cachefiles_object *object;
 	struct cachefiles_cache *cache;
 	const struct cred *saved_cred;
 	struct iattr newattrs;
@@ -334,12 +333,11 @@ static int cachefiles_attr_changed(struct fscache_object *_object)
 	loff_t oi_size;
 	int ret;
 
-	ni_size = _object->cookie->object_size;
+	ni_size = object->fscache.cookie->object_size;
 
 	_enter("{OBJ%x},[%llu]",
-	       _object->debug_id, (unsigned long long) ni_size);
+	       object->fscache.debug_id, (unsigned long long) ni_size);
 
-	object = container_of(_object, struct cachefiles_object, fscache);
 	cache = container_of(object->fscache.cache,
 			     struct cachefiles_cache, cache);
 
@@ -442,6 +440,5 @@ const struct fscache_cache_ops cachefiles_cache_ops = {
 	.drop_object		= cachefiles_drop_object,
 	.put_object		= cachefiles_put_object,
 	.sync_cache		= cachefiles_sync_cache,
-	.attr_changed		= cachefiles_attr_changed,
 	.begin_operation	= cachefiles_begin_operation,
 };
diff --git a/fs/fscache/page.c b/fs/fscache/page.c
index 2883d661f972..1d86c8a2a8c4 100644
--- a/fs/fscache/page.c
+++ b/fs/fscache/page.c
@@ -13,90 +13,6 @@
 #include <linux/slab.h>
 #include "internal.h"
 
-/*
- * actually apply the changed attributes to a cache object
- */
-static void fscache_attr_changed_op(struct fscache_operation *op)
-{
-	struct fscache_object *object = op->object;
-	int ret;
-
-	_enter("{OBJ%x OP%x}", object->debug_id, op->debug_id);
-
-	fscache_stat(&fscache_n_attr_changed_calls);
-
-	if (fscache_object_is_active(object)) {
-		fscache_stat(&fscache_n_cop_attr_changed);
-		ret = object->cache->ops->attr_changed(object);
-		fscache_stat_d(&fscache_n_cop_attr_changed);
-		if (ret < 0)
-			fscache_abort_object(object);
-		fscache_op_complete(op, ret < 0);
-	} else {
-		fscache_op_complete(op, true);
-	}
-
-	_leave("");
-}
-
-/*
- * notification that the attributes on an object have changed
- */
-int __fscache_attr_changed(struct fscache_cookie *cookie)
-{
-	struct fscache_operation *op;
-	struct fscache_object *object;
-	bool wake_cookie = false;
-
-	_enter("%p", cookie);
-
-	ASSERTCMP(cookie->type, !=, FSCACHE_COOKIE_TYPE_INDEX);
-
-	fscache_stat(&fscache_n_attr_changed);
-
-	op = kzalloc(sizeof(*op), GFP_KERNEL);
-	if (!op) {
-		fscache_stat(&fscache_n_attr_changed_nomem);
-		_leave(" = -ENOMEM");
-		return -ENOMEM;
-	}
-
-	fscache_operation_init(cookie, op, fscache_attr_changed_op, NULL, NULL);
-	trace_fscache_page_op(cookie, NULL, op, fscache_page_op_attr_changed);
-	op->flags = FSCACHE_OP_ASYNC |
-		(1 << FSCACHE_OP_EXCLUSIVE) |
-		(1 << FSCACHE_OP_UNUSE_COOKIE);
-
-	spin_lock(&cookie->lock);
-
-	if (!fscache_cookie_enabled(cookie) ||
-	    hlist_empty(&cookie->backing_objects))
-		goto nobufs;
-	object = hlist_entry(cookie->backing_objects.first,
-			     struct fscache_object, cookie_link);
-
-	__fscache_use_cookie(cookie);
-	if (fscache_submit_exclusive_op(object, op) < 0)
-		goto nobufs_dec;
-	spin_unlock(&cookie->lock);
-	fscache_stat(&fscache_n_attr_changed_ok);
-	fscache_put_operation(op);
-	_leave(" = 0");
-	return 0;
-
-nobufs_dec:
-	wake_cookie = __fscache_unuse_cookie(cookie);
-nobufs:
-	spin_unlock(&cookie->lock);
-	fscache_put_operation(op);
-	if (wake_cookie)
-		__fscache_wake_unused_cookie(cookie);
-	fscache_stat(&fscache_n_attr_changed_nobufs);
-	_leave(" = %d", -ENOBUFS);
-	return -ENOBUFS;
-}
-EXPORT_SYMBOL(__fscache_attr_changed);
-
 /*
  * wait for a deferred lookup to complete
  */
diff --git a/include/linux/fscache-cache.h b/include/linux/fscache-cache.h
index 15683b7876ff..d0c6c09bb5a1 100644
--- a/include/linux/fscache-cache.h
+++ b/include/linux/fscache-cache.h
@@ -184,10 +184,6 @@ struct fscache_cache_ops {
 	/* sync a cache */
 	void (*sync_cache)(struct fscache_cache *cache);
 
-	/* notification that the attributes of a non-index object (such as
-	 * i_size) have changed */
-	int (*attr_changed)(struct fscache_object *object);
-
 	/* reserve space for an object's data and associated metadata */
 	int (*reserve_space)(struct fscache_object *object, loff_t i_size);
 
diff --git a/include/linux/fscache.h b/include/linux/fscache.h
index 75ca4fda93d5..1dba014e848f 100644
--- a/include/linux/fscache.h
+++ b/include/linux/fscache.h
@@ -132,7 +132,6 @@ extern struct fscache_cookie *__fscache_acquire_cookie(
 	loff_t, bool);
 extern void __fscache_relinquish_cookie(struct fscache_cookie *, const void *, bool);
 extern void __fscache_update_cookie(struct fscache_cookie *, const void *);
-extern int __fscache_attr_changed(struct fscache_cookie *);
 extern void __fscache_invalidate(struct fscache_cookie *);
 extern void __fscache_wait_on_invalidate(struct fscache_cookie *);
 #ifdef FSCACHE_USE_NEW_IO_API
@@ -337,26 +336,6 @@ void fscache_unpin_cookie(struct fscache_cookie *cookie)
 {
 }
 
-/**
- * fscache_attr_changed - Notify cache that an object's attributes changed
- * @cookie: The cookie representing the cache object
- *
- * Send a notification to the cache indicating that an object's attributes have
- * changed.  This includes the data size.  These attributes will be obtained
- * through the get_attr() cookie definition op.
- *
- * See Documentation/filesystems/caching/netfs-api.rst for a complete
- * description.
- */
-static inline
-int fscache_attr_changed(struct fscache_cookie *cookie)
-{
-	if (fscache_cookie_valid(cookie) && fscache_cookie_enabled(cookie))
-		return __fscache_attr_changed(cookie);
-	else
-		return -ENOBUFS;
-}
-
 /**
  * fscache_invalidate - Notify cache that an object needs invalidation
  * @cookie: The cookie representing the cache object



^ permalink raw reply related	[flat|nested] 87+ messages in thread

* [PATCH 12/67] fscache: Remove obsolete stats
  2021-10-18 14:50 [PATCH 00/67] fscache: Rewrite index API and management system David Howells
                   ` (10 preceding siblings ...)
  2021-10-18 14:53 ` [PATCH 11/67] fscache: Remove fscache_attr_changed() David Howells
@ 2021-10-18 14:54 ` David Howells
  2021-10-18 14:54 ` [PATCH 13/67] fscache: Remove old I/O tracepoints David Howells
                   ` (58 subsequent siblings)
  70 siblings, 0 replies; 87+ messages in thread
From: David Howells @ 2021-10-18 14:54 UTC (permalink / raw)
  To: linux-cachefs
  Cc: dhowells, Trond Myklebust, Anna Schumaker, Steve French,
	Dominique Martinet, Jeff Layton, Matthew Wilcox, Alexander Viro,
	Omar Sandoval, Linus Torvalds, linux-afs, linux-nfs, linux-cifs,
	ceph-devel, v9fs-developer, linux-fsdevel, linux-kernel

Remove a bunch of now-unused fscache stats counters that were obsoleted by
the removal of the old I/O routines.

Signed-off-by: David Howells <dhowells@redhat.com>
---

 fs/fscache/internal.h |   15 ---------------
 fs/fscache/stats.c    |   27 +--------------------------
 2 files changed, 1 insertion(+), 41 deletions(-)

diff --git a/fs/fscache/internal.h b/fs/fscache/internal.h
index bff83b14398b..43d64f6bbf8b 100644
--- a/fs/fscache/internal.h
+++ b/fs/fscache/internal.h
@@ -149,9 +149,6 @@ extern void fscache_proc_cleanup(void);
  * stats.c
  */
 #ifdef CONFIG_FSCACHE_STATS
-extern atomic_t fscache_n_ops_processed[FSCACHE_MAX_THREADS];
-extern atomic_t fscache_n_objs_processed[FSCACHE_MAX_THREADS];
-
 extern atomic_t fscache_n_op_pend;
 extern atomic_t fscache_n_op_run;
 extern atomic_t fscache_n_op_enqueue;
@@ -162,12 +159,6 @@ extern atomic_t fscache_n_op_gc;
 extern atomic_t fscache_n_op_cancelled;
 extern atomic_t fscache_n_op_rejected;
 
-extern atomic_t fscache_n_attr_changed;
-extern atomic_t fscache_n_attr_changed_ok;
-extern atomic_t fscache_n_attr_changed_nobufs;
-extern atomic_t fscache_n_attr_changed_nomem;
-extern atomic_t fscache_n_attr_changed_calls;
-
 extern atomic_t fscache_n_retrievals;
 extern atomic_t fscache_n_retrievals_ok;
 extern atomic_t fscache_n_retrievals_wait;
@@ -205,7 +196,6 @@ extern atomic_t fscache_n_updates_run;
 
 extern atomic_t fscache_n_relinquishes;
 extern atomic_t fscache_n_relinquishes_null;
-extern atomic_t fscache_n_relinquishes_waitcrt;
 extern atomic_t fscache_n_relinquishes_retire;
 
 extern atomic_t fscache_n_cookie_index;
@@ -222,11 +212,6 @@ extern atomic_t fscache_n_object_created;
 extern atomic_t fscache_n_object_avail;
 extern atomic_t fscache_n_object_dead;
 
-extern atomic_t fscache_n_checkaux_none;
-extern atomic_t fscache_n_checkaux_okay;
-extern atomic_t fscache_n_checkaux_update;
-extern atomic_t fscache_n_checkaux_obsolete;
-
 extern atomic_t fscache_n_cop_alloc_object;
 extern atomic_t fscache_n_cop_lookup_object;
 extern atomic_t fscache_n_cop_lookup_complete;
diff --git a/fs/fscache/stats.c b/fs/fscache/stats.c
index 2449aa459140..cb9dd0a93e0d 100644
--- a/fs/fscache/stats.c
+++ b/fs/fscache/stats.c
@@ -24,12 +24,6 @@ atomic_t fscache_n_op_gc;
 atomic_t fscache_n_op_cancelled;
 atomic_t fscache_n_op_rejected;
 
-atomic_t fscache_n_attr_changed;
-atomic_t fscache_n_attr_changed_ok;
-atomic_t fscache_n_attr_changed_nobufs;
-atomic_t fscache_n_attr_changed_nomem;
-atomic_t fscache_n_attr_changed_calls;
-
 atomic_t fscache_n_retrievals;
 atomic_t fscache_n_retrievals_ok;
 atomic_t fscache_n_retrievals_wait;
@@ -67,7 +61,6 @@ atomic_t fscache_n_updates_run;
 
 atomic_t fscache_n_relinquishes;
 atomic_t fscache_n_relinquishes_null;
-atomic_t fscache_n_relinquishes_waitcrt;
 atomic_t fscache_n_relinquishes_retire;
 
 atomic_t fscache_n_cookie_index;
@@ -84,11 +77,6 @@ atomic_t fscache_n_object_created;
 atomic_t fscache_n_object_avail;
 atomic_t fscache_n_object_dead;
 
-atomic_t fscache_n_checkaux_none;
-atomic_t fscache_n_checkaux_okay;
-atomic_t fscache_n_checkaux_update;
-atomic_t fscache_n_checkaux_obsolete;
-
 atomic_t fscache_n_cop_alloc_object;
 atomic_t fscache_n_cop_lookup_object;
 atomic_t fscache_n_cop_lookup_complete;
@@ -122,11 +110,6 @@ int fscache_stats_show(struct seq_file *m, void *v)
 		   atomic_read(&fscache_n_object_no_alloc),
 		   atomic_read(&fscache_n_object_avail),
 		   atomic_read(&fscache_n_object_dead));
-	seq_printf(m, "ChkAux : non=%u ok=%u upd=%u obs=%u\n",
-		   atomic_read(&fscache_n_checkaux_none),
-		   atomic_read(&fscache_n_checkaux_okay),
-		   atomic_read(&fscache_n_checkaux_update),
-		   atomic_read(&fscache_n_checkaux_obsolete));
 
 	seq_printf(m, "Acquire: n=%u nul=%u noc=%u ok=%u nbf=%u"
 		   " oom=%u\n",
@@ -153,19 +136,11 @@ int fscache_stats_show(struct seq_file *m, void *v)
 		   atomic_read(&fscache_n_updates_null),
 		   atomic_read(&fscache_n_updates_run));
 
-	seq_printf(m, "Relinqs: n=%u nul=%u wcr=%u rtr=%u\n",
+	seq_printf(m, "Relinqs: n=%u nul=%u rtr=%u\n",
 		   atomic_read(&fscache_n_relinquishes),
 		   atomic_read(&fscache_n_relinquishes_null),
-		   atomic_read(&fscache_n_relinquishes_waitcrt),
 		   atomic_read(&fscache_n_relinquishes_retire));
 
-	seq_printf(m, "AttrChg: n=%u ok=%u nbf=%u oom=%u run=%u\n",
-		   atomic_read(&fscache_n_attr_changed),
-		   atomic_read(&fscache_n_attr_changed_ok),
-		   atomic_read(&fscache_n_attr_changed_nobufs),
-		   atomic_read(&fscache_n_attr_changed_nomem),
-		   atomic_read(&fscache_n_attr_changed_calls));
-
 	seq_printf(m, "Retrvls: n=%u ok=%u wt=%u nod=%u nbf=%u"
 		   " int=%u oom=%u\n",
 		   atomic_read(&fscache_n_retrievals),



^ permalink raw reply related	[flat|nested] 87+ messages in thread

* [PATCH 13/67] fscache: Remove old I/O tracepoints
  2021-10-18 14:50 [PATCH 00/67] fscache: Rewrite index API and management system David Howells
                   ` (11 preceding siblings ...)
  2021-10-18 14:54 ` [PATCH 12/67] fscache: Remove obsolete stats David Howells
@ 2021-10-18 14:54 ` David Howells
  2021-10-18 14:54 ` [PATCH 14/67] fscache: Temporarily disable fscache_invalidate() David Howells
                   ` (57 subsequent siblings)
  70 siblings, 0 replies; 87+ messages in thread
From: David Howells @ 2021-10-18 14:54 UTC (permalink / raw)
  To: linux-cachefs
  Cc: dhowells, Trond Myklebust, Anna Schumaker, Steve French,
	Dominique Martinet, Jeff Layton, Matthew Wilcox, Alexander Viro,
	Omar Sandoval, Linus Torvalds, linux-afs, linux-nfs, linux-cifs,
	ceph-devel, v9fs-developer, linux-fsdevel, linux-kernel

Remove now-(mostly)-unused fscache tracepoints that have been obsoleted by
the removal of the old I/O code.

Signed-off-by: David Howells <dhowells@redhat.com>
---

 fs/fscache/io.c                |    2 
 fs/fscache/object.c            |    1 
 include/trace/events/fscache.h |  195 ----------------------------------------
 3 files changed, 198 deletions(-)

diff --git a/fs/fscache/io.c b/fs/fscache/io.c
index 41f56a11b1a9..7ac34c2e45fe 100644
--- a/fs/fscache/io.c
+++ b/fs/fscache/io.c
@@ -65,8 +65,6 @@ int __fscache_begin_operation(struct netfs_cache_resources *cres,
 		(1UL << FSCACHE_OP_WAITING) |
 		(1UL << FSCACHE_OP_UNUSE_COOKIE);
 
-	trace_fscache_page_op(cookie, NULL, op, fscache_page_op_retr_multi);
-
 	spin_lock(&cookie->lock);
 
 	if (!fscache_cookie_enabled(cookie) ||
diff --git a/fs/fscache/object.c b/fs/fscache/object.c
index 3fb5a1a6c131..7aa1f90d978b 100644
--- a/fs/fscache/object.c
+++ b/fs/fscache/object.c
@@ -933,7 +933,6 @@ static const struct fscache_state *_fscache_invalidate_object(struct fscache_obj
 	op->flags = FSCACHE_OP_ASYNC |
 		(1 << FSCACHE_OP_EXCLUSIVE) |
 		(1 << FSCACHE_OP_UNUSE_COOKIE);
-	trace_fscache_page_op(cookie, NULL, op, fscache_page_op_invalidate);
 
 	spin_lock(&cookie->lock);
 	if (fscache_submit_exclusive_op(object, op) < 0)
diff --git a/include/trace/events/fscache.h b/include/trace/events/fscache.h
index ccff379db5e0..90d956ef1c6e 100644
--- a/include/trace/events/fscache.h
+++ b/include/trace/events/fscache.h
@@ -33,24 +33,6 @@ enum fscache_cookie_trace {
 	fscache_cookie_put_parent,
 };
 
-enum fscache_page_trace {
-	fscache_page_cached,
-	fscache_page_inval,
-	fscache_page_maybe_release,
-	fscache_page_radix_clear_store,
-	fscache_page_radix_delete,
-	fscache_page_radix_insert,
-	fscache_page_radix_pend2store,
-	fscache_page_radix_set_pend,
-	fscache_page_uncache,
-	fscache_page_write,
-	fscache_page_write_end,
-	fscache_page_write_end_pend,
-	fscache_page_write_end_noc,
-	fscache_page_write_wait,
-	fscache_page_trace__nr
-};
-
 enum fscache_op_trace {
 	fscache_op_cancel,
 	fscache_op_cancel_all,
@@ -69,17 +51,6 @@ enum fscache_op_trace {
 	fscache_op_trace__nr
 };
 
-enum fscache_page_op_trace {
-	fscache_page_op_alloc_one,
-	fscache_page_op_attr_changed,
-	fscache_page_op_check_consistency,
-	fscache_page_op_invalidate,
-	fscache_page_op_retr_multi,
-	fscache_page_op_retr_one,
-	fscache_page_op_write_one,
-	fscache_page_op_trace__nr
-};
-
 #endif
 
 /*
@@ -98,22 +69,6 @@ enum fscache_page_op_trace {
 	EM(fscache_cookie_put_object,		"PUT obj")		\
 	E_(fscache_cookie_put_parent,		"PUT prn")
 
-#define fscache_page_traces						\
-	EM(fscache_page_cached,			"Cached ")		\
-	EM(fscache_page_inval,			"InvalPg")		\
-	EM(fscache_page_maybe_release,		"MayRels")		\
-	EM(fscache_page_uncache,		"Uncache")		\
-	EM(fscache_page_radix_clear_store,	"RxCStr ")		\
-	EM(fscache_page_radix_delete,		"RxDel  ")		\
-	EM(fscache_page_radix_insert,		"RxIns  ")		\
-	EM(fscache_page_radix_pend2store,	"RxP2S  ")		\
-	EM(fscache_page_radix_set_pend,		"RxSPend ")		\
-	EM(fscache_page_write,			"WritePg")		\
-	EM(fscache_page_write_end,		"EndPgWr")		\
-	EM(fscache_page_write_end_pend,		"EndPgWP")		\
-	EM(fscache_page_write_end_noc,		"EndPgNC")		\
-	E_(fscache_page_write_wait,		"WtOnWrt")
-
 #define fscache_op_traces						\
 	EM(fscache_op_cancel,			"Cancel1")		\
 	EM(fscache_op_cancel_all,		"CancelA")		\
@@ -130,15 +85,6 @@ enum fscache_page_op_trace {
 	EM(fscache_op_submit_ex,		"SubmitX")		\
 	E_(fscache_op_work,			"Work   ")
 
-#define fscache_page_op_traces						\
-	EM(fscache_page_op_alloc_one,		"Alloc1 ")		\
-	EM(fscache_page_op_attr_changed,	"AttrChg")		\
-	EM(fscache_page_op_check_consistency,	"CheckCn")		\
-	EM(fscache_page_op_invalidate,		"Inval  ")		\
-	EM(fscache_page_op_retr_multi,		"RetrMul")		\
-	EM(fscache_page_op_retr_one,		"Retr1  ")		\
-	E_(fscache_page_op_write_one,		"Write1 ")
-
 /*
  * Export enum symbols via userspace.
  */
@@ -353,70 +299,6 @@ TRACE_EVENT(fscache_osm,
 		      __entry->event_num)
 	    );
 
-TRACE_EVENT(fscache_page,
-	    TP_PROTO(struct fscache_cookie *cookie, struct page *page,
-		     enum fscache_page_trace why),
-
-	    TP_ARGS(cookie, page, why),
-
-	    TP_STRUCT__entry(
-		    __field(unsigned int,		cookie		)
-		    __field(pgoff_t,			page		)
-		    __field(enum fscache_page_trace,	why		)
-			     ),
-
-	    TP_fast_assign(
-		    __entry->cookie		= cookie->debug_id;
-		    __entry->page		= page->index;
-		    __entry->why		= why;
-			   ),
-
-	    TP_printk("c=%08x %s pg=%lx",
-		      __entry->cookie,
-		      __print_symbolic(__entry->why, fscache_page_traces),
-		      __entry->page)
-	    );
-
-TRACE_EVENT(fscache_check_page,
-	    TP_PROTO(struct fscache_cookie *cookie, struct page *page,
-		     void *val, int n),
-
-	    TP_ARGS(cookie, page, val, n),
-
-	    TP_STRUCT__entry(
-		    __field(unsigned int,		cookie		)
-		    __field(void *,			page		)
-		    __field(void *,			val		)
-		    __field(int,			n		)
-			     ),
-
-	    TP_fast_assign(
-		    __entry->cookie		= cookie->debug_id;
-		    __entry->page		= page;
-		    __entry->val		= val;
-		    __entry->n			= n;
-			   ),
-
-	    TP_printk("c=%08x pg=%p val=%p n=%d",
-		      __entry->cookie, __entry->page, __entry->val, __entry->n)
-	    );
-
-TRACE_EVENT(fscache_wake_cookie,
-	    TP_PROTO(struct fscache_cookie *cookie),
-
-	    TP_ARGS(cookie),
-
-	    TP_STRUCT__entry(
-		    __field(unsigned int,		cookie		)
-			     ),
-
-	    TP_fast_assign(
-		    __entry->cookie		= cookie->debug_id;
-			   ),
-
-	    TP_printk("c=%08x", __entry->cookie)
-	    );
-
 TRACE_EVENT(fscache_op,
 	    TP_PROTO(struct fscache_cookie *cookie, struct fscache_operation *op,
 		     enum fscache_op_trace why),
@@ -440,83 +322,6 @@ TRACE_EVENT(fscache_op,
 		      __print_symbolic(__entry->why, fscache_op_traces))
 	    );
 
-TRACE_EVENT(fscache_page_op,
-	    TP_PROTO(struct fscache_cookie *cookie, struct page *page,
-		     struct fscache_operation *op, enum fscache_page_op_trace what),
-
-	    TP_ARGS(cookie, page, op, what),
-
-	    TP_STRUCT__entry(
-		    __field(unsigned int,		cookie		)
-		    __field(unsigned int,		op		)
-		    __field(pgoff_t,			page		)
-		    __field(enum fscache_page_op_trace,	what		)
-			     ),
-
-	    TP_fast_assign(
-		    __entry->cookie		= cookie->debug_id;
-		    __entry->page		= page ? page->index : 0;
-		    __entry->op			= op->debug_id;
-		    __entry->what		= what;
-			   ),
-
-	    TP_printk("c=%08x %s pg=%lx op=%08x",
-		      __entry->cookie,
-		      __print_symbolic(__entry->what, fscache_page_op_traces),
-		      __entry->page, __entry->op)
-	    );
-
-TRACE_EVENT(fscache_wrote_page,
-	    TP_PROTO(struct fscache_cookie *cookie, struct page *page,
-		     struct fscache_operation *op, int ret),
-
-	    TP_ARGS(cookie, page, op, ret),
-
-	    TP_STRUCT__entry(
-		    __field(unsigned int,		cookie		)
-		    __field(unsigned int,		op		)
-		    __field(pgoff_t,			page		)
-		    __field(int,			ret		)
-			     ),
-
-	    TP_fast_assign(
-		    __entry->cookie		= cookie->debug_id;
-		    __entry->page		= page->index;
-		    __entry->op			= op->debug_id;
-		    __entry->ret		= ret;
-			   ),
-
-	    TP_printk("c=%08x pg=%lx op=%08x ret=%d",
-		      __entry->cookie, __entry->page, __entry->op, __entry->ret)
-	    );
-
-TRACE_EVENT(fscache_gang_lookup,
-	    TP_PROTO(struct fscache_cookie *cookie, struct fscache_operation *op,
-		     void **results, int n, pgoff_t store_limit),
-
-	    TP_ARGS(cookie, op, results, n, store_limit),
-
-	    TP_STRUCT__entry(
-		    __field(unsigned int,		cookie		)
-		    __field(unsigned int,		op		)
-		    __field(pgoff_t,			results0	)
-		    __field(int,			n		)
-		    __field(pgoff_t,			store_limit	)
-			     ),
-
-	    TP_fast_assign(
-		    __entry->cookie		= cookie->debug_id;
-		    __entry->op			= op->debug_id;
-		    __entry->results0		= results[0] ? ((struct page *)results[0])->index : (pgoff_t)-1;
-		    __entry->n			= n;
-		    __entry->store_limit	= store_limit;
-			   ),
-
-	    TP_printk("c=%08x op=%08x r0=%lx n=%d sl=%lx",
-		      __entry->cookie, __entry->op, __entry->results0, __entry->n,
-		      __entry->store_limit)
-	    );
-
 #endif /* _TRACE_FSCACHE_H */
 
 /* This part must be outside protection */



^ permalink raw reply related	[flat|nested] 87+ messages in thread

* [PATCH 14/67] fscache: Temporarily disable fscache_invalidate()
  2021-10-18 14:50 [PATCH 00/67] fscache: Rewrite index API and management system David Howells
                   ` (12 preceding siblings ...)
  2021-10-18 14:54 ` [PATCH 13/67] fscache: Remove old I/O tracepoints David Howells
@ 2021-10-18 14:54 ` David Howells
  2021-10-18 14:54 ` [PATCH 15/67] fscache: Disable fscache_begin_operation() David Howells
                   ` (56 subsequent siblings)
  70 siblings, 0 replies; 87+ messages in thread
From: David Howells @ 2021-10-18 14:54 UTC (permalink / raw)
  To: linux-cachefs
  Cc: dhowells, Trond Myklebust, Anna Schumaker, Steve French,
	Dominique Martinet, Jeff Layton, Matthew Wilcox, Alexander Viro,
	Omar Sandoval, Linus Torvalds, linux-afs, linux-nfs, linux-cifs,
	ceph-devel, v9fs-developer, linux-fsdevel, linux-kernel

Temporarily disable the fscache side of fscache_invalidate() so that the
operation managing code can be removed.

Signed-off-by: David Howells <dhowells@redhat.com>
---

 fs/cachefiles/interface.c     |    9 ++---
 fs/fscache/cookie.c           |   23 +-----------
 fs/fscache/object.c           |   78 +----------------------------------------
 include/linux/fscache-cache.h |    2 +
 4 files changed, 7 insertions(+), 105 deletions(-)

diff --git a/fs/cachefiles/interface.c b/fs/cachefiles/interface.c
index 4a813a490ffe..a84adf638737 100644
--- a/fs/cachefiles/interface.c
+++ b/fs/cachefiles/interface.c
@@ -388,7 +388,7 @@ static int cachefiles_attr_changed(struct cachefiles_object *object)
 /*
  * Invalidate an object
  */
-static void cachefiles_invalidate_object(struct fscache_operation *op)
+static void cachefiles_invalidate_object(struct fscache_object *_object)
 {
 	struct cachefiles_object *object;
 	struct cachefiles_cache *cache;
@@ -397,14 +397,14 @@ static void cachefiles_invalidate_object(struct fscache_operation *op)
 	uint64_t ni_size;
 	int ret;
 
-	object = container_of(op->object, struct cachefiles_object, fscache);
+	object = container_of(_object, struct cachefiles_object, fscache);
 	cache = container_of(object->fscache.cache,
 			     struct cachefiles_cache, cache);
 
-	ni_size = op->object->cookie->object_size;
+	ni_size = object->fscache.cookie->object_size;
 
 	_enter("{OBJ%x},[%llu]",
-	       op->object->debug_id, (unsigned long long)ni_size);
+	       object->fscache.debug_id, (unsigned long long)ni_size);
 
 	if (object->backer) {
 		ASSERT(d_is_reg(object->backer));
@@ -425,7 +425,6 @@ static void cachefiles_invalidate_object(struct fscache_operation *op)
 		}
 	}
 
-	fscache_op_complete(op, true);
 	_leave("");
 }
 
diff --git a/fs/fscache/cookie.c b/fs/fscache/cookie.c
index 78938ea6ad1a..1a7372a1d1aa 100644
--- a/fs/fscache/cookie.c
+++ b/fs/fscache/cookie.c
@@ -612,8 +612,6 @@ static int fscache_attach_object(struct fscache_cookie *cookie,
  */
 void __fscache_invalidate(struct fscache_cookie *cookie)
 {
-	struct fscache_object *object;
-
 	_enter("{%s}", cookie->type_name);
 
 	fscache_stat(&fscache_n_invalidates);
@@ -625,26 +623,7 @@ void __fscache_invalidate(struct fscache_cookie *cookie)
 	 */
 	ASSERTCMP(cookie->type, ==, FSCACHE_COOKIE_TYPE_DATAFILE);
 
-	/* If there's an object, we tell the object state machine to handle the
-	 * invalidation on our behalf, otherwise there's nothing to do.
-	 */
-	if (!hlist_empty(&cookie->backing_objects)) {
-		spin_lock(&cookie->lock);
-
-		if (fscache_cookie_enabled(cookie) &&
-		    !hlist_empty(&cookie->backing_objects) &&
-		    !test_and_set_bit(FSCACHE_COOKIE_INVALIDATING,
-				      &cookie->flags)) {
-			object = hlist_entry(cookie->backing_objects.first,
-					     struct fscache_object,
-					     cookie_link);
-			if (fscache_object_is_live(object))
-				fscache_raise_event(
-					object, FSCACHE_OBJECT_EV_INVALIDATE);
-		}
-
-		spin_unlock(&cookie->lock);
-	}
+	/* TODO: Do invalidation */
 
 	_leave("");
 }
diff --git a/fs/fscache/object.c b/fs/fscache/object.c
index 7aa1f90d978b..4941b961c079 100644
--- a/fs/fscache/object.c
+++ b/fs/fscache/object.c
@@ -899,86 +899,10 @@ static void fscache_dequeue_object(struct fscache_object *object)
 	_leave("");
 }
 
-/*
- * Asynchronously invalidate an object.
- */
-static const struct fscache_state *_fscache_invalidate_object(struct fscache_object *object,
-							      int event)
-{
-	struct fscache_operation *op;
-	struct fscache_cookie *cookie = object->cookie;
-
-	_enter("{OBJ%x},%d", object->debug_id, event);
-
-	/* We're going to need the cookie.  If the cookie is not available then
-	 * retire the object instead.
-	 */
-	if (!fscache_use_cookie(object)) {
-		set_bit(FSCACHE_OBJECT_RETIRED, &object->flags);
-		_leave(" [no cookie]");
-		return transit_to(KILL_OBJECT);
-	}
-
-	/* Reject any new read/write ops and abort any that are pending. */
-	clear_bit(FSCACHE_OBJECT_PENDING_WRITE, &object->flags);
-	fscache_cancel_all_ops(object);
-
-	/* Now we have to wait for in-progress reads and writes */
-	op = kzalloc(sizeof(*op), GFP_KERNEL);
-	if (!op)
-		goto nomem;
-
-	fscache_operation_init(cookie, op, object->cache->ops->invalidate_object,
-			       NULL, NULL);
-	op->flags = FSCACHE_OP_ASYNC |
-		(1 << FSCACHE_OP_EXCLUSIVE) |
-		(1 << FSCACHE_OP_UNUSE_COOKIE);
-
-	spin_lock(&cookie->lock);
-	if (fscache_submit_exclusive_op(object, op) < 0)
-		goto submit_op_failed;
-	spin_unlock(&cookie->lock);
-	fscache_put_operation(op);
-
-	/* Once we've completed the invalidation, we know there will be no data
-	 * stored in the cache and thus we can reinstate the data-check-skip
-	 * optimisation.
-	 */
-	set_bit(FSCACHE_COOKIE_NO_DATA_YET, &cookie->flags);
-
-	/* We can allow read and write requests to come in once again.  They'll
-	 * queue up behind our exclusive invalidation operation.
-	 */
-	if (test_and_clear_bit(FSCACHE_COOKIE_INVALIDATING, &cookie->flags))
-		wake_up_bit(&cookie->flags, FSCACHE_COOKIE_INVALIDATING);
-	_leave(" [ok]");
-	return transit_to(UPDATE_OBJECT);
-
-nomem:
-	fscache_mark_object_dead(object);
-	fscache_unuse_cookie(object);
-	_leave(" [ENOMEM]");
-	return transit_to(KILL_OBJECT);
-
-submit_op_failed:
-	fscache_mark_object_dead(object);
-	spin_unlock(&cookie->lock);
-	fscache_unuse_cookie(object);
-	kfree(op);
-	_leave(" [EIO]");
-	return transit_to(KILL_OBJECT);
-}
-
 static const struct fscache_state *fscache_invalidate_object(struct fscache_object *object,
 							     int event)
 {
-	const struct fscache_state *s;
-
-	fscache_stat(&fscache_n_invalidates_run);
-	fscache_stat(&fscache_n_cop_invalidate_object);
-	s = _fscache_invalidate_object(object, event);
-	fscache_stat_d(&fscache_n_cop_invalidate_object);
-	return s;
+	return transit_to(UPDATE_OBJECT);
 }
 
 /*
diff --git a/include/linux/fscache-cache.h b/include/linux/fscache-cache.h
index d0c6c09bb5a1..436456d4ff64 100644
--- a/include/linux/fscache-cache.h
+++ b/include/linux/fscache-cache.h
@@ -171,7 +171,7 @@ struct fscache_cache_ops {
 	void (*update_object)(struct fscache_object *object);
 
 	/* Invalidate an object */
-	void (*invalidate_object)(struct fscache_operation *op);
+	void (*invalidate_object)(struct fscache_object *object);
 
 	/* discard the resources pinned by an object and effect retirement if
 	 * necessary */



^ permalink raw reply related	[flat|nested] 87+ messages in thread

* [PATCH 15/67] fscache: Disable fscache_begin_operation()
  2021-10-18 14:50 [PATCH 00/67] fscache: Rewrite index API and management system David Howells
                   ` (13 preceding siblings ...)
  2021-10-18 14:54 ` [PATCH 14/67] fscache: Temporarily disable fscache_invalidate() David Howells
@ 2021-10-18 14:54 ` David Howells
  2021-10-18 14:54 ` [PATCH 16/67] fscache: Remove the I/O operation manager David Howells
                   ` (55 subsequent siblings)
  70 siblings, 0 replies; 87+ messages in thread
From: David Howells @ 2021-10-18 14:54 UTC (permalink / raw)
  To: linux-cachefs
  Cc: dhowells, Trond Myklebust, Anna Schumaker, Steve French,
	Dominique Martinet, Jeff Layton, Matthew Wilcox, Alexander Viro,
	Omar Sandoval, Linus Torvalds, linux-afs, linux-nfs, linux-cifs,
	ceph-devel, v9fs-developer, linux-fsdevel, linux-kernel

Disable fscache_begin_operation() so that the operation manager can be
removed and replaced.

Signed-off-by: David Howells <dhowells@redhat.com>
---

 fs/cachefiles/io.c |   13 ++++++++++++-
 fs/fscache/io.c    |    2 ++
 2 files changed, 14 insertions(+), 1 deletion(-)

diff --git a/fs/cachefiles/io.c b/fs/cachefiles/io.c
index 5ead97de4bb7..4cc57be88f37 100644
--- a/fs/cachefiles/io.c
+++ b/fs/cachefiles/io.c
@@ -271,6 +271,7 @@ static int cachefiles_write(struct netfs_cache_resources *cres,
 static enum netfs_read_source cachefiles_prepare_read(struct netfs_read_subrequest *subreq,
 						      loff_t i_size)
 {
+#if 0
 	struct fscache_operation *op = subreq->rreq->cache_resources.cache_priv;
 	struct cachefiles_object *object;
 	struct cachefiles_cache *cache;
@@ -335,6 +336,9 @@ static enum netfs_read_source cachefiles_prepare_read(struct netfs_read_subreque
 out:
 	cachefiles_end_secure(cache, saved_cred);
 	return ret;
+#endif
+	return subreq->start >= i_size ?
+		NETFS_FILL_WITH_ZEROES : NETFS_DOWNLOAD_FROM_SERVER;
 }
 
 /*
@@ -359,6 +363,7 @@ static int cachefiles_prepare_write(struct netfs_cache_resources *cres,
 static int cachefiles_prepare_fallback_write(struct netfs_cache_resources *cres,
 					     pgoff_t index)
 {
+#if 0
 	struct fscache_operation *op = cres->cache_priv;
 	struct cachefiles_object *object;
 	struct cachefiles_cache *cache;
@@ -369,6 +374,8 @@ static int cachefiles_prepare_fallback_write(struct netfs_cache_resources *cres,
 	cache = container_of(object->fscache.cache,
 			     struct cachefiles_cache, cache);
 	return cachefiles_has_space(cache, 0, 1);
+#endif
+	return -ENOBUFS;
 }
 
 /*
@@ -376,6 +383,7 @@ static int cachefiles_prepare_fallback_write(struct netfs_cache_resources *cres,
  */
 static void cachefiles_end_operation(struct netfs_cache_resources *cres)
 {
+#if 0
 	struct fscache_operation *op = cres->cache_priv;
 	struct file *file = cres->cache_priv2;
 
@@ -387,8 +395,8 @@ static void cachefiles_end_operation(struct netfs_cache_resources *cres)
 		fscache_op_complete(op, false);
 		fscache_put_operation(op);
 	}
-
 	_leave("");
+#endif
 }
 
 static const struct netfs_cache_ops cachefiles_netfs_cache_ops = {
@@ -406,6 +414,7 @@ static const struct netfs_cache_ops cachefiles_netfs_cache_ops = {
 int cachefiles_begin_operation(struct netfs_cache_resources *cres,
 			       struct fscache_operation *op)
 {
+#if 0
 	struct cachefiles_object *object;
 	struct cachefiles_cache *cache;
 	struct path path;
@@ -441,5 +450,7 @@ int cachefiles_begin_operation(struct netfs_cache_resources *cres,
 
 error_file:
 	fput(file);
+#endif
+	cres->ops = &cachefiles_netfs_cache_ops;
 	return -EIO;
 }
diff --git a/fs/fscache/io.c b/fs/fscache/io.c
index 7ac34c2e45fe..2547892a6064 100644
--- a/fs/fscache/io.c
+++ b/fs/fscache/io.c
@@ -31,6 +31,7 @@ int __fscache_begin_operation(struct netfs_cache_resources *cres,
 			      struct fscache_cookie *cookie,
 			      bool for_write)
 {
+#if 0
 	struct fscache_operation *op;
 	struct fscache_object *object;
 	bool wake_cookie = false;
@@ -144,6 +145,7 @@ int __fscache_begin_operation(struct netfs_cache_resources *cres,
 		fscache_stat(&fscache_n_stores_nobufs);
 	else
 		fscache_stat(&fscache_n_retrievals_nobufs);
+#endif
 	_leave(" = -ENOBUFS");
 	return -ENOBUFS;
 }



^ permalink raw reply related	[flat|nested] 87+ messages in thread

* [PATCH 16/67] fscache: Remove the I/O operation manager
  2021-10-18 14:50 [PATCH 00/67] fscache: Rewrite index API and management system David Howells
                   ` (14 preceding siblings ...)
  2021-10-18 14:54 ` [PATCH 15/67] fscache: Disable fscache_begin_operation() David Howells
@ 2021-10-18 14:54 ` David Howells
  2021-10-18 14:55 ` [PATCH 17/67] fscache: Rename fscache_cookie_{get,put,see}() David Howells
                   ` (54 subsequent siblings)
  70 siblings, 0 replies; 87+ messages in thread
From: David Howells @ 2021-10-18 14:54 UTC (permalink / raw)
  To: linux-cachefs
  Cc: dhowells, Trond Myklebust, Anna Schumaker, Steve French,
	Dominique Martinet, Jeff Layton, Matthew Wilcox, Alexander Viro,
	Omar Sandoval, Linus Torvalds, linux-afs, linux-nfs, linux-cifs,
	ceph-devel, v9fs-developer, linux-fsdevel, linux-kernel

Remove the fscache I/O operation manager.  Getting operation-operation
interactions and object-operation interactions correct has proven really
difficult; furthermore, the operations are being replaced with kiocb-driven
stuff on the cache front.

Signed-off-by: David Howells <dhowells@redhat.com>
---

 fs/cachefiles/internal.h       |    3 
 fs/cachefiles/io.c             |    3 
 fs/cachefiles/namei.c          |    5 
 fs/fscache/Makefile            |    4 
 fs/fscache/cache.c             |    3 
 fs/fscache/cookie.c            |    1 
 fs/fscache/internal.h          |   22 -
 fs/fscache/object.c            |   24 --
 fs/fscache/operation.c         |  633 ----------------------------------------
 include/linux/fscache-cache.h  |   74 -----
 include/trace/events/fscache.h |   57 ----
 11 files changed, 7 insertions(+), 822 deletions(-)
 delete mode 100644 fs/fscache/operation.c

diff --git a/fs/cachefiles/internal.h b/fs/cachefiles/internal.h
index f6b85c370935..fbe43e1aa517 100644
--- a/fs/cachefiles/internal.h
+++ b/fs/cachefiles/internal.h
@@ -150,8 +150,7 @@ extern int cachefiles_check_in_use(struct cachefiles_cache *cache,
 /*
  * rdwr2.c
  */
-extern int cachefiles_begin_operation(struct netfs_cache_resources *,
-				      struct fscache_operation *);
+extern int cachefiles_begin_operation(struct netfs_cache_resources *);
 
 /*
  * security.c
diff --git a/fs/cachefiles/io.c b/fs/cachefiles/io.c
index 4cc57be88f37..534c81a05918 100644
--- a/fs/cachefiles/io.c
+++ b/fs/cachefiles/io.c
@@ -411,8 +411,7 @@ static const struct netfs_cache_ops cachefiles_netfs_cache_ops = {
 /*
  * Open the cache file when beginning a cache operation.
  */
-int cachefiles_begin_operation(struct netfs_cache_resources *cres,
-			       struct fscache_operation *op)
+int cachefiles_begin_operation(struct netfs_cache_resources *cres)
 {
 #if 0
 	struct cachefiles_object *object;
diff --git a/fs/cachefiles/namei.c b/fs/cachefiles/namei.c
index dc5e1e48c0a8..548aac544293 100644
--- a/fs/cachefiles/namei.c
+++ b/fs/cachefiles/namei.c
@@ -36,9 +36,8 @@ void __cachefiles_printk_object(struct cachefiles_object *object,
 	       prefix, object->fscache.state->name,
 	       object->fscache.flags, work_busy(&object->fscache.work),
 	       object->fscache.events, object->fscache.event_mask);
-	pr_err("%sops=%u inp=%u exc=%u\n",
-	       prefix, object->fscache.n_ops, object->fscache.n_in_progress,
-	       object->fscache.n_exclusive);
+	pr_err("%sops=%u\n",
+	       prefix, object->fscache.n_ops);
 	pr_err("%sparent=%x\n",
 	       prefix, object->fscache.parent ? object->fscache.parent->debug_id : 0);
 
diff --git a/fs/fscache/Makefile b/fs/fscache/Makefile
index 03a871d689bb..14dfdce1c045 100644
--- a/fs/fscache/Makefile
+++ b/fs/fscache/Makefile
@@ -10,9 +10,7 @@ fscache-y := \
 	io.o \
 	main.o \
 	netfs.o \
-	object.o \
-	operation.o \
-	page.o
+	object.o
 
 fscache-$(CONFIG_PROC_FS) += proc.o
 fscache-$(CONFIG_FSCACHE_STATS) += stats.o
diff --git a/fs/fscache/cache.c b/fs/fscache/cache.c
index efcdb40267d6..efcd834d6279 100644
--- a/fs/fscache/cache.c
+++ b/fs/fscache/cache.c
@@ -182,12 +182,9 @@ void fscache_init_cache(struct fscache_cache *cache,
 	vsnprintf(cache->identifier, sizeof(cache->identifier), idfmt, va);
 	va_end(va);
 
-	INIT_WORK(&cache->op_gc, fscache_operation_gc);
 	INIT_LIST_HEAD(&cache->link);
 	INIT_LIST_HEAD(&cache->object_list);
-	INIT_LIST_HEAD(&cache->op_gc_list);
 	spin_lock_init(&cache->object_list_lock);
-	spin_lock_init(&cache->op_gc_list_lock);
 }
 EXPORT_SYMBOL(fscache_init_cache);
 
diff --git a/fs/fscache/cookie.c b/fs/fscache/cookie.c
index 1a7372a1d1aa..62f3a4f75bad 100644
--- a/fs/fscache/cookie.c
+++ b/fs/fscache/cookie.c
@@ -722,7 +722,6 @@ void __fscache_disable_cookie(struct fscache_cookie *cookie,
 		hlist_for_each_entry(object, &cookie->backing_objects, cookie_link) {
 			if (invalidate)
 				set_bit(FSCACHE_OBJECT_RETIRED, &object->flags);
-			clear_bit(FSCACHE_OBJECT_PENDING_WRITE, &object->flags);
 			fscache_raise_event(object, FSCACHE_OBJECT_EV_KILL);
 		}
 	} else {
diff --git a/fs/fscache/internal.h b/fs/fscache/internal.h
index 43d64f6bbf8b..b8da3fe2d6f1 100644
--- a/fs/fscache/internal.h
+++ b/fs/fscache/internal.h
@@ -112,28 +112,6 @@ static inline bool fscache_object_congested(void)
  */
 extern void fscache_enqueue_object(struct fscache_object *);
 
-/*
- * operation.c
- */
-extern int fscache_submit_exclusive_op(struct fscache_object *,
-				       struct fscache_operation *);
-extern int fscache_submit_op(struct fscache_object *,
-			     struct fscache_operation *);
-extern int fscache_cancel_op(struct fscache_operation *, bool);
-extern void fscache_cancel_all_ops(struct fscache_object *);
-extern void fscache_abort_object(struct fscache_object *);
-extern void fscache_start_operations(struct fscache_object *);
-extern void fscache_operation_gc(struct work_struct *);
-
-/*
- * page.c
- */
-extern int fscache_wait_for_deferred_lookup(struct fscache_cookie *);
-extern int fscache_wait_for_operation_activation(struct fscache_object *,
-						 struct fscache_operation *,
-						 atomic_t *,
-						 atomic_t *);
-
 /*
  * proc.c
  */
diff --git a/fs/fscache/object.c b/fs/fscache/object.c
index 4941b961c079..c7fbdcf3e987 100644
--- a/fs/fscache/object.c
+++ b/fs/fscache/object.c
@@ -311,9 +311,8 @@ void fscache_object_init(struct fscache_object *object,
 	INIT_WORK(&object->work, fscache_object_work_func);
 	INIT_LIST_HEAD(&object->dependents);
 	INIT_LIST_HEAD(&object->dep_link);
-	INIT_LIST_HEAD(&object->pending_ops);
 	object->n_children = 0;
-	object->n_ops = object->n_in_progress = object->n_exclusive = 0;
+	object->n_ops = 0;
 	object->events = 0;
 	object->cache = cache;
 	object->cookie = cookie;
@@ -574,14 +573,6 @@ static const struct fscache_state *fscache_object_available(struct fscache_objec
 	spin_lock(&object->lock);
 
 	fscache_done_parent_op(object);
-	if (object->n_in_progress == 0) {
-		if (object->n_ops > 0) {
-			ASSERTCMP(object->n_ops, >=, object->n_obj_ops);
-			fscache_start_operations(object);
-		} else {
-			ASSERT(list_empty(&object->pending_ops));
-		}
-	}
 	spin_unlock(&object->lock);
 
 	fscache_stat(&fscache_n_cop_lookup_complete);
@@ -647,24 +638,11 @@ static const struct fscache_state *fscache_kill_object(struct fscache_object *ob
 	fscache_mark_object_dead(object);
 	object->oob_event_mask = 0;
 
-	if (test_bit(FSCACHE_OBJECT_RETIRED, &object->flags)) {
-		/* Reject any new read/write ops and abort any that are pending. */
-		clear_bit(FSCACHE_OBJECT_PENDING_WRITE, &object->flags);
-		fscache_cancel_all_ops(object);
-	}
-
 	if (list_empty(&object->dependents) &&
 	    object->n_ops == 0 &&
 	    object->n_children == 0)
 		return transit_to(DROP_OBJECT);
 
-	if (object->n_in_progress == 0) {
-		spin_lock(&object->lock);
-		if (object->n_ops > 0 && object->n_in_progress == 0)
-			fscache_start_operations(object);
-		spin_unlock(&object->lock);
-	}
-
 	if (!list_empty(&object->dependents))
 		return transit_to(KILL_DEPENDENTS);
 
diff --git a/fs/fscache/operation.c b/fs/fscache/operation.c
deleted file mode 100644
index e002cdfaf3cc..000000000000
--- a/fs/fscache/operation.c
+++ /dev/null
@@ -1,633 +0,0 @@
-// SPDX-License-Identifier: GPL-2.0-or-later
-/* FS-Cache worker operation management routines
- *
- * Copyright (C) 2008 Red Hat, Inc. All Rights Reserved.
- * Written by David Howells (dhowells@redhat.com)
- *
- * See Documentation/filesystems/caching/operations.rst
- */
-
-#define FSCACHE_DEBUG_LEVEL OPERATION
-#include <linux/module.h>
-#include <linux/seq_file.h>
-#include <linux/slab.h>
-#include "internal.h"
-
-atomic_t fscache_op_debug_id;
-EXPORT_SYMBOL(fscache_op_debug_id);
-
-static void fscache_operation_dummy_cancel(struct fscache_operation *op)
-{
-}
-
-/**
- * fscache_operation_init - Do basic initialisation of an operation
- * @cookie: The cookie to operate on
- * @op: The operation to initialise
- * @processor: The function to perform the operation
- * @cancel: A function to handle operation cancellation
- * @release: The release function to assign
- *
- * Do basic initialisation of an operation.  The caller must still set flags,
- * object and processor if needed.
- */
-void fscache_operation_init(struct fscache_cookie *cookie,
-			    struct fscache_operation *op,
-			    fscache_operation_processor_t processor,
-			    fscache_operation_cancel_t cancel,
-			    fscache_operation_release_t release)
-{
-	INIT_WORK(&op->work, fscache_op_work_func);
-	atomic_set(&op->usage, 1);
-	op->state = FSCACHE_OP_ST_INITIALISED;
-	op->debug_id = atomic_inc_return(&fscache_op_debug_id);
-	op->processor = processor;
-	op->cancel = cancel ?: fscache_operation_dummy_cancel;
-	op->release = release;
-	INIT_LIST_HEAD(&op->pend_link);
-	fscache_stat(&fscache_n_op_initialised);
-	trace_fscache_op(cookie, op, fscache_op_init);
-}
-EXPORT_SYMBOL(fscache_operation_init);
-
-/**
- * fscache_enqueue_operation - Enqueue an operation for processing
- * @op: The operation to enqueue
- *
- * Enqueue an operation for processing by the FS-Cache thread pool.
- *
- * This will get its own ref on the object.
- */
-void fscache_enqueue_operation(struct fscache_operation *op)
-{
-	struct fscache_cookie *cookie = op->object->cookie;
-	
-	_enter("{OBJ%x OP%x,%u}",
-	       op->object->debug_id, op->debug_id, atomic_read(&op->usage));
-
-	ASSERT(list_empty(&op->pend_link));
-	ASSERT(op->processor != NULL);
-	ASSERT(fscache_object_is_available(op->object));
-	ASSERTCMP(atomic_read(&op->usage), >, 0);
-	ASSERTIFCMP(op->state != FSCACHE_OP_ST_IN_PROGRESS,
-		    op->state, ==,  FSCACHE_OP_ST_CANCELLED);
-
-	fscache_stat(&fscache_n_op_enqueue);
-	switch (op->flags & FSCACHE_OP_TYPE) {
-	case FSCACHE_OP_ASYNC:
-		trace_fscache_op(cookie, op, fscache_op_enqueue_async);
-		_debug("queue async");
-		atomic_inc(&op->usage);
-		if (!queue_work(fscache_op_wq, &op->work))
-			fscache_put_operation(op);
-		break;
-	case FSCACHE_OP_MYTHREAD:
-		trace_fscache_op(cookie, op, fscache_op_enqueue_mythread);
-		_debug("queue for caller's attention");
-		break;
-	default:
-		pr_err("Unexpected op type %lx", op->flags);
-		BUG();
-		break;
-	}
-}
-EXPORT_SYMBOL(fscache_enqueue_operation);
-
-/*
- * start an op running
- */
-static void fscache_run_op(struct fscache_object *object,
-			   struct fscache_operation *op)
-{
-	ASSERTCMP(op->state, ==, FSCACHE_OP_ST_PENDING);
-
-	op->state = FSCACHE_OP_ST_IN_PROGRESS;
-	object->n_in_progress++;
-	if (test_and_clear_bit(FSCACHE_OP_WAITING, &op->flags))
-		wake_up_bit(&op->flags, FSCACHE_OP_WAITING);
-	if (op->processor)
-		fscache_enqueue_operation(op);
-	else
-		trace_fscache_op(object->cookie, op, fscache_op_run);
-	fscache_stat(&fscache_n_op_run);
-}
-
-/*
- * report an unexpected submission
- */
-static void fscache_report_unexpected_submission(struct fscache_object *object,
-						 struct fscache_operation *op,
-						 const struct fscache_state *ostate)
-{
-	static bool once_only;
-	struct fscache_operation *p;
-	unsigned n;
-
-	if (once_only)
-		return;
-	once_only = true;
-
-	kdebug("unexpected submission OP%x [OBJ%x %s]",
-	       op->debug_id, object->debug_id, object->state->name);
-	kdebug("objstate=%s [%s]", object->state->name, ostate->name);
-	kdebug("objflags=%lx", object->flags);
-	kdebug("objevent=%lx [%lx]", object->events, object->event_mask);
-	kdebug("ops=%u inp=%u exc=%u",
-	       object->n_ops, object->n_in_progress, object->n_exclusive);
-
-	if (!list_empty(&object->pending_ops)) {
-		n = 0;
-		list_for_each_entry(p, &object->pending_ops, pend_link) {
-			ASSERTCMP(p->object, ==, object);
-			kdebug("%p %p", op->processor, op->release);
-			n++;
-		}
-
-		kdebug("n=%u", n);
-	}
-
-	dump_stack();
-}
-
-/*
- * submit an exclusive operation for an object
- * - other ops are excluded from running simultaneously with this one
- * - this gets any extra refs it needs on an op
- */
-int fscache_submit_exclusive_op(struct fscache_object *object,
-				struct fscache_operation *op)
-{
-	const struct fscache_state *ostate;
-	unsigned long flags;
-	int ret;
-
-	_enter("{OBJ%x OP%x},", object->debug_id, op->debug_id);
-
-	trace_fscache_op(object->cookie, op, fscache_op_submit_ex);
-
-	ASSERTCMP(op->state, ==, FSCACHE_OP_ST_INITIALISED);
-	ASSERTCMP(atomic_read(&op->usage), >, 0);
-
-	spin_lock(&object->lock);
-	ASSERTCMP(object->n_ops, >=, object->n_in_progress);
-	ASSERTCMP(object->n_ops, >=, object->n_exclusive);
-	ASSERT(list_empty(&op->pend_link));
-
-	ostate = object->state;
-	smp_rmb();
-
-	op->state = FSCACHE_OP_ST_PENDING;
-	flags = READ_ONCE(object->flags);
-	if (unlikely(!(flags & BIT(FSCACHE_OBJECT_IS_LIVE)))) {
-		fscache_stat(&fscache_n_op_rejected);
-		op->cancel(op);
-		op->state = FSCACHE_OP_ST_CANCELLED;
-		ret = -ENOBUFS;
-	} else if (unlikely(fscache_cache_is_broken(object))) {
-		op->cancel(op);
-		op->state = FSCACHE_OP_ST_CANCELLED;
-		ret = -EIO;
-	} else if (flags & BIT(FSCACHE_OBJECT_IS_AVAILABLE)) {
-		op->object = object;
-		object->n_ops++;
-		object->n_exclusive++;	/* reads and writes must wait */
-
-		if (object->n_in_progress > 0) {
-			atomic_inc(&op->usage);
-			list_add_tail(&op->pend_link, &object->pending_ops);
-			fscache_stat(&fscache_n_op_pend);
-		} else if (!list_empty(&object->pending_ops)) {
-			atomic_inc(&op->usage);
-			list_add_tail(&op->pend_link, &object->pending_ops);
-			fscache_stat(&fscache_n_op_pend);
-			fscache_start_operations(object);
-		} else {
-			ASSERTCMP(object->n_in_progress, ==, 0);
-			fscache_run_op(object, op);
-		}
-
-		/* need to issue a new write op after this */
-		clear_bit(FSCACHE_OBJECT_PENDING_WRITE, &object->flags);
-		ret = 0;
-	} else if (flags & BIT(FSCACHE_OBJECT_IS_LOOKED_UP)) {
-		op->object = object;
-		object->n_ops++;
-		object->n_exclusive++;	/* reads and writes must wait */
-		atomic_inc(&op->usage);
-		list_add_tail(&op->pend_link, &object->pending_ops);
-		fscache_stat(&fscache_n_op_pend);
-		ret = 0;
-	} else if (flags & BIT(FSCACHE_OBJECT_KILLED_BY_CACHE)) {
-		op->cancel(op);
-		op->state = FSCACHE_OP_ST_CANCELLED;
-		ret = -ENOBUFS;
-	} else {
-		fscache_report_unexpected_submission(object, op, ostate);
-		op->cancel(op);
-		op->state = FSCACHE_OP_ST_CANCELLED;
-		ret = -ENOBUFS;
-	}
-
-	spin_unlock(&object->lock);
-	return ret;
-}
-
-/*
- * submit an operation for an object
- * - objects may be submitted only in the following states:
- *   - during object creation (write ops may be submitted)
- *   - whilst the object is active
- *   - after an I/O error incurred in one of the two above states (op rejected)
- * - this gets any extra refs it needs on an op
- */
-int fscache_submit_op(struct fscache_object *object,
-		      struct fscache_operation *op)
-{
-	const struct fscache_state *ostate;
-	unsigned long flags;
-	int ret;
-
-	_enter("{OBJ%x OP%x},{%u}",
-	       object->debug_id, op->debug_id, atomic_read(&op->usage));
-
-	trace_fscache_op(object->cookie, op, fscache_op_submit);
-
-	ASSERTCMP(op->state, ==, FSCACHE_OP_ST_INITIALISED);
-	ASSERTCMP(atomic_read(&op->usage), >, 0);
-
-	spin_lock(&object->lock);
-	ASSERTCMP(object->n_ops, >=, object->n_in_progress);
-	ASSERTCMP(object->n_ops, >=, object->n_exclusive);
-	ASSERT(list_empty(&op->pend_link));
-
-	ostate = object->state;
-	smp_rmb();
-
-	op->state = FSCACHE_OP_ST_PENDING;
-	flags = READ_ONCE(object->flags);
-	if (unlikely(!(flags & BIT(FSCACHE_OBJECT_IS_LIVE)))) {
-		fscache_stat(&fscache_n_op_rejected);
-		op->cancel(op);
-		op->state = FSCACHE_OP_ST_CANCELLED;
-		ret = -ENOBUFS;
-	} else if (unlikely(fscache_cache_is_broken(object))) {
-		op->cancel(op);
-		op->state = FSCACHE_OP_ST_CANCELLED;
-		ret = -EIO;
-	} else if (flags & BIT(FSCACHE_OBJECT_IS_AVAILABLE)) {
-		op->object = object;
-		object->n_ops++;
-
-		if (object->n_exclusive > 0) {
-			atomic_inc(&op->usage);
-			list_add_tail(&op->pend_link, &object->pending_ops);
-			fscache_stat(&fscache_n_op_pend);
-		} else if (!list_empty(&object->pending_ops)) {
-			atomic_inc(&op->usage);
-			list_add_tail(&op->pend_link, &object->pending_ops);
-			fscache_stat(&fscache_n_op_pend);
-			fscache_start_operations(object);
-		} else {
-			ASSERTCMP(object->n_exclusive, ==, 0);
-			fscache_run_op(object, op);
-		}
-		ret = 0;
-	} else if (flags & BIT(FSCACHE_OBJECT_IS_LOOKED_UP)) {
-		op->object = object;
-		object->n_ops++;
-		atomic_inc(&op->usage);
-		list_add_tail(&op->pend_link, &object->pending_ops);
-		fscache_stat(&fscache_n_op_pend);
-		ret = 0;
-	} else if (flags & BIT(FSCACHE_OBJECT_KILLED_BY_CACHE)) {
-		op->cancel(op);
-		op->state = FSCACHE_OP_ST_CANCELLED;
-		ret = -ENOBUFS;
-	} else {
-		fscache_report_unexpected_submission(object, op, ostate);
-		ASSERT(!fscache_object_is_active(object));
-		op->cancel(op);
-		op->state = FSCACHE_OP_ST_CANCELLED;
-		ret = -ENOBUFS;
-	}
-
-	spin_unlock(&object->lock);
-	return ret;
-}
-
-/*
- * queue an object for withdrawal on error, aborting all following asynchronous
- * operations
- */
-void fscache_abort_object(struct fscache_object *object)
-{
-	_enter("{OBJ%x}", object->debug_id);
-
-	fscache_raise_event(object, FSCACHE_OBJECT_EV_ERROR);
-}
-
-/*
- * Jump start the operation processing on an object.  The caller must hold
- * object->lock.
- */
-void fscache_start_operations(struct fscache_object *object)
-{
-	struct fscache_operation *op;
-	bool stop = false;
-
-	while (!list_empty(&object->pending_ops) && !stop) {
-		op = list_entry(object->pending_ops.next,
-				struct fscache_operation, pend_link);
-
-		if (test_bit(FSCACHE_OP_EXCLUSIVE, &op->flags)) {
-			if (object->n_in_progress > 0)
-				break;
-			stop = true;
-		}
-		list_del_init(&op->pend_link);
-		fscache_run_op(object, op);
-
-		/* the pending queue was holding a ref on the object */
-		fscache_put_operation(op);
-	}
-
-	ASSERTCMP(object->n_in_progress, <=, object->n_ops);
-
-	_debug("woke %d ops on OBJ%x",
-	       object->n_in_progress, object->debug_id);
-}
-
-/*
- * cancel an operation that's pending on an object
- */
-int fscache_cancel_op(struct fscache_operation *op,
-		      bool cancel_in_progress_op)
-{
-	struct fscache_object *object = op->object;
-	bool put = false;
-	int ret;
-
-	_enter("OBJ%x OP%x}", op->object->debug_id, op->debug_id);
-
-	trace_fscache_op(object->cookie, op, fscache_op_cancel);
-
-	ASSERTCMP(op->state, >=, FSCACHE_OP_ST_PENDING);
-	ASSERTCMP(op->state, !=, FSCACHE_OP_ST_CANCELLED);
-	ASSERTCMP(atomic_read(&op->usage), >, 0);
-
-	spin_lock(&object->lock);
-
-	ret = -EBUSY;
-	if (op->state == FSCACHE_OP_ST_PENDING) {
-		ASSERT(!list_empty(&op->pend_link));
-		list_del_init(&op->pend_link);
-		put = true;
-
-		fscache_stat(&fscache_n_op_cancelled);
-		op->cancel(op);
-		op->state = FSCACHE_OP_ST_CANCELLED;
-		if (test_bit(FSCACHE_OP_EXCLUSIVE, &op->flags))
-			object->n_exclusive--;
-		if (test_and_clear_bit(FSCACHE_OP_WAITING, &op->flags))
-			wake_up_bit(&op->flags, FSCACHE_OP_WAITING);
-		ret = 0;
-	} else if (op->state == FSCACHE_OP_ST_IN_PROGRESS && cancel_in_progress_op) {
-		ASSERTCMP(object->n_in_progress, >, 0);
-		if (test_bit(FSCACHE_OP_EXCLUSIVE, &op->flags))
-			object->n_exclusive--;
-		object->n_in_progress--;
-		if (object->n_in_progress == 0)
-			fscache_start_operations(object);
-
-		fscache_stat(&fscache_n_op_cancelled);
-		op->cancel(op);
-		op->state = FSCACHE_OP_ST_CANCELLED;
-		if (test_bit(FSCACHE_OP_EXCLUSIVE, &op->flags))
-			object->n_exclusive--;
-		if (test_and_clear_bit(FSCACHE_OP_WAITING, &op->flags))
-			wake_up_bit(&op->flags, FSCACHE_OP_WAITING);
-		ret = 0;
-	}
-
-	if (put)
-		fscache_put_operation(op);
-	spin_unlock(&object->lock);
-	_leave(" = %d", ret);
-	return ret;
-}
-
-/*
- * Cancel all pending operations on an object
- */
-void fscache_cancel_all_ops(struct fscache_object *object)
-{
-	struct fscache_operation *op;
-
-	_enter("OBJ%x", object->debug_id);
-
-	spin_lock(&object->lock);
-
-	while (!list_empty(&object->pending_ops)) {
-		op = list_entry(object->pending_ops.next,
-				struct fscache_operation, pend_link);
-		fscache_stat(&fscache_n_op_cancelled);
-		list_del_init(&op->pend_link);
-
-		trace_fscache_op(object->cookie, op, fscache_op_cancel_all);
-
-		ASSERTCMP(op->state, ==, FSCACHE_OP_ST_PENDING);
-		op->cancel(op);
-		op->state = FSCACHE_OP_ST_CANCELLED;
-
-		if (test_bit(FSCACHE_OP_EXCLUSIVE, &op->flags))
-			object->n_exclusive--;
-		if (test_and_clear_bit(FSCACHE_OP_WAITING, &op->flags))
-			wake_up_bit(&op->flags, FSCACHE_OP_WAITING);
-		fscache_put_operation(op);
-		cond_resched_lock(&object->lock);
-	}
-
-	spin_unlock(&object->lock);
-	_leave("");
-}
-
-/*
- * Record the completion or cancellation of an in-progress operation.
- */
-void fscache_op_complete(struct fscache_operation *op, bool cancelled)
-{
-	struct fscache_object *object = op->object;
-
-	_enter("OBJ%x", object->debug_id);
-
-	ASSERTCMP(op->state, ==, FSCACHE_OP_ST_IN_PROGRESS);
-	ASSERTCMP(object->n_in_progress, >, 0);
-	ASSERTIFCMP(test_bit(FSCACHE_OP_EXCLUSIVE, &op->flags),
-		    object->n_exclusive, >, 0);
-	ASSERTIFCMP(test_bit(FSCACHE_OP_EXCLUSIVE, &op->flags),
-		    object->n_in_progress, ==, 1);
-
-	spin_lock(&object->lock);
-
-	if (!cancelled) {
-		trace_fscache_op(object->cookie, op, fscache_op_completed);
-		op->state = FSCACHE_OP_ST_COMPLETE;
-	} else {
-		op->cancel(op);
-		trace_fscache_op(object->cookie, op, fscache_op_cancelled);
-		op->state = FSCACHE_OP_ST_CANCELLED;
-	}
-
-	if (test_bit(FSCACHE_OP_EXCLUSIVE, &op->flags))
-		object->n_exclusive--;
-	object->n_in_progress--;
-	if (object->n_in_progress == 0)
-		fscache_start_operations(object);
-
-	spin_unlock(&object->lock);
-	_leave("");
-}
-EXPORT_SYMBOL(fscache_op_complete);
-
-/*
- * release an operation
- * - queues pending ops if this is the last in-progress op
- */
-void fscache_put_operation(struct fscache_operation *op)
-{
-	struct fscache_object *object;
-	struct fscache_cache *cache;
-
-	_enter("{OBJ%x OP%x,%d}",
-	       op->object ? op->object->debug_id : 0,
-	       op->debug_id, atomic_read(&op->usage));
-
-	ASSERTCMP(atomic_read(&op->usage), >, 0);
-
-	if (!atomic_dec_and_test(&op->usage))
-		return;
-
-	trace_fscache_op(op->object ? op->object->cookie : NULL, op, fscache_op_put);
-
-	_debug("PUT OP");
-	ASSERTIFCMP(op->state != FSCACHE_OP_ST_INITIALISED &&
-		    op->state != FSCACHE_OP_ST_COMPLETE,
-		    op->state, ==, FSCACHE_OP_ST_CANCELLED);
-
-	fscache_stat(&fscache_n_op_release);
-
-	if (op->release) {
-		op->release(op);
-		op->release = NULL;
-	}
-	op->state = FSCACHE_OP_ST_DEAD;
-
-	object = op->object;
-	if (likely(object)) {
-		if (test_bit(FSCACHE_OP_DEC_READ_CNT, &op->flags))
-			atomic_dec(&object->n_reads);
-		if (test_bit(FSCACHE_OP_UNUSE_COOKIE, &op->flags))
-			fscache_unuse_cookie(object);
-
-		/* now... we may get called with the object spinlock held, so we
-		 * complete the cleanup here only if we can immediately acquire the
-		 * lock, and defer it otherwise */
-		if (!spin_trylock(&object->lock)) {
-			_debug("defer put");
-			fscache_stat(&fscache_n_op_deferred_release);
-
-			cache = object->cache;
-			spin_lock(&cache->op_gc_list_lock);
-			list_add_tail(&op->pend_link, &cache->op_gc_list);
-			spin_unlock(&cache->op_gc_list_lock);
-			schedule_work(&cache->op_gc);
-			_leave(" [defer]");
-			return;
-		}
-
-		ASSERTCMP(object->n_ops, >, 0);
-		object->n_ops--;
-		if (object->n_ops == 0)
-			fscache_raise_event(object, FSCACHE_OBJECT_EV_CLEARED);
-
-		spin_unlock(&object->lock);
-	}
-
-	kfree(op);
-	_leave(" [done]");
-}
-EXPORT_SYMBOL(fscache_put_operation);
-
-/*
- * garbage collect operations that have had their release deferred
- */
-void fscache_operation_gc(struct work_struct *work)
-{
-	struct fscache_operation *op;
-	struct fscache_object *object;
-	struct fscache_cache *cache =
-		container_of(work, struct fscache_cache, op_gc);
-	int count = 0;
-
-	_enter("");
-
-	do {
-		spin_lock(&cache->op_gc_list_lock);
-		if (list_empty(&cache->op_gc_list)) {
-			spin_unlock(&cache->op_gc_list_lock);
-			break;
-		}
-
-		op = list_entry(cache->op_gc_list.next,
-				struct fscache_operation, pend_link);
-		list_del(&op->pend_link);
-		spin_unlock(&cache->op_gc_list_lock);
-
-		object = op->object;
-		trace_fscache_op(object->cookie, op, fscache_op_gc);
-
-		spin_lock(&object->lock);
-
-		_debug("GC DEFERRED REL OBJ%x OP%x",
-		       object->debug_id, op->debug_id);
-		fscache_stat(&fscache_n_op_gc);
-
-		ASSERTCMP(atomic_read(&op->usage), ==, 0);
-		ASSERTCMP(op->state, ==, FSCACHE_OP_ST_DEAD);
-
-		ASSERTCMP(object->n_ops, >, 0);
-		object->n_ops--;
-		if (object->n_ops == 0)
-			fscache_raise_event(object, FSCACHE_OBJECT_EV_CLEARED);
-
-		spin_unlock(&object->lock);
-		kfree(op);
-
-	} while (count++ < 20);
-
-	if (!list_empty(&cache->op_gc_list))
-		schedule_work(&cache->op_gc);
-
-	_leave("");
-}
-
-/*
- * execute an operation using fs_op_wq to provide processing context -
- * the caller holds a ref to this object, so we don't need to hold one
- */
-void fscache_op_work_func(struct work_struct *work)
-{
-	struct fscache_operation *op =
-		container_of(work, struct fscache_operation, work);
-
-	_enter("{OBJ%x OP%x,%d}",
-	       op->object->debug_id, op->debug_id, atomic_read(&op->usage));
-
-	trace_fscache_op(op->object->cookie, op, fscache_op_work);
-
-	ASSERT(op->processor != NULL);
-	op->processor(op);
-	fscache_put_operation(op);
-
-	_leave("");
-}
diff --git a/include/linux/fscache-cache.h b/include/linux/fscache-cache.h
index 436456d4ff64..c5d454f340c6 100644
--- a/include/linux/fscache-cache.h
+++ b/include/linux/fscache-cache.h
@@ -23,7 +23,6 @@
 struct fscache_cache;
 struct fscache_cache_ops;
 struct fscache_object;
-struct fscache_operation;
 
 enum fscache_obj_ref_trace {
 	fscache_obj_get_add_to_deps,
@@ -62,11 +61,8 @@ struct fscache_cache {
 	char			identifier[36];	/* cache label */
 
 	/* node management */
-	struct work_struct	op_gc;		/* operation garbage collector */
 	struct list_head	object_list;	/* list of data/index objects */
-	struct list_head	op_gc_list;	/* list of ops to be deleted */
 	spinlock_t		object_list_lock;
-	spinlock_t		op_gc_list_lock;
 	atomic_t		object_count;	/* no. of live objects in this cache */
 	struct fscache_object	*fsdef;		/* object for the fsdef index */
 	unsigned long		flags;
@@ -76,68 +72,6 @@ struct fscache_cache {
 
 extern wait_queue_head_t fscache_cache_cleared_wq;
 
-/*
- * operation to be applied to a cache object
- * - retrieval initiation operations are done in the context of the process
- *   that issued them, and not in an async thread pool
- */
-typedef void (*fscache_operation_release_t)(struct fscache_operation *op);
-typedef void (*fscache_operation_processor_t)(struct fscache_operation *op);
-typedef void (*fscache_operation_cancel_t)(struct fscache_operation *op);
-
-enum fscache_operation_state {
-	FSCACHE_OP_ST_BLANK,		/* Op is not yet submitted */
-	FSCACHE_OP_ST_INITIALISED,	/* Op is initialised */
-	FSCACHE_OP_ST_PENDING,		/* Op is blocked from running */
-	FSCACHE_OP_ST_IN_PROGRESS,	/* Op is in progress */
-	FSCACHE_OP_ST_COMPLETE,		/* Op is complete */
-	FSCACHE_OP_ST_CANCELLED,	/* Op has been cancelled */
-	FSCACHE_OP_ST_DEAD		/* Op is now dead */
-};
-
-struct fscache_operation {
-	struct work_struct	work;		/* record for async ops */
-	struct list_head	pend_link;	/* link in object->pending_ops */
-	struct fscache_object	*object;	/* object to be operated upon */
-
-	unsigned long		flags;
-#define FSCACHE_OP_TYPE		0x000f	/* operation type */
-#define FSCACHE_OP_ASYNC	0x0001	/* - async op, processor may sleep for disk */
-#define FSCACHE_OP_MYTHREAD	0x0002	/* - processing is done be issuing thread, not pool */
-#define FSCACHE_OP_WAITING	4	/* cleared when op is woken */
-#define FSCACHE_OP_EXCLUSIVE	5	/* exclusive op, other ops must wait */
-#define FSCACHE_OP_DEC_READ_CNT	6	/* decrement object->n_reads on destruction */
-#define FSCACHE_OP_UNUSE_COOKIE	7	/* call fscache_unuse_cookie() on completion */
-#define FSCACHE_OP_KEEP_FLAGS	0x00f0	/* flags to keep when repurposing an op */
-
-	enum fscache_operation_state state;
-	atomic_t		usage;
-	unsigned		debug_id;	/* debugging ID */
-
-	/* operation processor callback
-	 * - can be NULL if FSCACHE_OP_WAITING is going to be used to perform
-	 *   the op in a non-pool thread */
-	fscache_operation_processor_t processor;
-
-	/* Operation cancellation cleanup (optional) */
-	fscache_operation_cancel_t cancel;
-
-	/* operation releaser */
-	fscache_operation_release_t release;
-};
-
-extern atomic_t fscache_op_debug_id;
-extern void fscache_op_work_func(struct work_struct *work);
-
-extern void fscache_enqueue_operation(struct fscache_operation *);
-extern void fscache_op_complete(struct fscache_operation *, bool);
-extern void fscache_put_operation(struct fscache_operation *);
-extern void fscache_operation_init(struct fscache_cookie *,
-				   struct fscache_operation *,
-				   fscache_operation_processor_t,
-				   fscache_operation_cancel_t,
-				   fscache_operation_release_t);
-
 /*
  * cache operations
  */
@@ -188,8 +122,7 @@ struct fscache_cache_ops {
 	int (*reserve_space)(struct fscache_object *object, loff_t i_size);
 
 	/* Begin an operation for the netfs lib */
-	int (*begin_operation)(struct netfs_cache_resources *cres,
-			       struct fscache_operation *op);
+	int (*begin_operation)(struct netfs_cache_resources *cres);
 };
 
 extern struct fscache_cookie fscache_fsdef_index;
@@ -236,9 +169,6 @@ struct fscache_object {
 	int			n_children;	/* number of child objects */
 	int			n_ops;		/* number of extant ops on object */
 	int			n_obj_ops;	/* number of object ops outstanding on object */
-	int			n_in_progress;	/* number of ops in progress */
-	int			n_exclusive;	/* number of exclusive ops queued or in progress */
-	atomic_t		n_reads;	/* number of read ops in progress */
 	spinlock_t		lock;		/* state and operations lock */
 
 	unsigned long		lookup_jif;	/* time at which lookup started */
@@ -249,7 +179,6 @@ struct fscache_object {
 
 	unsigned long		flags;
 #define FSCACHE_OBJECT_LOCK		0	/* T if object is busy being processed */
-#define FSCACHE_OBJECT_PENDING_WRITE	1	/* T if object has pending write */
 #define FSCACHE_OBJECT_WAITING		2	/* T if object is waiting on its parent */
 #define FSCACHE_OBJECT_IS_LIVE		3	/* T if object is not withdrawn or relinquished */
 #define FSCACHE_OBJECT_IS_LOOKED_UP	4	/* T if object has been looked up */
@@ -266,7 +195,6 @@ struct fscache_object {
 	struct work_struct	work;		/* attention scheduling record */
 	struct list_head	dependents;	/* FIFO of dependent objects */
 	struct list_head	dep_link;	/* link in parent's dependents list */
-	struct list_head	pending_ops;	/* unstarted operations on this object */
 };
 
 extern void fscache_object_init(struct fscache_object *, struct fscache_cookie *,
diff --git a/include/trace/events/fscache.h b/include/trace/events/fscache.h
index 90d956ef1c6e..eac561d3ac51 100644
--- a/include/trace/events/fscache.h
+++ b/include/trace/events/fscache.h
@@ -33,24 +33,6 @@ enum fscache_cookie_trace {
 	fscache_cookie_put_parent,
 };
 
-enum fscache_op_trace {
-	fscache_op_cancel,
-	fscache_op_cancel_all,
-	fscache_op_cancelled,
-	fscache_op_completed,
-	fscache_op_enqueue_async,
-	fscache_op_enqueue_mythread,
-	fscache_op_gc,
-	fscache_op_init,
-	fscache_op_put,
-	fscache_op_run,
-	fscache_op_signal,
-	fscache_op_submit,
-	fscache_op_submit_ex,
-	fscache_op_work,
-	fscache_op_trace__nr
-};
-
 #endif
 
 /*
@@ -69,22 +51,6 @@ enum fscache_op_trace {
 	EM(fscache_cookie_put_object,		"PUT obj")		\
 	E_(fscache_cookie_put_parent,		"PUT prn")
 
-#define fscache_op_traces						\
-	EM(fscache_op_cancel,			"Cancel1")		\
-	EM(fscache_op_cancel_all,		"CancelA")		\
-	EM(fscache_op_cancelled,		"Canclld")		\
-	EM(fscache_op_completed,		"Complet")		\
-	EM(fscache_op_enqueue_async,		"EnqAsyn")		\
-	EM(fscache_op_enqueue_mythread,		"EnqMyTh")		\
-	EM(fscache_op_gc,			"GC     ")		\
-	EM(fscache_op_init,			"Init   ")		\
-	EM(fscache_op_put,			"Put    ")		\
-	EM(fscache_op_run,			"Run    ")		\
-	EM(fscache_op_signal,			"Signal ")		\
-	EM(fscache_op_submit,			"Submit ")		\
-	EM(fscache_op_submit_ex,		"SubmitX")		\
-	E_(fscache_op_work,			"Work   ")
-
 /*
  * Export enum symbols via userspace.
  */
@@ -299,29 +265,6 @@ TRACE_EVENT(fscache_osm,
 		      __entry->event_num)
 	    );
 
-TRACE_EVENT(fscache_op,
-	    TP_PROTO(struct fscache_cookie *cookie, struct fscache_operation *op,
-		     enum fscache_op_trace why),
-
-	    TP_ARGS(cookie, op, why),
-
-	    TP_STRUCT__entry(
-		    __field(unsigned int,		cookie		)
-		    __field(unsigned int,		op		)
-		    __field(enum fscache_op_trace,	why		)
-			     ),
-
-	    TP_fast_assign(
-		    __entry->cookie		= cookie ? cookie->debug_id : 0;
-		    __entry->op			= op->debug_id;
-		    __entry->why		= why;
-			   ),
-
-	    TP_printk("c=%08x op=%08x %s",
-		      __entry->cookie, __entry->op,
-		      __print_symbolic(__entry->why, fscache_op_traces))
-	    );
-
 #endif /* _TRACE_FSCACHE_H */
 
 /* This part must be outside protection */



^ permalink raw reply related	[flat|nested] 87+ messages in thread

* [PATCH 17/67] fscache: Rename fscache_cookie_{get,put,see}()
  2021-10-18 14:50 [PATCH 00/67] fscache: Rewrite index API and management system David Howells
                   ` (15 preceding siblings ...)
  2021-10-18 14:54 ` [PATCH 16/67] fscache: Remove the I/O operation manager David Howells
@ 2021-10-18 14:55 ` David Howells
  2021-10-18 14:55 ` [PATCH 18/67] cachefiles: Remove tree of active files and use S_CACHE_FILE inode flag David Howells
                   ` (53 subsequent siblings)
  70 siblings, 0 replies; 87+ messages in thread
From: David Howells @ 2021-10-18 14:55 UTC (permalink / raw)
  To: linux-cachefs
  Cc: dhowells, Trond Myklebust, Anna Schumaker, Steve French,
	Dominique Martinet, Jeff Layton, Matthew Wilcox, Alexander Viro,
	Omar Sandoval, Linus Torvalds, linux-afs, linux-nfs, linux-cifs,
	ceph-devel, v9fs-developer, linux-fsdevel, linux-kernel

Rename fscache_cookie_{get,put,see}() to fscache_{get,put,see}_cookie() and
make them available to cache backend modules.

Signed-off-by: David Howells <dhowells@redhat.com>
---

 fs/fscache/cookie.c           |   14 ++++++++------
 fs/fscache/internal.h         |    6 +-----
 fs/fscache/netfs.c            |    4 ++--
 fs/fscache/object.c           |    4 ++--
 include/linux/fscache-cache.h |    6 ++++++
 5 files changed, 19 insertions(+), 15 deletions(-)

diff --git a/fs/fscache/cookie.c b/fs/fscache/cookie.c
index 62f3a4f75bad..0dc27f82e910 100644
--- a/fs/fscache/cookie.c
+++ b/fs/fscache/cookie.c
@@ -215,7 +215,7 @@ struct fscache_cookie *fscache_hash_cookie(struct fscache_cookie *candidate)
 	}
 
 	__set_bit(FSCACHE_COOKIE_ACQUIRED, &candidate->flags);
-	fscache_cookie_get(candidate->parent, fscache_cookie_get_acquire_parent);
+	fscache_get_cookie(candidate->parent, fscache_cookie_get_acquire_parent);
 	atomic_inc(&candidate->parent->n_children);
 	hlist_bl_add_head(&candidate->hash_link, h);
 	hlist_bl_unlock(h);
@@ -232,7 +232,7 @@ struct fscache_cookie *fscache_hash_cookie(struct fscache_cookie *candidate)
 		return NULL;
 	}
 
-	fscache_cookie_get(cursor, fscache_cookie_get_reacquire);
+	fscache_get_cookie(cursor, fscache_cookie_get_reacquire);
 	hlist_bl_unlock(h);
 	return cursor;
 }
@@ -330,7 +330,7 @@ struct fscache_cookie *__fscache_acquire_cookie(
 				set_bit(FSCACHE_COOKIE_ENABLED, &cookie->flags);
 			} else {
 				atomic_dec(&parent->n_children);
-				fscache_cookie_put(cookie,
+				fscache_put_cookie(cookie,
 						   fscache_cookie_put_acquire_nobufs);
 				fscache_stat(&fscache_n_acquires_nobufs);
 				_leave(" = NULL");
@@ -793,7 +793,7 @@ void __fscache_relinquish_cookie(struct fscache_cookie *cookie,
 	}
 
 	/* Dispose of the netfs's link to the cookie */
-	fscache_cookie_put(cookie, fscache_cookie_put_relinquish);
+	fscache_put_cookie(cookie, fscache_cookie_put_relinquish);
 
 	_leave("");
 }
@@ -818,7 +818,7 @@ static void fscache_unhash_cookie(struct fscache_cookie *cookie)
 /*
  * Drop a reference to a cookie.
  */
-void fscache_cookie_put(struct fscache_cookie *cookie,
+void fscache_put_cookie(struct fscache_cookie *cookie,
 			enum fscache_cookie_trace where)
 {
 	struct fscache_cookie *parent;
@@ -844,11 +844,12 @@ void fscache_cookie_put(struct fscache_cookie *cookie,
 
 	_leave("");
 }
+EXPORT_SYMBOL(fscache_put_cookie);
 
 /*
  * Get a reference to a cookie.
  */
-struct fscache_cookie *fscache_cookie_get(struct fscache_cookie *cookie,
+struct fscache_cookie *fscache_get_cookie(struct fscache_cookie *cookie,
 					  enum fscache_cookie_trace where)
 {
 	int ref;
@@ -857,6 +858,7 @@ struct fscache_cookie *fscache_cookie_get(struct fscache_cookie *cookie,
 	trace_fscache_cookie(cookie->debug_id, ref + 1, where);
 	return cookie;
 }
+EXPORT_SYMBOL(fscache_get_cookie);
 
 /*
  * Generate a list of extant cookies in /proc/fs/fscache/cookies
diff --git a/fs/fscache/internal.h b/fs/fscache/internal.h
index b8da3fe2d6f1..e78ca3151e41 100644
--- a/fs/fscache/internal.h
+++ b/fs/fscache/internal.h
@@ -72,12 +72,8 @@ extern struct fscache_cookie *fscache_alloc_cookie(struct fscache_cookie *,
 						   const void *, size_t,
 						   loff_t);
 extern struct fscache_cookie *fscache_hash_cookie(struct fscache_cookie *);
-extern struct fscache_cookie *fscache_cookie_get(struct fscache_cookie *,
-						 enum fscache_cookie_trace);
-extern void fscache_cookie_put(struct fscache_cookie *,
-			       enum fscache_cookie_trace);
 
-static inline void fscache_cookie_see(struct fscache_cookie *cookie,
+static inline void fscache_see_cookie(struct fscache_cookie *cookie,
 				      enum fscache_cookie_trace where)
 {
 	trace_fscache_cookie(cookie->debug_id, refcount_read(&cookie->ref),
diff --git a/fs/fscache/netfs.c b/fs/fscache/netfs.c
index 8b0f303a7715..d746365f1daf 100644
--- a/fs/fscache/netfs.c
+++ b/fs/fscache/netfs.c
@@ -43,7 +43,7 @@ int __fscache_register_netfs(struct fscache_netfs *netfs)
 		fscache_free_cookie(candidate);
 	}
 
-	fscache_cookie_get(cookie->parent, fscache_cookie_get_register_netfs);
+	fscache_get_cookie(cookie->parent, fscache_cookie_get_register_netfs);
 	atomic_inc(&cookie->parent->n_children);
 
 	netfs->primary_index = cookie;
@@ -54,7 +54,7 @@ int __fscache_register_netfs(struct fscache_netfs *netfs)
 	return 0;
 
 already_registered:
-	fscache_cookie_put(candidate, fscache_cookie_put_dup_netfs);
+	fscache_put_cookie(candidate, fscache_cookie_put_dup_netfs);
 	_leave(" = -EEXIST");
 	return -EEXIST;
 }
diff --git a/fs/fscache/object.c b/fs/fscache/object.c
index c7fbdcf3e987..761d6dc4aa0f 100644
--- a/fs/fscache/object.c
+++ b/fs/fscache/object.c
@@ -316,7 +316,7 @@ void fscache_object_init(struct fscache_object *object,
 	object->events = 0;
 	object->cache = cache;
 	object->cookie = cookie;
-	fscache_cookie_get(cookie, fscache_cookie_get_attach_object);
+	fscache_get_cookie(cookie, fscache_cookie_get_attach_object);
 	object->parent = NULL;
 #ifdef CONFIG_FSCACHE_OBJECT_LIST
 	RB_CLEAR_NODE(&object->objlist_link);
@@ -769,7 +769,7 @@ static void fscache_put_object(struct fscache_object *object,
 void fscache_object_destroy(struct fscache_object *object)
 {
 	/* We can get rid of the cookie now */
-	fscache_cookie_put(object->cookie, fscache_cookie_put_object);
+	fscache_put_cookie(object->cookie, fscache_cookie_put_object);
 	object->cookie = NULL;
 }
 EXPORT_SYMBOL(fscache_object_destroy);
diff --git a/include/linux/fscache-cache.h b/include/linux/fscache-cache.h
index c5d454f340c6..0439dc3021c7 100644
--- a/include/linux/fscache-cache.h
+++ b/include/linux/fscache-cache.h
@@ -23,6 +23,7 @@
 struct fscache_cache;
 struct fscache_cache_ops;
 struct fscache_object;
+enum fscache_cookie_trace;
 
 enum fscache_obj_ref_trace {
 	fscache_obj_get_add_to_deps,
@@ -325,6 +326,11 @@ enum fscache_why_object_killed {
 extern void fscache_object_mark_killed(struct fscache_object *object,
 				       enum fscache_why_object_killed why);
 
+extern struct fscache_cookie *fscache_get_cookie(struct fscache_cookie *cookie,
+						 enum fscache_cookie_trace where);
+extern void fscache_put_cookie(struct fscache_cookie *cookie,
+			       enum fscache_cookie_trace where);
+
 /*
  * Find the key on a cookie.
  */



^ permalink raw reply related	[flat|nested] 87+ messages in thread

* [PATCH 18/67] cachefiles: Remove tree of active files and use S_CACHE_FILE inode flag
  2021-10-18 14:50 [PATCH 00/67] fscache: Rewrite index API and management system David Howells
                   ` (16 preceding siblings ...)
  2021-10-18 14:55 ` [PATCH 17/67] fscache: Rename fscache_cookie_{get,put,see}() David Howells
@ 2021-10-18 14:55 ` David Howells
  2021-10-18 14:55 ` [PATCH 19/67] cachefiles: Don't set an xattr on the root of the cache David Howells
                   ` (52 subsequent siblings)
  70 siblings, 0 replies; 87+ messages in thread
From: David Howells @ 2021-10-18 14:55 UTC (permalink / raw)
  To: linux-cachefs
  Cc: dhowells, Trond Myklebust, Anna Schumaker, Steve French,
	Dominique Martinet, Jeff Layton, Matthew Wilcox, Alexander Viro,
	Omar Sandoval, Linus Torvalds, linux-afs, linux-nfs, linux-cifs,
	ceph-devel, v9fs-developer, linux-fsdevel, linux-kernel

Remove the tree of active dentries from the cachefiles_cache struct and
instead set a flag, S_CACHE_FILE, on the backing inode to indicate that
this file is in use by the kernel so as to ward off other kernel users.

This simplifies the code a lot and also prevents two overlain caches from
fighting with each other.

Signed-off-by: David Howells <dhowells@redhat.com>
---

 fs/cachefiles/daemon.c            |    4 
 fs/cachefiles/interface.c         |   20 --
 fs/cachefiles/internal.h          |   10 -
 fs/cachefiles/namei.c             |  378 +++++++------------------------------
 include/trace/events/cachefiles.h |   54 -----
 5 files changed, 79 insertions(+), 387 deletions(-)

diff --git a/fs/cachefiles/daemon.c b/fs/cachefiles/daemon.c
index 752c1e43416f..8a937d6d5e22 100644
--- a/fs/cachefiles/daemon.c
+++ b/fs/cachefiles/daemon.c
@@ -102,8 +102,6 @@ static int cachefiles_daemon_open(struct inode *inode, struct file *file)
 	}
 
 	mutex_init(&cache->daemon_mutex);
-	cache->active_nodes = RB_ROOT;
-	rwlock_init(&cache->active_lock);
 	init_waitqueue_head(&cache->daemon_pollwq);
 
 	/* set default caching limits
@@ -138,8 +136,6 @@ static int cachefiles_daemon_release(struct inode *inode, struct file *file)
 
 	cachefiles_daemon_unbind(cache);
 
-	ASSERT(!cache->active_nodes.rb_node);
-
 	/* clean up the control file interface */
 	cache->cachefilesd = NULL;
 	file->private_data = NULL;
diff --git a/fs/cachefiles/interface.c b/fs/cachefiles/interface.c
index a84adf638737..bc28bb6c7ef5 100644
--- a/fs/cachefiles/interface.c
+++ b/fs/cachefiles/interface.c
@@ -36,7 +36,6 @@ static struct fscache_object *cachefiles_alloc_object(
 
 	ASSERTCMP(object->backer, ==, NULL);
 
-	BUG_ON(test_bit(CACHEFILES_OBJECT_ACTIVE, &object->flags));
 	atomic_set(&object->usage, 1);
 
 	fscache_object_init(&object->fscache, cookie, &cache->cache);
@@ -74,7 +73,6 @@ static struct fscache_object *cachefiles_alloc_object(
 nomem_key:
 	kfree(buffer);
 nomem_buffer:
-	BUG_ON(test_bit(CACHEFILES_OBJECT_ACTIVE, &object->flags));
 	kmem_cache_free(cachefiles_object_jar, object);
 	fscache_object_destroyed(&cache->cache);
 nomem_object:
@@ -190,8 +188,6 @@ static void cachefiles_drop_object(struct fscache_object *_object)
 	struct cachefiles_object *object;
 	struct cachefiles_cache *cache;
 	const struct cred *saved_cred;
-	struct inode *inode;
-	blkcnt_t i_blocks = 0;
 
 	ASSERT(_object);
 
@@ -218,10 +214,6 @@ static void cachefiles_drop_object(struct fscache_object *_object)
 		    _object != cache->cache.fsdef
 		    ) {
 			_debug("- retire object OBJ%x", object->fscache.debug_id);
-			inode = d_backing_inode(object->dentry);
-			if (inode)
-				i_blocks = inode->i_blocks;
-
 			cachefiles_begin_secure(cache, &saved_cred);
 			cachefiles_delete_object(cache, object);
 			cachefiles_end_secure(cache, saved_cred);
@@ -231,14 +223,11 @@ static void cachefiles_drop_object(struct fscache_object *_object)
 		if (object->backer != object->dentry)
 			dput(object->backer);
 		object->backer = NULL;
-	}
 
-	/* note that the object is now inactive */
-	if (test_bit(CACHEFILES_OBJECT_ACTIVE, &object->flags))
-		cachefiles_mark_object_inactive(cache, object, i_blocks);
-
-	dput(object->dentry);
-	object->dentry = NULL;
+		cachefiles_unmark_inode_in_use(object, object->dentry);
+		dput(object->dentry);
+		object->dentry = NULL;
+	}
 
 	_leave("");
 }
@@ -274,7 +263,6 @@ void cachefiles_put_object(struct fscache_object *_object,
 	if (u == 0) {
 		_debug("- kill object OBJ%x", object->fscache.debug_id);
 
-		ASSERT(!test_bit(CACHEFILES_OBJECT_ACTIVE, &object->flags));
 		ASSERTCMP(object->fscache.parent, ==, NULL);
 		ASSERTCMP(object->backer, ==, NULL);
 		ASSERTCMP(object->dentry, ==, NULL);
diff --git a/fs/cachefiles/internal.h b/fs/cachefiles/internal.h
index fbe43e1aa517..0515add2b7e8 100644
--- a/fs/cachefiles/internal.h
+++ b/fs/cachefiles/internal.h
@@ -38,12 +38,9 @@ struct cachefiles_object {
 	struct dentry			*dentry;	/* the file/dir representing this object */
 	struct dentry			*backer;	/* backing file */
 	loff_t				i_size;		/* object size */
-	unsigned long			flags;
-#define CACHEFILES_OBJECT_ACTIVE	0		/* T if marked active */
 	atomic_t			usage;		/* object usage count */
 	uint8_t				type;		/* object type */
 	uint8_t				new;		/* T if object new */
-	struct rb_node			active_node;	/* link in active tree (dentry is key) */
 };
 
 extern struct kmem_cache *cachefiles_object_jar;
@@ -59,8 +56,6 @@ struct cachefiles_cache {
 	const struct cred		*cache_cred;	/* security override for accessing cache */
 	struct mutex			daemon_mutex;	/* command serialisation mutex */
 	wait_queue_head_t		daemon_pollwq;	/* poll waitqueue for daemon */
-	struct rb_root			active_nodes;	/* active nodes (can't be culled) */
-	rwlock_t			active_lock;	/* lock for active_nodes */
 	atomic_t			gravecounter;	/* graveyard uniquifier */
 	atomic_t			f_released;	/* number of objects released lately */
 	atomic_long_t			b_released;	/* number of blocks released lately */
@@ -129,9 +124,8 @@ extern char *cachefiles_cook_key(const u8 *raw, int keylen, uint8_t type);
 /*
  * namei.c
  */
-extern void cachefiles_mark_object_inactive(struct cachefiles_cache *cache,
-					    struct cachefiles_object *object,
-					    blkcnt_t i_blocks);
+extern void cachefiles_unmark_inode_in_use(struct cachefiles_object *object,
+				    struct dentry *dentry);
 extern int cachefiles_delete_object(struct cachefiles_cache *cache,
 				    struct cachefiles_object *object);
 extern int cachefiles_walk_to_object(struct cachefiles_object *parent,
diff --git a/fs/cachefiles/namei.c b/fs/cachefiles/namei.c
index 548aac544293..4bd31be3be30 100644
--- a/fs/cachefiles/namei.c
+++ b/fs/cachefiles/namei.c
@@ -21,251 +21,51 @@
 #define CACHEFILES_KEYBUF_SIZE 512
 
 /*
- * dump debugging info about an object
+ * Mark the backing file as being a cache file if it's not already in use so.
  */
-static noinline
-void __cachefiles_printk_object(struct cachefiles_object *object,
-				const char *prefix)
+static bool cachefiles_mark_inode_in_use(struct cachefiles_object *object,
+					 struct dentry *dentry)
 {
-	struct fscache_cookie *cookie;
-	const u8 *k;
-	unsigned loop;
-
-	pr_err("%sobject: OBJ%x\n", prefix, object->fscache.debug_id);
-	pr_err("%sobjstate=%s fl=%lx wbusy=%x ev=%lx[%lx]\n",
-	       prefix, object->fscache.state->name,
-	       object->fscache.flags, work_busy(&object->fscache.work),
-	       object->fscache.events, object->fscache.event_mask);
-	pr_err("%sops=%u\n",
-	       prefix, object->fscache.n_ops);
-	pr_err("%sparent=%x\n",
-	       prefix, object->fscache.parent ? object->fscache.parent->debug_id : 0);
-
-	spin_lock(&object->fscache.lock);
-	cookie = object->fscache.cookie;
-	if (cookie) {
-		pr_err("%scookie=%x [pr=%x fl=%lx]\n",
-		       prefix,
-		       cookie->debug_id,
-		       cookie->parent ? cookie->parent->debug_id : 0,
-		       cookie->flags);
-		pr_err("%skey=[%u] '", prefix, cookie->key_len);
-		k = (cookie->key_len <= sizeof(cookie->inline_key)) ?
-			cookie->inline_key : cookie->key;
-		for (loop = 0; loop < cookie->key_len; loop++)
-			pr_cont("%02x", k[loop]);
-		pr_cont("'\n");
-	} else {
-		pr_err("%scookie=NULL\n", prefix);
-	}
-	spin_unlock(&object->fscache.lock);
-}
-
-/*
- * dump debugging info about a pair of objects
- */
-static noinline void cachefiles_printk_object(struct cachefiles_object *object,
-					      struct cachefiles_object *xobject)
-{
-	if (object)
-		__cachefiles_printk_object(object, "");
-	if (xobject)
-		__cachefiles_printk_object(xobject, "x");
-}
-
-/*
- * mark the owner of a dentry, if there is one, to indicate that that dentry
- * has been preemptively deleted
- * - the caller must hold the i_mutex on the dentry's parent as required to
- *   call vfs_unlink(), vfs_rmdir() or vfs_rename()
- */
-static void cachefiles_mark_object_buried(struct cachefiles_cache *cache,
-					  struct dentry *dentry,
-					  enum fscache_why_object_killed why)
-{
-	struct cachefiles_object *object;
-	struct rb_node *p;
-
-	_enter(",'%pd'", dentry);
-
-	write_lock(&cache->active_lock);
-
-	p = cache->active_nodes.rb_node;
-	while (p) {
-		object = rb_entry(p, struct cachefiles_object, active_node);
-		if (object->dentry > dentry)
-			p = p->rb_left;
-		else if (object->dentry < dentry)
-			p = p->rb_right;
-		else
-			goto found_dentry;
-	}
-
-	write_unlock(&cache->active_lock);
-	trace_cachefiles_mark_buried(NULL, dentry, why);
-	_leave(" [no owner]");
-	return;
+	struct inode *inode = d_backing_inode(dentry);
+	bool can_use = false;
 
-	/* found the dentry for  */
-found_dentry:
-	kdebug("preemptive burial: OBJ%x [%s] %pd",
-	       object->fscache.debug_id,
-	       object->fscache.state->name,
-	       dentry);
+	_enter(",%x", object->fscache.debug_id);
 
-	trace_cachefiles_mark_buried(object, dentry, why);
+	inode_lock(inode);
 
-	if (fscache_object_is_live(&object->fscache)) {
-		pr_err("\n");
-		pr_err("Error: Can't preemptively bury live object\n");
-		cachefiles_printk_object(object, NULL);
+	if (!(inode->i_flags & S_KERNEL_FILE)) {
+		inode->i_flags |= S_KERNEL_FILE;
+		trace_cachefiles_mark_active(object, dentry);
+		can_use = true;
 	} else {
-		if (why != FSCACHE_OBJECT_IS_STALE)
-			fscache_object_mark_killed(&object->fscache, why);
+		pr_notice("cachefiles: Inode already in use: %pd\n", dentry);
 	}
 
-	write_unlock(&cache->active_lock);
-	_leave(" [owner marked]");
+	inode_unlock(inode);
+	return can_use;
 }
 
 /*
- * record the fact that an object is now active
+ * Unmark a backing inode.
  */
-static int cachefiles_mark_object_active(struct cachefiles_cache *cache,
-					 struct cachefiles_object *object)
+void cachefiles_unmark_inode_in_use(struct cachefiles_object *object,
+				    struct dentry *dentry)
 {
-	struct cachefiles_object *xobject;
-	struct rb_node **_p, *_parent = NULL;
-	struct dentry *dentry;
-
-	_enter(",%x", object->fscache.debug_id);
-
-try_again:
-	write_lock(&cache->active_lock);
-
-	dentry = object->dentry;
-	trace_cachefiles_mark_active(object, dentry);
-
-	if (test_and_set_bit(CACHEFILES_OBJECT_ACTIVE, &object->flags)) {
-		pr_err("Error: Object already active\n");
-		cachefiles_printk_object(object, NULL);
-		BUG();
-	}
-
-	_p = &cache->active_nodes.rb_node;
-	while (*_p) {
-		_parent = *_p;
-		xobject = rb_entry(_parent,
-				   struct cachefiles_object, active_node);
-
-		ASSERT(xobject != object);
-
-		if (xobject->dentry > dentry)
-			_p = &(*_p)->rb_left;
-		else if (xobject->dentry < dentry)
-			_p = &(*_p)->rb_right;
-		else
-			goto wait_for_old_object;
-	}
-
-	rb_link_node(&object->active_node, _parent, _p);
-	rb_insert_color(&object->active_node, &cache->active_nodes);
-
-	write_unlock(&cache->active_lock);
-	_leave(" = 0");
-	return 0;
-
-	/* an old object from a previous incarnation is hogging the slot - we
-	 * need to wait for it to be destroyed */
-wait_for_old_object:
-	trace_cachefiles_wait_active(object, dentry, xobject);
-	clear_bit(CACHEFILES_OBJECT_ACTIVE, &object->flags);
-
-	if (fscache_object_is_live(&xobject->fscache)) {
-		pr_err("\n");
-		pr_err("Error: Unexpected object collision\n");
-		cachefiles_printk_object(object, xobject);
-	}
-	atomic_inc(&xobject->usage);
-	write_unlock(&cache->active_lock);
-
-	if (test_bit(CACHEFILES_OBJECT_ACTIVE, &xobject->flags)) {
-		wait_queue_head_t *wq;
-
-		signed long timeout = 60 * HZ;
-		wait_queue_entry_t wait;
-		bool requeue;
-
-		/* if the object we're waiting for is queued for processing,
-		 * then just put ourselves on the queue behind it */
-		if (work_pending(&xobject->fscache.work)) {
-			_debug("queue OBJ%x behind OBJ%x immediately",
-			       object->fscache.debug_id,
-			       xobject->fscache.debug_id);
-			goto requeue;
-		}
-
-		/* otherwise we sleep until either the object we're waiting for
-		 * is done, or the fscache_object is congested */
-		wq = bit_waitqueue(&xobject->flags, CACHEFILES_OBJECT_ACTIVE);
-		init_wait(&wait);
-		requeue = false;
-		do {
-			prepare_to_wait(wq, &wait, TASK_UNINTERRUPTIBLE);
-			if (!test_bit(CACHEFILES_OBJECT_ACTIVE, &xobject->flags))
-				break;
-
-			requeue = fscache_object_sleep_till_congested(&timeout);
-		} while (timeout > 0 && !requeue);
-		finish_wait(wq, &wait);
-
-		if (requeue &&
-		    test_bit(CACHEFILES_OBJECT_ACTIVE, &xobject->flags)) {
-			_debug("queue OBJ%x behind OBJ%x after wait",
-			       object->fscache.debug_id,
-			       xobject->fscache.debug_id);
-			goto requeue;
-		}
-
-		if (timeout <= 0) {
-			pr_err("\n");
-			pr_err("Error: Overlong wait for old active object to go away\n");
-			cachefiles_printk_object(object, xobject);
-			goto requeue;
-		}
-	}
-
-	ASSERT(!test_bit(CACHEFILES_OBJECT_ACTIVE, &xobject->flags));
-
-	cache->cache.ops->put_object(&xobject->fscache,
-		(enum fscache_obj_ref_trace)cachefiles_obj_put_wait_retry);
-	goto try_again;
+	struct inode *inode = d_backing_inode(dentry);
 
-requeue:
-	cache->cache.ops->put_object(&xobject->fscache,
-		(enum fscache_obj_ref_trace)cachefiles_obj_put_wait_timeo);
-	_leave(" = -ETIMEDOUT");
-	return -ETIMEDOUT;
+	inode_lock(inode);
+	inode->i_flags &= ~S_KERNEL_FILE;
+	inode_unlock(inode);
+	trace_cachefiles_mark_inactive(object, dentry, inode);
 }
 
 /*
  * Mark an object as being inactive.
  */
-void cachefiles_mark_object_inactive(struct cachefiles_cache *cache,
-				     struct cachefiles_object *object,
-				     blkcnt_t i_blocks)
+static void cachefiles_mark_object_inactive(struct cachefiles_cache *cache,
+					    struct cachefiles_object *object)
 {
-	struct dentry *dentry = object->dentry;
-	struct inode *inode = d_backing_inode(dentry);
-
-	trace_cachefiles_mark_inactive(object, dentry, inode);
-
-	write_lock(&cache->active_lock);
-	rb_erase(&object->active_node, &cache->active_nodes);
-	clear_bit(CACHEFILES_OBJECT_ACTIVE, &object->flags);
-	write_unlock(&cache->active_lock);
-
-	wake_up_bit(&object->flags, CACHEFILES_OBJECT_ACTIVE);
+	blkcnt_t i_blocks = d_backing_inode(object->dentry)->i_blocks;
 
 	/* This object can now be culled, so we need to let the daemon know
 	 * that there is something it can remove if it needs to.
@@ -286,7 +86,6 @@ static int cachefiles_bury_object(struct cachefiles_cache *cache,
 				  struct cachefiles_object *object,
 				  struct dentry *dir,
 				  struct dentry *rep,
-				  bool preemptive,
 				  enum fscache_why_object_killed why)
 {
 	struct dentry *grave, *trap;
@@ -307,11 +106,7 @@ static int cachefiles_bury_object(struct cachefiles_cache *cache,
 			cachefiles_io_error(cache, "Unlink security error");
 		} else {
 			trace_cachefiles_unlink(object, rep, why);
-			ret = vfs_unlink(&init_user_ns, d_inode(dir), rep,
-					 NULL);
-
-			if (preemptive)
-				cachefiles_mark_object_buried(cache, rep, why);
+			ret = vfs_unlink(&init_user_ns, d_inode(dir), rep, NULL);
 		}
 
 		inode_unlock(d_inode(dir));
@@ -372,8 +167,7 @@ static int cachefiles_bury_object(struct cachefiles_cache *cache,
 			return -ENOMEM;
 		}
 
-		cachefiles_io_error(cache, "Lookup error %ld",
-				    PTR_ERR(grave));
+		cachefiles_io_error(cache, "Lookup error %ld", PTR_ERR(grave));
 		return -EIO;
 	}
 
@@ -422,9 +216,6 @@ static int cachefiles_bury_object(struct cachefiles_cache *cache,
 		if (ret != 0 && ret != -ENOMEM)
 			cachefiles_io_error(cache,
 					    "Rename failed with error %d", ret);
-
-		if (preemptive)
-			cachefiles_mark_object_buried(cache, rep, why);
 	}
 
 	unlock_rename(cache->graveyard, dir);
@@ -452,26 +243,18 @@ int cachefiles_delete_object(struct cachefiles_cache *cache,
 
 	inode_lock_nested(d_inode(dir), I_MUTEX_PARENT);
 
-	if (test_bit(FSCACHE_OBJECT_KILLED_BY_CACHE, &object->fscache.flags)) {
-		/* object allocation for the same key preemptively deleted this
-		 * object's file so that it could create its own file */
-		_debug("object preemptively buried");
+	/* We need to check that our parent is _still_ our parent - it may have
+	 * been renamed.
+	 */
+	if (dir == object->dentry->d_parent) {
+		ret = cachefiles_bury_object(cache, object, dir, object->dentry,
+					     FSCACHE_OBJECT_WAS_RETIRED);
+	} else {
+		/* It got moved, presumably by cachefilesd culling it, so it's
+		 * no longer in the key path and we can ignore it.
+		 */
 		inode_unlock(d_inode(dir));
 		ret = 0;
-	} else {
-		/* we need to check that our parent is _still_ our parent - it
-		 * may have been renamed */
-		if (dir == object->dentry->d_parent) {
-			ret = cachefiles_bury_object(cache, object, dir,
-						     object->dentry, false,
-						     FSCACHE_OBJECT_WAS_RETIRED);
-		} else {
-			/* it got moved, presumably by cachefilesd culling it,
-			 * so it's no longer in the key path and we can ignore
-			 * it */
-			inode_unlock(d_inode(dir));
-			ret = 0;
-		}
 	}
 
 	dput(dir);
@@ -492,6 +275,7 @@ int cachefiles_walk_to_object(struct cachefiles_object *parent,
 	struct inode *inode;
 	struct path path;
 	const char *name;
+	bool marked = false;
 	int ret, nlen;
 
 	_enter("OBJ%x{%pd},OBJ%x,%s,",
@@ -532,6 +316,7 @@ int cachefiles_walk_to_object(struct cachefiles_object *parent,
 	next = lookup_one_len(name, dir, nlen);
 	if (IS_ERR(next)) {
 		trace_cachefiles_lookup(object, next, NULL);
+		ret = PTR_ERR(next);
 		goto lookup_error;
 	}
 
@@ -628,6 +413,13 @@ int cachefiles_walk_to_object(struct cachefiles_object *parent,
 	/* we've found the object we were looking for */
 	object->dentry = next;
 
+	/* note that we're now using this object */
+	if (!cachefiles_mark_inode_in_use(object, object->dentry)) {
+		ret = -EBUSY;
+		goto check_error_unlock;
+	}
+	marked = true;
+
 	/* if we've found that the terminal object exists, then we need to
 	 * check its attributes and delete it if it's out of date */
 	if (!object->new) {
@@ -640,13 +432,12 @@ int cachefiles_walk_to_object(struct cachefiles_object *parent,
 			object->dentry = NULL;
 
 			ret = cachefiles_bury_object(cache, object, dir, next,
-						     true,
 						     FSCACHE_OBJECT_IS_STALE);
 			dput(next);
 			next = NULL;
 
 			if (ret < 0)
-				goto delete_error;
+				goto error_out2;
 
 			_debug("redo lookup");
 			fscache_object_retrying_stale(&object->fscache);
@@ -654,16 +445,10 @@ int cachefiles_walk_to_object(struct cachefiles_object *parent,
 		}
 	}
 
-	/* note that we're now using this object */
-	ret = cachefiles_mark_object_active(cache, object);
-
 	inode_unlock(d_inode(dir));
 	dput(dir);
 	dir = NULL;
 
-	if (ret == -ETIMEDOUT)
-		goto mark_active_timed_out;
-
 	_debug("=== OBTAINED_OBJECT ===");
 
 	if (object->new) {
@@ -712,26 +497,19 @@ int cachefiles_walk_to_object(struct cachefiles_object *parent,
 		cachefiles_io_error(cache, "Create/mkdir failed");
 	goto error;
 
-mark_active_timed_out:
-	_debug("mark active timed out");
-	goto release_dentry;
-
+check_error_unlock:
+	inode_unlock(d_inode(dir));
+	dput(dir);
 check_error:
-	_debug("check error %d", ret);
-	cachefiles_mark_object_inactive(
-		cache, object, d_backing_inode(object->dentry)->i_blocks);
-release_dentry:
+	if (marked)
+		cachefiles_unmark_inode_in_use(object, object->dentry);
+	cachefiles_mark_object_inactive(cache, object);
 	dput(object->dentry);
 	object->dentry = NULL;
 	goto error_out;
 
-delete_error:
-	_debug("delete error %d", ret);
-	goto error_out2;
-
 lookup_error:
-	_debug("lookup error %ld", PTR_ERR(next));
-	ret = PTR_ERR(next);
+	_debug("lookup error %d", ret);
 	if (ret == -EIO)
 		cachefiles_io_error(cache, "Lookup failed");
 	next = NULL;
@@ -856,8 +634,6 @@ static struct dentry *cachefiles_check_active(struct cachefiles_cache *cache,
 					      struct dentry *dir,
 					      char *filename)
 {
-	struct cachefiles_object *object;
-	struct rb_node *_n;
 	struct dentry *victim;
 	int ret;
 
@@ -884,34 +660,9 @@ static struct dentry *cachefiles_check_active(struct cachefiles_cache *cache,
 		return ERR_PTR(-ENOENT);
 	}
 
-	/* check to see if we're using this object */
-	read_lock(&cache->active_lock);
-
-	_n = cache->active_nodes.rb_node;
-
-	while (_n) {
-		object = rb_entry(_n, struct cachefiles_object, active_node);
-
-		if (object->dentry > victim)
-			_n = _n->rb_left;
-		else if (object->dentry < victim)
-			_n = _n->rb_right;
-		else
-			goto object_in_use;
-	}
-
-	read_unlock(&cache->active_lock);
-
 	//_leave(" = %pd", victim);
 	return victim;
 
-object_in_use:
-	read_unlock(&cache->active_lock);
-	inode_unlock(d_inode(dir));
-	dput(victim);
-	//_leave(" = -EBUSY [in use]");
-	return ERR_PTR(-EBUSY);
-
 lookup_error:
 	inode_unlock(d_inode(dir));
 	ret = PTR_ERR(victim);
@@ -940,6 +691,7 @@ int cachefiles_cull(struct cachefiles_cache *cache, struct dentry *dir,
 		    char *filename)
 {
 	struct dentry *victim;
+	struct inode *inode;
 	int ret;
 
 	_enter(",%pd/,%s", dir, filename);
@@ -948,6 +700,19 @@ int cachefiles_cull(struct cachefiles_cache *cache, struct dentry *dir,
 	if (IS_ERR(victim))
 		return PTR_ERR(victim);
 
+	/* check to see if someone is using this object */
+	inode = d_inode(victim);
+	inode_lock(inode);
+	if (inode->i_flags & S_KERNEL_FILE) {
+		ret = -EBUSY;
+	} else {
+		inode->i_flags |= S_KERNEL_FILE;
+		ret = 0;
+	}
+	inode_unlock(inode);
+	if (ret < 0)
+		goto error_unlock;
+
 	_debug("victim -> %pd %s",
 	       victim, d_backing_inode(victim) ? "positive" : "negative");
 
@@ -963,7 +728,7 @@ int cachefiles_cull(struct cachefiles_cache *cache, struct dentry *dir,
 	/*  actually remove the victim (drops the dir mutex) */
 	_debug("bury");
 
-	ret = cachefiles_bury_object(cache, NULL, dir, victim, false,
+	ret = cachefiles_bury_object(cache, NULL, dir, victim,
 				     FSCACHE_OBJECT_WAS_CULLED);
 	if (ret < 0)
 		goto error;
@@ -1000,6 +765,7 @@ int cachefiles_check_in_use(struct cachefiles_cache *cache, struct dentry *dir,
 			    char *filename)
 {
 	struct dentry *victim;
+	int ret = 0;
 
 	//_enter(",%pd/,%s",
 	//       dir, filename);
@@ -1009,7 +775,9 @@ int cachefiles_check_in_use(struct cachefiles_cache *cache, struct dentry *dir,
 		return PTR_ERR(victim);
 
 	inode_unlock(d_inode(dir));
+	if (d_inode(victim)->i_flags & S_KERNEL_FILE)
+		ret = -EBUSY;
 	dput(victim);
 	//_leave(" = 0");
-	return 0;
+	return ret;
 }
diff --git a/include/trace/events/cachefiles.h b/include/trace/events/cachefiles.h
index 920b6a303d60..57d811718953 100644
--- a/include/trace/events/cachefiles.h
+++ b/include/trace/events/cachefiles.h
@@ -237,35 +237,6 @@ TRACE_EVENT(cachefiles_mark_active,
 		      __entry->obj, __entry->de)
 	    );
 
-TRACE_EVENT(cachefiles_wait_active,
-	    TP_PROTO(struct cachefiles_object *obj,
-		     struct dentry *de,
-		     struct cachefiles_object *xobj),
-
-	    TP_ARGS(obj, de, xobj),
-
-	    /* Note that obj may be NULL */
-	    TP_STRUCT__entry(
-		    __field(unsigned int,		obj		)
-		    __field(unsigned int,		xobj		)
-		    __field(struct dentry *,		de		)
-		    __field(u16,			flags		)
-		    __field(u16,			fsc_flags	)
-			     ),
-
-	    TP_fast_assign(
-		    __entry->obj	= obj->fscache.debug_id;
-		    __entry->de		= de;
-		    __entry->xobj	= xobj->fscache.debug_id;
-		    __entry->flags	= xobj->flags;
-		    __entry->fsc_flags	= xobj->fscache.flags;
-			   ),
-
-	    TP_printk("o=%08x d=%p wo=%08x wf=%x wff=%x",
-		      __entry->obj, __entry->de, __entry->xobj,
-		      __entry->flags, __entry->fsc_flags)
-	    );
-
 TRACE_EVENT(cachefiles_mark_inactive,
 	    TP_PROTO(struct cachefiles_object *obj,
 		     struct dentry *de,
@@ -290,31 +261,6 @@ TRACE_EVENT(cachefiles_mark_inactive,
 		      __entry->obj, __entry->de, __entry->inode)
 	    );
 
-TRACE_EVENT(cachefiles_mark_buried,
-	    TP_PROTO(struct cachefiles_object *obj,
-		     struct dentry *de,
-		     enum fscache_why_object_killed why),
-
-	    TP_ARGS(obj, de, why),
-
-	    /* Note that obj may be NULL */
-	    TP_STRUCT__entry(
-		    __field(unsigned int,		obj		)
-		    __field(struct dentry *,		de		)
-		    __field(enum fscache_why_object_killed, why		)
-			     ),
-
-	    TP_fast_assign(
-		    __entry->obj	= obj ? obj->fscache.debug_id : UINT_MAX;
-		    __entry->de		= de;
-		    __entry->why	= why;
-			   ),
-
-	    TP_printk("o=%08x d=%p w=%s",
-		      __entry->obj, __entry->de,
-		      __print_symbolic(__entry->why, cachefiles_obj_kill_traces))
-	    );
-
 #endif /* _TRACE_CACHEFILES_H */
 
 /* This part must be outside protection */



^ permalink raw reply related	[flat|nested] 87+ messages in thread

* [PATCH 19/67] cachefiles: Don't set an xattr on the root of the cache
  2021-10-18 14:50 [PATCH 00/67] fscache: Rewrite index API and management system David Howells
                   ` (17 preceding siblings ...)
  2021-10-18 14:55 ` [PATCH 18/67] cachefiles: Remove tree of active files and use S_CACHE_FILE inode flag David Howells
@ 2021-10-18 14:55 ` David Howells
  2021-10-18 14:55 ` [PATCH 20/67] cachefiles: Remove some redundant checks on unsigned values David Howells
                   ` (51 subsequent siblings)
  70 siblings, 0 replies; 87+ messages in thread
From: David Howells @ 2021-10-18 14:55 UTC (permalink / raw)
  To: linux-cachefs
  Cc: dhowells, Trond Myklebust, Anna Schumaker, Steve French,
	Dominique Martinet, Jeff Layton, Matthew Wilcox, Alexander Viro,
	Omar Sandoval, Linus Torvalds, linux-afs, linux-nfs, linux-cifs,
	ceph-devel, v9fs-developer, linux-fsdevel, linux-kernel

cachefiles_check_object_type() sets an xattr on the root of the cache, but
it's pointless because it never conveys any useful information, so don't do
that.

Signed-off-by: David Howells <dhowells@redhat.com>
---

 fs/cachefiles/bind.c     |    4 --
 fs/cachefiles/internal.h |    1 -
 fs/cachefiles/xattr.c    |   76 ----------------------------------------------
 3 files changed, 81 deletions(-)

diff --git a/fs/cachefiles/bind.c b/fs/cachefiles/bind.c
index d463d89f5db8..6720c07f3d85 100644
--- a/fs/cachefiles/bind.c
+++ b/fs/cachefiles/bind.c
@@ -199,10 +199,6 @@ static int cachefiles_daemon_add_cache(struct cachefiles_cache *cache)
 	fsdef->dentry = cachedir;
 	fsdef->fscache.cookie = NULL;
 
-	ret = cachefiles_check_object_type(fsdef);
-	if (ret < 0)
-		goto error_unsupported;
-
 	/* get the graveyard directory */
 	graveyard = cachefiles_get_directory(cache, root, "graveyard");
 	if (IS_ERR(graveyard)) {
diff --git a/fs/cachefiles/internal.h b/fs/cachefiles/internal.h
index 0515add2b7e8..6464a6821bfb 100644
--- a/fs/cachefiles/internal.h
+++ b/fs/cachefiles/internal.h
@@ -169,7 +169,6 @@ static inline void cachefiles_end_secure(struct cachefiles_cache *cache,
 /*
  * xattr.c
  */
-extern int cachefiles_check_object_type(struct cachefiles_object *object);
 extern int cachefiles_set_object_xattr(struct cachefiles_object *object,
 				       unsigned int xattr_flags);
 extern int cachefiles_check_auxdata(struct cachefiles_object *object);
diff --git a/fs/cachefiles/xattr.c b/fs/cachefiles/xattr.c
index bcc4a2dfe1e8..8b9f30f9ce21 100644
--- a/fs/cachefiles/xattr.c
+++ b/fs/cachefiles/xattr.c
@@ -23,82 +23,6 @@ struct cachefiles_xattr {
 static const char cachefiles_xattr_cache[] =
 	XATTR_USER_PREFIX "CacheFiles.cache";
 
-/*
- * check the type label on an object
- * - done using xattrs
- */
-int cachefiles_check_object_type(struct cachefiles_object *object)
-{
-	struct dentry *dentry = object->dentry;
-	char type[3], xtype[3];
-	int ret;
-
-	ASSERT(dentry);
-	ASSERT(d_backing_inode(dentry));
-
-	if (!object->fscache.cookie)
-		strcpy(type, "C3");
-	else
-		snprintf(type, 3, "%02x", object->fscache.cookie->type);
-
-	_enter("%x{%s}", object->fscache.debug_id, type);
-
-	/* attempt to install a type label directly */
-	ret = vfs_setxattr(&init_user_ns, dentry, cachefiles_xattr_cache, type,
-			   2, XATTR_CREATE);
-	if (ret == 0) {
-		_debug("SET"); /* we succeeded */
-		goto error;
-	}
-
-	if (ret != -EEXIST) {
-		pr_err("Can't set xattr on %pd [%lu] (err %d)\n",
-		       dentry, d_backing_inode(dentry)->i_ino,
-		       -ret);
-		goto error;
-	}
-
-	/* read the current type label */
-	ret = vfs_getxattr(&init_user_ns, dentry, cachefiles_xattr_cache, xtype,
-			   3);
-	if (ret < 0) {
-		if (ret == -ERANGE)
-			goto bad_type_length;
-
-		pr_err("Can't read xattr on %pd [%lu] (err %d)\n",
-		       dentry, d_backing_inode(dentry)->i_ino,
-		       -ret);
-		goto error;
-	}
-
-	/* check the type is what we're expecting */
-	if (ret != 2)
-		goto bad_type_length;
-
-	if (xtype[0] != type[0] || xtype[1] != type[1])
-		goto bad_type;
-
-	ret = 0;
-
-error:
-	_leave(" = %d", ret);
-	return ret;
-
-bad_type_length:
-	pr_err("Cache object %lu type xattr length incorrect\n",
-	       d_backing_inode(dentry)->i_ino);
-	ret = -EIO;
-	goto error;
-
-bad_type:
-	xtype[2] = 0;
-	pr_err("Cache object %pd [%lu] type %s not %s\n",
-	       dentry, d_backing_inode(dentry)->i_ino,
-	       xtype, type);
-	ret = -EIO;
-	goto error;
-}
-
 /*
  * set the state xattr on a cache file
  */



^ permalink raw reply related	[flat|nested] 87+ messages in thread

* [PATCH 20/67] cachefiles: Remove some redundant checks on unsigned values
  2021-10-18 14:50 [PATCH 00/67] fscache: Rewrite index API and management system David Howells
                   ` (18 preceding siblings ...)
  2021-10-18 14:55 ` [PATCH 19/67] cachefiles: Don't set an xattr on the root of the cache David Howells
@ 2021-10-18 14:55 ` David Howells
  2021-10-18 14:56 ` [PATCH 21/67] cachefiles: Prevent inode from going away when burying a dentry David Howells
                   ` (50 subsequent siblings)
  70 siblings, 0 replies; 87+ messages in thread
From: David Howells @ 2021-10-18 14:55 UTC (permalink / raw)
  To: linux-cachefs
  Cc: dhowells, Trond Myklebust, Anna Schumaker, Steve French,
	Dominique Martinet, Jeff Layton, Matthew Wilcox, Alexander Viro,
	Omar Sandoval, Linus Torvalds, linux-afs, linux-nfs, linux-cifs,
	ceph-devel, v9fs-developer, linux-fsdevel, linux-kernel

Remove some redundant checks for unsigned values being >= 0.

Signed-off-by: David Howells <dhowells@redhat.com>
---

 fs/cachefiles/bind.c   |    6 ++----
 fs/cachefiles/daemon.c |    6 +++---
 2 files changed, 5 insertions(+), 7 deletions(-)

diff --git a/fs/cachefiles/bind.c b/fs/cachefiles/bind.c
index 6720c07f3d85..0bc748081f59 100644
--- a/fs/cachefiles/bind.c
+++ b/fs/cachefiles/bind.c
@@ -36,13 +36,11 @@ int cachefiles_daemon_bind(struct cachefiles_cache *cache, char *args)
 	       args);
 
 	/* start by checking things over */
-	ASSERT(cache->fstop_percent >= 0 &&
-	       cache->fstop_percent < cache->fcull_percent &&
+	ASSERT(cache->fstop_percent < cache->fcull_percent &&
 	       cache->fcull_percent < cache->frun_percent &&
 	       cache->frun_percent  < 100);
 
-	ASSERT(cache->bstop_percent >= 0 &&
-	       cache->bstop_percent < cache->bcull_percent &&
+	ASSERT(cache->bstop_percent < cache->bcull_percent &&
 	       cache->bcull_percent < cache->brun_percent &&
 	       cache->brun_percent  < 100);
 
diff --git a/fs/cachefiles/daemon.c b/fs/cachefiles/daemon.c
index 8a937d6d5e22..e8ab3ab57147 100644
--- a/fs/cachefiles/daemon.c
+++ b/fs/cachefiles/daemon.c
@@ -221,7 +221,7 @@ static ssize_t cachefiles_daemon_write(struct file *file,
 	if (test_bit(CACHEFILES_DEAD, &cache->flags))
 		return -EIO;
 
-	if (datalen < 0 || datalen > PAGE_SIZE - 1)
+	if (datalen > PAGE_SIZE - 1)
 		return -EOPNOTSUPP;
 
 	/* drag the command string into the kernel so we can parse it */
@@ -378,7 +378,7 @@ static int cachefiles_daemon_fstop(struct cachefiles_cache *cache, char *args)
 	if (args[0] != '%' || args[1] != '\0')
 		return -EINVAL;
 
-	if (fstop < 0 || fstop >= cache->fcull_percent)
+	if (fstop >= cache->fcull_percent)
 		return cachefiles_daemon_range_error(cache, args);
 
 	cache->fstop_percent = fstop;
@@ -450,7 +450,7 @@ static int cachefiles_daemon_bstop(struct cachefiles_cache *cache, char *args)
 	if (args[0] != '%' || args[1] != '\0')
 		return -EINVAL;
 
-	if (bstop < 0 || bstop >= cache->bcull_percent)
+	if (bstop >= cache->bcull_percent)
 		return cachefiles_daemon_range_error(cache, args);
 
 	cache->bstop_percent = bstop;



^ permalink raw reply related	[flat|nested] 87+ messages in thread

* [PATCH 21/67] cachefiles: Prevent inode from going away when burying a dentry
  2021-10-18 14:50 [PATCH 00/67] fscache: Rewrite index API and management system David Howells
                   ` (19 preceding siblings ...)
  2021-10-18 14:55 ` [PATCH 20/67] cachefiles: Remove some redundant checks on unsigned values David Howells
@ 2021-10-18 14:56 ` David Howells
  2021-10-18 14:56 ` [PATCH 22/67] cachefiles: Simplify the pathwalk and save the filename for an object David Howells
                   ` (49 subsequent siblings)
  70 siblings, 0 replies; 87+ messages in thread
From: David Howells @ 2021-10-18 14:56 UTC (permalink / raw)
  To: linux-cachefs
  Cc: dhowells, Trond Myklebust, Anna Schumaker, Steve French,
	Dominique Martinet, Jeff Layton, Matthew Wilcox, Alexander Viro,
	Omar Sandoval, Linus Torvalds, linux-afs, linux-nfs, linux-cifs,
	ceph-devel, v9fs-developer, linux-fsdevel, linux-kernel

When deleting a file, we want to make sure that the inode doesn't get
detached from it (leading to ->d_inode being cleared) as we may still want
to touch the inode afterwards (we want to clear the belongs-to-kernel flag
and we may have the dentry referred to by a file struct).

Do this by getting an extra ref on the dentry around the vfs_unlink() call
so that d_delete() doesn't see the refcount == 1.

Signed-off-by: David Howells <dhowells@redhat.com>
---

 fs/cachefiles/namei.c |    2 ++
 1 file changed, 2 insertions(+)

diff --git a/fs/cachefiles/namei.c b/fs/cachefiles/namei.c
index 4bd31be3be30..04c767624e3d 100644
--- a/fs/cachefiles/namei.c
+++ b/fs/cachefiles/namei.c
@@ -106,7 +106,9 @@ static int cachefiles_bury_object(struct cachefiles_cache *cache,
 			cachefiles_io_error(cache, "Unlink security error");
 		} else {
 			trace_cachefiles_unlink(object, rep, why);
+			dget(rep);
 			ret = vfs_unlink(&init_user_ns, d_inode(dir), rep, NULL);
+			dput(rep);
 		}
 
 		inode_unlock(d_inode(dir));



^ permalink raw reply related	[flat|nested] 87+ messages in thread

* [PATCH 22/67] cachefiles: Simplify the pathwalk and save the filename for an object
  2021-10-18 14:50 [PATCH 00/67] fscache: Rewrite index API and management system David Howells
                   ` (20 preceding siblings ...)
  2021-10-18 14:56 ` [PATCH 21/67] cachefiles: Prevent inode from going away when burying a dentry David Howells
@ 2021-10-18 14:56 ` David Howells
  2021-10-18 14:56 ` [PATCH 23/67] cachefiles: trace: Improve the lookup tracepoint David Howells
                   ` (48 subsequent siblings)
  70 siblings, 0 replies; 87+ messages in thread
From: David Howells @ 2021-10-18 14:56 UTC (permalink / raw)
  To: linux-cachefs
  Cc: dhowells, Trond Myklebust, Anna Schumaker, Steve French,
	Dominique Martinet, Jeff Layton, Matthew Wilcox, Alexander Viro,
	Omar Sandoval, Linus Torvalds, linux-afs, linux-nfs, linux-cifs,
	ceph-devel, v9fs-developer, linux-fsdevel, linux-kernel

In the future, I want to move to an invalidation model where invalidation
on a live cache object is achieved by opening a tmpfile, changing the
pointer to cut over I/O and then linking it into place in the parent
directory later rather than by truncating the file on the spot.

To this make this work, however, I need the filename to pass to
vfs_link()[*].  I can either copy it from the outgoing dentry (which I then
need to unlink) or

[*] Unless I can pass a flag to vfs_link() to tell it to replace the file
    at that dentry.

The filename concocted by cook key is a path relative to the parent index,
and may contain multiple components if the object key gets rendered to a
size larger than NAME_MAX.

However, apart from one exception, none of the filesystems have a key
that's anywhere near that large - even NFS filehandles top out at about 128
bytes, which encoded in base64 increase by a quarter to 192 plus a few
extra chars of markup.  The exception is the AFS cell name, which is a
domain name and can be up to 255 chars in size.  Restricting this allows
simplification of the pathwalk.

So, to this end, the following changes are made:

 (1) Restrict keys to those that can be rendered in NAME_MAX size.

 (2) Simplify the filename generation to store the fanout selection
     separately from the key filename and put the key filename alone into
     the filename string.

     As before, if suitable, the key is rendered directly, prefixed by
     I/D/S and an encoded length.  Otherwise, it has the length prepended
     and the whole lot is rendered into base64 and then prefixed with
     J/E/T.

     (The lengths are at least partially redundant, but a later patch is
     going to fully overhaul the encodings anyway).

     Allocating a temporary buffer to expand the key is only necessary when
     rendering to base64.

 (3) cachefiles_cook_key() now returns a bool and attaches the name string
     and key hash (fanout selection) directly to the object.  The name is
     freed when the object is freed.

 (4) Simplify the pathwalk to assume there are exactly two steps in looking
     up an object: one fanout dir and then one directory or file
     representing the object itself.  The fanout filename is rendered on
     the fly.

 (5) Assume that data objects are always going to be files and not a
     directory with further structure therein.

Signed-off-by: David Howells <dhowells@redhat.com>
---

 fs/cachefiles/bind.c              |    4 
 fs/cachefiles/interface.c         |   41 +---
 fs/cachefiles/internal.h          |   17 +-
 fs/cachefiles/key.c               |  179 +++++++---------
 fs/cachefiles/namei.c             |  403 +++++++++++++++++--------------------
 include/trace/events/cachefiles.h |   22 --
 6 files changed, 279 insertions(+), 387 deletions(-)

diff --git a/fs/cachefiles/bind.c b/fs/cachefiles/bind.c
index 0bc748081f59..fbc8577477c1 100644
--- a/fs/cachefiles/bind.c
+++ b/fs/cachefiles/bind.c
@@ -188,7 +188,7 @@ static int cachefiles_daemon_add_cache(struct cachefiles_cache *cache)
 	       (unsigned long long) cache->bstop);
 
 	/* get the cache directory and check its type */
-	cachedir = cachefiles_get_directory(cache, root, "cache");
+	cachedir = cachefiles_get_directory(cache, root, "cache", NULL);
 	if (IS_ERR(cachedir)) {
 		ret = PTR_ERR(cachedir);
 		goto error_unsupported;
@@ -198,7 +198,7 @@ static int cachefiles_daemon_add_cache(struct cachefiles_cache *cache)
 	fsdef->fscache.cookie = NULL;
 
 	/* get the graveyard directory */
-	graveyard = cachefiles_get_directory(cache, root, "graveyard");
+	graveyard = cachefiles_get_directory(cache, root, "graveyard", NULL);
 	if (IS_ERR(graveyard)) {
 		ret = PTR_ERR(graveyard);
 		goto error_unsupported;
diff --git a/fs/cachefiles/interface.c b/fs/cachefiles/interface.c
index bc28bb6c7ef5..2083aca6bd0c 100644
--- a/fs/cachefiles/interface.c
+++ b/fs/cachefiles/interface.c
@@ -21,9 +21,6 @@ static struct fscache_object *cachefiles_alloc_object(
 {
 	struct cachefiles_object *object;
 	struct cachefiles_cache *cache;
-	unsigned keylen;
-	void *buffer, *p;
-	char *key;
 
 	cache = container_of(_cache, struct cachefiles_cache, cache);
 
@@ -42,37 +39,14 @@ static struct fscache_object *cachefiles_alloc_object(
 
 	object->type = cookie->type;
 
-	/* get hold of the raw key
-	 * - stick the length on the front and leave space on the back for the
-	 *   encoder
-	 */
-	buffer = kmalloc((2 + 512) + 3, cachefiles_gfp);
-	if (!buffer)
-		goto nomem_buffer;
-
-	keylen = cookie->key_len;
-	p = fscache_get_key(cookie);
-	memcpy(buffer + 2, p, keylen);
-
-	*(uint16_t *)buffer = keylen;
-	((char *)buffer)[keylen + 2] = 0;
-	((char *)buffer)[keylen + 3] = 0;
-	((char *)buffer)[keylen + 4] = 0;
-
 	/* turn the raw key into something that can work with as a filename */
-	key = cachefiles_cook_key(buffer, keylen + 2, object->type);
-	kfree(buffer);
-	if (!key)
+	if (!cachefiles_cook_key(object))
 		goto nomem_key;
 
-	object->lookup_key = key;
-
-	_leave(" = %x [%s]", object->fscache.debug_id, key);
+	_leave(" = %x [%s]", object->fscache.debug_id, object->d_name);
 	return &object->fscache;
 
 nomem_key:
-	kfree(buffer);
-nomem_buffer:
 	kmem_cache_free(cachefiles_object_jar, object);
 	fscache_object_destroyed(&cache->cache);
 nomem_object:
@@ -98,11 +72,11 @@ static int cachefiles_lookup_object(struct fscache_object *_object)
 			      struct cachefiles_object, fscache);
 	object = container_of(_object, struct cachefiles_object, fscache);
 
-	ASSERTCMP(object->lookup_key, !=, NULL);
+	ASSERT(object->d_name);
 
 	/* look up the key, creating any missing bits */
 	cachefiles_begin_secure(cache, &saved_cred);
-	ret = cachefiles_walk_to_object(parent, object, object->lookup_key);
+	ret = cachefiles_walk_to_object(parent, object);
 	cachefiles_end_secure(cache, saved_cred);
 
 	/* polish off by setting the attributes of non-index files */
@@ -130,9 +104,6 @@ static void cachefiles_lookup_complete(struct fscache_object *_object)
 	object = container_of(_object, struct cachefiles_object, fscache);
 
 	_enter("{OBJ%x}", object->fscache.debug_id);
-
-	kfree(object->lookup_key);
-	object->lookup_key = NULL;
 }
 
 /*
@@ -224,7 +195,7 @@ static void cachefiles_drop_object(struct fscache_object *_object)
 			dput(object->backer);
 		object->backer = NULL;
 
-		cachefiles_unmark_inode_in_use(object, object->dentry);
+		cachefiles_unmark_inode_in_use(object);
 		dput(object->dentry);
 		object->dentry = NULL;
 	}
@@ -269,7 +240,7 @@ void cachefiles_put_object(struct fscache_object *_object,
 		ASSERTCMP(object->fscache.n_ops, ==, 0);
 		ASSERTCMP(object->fscache.n_children, ==, 0);
 
-		kfree(object->lookup_key);
+		kfree(object->d_name);
 
 		cache = object->fscache.cache;
 		fscache_object_destroy(&object->fscache);
diff --git a/fs/cachefiles/internal.h b/fs/cachefiles/internal.h
index 6464a6821bfb..83911cf24769 100644
--- a/fs/cachefiles/internal.h
+++ b/fs/cachefiles/internal.h
@@ -34,13 +34,15 @@ extern unsigned cachefiles_debug;
  */
 struct cachefiles_object {
 	struct fscache_object		fscache;	/* fscache handle */
-	char				*lookup_key;	/* key to look up */
+	char				*d_name;	/* Filename */
 	struct dentry			*dentry;	/* the file/dir representing this object */
 	struct dentry			*backer;	/* backing file */
 	loff_t				i_size;		/* object size */
 	atomic_t			usage;		/* object usage count */
 	uint8_t				type;		/* object type */
-	uint8_t				new;		/* T if object new */
+	bool				new;		/* T if object new */
+	u8				d_name_len;	/* Length of filename */
+	u8				key_hash;
 };
 
 extern struct kmem_cache *cachefiles_object_jar;
@@ -119,21 +121,20 @@ void cachefiles_put_object(struct fscache_object *_object,
 /*
  * key.c
  */
-extern char *cachefiles_cook_key(const u8 *raw, int keylen, uint8_t type);
+extern bool cachefiles_cook_key(struct cachefiles_object *object);
 
 /*
  * namei.c
  */
-extern void cachefiles_unmark_inode_in_use(struct cachefiles_object *object,
-				    struct dentry *dentry);
+extern void cachefiles_unmark_inode_in_use(struct cachefiles_object *object);
 extern int cachefiles_delete_object(struct cachefiles_cache *cache,
 				    struct cachefiles_object *object);
 extern int cachefiles_walk_to_object(struct cachefiles_object *parent,
-				     struct cachefiles_object *object,
-				     const char *key);
+				     struct cachefiles_object *object);
 extern struct dentry *cachefiles_get_directory(struct cachefiles_cache *cache,
 					       struct dentry *dir,
-					       const char *name);
+					       const char *name,
+					       struct cachefiles_object *object);
 
 extern int cachefiles_cull(struct cachefiles_cache *cache, struct dentry *dir,
 			   char *filename);
diff --git a/fs/cachefiles/key.c b/fs/cachefiles/key.c
index 7f94efc97e23..56a9b3201b41 100644
--- a/fs/cachefiles/key.c
+++ b/fs/cachefiles/key.c
@@ -24,132 +24,107 @@ static const char cachefiles_filecharmap[256] = {
 
 /*
  * turn the raw key into something cooked
- * - the raw key should include the length in the two bytes at the front
- * - the key may be up to 514 bytes in length (including the length word)
+ * - the key may be up to NAME_MAX in length (including the length word)
  *   - "base64" encode the strange keys, mapping 3 bytes of raw to four of
  *     cooked
  *   - need to cut the cooked key into 252 char lengths (189 raw bytes)
  */
-char *cachefiles_cook_key(const u8 *raw, int keylen, uint8_t type)
+bool cachefiles_cook_key(struct cachefiles_object *object)
 {
-	unsigned char csum, ch;
-	unsigned int acc;
-	char *key;
-	int loop, len, max, seg, mark, print;
+	const u8 *key = fscache_get_key(object->fscache.cookie);
+	unsigned int acc, sum, keylen = object->fscache.cookie->key_len;
+	char *name;
+	u8 *buffer, *p;
+	int i, len, elem3, print;
+	u8 type;
 
 	_enter(",%d", keylen);
 
-	BUG_ON(keylen < 2 || keylen > 514);
+	BUG_ON(keylen > NAME_MAX - 3);
 
-	csum = raw[0] + raw[1];
+	sum = 0;
 	print = 1;
-	for (loop = 2; loop < keylen; loop++) {
-		ch = raw[loop];
-		csum += ch;
+	for (i = 0; i < keylen; i++) {
+		u8 ch = key[i];
+		sum += ch;
 		print &= cachefiles_filecharmap[ch];
 	}
+	object->key_hash = sum;
 
+	/* If the path is usable ASCII, then we render it directly */
 	if (print) {
-		/* if the path is usable ASCII, then we render it directly */
-		max = keylen - 2;
-		max += 2;	/* two base64'd length chars on the front */
-		max += 5;	/* @checksum/M */
-		max += 3 * 2;	/* maximum number of segment dividers (".../M")
-				 * is ((514 + 251) / 252) = 3
-				 */
-		max += 1;	/* NUL on end */
-	} else {
-		/* calculate the maximum length of the cooked key */
-		keylen = (keylen + 2) / 3;
-
-		max = keylen * 4;
-		max += 5;	/* @checksum/M */
-		max += 3 * 2;	/* maximum number of segment dividers (".../M")
-				 * is ((514 + 188) / 189) = 3
-				 */
-		max += 1;	/* NUL on end */
-	}
+		name = kmalloc(3 + keylen + 1, cachefiles_gfp);
+		if (!name)
+			return false;
 
-	max += 1;	/* 2nd NUL on end */
+		switch (object->fscache.cookie->type) {
+		case FSCACHE_COOKIE_TYPE_INDEX:		type = 'I';	break;
+		case FSCACHE_COOKIE_TYPE_DATAFILE:	type = 'D';	break;
+		default:				type = 'S';	break;
+		}
 
-	_debug("max: %d", max);
+		name[0] = type;
+		name[1] = cachefiles_charmap[(keylen >> 6) & 63];
+		name[2] = cachefiles_charmap[keylen & 63];
 
-	key = kmalloc(max, cachefiles_gfp);
-	if (!key)
-		return NULL;
+		memcpy(name + 3, key, keylen);
+		name[3 + keylen] = 0;
+		object->d_name = name;
+		object->d_name_len = 3 + keylen;
+		goto success;
+	}
 
-	len = 0;
+	/* Construct the key we actually want to render.  We stick the length
+	 * on the front and leave NULs on the back for the encoder to overread.
+	 */
+	buffer = kmalloc(2 + keylen + 3, cachefiles_gfp);
+	if (!buffer)
+		return false;
 
-	/* build the cooked key */
-	sprintf(key, "@%02x%c+", (unsigned) csum, 0);
-	len = 5;
-	mark = len - 1;
+	memcpy(buffer + 2, key, keylen);
 
-	if (print) {
-		acc = *(uint16_t *) raw;
-		raw += 2;
+	*(uint16_t *)buffer = keylen;
+	((char *)buffer)[keylen + 2] = 0;
+	((char *)buffer)[keylen + 3] = 0;
+	((char *)buffer)[keylen + 4] = 0;
 
-		key[len + 1] = cachefiles_charmap[acc & 63];
-		acc >>= 6;
-		key[len] = cachefiles_charmap[acc & 63];
-		len += 2;
-
-		seg = 250;
-		for (loop = keylen; loop > 0; loop--) {
-			if (seg <= 0) {
-				key[len++] = '\0';
-				mark = len;
-				key[len++] = '+';
-				seg = 252;
-			}
-
-			key[len++] = *raw++;
-			ASSERT(len < max);
-		}
+	elem3 = DIV_ROUND_UP(2 + keylen, 3); /* Count of 3-byte elements */
+	len = elem3 * 4;
 
-		switch (type) {
-		case FSCACHE_COOKIE_TYPE_INDEX:		type = 'I';	break;
-		case FSCACHE_COOKIE_TYPE_DATAFILE:	type = 'D';	break;
-		default:				type = 'S';	break;
-		}
-	} else {
-		seg = 252;
-		for (loop = keylen; loop > 0; loop--) {
-			if (seg <= 0) {
-				key[len++] = '\0';
-				mark = len;
-				key[len++] = '+';
-				seg = 252;
-			}
-
-			acc = *raw++;
-			acc |= *raw++ << 8;
-			acc |= *raw++ << 16;
-
-			_debug("acc: %06x", acc);
-
-			key[len++] = cachefiles_charmap[acc & 63];
-			acc >>= 6;
-			key[len++] = cachefiles_charmap[acc & 63];
-			acc >>= 6;
-			key[len++] = cachefiles_charmap[acc & 63];
-			acc >>= 6;
-			key[len++] = cachefiles_charmap[acc & 63];
-
-			ASSERT(len < max);
-		}
+	name = kmalloc(1 + len + 1, cachefiles_gfp);
+	if (!name) {
+		kfree(buffer);
+		return false;
+	}
 
-		switch (type) {
-		case FSCACHE_COOKIE_TYPE_INDEX:		type = 'J';	break;
-		case FSCACHE_COOKIE_TYPE_DATAFILE:	type = 'E';	break;
-		default:				type = 'T';	break;
-		}
+	switch (object->fscache.cookie->type) {
+	case FSCACHE_COOKIE_TYPE_INDEX:		type = 'J';	break;
+	case FSCACHE_COOKIE_TYPE_DATAFILE:	type = 'E';	break;
+	default:				type = 'T';	break;
 	}
 
-	key[mark] = type;
-	key[len++] = 0;
-	key[len] = 0;
+	name[0] = type;
+	len = 1;
+	p = buffer;
+	for (i = 0; i < elem3; i++) {
+		acc = *p++;
+		acc |= *p++ << 8;
+		acc |= *p++ << 16;
+
+		name[len++] = cachefiles_charmap[acc & 63];
+		acc >>= 6;
+		name[len++] = cachefiles_charmap[acc & 63];
+		acc >>= 6;
+		name[len++] = cachefiles_charmap[acc & 63];
+		acc >>= 6;
+		name[len++] = cachefiles_charmap[acc & 63];
+	}
 
-	_leave(" = %s %d", key, len);
-	return key;
+	name[len] = 0;
+	object->d_name = name;
+	object->d_name_len = len;
+	kfree(buffer);
+success:
+	_leave(" = %s", object->d_name);
+	return true;
 }
diff --git a/fs/cachefiles/namei.c b/fs/cachefiles/namei.c
index 04c767624e3d..10b6d571eda8 100644
--- a/fs/cachefiles/namei.c
+++ b/fs/cachefiles/namei.c
@@ -23,9 +23,9 @@
 /*
  * Mark the backing file as being a cache file if it's not already in use so.
  */
-static bool cachefiles_mark_inode_in_use(struct cachefiles_object *object,
-					 struct dentry *dentry)
+static bool cachefiles_mark_inode_in_use(struct cachefiles_object *object)
 {
+	struct dentry *dentry = object->dentry;
 	struct inode *inode = d_backing_inode(dentry);
 	bool can_use = false;
 
@@ -48,9 +48,9 @@ static bool cachefiles_mark_inode_in_use(struct cachefiles_object *object,
 /*
  * Unmark a backing inode.
  */
-void cachefiles_unmark_inode_in_use(struct cachefiles_object *object,
-				    struct dentry *dentry)
+void cachefiles_unmark_inode_in_use(struct cachefiles_object *object)
 {
+	struct dentry *dentry = object->dentry;
 	struct inode *inode = d_backing_inode(dentry);
 
 	inode_lock(inode);
@@ -265,263 +265,221 @@ int cachefiles_delete_object(struct cachefiles_cache *cache,
 }
 
 /*
- * walk from the parent object to the child object through the backing
- * filesystem, creating directories as we go
+ * Check and open the terminal object.
  */
-int cachefiles_walk_to_object(struct cachefiles_object *parent,
-			      struct cachefiles_object *object,
-			      const char *key)
+static int cachefiles_check_open_object(struct cachefiles_cache *cache,
+					struct cachefiles_object *object,
+					struct dentry *fan)
 {
-	struct cachefiles_cache *cache;
-	struct dentry *dir, *next = NULL;
-	struct inode *inode;
 	struct path path;
-	const char *name;
-	bool marked = false;
-	int ret, nlen;
+	int ret;
 
-	_enter("OBJ%x{%pd},OBJ%x,%s,",
-	       parent->fscache.debug_id, parent->dentry,
-	       object->fscache.debug_id, key);
+	if (!cachefiles_mark_inode_in_use(object))
+		return -EBUSY;
 
-	cache = container_of(parent->fscache.cache,
-			     struct cachefiles_cache, cache);
-	path.mnt = cache->mnt;
+	/* if we've found that the terminal object exists, then we need to
+	 * check its attributes and delete it if it's out of date */
+	if (!object->new) {
+		_debug("validate '%pd'", object->dentry);
 
-	ASSERT(parent->dentry);
-	ASSERT(d_backing_inode(parent->dentry));
+		ret = cachefiles_check_auxdata(object);
+		if (ret == -ESTALE)
+			goto stale;
+		if (ret < 0)
+			goto error_unmark;
+	}
+
+	_debug("=== OBTAINED_OBJECT ===");
 
-	if (!(d_is_dir(parent->dentry))) {
-		// TODO: convert file to dir
-		_leave("looking up in none directory");
-		return -ENOBUFS;
+	if (object->new) {
+		/* attach data to a newly constructed terminal object */
+		ret = cachefiles_set_object_xattr(object, XATTR_CREATE);
+		if (ret < 0)
+			goto error_unmark;
+	} else {
+		/* always update the atime on an object we've just looked up
+		 * (this is used to keep track of culling, and atimes are only
+		 * updated by read, write and readdir but not lookup or
+		 * open) */
+		path.mnt = cache->mnt;
+		path.dentry = object->dentry;
+		touch_atime(&path);
 	}
 
-	dir = dget(parent->dentry);
+	return 0;
 
-advance:
-	/* attempt to transit the first directory component */
-	name = key;
-	nlen = strlen(key);
+stale:
+	cachefiles_unmark_inode_in_use(object);
+	inode_lock_nested(d_inode(fan), I_MUTEX_PARENT);
+	ret = cachefiles_bury_object(cache, object, fan, object->dentry,
+				     FSCACHE_OBJECT_IS_STALE);
+	if (ret < 0)
+		return ret;
+	_debug("redo lookup");
+	return -ESTALE;
 
-	/* key ends in a double NUL */
-	key = key + nlen + 1;
-	if (!*key)
-		key = NULL;
+error_unmark:
+	cachefiles_unmark_inode_in_use(object);
+	return ret;
+}
 
-lookup_again:
-	/* search the current directory for the element name */
-	_debug("lookup '%s'", name);
+/*
+ * Walk to a file, creating it if necessary.
+ */
+static int cachefiles_walk_to_file(struct cachefiles_cache *cache,
+				   struct cachefiles_object *object,
+				   struct dentry *fan)
+{
+	struct dentry *dentry;
+	struct inode *dinode = d_backing_inode(fan);
+	struct path fan_path;
+	int ret;
 
-	inode_lock_nested(d_inode(dir), I_MUTEX_PARENT);
+	_enter("%pd %s", fan, object->d_name);
 
-	next = lookup_one_len(name, dir, nlen);
-	if (IS_ERR(next)) {
-		trace_cachefiles_lookup(object, next, NULL);
-		ret = PTR_ERR(next);
-		goto lookup_error;
-	}
+	inode_lock_nested(dinode, I_MUTEX_PARENT);
 
-	inode = d_backing_inode(next);
-	trace_cachefiles_lookup(object, next, inode);
-	_debug("next -> %pd %s", next, inode ? "positive" : "negative");
+	dentry = lookup_one_len(object->d_name, fan, object->d_name_len);
+	if (IS_ERR(dentry)) {
+		trace_cachefiles_lookup(object, dentry, NULL);
+		ret = PTR_ERR(dentry);
+		goto error;
+	}
 
-	if (!key)
-		object->new = !inode;
+	trace_cachefiles_lookup(object, dentry, d_backing_inode(dentry));
 
-	/* if this element of the path doesn't exist, then the lookup phase
-	 * failed, and we can release any readers in the certain knowledge that
-	 * there's nothing for them to actually read */
-	if (d_is_negative(next))
+	if (d_is_negative(dentry)) {
+		/* This element of the path doesn't exist, so we can release
+		 * any readers in the certain knowledge that there's nothing
+		 * for them to actually read */
 		fscache_object_lookup_negative(&object->fscache);
 
-	/* we need to create the object if it's negative */
-	if (key || object->type == FSCACHE_COOKIE_TYPE_INDEX) {
-		/* index objects and intervening tree levels must be subdirs */
-		if (d_is_negative(next)) {
-			ret = cachefiles_has_space(cache, 1, 0);
-			if (ret < 0)
-				goto no_space_error;
-
-			path.dentry = dir;
-			ret = security_path_mkdir(&path, next, 0);
-			if (ret < 0)
-				goto create_error;
-			ret = vfs_mkdir(&init_user_ns, d_inode(dir), next, 0);
-			if (!key)
-				trace_cachefiles_mkdir(object, next, ret);
-			if (ret < 0)
-				goto create_error;
-
-			if (unlikely(d_unhashed(next))) {
-				dput(next);
-				inode_unlock(d_inode(dir));
-				goto lookup_again;
-			}
-			ASSERT(d_backing_inode(next));
-
-			_debug("mkdir -> %pd{ino=%lu}",
-			       next, d_backing_inode(next)->i_ino);
-
-		} else if (!d_can_lookup(next)) {
-			pr_err("inode %lu is not a directory\n",
-			       d_backing_inode(next)->i_ino);
-			ret = -ENOBUFS;
-			goto error;
+		ret = cachefiles_has_space(cache, 1, 0);
+		if (ret < 0) {
+			fscache_object_mark_killed(&object->fscache, FSCACHE_OBJECT_NO_SPACE);
+			goto error_dput;
 		}
 
-	} else {
-		/* non-index objects start out life as files */
-		if (d_is_negative(next)) {
-			ret = cachefiles_has_space(cache, 1, 0);
-			if (ret < 0)
-				goto no_space_error;
-
-			path.dentry = dir;
-			ret = security_path_mknod(&path, next, S_IFREG, 0);
-			if (ret < 0)
-				goto create_error;
-			ret = vfs_create(&init_user_ns, d_inode(dir), next,
-					 S_IFREG, true);
-			trace_cachefiles_create(object, next, ret);
-			if (ret < 0)
-				goto create_error;
-
-			ASSERT(d_backing_inode(next));
-
-			_debug("create -> %pd{ino=%lu}",
-			       next, d_backing_inode(next)->i_ino);
-
-		} else if (!d_can_lookup(next) &&
-			   !d_is_reg(next)
-			   ) {
-			pr_err("inode %lu is not a file or directory\n",
-			       d_backing_inode(next)->i_ino);
-			ret = -ENOBUFS;
-			goto error;
-		}
-	}
+		fan_path.mnt = cache->mnt;
+		fan_path.dentry = fan;
+		ret = security_path_mknod(&fan_path, dentry, S_IFREG, 0);
+		if (ret < 0)
+			goto error_dput;
+		ret = vfs_create(&init_user_ns, dinode, dentry, S_IFREG, true);
+		trace_cachefiles_create(object, dentry, ret);
+		if (ret < 0)
+			goto error_dput;
 
-	/* process the next component */
-	if (key) {
-		_debug("advance");
-		inode_unlock(d_inode(dir));
-		dput(dir);
-		dir = next;
-		next = NULL;
-		goto advance;
-	}
+		_debug("create -> %pd{ino=%lu}",
+		       dentry, d_backing_inode(dentry)->i_ino);
+		object->new = true;
 
-	/* we've found the object we were looking for */
-	object->dentry = next;
+	} else if (!d_is_reg(dentry)) {
+		pr_err("inode %lu is not a file\n",
+		       d_backing_inode(dentry)->i_ino);
+		ret = -EIO;
+		goto error_dput;
+	} else {
+		_debug("file -> %pd positive", dentry);
+	}
 
-	/* note that we're now using this object */
-	if (!cachefiles_mark_inode_in_use(object, object->dentry)) {
-		ret = -EBUSY;
-		goto check_error_unlock;
+	if (dentry->d_sb->s_blocksize > PAGE_SIZE) {
+		pr_warn("cachefiles: Block size too large\n");
+		ret = -EIO;
+		goto error_dput;
 	}
-	marked = true;
 
-	/* if we've found that the terminal object exists, then we need to
-	 * check its attributes and delete it if it's out of date */
-	if (!object->new) {
-		_debug("validate '%pd'", next);
+	object->dentry = dentry;
+	inode_unlock(dinode);
+	return 0;
 
-		ret = cachefiles_check_auxdata(object);
-		if (ret == -ESTALE) {
-			/* delete the object (the deleter drops the directory
-			 * mutex) */
-			object->dentry = NULL;
-
-			ret = cachefiles_bury_object(cache, object, dir, next,
-						     FSCACHE_OBJECT_IS_STALE);
-			dput(next);
-			next = NULL;
-
-			if (ret < 0)
-				goto error_out2;
-
-			_debug("redo lookup");
-			fscache_object_retrying_stale(&object->fscache);
-			goto lookup_again;
-		}
-	}
+error_dput:
+	dput(dentry);
+error:
+	inode_unlock(dinode);
+	return ret;
+}
 
-	inode_unlock(d_inode(dir));
-	dput(dir);
-	dir = NULL;
+/*
+ * Walk over the fanout directory.
+ */
+static struct dentry *cachefiles_walk_over_fanout(struct cachefiles_object *object,
+						  struct cachefiles_cache *cache,
+						  struct dentry *dir)
+{
+	char name[4];
 
-	_debug("=== OBTAINED_OBJECT ===");
+	snprintf(name, sizeof(name), "@%02x", object->key_hash);
+	return cachefiles_get_directory(cache, dir, name, object);
+}
 
-	if (object->new) {
-		/* attach data to a newly constructed terminal object */
-		ret = cachefiles_set_object_xattr(object, XATTR_CREATE);
-		if (ret < 0)
-			goto check_error;
-	} else {
-		/* always update the atime on an object we've just looked up
-		 * (this is used to keep track of culling, and atimes are only
-		 * updated by read, write and readdir but not lookup or
-		 * open) */
-		path.dentry = next;
-		touch_atime(&path);
-	}
+/*
+ * walk from the parent object to the child object through the backing
+ * filesystem, creating directories as we go
+ */
+int cachefiles_walk_to_object(struct cachefiles_object *parent,
+			      struct cachefiles_object *object)
+{
+	struct cachefiles_cache *cache;
+	struct dentry *fan, *dentry;
+	int ret;
 
-	/* open a file interface onto a data file */
-	if (object->type != FSCACHE_COOKIE_TYPE_INDEX) {
-		if (d_is_reg(object->dentry)) {
-			const struct address_space_operations *aops;
+	_enter("OBJ%x{%pd},OBJ%x,%s,",
+	       parent->fscache.debug_id, parent->dentry,
+	       object->fscache.debug_id, object->d_name);
 
-			ret = -EPERM;
-			aops = d_backing_inode(object->dentry)->i_mapping->a_ops;
-			if (!aops->bmap)
-				goto check_error;
-			if (object->dentry->d_sb->s_blocksize > PAGE_SIZE)
-				goto check_error;
+	cache = container_of(parent->fscache.cache,
+			     struct cachefiles_cache, cache);
+	ASSERT(parent->dentry);
+	ASSERT(d_backing_inode(parent->dentry));
 
-			object->backer = object->dentry;
-		} else {
-			BUG(); // TODO: open file in data-class subdir
+lookup_again:
+	fan = cachefiles_walk_over_fanout(object, cache, parent->dentry);
+	if (IS_ERR(fan))
+		return PTR_ERR(fan);
+
+	/* Walk over path "parent/fanout/object". */
+	if (object->type == FSCACHE_COOKIE_TYPE_INDEX) {
+		dentry = cachefiles_get_directory(cache, fan, object->d_name,
+						  object);
+		if (IS_ERR(dentry)) {
+			dput(fan);
+			return PTR_ERR(dentry);
+		}
+		object->dentry = dentry;
+	} else {
+		ret = cachefiles_walk_to_file(cache, object, fan);
+		if (ret < 0) {
+			dput(fan);
+			return ret;
 		}
 	}
 
-	object->new = 0;
-	fscache_obtained_object(&object->fscache);
+	ret = cachefiles_check_open_object(cache, object, fan);
+	dput(fan);
+	fan = NULL;
+	if (ret < 0)
+		goto check_error;
 
+	object->backer = object->dentry;
+	object->new = false;
+	fscache_obtained_object(&object->fscache);
 	_leave(" = 0 [%lu]", d_backing_inode(object->dentry)->i_ino);
 	return 0;
 
-no_space_error:
-	fscache_object_mark_killed(&object->fscache, FSCACHE_OBJECT_NO_SPACE);
-create_error:
-	_debug("create error %d", ret);
-	if (ret == -EIO)
-		cachefiles_io_error(cache, "Create/mkdir failed");
-	goto error;
-
-check_error_unlock:
-	inode_unlock(d_inode(dir));
-	dput(dir);
 check_error:
-	if (marked)
-		cachefiles_unmark_inode_in_use(object, object->dentry);
+	if (ret == -ESTALE) {
+		dput(object->dentry);
+		object->dentry = NULL;
+		fscache_object_retrying_stale(&object->fscache);
+		goto lookup_again;
+	}
+	if (ret == -EIO)
+		cachefiles_io_error_obj(object, "Lookup failed");
 	cachefiles_mark_object_inactive(cache, object);
 	dput(object->dentry);
 	object->dentry = NULL;
-	goto error_out;
-
-lookup_error:
-	_debug("lookup error %d", ret);
-	if (ret == -EIO)
-		cachefiles_io_error(cache, "Lookup failed");
-	next = NULL;
-error:
-	inode_unlock(d_inode(dir));
-	dput(next);
-error_out2:
-	dput(dir);
-error_out:
-	_leave(" = error %d", -ret);
+	_leave(" = error %d", ret);
 	return ret;
 }
 
@@ -530,7 +488,8 @@ int cachefiles_walk_to_object(struct cachefiles_object *parent,
  */
 struct dentry *cachefiles_get_directory(struct cachefiles_cache *cache,
 					struct dentry *dir,
-					const char *dirname)
+					const char *dirname,
+					struct cachefiles_object *object)
 {
 	struct dentry *subdir;
 	struct path path;
@@ -554,6 +513,12 @@ struct dentry *cachefiles_get_directory(struct cachefiles_cache *cache,
 
 	/* we need to create the subdir if it doesn't exist yet */
 	if (d_is_negative(subdir)) {
+		/* This element of the path doesn't exist, so we can release
+		 * any readers in the certain knowledge that there's nothing
+		 * for them to actually read */
+		if (object)
+			fscache_object_lookup_negative(&object->fscache);
+
 		ret = cachefiles_has_space(cache, 1, 0);
 		if (ret < 0)
 			goto mkdir_error;
@@ -577,6 +542,8 @@ struct dentry *cachefiles_get_directory(struct cachefiles_cache *cache,
 
 		_debug("mkdir -> %pd{ino=%lu}",
 		       subdir, d_backing_inode(subdir)->i_ino);
+		if (object)
+			object->new = true;
 	}
 
 	inode_unlock(d_inode(dir));
diff --git a/include/trace/events/cachefiles.h b/include/trace/events/cachefiles.h
index 57d811718953..bd0b5bbd3889 100644
--- a/include/trace/events/cachefiles.h
+++ b/include/trace/events/cachefiles.h
@@ -119,28 +119,6 @@ TRACE_EVENT(cachefiles_lookup,
 		      __entry->obj, __entry->de, __entry->inode)
 	    );
 
-TRACE_EVENT(cachefiles_mkdir,
-	    TP_PROTO(struct cachefiles_object *obj,
-		     struct dentry *de, int ret),
-
-	    TP_ARGS(obj, de, ret),
-
-	    TP_STRUCT__entry(
-		    __field(unsigned int,		obj	)
-		    __field(struct dentry *,		de	)
-		    __field(int,			ret	)
-			     ),
-
-	    TP_fast_assign(
-		    __entry->obj	= obj->fscache.debug_id;
-		    __entry->de		= de;
-		    __entry->ret	= ret;
-			   ),
-
-	    TP_printk("o=%08x d=%p r=%u",
-		      __entry->obj, __entry->de, __entry->ret)
-	    );
-
 TRACE_EVENT(cachefiles_create,
 	    TP_PROTO(struct cachefiles_object *obj,
 		     struct dentry *de, int ret),



^ permalink raw reply related	[flat|nested] 87+ messages in thread

* [PATCH 23/67] cachefiles: trace: Improve the lookup tracepoint
  2021-10-18 14:50 [PATCH 00/67] fscache: Rewrite index API and management system David Howells
                   ` (21 preceding siblings ...)
  2021-10-18 14:56 ` [PATCH 22/67] cachefiles: Simplify the pathwalk and save the filename for an object David Howells
@ 2021-10-18 14:56 ` David Howells
  2021-10-18 14:56 ` [PATCH 24/67] cachefiles: Remove separate backer dentry from cachefiles_object David Howells
                   ` (47 subsequent siblings)
  70 siblings, 0 replies; 87+ messages in thread
From: David Howells @ 2021-10-18 14:56 UTC (permalink / raw)
  To: linux-cachefs
  Cc: dhowells, Trond Myklebust, Anna Schumaker, Steve French,
	Dominique Martinet, Jeff Layton, Matthew Wilcox, Alexander Viro,
	Omar Sandoval, Linus Torvalds, linux-afs, linux-nfs, linux-cifs,
	ceph-devel, v9fs-developer, linux-fsdevel, linux-kernel

Improve the cachefiles_lookup tracepoint:

 - Don't display the dentry address, since it's going to get hashed.

 - Do display any error code.

 - Work out the inode in the tracepoint rather than in the caller so that
   the logic is conditional on the tracepoint being enabled.

Signed-off-by: David Howells <dhowells@redhat.com>
---

 fs/cachefiles/namei.c             |    4 +---
 include/trace/events/cachefiles.h |   18 +++++++++---------
 2 files changed, 10 insertions(+), 12 deletions(-)

diff --git a/fs/cachefiles/namei.c b/fs/cachefiles/namei.c
index 10b6d571eda8..b5a0aec529af 100644
--- a/fs/cachefiles/namei.c
+++ b/fs/cachefiles/namei.c
@@ -340,14 +340,12 @@ static int cachefiles_walk_to_file(struct cachefiles_cache *cache,
 	inode_lock_nested(dinode, I_MUTEX_PARENT);
 
 	dentry = lookup_one_len(object->d_name, fan, object->d_name_len);
+	trace_cachefiles_lookup(object, dentry);
 	if (IS_ERR(dentry)) {
-		trace_cachefiles_lookup(object, dentry, NULL);
 		ret = PTR_ERR(dentry);
 		goto error;
 	}
 
-	trace_cachefiles_lookup(object, dentry, d_backing_inode(dentry));
-
 	if (d_is_negative(dentry)) {
 		/* This element of the path doesn't exist, so we can release
 		 * any readers in the certain knowledge that there's nothing
diff --git a/include/trace/events/cachefiles.h b/include/trace/events/cachefiles.h
index bd0b5bbd3889..87681dd957ec 100644
--- a/include/trace/events/cachefiles.h
+++ b/include/trace/events/cachefiles.h
@@ -98,25 +98,25 @@ TRACE_EVENT(cachefiles_ref,
 
 TRACE_EVENT(cachefiles_lookup,
 	    TP_PROTO(struct cachefiles_object *obj,
-		     struct dentry *de,
-		     struct inode *inode),
+		     struct dentry *de),
 
-	    TP_ARGS(obj, de, inode),
+	    TP_ARGS(obj, de),
 
 	    TP_STRUCT__entry(
 		    __field(unsigned int,		obj	)
-		    __field(struct dentry *,		de	)
-		    __field(struct inode *,		inode	)
+		    __field(short,			error	)
+		    __field(unsigned long,		ino	)
 			     ),
 
 	    TP_fast_assign(
 		    __entry->obj	= obj->fscache.debug_id;
-		    __entry->de		= de;
-		    __entry->inode	= inode;
+		    __entry->ino	= (!IS_ERR(de) && d_backing_inode(de) ?
+					   d_backing_inode(de)->i_ino : 0);
+		    __entry->error	= IS_ERR(de) ? PTR_ERR(de) : 0;
 			   ),
 
-	    TP_printk("o=%08x d=%p i=%p",
-		      __entry->obj, __entry->de, __entry->inode)
+	    TP_printk("o=%08x i=%lx e=%d",
+		      __entry->obj, __entry->ino, __entry->error)
 	    );
 
 TRACE_EVENT(cachefiles_create,



^ permalink raw reply related	[flat|nested] 87+ messages in thread

* [PATCH 24/67] cachefiles: Remove separate backer dentry from cachefiles_object
  2021-10-18 14:50 [PATCH 00/67] fscache: Rewrite index API and management system David Howells
                   ` (22 preceding siblings ...)
  2021-10-18 14:56 ` [PATCH 23/67] cachefiles: trace: Improve the lookup tracepoint David Howells
@ 2021-10-18 14:56 ` David Howells
  2021-10-18 14:57 ` [PATCH 25/67] cachefiles: Fold fscache_object into cachefiles_object David Howells
                   ` (46 subsequent siblings)
  70 siblings, 0 replies; 87+ messages in thread
From: David Howells @ 2021-10-18 14:56 UTC (permalink / raw)
  To: linux-cachefs
  Cc: dhowells, Trond Myklebust, Anna Schumaker, Steve French,
	Dominique Martinet, Jeff Layton, Matthew Wilcox, Alexander Viro,
	Omar Sandoval, Linus Torvalds, linux-afs, linux-nfs, linux-cifs,
	ceph-devel, v9fs-developer, linux-fsdevel, linux-kernel

The cachefiles_object struct has two dentry pointers - one for the
file/directory representing the object and a second one in case a data
object is a directory with a file inside of it that contains the data (the
idea being that there might be another file, say, containing a journal of
local changes that need committing or a list of cached xattrs).

At the moment, this isn't implemented, so remove it and always use the main
dentry pointer.

Signed-off-by: David Howells <dhowells@redhat.com>
---

 fs/cachefiles/bind.c      |    2 --
 fs/cachefiles/interface.c |   28 +++++++++-------------------
 fs/cachefiles/internal.h  |    1 -
 fs/cachefiles/io.c        |    4 ++--
 fs/cachefiles/namei.c     |    1 -
 5 files changed, 11 insertions(+), 25 deletions(-)

diff --git a/fs/cachefiles/bind.c b/fs/cachefiles/bind.c
index fbc8577477c1..cb3296814056 100644
--- a/fs/cachefiles/bind.c
+++ b/fs/cachefiles/bind.c
@@ -101,8 +101,6 @@ static int cachefiles_daemon_add_cache(struct cachefiles_cache *cache)
 	if (!fsdef)
 		goto error_root_object;
 
-	ASSERTCMP(fsdef->backer, ==, NULL);
-
 	atomic_set(&fsdef->usage, 1);
 	fsdef->type = FSCACHE_COOKIE_TYPE_INDEX;
 
diff --git a/fs/cachefiles/interface.c b/fs/cachefiles/interface.c
index 2083aca6bd0c..92bb3ba78c41 100644
--- a/fs/cachefiles/interface.c
+++ b/fs/cachefiles/interface.c
@@ -31,8 +31,6 @@ static struct fscache_object *cachefiles_alloc_object(
 	if (!object)
 		goto nomem_object;
 
-	ASSERTCMP(object->backer, ==, NULL);
-
 	atomic_set(&object->usage, 1);
 
 	fscache_object_init(&object->fscache, cookie, &cache->cache);
@@ -191,10 +189,6 @@ static void cachefiles_drop_object(struct fscache_object *_object)
 		}
 
 		/* close the filesystem stuff attached to the object */
-		if (object->backer != object->dentry)
-			dput(object->backer);
-		object->backer = NULL;
-
 		cachefiles_unmark_inode_in_use(object);
 		dput(object->dentry);
 		object->dentry = NULL;
@@ -235,7 +229,6 @@ void cachefiles_put_object(struct fscache_object *_object,
 		_debug("- kill object OBJ%x", object->fscache.debug_id);
 
 		ASSERTCMP(object->fscache.parent, ==, NULL);
-		ASSERTCMP(object->backer, ==, NULL);
 		ASSERTCMP(object->dentry, ==, NULL);
 		ASSERTCMP(object->fscache.n_ops, ==, 0);
 		ASSERTCMP(object->fscache.n_children, ==, 0);
@@ -303,17 +296,14 @@ static int cachefiles_attr_changed(struct cachefiles_object *object)
 	if (ni_size == object->i_size)
 		return 0;
 
-	if (!object->backer)
-		return -ENOBUFS;
+	ASSERT(d_is_reg(object->dentry));
 
-	ASSERT(d_is_reg(object->backer));
-
-	oi_size = i_size_read(d_backing_inode(object->backer));
+	oi_size = i_size_read(d_backing_inode(object->dentry));
 	if (oi_size == ni_size)
 		return 0;
 
 	cachefiles_begin_secure(cache, &saved_cred);
-	inode_lock(d_inode(object->backer));
+	inode_lock(d_inode(object->dentry));
 
 	/* if there's an extension to a partial page at the end of the backing
 	 * file, we need to discard the partial page so that we pick up new
@@ -322,17 +312,17 @@ static int cachefiles_attr_changed(struct cachefiles_object *object)
 		_debug("discard tail %llx", oi_size);
 		newattrs.ia_valid = ATTR_SIZE;
 		newattrs.ia_size = oi_size & PAGE_MASK;
-		ret = notify_change(&init_user_ns, object->backer, &newattrs, NULL);
+		ret = notify_change(&init_user_ns, object->dentry, &newattrs, NULL);
 		if (ret < 0)
 			goto truncate_failed;
 	}
 
 	newattrs.ia_valid = ATTR_SIZE;
 	newattrs.ia_size = ni_size;
-	ret = notify_change(&init_user_ns, object->backer, &newattrs, NULL);
+	ret = notify_change(&init_user_ns, object->dentry, &newattrs, NULL);
 
 truncate_failed:
-	inode_unlock(d_inode(object->backer));
+	inode_unlock(d_inode(object->dentry));
 	cachefiles_end_secure(cache, saved_cred);
 
 	if (ret == -EIO) {
@@ -365,10 +355,10 @@ static void cachefiles_invalidate_object(struct fscache_object *_object)
 	_enter("{OBJ%x},[%llu]",
 	       object->fscache.debug_id, (unsigned long long)ni_size);
 
-	if (object->backer) {
-		ASSERT(d_is_reg(object->backer));
+	if (object->dentry) {
+		ASSERT(d_is_reg(object->dentry));
 
-		path.dentry = object->backer;
+		path.dentry = object->dentry;
 		path.mnt = cache->mnt;
 
 		cachefiles_begin_secure(cache, &saved_cred);
diff --git a/fs/cachefiles/internal.h b/fs/cachefiles/internal.h
index 83911cf24769..9f2f837027e0 100644
--- a/fs/cachefiles/internal.h
+++ b/fs/cachefiles/internal.h
@@ -36,7 +36,6 @@ struct cachefiles_object {
 	struct fscache_object		fscache;	/* fscache handle */
 	char				*d_name;	/* Filename */
 	struct dentry			*dentry;	/* the file/dir representing this object */
-	struct dentry			*backer;	/* backing file */
 	loff_t				i_size;		/* object size */
 	atomic_t			usage;		/* object usage count */
 	uint8_t				type;		/* object type */
diff --git a/fs/cachefiles/io.c b/fs/cachefiles/io.c
index 534c81a05918..920ca48eecfa 100644
--- a/fs/cachefiles/io.c
+++ b/fs/cachefiles/io.c
@@ -426,9 +426,9 @@ int cachefiles_begin_operation(struct netfs_cache_resources *cres)
 			     struct cachefiles_cache, cache);
 
 	path.mnt = cache->mnt;
-	path.dentry = object->backer;
+	path.dentry = object->dentry;
 	file = open_with_fake_path(&path, O_RDWR | O_LARGEFILE | O_DIRECT,
-				   d_inode(object->backer), cache->cache_cred);
+				   d_inode(object->dentry), cache->cache_cred);
 	if (IS_ERR(file))
 		return PTR_ERR(file);
 	if (!S_ISREG(file_inode(file)->i_mode))
diff --git a/fs/cachefiles/namei.c b/fs/cachefiles/namei.c
index b5a0aec529af..7f02fcb34b1e 100644
--- a/fs/cachefiles/namei.c
+++ b/fs/cachefiles/namei.c
@@ -459,7 +459,6 @@ int cachefiles_walk_to_object(struct cachefiles_object *parent,
 	if (ret < 0)
 		goto check_error;
 
-	object->backer = object->dentry;
 	object->new = false;
 	fscache_obtained_object(&object->fscache);
 	_leave(" = 0 [%lu]", d_backing_inode(object->dentry)->i_ino);



^ permalink raw reply related	[flat|nested] 87+ messages in thread

* [PATCH 25/67] cachefiles: Fold fscache_object into cachefiles_object
  2021-10-18 14:50 [PATCH 00/67] fscache: Rewrite index API and management system David Howells
                   ` (23 preceding siblings ...)
  2021-10-18 14:56 ` [PATCH 24/67] cachefiles: Remove separate backer dentry from cachefiles_object David Howells
@ 2021-10-18 14:57 ` David Howells
  2021-10-18 14:57 ` [PATCH 26/67] cachefiles: Change to storing file* rather than dentry* David Howells
                   ` (45 subsequent siblings)
  70 siblings, 0 replies; 87+ messages in thread
From: David Howells @ 2021-10-18 14:57 UTC (permalink / raw)
  To: linux-cachefs
  Cc: dhowells, Trond Myklebust, Anna Schumaker, Steve French,
	Dominique Martinet, Jeff Layton, Matthew Wilcox, Alexander Viro,
	Omar Sandoval, Linus Torvalds, linux-afs, linux-nfs, linux-cifs,
	ceph-devel, v9fs-developer, linux-fsdevel, linux-kernel

Merge struct fscache_object into cachefiles_object so that it can be got
rid of.

Signed-off-by: David Howells <dhowells@redhat.com>
---

 fs/cachefiles/bind.c              |    6 +-
 fs/cachefiles/interface.c         |  116 +++++++++++++++----------------------
 fs/cachefiles/internal.h          |   19 +-----
 fs/cachefiles/key.c               |    8 +--
 fs/cachefiles/namei.c             |   21 +++----
 fs/cachefiles/xattr.c             |   16 +++--
 fs/fscache/cache.c                |   10 ++-
 fs/fscache/cookie.c               |   20 +++---
 fs/fscache/internal.h             |    4 +
 fs/fscache/object.c               |  110 ++++++++++++++++++-----------------
 include/linux/fscache-cache.h     |   71 +++++++++++++----------
 include/trace/events/cachefiles.h |   14 ++--
 include/trace/events/fscache.h    |    2 -
 13 files changed, 194 insertions(+), 223 deletions(-)

diff --git a/fs/cachefiles/bind.c b/fs/cachefiles/bind.c
index cb3296814056..8db1ef2d1196 100644
--- a/fs/cachefiles/bind.c
+++ b/fs/cachefiles/bind.c
@@ -193,7 +193,7 @@ static int cachefiles_daemon_add_cache(struct cachefiles_cache *cache)
 	}
 
 	fsdef->dentry = cachedir;
-	fsdef->fscache.cookie = NULL;
+	fsdef->cookie = NULL;
 
 	/* get the graveyard directory */
 	graveyard = cachefiles_get_directory(cache, root, "graveyard", NULL);
@@ -210,10 +210,10 @@ static int cachefiles_daemon_add_cache(struct cachefiles_cache *cache)
 			   "%s",
 			   fsdef->dentry->d_sb->s_id);
 
-	fscache_object_init(&fsdef->fscache, &fscache_fsdef_index,
+	fscache_object_init(fsdef, &fscache_fsdef_index,
 			    &cache->cache);
 
-	ret = fscache_add_cache(&cache->cache, &fsdef->fscache, cache->tag);
+	ret = fscache_add_cache(&cache->cache, fsdef, cache->tag);
 	if (ret < 0)
 		goto error_add_cache;
 
diff --git a/fs/cachefiles/interface.c b/fs/cachefiles/interface.c
index 92bb3ba78c41..7ec2302a3214 100644
--- a/fs/cachefiles/interface.c
+++ b/fs/cachefiles/interface.c
@@ -15,7 +15,7 @@ static int cachefiles_attr_changed(struct cachefiles_object *object);
 /*
  * allocate an object record for a cookie lookup and prepare the lookup data
  */
-static struct fscache_object *cachefiles_alloc_object(
+static struct cachefiles_object *cachefiles_alloc_object(
 	struct fscache_cache *_cache,
 	struct fscache_cookie *cookie)
 {
@@ -33,7 +33,7 @@ static struct fscache_object *cachefiles_alloc_object(
 
 	atomic_set(&object->usage, 1);
 
-	fscache_object_init(&object->fscache, cookie, &cache->cache);
+	fscache_object_init(object, cookie, &cache->cache);
 
 	object->type = cookie->type;
 
@@ -41,8 +41,8 @@ static struct fscache_object *cachefiles_alloc_object(
 	if (!cachefiles_cook_key(object))
 		goto nomem_key;
 
-	_leave(" = %x [%s]", object->fscache.debug_id, object->d_name);
-	return &object->fscache;
+	_leave(" = %x [%s]", object->debug_id, object->d_name);
+	return object;
 
 nomem_key:
 	kmem_cache_free(cachefiles_object_jar, object);
@@ -56,19 +56,17 @@ static struct fscache_object *cachefiles_alloc_object(
  * attempt to look up the nominated node in this cache
  * - return -ETIMEDOUT to be scheduled again
  */
-static int cachefiles_lookup_object(struct fscache_object *_object)
+static int cachefiles_lookup_object(struct cachefiles_object *object)
 {
-	struct cachefiles_object *parent, *object;
+	struct cachefiles_object *parent;
 	struct cachefiles_cache *cache;
 	const struct cred *saved_cred;
 	int ret;
 
-	_enter("{OBJ%x}", _object->debug_id);
+	_enter("{OBJ%x}", object->debug_id);
 
-	cache = container_of(_object->cache, struct cachefiles_cache, cache);
-	parent = container_of(_object->parent,
-			      struct cachefiles_object, fscache);
-	object = container_of(_object, struct cachefiles_object, fscache);
+	cache = container_of(object->cache, struct cachefiles_cache, cache);
+	parent = object->parent;
 
 	ASSERT(object->d_name);
 
@@ -79,13 +77,13 @@ static int cachefiles_lookup_object(struct fscache_object *_object)
 
 	/* polish off by setting the attributes of non-index files */
 	if (ret == 0 &&
-	    object->fscache.cookie->type != FSCACHE_COOKIE_TYPE_INDEX)
+	    object->cookie->type != FSCACHE_COOKIE_TYPE_INDEX)
 		cachefiles_attr_changed(object);
 
 	if (ret < 0 && ret != -ETIMEDOUT) {
 		if (ret != -ENOBUFS)
 			pr_warn("Lookup failed error %d\n", ret);
-		fscache_object_lookup_error(&object->fscache);
+		fscache_object_lookup_error(object);
 	}
 
 	_leave(" [%d]", ret);
@@ -95,52 +93,43 @@ static int cachefiles_lookup_object(struct fscache_object *_object)
 /*
  * indication of lookup completion
  */
-static void cachefiles_lookup_complete(struct fscache_object *_object)
+static void cachefiles_lookup_complete(struct cachefiles_object *object)
 {
-	struct cachefiles_object *object;
-
-	object = container_of(_object, struct cachefiles_object, fscache);
-
-	_enter("{OBJ%x}", object->fscache.debug_id);
+	_enter("{OBJ%x}", object->debug_id);
 }
 
 /*
  * increment the usage count on an inode object (may fail if unmounting)
  */
 static
-struct fscache_object *cachefiles_grab_object(struct fscache_object *_object,
-					      enum fscache_obj_ref_trace why)
+struct cachefiles_object *cachefiles_grab_object(struct cachefiles_object *object,
+						 enum fscache_obj_ref_trace why)
 {
-	struct cachefiles_object *object =
-		container_of(_object, struct cachefiles_object, fscache);
 	int u;
 
-	_enter("{OBJ%x,%d}", _object->debug_id, atomic_read(&object->usage));
+	_enter("{OBJ%x,%d}", object->debug_id, atomic_read(&object->usage));
 
 #ifdef CACHEFILES_DEBUG_SLAB
 	ASSERT((atomic_read(&object->usage) & 0xffff0000) != 0x6b6b0000);
 #endif
 
 	u = atomic_inc_return(&object->usage);
-	trace_cachefiles_ref(object, _object->cookie,
+	trace_cachefiles_ref(object, object->cookie,
 			     (enum cachefiles_obj_ref_trace)why, u);
-	return &object->fscache;
+	return object;
 }
 
 /*
  * update the auxiliary data for an object object on disk
  */
-static void cachefiles_update_object(struct fscache_object *_object)
+static void cachefiles_update_object(struct cachefiles_object *object)
 {
-	struct cachefiles_object *object;
 	struct cachefiles_cache *cache;
 	const struct cred *saved_cred;
 
-	_enter("{OBJ%x}", _object->debug_id);
+	_enter("{OBJ%x}", object->debug_id);
 
-	object = container_of(_object, struct cachefiles_object, fscache);
-	cache = container_of(object->fscache.cache, struct cachefiles_cache,
-			     cache);
+	cache = container_of(object->cache, struct cachefiles_cache, cache);
 
 	cachefiles_begin_secure(cache, &saved_cred);
 	cachefiles_set_object_xattr(object, XATTR_REPLACE);
@@ -152,21 +141,16 @@ static void cachefiles_update_object(struct fscache_object *_object)
  * discard the resources pinned by an object and effect retirement if
  * requested
  */
-static void cachefiles_drop_object(struct fscache_object *_object)
+static void cachefiles_drop_object(struct cachefiles_object *object)
 {
-	struct cachefiles_object *object;
 	struct cachefiles_cache *cache;
 	const struct cred *saved_cred;
 
-	ASSERT(_object);
-
-	object = container_of(_object, struct cachefiles_object, fscache);
+	ASSERT(object);
 
-	_enter("{OBJ%x,%d}",
-	       object->fscache.debug_id, atomic_read(&object->usage));
+	_enter("{OBJ%x,%d}", object->debug_id, atomic_read(&object->usage));
 
-	cache = container_of(object->fscache.cache,
-			     struct cachefiles_cache, cache);
+	cache = container_of(object->cache, struct cachefiles_cache, cache);
 
 #ifdef CACHEFILES_DEBUG_SLAB
 	ASSERT((atomic_read(&object->usage) & 0xffff0000) != 0x6b6b0000);
@@ -179,10 +163,10 @@ static void cachefiles_drop_object(struct fscache_object *_object)
 	 */
 	if (object->dentry) {
 		/* delete retired objects */
-		if (test_bit(FSCACHE_OBJECT_RETIRED, &object->fscache.flags) &&
-		    _object != cache->cache.fsdef
+		if (test_bit(FSCACHE_OBJECT_RETIRED, &object->flags) &&
+		    object != cache->cache.fsdef
 		    ) {
-			_debug("- retire object OBJ%x", object->fscache.debug_id);
+			_debug("- retire object OBJ%x", object->debug_id);
 			cachefiles_begin_secure(cache, &saved_cred);
 			cachefiles_delete_object(cache, object);
 			cachefiles_end_secure(cache, saved_cred);
@@ -200,43 +184,40 @@ static void cachefiles_drop_object(struct fscache_object *_object)
 /*
  * dispose of a reference to an object
  */
-void cachefiles_put_object(struct fscache_object *_object,
+void cachefiles_put_object(struct cachefiles_object *object,
 			   enum fscache_obj_ref_trace why)
 {
-	struct cachefiles_object *object;
 	struct fscache_cache *cache;
 	int u;
 
-	ASSERT(_object);
-
-	object = container_of(_object, struct cachefiles_object, fscache);
+	ASSERT(object);
 
 	_enter("{OBJ%x,%d}",
-	       object->fscache.debug_id, atomic_read(&object->usage));
+	       object->debug_id, atomic_read(&object->usage));
 
 #ifdef CACHEFILES_DEBUG_SLAB
 	ASSERT((atomic_read(&object->usage) & 0xffff0000) != 0x6b6b0000);
 #endif
 
-	ASSERTIFCMP(object->fscache.parent,
-		    object->fscache.parent->n_children, >, 0);
+	ASSERTIFCMP(object->parent,
+		    object->parent->n_children, >, 0);
 
 	u = atomic_dec_return(&object->usage);
-	trace_cachefiles_ref(object, _object->cookie,
+	trace_cachefiles_ref(object, object->cookie,
 			     (enum cachefiles_obj_ref_trace)why, u);
 	ASSERTCMP(u, !=, -1);
 	if (u == 0) {
-		_debug("- kill object OBJ%x", object->fscache.debug_id);
+		_debug("- kill object OBJ%x", object->debug_id);
 
-		ASSERTCMP(object->fscache.parent, ==, NULL);
+		ASSERTCMP(object->parent, ==, NULL);
 		ASSERTCMP(object->dentry, ==, NULL);
-		ASSERTCMP(object->fscache.n_ops, ==, 0);
-		ASSERTCMP(object->fscache.n_children, ==, 0);
+		ASSERTCMP(object->n_ops, ==, 0);
+		ASSERTCMP(object->n_children, ==, 0);
 
 		kfree(object->d_name);
 
-		cache = object->fscache.cache;
-		fscache_object_destroy(&object->fscache);
+		cache = object->cache;
+		fscache_object_destroy(object);
 		kmem_cache_free(cachefiles_object_jar, object);
 		fscache_object_destroyed(cache);
 	}
@@ -285,12 +266,12 @@ static int cachefiles_attr_changed(struct cachefiles_object *object)
 	loff_t oi_size;
 	int ret;
 
-	ni_size = object->fscache.cookie->object_size;
+	ni_size = object->cookie->object_size;
 
 	_enter("{OBJ%x},[%llu]",
-	       object->fscache.debug_id, (unsigned long long) ni_size);
+	       object->debug_id, (unsigned long long) ni_size);
 
-	cache = container_of(object->fscache.cache,
+	cache = container_of(object->cache,
 			     struct cachefiles_cache, cache);
 
 	if (ni_size == object->i_size)
@@ -337,23 +318,20 @@ static int cachefiles_attr_changed(struct cachefiles_object *object)
 /*
  * Invalidate an object
  */
-static void cachefiles_invalidate_object(struct fscache_object *_object)
+static void cachefiles_invalidate_object(struct cachefiles_object *object)
 {
-	struct cachefiles_object *object;
 	struct cachefiles_cache *cache;
 	const struct cred *saved_cred;
 	struct path path;
 	uint64_t ni_size;
 	int ret;
 
-	object = container_of(_object, struct cachefiles_object, fscache);
-	cache = container_of(object->fscache.cache,
-			     struct cachefiles_cache, cache);
+	cache = container_of(object->cache, struct cachefiles_cache, cache);
 
-	ni_size = object->fscache.cookie->object_size;
+	ni_size = object->cookie->object_size;
 
 	_enter("{OBJ%x},[%llu]",
-	       object->fscache.debug_id, (unsigned long long)ni_size);
+	       object->debug_id, (unsigned long long)ni_size);
 
 	if (object->dentry) {
 		ASSERT(d_is_reg(object->dentry));
diff --git a/fs/cachefiles/internal.h b/fs/cachefiles/internal.h
index 9f2f837027e0..f268495ecbc6 100644
--- a/fs/cachefiles/internal.h
+++ b/fs/cachefiles/internal.h
@@ -29,21 +29,6 @@ extern unsigned cachefiles_debug;
 
 #define cachefiles_gfp (__GFP_RECLAIM | __GFP_NORETRY | __GFP_NOMEMALLOC)
 
-/*
- * node records
- */
-struct cachefiles_object {
-	struct fscache_object		fscache;	/* fscache handle */
-	char				*d_name;	/* Filename */
-	struct dentry			*dentry;	/* the file/dir representing this object */
-	loff_t				i_size;		/* object size */
-	atomic_t			usage;		/* object usage count */
-	uint8_t				type;		/* object type */
-	bool				new;		/* T if object new */
-	u8				d_name_len;	/* Length of filename */
-	u8				key_hash;
-};
-
 extern struct kmem_cache *cachefiles_object_jar;
 
 /*
@@ -114,7 +99,7 @@ extern int cachefiles_has_space(struct cachefiles_cache *cache,
  */
 extern const struct fscache_cache_ops cachefiles_cache_ops;
 
-void cachefiles_put_object(struct fscache_object *_object,
+void cachefiles_put_object(struct cachefiles_object *_object,
 			   enum fscache_obj_ref_trace why);
 
 /*
@@ -191,7 +176,7 @@ do {							\
 do {									\
 	struct cachefiles_cache *___cache;				\
 									\
-	___cache = container_of((object)->fscache.cache,		\
+	___cache = container_of((object)->cache,			\
 				struct cachefiles_cache, cache);	\
 	cachefiles_io_error(___cache, FMT, ##__VA_ARGS__);		\
 } while (0)
diff --git a/fs/cachefiles/key.c b/fs/cachefiles/key.c
index 56a9b3201b41..ccadbc4776f1 100644
--- a/fs/cachefiles/key.c
+++ b/fs/cachefiles/key.c
@@ -31,8 +31,8 @@ static const char cachefiles_filecharmap[256] = {
  */
 bool cachefiles_cook_key(struct cachefiles_object *object)
 {
-	const u8 *key = fscache_get_key(object->fscache.cookie);
-	unsigned int acc, sum, keylen = object->fscache.cookie->key_len;
+	const u8 *key = fscache_get_key(object->cookie);
+	unsigned int acc, sum, keylen = object->cookie->key_len;
 	char *name;
 	u8 *buffer, *p;
 	int i, len, elem3, print;
@@ -57,7 +57,7 @@ bool cachefiles_cook_key(struct cachefiles_object *object)
 		if (!name)
 			return false;
 
-		switch (object->fscache.cookie->type) {
+		switch (object->cookie->type) {
 		case FSCACHE_COOKIE_TYPE_INDEX:		type = 'I';	break;
 		case FSCACHE_COOKIE_TYPE_DATAFILE:	type = 'D';	break;
 		default:				type = 'S';	break;
@@ -97,7 +97,7 @@ bool cachefiles_cook_key(struct cachefiles_object *object)
 		return false;
 	}
 
-	switch (object->fscache.cookie->type) {
+	switch (object->cookie->type) {
 	case FSCACHE_COOKIE_TYPE_INDEX:		type = 'J';	break;
 	case FSCACHE_COOKIE_TYPE_DATAFILE:	type = 'E';	break;
 	default:				type = 'T';	break;
diff --git a/fs/cachefiles/namei.c b/fs/cachefiles/namei.c
index 7f02fcb34b1e..d3903cbb7de3 100644
--- a/fs/cachefiles/namei.c
+++ b/fs/cachefiles/namei.c
@@ -29,7 +29,7 @@ static bool cachefiles_mark_inode_in_use(struct cachefiles_object *object)
 	struct inode *inode = d_backing_inode(dentry);
 	bool can_use = false;
 
-	_enter(",%x", object->fscache.debug_id);
+	_enter(",%x", object->debug_id);
 
 	inode_lock(inode);
 
@@ -235,7 +235,7 @@ int cachefiles_delete_object(struct cachefiles_cache *cache,
 	struct dentry *dir;
 	int ret;
 
-	_enter(",OBJ%x{%pd}", object->fscache.debug_id, object->dentry);
+	_enter(",OBJ%x{%pd}", object->debug_id, object->dentry);
 
 	ASSERT(object->dentry);
 	ASSERT(d_backing_inode(object->dentry));
@@ -350,11 +350,11 @@ static int cachefiles_walk_to_file(struct cachefiles_cache *cache,
 		/* This element of the path doesn't exist, so we can release
 		 * any readers in the certain knowledge that there's nothing
 		 * for them to actually read */
-		fscache_object_lookup_negative(&object->fscache);
+		fscache_object_lookup_negative(object);
 
 		ret = cachefiles_has_space(cache, 1, 0);
 		if (ret < 0) {
-			fscache_object_mark_killed(&object->fscache, FSCACHE_OBJECT_NO_SPACE);
+			fscache_object_mark_killed(object, FSCACHE_OBJECT_NO_SPACE);
 			goto error_dput;
 		}
 
@@ -423,11 +423,10 @@ int cachefiles_walk_to_object(struct cachefiles_object *parent,
 	int ret;
 
 	_enter("OBJ%x{%pd},OBJ%x,%s,",
-	       parent->fscache.debug_id, parent->dentry,
-	       object->fscache.debug_id, object->d_name);
+	       parent->debug_id, parent->dentry,
+	       object->debug_id, object->d_name);
 
-	cache = container_of(parent->fscache.cache,
-			     struct cachefiles_cache, cache);
+	cache = container_of(parent->cache, struct cachefiles_cache, cache);
 	ASSERT(parent->dentry);
 	ASSERT(d_backing_inode(parent->dentry));
 
@@ -460,7 +459,7 @@ int cachefiles_walk_to_object(struct cachefiles_object *parent,
 		goto check_error;
 
 	object->new = false;
-	fscache_obtained_object(&object->fscache);
+	fscache_obtained_object(object);
 	_leave(" = 0 [%lu]", d_backing_inode(object->dentry)->i_ino);
 	return 0;
 
@@ -468,7 +467,7 @@ int cachefiles_walk_to_object(struct cachefiles_object *parent,
 	if (ret == -ESTALE) {
 		dput(object->dentry);
 		object->dentry = NULL;
-		fscache_object_retrying_stale(&object->fscache);
+		fscache_object_retrying_stale(object);
 		goto lookup_again;
 	}
 	if (ret == -EIO)
@@ -514,7 +513,7 @@ struct dentry *cachefiles_get_directory(struct cachefiles_cache *cache,
 		 * any readers in the certain knowledge that there's nothing
 		 * for them to actually read */
 		if (object)
-			fscache_object_lookup_negative(&object->fscache);
+			fscache_object_lookup_negative(object);
 
 		ret = cachefiles_has_space(cache, 1, 0);
 		if (ret < 0)
diff --git a/fs/cachefiles/xattr.c b/fs/cachefiles/xattr.c
index 8b9f30f9ce21..464dea0b467c 100644
--- a/fs/cachefiles/xattr.c
+++ b/fs/cachefiles/xattr.c
@@ -31,23 +31,23 @@ int cachefiles_set_object_xattr(struct cachefiles_object *object,
 {
 	struct cachefiles_xattr *buf;
 	struct dentry *dentry = object->dentry;
-	unsigned int len = object->fscache.cookie->aux_len;
+	unsigned int len = object->cookie->aux_len;
 	int ret;
 
 	if (!dentry)
 		return -ESTALE;
 
-	_enter("%x,#%d", object->fscache.debug_id, len);
+	_enter("%x,#%d", object->debug_id, len);
 
 	buf = kmalloc(sizeof(struct cachefiles_xattr) + len, GFP_KERNEL);
 	if (!buf)
 		return -ENOMEM;
 
-	buf->type = object->fscache.cookie->type;
+	buf->type = object->cookie->type;
 	if (len > 0)
-		memcpy(buf->data, fscache_get_aux(object->fscache.cookie), len);
+		memcpy(buf->data, fscache_get_aux(object->cookie), len);
 
-	clear_bit(FSCACHE_COOKIE_AUX_UPDATED, &object->fscache.cookie->flags);
+	clear_bit(FSCACHE_COOKIE_AUX_UPDATED, &object->cookie->flags);
 	ret = vfs_setxattr(&init_user_ns, dentry, cachefiles_xattr_cache,
 			   buf, sizeof(struct cachefiles_xattr) + len,
 			   xattr_flags);
@@ -68,8 +68,8 @@ int cachefiles_check_auxdata(struct cachefiles_object *object)
 {
 	struct cachefiles_xattr *buf;
 	struct dentry *dentry = object->dentry;
-	unsigned int len = object->fscache.cookie->aux_len, tlen;
-	const void *p = fscache_get_aux(object->fscache.cookie);
+	unsigned int len = object->cookie->aux_len, tlen;
+	const void *p = fscache_get_aux(object->cookie);
 	ssize_t ret;
 
 	ASSERT(dentry);
@@ -82,7 +82,7 @@ int cachefiles_check_auxdata(struct cachefiles_object *object)
 
 	ret = vfs_getxattr(&init_user_ns, dentry, cachefiles_xattr_cache, buf, tlen);
 	if (ret == tlen &&
-	    buf->type == object->fscache.cookie->type &&
+	    buf->type == object->cookie->type &&
 	    memcmp(buf->data, p, len) == 0)
 		ret = 0;
 	else
diff --git a/fs/fscache/cache.c b/fs/fscache/cache.c
index efcd834d6279..8a3191a89c32 100644
--- a/fs/fscache/cache.c
+++ b/fs/fscache/cache.c
@@ -93,7 +93,7 @@ struct fscache_cache *fscache_select_cache_for_object(
 	struct fscache_cookie *cookie)
 {
 	struct fscache_cache_tag *tag;
-	struct fscache_object *object;
+	struct cachefiles_object *object;
 	struct fscache_cache *cache;
 
 	_enter("");
@@ -110,7 +110,7 @@ struct fscache_cache *fscache_select_cache_for_object(
 	 * cache */
 	if (!hlist_empty(&cookie->backing_objects)) {
 		object = hlist_entry(cookie->backing_objects.first,
-				     struct fscache_object, cookie_link);
+				     struct cachefiles_object, cookie_link);
 
 		cache = object->cache;
 		if (fscache_object_is_dying(object) ||
@@ -200,7 +200,7 @@ EXPORT_SYMBOL(fscache_init_cache);
  * description.
  */
 int fscache_add_cache(struct fscache_cache *cache,
-		      struct fscache_object *ifsdef,
+		      struct cachefiles_object *ifsdef,
 		      const char *tagname)
 {
 	struct fscache_cache_tag *tag;
@@ -313,14 +313,14 @@ EXPORT_SYMBOL(fscache_io_error);
 static void fscache_withdraw_all_objects(struct fscache_cache *cache,
 					 struct list_head *dying_objects)
 {
-	struct fscache_object *object;
+	struct cachefiles_object *object;
 
 	while (!list_empty(&cache->object_list)) {
 		spin_lock(&cache->object_list_lock);
 
 		if (!list_empty(&cache->object_list)) {
 			object = list_entry(cache->object_list.next,
-					    struct fscache_object, cache_link);
+					    struct cachefiles_object, cache_link);
 			list_move_tail(&object->cache_link, dying_objects);
 
 			_debug("withdraw %x", object->cookie->debug_id);
diff --git a/fs/fscache/cookie.c b/fs/fscache/cookie.c
index 0dc27f82e910..3f7bb2eecdc3 100644
--- a/fs/fscache/cookie.c
+++ b/fs/fscache/cookie.c
@@ -26,11 +26,11 @@ static int fscache_acquire_non_index_cookie(struct fscache_cookie *cookie);
 static int fscache_alloc_object(struct fscache_cache *cache,
 				struct fscache_cookie *cookie);
 static int fscache_attach_object(struct fscache_cookie *cookie,
-				 struct fscache_object *object);
+				 struct cachefiles_object *object);
 
 static void fscache_print_cookie(struct fscache_cookie *cookie, char prefix)
 {
-	struct fscache_object *object;
+	struct cachefiles_object *object;
 	struct hlist_node *o;
 	const u8 *k;
 	unsigned loop;
@@ -48,7 +48,7 @@ static void fscache_print_cookie(struct fscache_cookie *cookie, char prefix)
 
 	o = READ_ONCE(cookie->backing_objects.first);
 	if (o) {
-		object = hlist_entry(o, struct fscache_object, cookie_link);
+		object = hlist_entry(o, struct cachefiles_object, cookie_link);
 		pr_err("%c-cookie o=%u\n", prefix, object->debug_id);
 	}
 
@@ -396,7 +396,7 @@ EXPORT_SYMBOL(__fscache_enable_cookie);
  */
 static int fscache_acquire_non_index_cookie(struct fscache_cookie *cookie)
 {
-	struct fscache_object *object;
+	struct cachefiles_object *object;
 	struct fscache_cache *cache;
 	int ret;
 
@@ -443,7 +443,7 @@ static int fscache_acquire_non_index_cookie(struct fscache_cookie *cookie)
 	}
 
 	object = hlist_entry(cookie->backing_objects.first,
-			     struct fscache_object, cookie_link);
+			     struct cachefiles_object, cookie_link);
 
 	/* initiate the process of looking up all the objects in the chain
 	 * (done by fscache_initialise_object()) */
@@ -476,7 +476,7 @@ static int fscache_acquire_non_index_cookie(struct fscache_cookie *cookie)
 static int fscache_alloc_object(struct fscache_cache *cache,
 				struct fscache_cookie *cookie)
 {
-	struct fscache_object *object;
+	struct cachefiles_object *object;
 	int ret;
 
 	_enter("%s,%x{%s}", cache->tag->name, cookie->debug_id, cookie->type_name);
@@ -548,9 +548,9 @@ static int fscache_alloc_object(struct fscache_cache *cache,
  * attach a cache object to a cookie
  */
 static int fscache_attach_object(struct fscache_cookie *cookie,
-				 struct fscache_object *object)
+				 struct cachefiles_object *object)
 {
-	struct fscache_object *p;
+	struct cachefiles_object *p;
 	struct fscache_cache *cache = object->cache;
 	int ret;
 
@@ -648,7 +648,7 @@ EXPORT_SYMBOL(__fscache_wait_on_invalidate);
  */
 void __fscache_update_cookie(struct fscache_cookie *cookie, const void *aux_data)
 {
-	struct fscache_object *object;
+	struct cachefiles_object *object;
 
 	fscache_stat(&fscache_n_updates);
 
@@ -686,7 +686,7 @@ void __fscache_disable_cookie(struct fscache_cookie *cookie,
 			      const void *aux_data,
 			      bool invalidate)
 {
-	struct fscache_object *object;
+	struct cachefiles_object *object;
 	bool awaken = false;
 
 	_enter("%x,%u", cookie->debug_id, invalidate);
diff --git a/fs/fscache/internal.h b/fs/fscache/internal.h
index e78ca3151e41..eefcb6dfee3e 100644
--- a/fs/fscache/internal.h
+++ b/fs/fscache/internal.h
@@ -106,7 +106,7 @@ static inline bool fscache_object_congested(void)
 /*
  * object.c
  */
-extern void fscache_enqueue_object(struct fscache_object *);
+extern void fscache_enqueue_object(struct cachefiles_object *);
 
 /*
  * proc.c
@@ -227,7 +227,7 @@ int fscache_stats_show(struct seq_file *m, void *v);
  * - if the event is not masked for that object, then the object is
  *   queued for attention by the thread pool.
  */
-static inline void fscache_raise_event(struct fscache_object *object,
+static inline void fscache_raise_event(struct cachefiles_object *object,
 				       unsigned event)
 {
 	BUG_ON(event >= NR_FSCACHE_OBJECT_EVENTS);
diff --git a/fs/fscache/object.c b/fs/fscache/object.c
index 761d6dc4aa0f..e653d0194f71 100644
--- a/fs/fscache/object.c
+++ b/fs/fscache/object.c
@@ -14,19 +14,19 @@
 #include <linux/prefetch.h>
 #include "internal.h"
 
-static const struct fscache_state *fscache_abort_initialisation(struct fscache_object *, int);
-static const struct fscache_state *fscache_kill_dependents(struct fscache_object *, int);
-static const struct fscache_state *fscache_drop_object(struct fscache_object *, int);
-static const struct fscache_state *fscache_initialise_object(struct fscache_object *, int);
-static const struct fscache_state *fscache_invalidate_object(struct fscache_object *, int);
-static const struct fscache_state *fscache_jumpstart_dependents(struct fscache_object *, int);
-static const struct fscache_state *fscache_kill_object(struct fscache_object *, int);
-static const struct fscache_state *fscache_lookup_failure(struct fscache_object *, int);
-static const struct fscache_state *fscache_look_up_object(struct fscache_object *, int);
-static const struct fscache_state *fscache_object_available(struct fscache_object *, int);
-static const struct fscache_state *fscache_parent_ready(struct fscache_object *, int);
-static const struct fscache_state *fscache_update_object(struct fscache_object *, int);
-static const struct fscache_state *fscache_object_dead(struct fscache_object *, int);
+static const struct fscache_state *fscache_abort_initialisation(struct cachefiles_object *, int);
+static const struct fscache_state *fscache_kill_dependents(struct cachefiles_object *, int);
+static const struct fscache_state *fscache_drop_object(struct cachefiles_object *, int);
+static const struct fscache_state *fscache_initialise_object(struct cachefiles_object *, int);
+static const struct fscache_state *fscache_invalidate_object(struct cachefiles_object *, int);
+static const struct fscache_state *fscache_jumpstart_dependents(struct cachefiles_object *, int);
+static const struct fscache_state *fscache_kill_object(struct cachefiles_object *, int);
+static const struct fscache_state *fscache_lookup_failure(struct cachefiles_object *, int);
+static const struct fscache_state *fscache_look_up_object(struct cachefiles_object *, int);
+static const struct fscache_state *fscache_object_available(struct cachefiles_object *, int);
+static const struct fscache_state *fscache_parent_ready(struct cachefiles_object *, int);
+static const struct fscache_state *fscache_update_object(struct cachefiles_object *, int);
+static const struct fscache_state *fscache_object_dead(struct cachefiles_object *, int);
 
 #define __STATE_NAME(n) fscache_osm_##n
 #define STATE(n) (&__STATE_NAME(n))
@@ -133,21 +133,21 @@ static const struct fscache_transition fscache_osm_run_oob[] = {
 	   { 0, NULL }
 };
 
-static int  fscache_get_object(struct fscache_object *,
+static int  fscache_get_object(struct cachefiles_object *,
 			       enum fscache_obj_ref_trace);
-static void fscache_put_object(struct fscache_object *,
+static void fscache_put_object(struct cachefiles_object *,
 			       enum fscache_obj_ref_trace);
-static bool fscache_enqueue_dependents(struct fscache_object *, int);
-static void fscache_dequeue_object(struct fscache_object *);
-static void fscache_update_aux_data(struct fscache_object *);
+static bool fscache_enqueue_dependents(struct cachefiles_object *, int);
+static void fscache_dequeue_object(struct cachefiles_object *);
+static void fscache_update_aux_data(struct cachefiles_object *);
 
 /*
  * we need to notify the parent when an op completes that we had outstanding
  * upon it
  */
-static inline void fscache_done_parent_op(struct fscache_object *object)
+static inline void fscache_done_parent_op(struct cachefiles_object *object)
 {
-	struct fscache_object *parent = object->parent;
+	struct cachefiles_object *parent = object->parent;
 
 	_enter("OBJ%x {OBJ%x,%x}",
 	       object->debug_id, parent->debug_id, parent->n_ops);
@@ -163,7 +163,7 @@ static inline void fscache_done_parent_op(struct fscache_object *object)
 /*
  * Object state machine dispatcher.
  */
-static void fscache_object_sm_dispatcher(struct fscache_object *object)
+static void fscache_object_sm_dispatcher(struct cachefiles_object *object)
 {
 	const struct fscache_transition *t;
 	const struct fscache_state *state, *new_state;
@@ -274,8 +274,8 @@ static void fscache_object_sm_dispatcher(struct fscache_object *object)
  */
 static void fscache_object_work_func(struct work_struct *work)
 {
-	struct fscache_object *object =
-		container_of(work, struct fscache_object, work);
+	struct cachefiles_object *object =
+		container_of(work, struct cachefiles_object, work);
 
 	_enter("{OBJ%x}", object->debug_id);
 
@@ -294,7 +294,7 @@ static void fscache_object_work_func(struct work_struct *work)
  * See Documentation/filesystems/caching/backend-api.rst for a complete
  * description.
  */
-void fscache_object_init(struct fscache_object *object,
+void fscache_object_init(struct cachefiles_object *object,
 			 struct fscache_cookie *cookie,
 			 struct fscache_cache *cache)
 {
@@ -335,7 +335,7 @@ EXPORT_SYMBOL(fscache_object_init);
  * Mark the object as no longer being live, making sure that we synchronise
  * against op submission.
  */
-static inline void fscache_mark_object_dead(struct fscache_object *object)
+static inline void fscache_mark_object_dead(struct cachefiles_object *object)
 {
 	spin_lock(&object->lock);
 	clear_bit(FSCACHE_OBJECT_IS_LIVE, &object->flags);
@@ -345,7 +345,7 @@ static inline void fscache_mark_object_dead(struct fscache_object *object)
 /*
  * Abort object initialisation before we start it.
  */
-static const struct fscache_state *fscache_abort_initialisation(struct fscache_object *object,
+static const struct fscache_state *fscache_abort_initialisation(struct cachefiles_object *object,
 								int event)
 {
 	_enter("{OBJ%x},%d", object->debug_id, event);
@@ -362,10 +362,10 @@ static const struct fscache_state *fscache_abort_initialisation(struct fscache_o
  * - we may need to start the process of creating a parent and we need to wait
  *   for the parent's lookup and creation to complete if it's not there yet
  */
-static const struct fscache_state *fscache_initialise_object(struct fscache_object *object,
+static const struct fscache_state *fscache_initialise_object(struct cachefiles_object *object,
 							     int event)
 {
-	struct fscache_object *parent;
+	struct cachefiles_object *parent;
 	bool success;
 
 	_enter("{OBJ%x},%d", object->debug_id, event);
@@ -417,10 +417,10 @@ static const struct fscache_state *fscache_initialise_object(struct fscache_obje
 /*
  * Once the parent object is ready, we should kick off our lookup op.
  */
-static const struct fscache_state *fscache_parent_ready(struct fscache_object *object,
+static const struct fscache_state *fscache_parent_ready(struct cachefiles_object *object,
 							int event)
 {
-	struct fscache_object *parent = object->parent;
+	struct cachefiles_object *parent = object->parent;
 
 	_enter("{OBJ%x},%d", object->debug_id, event);
 
@@ -440,11 +440,11 @@ static const struct fscache_state *fscache_parent_ready(struct fscache_object *o
  * - we hold an "access lock" on the parent object, so the parent object cannot
  *   be withdrawn by either party till we've finished
  */
-static const struct fscache_state *fscache_look_up_object(struct fscache_object *object,
+static const struct fscache_state *fscache_look_up_object(struct cachefiles_object *object,
 							  int event)
 {
 	struct fscache_cookie *cookie = object->cookie;
-	struct fscache_object *parent = object->parent;
+	struct cachefiles_object *parent = object->parent;
 	int ret;
 
 	_enter("{OBJ%x},%d", object->debug_id, event);
@@ -499,7 +499,7 @@ static const struct fscache_state *fscache_look_up_object(struct fscache_object
  * Note negative lookup, permitting those waiting to read data from an already
  * existing backing object to continue as there's no data for them to read.
  */
-void fscache_object_lookup_negative(struct fscache_object *object)
+void fscache_object_lookup_negative(struct cachefiles_object *object)
 {
 	struct fscache_cookie *cookie = object->cookie;
 
@@ -531,7 +531,7 @@ EXPORT_SYMBOL(fscache_object_lookup_negative);
  * Note that after calling this, an object's cookie may be relinquished by the
  * netfs, and so must be accessed with object lock held.
  */
-void fscache_obtained_object(struct fscache_object *object)
+void fscache_obtained_object(struct cachefiles_object *object)
 {
 	struct fscache_cookie *cookie = object->cookie;
 
@@ -563,7 +563,7 @@ EXPORT_SYMBOL(fscache_obtained_object);
 /*
  * handle an object that has just become available
  */
-static const struct fscache_state *fscache_object_available(struct fscache_object *object,
+static const struct fscache_state *fscache_object_available(struct cachefiles_object *object,
 							    int event)
 {
 	_enter("{OBJ%x},%d", object->debug_id, event);
@@ -588,7 +588,7 @@ static const struct fscache_state *fscache_object_available(struct fscache_objec
 /*
  * Wake up this object's dependent objects now that we've become available.
  */
-static const struct fscache_state *fscache_jumpstart_dependents(struct fscache_object *object,
+static const struct fscache_state *fscache_jumpstart_dependents(struct cachefiles_object *object,
 								int event)
 {
 	_enter("{OBJ%x},%d", object->debug_id, event);
@@ -601,7 +601,7 @@ static const struct fscache_state *fscache_jumpstart_dependents(struct fscache_o
 /*
  * Handle lookup or creation failute.
  */
-static const struct fscache_state *fscache_lookup_failure(struct fscache_object *object,
+static const struct fscache_state *fscache_lookup_failure(struct cachefiles_object *object,
 							  int event)
 {
 	struct fscache_cookie *cookie;
@@ -629,7 +629,7 @@ static const struct fscache_state *fscache_lookup_failure(struct fscache_object
  * Wait for completion of all active operations on this object and the death of
  * all child objects of this object.
  */
-static const struct fscache_state *fscache_kill_object(struct fscache_object *object,
+static const struct fscache_state *fscache_kill_object(struct cachefiles_object *object,
 						       int event)
 {
 	_enter("{OBJ%x,%d,%d},%d",
@@ -652,7 +652,7 @@ static const struct fscache_state *fscache_kill_object(struct fscache_object *ob
 /*
  * Kill dependent objects.
  */
-static const struct fscache_state *fscache_kill_dependents(struct fscache_object *object,
+static const struct fscache_state *fscache_kill_dependents(struct cachefiles_object *object,
 							   int event)
 {
 	_enter("{OBJ%x},%d", object->debug_id, event);
@@ -665,10 +665,10 @@ static const struct fscache_state *fscache_kill_dependents(struct fscache_object
 /*
  * Drop an object's attachments
  */
-static const struct fscache_state *fscache_drop_object(struct fscache_object *object,
+static const struct fscache_state *fscache_drop_object(struct cachefiles_object *object,
 						       int event)
 {
-	struct fscache_object *parent = object->parent;
+	struct cachefiles_object *parent = object->parent;
 	struct fscache_cookie *cookie = object->cookie;
 	struct fscache_cache *cache = object->cache;
 	bool awaken = false;
@@ -738,7 +738,7 @@ static const struct fscache_state *fscache_drop_object(struct fscache_object *ob
 /*
  * get a ref on an object
  */
-static int fscache_get_object(struct fscache_object *object,
+static int fscache_get_object(struct cachefiles_object *object,
 			      enum fscache_obj_ref_trace why)
 {
 	int ret;
@@ -752,7 +752,7 @@ static int fscache_get_object(struct fscache_object *object,
 /*
  * Discard a ref on an object
  */
-static void fscache_put_object(struct fscache_object *object,
+static void fscache_put_object(struct cachefiles_object *object,
 			       enum fscache_obj_ref_trace why)
 {
 	fscache_stat(&fscache_n_cop_put_object);
@@ -766,7 +766,7 @@ static void fscache_put_object(struct fscache_object *object,
  *
  * Note the imminent destruction and deallocation of a cache object record.
  */
-void fscache_object_destroy(struct fscache_object *object)
+void fscache_object_destroy(struct cachefiles_object *object)
 {
 	/* We can get rid of the cookie now */
 	fscache_put_cookie(object->cookie, fscache_cookie_put_object);
@@ -777,7 +777,7 @@ EXPORT_SYMBOL(fscache_object_destroy);
 /*
  * enqueue an object for metadata-type processing
  */
-void fscache_enqueue_object(struct fscache_object *object)
+void fscache_enqueue_object(struct cachefiles_object *object)
 {
 	_enter("{OBJ%x}", object->debug_id);
 
@@ -831,9 +831,9 @@ EXPORT_SYMBOL_GPL(fscache_object_sleep_till_congested);
  * again then return false immediately.  We return true if the list was
  * cleared.
  */
-static bool fscache_enqueue_dependents(struct fscache_object *object, int event)
+static bool fscache_enqueue_dependents(struct cachefiles_object *object, int event)
 {
-	struct fscache_object *dep;
+	struct cachefiles_object *dep;
 	bool ret = true;
 
 	_enter("{OBJ%x}", object->debug_id);
@@ -845,7 +845,7 @@ static bool fscache_enqueue_dependents(struct fscache_object *object, int event)
 
 	while (!list_empty(&object->dependents)) {
 		dep = list_entry(object->dependents.next,
-				 struct fscache_object, dep_link);
+				 struct cachefiles_object, dep_link);
 		list_del_init(&dep->dep_link);
 
 		fscache_raise_event(dep, event);
@@ -864,7 +864,7 @@ static bool fscache_enqueue_dependents(struct fscache_object *object, int event)
 /*
  * remove an object from whatever queue it's waiting on
  */
-static void fscache_dequeue_object(struct fscache_object *object)
+static void fscache_dequeue_object(struct cachefiles_object *object)
 {
 	_enter("{OBJ%x}", object->debug_id);
 
@@ -877,7 +877,7 @@ static void fscache_dequeue_object(struct fscache_object *object)
 	_leave("");
 }
 
-static const struct fscache_state *fscache_invalidate_object(struct fscache_object *object,
+static const struct fscache_state *fscache_invalidate_object(struct cachefiles_object *object,
 							     int event)
 {
 	return transit_to(UPDATE_OBJECT);
@@ -886,7 +886,7 @@ static const struct fscache_state *fscache_invalidate_object(struct fscache_obje
 /*
  * Update auxiliary data.
  */
-static void fscache_update_aux_data(struct fscache_object *object)
+static void fscache_update_aux_data(struct cachefiles_object *object)
 {
 	fscache_stat(&fscache_n_updates_run);
 	fscache_stat(&fscache_n_cop_update_object);
@@ -897,7 +897,7 @@ static void fscache_update_aux_data(struct fscache_object *object)
 /*
  * Asynchronously update an object.
  */
-static const struct fscache_state *fscache_update_object(struct fscache_object *object,
+static const struct fscache_state *fscache_update_object(struct cachefiles_object *object,
 							 int event)
 {
 	_enter("{OBJ%x},%d", object->debug_id, event);
@@ -915,7 +915,7 @@ static const struct fscache_state *fscache_update_object(struct fscache_object *
  * Note that an object lookup found an on-disk object that was adjudged to be
  * stale and has been deleted.  The lookup will be retried.
  */
-void fscache_object_retrying_stale(struct fscache_object *object)
+void fscache_object_retrying_stale(struct cachefiles_object *object)
 {
 	fscache_stat(&fscache_n_cache_no_space_reject);
 }
@@ -929,7 +929,7 @@ EXPORT_SYMBOL(fscache_object_retrying_stale);
  * Note that an object was killed.  Returns true if the object was
  * already marked killed, false if it wasn't.
  */
-void fscache_object_mark_killed(struct fscache_object *object,
+void fscache_object_mark_killed(struct cachefiles_object *object,
 				enum fscache_why_object_killed why)
 {
 	if (test_and_set_bit(FSCACHE_OBJECT_KILLED_BY_CACHE, &object->flags)) {
@@ -961,7 +961,7 @@ EXPORT_SYMBOL(fscache_object_mark_killed);
  * already running (and so can be requeued) but hasn't yet cleared the event
  * mask.
  */
-static const struct fscache_state *fscache_object_dead(struct fscache_object *object,
+static const struct fscache_state *fscache_object_dead(struct cachefiles_object *object,
 						       int event)
 {
 	if (!test_and_set_bit(FSCACHE_OBJECT_RUN_AFTER_DEAD,
diff --git a/include/linux/fscache-cache.h b/include/linux/fscache-cache.h
index 0439dc3021c7..7021c1ec2b64 100644
--- a/include/linux/fscache-cache.h
+++ b/include/linux/fscache-cache.h
@@ -22,7 +22,7 @@
 
 struct fscache_cache;
 struct fscache_cache_ops;
-struct fscache_object;
+struct cachefiles_object;
 enum fscache_cookie_trace;
 
 enum fscache_obj_ref_trace {
@@ -65,7 +65,7 @@ struct fscache_cache {
 	struct list_head	object_list;	/* list of data/index objects */
 	spinlock_t		object_list_lock;
 	atomic_t		object_count;	/* no. of live objects in this cache */
-	struct fscache_object	*fsdef;		/* object for the fsdef index */
+	struct cachefiles_object	*fsdef;		/* object for the fsdef index */
 	unsigned long		flags;
 #define FSCACHE_IOERROR		0	/* cache stopped on I/O error */
 #define FSCACHE_CACHE_WITHDRAWN	1	/* cache has been withdrawn */
@@ -81,46 +81,46 @@ struct fscache_cache_ops {
 	const char *name;
 
 	/* allocate an object record for a cookie */
-	struct fscache_object *(*alloc_object)(struct fscache_cache *cache,
+	struct cachefiles_object *(*alloc_object)(struct fscache_cache *cache,
 					       struct fscache_cookie *cookie);
 
 	/* look up the object for a cookie
 	 * - return -ETIMEDOUT to be requeued
 	 */
-	int (*lookup_object)(struct fscache_object *object);
+	int (*lookup_object)(struct cachefiles_object *object);
 
 	/* finished looking up */
-	void (*lookup_complete)(struct fscache_object *object);
+	void (*lookup_complete)(struct cachefiles_object *object);
 
 	/* increment the usage count on this object (may fail if unmounting) */
-	struct fscache_object *(*grab_object)(struct fscache_object *object,
+	struct cachefiles_object *(*grab_object)(struct cachefiles_object *object,
 					      enum fscache_obj_ref_trace why);
 
 	/* pin an object in the cache */
-	int (*pin_object)(struct fscache_object *object);
+	int (*pin_object)(struct cachefiles_object *object);
 
 	/* unpin an object in the cache */
-	void (*unpin_object)(struct fscache_object *object);
+	void (*unpin_object)(struct cachefiles_object *object);
 
 	/* store the updated auxiliary data on an object */
-	void (*update_object)(struct fscache_object *object);
+	void (*update_object)(struct cachefiles_object *object);
 
 	/* Invalidate an object */
-	void (*invalidate_object)(struct fscache_object *object);
+	void (*invalidate_object)(struct cachefiles_object *object);
 
 	/* discard the resources pinned by an object and effect retirement if
 	 * necessary */
-	void (*drop_object)(struct fscache_object *object);
+	void (*drop_object)(struct cachefiles_object *object);
 
 	/* dispose of a reference to an object */
-	void (*put_object)(struct fscache_object *object,
+	void (*put_object)(struct cachefiles_object *object,
 			   enum fscache_obj_ref_trace why);
 
 	/* sync a cache */
 	void (*sync_cache)(struct fscache_cache *cache);
 
 	/* reserve space for an object's data and associated metadata */
-	int (*reserve_space)(struct fscache_object *object, loff_t i_size);
+	int (*reserve_space)(struct cachefiles_object *object, loff_t i_size);
 
 	/* Begin an operation for the netfs lib */
 	int (*begin_operation)(struct netfs_cache_resources *cres);
@@ -155,7 +155,7 @@ struct fscache_transition {
 struct fscache_state {
 	char name[24];
 	char short_name[8];
-	const struct fscache_state *(*work)(struct fscache_object *object,
+	const struct fscache_state *(*work)(struct cachefiles_object *object,
 					    int event);
 	const struct fscache_transition transitions[];
 };
@@ -163,7 +163,7 @@ struct fscache_state {
 /*
  * on-disk cache file or index handle
  */
-struct fscache_object {
+struct cachefiles_object {
 	const struct fscache_state *state;	/* Object state machine state */
 	const struct fscache_transition *oob_table; /* OOB state transition table */
 	int			debug_id;	/* debugging ID */
@@ -192,40 +192,49 @@ struct fscache_object {
 	struct hlist_node	cookie_link;	/* link in cookie->backing_objects */
 	struct fscache_cache	*cache;		/* cache that supplied this object */
 	struct fscache_cookie	*cookie;	/* netfs's file/index object */
-	struct fscache_object	*parent;	/* parent object */
+	struct cachefiles_object	*parent;	/* parent object */
 	struct work_struct	work;		/* attention scheduling record */
 	struct list_head	dependents;	/* FIFO of dependent objects */
 	struct list_head	dep_link;	/* link in parent's dependents list */
+
+	char				*d_name;	/* Filename */
+	struct dentry			*dentry;	/* the file/dir representing this object */
+	loff_t				i_size;		/* object size */
+	atomic_t			usage;		/* object usage count */
+	uint8_t				type;		/* object type */
+	bool				new;		/* T if object new */
+	u8				d_name_len;	/* Length of filename */
+	u8				key_hash;
 };
 
-extern void fscache_object_init(struct fscache_object *, struct fscache_cookie *,
+extern void fscache_object_init(struct cachefiles_object *, struct fscache_cookie *,
 				struct fscache_cache *);
-extern void fscache_object_destroy(struct fscache_object *);
+extern void fscache_object_destroy(struct cachefiles_object *);
 
-extern void fscache_object_lookup_negative(struct fscache_object *object);
-extern void fscache_obtained_object(struct fscache_object *object);
+extern void fscache_object_lookup_negative(struct cachefiles_object *object);
+extern void fscache_obtained_object(struct cachefiles_object *object);
 
-static inline bool fscache_object_is_live(struct fscache_object *object)
+static inline bool fscache_object_is_live(struct cachefiles_object *object)
 {
 	return test_bit(FSCACHE_OBJECT_IS_LIVE, &object->flags);
 }
 
-static inline bool fscache_object_is_dying(struct fscache_object *object)
+static inline bool fscache_object_is_dying(struct cachefiles_object *object)
 {
 	return !fscache_object_is_live(object);
 }
 
-static inline bool fscache_object_is_available(struct fscache_object *object)
+static inline bool fscache_object_is_available(struct cachefiles_object *object)
 {
 	return test_bit(FSCACHE_OBJECT_IS_AVAILABLE, &object->flags);
 }
 
-static inline bool fscache_cache_is_broken(struct fscache_object *object)
+static inline bool fscache_cache_is_broken(struct cachefiles_object *object)
 {
 	return test_bit(FSCACHE_IOERROR, &object->cache->flags);
 }
 
-static inline bool fscache_object_is_active(struct fscache_object *object)
+static inline bool fscache_object_is_active(struct cachefiles_object *object)
 {
 	return fscache_object_is_available(object) &&
 		fscache_object_is_live(object) &&
@@ -251,7 +260,7 @@ static inline void fscache_object_destroyed(struct fscache_cache *cache)
  * Note that an object encountered a fatal error (usually an I/O error) and
  * that it should be withdrawn as soon as possible.
  */
-static inline void fscache_object_lookup_error(struct fscache_object *object)
+static inline void fscache_object_lookup_error(struct cachefiles_object *object)
 {
 	set_bit(FSCACHE_OBJECT_EV_ERROR, &object->events);
 }
@@ -268,7 +277,7 @@ static inline void __fscache_use_cookie(struct fscache_cookie *cookie)
  * Request usage of the cookie attached to an object.  NULL is returned if the
  * relinquishment had reduced the cookie usage count to 0.
  */
-static inline bool fscache_use_cookie(struct fscache_object *object)
+static inline bool fscache_use_cookie(struct cachefiles_object *object)
 {
 	struct fscache_cookie *cookie = object->cookie;
 	return atomic_inc_not_zero(&cookie->n_active) != 0;
@@ -291,7 +300,7 @@ static inline void __fscache_wake_unused_cookie(struct fscache_cookie *cookie)
  * Cease usage of the cookie attached to an object.  When the users count
  * reaches zero then the cookie relinquishment will be permitted to proceed.
  */
-static inline void fscache_unuse_cookie(struct fscache_object *object)
+static inline void fscache_unuse_cookie(struct cachefiles_object *object)
 {
 	struct fscache_cookie *cookie = object->cookie;
 	if (__fscache_unuse_cookie(cookie))
@@ -307,7 +316,7 @@ void fscache_init_cache(struct fscache_cache *cache,
 			const char *idfmt, ...);
 
 extern int fscache_add_cache(struct fscache_cache *cache,
-			     struct fscache_object *fsdef,
+			     struct cachefiles_object *fsdef,
 			     const char *tagname);
 extern void fscache_withdraw_cache(struct fscache_cache *cache);
 
@@ -315,7 +324,7 @@ extern void fscache_io_error(struct fscache_cache *cache);
 
 extern bool fscache_object_sleep_till_congested(signed long *timeoutp);
 
-extern void fscache_object_retrying_stale(struct fscache_object *object);
+extern void fscache_object_retrying_stale(struct cachefiles_object *object);
 
 enum fscache_why_object_killed {
 	FSCACHE_OBJECT_IS_STALE,
@@ -323,7 +332,7 @@ enum fscache_why_object_killed {
 	FSCACHE_OBJECT_WAS_RETIRED,
 	FSCACHE_OBJECT_WAS_CULLED,
 };
-extern void fscache_object_mark_killed(struct fscache_object *object,
+extern void fscache_object_mark_killed(struct cachefiles_object *object,
 				       enum fscache_why_object_killed why);
 
 extern struct fscache_cookie *fscache_get_cookie(struct fscache_cookie *cookie,
diff --git a/include/trace/events/cachefiles.h b/include/trace/events/cachefiles.h
index 87681dd957ec..5aec097d51ae 100644
--- a/include/trace/events/cachefiles.h
+++ b/include/trace/events/cachefiles.h
@@ -85,7 +85,7 @@ TRACE_EVENT(cachefiles_ref,
 			     ),
 
 	    TP_fast_assign(
-		    __entry->obj	= obj->fscache.debug_id;
+		    __entry->obj	= obj->debug_id;
 		    __entry->cookie	= cookie->debug_id;
 		    __entry->usage	= usage;
 		    __entry->why	= why;
@@ -109,7 +109,7 @@ TRACE_EVENT(cachefiles_lookup,
 			     ),
 
 	    TP_fast_assign(
-		    __entry->obj	= obj->fscache.debug_id;
+		    __entry->obj	= obj->debug_id;
 		    __entry->ino	= (!IS_ERR(de) && d_backing_inode(de) ?
 					   d_backing_inode(de)->i_ino : 0);
 		    __entry->error	= IS_ERR(de) ? PTR_ERR(de) : 0;
@@ -132,7 +132,7 @@ TRACE_EVENT(cachefiles_create,
 			     ),
 
 	    TP_fast_assign(
-		    __entry->obj	= obj->fscache.debug_id;
+		    __entry->obj	= obj->debug_id;
 		    __entry->de		= de;
 		    __entry->ret	= ret;
 			   ),
@@ -156,7 +156,7 @@ TRACE_EVENT(cachefiles_unlink,
 			     ),
 
 	    TP_fast_assign(
-		    __entry->obj	= obj ? obj->fscache.debug_id : UINT_MAX;
+		    __entry->obj	= obj ? obj->debug_id : UINT_MAX;
 		    __entry->de		= de;
 		    __entry->why	= why;
 			   ),
@@ -183,7 +183,7 @@ TRACE_EVENT(cachefiles_rename,
 			     ),
 
 	    TP_fast_assign(
-		    __entry->obj	= obj ? obj->fscache.debug_id : UINT_MAX;
+		    __entry->obj	= obj ? obj->debug_id : UINT_MAX;
 		    __entry->de		= de;
 		    __entry->to		= to;
 		    __entry->why	= why;
@@ -207,7 +207,7 @@ TRACE_EVENT(cachefiles_mark_active,
 			     ),
 
 	    TP_fast_assign(
-		    __entry->obj	= obj->fscache.debug_id;
+		    __entry->obj	= obj->debug_id;
 		    __entry->de		= de;
 			   ),
 
@@ -230,7 +230,7 @@ TRACE_EVENT(cachefiles_mark_inactive,
 			     ),
 
 	    TP_fast_assign(
-		    __entry->obj	= obj->fscache.debug_id;
+		    __entry->obj	= obj->debug_id;
 		    __entry->de		= de;
 		    __entry->inode	= inode;
 			   ),
diff --git a/include/trace/events/fscache.h b/include/trace/events/fscache.h
index eac561d3ac51..412f016f6975 100644
--- a/include/trace/events/fscache.h
+++ b/include/trace/events/fscache.h
@@ -228,7 +228,7 @@ TRACE_EVENT(fscache_disable,
 	    );
 
 TRACE_EVENT(fscache_osm,
-	    TP_PROTO(struct fscache_object *object,
+	    TP_PROTO(struct cachefiles_object *object,
 		     const struct fscache_state *state,
 		     bool wait, bool oob, s8 event_num),
 



^ permalink raw reply related	[flat|nested] 87+ messages in thread

* [PATCH 26/67] cachefiles: Change to storing file* rather than dentry*
  2021-10-18 14:50 [PATCH 00/67] fscache: Rewrite index API and management system David Howells
                   ` (24 preceding siblings ...)
  2021-10-18 14:57 ` [PATCH 25/67] cachefiles: Fold fscache_object into cachefiles_object David Howells
@ 2021-10-18 14:57 ` David Howells
  2021-10-18 14:57 ` [PATCH 27/67] cachefiles: trace: Log coherency checks David Howells
                   ` (44 subsequent siblings)
  70 siblings, 0 replies; 87+ messages in thread
From: David Howells @ 2021-10-18 14:57 UTC (permalink / raw)
  To: linux-cachefs
  Cc: dhowells, Trond Myklebust, Anna Schumaker, Steve French,
	Dominique Martinet, Jeff Layton, Matthew Wilcox, Alexander Viro,
	Omar Sandoval, Linus Torvalds, linux-afs, linux-nfs, linux-cifs,
	ceph-devel, v9fs-developer, linux-fsdevel, linux-kernel

Attempting to open an ext4 file while files are being closed by do_exit()
results in an oops in ext4 because it uses d_path() which expects
current->fs to point somewhere.

This is a problem if I want to open a cache file in order to write out data
to it from the writeback code.

Mitigate this by opening a file right at the beginning and keeping it open,
replacing it with a new file when invalidation occurs.

This has the downside of keeping a bunch of file resources open.

Signed-off-by: David Howells <dhowells@redhat.com>
---

 fs/cachefiles/bind.c              |   17 +++-
 fs/cachefiles/interface.c         |   41 +++++-----
 fs/cachefiles/internal.h          |    6 ++
 fs/cachefiles/io.c                |   41 ++--------
 fs/cachefiles/namei.c             |  146 ++++++++++++++++++++++---------------
 fs/cachefiles/xattr.c             |   11 +--
 include/linux/fscache-cache.h     |    2 -
 include/trace/events/cachefiles.h |   25 +++---
 8 files changed, 151 insertions(+), 138 deletions(-)

diff --git a/fs/cachefiles/bind.c b/fs/cachefiles/bind.c
index 8db1ef2d1196..af7386ad14af 100644
--- a/fs/cachefiles/bind.c
+++ b/fs/cachefiles/bind.c
@@ -82,6 +82,7 @@ static int cachefiles_daemon_add_cache(struct cachefiles_cache *cache)
 	struct path path;
 	struct kstatfs stats;
 	struct dentry *graveyard, *cachedir, *root;
+	struct file *dirf;
 	const struct cred *saved_cred;
 	int ret;
 
@@ -192,7 +193,13 @@ static int cachefiles_daemon_add_cache(struct cachefiles_cache *cache)
 		goto error_unsupported;
 	}
 
-	fsdef->dentry = cachedir;
+	dirf = open_with_fake_path(&path, O_RDONLY | O_DIRECTORY,
+				   d_inode(cachedir), cache->cache_cred);
+	if (IS_ERR(dirf)) {
+		ret = PTR_ERR(dirf);
+		goto error_unsupported;
+	}
+	fsdef->file = dirf;
 	fsdef->cookie = NULL;
 
 	/* get the graveyard directory */
@@ -208,7 +215,7 @@ static int cachefiles_daemon_add_cache(struct cachefiles_cache *cache)
 	fscache_init_cache(&cache->cache,
 			   &cachefiles_cache_ops,
 			   "%s",
-			   fsdef->dentry->d_sb->s_id);
+			   graveyard->d_sb->s_id);
 
 	fscache_object_init(fsdef, &fscache_fsdef_index,
 			    &cache->cache);
@@ -234,8 +241,10 @@ static int cachefiles_daemon_add_cache(struct cachefiles_cache *cache)
 error_unsupported:
 	mntput(cache->mnt);
 	cache->mnt = NULL;
-	dput(fsdef->dentry);
-	fsdef->dentry = NULL;
+	if (fsdef->file) {
+		fput(fsdef->file);
+		fsdef->file = NULL;
+	}
 	dput(root);
 error_open_root:
 	kmem_cache_free(cachefiles_object_jar, fsdef);
diff --git a/fs/cachefiles/interface.c b/fs/cachefiles/interface.c
index 7ec2302a3214..4edb3a09932a 100644
--- a/fs/cachefiles/interface.c
+++ b/fs/cachefiles/interface.c
@@ -161,7 +161,7 @@ static void cachefiles_drop_object(struct cachefiles_object *object)
 	 * initialised if the parent goes away or the object gets retired
 	 * before we set it up.
 	 */
-	if (object->dentry) {
+	if (object->file) {
 		/* delete retired objects */
 		if (test_bit(FSCACHE_OBJECT_RETIRED, &object->flags) &&
 		    object != cache->cache.fsdef
@@ -174,8 +174,8 @@ static void cachefiles_drop_object(struct cachefiles_object *object)
 
 		/* close the filesystem stuff attached to the object */
 		cachefiles_unmark_inode_in_use(object);
-		dput(object->dentry);
-		object->dentry = NULL;
+		fput(object->file);
+		object->file = NULL;
 	}
 
 	_leave("");
@@ -210,7 +210,7 @@ void cachefiles_put_object(struct cachefiles_object *object,
 		_debug("- kill object OBJ%x", object->debug_id);
 
 		ASSERTCMP(object->parent, ==, NULL);
-		ASSERTCMP(object->dentry, ==, NULL);
+		ASSERTCMP(object->file, ==, NULL);
 		ASSERTCMP(object->n_ops, ==, 0);
 		ASSERTCMP(object->n_children, ==, 0);
 
@@ -262,6 +262,7 @@ static int cachefiles_attr_changed(struct cachefiles_object *object)
 	struct cachefiles_cache *cache;
 	const struct cred *saved_cred;
 	struct iattr newattrs;
+	struct file *file = object->file;
 	uint64_t ni_size;
 	loff_t oi_size;
 	int ret;
@@ -271,20 +272,22 @@ static int cachefiles_attr_changed(struct cachefiles_object *object)
 	_enter("{OBJ%x},[%llu]",
 	       object->debug_id, (unsigned long long) ni_size);
 
-	cache = container_of(object->cache,
-			     struct cachefiles_cache, cache);
+	if (!file)
+		return -ENOBUFS;
+
+	cache = container_of(object->cache, struct cachefiles_cache, cache);
 
 	if (ni_size == object->i_size)
 		return 0;
 
-	ASSERT(d_is_reg(object->dentry));
+	ASSERT(d_is_reg(file->f_path.dentry));
 
-	oi_size = i_size_read(d_backing_inode(object->dentry));
+	oi_size = i_size_read(file_inode(file));
 	if (oi_size == ni_size)
 		return 0;
 
 	cachefiles_begin_secure(cache, &saved_cred);
-	inode_lock(d_inode(object->dentry));
+	inode_lock(file_inode(file));
 
 	/* if there's an extension to a partial page at the end of the backing
 	 * file, we need to discard the partial page so that we pick up new
@@ -293,17 +296,18 @@ static int cachefiles_attr_changed(struct cachefiles_object *object)
 		_debug("discard tail %llx", oi_size);
 		newattrs.ia_valid = ATTR_SIZE;
 		newattrs.ia_size = oi_size & PAGE_MASK;
-		ret = notify_change(&init_user_ns, object->dentry, &newattrs, NULL);
+		ret = notify_change(&init_user_ns, file->f_path.dentry,
+				    &newattrs, NULL);
 		if (ret < 0)
 			goto truncate_failed;
 	}
 
 	newattrs.ia_valid = ATTR_SIZE;
 	newattrs.ia_size = ni_size;
-	ret = notify_change(&init_user_ns, object->dentry, &newattrs, NULL);
+	ret = notify_change(&init_user_ns, file->f_path.dentry, &newattrs, NULL);
 
 truncate_failed:
-	inode_unlock(d_inode(object->dentry));
+	inode_unlock(file_inode(file));
 	cachefiles_end_secure(cache, saved_cred);
 
 	if (ret == -EIO) {
@@ -322,7 +326,7 @@ static void cachefiles_invalidate_object(struct cachefiles_object *object)
 {
 	struct cachefiles_cache *cache;
 	const struct cred *saved_cred;
-	struct path path;
+	struct file *file = object->file;
 	uint64_t ni_size;
 	int ret;
 
@@ -333,16 +337,13 @@ static void cachefiles_invalidate_object(struct cachefiles_object *object)
 	_enter("{OBJ%x},[%llu]",
 	       object->debug_id, (unsigned long long)ni_size);
 
-	if (object->dentry) {
-		ASSERT(d_is_reg(object->dentry));
-
-		path.dentry = object->dentry;
-		path.mnt = cache->mnt;
+	if (file) {
+		ASSERT(d_is_reg(file->f_path.dentry));
 
 		cachefiles_begin_secure(cache, &saved_cred);
-		ret = vfs_truncate(&path, 0);
+		ret = vfs_truncate(&file->f_path, 0);
 		if (ret == 0)
-			ret = vfs_truncate(&path, ni_size);
+			ret = vfs_truncate(&file->f_path, ni_size);
 		cachefiles_end_secure(cache, saved_cred);
 
 		if (ret != 0) {
diff --git a/fs/cachefiles/internal.h b/fs/cachefiles/internal.h
index f268495ecbc6..4705f968e661 100644
--- a/fs/cachefiles/internal.h
+++ b/fs/cachefiles/internal.h
@@ -71,6 +71,12 @@ struct cachefiles_cache {
 
 #include <trace/events/cachefiles.h>
 
+static inline
+struct file *cachefiles_cres_file(struct netfs_cache_resources *cres)
+{
+	return cres->cache_priv2;
+}
+
 /*
  * note change of state for daemon
  */
diff --git a/fs/cachefiles/io.c b/fs/cachefiles/io.c
index 920ca48eecfa..0e1fb44b5d04 100644
--- a/fs/cachefiles/io.c
+++ b/fs/cachefiles/io.c
@@ -63,7 +63,7 @@ static int cachefiles_read(struct netfs_cache_resources *cres,
 			   void *term_func_priv)
 {
 	struct cachefiles_kiocb *ki;
-	struct file *file = cres->cache_priv2;
+	struct file *file = cachefiles_cres_file(cres);
 	unsigned int old_nofs;
 	ssize_t ret = -ENODATA;
 	size_t len = iov_iter_count(iter), skipped = 0;
@@ -190,7 +190,7 @@ static int cachefiles_write(struct netfs_cache_resources *cres,
 {
 	struct cachefiles_kiocb *ki;
 	struct inode *inode;
-	struct file *file = cres->cache_priv2;
+	struct file *file = cachefiles_cres_file(cres);
 	unsigned int old_nofs;
 	ssize_t ret = -ENOBUFS;
 	size_t len = iov_iter_count(iter);
@@ -385,7 +385,7 @@ static void cachefiles_end_operation(struct netfs_cache_resources *cres)
 {
 #if 0
 	struct fscache_operation *op = cres->cache_priv;
-	struct file *file = cres->cache_priv2;
+	struct file *file = cachefiles_cres_file(cres);
 
 	_enter("");
 
@@ -414,41 +414,16 @@ static const struct netfs_cache_ops cachefiles_netfs_cache_ops = {
 int cachefiles_begin_operation(struct netfs_cache_resources *cres)
 {
 #if 0
-	struct cachefiles_object *object;
-	struct cachefiles_cache *cache;
-	struct path path;
-	struct file *file;
+	struct cachefiles_object *object = op->object;
 
 	_enter("");
 
-	object = container_of(op->object, struct cachefiles_object, fscache);
-	cache = container_of(object->fscache.cache,
-			     struct cachefiles_cache, cache);
-
-	path.mnt = cache->mnt;
-	path.dentry = object->dentry;
-	file = open_with_fake_path(&path, O_RDWR | O_LARGEFILE | O_DIRECT,
-				   d_inode(object->dentry), cache->cache_cred);
-	if (IS_ERR(file))
-		return PTR_ERR(file);
-	if (!S_ISREG(file_inode(file)->i_mode))
-		goto error_file;
-	if (unlikely(!file->f_op->read_iter) ||
-	    unlikely(!file->f_op->write_iter)) {
-		pr_notice("Cache does not support read_iter and write_iter\n");
-		goto error_file;
-	}
-
-	atomic_inc(&op->usage);
-	cres->cache_priv = op;
-	cres->cache_priv2 = file;
-	cres->ops = &cachefiles_netfs_cache_ops;
-	cres->debug_id = object->fscache.debug_id;
+	cres->cache_priv	= op;
+	cres->cache_priv2	= get_file(object->file);
+	cres->ops		= &cachefiles_netfs_cache_ops;
+	cres->debug_id		= object->cookie->debug_id;
 	_leave("");
 	return 0;
-
-error_file:
-	fput(file);
 #endif
 	cres->ops = &cachefiles_netfs_cache_ops;
 	return -EIO;
diff --git a/fs/cachefiles/namei.c b/fs/cachefiles/namei.c
index d3903cbb7de3..a60ef6f1cf1e 100644
--- a/fs/cachefiles/namei.c
+++ b/fs/cachefiles/namei.c
@@ -25,8 +25,7 @@
  */
 static bool cachefiles_mark_inode_in_use(struct cachefiles_object *object)
 {
-	struct dentry *dentry = object->dentry;
-	struct inode *inode = d_backing_inode(dentry);
+	struct inode *inode = file_inode(object->file);
 	bool can_use = false;
 
 	_enter(",%x", object->debug_id);
@@ -35,10 +34,10 @@ static bool cachefiles_mark_inode_in_use(struct cachefiles_object *object)
 
 	if (!(inode->i_flags & S_KERNEL_FILE)) {
 		inode->i_flags |= S_KERNEL_FILE;
-		trace_cachefiles_mark_active(object, dentry);
+		trace_cachefiles_mark_active(object, inode);
 		can_use = true;
 	} else {
-		pr_notice("cachefiles: Inode already in use: %pd\n", dentry);
+		pr_notice("cachefiles: Inode already in use: %pD\n", object->file);
 	}
 
 	inode_unlock(inode);
@@ -50,13 +49,12 @@ static bool cachefiles_mark_inode_in_use(struct cachefiles_object *object)
  */
 void cachefiles_unmark_inode_in_use(struct cachefiles_object *object)
 {
-	struct dentry *dentry = object->dentry;
-	struct inode *inode = d_backing_inode(dentry);
+	struct inode *inode = file_inode(object->file);
 
 	inode_lock(inode);
 	inode->i_flags &= ~S_KERNEL_FILE;
 	inode_unlock(inode);
-	trace_cachefiles_mark_inactive(object, dentry, inode);
+	trace_cachefiles_mark_inactive(object, inode);
 }
 
 /*
@@ -65,7 +63,7 @@ void cachefiles_unmark_inode_in_use(struct cachefiles_object *object)
 static void cachefiles_mark_object_inactive(struct cachefiles_cache *cache,
 					    struct cachefiles_object *object)
 {
-	blkcnt_t i_blocks = d_backing_inode(object->dentry)->i_blocks;
+	blkcnt_t i_blocks = file_inode(object->file)->i_blocks;
 
 	/* This object can now be culled, so we need to let the daemon know
 	 * that there is something it can remove if it needs to.
@@ -106,7 +104,9 @@ static int cachefiles_bury_object(struct cachefiles_cache *cache,
 			cachefiles_io_error(cache, "Unlink security error");
 		} else {
 			trace_cachefiles_unlink(object, rep, why);
-			dget(rep);
+			dget(rep); /* Stop the dentry being negated if it's
+				    * only pinned by a file struct.
+				    */
 			ret = vfs_unlink(&init_user_ns, d_inode(dir), rep, NULL);
 			dput(rep);
 		}
@@ -232,30 +232,29 @@ static int cachefiles_bury_object(struct cachefiles_cache *cache,
 int cachefiles_delete_object(struct cachefiles_cache *cache,
 			     struct cachefiles_object *object)
 {
-	struct dentry *dir;
+	struct dentry *dentry = object->file->f_path.dentry, *dir;
 	int ret;
 
-	_enter(",OBJ%x{%pd}", object->debug_id, object->dentry);
+	_enter(",OBJ%x{%pD}", object->debug_id, object->file);
 
-	ASSERT(object->dentry);
-	ASSERT(d_backing_inode(object->dentry));
-	ASSERT(object->dentry->d_parent);
+	ASSERT(d_backing_inode(dentry));
+	ASSERT(dentry->d_parent);
 
-	dir = dget_parent(object->dentry);
+	dir = dget_parent(dentry);
 
-	inode_lock_nested(d_inode(dir), I_MUTEX_PARENT);
+	inode_lock_nested(d_backing_inode(dir), I_MUTEX_PARENT);
 
 	/* We need to check that our parent is _still_ our parent - it may have
 	 * been renamed.
 	 */
-	if (dir == object->dentry->d_parent) {
-		ret = cachefiles_bury_object(cache, object, dir, object->dentry,
+	if (dir == dentry->d_parent) {
+		ret = cachefiles_bury_object(cache, object, dir, dentry,
 					     FSCACHE_OBJECT_WAS_RETIRED);
 	} else {
 		/* It got moved, presumably by cachefilesd culling it, so it's
 		 * no longer in the key path and we can ignore it.
 		 */
-		inode_unlock(d_inode(dir));
+		inode_unlock(d_backing_inode(dir));
 		ret = 0;
 	}
 
@@ -271,7 +270,6 @@ static int cachefiles_check_open_object(struct cachefiles_cache *cache,
 					struct cachefiles_object *object,
 					struct dentry *fan)
 {
-	struct path path;
 	int ret;
 
 	if (!cachefiles_mark_inode_in_use(object))
@@ -280,7 +278,7 @@ static int cachefiles_check_open_object(struct cachefiles_cache *cache,
 	/* if we've found that the terminal object exists, then we need to
 	 * check its attributes and delete it if it's out of date */
 	if (!object->new) {
-		_debug("validate '%pd'", object->dentry);
+		_debug("validate '%pD'", object->file);
 
 		ret = cachefiles_check_auxdata(object);
 		if (ret == -ESTALE)
@@ -301,9 +299,7 @@ static int cachefiles_check_open_object(struct cachefiles_cache *cache,
 		 * (this is used to keep track of culling, and atimes are only
 		 * updated by read, write and readdir but not lookup or
 		 * open) */
-		path.mnt = cache->mnt;
-		path.dentry = object->dentry;
-		touch_atime(&path);
+		touch_atime(&object->file->f_path);
 	}
 
 	return 0;
@@ -311,7 +307,8 @@ static int cachefiles_check_open_object(struct cachefiles_cache *cache,
 stale:
 	cachefiles_unmark_inode_in_use(object);
 	inode_lock_nested(d_inode(fan), I_MUTEX_PARENT);
-	ret = cachefiles_bury_object(cache, object, fan, object->dentry,
+	ret = cachefiles_bury_object(cache, object, fan,
+				     object->file->f_path.dentry,
 				     FSCACHE_OBJECT_IS_STALE);
 	if (ret < 0)
 		return ret;
@@ -326,13 +323,14 @@ static int cachefiles_check_open_object(struct cachefiles_cache *cache,
 /*
  * Walk to a file, creating it if necessary.
  */
-static int cachefiles_walk_to_file(struct cachefiles_cache *cache,
-				   struct cachefiles_object *object,
-				   struct dentry *fan)
+static int cachefiles_open_file(struct cachefiles_cache *cache,
+				struct cachefiles_object *object,
+				struct dentry *fan)
 {
 	struct dentry *dentry;
-	struct inode *dinode = d_backing_inode(fan);
-	struct path fan_path;
+	struct inode *dinode = d_backing_inode(fan), *inode;
+	struct file *file;
+	struct path fan_path, path;
 	int ret;
 
 	_enter("%pd %s", fan, object->d_name);
@@ -343,7 +341,7 @@ static int cachefiles_walk_to_file(struct cachefiles_cache *cache,
 	trace_cachefiles_lookup(object, dentry);
 	if (IS_ERR(dentry)) {
 		ret = PTR_ERR(dentry);
-		goto error;
+		goto error_unlock;
 	}
 
 	if (d_is_negative(dentry)) {
@@ -368,34 +366,53 @@ static int cachefiles_walk_to_file(struct cachefiles_cache *cache,
 		if (ret < 0)
 			goto error_dput;
 
-		_debug("create -> %pd{ino=%lu}",
-		       dentry, d_backing_inode(dentry)->i_ino);
-		object->new = true;
+		inode = d_backing_inode(dentry);
+		_debug("create -> %pd{ino=%lu}", dentry, inode->i_ino);
 
 	} else if (!d_is_reg(dentry)) {
-		pr_err("inode %lu is not a file\n",
-		       d_backing_inode(dentry)->i_ino);
+		inode = d_backing_inode(dentry);
+		pr_err("inode %lu is not a file\n", inode->i_ino);
 		ret = -EIO;
 		goto error_dput;
 	} else {
+		inode = d_backing_inode(dentry);
 		_debug("file -> %pd positive", dentry);
 	}
 
-	if (dentry->d_sb->s_blocksize > PAGE_SIZE) {
-		pr_warn("cachefiles: Block size too large\n");
+	inode_unlock(dinode);
+
+	/* We need to open a file interface onto a data file now as we can't do
+	 * it on demand because writeback called from do_exit() sees
+	 * current->fs == NULL - which breaks d_path() called from ext4 open.
+	 */
+	path.mnt = cache->mnt;
+	path.dentry = dentry;
+	file = open_with_fake_path(&path, O_RDWR | O_LARGEFILE | O_DIRECT,
+				   inode, cache->cache_cred);
+	dput(dentry);
+	if (IS_ERR(file)) {
+		ret = PTR_ERR(file);
+		goto error;
+	}
+	if (unlikely(!file->f_op->read_iter) ||
+	    unlikely(!file->f_op->write_iter)) {
+		pr_notice("Cache does not support read_iter and write_iter\n");
 		ret = -EIO;
-		goto error_dput;
+		goto error_fput;
 	}
 
-	object->dentry = dentry;
-	inode_unlock(dinode);
+	object->file = file;
 	return 0;
 
 error_dput:
 	dput(dentry);
-error:
+error_unlock:
 	inode_unlock(dinode);
 	return ret;
+error_fput:
+	fput(file);
+error:
+	return ret;
 }
 
 /*
@@ -419,33 +436,46 @@ int cachefiles_walk_to_object(struct cachefiles_object *parent,
 			      struct cachefiles_object *object)
 {
 	struct cachefiles_cache *cache;
-	struct dentry *fan, *dentry;
+	struct dentry *fan;
 	int ret;
 
-	_enter("OBJ%x{%pd},OBJ%x,%s,",
-	       parent->debug_id, parent->dentry,
+	_enter("OBJ%x{%pD},OBJ%x,%s,",
+	       parent->debug_id, parent->file,
 	       object->debug_id, object->d_name);
 
 	cache = container_of(parent->cache, struct cachefiles_cache, cache);
-	ASSERT(parent->dentry);
-	ASSERT(d_backing_inode(parent->dentry));
+	ASSERT(parent->file);
 
 lookup_again:
-	fan = cachefiles_walk_over_fanout(object, cache, parent->dentry);
+	fan = cachefiles_walk_over_fanout(object, cache, parent->file->f_path.dentry);
 	if (IS_ERR(fan))
 		return PTR_ERR(fan);
 
-	/* Walk over path "parent/fanout/object". */
+	/* Open path "parent/fanout/object". */
 	if (object->type == FSCACHE_COOKIE_TYPE_INDEX) {
+		struct dentry *dentry;
+		struct file *file;
+		struct path path;
+
 		dentry = cachefiles_get_directory(cache, fan, object->d_name,
 						  object);
 		if (IS_ERR(dentry)) {
 			dput(fan);
 			return PTR_ERR(dentry);
 		}
-		object->dentry = dentry;
+		path.mnt = cache->mnt;
+		path.dentry = dentry;
+		file = open_with_fake_path(&path, O_RDONLY | O_DIRECTORY,
+					   d_backing_inode(dentry),
+					   cache->cache_cred);
+		dput(dentry);
+		if (IS_ERR(file)) {
+			dput(fan);
+			return PTR_ERR(file);
+		}
+		object->file = file;
 	} else {
-		ret = cachefiles_walk_to_file(cache, object, fan);
+		ret = cachefiles_open_file(cache, object, fan);
 		if (ret < 0) {
 			dput(fan);
 			return ret;
@@ -460,21 +490,17 @@ int cachefiles_walk_to_object(struct cachefiles_object *parent,
 
 	object->new = false;
 	fscache_obtained_object(object);
-	_leave(" = 0 [%lu]", d_backing_inode(object->dentry)->i_ino);
-	return 0;
+	_leave(" = t [%lu]", file_inode(object->file)->i_ino);
+	return true;
 
 check_error:
-	if (ret == -ESTALE) {
-		dput(object->dentry);
-		object->dentry = NULL;
-		fscache_object_retrying_stale(object);
+	fput(object->file);
+	object->file = NULL;
+	if (ret == -ESTALE)
 		goto lookup_again;
-	}
 	if (ret == -EIO)
 		cachefiles_io_error_obj(object, "Lookup failed");
 	cachefiles_mark_object_inactive(cache, object);
-	dput(object->dentry);
-	object->dentry = NULL;
 	_leave(" = error %d", ret);
 	return ret;
 }
diff --git a/fs/cachefiles/xattr.c b/fs/cachefiles/xattr.c
index 464dea0b467c..2f14618d9a36 100644
--- a/fs/cachefiles/xattr.c
+++ b/fs/cachefiles/xattr.c
@@ -30,12 +30,14 @@ int cachefiles_set_object_xattr(struct cachefiles_object *object,
 				unsigned int xattr_flags)
 {
 	struct cachefiles_xattr *buf;
-	struct dentry *dentry = object->dentry;
+	struct dentry *dentry;
+	struct file *file = object->file;
 	unsigned int len = object->cookie->aux_len;
 	int ret;
 
-	if (!dentry)
+	if (!file)
 		return -ESTALE;
+	dentry = file->f_path.dentry;
 
 	_enter("%x,#%d", object->debug_id, len);
 
@@ -67,14 +69,11 @@ int cachefiles_set_object_xattr(struct cachefiles_object *object,
 int cachefiles_check_auxdata(struct cachefiles_object *object)
 {
 	struct cachefiles_xattr *buf;
-	struct dentry *dentry = object->dentry;
+	struct dentry *dentry = object->file->f_path.dentry;
 	unsigned int len = object->cookie->aux_len, tlen;
 	const void *p = fscache_get_aux(object->cookie);
 	ssize_t ret;
 
-	ASSERT(dentry);
-	ASSERT(d_backing_inode(dentry));
-
 	tlen = sizeof(struct cachefiles_xattr) + len;
 	buf = kmalloc(tlen, GFP_KERNEL);
 	if (!buf)
diff --git a/include/linux/fscache-cache.h b/include/linux/fscache-cache.h
index 7021c1ec2b64..90a7c92fca98 100644
--- a/include/linux/fscache-cache.h
+++ b/include/linux/fscache-cache.h
@@ -198,7 +198,7 @@ struct cachefiles_object {
 	struct list_head	dep_link;	/* link in parent's dependents list */
 
 	char				*d_name;	/* Filename */
-	struct dentry			*dentry;	/* the file/dir representing this object */
+	struct file			*file;		/* The file representing this object */
 	loff_t				i_size;		/* object size */
 	atomic_t			usage;		/* object usage count */
 	uint8_t				type;		/* object type */
diff --git a/include/trace/events/cachefiles.h b/include/trace/events/cachefiles.h
index 5aec097d51ae..10ad5305d1c5 100644
--- a/include/trace/events/cachefiles.h
+++ b/include/trace/events/cachefiles.h
@@ -196,47 +196,44 @@ TRACE_EVENT(cachefiles_rename,
 
 TRACE_EVENT(cachefiles_mark_active,
 	    TP_PROTO(struct cachefiles_object *obj,
-		     struct dentry *de),
+		     struct inode *inode),
 
-	    TP_ARGS(obj, de),
+	    TP_ARGS(obj, inode),
 
 	    /* Note that obj may be NULL */
 	    TP_STRUCT__entry(
 		    __field(unsigned int,		obj		)
-		    __field(struct dentry *,		de		)
+		    __field(ino_t,			inode		)
 			     ),
 
 	    TP_fast_assign(
 		    __entry->obj	= obj->debug_id;
-		    __entry->de		= de;
+		    __entry->inode	= inode->i_ino;
 			   ),
 
-	    TP_printk("o=%08x d=%p",
-		      __entry->obj, __entry->de)
+	    TP_printk("o=%08x i=%lx",
+		      __entry->obj, __entry->inode)
 	    );
 
 TRACE_EVENT(cachefiles_mark_inactive,
 	    TP_PROTO(struct cachefiles_object *obj,
-		     struct dentry *de,
 		     struct inode *inode),
 
-	    TP_ARGS(obj, de, inode),
+	    TP_ARGS(obj, inode),
 
 	    /* Note that obj may be NULL */
 	    TP_STRUCT__entry(
 		    __field(unsigned int,		obj		)
-		    __field(struct dentry *,		de		)
-		    __field(struct inode *,		inode		)
+		    __field(ino_t,			inode		)
 			     ),
 
 	    TP_fast_assign(
 		    __entry->obj	= obj->debug_id;
-		    __entry->de		= de;
-		    __entry->inode	= inode;
+		    __entry->inode	= inode->i_ino;
 			   ),
 
-	    TP_printk("o=%08x d=%p i=%p",
-		      __entry->obj, __entry->de, __entry->inode)
+	    TP_printk("o=%08x i=%lx",
+		      __entry->obj, __entry->inode)
 	    );
 
 #endif /* _TRACE_CACHEFILES_H */



^ permalink raw reply related	[flat|nested] 87+ messages in thread

* [PATCH 27/67] cachefiles: trace: Log coherency checks
  2021-10-18 14:50 [PATCH 00/67] fscache: Rewrite index API and management system David Howells
                   ` (25 preceding siblings ...)
  2021-10-18 14:57 ` [PATCH 26/67] cachefiles: Change to storing file* rather than dentry* David Howells
@ 2021-10-18 14:57 ` David Howells
  2021-10-18 14:57 ` [PATCH 28/67] cachefiles: Trace truncations David Howells
                   ` (43 subsequent siblings)
  70 siblings, 0 replies; 87+ messages in thread
From: David Howells @ 2021-10-18 14:57 UTC (permalink / raw)
  To: linux-cachefs
  Cc: dhowells, Trond Myklebust, Anna Schumaker, Steve French,
	Dominique Martinet, Jeff Layton, Matthew Wilcox, Alexander Viro,
	Omar Sandoval, Linus Torvalds, linux-afs, linux-nfs, linux-cifs,
	ceph-devel, v9fs-developer, linux-fsdevel, linux-kernel

Add a cachefiles tracepoint that logs the result of coherency management
when the coherency data on a file in the cache is checked or committed.

Signed-off-by: David Howells <dhowells@redhat.com>
---

 fs/cachefiles/xattr.c             |   44 +++++++++++++++++++++--------
 include/trace/events/cachefiles.h |   56 +++++++++++++++++++++++++++++++++++++
 2 files changed, 88 insertions(+), 12 deletions(-)

diff --git a/fs/cachefiles/xattr.c b/fs/cachefiles/xattr.c
index 2f14618d9a36..bfb2f4d605af 100644
--- a/fs/cachefiles/xattr.c
+++ b/fs/cachefiles/xattr.c
@@ -53,12 +53,21 @@ int cachefiles_set_object_xattr(struct cachefiles_object *object,
 	ret = vfs_setxattr(&init_user_ns, dentry, cachefiles_xattr_cache,
 			   buf, sizeof(struct cachefiles_xattr) + len,
 			   xattr_flags);
-	kfree(buf);
-	if (ret < 0 && ret != -ENOMEM)
-		cachefiles_io_error_obj(
-			object,
-			"Failed to set xattr with error %d", ret);
+	if (ret < 0) {
+		trace_cachefiles_coherency(object, file_inode(file)->i_ino,
+					   0,
+					   cachefiles_coherency_set_fail);
+		if (ret != -ENOMEM)
+			cachefiles_io_error_obj(
+				object,
+				"Failed to set xattr with error %d", ret);
+	} else {
+		trace_cachefiles_coherency(object, file_inode(file)->i_ino,
+					   0,
+					   cachefiles_coherency_set_ok);
+	}
 
+	kfree(buf);
 	_leave(" = %d", ret);
 	return ret;
 }
@@ -72,21 +81,32 @@ int cachefiles_check_auxdata(struct cachefiles_object *object)
 	struct dentry *dentry = object->file->f_path.dentry;
 	unsigned int len = object->cookie->aux_len, tlen;
 	const void *p = fscache_get_aux(object->cookie);
-	ssize_t ret;
+	enum cachefiles_coherency_trace why;
+	ssize_t xlen;
+	int ret = -ESTALE;
 
 	tlen = sizeof(struct cachefiles_xattr) + len;
 	buf = kmalloc(tlen, GFP_KERNEL);
 	if (!buf)
 		return -ENOMEM;
 
-	ret = vfs_getxattr(&init_user_ns, dentry, cachefiles_xattr_cache, buf, tlen);
-	if (ret == tlen &&
-	    buf->type == object->cookie->type &&
-	    memcmp(buf->data, p, len) == 0)
+	xlen = vfs_getxattr(&init_user_ns, dentry, cachefiles_xattr_cache, buf, tlen);
+	if (xlen != tlen) {
+		if (xlen == -EIO)
+			cachefiles_io_error_obj(
+				object,
+				"Failed to read aux with error %zd", xlen);
+		why = cachefiles_coherency_check_xattr;
+	} else if (buf->type != object->cookie->type) {
+		why = cachefiles_coherency_check_type;
+	} else if (memcmp(buf->data, p, len) != 0) {
+		why = cachefiles_coherency_check_aux;
+	} else {
+		why = cachefiles_coherency_check_ok;
 		ret = 0;
-	else
-		ret = -ESTALE;
+	}
 
+	trace_cachefiles_coherency(object, file_inode(object->file)->i_ino, 0, why);
 	kfree(buf);
 	return ret;
 }
diff --git a/include/trace/events/cachefiles.h b/include/trace/events/cachefiles.h
index 10ad5305d1c5..6e9f6462833d 100644
--- a/include/trace/events/cachefiles.h
+++ b/include/trace/events/cachefiles.h
@@ -24,6 +24,19 @@ enum cachefiles_obj_ref_trace {
 	cachefiles_obj_ref__nr_traces
 };
 
+enum cachefiles_coherency_trace {
+	cachefiles_coherency_check_aux,
+	cachefiles_coherency_check_content,
+	cachefiles_coherency_check_dirty,
+	cachefiles_coherency_check_len,
+	cachefiles_coherency_check_objsize,
+	cachefiles_coherency_check_ok,
+	cachefiles_coherency_check_type,
+	cachefiles_coherency_check_xattr,
+	cachefiles_coherency_set_fail,
+	cachefiles_coherency_set_ok,
+};
+
 #endif
 
 /*
@@ -47,6 +60,18 @@ enum cachefiles_obj_ref_trace {
 	EM(cachefiles_obj_put_wait_retry,	"PUT wait_retry")	\
 	E_(cachefiles_obj_put_wait_timeo,	"PUT wait_timeo")
 
+#define cachefiles_coherency_traces					\
+	EM(cachefiles_coherency_check_aux,	"BAD aux ")		\
+	EM(cachefiles_coherency_check_content,	"BAD cont")		\
+	EM(cachefiles_coherency_check_dirty,	"BAD dirt")		\
+	EM(cachefiles_coherency_check_len,	"BAD len ")		\
+	EM(cachefiles_coherency_check_objsize,	"BAD osiz")		\
+	EM(cachefiles_coherency_check_ok,	"OK      ")		\
+	EM(cachefiles_coherency_check_type,	"BAD type")		\
+	EM(cachefiles_coherency_check_xattr,	"BAD xatt")		\
+	EM(cachefiles_coherency_set_fail,	"SET fail")		\
+	E_(cachefiles_coherency_set_ok,		"SET ok  ")
+
 /*
  * Export enum symbols via userspace.
  */
@@ -57,6 +82,7 @@ enum cachefiles_obj_ref_trace {
 
 cachefiles_obj_kill_traces;
 cachefiles_obj_ref_traces;
+cachefiles_coherency_traces;
 
 /*
  * Now redefine the EM() and E_() macros to map the enums to the strings that
@@ -236,6 +262,36 @@ TRACE_EVENT(cachefiles_mark_inactive,
 		      __entry->obj, __entry->inode)
 	    );
 
+TRACE_EVENT(cachefiles_coherency,
+	    TP_PROTO(struct cachefiles_object *obj,
+		     ino_t ino,
+		     int content,
+		     enum cachefiles_coherency_trace why),
+
+	    TP_ARGS(obj, ino, content, why),
+
+	    /* Note that obj may be NULL */
+	    TP_STRUCT__entry(
+		    __field(unsigned int,			obj	)
+		    __field(enum cachefiles_coherency_trace,	why	)
+		    __field(int,				content	)
+		    __field(u64,				ino	)
+			     ),
+
+	    TP_fast_assign(
+		    __entry->obj	= obj->debug_id;
+		    __entry->why	= why;
+		    __entry->content	= content;
+		    __entry->ino	= ino;
+			   ),
+
+	    TP_printk("o=%08x %s i=%llx c=%u",
+		      __entry->obj,
+		      __print_symbolic(__entry->why, cachefiles_coherency_traces),
+		      __entry->ino,
+		      __entry->content)
+	    );
+
 #endif /* _TRACE_CACHEFILES_H */
 
 /* This part must be outside protection */



^ permalink raw reply related	[flat|nested] 87+ messages in thread

* [PATCH 28/67] cachefiles: Trace truncations
  2021-10-18 14:50 [PATCH 00/67] fscache: Rewrite index API and management system David Howells
                   ` (26 preceding siblings ...)
  2021-10-18 14:57 ` [PATCH 27/67] cachefiles: trace: Log coherency checks David Howells
@ 2021-10-18 14:57 ` David Howells
  2021-10-18 14:57 ` [PATCH 29/67] cachefiles: Trace read and write operations David Howells
                   ` (42 subsequent siblings)
  70 siblings, 0 replies; 87+ messages in thread
From: David Howells @ 2021-10-18 14:57 UTC (permalink / raw)
  To: linux-cachefs
  Cc: dhowells, Trond Myklebust, Anna Schumaker, Steve French,
	Dominique Martinet, Jeff Layton, Matthew Wilcox, Alexander Viro,
	Omar Sandoval, Linus Torvalds, linux-afs, linux-nfs, linux-cifs,
	ceph-devel, v9fs-developer, linux-fsdevel, linux-kernel

Add a tracepoint to trace truncation operations.

Signed-off-by: David Howells <dhowells@redhat.com>
---

 fs/cachefiles/interface.c         |    9 +++++++-
 include/trace/events/cachefiles.h |   40 +++++++++++++++++++++++++++++++++++++
 2 files changed, 48 insertions(+), 1 deletion(-)

diff --git a/fs/cachefiles/interface.c b/fs/cachefiles/interface.c
index 4edb3a09932a..8f98e5c27d66 100644
--- a/fs/cachefiles/interface.c
+++ b/fs/cachefiles/interface.c
@@ -341,9 +341,16 @@ static void cachefiles_invalidate_object(struct cachefiles_object *object)
 		ASSERT(d_is_reg(file->f_path.dentry));
 
 		cachefiles_begin_secure(cache, &saved_cred);
+		trace_cachefiles_trunc(object, file_inode(file),
+				       i_size_read(file_inode(file)), 0,
+				       cachefiles_trunc_invalidate);
 		ret = vfs_truncate(&file->f_path, 0);
-		if (ret == 0)
+		if (ret == 0) {
+			trace_cachefiles_trunc(object, file_inode(file),
+					       0, ni_size,
+					       cachefiles_trunc_set_size);
 			ret = vfs_truncate(&file->f_path, ni_size);
+		}
 		cachefiles_end_secure(cache, saved_cred);
 
 		if (ret != 0) {
diff --git a/include/trace/events/cachefiles.h b/include/trace/events/cachefiles.h
index 6e9f6462833d..09d76c160451 100644
--- a/include/trace/events/cachefiles.h
+++ b/include/trace/events/cachefiles.h
@@ -37,6 +37,11 @@ enum cachefiles_coherency_trace {
 	cachefiles_coherency_set_ok,
 };
 
+enum cachefiles_trunc_trace {
+	cachefiles_trunc_invalidate,
+	cachefiles_trunc_set_size,
+};
+
 #endif
 
 /*
@@ -72,6 +77,10 @@ enum cachefiles_coherency_trace {
 	EM(cachefiles_coherency_set_fail,	"SET fail")		\
 	E_(cachefiles_coherency_set_ok,		"SET ok  ")
 
+#define cachefiles_trunc_traces						\
+	EM(cachefiles_trunc_invalidate,		"INVAL ")		\
+	E_(cachefiles_trunc_set_size,		"SETSIZ")
+
 /*
  * Export enum symbols via userspace.
  */
@@ -83,6 +92,7 @@ enum cachefiles_coherency_trace {
 cachefiles_obj_kill_traces;
 cachefiles_obj_ref_traces;
 cachefiles_coherency_traces;
+cachefiles_trunc_traces;
 
 /*
  * Now redefine the EM() and E_() macros to map the enums to the strings that
@@ -292,6 +302,36 @@ TRACE_EVENT(cachefiles_coherency,
 		      __entry->content)
 	    );
 
+TRACE_EVENT(cachefiles_trunc,
+	    TP_PROTO(struct cachefiles_object *obj, struct inode *backer,
+		     loff_t from, loff_t to, enum cachefiles_trunc_trace why),
+
+	    TP_ARGS(obj, backer, from, to, why),
+
+	    TP_STRUCT__entry(
+		    __field(unsigned int,			obj	)
+		    __field(unsigned int,			backer	)
+		    __field(enum cachefiles_trunc_trace,	why	)
+		    __field(loff_t,				from	)
+		    __field(loff_t,				to	)
+			     ),
+
+	    TP_fast_assign(
+		    __entry->obj	= obj->debug_id;
+		    __entry->backer	= backer->i_ino;
+		    __entry->from	= from;
+		    __entry->to		= to;
+		    __entry->why	= why;
+			   ),
+
+	    TP_printk("o=%08x b=%08x %s l=%llx->%llx",
+		      __entry->obj,
+		      __entry->backer,
+		      __print_symbolic(__entry->why, cachefiles_trunc_traces),
+		      __entry->from,
+		      __entry->to)
+	    );
+
 #endif /* _TRACE_CACHEFILES_H */
 
 /* This part must be outside protection */



^ permalink raw reply related	[flat|nested] 87+ messages in thread

* [PATCH 29/67] cachefiles: Trace read and write operations
  2021-10-18 14:50 [PATCH 00/67] fscache: Rewrite index API and management system David Howells
                   ` (27 preceding siblings ...)
  2021-10-18 14:57 ` [PATCH 28/67] cachefiles: Trace truncations David Howells
@ 2021-10-18 14:57 ` David Howells
  2021-10-18 14:58 ` [PATCH 30/67] cachefiles: Round the cachefile size up to DIO block size David Howells
                   ` (41 subsequent siblings)
  70 siblings, 0 replies; 87+ messages in thread
From: David Howells @ 2021-10-18 14:57 UTC (permalink / raw)
  To: linux-cachefs
  Cc: dhowells, Trond Myklebust, Anna Schumaker, Steve French,
	Dominique Martinet, Jeff Layton, Matthew Wilcox, Alexander Viro,
	Omar Sandoval, Linus Torvalds, linux-afs, linux-nfs, linux-cifs,
	ceph-devel, v9fs-developer, linux-fsdevel, linux-kernel

Add tracepoints for read and write operations on cache files.

Signed-off-by: David Howells <dhowells@redhat.com>
---

 fs/cachefiles/io.c                |    6 +++-
 include/trace/events/cachefiles.h |   58 +++++++++++++++++++++++++++++++++++++
 2 files changed, 63 insertions(+), 1 deletion(-)

diff --git a/fs/cachefiles/io.c b/fs/cachefiles/io.c
index 0e1fb44b5d04..f703a93e238b 100644
--- a/fs/cachefiles/io.c
+++ b/fs/cachefiles/io.c
@@ -62,6 +62,7 @@ static int cachefiles_read(struct netfs_cache_resources *cres,
 			   netfs_io_terminated_t term_func,
 			   void *term_func_priv)
 {
+	struct cachefiles_object *object = cres->cache_priv;
 	struct cachefiles_kiocb *ki;
 	struct file *file = cachefiles_cres_file(cres);
 	unsigned int old_nofs;
@@ -124,6 +125,7 @@ static int cachefiles_read(struct netfs_cache_resources *cres,
 
 	get_file(ki->iocb.ki_filp);
 
+	trace_cachefiles_read(object, file_inode(file), ki->iocb.ki_pos, len - skipped);
 	old_nofs = memalloc_nofs_save();
 	ret = vfs_iocb_iter_read(file, &ki->iocb, iter);
 	memalloc_nofs_restore(old_nofs);
@@ -188,6 +190,7 @@ static int cachefiles_write(struct netfs_cache_resources *cres,
 			    netfs_io_terminated_t term_func,
 			    void *term_func_priv)
 {
+	struct cachefiles_object *object = cres->cache_priv;
 	struct cachefiles_kiocb *ki;
 	struct inode *inode;
 	struct file *file = cachefiles_cres_file(cres);
@@ -229,6 +232,7 @@ static int cachefiles_write(struct netfs_cache_resources *cres,
 
 	get_file(ki->iocb.ki_filp);
 
+	trace_cachefiles_write(object, inode, ki->iocb.ki_pos, len);
 	old_nofs = memalloc_nofs_save();
 	ret = vfs_iocb_iter_write(file, &ki->iocb, iter);
 	memalloc_nofs_restore(old_nofs);
@@ -418,7 +422,7 @@ int cachefiles_begin_operation(struct netfs_cache_resources *cres)
 
 	_enter("");
 
-	cres->cache_priv	= op;
+	cres->cache_priv	= object;
 	cres->cache_priv2	= get_file(object->file);
 	cres->ops		= &cachefiles_netfs_cache_ops;
 	cres->debug_id		= object->cookie->debug_id;
diff --git a/include/trace/events/cachefiles.h b/include/trace/events/cachefiles.h
index 09d76c160451..47df44550ad6 100644
--- a/include/trace/events/cachefiles.h
+++ b/include/trace/events/cachefiles.h
@@ -302,6 +302,64 @@ TRACE_EVENT(cachefiles_coherency,
 		      __entry->content)
 	    );
 
+TRACE_EVENT(cachefiles_read,
+	    TP_PROTO(struct cachefiles_object *obj,
+		     struct inode *backer,
+		     loff_t start,
+		     size_t len),
+
+	    TP_ARGS(obj, backer, start, len),
+
+	    TP_STRUCT__entry(
+		    __field(unsigned int,			obj	)
+		    __field(unsigned int,			backer	)
+		    __field(size_t,				len	)
+		    __field(loff_t,				start	)
+			     ),
+
+	    TP_fast_assign(
+		    __entry->obj	= obj->debug_id;
+		    __entry->backer	= backer->i_ino;
+		    __entry->start	= start;
+		    __entry->len	= len;
+			   ),
+
+	    TP_printk("o=%08x b=%08x s=%llx l=%zx",
+		      __entry->obj,
+		      __entry->backer,
+		      __entry->start,
+		      __entry->len)
+	    );
+
+TRACE_EVENT(cachefiles_write,
+	    TP_PROTO(struct cachefiles_object *obj,
+		     struct inode *backer,
+		     loff_t start,
+		     size_t len),
+
+	    TP_ARGS(obj, backer, start, len),
+
+	    TP_STRUCT__entry(
+		    __field(unsigned int,			obj	)
+		    __field(unsigned int,			backer	)
+		    __field(size_t,				len	)
+		    __field(loff_t,				start	)
+			     ),
+
+	    TP_fast_assign(
+		    __entry->obj	= obj->debug_id;
+		    __entry->backer	= backer->i_ino;
+		    __entry->start	= start;
+		    __entry->len	= len;
+			   ),
+
+	    TP_printk("o=%08x b=%08x s=%llx l=%zx",
+		      __entry->obj,
+		      __entry->backer,
+		      __entry->start,
+		      __entry->len)
+	    );
+
 TRACE_EVENT(cachefiles_trunc,
 	    TP_PROTO(struct cachefiles_object *obj, struct inode *backer,
 		     loff_t from, loff_t to, enum cachefiles_trunc_trace why),



^ permalink raw reply related	[flat|nested] 87+ messages in thread

* [PATCH 30/67] cachefiles: Round the cachefile size up to DIO block size
  2021-10-18 14:50 [PATCH 00/67] fscache: Rewrite index API and management system David Howells
                   ` (28 preceding siblings ...)
  2021-10-18 14:57 ` [PATCH 29/67] cachefiles: Trace read and write operations David Howells
@ 2021-10-18 14:58 ` David Howells
  2021-10-18 14:58 ` [PATCH 31/67] cachefiles: Don't use XATTR_ flags with vfs_setxattr() David Howells
                   ` (40 subsequent siblings)
  70 siblings, 0 replies; 87+ messages in thread
From: David Howells @ 2021-10-18 14:58 UTC (permalink / raw)
  To: linux-cachefs
  Cc: dhowells, Trond Myklebust, Anna Schumaker, Steve French,
	Dominique Martinet, Jeff Layton, Matthew Wilcox, Alexander Viro,
	Omar Sandoval, Linus Torvalds, linux-afs, linux-nfs, linux-cifs,
	ceph-devel, v9fs-developer, linux-fsdevel, linux-kernel

Round the size of a cachefile up to DIO block size so that we can always
read back the last partial page of a file using direct I/O.

Signed-off-by: David Howells <dhowells@redhat.com>
---

 fs/cachefiles/bind.c      |    3 ++-
 fs/cachefiles/interface.c |    2 ++
 fs/cachefiles/internal.h  |    2 ++
 3 files changed, 6 insertions(+), 1 deletion(-)

diff --git a/fs/cachefiles/bind.c b/fs/cachefiles/bind.c
index af7386ad14af..4ea8c93e14d8 100644
--- a/fs/cachefiles/bind.c
+++ b/fs/cachefiles/bind.c
@@ -126,7 +126,8 @@ static int cachefiles_daemon_add_cache(struct cachefiles_cache *cache)
 	    !d_backing_inode(root)->i_op->mkdir ||
 	    !(d_backing_inode(root)->i_opflags & IOP_XATTR) ||
 	    !root->d_sb->s_op->statfs ||
-	    !root->d_sb->s_op->sync_fs)
+	    !root->d_sb->s_op->sync_fs ||
+	    root->d_sb->s_blocksize > PAGE_SIZE)
 		goto error_unsupported;
 
 	ret = -EROFS;
diff --git a/fs/cachefiles/interface.c b/fs/cachefiles/interface.c
index 8f98e5c27d66..3e678ab14c85 100644
--- a/fs/cachefiles/interface.c
+++ b/fs/cachefiles/interface.c
@@ -268,6 +268,7 @@ static int cachefiles_attr_changed(struct cachefiles_object *object)
 	int ret;
 
 	ni_size = object->cookie->object_size;
+	ni_size = round_up(ni_size, CACHEFILES_DIO_BLOCK_SIZE);
 
 	_enter("{OBJ%x},[%llu]",
 	       object->debug_id, (unsigned long long) ni_size);
@@ -346,6 +347,7 @@ static void cachefiles_invalidate_object(struct cachefiles_object *object)
 				       cachefiles_trunc_invalidate);
 		ret = vfs_truncate(&file->f_path, 0);
 		if (ret == 0) {
+			ni_size = round_up(ni_size, CACHEFILES_DIO_BLOCK_SIZE);
 			trace_cachefiles_trunc(object, file_inode(file),
 					       0, ni_size,
 					       cachefiles_trunc_set_size);
diff --git a/fs/cachefiles/internal.h b/fs/cachefiles/internal.h
index 4705f968e661..ff00c5249f4f 100644
--- a/fs/cachefiles/internal.h
+++ b/fs/cachefiles/internal.h
@@ -19,6 +19,8 @@
 #include <linux/workqueue.h>
 #include <linux/security.h>
 
+#define CACHEFILES_DIO_BLOCK_SIZE 4096
+
 struct cachefiles_cache;
 struct cachefiles_object;
 



^ permalink raw reply related	[flat|nested] 87+ messages in thread

* [PATCH 31/67] cachefiles: Don't use XATTR_ flags with vfs_setxattr()
  2021-10-18 14:50 [PATCH 00/67] fscache: Rewrite index API and management system David Howells
                   ` (29 preceding siblings ...)
  2021-10-18 14:58 ` [PATCH 30/67] cachefiles: Round the cachefile size up to DIO block size David Howells
@ 2021-10-18 14:58 ` David Howells
  2021-10-18 14:59 ` [PATCH 32/67] fscache: Replace the object management state machine David Howells
                   ` (39 subsequent siblings)
  70 siblings, 0 replies; 87+ messages in thread
From: David Howells @ 2021-10-18 14:58 UTC (permalink / raw)
  To: linux-cachefs
  Cc: dhowells, Trond Myklebust, Anna Schumaker, Steve French,
	Dominique Martinet, Jeff Layton, Matthew Wilcox, Alexander Viro,
	Omar Sandoval, Linus Torvalds, linux-afs, linux-nfs, linux-cifs,
	ceph-devel, v9fs-developer, linux-fsdevel, linux-kernel

Don't pass XATTR_CREATE/XATTR_REPLACE to vfs_setxattr(); just pass 0 as we
don't want to have to deal with the error or try to guess which we want to
use.

Signed-off-by: David Howells <dhowells@redhat.com>
---

 fs/cachefiles/interface.c |    2 +-
 fs/cachefiles/internal.h  |    3 +--
 fs/cachefiles/namei.c     |    2 +-
 fs/cachefiles/xattr.c     |    6 ++----
 4 files changed, 5 insertions(+), 8 deletions(-)

diff --git a/fs/cachefiles/interface.c b/fs/cachefiles/interface.c
index 3e678ab14c85..674d3d75fa70 100644
--- a/fs/cachefiles/interface.c
+++ b/fs/cachefiles/interface.c
@@ -132,7 +132,7 @@ static void cachefiles_update_object(struct cachefiles_object *object)
 	cache = container_of(object->cache, struct cachefiles_cache, cache);
 
 	cachefiles_begin_secure(cache, &saved_cred);
-	cachefiles_set_object_xattr(object, XATTR_REPLACE);
+	cachefiles_set_object_xattr(object);
 	cachefiles_end_secure(cache, saved_cred);
 	_leave("");
 }
diff --git a/fs/cachefiles/internal.h b/fs/cachefiles/internal.h
index ff00c5249f4f..92f90a5a4e93 100644
--- a/fs/cachefiles/internal.h
+++ b/fs/cachefiles/internal.h
@@ -162,8 +162,7 @@ static inline void cachefiles_end_secure(struct cachefiles_cache *cache,
 /*
  * xattr.c
  */
-extern int cachefiles_set_object_xattr(struct cachefiles_object *object,
-				       unsigned int xattr_flags);
+extern int cachefiles_set_object_xattr(struct cachefiles_object *object);
 extern int cachefiles_check_auxdata(struct cachefiles_object *object);
 extern int cachefiles_remove_object_xattr(struct cachefiles_cache *cache,
 					  struct dentry *dentry);
diff --git a/fs/cachefiles/namei.c b/fs/cachefiles/namei.c
index a60ef6f1cf1e..cb08be5fb28e 100644
--- a/fs/cachefiles/namei.c
+++ b/fs/cachefiles/namei.c
@@ -291,7 +291,7 @@ static int cachefiles_check_open_object(struct cachefiles_cache *cache,
 
 	if (object->new) {
 		/* attach data to a newly constructed terminal object */
-		ret = cachefiles_set_object_xattr(object, XATTR_CREATE);
+		ret = cachefiles_set_object_xattr(object);
 		if (ret < 0)
 			goto error_unmark;
 	} else {
diff --git a/fs/cachefiles/xattr.c b/fs/cachefiles/xattr.c
index bfb2f4d605af..82c822bb71af 100644
--- a/fs/cachefiles/xattr.c
+++ b/fs/cachefiles/xattr.c
@@ -26,8 +26,7 @@ static const char cachefiles_xattr_cache[] =
 /*
  * set the state xattr on a cache file
  */
-int cachefiles_set_object_xattr(struct cachefiles_object *object,
-				unsigned int xattr_flags)
+int cachefiles_set_object_xattr(struct cachefiles_object *object)
 {
 	struct cachefiles_xattr *buf;
 	struct dentry *dentry;
@@ -51,8 +50,7 @@ int cachefiles_set_object_xattr(struct cachefiles_object *object,
 
 	clear_bit(FSCACHE_COOKIE_AUX_UPDATED, &object->cookie->flags);
 	ret = vfs_setxattr(&init_user_ns, dentry, cachefiles_xattr_cache,
-			   buf, sizeof(struct cachefiles_xattr) + len,
-			   xattr_flags);
+			   buf, sizeof(struct cachefiles_xattr) + len, 0);
 	if (ret < 0) {
 		trace_cachefiles_coherency(object, file_inode(file)->i_ino,
 					   0,



^ permalink raw reply related	[flat|nested] 87+ messages in thread

* [PATCH 32/67] fscache: Replace the object management state machine
  2021-10-18 14:50 [PATCH 00/67] fscache: Rewrite index API and management system David Howells
                   ` (30 preceding siblings ...)
  2021-10-18 14:58 ` [PATCH 31/67] cachefiles: Don't use XATTR_ flags with vfs_setxattr() David Howells
@ 2021-10-18 14:59 ` David Howells
  2021-10-18 14:59 ` [PATCH 33/67] cachefiles: Trace decisions in cachefiles_prepare_read() David Howells
                   ` (38 subsequent siblings)
  70 siblings, 0 replies; 87+ messages in thread
From: David Howells @ 2021-10-18 14:59 UTC (permalink / raw)
  To: linux-cachefs
  Cc: dhowells, Trond Myklebust, Anna Schumaker, Steve French,
	Dominique Martinet, Jeff Layton, Matthew Wilcox, Alexander Viro,
	Omar Sandoval, Linus Torvalds, linux-afs, linux-nfs, linux-cifs,
	ceph-devel, v9fs-developer, linux-fsdevel, linux-kernel

Replace the object management state machine with something a lot simpler.
The entire process of setting up or tearing down a cookie is done in one
go, and the dispatcher either punts it to a worker thread, or if all the
worker threads are all busy, does it in the current thread.

fscache_enable_cookie() and fscache_disable_cookie() are replaced by 'use'
and 'unuse' routines to which the mode of access (readonly or writable) is
declared - these then impose the policy of what to do with the backing
object.

The policy for handling local writes is declared to
fscache_acquire_cookie() using FSACHE_ADV_WRITE_*CACHE flags.  This only
allows for the possibility of suspending caching whilst a file is open for
writing; policies such as write-through and write-back have to be handled
at the netfs level.

Filesystems that use the fallback I/O API must set FSCACHE_ADV_FALLBACK_IO
when creating a cookie so that the content mapper doesn't try to interfere.
For cachefiles, this will cause I/O paths to use SEEK_DATA/SEEK_HOLE rather
than using a separate content map; this may result in data corruption if
the backing filesystem inserts bridging blocks[1].

At some point in the future, object records that aren't in use will get put
on an LRU and discarded under memory pressure or if they haven't been used
for a while.

Whilst accessing the cache, either fscache or the cache backend must hold
an access count on the cookie or volume cookie for the duration of an
operation to prevent the cache's structures from being deallocated if the
cache is simultaneously withdrawn; further, operations must be held up
until the backend finishes or fails at creating its structures in the
background.  In this case, cachefiles cannot access cache_priv pointers
until it has waited for the appropriate state to be achieved.

Signed-off-by: David Howells <dhowells@redhat.com>
Link: https://lore.kernel.org/r/YO17ZNOcq+9PajfQ@mit.edu [1]
---

 fs/9p/vfs_addr.c                  |    2 
 fs/afs/Makefile                   |    3 
 fs/afs/cache.c                    |   14 
 fs/afs/cell.c                     |   15 -
 fs/afs/file.c                     |   17 +
 fs/afs/inode.c                    |   16 -
 fs/afs/internal.h                 |    5 
 fs/afs/main.c                     |   14 
 fs/afs/volume.c                   |   16 -
 fs/cachefiles/Makefile            |    1 
 fs/cachefiles/bind.c              |  184 +++++--
 fs/cachefiles/daemon.c            |    8 
 fs/cachefiles/interface.c         |  340 +++++++-----
 fs/cachefiles/internal.h          |   90 +++
 fs/cachefiles/io.c                |  144 +++--
 fs/cachefiles/key.c               |  126 ++--
 fs/cachefiles/main.c              |   11 
 fs/cachefiles/namei.c             |  209 ++-----
 fs/cachefiles/volume.c            |  128 +++++
 fs/cachefiles/xattr.c             |    7 
 fs/fscache/Makefile               |    4 
 fs/fscache/cache.c                |  530 +++++++++----------
 fs/fscache/cookie.c               | 1043 +++++++++++++++++--------------------
 fs/fscache/fsdef.c                |   46 --
 fs/fscache/internal.h             |  160 +-----
 fs/fscache/io.c                   |  238 ++++----
 fs/fscache/main.c                 |  134 -----
 fs/fscache/netfs.c                |   76 ---
 fs/fscache/object.c               |  973 -----------------------------------
 fs/fscache/proc.c                 |   43 --
 fs/fscache/stats.c                |  143 +----
 fs/fscache/volume.c               |  449 ++++++++++++++++
 include/linux/fscache-cache.h     |  371 ++++---------
 include/linux/fscache.h           |  405 ++++++--------
 include/trace/events/cachefiles.h |   64 ++
 include/trace/events/fscache.h    |  395 +++++++++-----
 36 files changed, 2715 insertions(+), 3709 deletions(-)
 delete mode 100644 fs/afs/cache.c
 create mode 100644 fs/cachefiles/volume.c
 delete mode 100644 fs/fscache/fsdef.c
 delete mode 100644 fs/fscache/netfs.c
 delete mode 100644 fs/fscache/object.c
 create mode 100644 fs/fscache/volume.c

diff --git a/fs/9p/vfs_addr.c b/fs/9p/vfs_addr.c
index cff99f5c05e3..de857fa4629b 100644
--- a/fs/9p/vfs_addr.c
+++ b/fs/9p/vfs_addr.c
@@ -80,7 +80,7 @@ static bool v9fs_is_cache_enabled(struct inode *inode)
 {
 	struct fscache_cookie *cookie = v9fs_inode_cookie(V9FS_I(inode));
 
-	return fscache_cookie_enabled(cookie) && !hlist_empty(&cookie->backing_objects);
+	return fscache_cookie_valid(cookie) && cookie->cache_priv;
 }
 
 /**
diff --git a/fs/afs/Makefile b/fs/afs/Makefile
index 75c4e4043d1d..e8956b65d7ff 100644
--- a/fs/afs/Makefile
+++ b/fs/afs/Makefile
@@ -3,10 +3,7 @@
 # Makefile for Red Hat Linux AFS client.
 #
 
-afs-cache-$(CONFIG_AFS_FSCACHE) := cache.o
-
 kafs-y := \
-	$(afs-cache-y) \
 	addr_list.o \
 	callback.o \
 	cell.o \
diff --git a/fs/afs/cache.c b/fs/afs/cache.c
deleted file mode 100644
index 0ee9ede6fc67..000000000000
--- a/fs/afs/cache.c
+++ /dev/null
@@ -1,14 +0,0 @@
-// SPDX-License-Identifier: GPL-2.0-or-later
-/* AFS caching stuff
- *
- * Copyright (C) 2008 Red Hat, Inc. All Rights Reserved.
- * Written by David Howells (dhowells@redhat.com)
- */
-
-#include <linux/sched.h>
-#include "internal.h"
-
-struct fscache_netfs afs_cache_netfs = {
-	.name			= "afs",
-	.version		= 2,
-};
diff --git a/fs/afs/cell.c b/fs/afs/cell.c
index ebf140317232..07ad744eef77 100644
--- a/fs/afs/cell.c
+++ b/fs/afs/cell.c
@@ -680,16 +680,6 @@ static int afs_activate_cell(struct afs_net *net, struct afs_cell *cell)
 			return ret;
 	}
 
-#ifdef CONFIG_AFS_FSCACHE
-	cell->cache = fscache_acquire_cookie(afs_cache_netfs.primary_index,
-					     FSCACHE_COOKIE_TYPE_INDEX,
-					     "AFS.cell",
-					     0,
-					     NULL,
-					     cell->name, strlen(cell->name),
-					     NULL, 0,
-					     0, true);
-#endif
 	ret = afs_proc_cell_setup(cell);
 	if (ret < 0)
 		return ret;
@@ -726,11 +716,6 @@ static void afs_deactivate_cell(struct afs_net *net, struct afs_cell *cell)
 	afs_dynroot_rmdir(net, cell);
 	mutex_unlock(&net->proc_cells_lock);
 
-#ifdef CONFIG_AFS_FSCACHE
-	fscache_relinquish_cookie(cell->cache, NULL, false);
-	cell->cache = NULL;
-#endif
-
 	_leave("");
 }
 
diff --git a/fs/afs/file.c b/fs/afs/file.c
index 4d5b6bfcf815..b4666da93b54 100644
--- a/fs/afs/file.c
+++ b/fs/afs/file.c
@@ -151,7 +151,9 @@ int afs_open(struct inode *inode, struct file *file)
 
 	if (file->f_flags & O_TRUNC)
 		set_bit(AFS_VNODE_NEW_CONTENT, &vnode->flags);
-	
+
+	fscache_use_cookie(afs_vnode_cache(vnode), file->f_mode & FMODE_WRITE);
+
 	file->private_data = af;
 	_leave(" = 0");
 	return 0;
@@ -170,8 +172,10 @@ int afs_open(struct inode *inode, struct file *file)
  */
 int afs_release(struct inode *inode, struct file *file)
 {
+	struct afs_vnode_cache_aux aux;
 	struct afs_vnode *vnode = AFS_FS_I(inode);
 	struct afs_file *af = file->private_data;
+	loff_t i_size;
 	int ret = 0;
 
 	_enter("{%llx:%llu},", vnode->fid.vid, vnode->fid.vnode);
@@ -182,6 +186,15 @@ int afs_release(struct inode *inode, struct file *file)
 	file->private_data = NULL;
 	if (af->wb)
 		afs_put_wb_key(af->wb);
+
+	if ((file->f_mode & FMODE_WRITE)) {
+		i_size = i_size_read(&vnode->vfs_inode);
+		aux.data_version = vnode->status.data_version;
+		fscache_unuse_cookie(afs_vnode_cache(vnode), &aux, &i_size);
+	} else {
+		fscache_unuse_cookie(afs_vnode_cache(vnode), NULL, NULL);
+	}
+
 	key_put(af->key);
 	kfree(af);
 	afs_prune_wb_keys(vnode);
@@ -344,7 +357,7 @@ static bool afs_is_cache_enabled(struct inode *inode)
 {
 	struct fscache_cookie *cookie = afs_vnode_cache(AFS_FS_I(inode));
 
-	return fscache_cookie_enabled(cookie) && !hlist_empty(&cookie->backing_objects);
+	return fscache_cookie_valid(cookie) && cookie->cache_priv;
 }
 
 static int afs_begin_cache_operation(struct netfs_read_request *rreq)
diff --git a/fs/afs/inode.c b/fs/afs/inode.c
index f761c9a5067f..c2f4afff9837 100644
--- a/fs/afs/inode.c
+++ b/fs/afs/inode.c
@@ -432,13 +432,10 @@ static void afs_get_inode_cache(struct afs_vnode *vnode)
 
 	vnode->cache = fscache_acquire_cookie(
 		vnode->volume->cache,
-		FSCACHE_COOKIE_TYPE_DATAFILE,
-		"AFS.vnode",
 		vnode->status.type == AFS_FTYPE_FILE ? 0 : FSCACHE_ADV_SINGLE_CHUNK,
-		NULL,
 		&key, sizeof(key),
 		&aux, sizeof(aux),
-		vnode->status.size, true);
+		vnode->status.size);
 #endif
 }
 
@@ -790,14 +787,9 @@ void afs_evict_inode(struct inode *inode)
 	}
 
 #ifdef CONFIG_AFS_FSCACHE
-	{
-		struct afs_vnode_cache_aux aux;
-
-		aux.data_version = vnode->status.data_version;
-		fscache_relinquish_cookie(vnode->cache, &aux,
-					  test_bit(AFS_VNODE_DELETED, &vnode->flags));
-		vnode->cache = NULL;
-	}
+	fscache_relinquish_cookie(vnode->cache,
+				  test_bit(AFS_VNODE_DELETED, &vnode->flags));
+	vnode->cache = NULL;
 #endif
 
 	afs_prune_wb_keys(vnode);
diff --git a/fs/afs/internal.h b/fs/afs/internal.h
index bdcc677338fb..8e168c3fa5d1 100644
--- a/fs/afs/internal.h
+++ b/fs/afs/internal.h
@@ -364,9 +364,6 @@ struct afs_cell {
 	struct key		*anonymous_key;	/* anonymous user key for this cell */
 	struct work_struct	manager;	/* Manager for init/deinit/dns */
 	struct hlist_node	proc_link;	/* /proc cell list link */
-#ifdef CONFIG_AFS_FSCACHE
-	struct fscache_cookie	*cache;		/* caching cookie */
-#endif
 	time64_t		dns_expiry;	/* Time AFSDB/SRV record expires */
 	time64_t		last_inactive;	/* Time of last drop of usage count */
 	atomic_t		ref;		/* Struct refcount */
@@ -590,7 +587,7 @@ struct afs_volume {
 #define AFS_VOLUME_BUSY		5	/* - T if volume busy notice given */
 #define AFS_VOLUME_MAYBE_NO_IBULK 6	/* - T if some servers don't have InlineBulkStatus */
 #ifdef CONFIG_AFS_FSCACHE
-	struct fscache_cookie	*cache;		/* caching cookie */
+	struct fscache_volume	*cache;		/* Caching cookie */
 #endif
 	struct afs_server_list __rcu *servers;	/* List of servers on which volume resides */
 	rwlock_t		servers_lock;	/* Lock for ->servers */
diff --git a/fs/afs/main.c b/fs/afs/main.c
index 179004b15566..eae288c8d40a 100644
--- a/fs/afs/main.c
+++ b/fs/afs/main.c
@@ -186,13 +186,6 @@ static int __init afs_init(void)
 	if (!afs_lock_manager)
 		goto error_lockmgr;
 
-#ifdef CONFIG_AFS_FSCACHE
-	/* we want to be able to cache */
-	ret = fscache_register_netfs(&afs_cache_netfs);
-	if (ret < 0)
-		goto error_cache;
-#endif
-
 	ret = register_pernet_device(&afs_net_ops);
 	if (ret < 0)
 		goto error_net;
@@ -215,10 +208,6 @@ static int __init afs_init(void)
 error_fs:
 	unregister_pernet_device(&afs_net_ops);
 error_net:
-#ifdef CONFIG_AFS_FSCACHE
-	fscache_unregister_netfs(&afs_cache_netfs);
-error_cache:
-#endif
 	destroy_workqueue(afs_lock_manager);
 error_lockmgr:
 	destroy_workqueue(afs_async_calls);
@@ -245,9 +234,6 @@ static void __exit afs_exit(void)
 	proc_remove(afs_proc_symlink);
 	afs_fs_exit();
 	unregister_pernet_device(&afs_net_ops);
-#ifdef CONFIG_AFS_FSCACHE
-	fscache_unregister_netfs(&afs_cache_netfs);
-#endif
 	destroy_workqueue(afs_lock_manager);
 	destroy_workqueue(afs_async_calls);
 	destroy_workqueue(afs_wq);
diff --git a/fs/afs/volume.c b/fs/afs/volume.c
index 5eaaa762fbd9..1269ec08170e 100644
--- a/fs/afs/volume.c
+++ b/fs/afs/volume.c
@@ -271,12 +271,14 @@ void afs_put_volume(struct afs_net *net, struct afs_volume *volume,
 void afs_activate_volume(struct afs_volume *volume)
 {
 #ifdef CONFIG_AFS_FSCACHE
-	volume->cache = fscache_acquire_cookie(volume->cell->cache,
-					       FSCACHE_COOKIE_TYPE_INDEX,
-					       "AFS.vol",
-					       0, NULL,
-					       &volume->vid, sizeof(volume->vid),
-					       NULL, 0, 0, true);
+	char *name;
+
+	name = kasprintf(GFP_KERNEL, "afs,%s,%llx",
+			 volume->cell->name, volume->vid);
+	if (name) {
+		volume->cache = fscache_acquire_volume(name, NULL, 0);
+		kfree(name);
+	}
 #endif
 }
 
@@ -288,7 +290,7 @@ void afs_deactivate_volume(struct afs_volume *volume)
 	_enter("%s", volume->name);
 
 #ifdef CONFIG_AFS_FSCACHE
-	fscache_relinquish_cookie(volume->cache, NULL,
+	fscache_relinquish_volume(volume->cache, 0,
 				  test_bit(AFS_VOLUME_DELETED, &volume->flags));
 	volume->cache = NULL;
 #endif
diff --git a/fs/cachefiles/Makefile b/fs/cachefiles/Makefile
index 714e84b3ca24..9062767331e8 100644
--- a/fs/cachefiles/Makefile
+++ b/fs/cachefiles/Makefile
@@ -12,6 +12,7 @@ cachefiles-y := \
 	main.o \
 	namei.o \
 	security.o \
+	volume.o \
 	xattr.o
 
 obj-$(CONFIG_CACHEFILES) := cachefiles.o
diff --git a/fs/cachefiles/bind.c b/fs/cachefiles/bind.c
index 4ea8c93e14d8..53aac6323753 100644
--- a/fs/cachefiles/bind.c
+++ b/fs/cachefiles/bind.c
@@ -17,8 +17,11 @@
 #include <linux/statfs.h>
 #include <linux/ctype.h>
 #include <linux/xattr.h>
+#include <trace/events/fscache.h>
 #include "internal.h"
 
+DECLARE_WAIT_QUEUE_HEAD(cachefiles_clearance_wq);
+
 static int cachefiles_daemon_add_cache(struct cachefiles_cache *caches);
 
 /*
@@ -60,16 +63,6 @@ int cachefiles_daemon_bind(struct cachefiles_cache *cache, char *args)
 		return -EBUSY;
 	}
 
-	/* make sure we have copies of the tag and dirname strings */
-	if (!cache->tag) {
-		/* the tag string is released by the fops->release()
-		 * function, so we don't release it on error here */
-		cache->tag = kstrdup("CacheFiles", GFP_KERNEL);
-		if (!cache->tag)
-			return -ENOMEM;
-	}
-
-	/* add the cache */
 	return cachefiles_daemon_add_cache(cache);
 }
 
@@ -78,33 +71,34 @@ int cachefiles_daemon_bind(struct cachefiles_cache *cache, char *args)
  */
 static int cachefiles_daemon_add_cache(struct cachefiles_cache *cache)
 {
-	struct cachefiles_object *fsdef;
+	struct fscache_cache *cache_cookie;
 	struct path path;
 	struct kstatfs stats;
 	struct dentry *graveyard, *cachedir, *root;
-	struct file *dirf;
 	const struct cred *saved_cred;
 	int ret;
 
 	_enter("");
 
+	cache_cookie = fscache_acquire_cache(cache->tag);
+	if (IS_ERR(cache_cookie))
+		return PTR_ERR(cache_cookie);
+
+	if (!fscache_set_cache_state_maybe(cache_cookie,
+					   FSCACHE_CACHE_IS_NOT_PRESENT,
+					   FSCACHE_CACHE_IS_PREPARING)) {
+		pr_warn("Cache tag in use\n");
+		ret = -EBUSY;
+		goto error_preparing;
+	}
+
 	/* we want to work under the module's security ID */
 	ret = cachefiles_get_security_ID(cache);
 	if (ret < 0)
-		return ret;
+		goto error_getsec;
 
 	cachefiles_begin_secure(cache, &saved_cred);
 
-	/* allocate the root index object */
-	ret = -ENOMEM;
-
-	fsdef = kmem_cache_alloc(cachefiles_object_jar, GFP_KERNEL);
-	if (!fsdef)
-		goto error_root_object;
-
-	atomic_set(&fsdef->usage, 1);
-	fsdef->type = FSCACHE_COOKIE_TYPE_INDEX;
-
 	/* look up the directory at the root of the cache */
 	ret = kern_path(cache->rootdirname, LOOKUP_DIRECTORY, &path);
 	if (ret < 0)
@@ -188,40 +182,25 @@ static int cachefiles_daemon_add_cache(struct cachefiles_cache *cache)
 	       (unsigned long long) cache->bstop);
 
 	/* get the cache directory and check its type */
-	cachedir = cachefiles_get_directory(cache, root, "cache", NULL);
+	cachedir = cachefiles_get_directory(cache, root, "cache");
 	if (IS_ERR(cachedir)) {
 		ret = PTR_ERR(cachedir);
 		goto error_unsupported;
 	}
 
-	dirf = open_with_fake_path(&path, O_RDONLY | O_DIRECTORY,
-				   d_inode(cachedir), cache->cache_cred);
-	if (IS_ERR(dirf)) {
-		ret = PTR_ERR(dirf);
-		goto error_unsupported;
-	}
-	fsdef->file = dirf;
-	fsdef->cookie = NULL;
+	cache->store = cachedir;
 
 	/* get the graveyard directory */
-	graveyard = cachefiles_get_directory(cache, root, "graveyard", NULL);
+	graveyard = cachefiles_get_directory(cache, root, "graveyard");
 	if (IS_ERR(graveyard)) {
 		ret = PTR_ERR(graveyard);
 		goto error_unsupported;
 	}
 
 	cache->graveyard = graveyard;
+	cache->cache = cache_cookie;
 
-	/* publish the cache */
-	fscache_init_cache(&cache->cache,
-			   &cachefiles_cache_ops,
-			   "%s",
-			   graveyard->d_sb->s_id);
-
-	fscache_object_init(fsdef, &fscache_fsdef_index,
-			    &cache->cache);
-
-	ret = fscache_add_cache(&cache->cache, fsdef, cache->tag);
+	ret = fscache_add_cache(cache_cookie, &cachefiles_cache_ops, cache);
 	if (ret < 0)
 		goto error_add_cache;
 
@@ -229,47 +208,140 @@ static int cachefiles_daemon_add_cache(struct cachefiles_cache *cache)
 	set_bit(CACHEFILES_READY, &cache->flags);
 	dput(root);
 
-	pr_info("File cache on %s registered\n", cache->cache.identifier);
+	pr_info("File cache on %s registered\n", cache_cookie->name);
 
 	/* check how much space the cache has */
 	cachefiles_has_space(cache, 0, 0);
 	cachefiles_end_secure(cache, saved_cred);
+	_leave(" = 0 [%px]", cache->cache);
 	return 0;
 
 error_add_cache:
 	dput(cache->graveyard);
 	cache->graveyard = NULL;
 error_unsupported:
+	dput(cache->store);
+	cache->store = NULL;
 	mntput(cache->mnt);
 	cache->mnt = NULL;
-	if (fsdef->file) {
-		fput(fsdef->file);
-		fsdef->file = NULL;
-	}
 	dput(root);
 error_open_root:
-	kmem_cache_free(cachefiles_object_jar, fsdef);
-error_root_object:
 	cachefiles_end_secure(cache, saved_cred);
+error_getsec:
+	fscache_set_cache_state(cache_cookie, FSCACHE_CACHE_IS_NOT_PRESENT);
+error_preparing:
+	fscache_put_cache(cache_cookie, fscache_cache_put_cache);
+	cache->cache = NULL;
 	pr_err("Failed to register: %d\n", ret);
 	return ret;
 }
 
 /*
- * unbind a cache on fd release
+ * Mark all the objects as being out of service and queue them all for cleanup.
  */
-void cachefiles_daemon_unbind(struct cachefiles_cache *cache)
+static void cachefiles_withdraw_objects(struct cachefiles_cache *cache)
 {
+	struct cachefiles_object *object;
+	unsigned int count = 0;
+
 	_enter("");
 
-	if (test_bit(CACHEFILES_READY, &cache->flags)) {
-		pr_info("File cache on %s unregistering\n",
-			cache->cache.identifier);
+	spin_lock(&cache->object_list_lock);
+
+	while (!list_empty(&cache->object_list)) {
+		object = list_first_entry(&cache->object_list,
+					  struct cachefiles_object, cache_link);
+		cachefiles_see_object(object, cachefiles_obj_see_withdrawal);
+		list_del_init(&object->cache_link);
+		fscache_withdraw_cookie(object->cookie);
+		count++;
+		if ((count & 63) == 0) {
+			spin_unlock(&cache->object_list_lock);
+			cond_resched();
+			spin_lock(&cache->object_list_lock);
+		}
+	}
+
+	spin_unlock(&cache->object_list_lock);
+	_leave(" [%u objs]", count);
+}
+
+/*
+ * Withdraw volumes.
+ */
+static void cachefiles_withdraw_volumes(struct cachefiles_cache *cache)
+{
+	_enter("");
 
-		fscache_withdraw_cache(&cache->cache);
+	for (;;) {
+		struct cachefiles_volume *volume = NULL;
+
+		spin_lock(&cache->object_list_lock);
+		if (!list_empty(&cache->volumes)) {
+			volume = list_first_entry(&cache->volumes,
+						  struct cachefiles_volume, cache_link);
+			list_del_init(&volume->cache_link);
+		}
+		spin_unlock(&cache->object_list_lock);
+		if (!volume)
+			break;
+
+		cachefiles_withdraw_volume(volume);
 	}
 
+	_leave("");
+}
+
+/*
+ * Withdraw cache objects.
+ */
+static void cachefiles_withdraw_cache(struct cachefiles_cache *cache)
+{
+	struct fscache_cache *fscache = cache->cache;
+
+	pr_info("File cache on %s unregistering\n", fscache->name);
+
+	fscache_withdraw_cache(fscache);
+
+	/* we now have to destroy all the active objects pertaining to this
+	 * cache - which we do by passing them off to thread pool to be
+	 * disposed of */
+	cachefiles_withdraw_objects(cache);
+
+	/* wait for all extant objects to finish their outstanding operations
+	 * and go away */
+	_debug("wait for finish %u", atomic_read(&fscache->object_count));
+	wait_event(cachefiles_clearance_wq,
+		   atomic_read(&fscache->object_count) == 0);
+	_debug("cleared");
+
+	cachefiles_withdraw_volumes(cache);
+
+	/* make sure all outstanding data is written to disk */
+	cachefiles_sync_cache(cache);
+
+	_debug("wait for clearance");
+	wait_event(cachefiles_clearance_wq, list_empty(&cache->object_list));
+
+	cache->cache = NULL;
+	fscache->ops = NULL;
+	fscache->cache_priv = NULL;
+	fscache_set_cache_state(fscache, FSCACHE_CACHE_IS_NOT_PRESENT);
+	fscache_put_cache(fscache, fscache_cache_put_withdraw);
+}
+
+/*
+ * unbind a cache on fd release
+ */
+void cachefiles_daemon_unbind(struct cachefiles_cache *cache)
+{
+	_enter("%px", cache->cache);
+
+	if (test_bit(CACHEFILES_READY, &cache->flags))
+		cachefiles_withdraw_cache(cache);
+
 	dput(cache->graveyard);
+	dput(cache->store);
 	mntput(cache->mnt);
 
 	kfree(cache->rootdirname);
diff --git a/fs/cachefiles/daemon.c b/fs/cachefiles/daemon.c
index e8ab3ab57147..6d31fba31ce9 100644
--- a/fs/cachefiles/daemon.c
+++ b/fs/cachefiles/daemon.c
@@ -103,6 +103,9 @@ static int cachefiles_daemon_open(struct inode *inode, struct file *file)
 
 	mutex_init(&cache->daemon_mutex);
 	init_waitqueue_head(&cache->daemon_pollwq);
+	INIT_LIST_HEAD(&cache->volumes);
+	INIT_LIST_HEAD(&cache->object_list);
+	spin_lock_init(&cache->object_list_lock);
 
 	/* set default caching limits
 	 * - limit at 1% free space and/or free files
@@ -668,11 +671,12 @@ int cachefiles_has_space(struct cachefiles_cache *cache,
 			 unsigned fnr, unsigned bnr)
 {
 	struct kstatfs stats;
+	int ret;
+
 	struct path path = {
 		.mnt	= cache->mnt,
-		.dentry	= cache->mnt->mnt_root,
+		.dentry	= cache->store,
 	};
-	int ret;
 
 	//_enter("{%llu,%llu,%llu,%llu,%llu,%llu},%u,%u",
 	//       (unsigned long long) cache->frun,
diff --git a/fs/cachefiles/interface.c b/fs/cachefiles/interface.c
index 674d3d75fa70..d186a68ff810 100644
--- a/fs/cachefiles/interface.c
+++ b/fs/cachefiles/interface.c
@@ -8,114 +8,128 @@
 #include <linux/slab.h>
 #include <linux/mount.h>
 #include <linux/xattr.h>
+#include <linux/file.h>
+#include <trace/events/fscache.h>
 #include "internal.h"
 
+static atomic_t cachefiles_object_debug_id;
+
 static int cachefiles_attr_changed(struct cachefiles_object *object);
 
 /*
- * allocate an object record for a cookie lookup and prepare the lookup data
+ * Allocate a cache object record.
  */
-static struct cachefiles_object *cachefiles_alloc_object(
-	struct fscache_cache *_cache,
-	struct fscache_cookie *cookie)
+static
+struct cachefiles_object *cachefiles_alloc_object(struct fscache_cookie *cookie)
 {
+	struct fscache_volume *vcookie = cookie->volume;
 	struct cachefiles_object *object;
-	struct cachefiles_cache *cache;
-
-	cache = container_of(_cache, struct cachefiles_cache, cache);
+	struct cachefiles_volume *volume = vcookie->cache_priv;
+	int n_accesses;
 
-	_enter("{%s},%x,", cache->cache.identifier, cookie->debug_id);
+	_enter("{%s},%x,", vcookie->key, cookie->debug_id);
 
-	/* create a new object record and a temporary leaf image */
-	object = kmem_cache_alloc(cachefiles_object_jar, cachefiles_gfp);
+	object = kmem_cache_zalloc(cachefiles_object_jar, cachefiles_gfp);
 	if (!object)
-		goto nomem_object;
+		return NULL;
 
 	atomic_set(&object->usage, 1);
 
-	fscache_object_init(object, cookie, &cache->cache);
+	spin_lock_init(&object->lock);
+	INIT_LIST_HEAD(&object->cache_link);
+	object->volume = volume;
+	object->debug_id = atomic_inc_return(&cachefiles_object_debug_id);
+	object->cookie = fscache_get_cookie(cookie, fscache_cookie_get_attach_object);
 
-	object->type = cookie->type;
-
-	/* turn the raw key into something that can work with as a filename */
-	if (!cachefiles_cook_key(object))
-		goto nomem_key;
+	atomic_inc(&vcookie->cache->object_count);
+	trace_cachefiles_ref(object->debug_id, cookie->debug_id, 1,
+			     cachefiles_obj_new);
 
-	_leave(" = %x [%s]", object->debug_id, object->d_name);
+	/* Get a ref on the cookie and keep its n_accesses counter raised by 1
+	 * to prevent wakeups from transitioning it to 0 until we're
+	 * withdrawing caching services from it.
+	 */
+	n_accesses = atomic_inc_return(&cookie->n_accesses);
+	trace_fscache_access(cookie->debug_id, refcount_read(&cookie->ref),
+			     n_accesses, fscache_access_cache_pin);
+	set_bit(FSCACHE_COOKIE_NACC_ELEVATED, &cookie->flags);
 	return object;
-
-nomem_key:
-	kmem_cache_free(cachefiles_object_jar, object);
-	fscache_object_destroyed(&cache->cache);
-nomem_object:
-	_leave(" = -ENOMEM");
-	return ERR_PTR(-ENOMEM);
 }
 
 /*
- * attempt to look up the nominated node in this cache
- * - return -ETIMEDOUT to be scheduled again
+ * Attempt to look up the nominated node in this cache
  */
-static int cachefiles_lookup_object(struct cachefiles_object *object)
+static bool cachefiles_lookup_cookie(struct fscache_cookie *cookie)
 {
-	struct cachefiles_object *parent;
-	struct cachefiles_cache *cache;
+	struct cachefiles_object *object;
+	struct cachefiles_cache *cache = cookie->volume->cache->cache_priv;
 	const struct cred *saved_cred;
-	int ret;
+	bool success;
+
+	object = cachefiles_alloc_object(cookie);
+	if (!object)
+		goto fail;
 
 	_enter("{OBJ%x}", object->debug_id);
 
-	cache = container_of(object->cache, struct cachefiles_cache, cache);
-	parent = object->parent;
+	if (!cachefiles_cook_key(object))
+		goto fail_put;
 
-	ASSERT(object->d_name);
+	cookie->cache_priv = object;
 
 	/* look up the key, creating any missing bits */
 	cachefiles_begin_secure(cache, &saved_cred);
-	ret = cachefiles_walk_to_object(parent, object);
+	success = cachefiles_walk_to_object(object);
 	cachefiles_end_secure(cache, saved_cred);
 
-	/* polish off by setting the attributes of non-index files */
-	if (ret == 0 &&
-	    object->cookie->type != FSCACHE_COOKIE_TYPE_INDEX)
-		cachefiles_attr_changed(object);
-
-	if (ret < 0 && ret != -ETIMEDOUT) {
-		if (ret != -ENOBUFS)
-			pr_warn("Lookup failed error %d\n", ret);
-		fscache_object_lookup_error(object);
-	}
+	if (!success)
+		goto fail_withdraw;
+
+	cachefiles_see_object(object, cachefiles_obj_see_lookup_cookie);
+
+	spin_lock(&cache->object_list_lock);
+	list_add(&object->cache_link, &cache->object_list);
+	spin_unlock(&cache->object_list_lock);
+	cachefiles_attr_changed(object);
+	_leave(" = t");
+	return true;
+
+fail_withdraw:
+	cachefiles_see_object(object, cachefiles_obj_see_lookup_failed);
+	clear_bit(FSCACHE_COOKIE_IS_CACHING, &object->flags);
+	fscache_set_cookie_stage(cookie, FSCACHE_COOKIE_STAGE_FAILED);
+	kdebug("failed c=%08x o=%08x", cookie->debug_id, object->debug_id);
+	/* The caller holds an access count on the cookie, so we need them to
+	 * drop it before we can withdraw the object.
+	 */
+	return false;
 
-	_leave(" [%d]", ret);
-	return ret;
+fail_put:
+	cachefiles_put_object(object, cachefiles_obj_put_alloc_fail);
+fail:
+	return false;
 }
 
 /*
- * indication of lookup completion
+ * Note that an object has been seen.
  */
-static void cachefiles_lookup_complete(struct cachefiles_object *object)
+void cachefiles_see_object(struct cachefiles_object *object,
+			   enum cachefiles_obj_ref_trace why)
 {
-	_enter("{OBJ%x}", object->debug_id);
+	trace_cachefiles_ref(object->debug_id, object->cookie->debug_id,
+			     atomic_read(&object->usage), why);
 }
 
 /*
  * increment the usage count on an inode object (may fail if unmounting)
  */
-static
 struct cachefiles_object *cachefiles_grab_object(struct cachefiles_object *object,
-						 enum fscache_obj_ref_trace why)
+						 enum cachefiles_obj_ref_trace why)
 {
 	int u;
 
-	_enter("{OBJ%x,%d}", object->debug_id, atomic_read(&object->usage));
-
-#ifdef CACHEFILES_DEBUG_SLAB
-	ASSERT((atomic_read(&object->usage) & 0xffff0000) != 0x6b6b0000);
-#endif
-
 	u = atomic_inc_return(&object->usage);
-	trace_cachefiles_ref(object, object->cookie,
-			     (enum cachefiles_obj_ref_trace)why, u);
+	trace_cachefiles_ref(object->debug_id, object->cookie->debug_id, u, why);
 	return object;
 }
 
@@ -124,102 +138,144 @@ struct cachefiles_object *cachefiles_grab_object(struct cachefiles_object *objec
  */
 static void cachefiles_update_object(struct cachefiles_object *object)
 {
-	struct cachefiles_cache *cache;
+	struct cachefiles_cache *cache = object->volume->cache;
 	const struct cred *saved_cred;
+	struct file *file = object->file;
+	loff_t object_size, i_size;
+	int ret;
 
 	_enter("{OBJ%x}", object->debug_id);
 
-	cache = container_of(object->cache, struct cachefiles_cache, cache);
-
 	cachefiles_begin_secure(cache, &saved_cred);
+
+	object_size = object->cookie->object_size;
+	i_size = i_size_read(file_inode(file));
+	if (i_size > object_size) {
+		_debug("trunc %llx -> %llx", i_size, object_size);
+		trace_cachefiles_trunc(object, file_inode(file),
+				       i_size, object_size,
+				       cachefiles_trunc_shrink);
+		ret = vfs_truncate(&file->f_path, object_size);
+		if (ret < 0) {
+			cachefiles_io_error_obj(object, "Trunc-to-size failed");
+			cachefiles_remove_object_xattr(cache, file->f_path.dentry);
+			goto out;
+		}
+
+		object_size = round_up(object_size, CACHEFILES_DIO_BLOCK_SIZE);
+		i_size = i_size_read(file_inode(file));
+		_debug("trunc %llx -> %llx", i_size, object_size);
+		if (i_size < object_size) {
+			trace_cachefiles_trunc(object, file_inode(file),
+					       i_size, object_size,
+					       cachefiles_trunc_dio_adjust);
+			ret = vfs_truncate(&file->f_path, object_size);
+			if (ret < 0) {
+				cachefiles_io_error_obj(object, "Trunc-to-dio-size failed");
+				cachefiles_remove_object_xattr(cache, file->f_path.dentry);
+				goto out;
+			}
+		}
+	}
+
 	cachefiles_set_object_xattr(object);
+
+out:
 	cachefiles_end_secure(cache, saved_cred);
 	_leave("");
 }
 
 /*
- * discard the resources pinned by an object and effect retirement if
- * requested
+ * Commit changes to the object as we drop it.
  */
-static void cachefiles_drop_object(struct cachefiles_object *object)
+static void cachefiles_commit_object(struct cachefiles_object *object,
+				     struct cachefiles_cache *cache)
 {
-	struct cachefiles_cache *cache;
-	const struct cred *saved_cred;
-
-	ASSERT(object);
+	bool update = false;
 
-	_enter("{OBJ%x,%d}", object->debug_id, atomic_read(&object->usage));
-
-	cache = container_of(object->cache, struct cachefiles_cache, cache);
+	if (test_and_clear_bit(FSCACHE_COOKIE_NEEDS_UPDATE, &object->cookie->flags))
+		update = true;
+	if (update)
+		cachefiles_update_object(object);
+}
 
-#ifdef CACHEFILES_DEBUG_SLAB
-	ASSERT((atomic_read(&object->usage) & 0xffff0000) != 0x6b6b0000);
-#endif
+/*
+ * Finalise and object and close the VFS structs that we have.
+ */
+static void cachefiles_clean_up_object(struct cachefiles_object *object,
+				       struct cachefiles_cache *cache)
+{
+	if (test_bit(FSCACHE_COOKIE_RETIRED, &object->cookie->flags)) {
+		cachefiles_see_object(object, cachefiles_obj_see_clean_delete);
+		_debug("- inval object OBJ%x", object->debug_id);
+		cachefiles_delete_object(object, FSCACHE_OBJECT_WAS_RETIRED);
+	} else {
+		cachefiles_see_object(object, cachefiles_obj_see_clean_commit);
+		cachefiles_commit_object(object, cache);
+	}
 
-	/* We need to tidy the object up if we did in fact manage to open it.
-	 * It's possible for us to get here before the object is fully
-	 * initialised if the parent goes away or the object gets retired
-	 * before we set it up.
-	 */
+	cachefiles_unmark_inode_in_use(object);
 	if (object->file) {
-		/* delete retired objects */
-		if (test_bit(FSCACHE_OBJECT_RETIRED, &object->flags) &&
-		    object != cache->cache.fsdef
-		    ) {
-			_debug("- retire object OBJ%x", object->debug_id);
-			cachefiles_begin_secure(cache, &saved_cred);
-			cachefiles_delete_object(cache, object);
-			cachefiles_end_secure(cache, saved_cred);
-		}
-
-		/* close the filesystem stuff attached to the object */
-		cachefiles_unmark_inode_in_use(object);
 		fput(object->file);
 		object->file = NULL;
 	}
+}
 
-	_leave("");
+/*
+ * Withdraw caching for a cookie.
+ */
+static void cachefiles_withdraw_cookie(struct fscache_cookie *cookie)
+{
+	struct cachefiles_object *object = cookie->cache_priv;
+	struct cachefiles_cache *cache = object->volume->cache;
+	const struct cred *saved_cred;
+
+	_enter("o=%x", object->debug_id);
+	cachefiles_see_object(object, cachefiles_obj_see_withdraw_cookie);
+
+	if (!list_empty(&object->cache_link)) {
+		spin_lock(&cache->object_list_lock);
+		cachefiles_see_object(object, cachefiles_obj_see_withdrawal);
+		list_del_init(&object->cache_link);
+		spin_unlock(&cache->object_list_lock);
+	}
+
+	if (object->file) {
+		cachefiles_begin_secure(cache, &saved_cred);
+		cachefiles_clean_up_object(object, cache);
+		cachefiles_end_secure(cache, saved_cred);
+	}
+
+	cookie->cache_priv = NULL;
+	cachefiles_put_object(object, cachefiles_obj_put_detach);
 }
 
 /*
  * dispose of a reference to an object
  */
 void cachefiles_put_object(struct cachefiles_object *object,
-			   enum fscache_obj_ref_trace why)
+			   enum cachefiles_obj_ref_trace why)
 {
+	unsigned int object_debug_id = object->debug_id;
+	unsigned int cookie_debug_id = object->cookie->debug_id;
 	struct fscache_cache *cache;
 	int u;
 
-	ASSERT(object);
-
-	_enter("{OBJ%x,%d}",
-	       object->debug_id, atomic_read(&object->usage));
-
-#ifdef CACHEFILES_DEBUG_SLAB
-	ASSERT((atomic_read(&object->usage) & 0xffff0000) != 0x6b6b0000);
-#endif
-
-	ASSERTIFCMP(object->parent,
-		    object->parent->n_children, >, 0);
-
 	u = atomic_dec_return(&object->usage);
-	trace_cachefiles_ref(object, object->cookie,
-			     (enum cachefiles_obj_ref_trace)why, u);
-	ASSERTCMP(u, !=, -1);
+	trace_cachefiles_ref(object_debug_id, cookie_debug_id, u, why);
 	if (u == 0) {
-		_debug("- kill object OBJ%x", object->debug_id);
+		_debug("- kill object OBJ%x", object_debug_id);
 
-		ASSERTCMP(object->parent, ==, NULL);
 		ASSERTCMP(object->file, ==, NULL);
-		ASSERTCMP(object->n_ops, ==, 0);
-		ASSERTCMP(object->n_children, ==, 0);
 
 		kfree(object->d_name);
 
-		cache = object->cache;
-		fscache_object_destroy(object);
+		cache = object->volume->cache->cache;
+		fscache_put_cookie(object->cookie, fscache_cookie_put_object);
+		object->cookie = NULL;
 		kmem_cache_free(cachefiles_object_jar, object);
-		fscache_object_destroyed(cache);
+		if (atomic_dec_and_test(&cache->object_count))
+			wake_up_all(&cachefiles_clearance_wq);
 	}
 
 	_leave("");
@@ -228,15 +284,12 @@ void cachefiles_put_object(struct cachefiles_object *object,
 /*
  * sync a cache
  */
-static void cachefiles_sync_cache(struct fscache_cache *_cache)
+void cachefiles_sync_cache(struct cachefiles_cache *cache)
 {
-	struct cachefiles_cache *cache;
 	const struct cred *saved_cred;
 	int ret;
 
-	_enter("%s", _cache->tag->name);
-
-	cache = container_of(_cache, struct cachefiles_cache, cache);
+	_enter("%s", cache->cache->name);
 
 	/* make sure all pages pinned by operations on behalf of the netfs are
 	 * written to disc */
@@ -248,8 +301,7 @@ static void cachefiles_sync_cache(struct fscache_cache *_cache)
 
 	if (ret == -EIO)
 		cachefiles_io_error(cache,
-				    "Attempt to sync backing fs superblock"
-				    " returned error %d",
+				    "Attempt to sync backing fs superblock returned error %d",
 				    ret);
 }
 
@@ -259,7 +311,7 @@ static void cachefiles_sync_cache(struct fscache_cache *_cache)
  */
 static int cachefiles_attr_changed(struct cachefiles_object *object)
 {
-	struct cachefiles_cache *cache;
+	struct cachefiles_cache *cache = object->volume->cache;
 	const struct cred *saved_cred;
 	struct iattr newattrs;
 	struct file *file = object->file;
@@ -276,13 +328,6 @@ static int cachefiles_attr_changed(struct cachefiles_object *object)
 	if (!file)
 		return -ENOBUFS;
 
-	cache = container_of(object->cache, struct cachefiles_cache, cache);
-
-	if (ni_size == object->i_size)
-		return 0;
-
-	ASSERT(d_is_reg(file->f_path.dentry));
-
 	oi_size = i_size_read(file_inode(file));
 	if (oi_size == ni_size)
 		return 0;
@@ -321,20 +366,18 @@ static int cachefiles_attr_changed(struct cachefiles_object *object)
 }
 
 /*
- * Invalidate an object
+ * Invalidate the storage associated with a cookie.
  */
-static void cachefiles_invalidate_object(struct cachefiles_object *object)
+static bool cachefiles_invalidate_cookie(struct fscache_cookie *cookie,
+					 unsigned int flags)
 {
-	struct cachefiles_cache *cache;
+	struct cachefiles_object *object = cookie->cache_priv;
+	struct cachefiles_cache *cache = object->volume->cache;
 	const struct cred *saved_cred;
 	struct file *file = object->file;
-	uint64_t ni_size;
+	uint64_t ni_size = cookie->object_size;
 	int ret;
 
-	cache = container_of(object->cache, struct cachefiles_cache, cache);
-
-	ni_size = object->cookie->object_size;
-
 	_enter("{OBJ%x},[%llu]",
 	       object->debug_id, (unsigned long long)ni_size);
 
@@ -359,22 +402,19 @@ static void cachefiles_invalidate_object(struct cachefiles_object *object)
 			if (ret == -EIO)
 				cachefiles_io_error_obj(object,
 							"Invalidate failed");
+			return false;
 		}
 	}
 
-	_leave("");
+	return true;
 }
 
 const struct fscache_cache_ops cachefiles_cache_ops = {
 	.name			= "cachefiles",
-	.alloc_object		= cachefiles_alloc_object,
-	.lookup_object		= cachefiles_lookup_object,
-	.lookup_complete	= cachefiles_lookup_complete,
-	.grab_object		= cachefiles_grab_object,
-	.update_object		= cachefiles_update_object,
-	.invalidate_object	= cachefiles_invalidate_object,
-	.drop_object		= cachefiles_drop_object,
-	.put_object		= cachefiles_put_object,
-	.sync_cache		= cachefiles_sync_cache,
+	.acquire_volume		= cachefiles_acquire_volume,
+	.free_volume		= cachefiles_free_volume,
+	.lookup_cookie		= cachefiles_lookup_cookie,
+	.withdraw_cookie	= cachefiles_withdraw_cookie,
+	.invalidate_cookie	= cachefiles_invalidate_cookie,
 	.begin_operation	= cachefiles_begin_operation,
 };
diff --git a/fs/cachefiles/internal.h b/fs/cachefiles/internal.h
index 92f90a5a4e93..d8a70ecbe94a 100644
--- a/fs/cachefiles/internal.h
+++ b/fs/cachefiles/internal.h
@@ -31,16 +31,50 @@ extern unsigned cachefiles_debug;
 
 #define cachefiles_gfp (__GFP_RECLAIM | __GFP_NORETRY | __GFP_NOMEMALLOC)
 
+/*
+ * Cached volume representation.
+ */
+struct cachefiles_volume {
+	struct cachefiles_cache		*cache;
+	struct list_head		cache_link;	/* Link in cache->volumes */
+	struct fscache_volume		*vcookie;	/* The netfs's representation */
+	struct dentry			*dentry;	/* The volume dentry */
+	struct dentry			*fanout[256];	/* Fanout subdirs */
+};
+
+/*
+ * node records
+ */
+struct cachefiles_object {
+	int				debug_id;	/* debugging ID */
+	spinlock_t			lock;		/* state and operations lock */
+
+	struct list_head		cache_link;	/* Link in cache->*_list */
+	struct cachefiles_volume	*volume;	/* Cache volume that holds this object */
+	struct fscache_cookie		*cookie;	/* netfs's file/index object */
+	struct file			*file;		/* The file representing this object */
+	char				*d_name;	/* Filename */
+	atomic_t			usage;		/* object usage count */
+	u8				d_name_len;	/* Length of filename */
+	u8				key_hash;	/* Hash of object key */
+	unsigned long			flags;
+#define CACHEFILES_OBJECT_IS_NEW	0		/* Set if object is new */
+};
+
 extern struct kmem_cache *cachefiles_object_jar;
 
 /*
  * Cache files cache definition
  */
 struct cachefiles_cache {
-	struct fscache_cache		cache;		/* FS-Cache record */
+	struct fscache_cache		*cache;		/* Cache cookie */
 	struct vfsmount			*mnt;		/* mountpoint holding the cache */
+	struct dentry			*store;		/* Directory into which live objects go */
 	struct dentry			*graveyard;	/* directory into which dead objects go */
 	struct file			*cachefilesd;	/* manager daemon handle */
+	struct list_head		volumes;	/* List of volume objects */
+	struct list_head		object_list;	/* List of active objects */
+	spinlock_t			object_list_lock;
 	const struct cred		*cache_cred;	/* security override for accessing cache */
 	struct mutex			daemon_mutex;	/* command serialisation mutex */
 	wait_queue_head_t		daemon_pollwq;	/* poll waitqueue for daemon */
@@ -79,6 +113,12 @@ struct file *cachefiles_cres_file(struct netfs_cache_resources *cres)
 	return cres->cache_priv2;
 }
 
+static inline
+struct cachefiles_object *cachefiles_cres_object(struct netfs_cache_resources *cres)
+{
+	return fscache_cres_cookie(cres)->cache_priv;
+}
+
 /*
  * note change of state for daemon
  */
@@ -91,6 +131,8 @@ static inline void cachefiles_state_changed(struct cachefiles_cache *cache)
 /*
  * bind.c
  */
+extern wait_queue_head_t cachefiles_clearance_wq;
+
 extern int cachefiles_daemon_bind(struct cachefiles_cache *cache, char *args);
 extern void cachefiles_daemon_unbind(struct cachefiles_cache *cache);
 
@@ -106,9 +148,19 @@ extern int cachefiles_has_space(struct cachefiles_cache *cache,
  * interface.c
  */
 extern const struct fscache_cache_ops cachefiles_cache_ops;
+extern void cachefiles_see_object(struct cachefiles_object *object,
+				  enum cachefiles_obj_ref_trace why);
+extern struct cachefiles_object *cachefiles_grab_object(struct cachefiles_object *object,
+							enum cachefiles_obj_ref_trace why);
+extern void cachefiles_put_object(struct cachefiles_object *object,
+				  enum cachefiles_obj_ref_trace why);
+extern void cachefiles_sync_cache(struct cachefiles_cache *cache);
 
-void cachefiles_put_object(struct cachefiles_object *_object,
-			   enum fscache_obj_ref_trace why);
+/*
+ * io.c
+ */
+extern bool cachefiles_begin_operation(struct netfs_cache_resources *cres,
+				       enum fscache_want_stage want_stage);
 
 /*
  * key.c
@@ -119,14 +171,12 @@ extern bool cachefiles_cook_key(struct cachefiles_object *object);
  * namei.c
  */
 extern void cachefiles_unmark_inode_in_use(struct cachefiles_object *object);
-extern int cachefiles_delete_object(struct cachefiles_cache *cache,
-				    struct cachefiles_object *object);
-extern int cachefiles_walk_to_object(struct cachefiles_object *parent,
-				     struct cachefiles_object *object);
+extern int cachefiles_delete_object(struct cachefiles_object *object,
+				    enum fscache_why_object_killed why);
+extern bool cachefiles_walk_to_object(struct cachefiles_object *object);
 extern struct dentry *cachefiles_get_directory(struct cachefiles_cache *cache,
 					       struct dentry *dir,
-					       const char *name,
-					       struct cachefiles_object *object);
+					       const char *name);
 
 extern int cachefiles_cull(struct cachefiles_cache *cache, struct dentry *dir,
 			   char *filename);
@@ -134,11 +184,6 @@ extern int cachefiles_cull(struct cachefiles_cache *cache, struct dentry *dir,
 extern int cachefiles_check_in_use(struct cachefiles_cache *cache,
 				   struct dentry *dir, char *filename);
 
-/*
- * rdwr2.c
- */
-extern int cachefiles_begin_operation(struct netfs_cache_resources *);
-
 /*
  * security.c
  */
@@ -159,6 +204,13 @@ static inline void cachefiles_end_secure(struct cachefiles_cache *cache,
 	revert_creds(saved_cred);
 }
 
+/*
+ * volume.c
+ */
+void cachefiles_acquire_volume(struct fscache_volume *volume);
+void cachefiles_free_volume(struct fscache_volume *volume);
+void cachefiles_withdraw_volume(struct cachefiles_volume *volume);
+
 /*
  * xattr.c
  */
@@ -175,7 +227,7 @@ extern int cachefiles_remove_object_xattr(struct cachefiles_cache *cache,
 #define cachefiles_io_error(___cache, FMT, ...)		\
 do {							\
 	pr_err("I/O Error: " FMT"\n", ##__VA_ARGS__);	\
-	fscache_io_error(&(___cache)->cache);		\
+	fscache_io_error((___cache)->cache);		\
 	set_bit(CACHEFILES_DEAD, &(___cache)->flags);	\
 } while (0)
 
@@ -183,9 +235,9 @@ do {							\
 do {									\
 	struct cachefiles_cache *___cache;				\
 									\
-	___cache = container_of((object)->cache,			\
-				struct cachefiles_cache, cache);	\
-	cachefiles_io_error(___cache, FMT, ##__VA_ARGS__);		\
+	___cache = (object)->volume->cache;				\
+	cachefiles_io_error(___cache, FMT " [o=%08x]", ##__VA_ARGS__,	\
+			    (object)->debug_id);			\
 } while (0)
 
 
@@ -193,7 +245,7 @@ do {									\
  * debug tracing
  */
 #define dbgprintk(FMT, ...) \
-	printk(KERN_DEBUG "[%-6.6s] "FMT"\n", current->comm, ##__VA_ARGS__)
+	printk("[%-6.6s] "FMT"\n", current->comm, ##__VA_ARGS__)
 
 #define kenter(FMT, ...) dbgprintk("==> %s("FMT")", __func__, ##__VA_ARGS__)
 #define kleave(FMT, ...) dbgprintk("<== %s()"FMT"", __func__, ##__VA_ARGS__)
diff --git a/fs/cachefiles/io.c b/fs/cachefiles/io.c
index f703a93e238b..e5c29c0decea 100644
--- a/fs/cachefiles/io.c
+++ b/fs/cachefiles/io.c
@@ -10,7 +10,7 @@
 #include <linux/file.h>
 #include <linux/uio.h>
 #include <linux/sched/mm.h>
-#include <linux/netfs.h>
+#include <trace/events/fscache.h>
 #include "internal.h"
 
 struct cachefiles_kiocb {
@@ -21,14 +21,17 @@ struct cachefiles_kiocb {
 		size_t		skipped;
 		size_t		len;
 	};
+	struct cachefiles_object *object;
 	netfs_io_terminated_t	term_func;
 	void			*term_func_priv;
 	bool			was_async;
+	unsigned int		inval_counter;	/* Copy of cookie->inval_counter */
 };
 
 static inline void cachefiles_put_kiocb(struct cachefiles_kiocb *ki)
 {
 	if (refcount_dec_and_test(&ki->ki_refcnt)) {
+		cachefiles_put_object(ki->object, cachefiles_obj_put_ioreq);
 		fput(ki->iocb.ki_filp);
 		kfree(ki);
 	}
@@ -44,8 +47,13 @@ static void cachefiles_read_complete(struct kiocb *iocb, long ret, long ret2)
 	_enter("%ld,%ld", ret, ret2);
 
 	if (ki->term_func) {
-		if (ret >= 0)
-			ret += ki->skipped;
+		if (ret >= 0) {
+			if (ki->object->cookie->inval_counter == ki->inval_counter)
+				ki->skipped += ret;
+			else
+				ret = -ESTALE;
+		}
+
 		ki->term_func(ki->term_func_priv, ret, ki->was_async);
 	}
 
@@ -62,13 +70,20 @@ static int cachefiles_read(struct netfs_cache_resources *cres,
 			   netfs_io_terminated_t term_func,
 			   void *term_func_priv)
 {
-	struct cachefiles_object *object = cres->cache_priv;
+	struct cachefiles_object *object;
 	struct cachefiles_kiocb *ki;
-	struct file *file = cachefiles_cres_file(cres);
+	struct file *file;
 	unsigned int old_nofs;
-	ssize_t ret = -ENODATA;
+	ssize_t ret = -ENOBUFS;
 	size_t len = iov_iter_count(iter), skipped = 0;
 
+	if (!fscache_wait_for_operation(cres, FSCACHE_WANT_READ))
+		goto presubmission_error;
+
+	fscache_count_read();
+	object = cachefiles_cres_object(cres);
+	file = cachefiles_cres_file(cres);
+
 	_enter("%pD,%li,%llx,%zx/%llx",
 	       file, file_inode(file)->i_ino, start_pos, len,
 	       i_size_read(file_inode(file)));
@@ -91,6 +106,7 @@ static int cachefiles_read(struct netfs_cache_resources *cres,
 			 * in the region, so clear the rest of the buffer and
 			 * return success.
 			 */
+			ret = -ENODATA;
 			if (read_hole == NETFS_READ_HOLE_FAIL)
 				goto presubmission_error;
 
@@ -104,7 +120,7 @@ static int cachefiles_read(struct netfs_cache_resources *cres,
 		iov_iter_zero(skipped, iter);
 	}
 
-	ret = -ENOBUFS;
+	ret = -ENOMEM;
 	ki = kzalloc(sizeof(struct cachefiles_kiocb), GFP_KERNEL);
 	if (!ki)
 		goto presubmission_error;
@@ -116,6 +132,8 @@ static int cachefiles_read(struct netfs_cache_resources *cres,
 	ki->iocb.ki_hint	= ki_hint_validate(file_write_hint(file));
 	ki->iocb.ki_ioprio	= get_current_ioprio();
 	ki->skipped		= skipped;
+	ki->object		= object;
+	ki->inval_counter	= object->cookie->inval_counter;
 	ki->term_func		= term_func;
 	ki->term_func_priv	= term_func_priv;
 	ki->was_async		= true;
@@ -124,6 +142,7 @@ static int cachefiles_read(struct netfs_cache_resources *cres,
 		ki->iocb.ki_complete = cachefiles_read_complete;
 
 	get_file(ki->iocb.ki_filp);
+	cachefiles_grab_object(object, cachefiles_obj_get_ioreq);
 
 	trace_cachefiles_read(object, file_inode(file), ki->iocb.ki_pos, len - skipped);
 	old_nofs = memalloc_nofs_save();
@@ -177,7 +196,6 @@ static void cachefiles_write_complete(struct kiocb *iocb, long ret, long ret2)
 
 	if (ki->term_func)
 		ki->term_func(ki->term_func_priv, ret, ki->was_async);
-
 	cachefiles_put_kiocb(ki);
 }
 
@@ -190,18 +208,25 @@ static int cachefiles_write(struct netfs_cache_resources *cres,
 			    netfs_io_terminated_t term_func,
 			    void *term_func_priv)
 {
-	struct cachefiles_object *object = cres->cache_priv;
+	struct cachefiles_object *object;
 	struct cachefiles_kiocb *ki;
 	struct inode *inode;
-	struct file *file = cachefiles_cres_file(cres);
+	struct file *file;
 	unsigned int old_nofs;
 	ssize_t ret = -ENOBUFS;
 	size_t len = iov_iter_count(iter);
 
+	if (!fscache_wait_for_operation(cres, FSCACHE_WANT_WRITE))
+		goto presubmission_error;
+	fscache_count_write();
+	object = cachefiles_cres_object(cres);
+	file = cachefiles_cres_file(cres);
+
 	_enter("%pD,%li,%llx,%zx/%llx",
 	       file, file_inode(file)->i_ino, start_pos, len,
 	       i_size_read(file_inode(file)));
 
+	ret = -ENOMEM;
 	ki = kzalloc(sizeof(struct cachefiles_kiocb), GFP_KERNEL);
 	if (!ki)
 		goto presubmission_error;
@@ -212,6 +237,8 @@ static int cachefiles_write(struct netfs_cache_resources *cres,
 	ki->iocb.ki_flags	= IOCB_DIRECT | IOCB_WRITE;
 	ki->iocb.ki_hint	= ki_hint_validate(file_write_hint(file));
 	ki->iocb.ki_ioprio	= get_current_ioprio();
+	ki->object		= object;
+	ki->inval_counter	= object->cookie->inval_counter;
 	ki->start		= start_pos;
 	ki->len			= len;
 	ki->term_func		= term_func;
@@ -231,6 +258,7 @@ static int cachefiles_write(struct netfs_cache_resources *cres,
 	__sb_writers_release(inode->i_sb, SB_FREEZE_WRITE);
 
 	get_file(ki->iocb.ki_filp);
+	cachefiles_grab_object(object, cachefiles_obj_get_ioreq);
 
 	trace_cachefiles_write(object, inode, ki->iocb.ki_pos, len);
 	old_nofs = memalloc_nofs_save();
@@ -264,8 +292,8 @@ static int cachefiles_write(struct netfs_cache_resources *cres,
 
 presubmission_error:
 	if (term_func)
-		term_func(term_func_priv, -ENOMEM, false);
-	return -ENOMEM;
+		term_func(term_func_priv, ret, false);
+	return ret;
 }
 
 /*
@@ -275,33 +303,40 @@ static int cachefiles_write(struct netfs_cache_resources *cres,
 static enum netfs_read_source cachefiles_prepare_read(struct netfs_read_subrequest *subreq,
 						      loff_t i_size)
 {
-#if 0
-	struct fscache_operation *op = subreq->rreq->cache_resources.cache_priv;
+	struct netfs_read_request *rreq = subreq->rreq;
+	struct netfs_cache_resources *cres = &rreq->cache_resources;
 	struct cachefiles_object *object;
 	struct cachefiles_cache *cache;
+	struct fscache_cookie *cookie = fscache_cres_cookie(cres);
 	const struct cred *saved_cred;
-	struct file *file = subreq->rreq->cache_resources.cache_priv2;
+	struct file *file = cachefiles_cres_file(cres);
 	enum netfs_read_source ret = NETFS_DOWNLOAD_FROM_SERVER;
 	loff_t off, to;
 
 	_enter("%zx @%llx/%llx", subreq->len, subreq->start, i_size);
 
-	object = container_of(op->object, struct cachefiles_object, fscache);
-	cache = container_of(object->fscache.cache,
-			     struct cachefiles_cache, cache);
-
-	cachefiles_begin_secure(cache, &saved_cred);
-
 	if (subreq->start >= i_size) {
 		ret = NETFS_FILL_WITH_ZEROES;
-		goto out;
+		goto out_no_object;
 	}
 
-	if (!file)
-		goto out;
+	if (test_bit(FSCACHE_COOKIE_NO_DATA_TO_READ, &cookie->flags)) {
+		__set_bit(NETFS_SREQ_WRITE_TO_CACHE, &subreq->flags);
+		goto out_no_object;
+	}
 
-	if (test_bit(FSCACHE_COOKIE_NO_DATA_YET, &object->fscache.cookie->flags))
-		goto download_and_store;
+	/* The object and the file may be being created in the background. */
+	if (!file) {
+		if (!fscache_wait_for_operation(cres, FSCACHE_WANT_READ))
+			goto out_no_object;
+		file = cachefiles_cres_file(cres);
+		if (!file)
+			goto out_no_object;
+	}
+
+	object = cachefiles_cres_object(cres);
+	cache = object->volume->cache;
+	cachefiles_begin_secure(cache, &saved_cred);
 
 	off = vfs_llseek(file, subreq->start, SEEK_DATA);
 	if (off < 0 && off >= (loff_t)-MAX_ERRNO) {
@@ -339,10 +374,8 @@ static enum netfs_read_source cachefiles_prepare_read(struct netfs_read_subreque
 		__set_bit(NETFS_SREQ_WRITE_TO_CACHE, &subreq->flags);
 out:
 	cachefiles_end_secure(cache, saved_cred);
+out_no_object:
 	return ret;
-#endif
-	return subreq->start >= i_size ?
-		NETFS_FILL_WITH_ZEROES : NETFS_DOWNLOAD_FROM_SERVER;
 }
 
 /*
@@ -367,19 +400,12 @@ static int cachefiles_prepare_write(struct netfs_cache_resources *cres,
 static int cachefiles_prepare_fallback_write(struct netfs_cache_resources *cres,
 					     pgoff_t index)
 {
-#if 0
-	struct fscache_operation *op = cres->cache_priv;
-	struct cachefiles_object *object;
-	struct cachefiles_cache *cache;
+	struct cachefiles_object *object = cachefiles_cres_object(cres);
+	struct cachefiles_cache *cache = object->volume->cache;
 
 	_enter("%lx", index);
 
-	object = container_of(op->object, struct cachefiles_object, fscache);
-	cache = container_of(object->fscache.cache,
-			     struct cachefiles_cache, cache);
 	return cachefiles_has_space(cache, 0, 1);
-#endif
-	return -ENOBUFS;
 }
 
 /*
@@ -387,20 +413,11 @@ static int cachefiles_prepare_fallback_write(struct netfs_cache_resources *cres,
  */
 static void cachefiles_end_operation(struct netfs_cache_resources *cres)
 {
-#if 0
-	struct fscache_operation *op = cres->cache_priv;
 	struct file *file = cachefiles_cres_file(cres);
 
-	_enter("");
-
 	if (file)
 		fput(file);
-	if (op) {
-		fscache_op_complete(op, false);
-		fscache_put_operation(op);
-	}
-	_leave("");
-#endif
+	fscache_end_cookie_access(fscache_cres_cookie(cres), fscache_access_io_end);
 }
 
 static const struct netfs_cache_ops cachefiles_netfs_cache_ops = {
@@ -415,20 +432,25 @@ static const struct netfs_cache_ops cachefiles_netfs_cache_ops = {
 /*
  * Open the cache file when beginning a cache operation.
  */
-int cachefiles_begin_operation(struct netfs_cache_resources *cres)
+bool cachefiles_begin_operation(struct netfs_cache_resources *cres,
+				enum fscache_want_stage want_stage)
 {
-#if 0
-	struct cachefiles_object *object = op->object;
+	struct cachefiles_object *object = cachefiles_cres_object(cres);
+
+	if (!cachefiles_cres_file(cres)) {
+		cres->ops = &cachefiles_netfs_cache_ops;
+		if (object) {
+			spin_lock(&object->lock);
+			if (!cres->cache_priv2 && object->file)
+				cres->cache_priv2 = get_file(object->file);
+			spin_unlock(&object->lock);
+		}
+	}
 
-	_enter("");
+	if (!cachefiles_cres_file(cres) && want_stage != FSCACHE_WANT_PARAMS) {
+		pr_err("failed to get cres->file\n");
+		return false;
+	}
 
-	cres->cache_priv	= object;
-	cres->cache_priv2	= get_file(object->file);
-	cres->ops		= &cachefiles_netfs_cache_ops;
-	cres->debug_id		= object->cookie->debug_id;
-	_leave("");
-	return 0;
-#endif
-	cres->ops = &cachefiles_netfs_cache_ops;
-	return -EIO;
+	return true;
 }
diff --git a/fs/cachefiles/key.c b/fs/cachefiles/key.c
index ccadbc4776f1..635166d2d7a9 100644
--- a/fs/cachefiles/key.c
+++ b/fs/cachefiles/key.c
@@ -22,6 +22,11 @@ static const char cachefiles_filecharmap[256] = {
 	[48 ... 127] = 1,		/* '0' -> '~' */
 };
 
+static inline unsigned int how_many_hex_digits(unsigned int x)
+{
+	return x ? round_up(ilog2(x) + 1, 4) / 4 : 0;
+}
+
 /*
  * turn the raw key into something cooked
  * - the key may be up to NAME_MAX in length (including the length word)
@@ -31,21 +36,20 @@ static const char cachefiles_filecharmap[256] = {
  */
 bool cachefiles_cook_key(struct cachefiles_object *object)
 {
-	const u8 *key = fscache_get_key(object->cookie);
-	unsigned int acc, sum, keylen = object->cookie->key_len;
-	char *name;
-	u8 *buffer, *p;
-	int i, len, elem3, print;
-	u8 type;
+	const u8 *key = fscache_get_key(object->cookie), *kend;
+	unsigned char sum, ch;
+	unsigned int acc, i, n, nle, nbe, keylen = object->cookie->key_len;
+	unsigned int b64len, len, print, pad;
+	char *name, sep;
 
-	_enter(",%d", keylen);
+	_enter(",%u,%*phN", keylen, keylen, key);
 
 	BUG_ON(keylen > NAME_MAX - 3);
 
 	sum = 0;
 	print = 1;
 	for (i = 0; i < keylen; i++) {
-		u8 ch = key[i];
+		ch = key[i];
 		sum += ch;
 		print &= cachefiles_filecharmap[ch];
 	}
@@ -53,63 +57,72 @@ bool cachefiles_cook_key(struct cachefiles_object *object)
 
 	/* If the path is usable ASCII, then we render it directly */
 	if (print) {
-		name = kmalloc(3 + keylen + 1, cachefiles_gfp);
+		len = 1 + keylen + 1;
+		name = kmalloc(len, cachefiles_gfp);
 		if (!name)
 			return false;
 
-		switch (object->cookie->type) {
-		case FSCACHE_COOKIE_TYPE_INDEX:		type = 'I';	break;
-		case FSCACHE_COOKIE_TYPE_DATAFILE:	type = 'D';	break;
-		default:				type = 'S';	break;
-		}
-
-		name[0] = type;
-		name[1] = cachefiles_charmap[(keylen >> 6) & 63];
-		name[2] = cachefiles_charmap[keylen & 63];
-
-		memcpy(name + 3, key, keylen);
-		name[3 + keylen] = 0;
-		object->d_name = name;
-		object->d_name_len = 3 + keylen;
+		name[0] = 'D'; /* Data object type, string encoding */
+		name[1 + keylen] = 0;
+		memcpy(name + 1, key, keylen);
 		goto success;
 	}
 
-	/* Construct the key we actually want to render.  We stick the length
-	 * on the front and leave NULs on the back for the encoder to overread.
+	/* See if it makes sense to encode it as "hex,hex,hex" for each 32-bit
+	 * chunk.  We rely on the key having been padded out to a whole number
+	 * of 32-bit words.
 	 */
-	buffer = kmalloc(2 + keylen + 3, cachefiles_gfp);
-	if (!buffer)
-		return false;
-
-	memcpy(buffer + 2, key, keylen);
-
-	*(uint16_t *)buffer = keylen;
-	((char *)buffer)[keylen + 2] = 0;
-	((char *)buffer)[keylen + 3] = 0;
-	((char *)buffer)[keylen + 4] = 0;
-
-	elem3 = DIV_ROUND_UP(2 + keylen, 3); /* Count of 3-byte elements */
-	len = elem3 * 4;
-
-	name = kmalloc(1 + len + 1, cachefiles_gfp);
-	if (!name) {
-		kfree(buffer);
-		return false;
+	n = round_up(keylen, 4);
+	nbe = nle = 0;
+	for (i = 0; i < n; i += 4) {
+		u32 be = be32_to_cpu(*(__be32 *)(key + i));
+		u32 le = le32_to_cpu(*(__le32 *)(key + i));
+
+		nbe += 1 + how_many_hex_digits(be);
+		nle += 1 + how_many_hex_digits(le);
 	}
 
-	switch (object->cookie->type) {
-	case FSCACHE_COOKIE_TYPE_INDEX:		type = 'J';	break;
-	case FSCACHE_COOKIE_TYPE_DATAFILE:	type = 'E';	break;
-	default:				type = 'T';	break;
+	b64len = DIV_ROUND_UP(keylen, 3);
+	pad = b64len * 3 - keylen;
+	b64len = 2 + b64len * 4; /* Length if we base64-encode it */
+	_debug("len=%u nbe=%u nle=%u b64=%u", keylen, nbe, nle, b64len);
+	if (nbe < b64len || nle < b64len) {
+		unsigned int nlen = min(nbe, nle) + 1;
+		name = kmalloc(nlen, cachefiles_gfp);
+		if (!name)
+			return false;
+		sep = (nbe <= nle) ? 'S' : 'T'; /* Encoding indicator */
+		len = 0;
+		for (i = 0; i < n; i += 4) {
+			u32 x;
+			if (nbe <= nle)
+				x = be32_to_cpu(*(__be32 *)(key + i));
+			else
+				x = le32_to_cpu(*(__le32 *)(key + i));
+			name[len++] = sep;
+			if (x != 0)
+				len += snprintf(name + len, nlen - len, "%x", x);
+			sep = ',';
+		}
+		goto success;
 	}
 
-	name[0] = type;
-	len = 1;
-	p = buffer;
-	for (i = 0; i < elem3; i++) {
-		acc = *p++;
-		acc |= *p++ << 8;
-		acc |= *p++ << 16;
+	/* We need to base64-encode it */
+	name = kmalloc(b64len + 1, cachefiles_gfp);
+	if (!name)
+		return false;
+
+	name[0] = 'E';
+	name[1] = '0' + pad;
+	len = 2;
+	kend = key + keylen;
+	do {
+		acc  = *key++;
+		if (key < kend) {
+			acc |= *key++ << 8;
+			if (key < kend)
+				acc |= *key++ << 16;
+		}
 
 		name[len++] = cachefiles_charmap[acc & 63];
 		acc >>= 6;
@@ -118,13 +131,12 @@ bool cachefiles_cook_key(struct cachefiles_object *object)
 		name[len++] = cachefiles_charmap[acc & 63];
 		acc >>= 6;
 		name[len++] = cachefiles_charmap[acc & 63];
-	}
+	} while (key < kend);
 
+success:
 	name[len] = 0;
 	object->d_name = name;
 	object->d_name_len = len;
-	kfree(buffer);
-success:
 	_leave(" = %s", object->d_name);
 	return true;
 }
diff --git a/fs/cachefiles/main.c b/fs/cachefiles/main.c
index d3115106b22b..dc7731812b98 100644
--- a/fs/cachefiles/main.c
+++ b/fs/cachefiles/main.c
@@ -37,13 +37,6 @@ static struct miscdevice cachefiles_dev = {
 	.fops	= &cachefiles_daemon_fops,
 };
 
-static void cachefiles_object_init_once(void *_object)
-{
-	struct cachefiles_object *object = _object;
-
-	memset(object, 0, sizeof(*object));
-}
-
 /*
  * initialise the fs caching module
  */
@@ -60,9 +53,7 @@ static int __init cachefiles_init(void)
 	cachefiles_object_jar =
 		kmem_cache_create("cachefiles_object_jar",
 				  sizeof(struct cachefiles_object),
-				  0,
-				  SLAB_HWCACHE_ALIGN,
-				  cachefiles_object_init_once);
+				  0, SLAB_HWCACHE_ALIGN, NULL);
 	if (!cachefiles_object_jar) {
 		pr_notice("Failed to allocate an object jar\n");
 		goto error_object_jar;
diff --git a/fs/cachefiles/namei.c b/fs/cachefiles/namei.c
index cb08be5fb28e..f7e73aba9104 100644
--- a/fs/cachefiles/namei.c
+++ b/fs/cachefiles/namei.c
@@ -18,8 +18,6 @@
 #include <linux/slab.h>
 #include "internal.h"
 
-#define CACHEFILES_KEYBUF_SIZE 512
-
 /*
  * Mark the backing file as being a cache file if it's not already in use so.
  */
@@ -51,6 +49,9 @@ void cachefiles_unmark_inode_in_use(struct cachefiles_object *object)
 {
 	struct inode *inode = file_inode(object->file);
 
+	if (!inode)
+		return;
+
 	inode_lock(inode);
 	inode->i_flags &= ~S_KERNEL_FILE;
 	inode_unlock(inode);
@@ -60,9 +61,9 @@ void cachefiles_unmark_inode_in_use(struct cachefiles_object *object)
 /*
  * Mark an object as being inactive.
  */
-static void cachefiles_mark_object_inactive(struct cachefiles_cache *cache,
-					    struct cachefiles_object *object)
+static void cachefiles_mark_object_inactive(struct cachefiles_object *object)
 {
+	struct cachefiles_cache *cache = object->volume->cache;
 	blkcnt_t i_blocks = file_inode(object->file)->i_blocks;
 
 	/* This object can now be culled, so we need to let the daemon know
@@ -78,7 +79,6 @@ static void cachefiles_mark_object_inactive(struct cachefiles_cache *cache,
  * - file backed objects are unlinked
  * - directory backed objects are stuffed into the graveyard for userspace to
  *   delete
- * - unlocks the directory mutex
  */
 static int cachefiles_bury_object(struct cachefiles_cache *cache,
 				  struct cachefiles_object *object,
@@ -93,6 +93,12 @@ static int cachefiles_bury_object(struct cachefiles_cache *cache,
 
 	_enter(",'%pd','%pd'", dir, rep);
 
+	if (rep->d_parent != dir) {
+		inode_unlock(d_inode(dir));
+		_leave(" = -ESTALE");
+		return -ESTALE;
+	}
+
 	/* non-directories can just be unlinked */
 	if (!d_is_dir(rep)) {
 		_debug("unlink stale object");
@@ -229,45 +235,24 @@ static int cachefiles_bury_object(struct cachefiles_cache *cache,
 /*
  * delete an object representation from the cache
  */
-int cachefiles_delete_object(struct cachefiles_cache *cache,
-			     struct cachefiles_object *object)
+int cachefiles_delete_object(struct cachefiles_object *object,
+			     enum fscache_why_object_killed why)
 {
-	struct dentry *dentry = object->file->f_path.dentry, *dir;
-	int ret;
+	struct cachefiles_volume *volume = object->volume;
+	struct dentry *fan = volume->fanout[(u8)object->key_hash];
 
 	_enter(",OBJ%x{%pD}", object->debug_id, object->file);
 
-	ASSERT(d_backing_inode(dentry));
-	ASSERT(dentry->d_parent);
-
-	dir = dget_parent(dentry);
-
-	inode_lock_nested(d_backing_inode(dir), I_MUTEX_PARENT);
-
-	/* We need to check that our parent is _still_ our parent - it may have
-	 * been renamed.
-	 */
-	if (dir == dentry->d_parent) {
-		ret = cachefiles_bury_object(cache, object, dir, dentry,
-					     FSCACHE_OBJECT_WAS_RETIRED);
-	} else {
-		/* It got moved, presumably by cachefilesd culling it, so it's
-		 * no longer in the key path and we can ignore it.
-		 */
-		inode_unlock(d_backing_inode(dir));
-		ret = 0;
-	}
-
-	dput(dir);
-	_leave(" = %d", ret);
-	return ret;
+	inode_lock_nested(d_backing_inode(fan), I_MUTEX_PARENT);
+	return cachefiles_bury_object(volume->cache, object, fan,
+				      object->file->f_path.dentry, why);
 }
 
 /*
- * Check and open the terminal object.
+ * Check the attributes on a file we've just opened and delete it if it's out
+ * of date.
  */
-static int cachefiles_check_open_object(struct cachefiles_cache *cache,
-					struct cachefiles_object *object,
+static int cachefiles_check_open_object(struct cachefiles_object *object,
 					struct dentry *fan)
 {
 	int ret;
@@ -275,43 +260,32 @@ static int cachefiles_check_open_object(struct cachefiles_cache *cache,
 	if (!cachefiles_mark_inode_in_use(object))
 		return -EBUSY;
 
-	/* if we've found that the terminal object exists, then we need to
-	 * check its attributes and delete it if it's out of date */
-	if (!object->new) {
-		_debug("validate '%pD'", object->file);
-
-		ret = cachefiles_check_auxdata(object);
-		if (ret == -ESTALE)
-			goto stale;
-		if (ret < 0)
-			goto error_unmark;
-	}
-
-	_debug("=== OBTAINED_OBJECT ===");
+	_enter("%pD", object->file);
 
-	if (object->new) {
-		/* attach data to a newly constructed terminal object */
-		ret = cachefiles_set_object_xattr(object);
-		if (ret < 0)
-			goto error_unmark;
-	} else {
-		/* always update the atime on an object we've just looked up
-		 * (this is used to keep track of culling, and atimes are only
-		 * updated by read, write and readdir but not lookup or
-		 * open) */
-		touch_atime(&object->file->f_path);
-	}
+	ret = cachefiles_check_auxdata(object);
+	if (ret == -ESTALE)
+		goto stale;
+	if (ret < 0)
+		goto error_unmark;
 
+	/* Always update the atime on an object we've just looked up (this is
+	 * used to keep track of culling, and atimes are only updated by read,
+	 * write and readdir but not lookup or open).
+	 */
+	touch_atime(&object->file->f_path);
 	return 0;
 
 stale:
+	set_bit(CACHEFILES_OBJECT_IS_NEW, &object->flags);
+	fscache_cookie_lookup_negative(object->cookie);
 	cachefiles_unmark_inode_in_use(object);
 	inode_lock_nested(d_inode(fan), I_MUTEX_PARENT);
-	ret = cachefiles_bury_object(cache, object, fan,
+	ret = cachefiles_bury_object(object->volume->cache, object, fan,
 				     object->file->f_path.dentry,
 				     FSCACHE_OBJECT_IS_STALE);
 	if (ret < 0)
 		return ret;
+	cachefiles_mark_object_inactive(object);
 	_debug("redo lookup");
 	return -ESTALE;
 
@@ -321,12 +295,12 @@ static int cachefiles_check_open_object(struct cachefiles_cache *cache,
 }
 
 /*
- * Walk to a file, creating it if necessary.
+ * Look up a file, creating it if necessary.
  */
-static int cachefiles_open_file(struct cachefiles_cache *cache,
-				struct cachefiles_object *object,
+static int cachefiles_open_file(struct cachefiles_object *object,
 				struct dentry *fan)
 {
+	struct cachefiles_cache *cache = object->volume->cache;
 	struct dentry *dentry;
 	struct inode *dinode = d_backing_inode(fan), *inode;
 	struct file *file;
@@ -345,16 +319,11 @@ static int cachefiles_open_file(struct cachefiles_cache *cache,
 	}
 
 	if (d_is_negative(dentry)) {
-		/* This element of the path doesn't exist, so we can release
-		 * any readers in the certain knowledge that there's nothing
-		 * for them to actually read */
-		fscache_object_lookup_negative(object);
+		fscache_cookie_lookup_negative(object->cookie);
 
 		ret = cachefiles_has_space(cache, 1, 0);
-		if (ret < 0) {
-			fscache_object_mark_killed(object, FSCACHE_OBJECT_NO_SPACE);
+		if (ret < 0)
 			goto error_dput;
-		}
 
 		fan_path.mnt = cache->mnt;
 		fan_path.dentry = fan;
@@ -368,6 +337,7 @@ static int cachefiles_open_file(struct cachefiles_cache *cache,
 
 		inode = d_backing_inode(dentry);
 		_debug("create -> %pd{ino=%lu}", dentry, inode->i_ino);
+		set_bit(CACHEFILES_OBJECT_IS_NEW, &object->flags);
 
 	} else if (!d_is_reg(dentry)) {
 		inode = d_backing_inode(dentry);
@@ -415,81 +385,36 @@ static int cachefiles_open_file(struct cachefiles_cache *cache,
 	return ret;
 }
 
-/*
- * Walk over the fanout directory.
- */
-static struct dentry *cachefiles_walk_over_fanout(struct cachefiles_object *object,
-						  struct cachefiles_cache *cache,
-						  struct dentry *dir)
-{
-	char name[4];
-
-	snprintf(name, sizeof(name), "@%02x", object->key_hash);
-	return cachefiles_get_directory(cache, dir, name, object);
-}
-
 /*
  * walk from the parent object to the child object through the backing
  * filesystem, creating directories as we go
  */
-int cachefiles_walk_to_object(struct cachefiles_object *parent,
-			      struct cachefiles_object *object)
+bool cachefiles_walk_to_object(struct cachefiles_object *object)
 {
-	struct cachefiles_cache *cache;
+	struct cachefiles_volume *volume = object->cookie->volume->cache_priv;
 	struct dentry *fan;
 	int ret;
 
-	_enter("OBJ%x{%pD},OBJ%x,%s,",
-	       parent->debug_id, parent->file,
-	       object->debug_id, object->d_name);
-
-	cache = container_of(parent->cache, struct cachefiles_cache, cache);
-	ASSERT(parent->file);
+	_enter("OBJ%x,%s,", object->debug_id, object->d_name);
 
 lookup_again:
-	fan = cachefiles_walk_over_fanout(object, cache, parent->file->f_path.dentry);
-	if (IS_ERR(fan))
-		return PTR_ERR(fan);
-
-	/* Open path "parent/fanout/object". */
-	if (object->type == FSCACHE_COOKIE_TYPE_INDEX) {
-		struct dentry *dentry;
-		struct file *file;
-		struct path path;
-
-		dentry = cachefiles_get_directory(cache, fan, object->d_name,
-						  object);
-		if (IS_ERR(dentry)) {
-			dput(fan);
-			return PTR_ERR(dentry);
-		}
-		path.mnt = cache->mnt;
-		path.dentry = dentry;
-		file = open_with_fake_path(&path, O_RDONLY | O_DIRECTORY,
-					   d_backing_inode(dentry),
-					   cache->cache_cred);
-		dput(dentry);
-		if (IS_ERR(file)) {
-			dput(fan);
-			return PTR_ERR(file);
-		}
-		object->file = file;
+	/* Open path "cache/vol/fanout/file". */
+	fan = volume->fanout[(u8)object->key_hash];
+	ret = cachefiles_open_file(object, fan);
+	if (ret < 0)
+		goto lookup_error;
+
+	if (!test_bit(CACHEFILES_OBJECT_IS_NEW, &object->flags)) {
+		ret = cachefiles_check_open_object(object, fan);
+		if (ret < 0)
+			goto check_error;
 	} else {
-		ret = cachefiles_open_file(cache, object, fan);
-		if (ret < 0) {
-			dput(fan);
-			return ret;
-		}
+		ret = -EBUSY;
+		if (!cachefiles_mark_inode_in_use(object))
+			goto check_error;
 	}
 
-	ret = cachefiles_check_open_object(cache, object, fan);
-	dput(fan);
-	fan = NULL;
-	if (ret < 0)
-		goto check_error;
-
-	object->new = false;
-	fscache_obtained_object(object);
+	clear_bit(CACHEFILES_OBJECT_IS_NEW, &object->flags);
 	_leave(" = t [%lu]", file_inode(object->file)->i_ino);
 	return true;
 
@@ -498,11 +423,10 @@ int cachefiles_walk_to_object(struct cachefiles_object *parent,
 	object->file = NULL;
 	if (ret == -ESTALE)
 		goto lookup_again;
+lookup_error:
 	if (ret == -EIO)
 		cachefiles_io_error_obj(object, "Lookup failed");
-	cachefiles_mark_object_inactive(cache, object);
-	_leave(" = error %d", ret);
-	return ret;
+	return false;
 }
 
 /*
@@ -510,8 +434,7 @@ int cachefiles_walk_to_object(struct cachefiles_object *parent,
  */
 struct dentry *cachefiles_get_directory(struct cachefiles_cache *cache,
 					struct dentry *dir,
-					const char *dirname,
-					struct cachefiles_object *object)
+					const char *dirname)
 {
 	struct dentry *subdir;
 	struct path path;
@@ -535,12 +458,6 @@ struct dentry *cachefiles_get_directory(struct cachefiles_cache *cache,
 
 	/* we need to create the subdir if it doesn't exist yet */
 	if (d_is_negative(subdir)) {
-		/* This element of the path doesn't exist, so we can release
-		 * any readers in the certain knowledge that there's nothing
-		 * for them to actually read */
-		if (object)
-			fscache_object_lookup_negative(object);
-
 		ret = cachefiles_has_space(cache, 1, 0);
 		if (ret < 0)
 			goto mkdir_error;
@@ -564,8 +481,6 @@ struct dentry *cachefiles_get_directory(struct cachefiles_cache *cache,
 
 		_debug("mkdir -> %pd{ino=%lu}",
 		       subdir, d_backing_inode(subdir)->i_ino);
-		if (object)
-			object->new = true;
 	}
 
 	inode_unlock(d_inode(dir));
diff --git a/fs/cachefiles/volume.c b/fs/cachefiles/volume.c
new file mode 100644
index 000000000000..f5e527b56228
--- /dev/null
+++ b/fs/cachefiles/volume.c
@@ -0,0 +1,128 @@
+// SPDX-License-Identifier: GPL-2.0-or-later
+/* Volume handling.
+ *
+ * Copyright (C) 2021 Red Hat, Inc. All Rights Reserved.
+ * Written by David Howells (dhowells@redhat.com)
+ */
+
+#include <linux/fs.h>
+#include <linux/slab.h>
+#include "internal.h"
+#include <trace/events/fscache.h>
+
+/*
+ * Allocate and set up a volume representation.  We make sure all the fanout
+ * directories are created and pinned.
+ */
+void cachefiles_acquire_volume(struct fscache_volume *vcookie)
+{
+	struct cachefiles_volume *volume;
+	struct cachefiles_cache *cache = vcookie->cache->cache_priv;
+	const struct cred *saved_cred;
+	struct dentry *vdentry, *fan;
+	size_t len;
+	char *name;
+	int n_accesses, i;
+
+	_enter("");
+
+	volume = kzalloc(sizeof(struct cachefiles_volume), GFP_KERNEL);
+	if (!volume)
+		return;
+	volume->vcookie = vcookie;
+	volume->cache = cache;
+	INIT_LIST_HEAD(&volume->cache_link);
+
+	cachefiles_begin_secure(cache, &saved_cred);
+
+	len = vcookie->key[0];
+	name = kmalloc(len + 3, GFP_NOFS);
+	if (!name)
+		goto error_vol;
+	name[0] = 'I';
+	memcpy(name + 1, vcookie->key + 1, len);
+	name[len + 1] = 0;
+
+	vdentry = cachefiles_get_directory(cache, cache->store, name);
+	if (IS_ERR(vdentry))
+		goto error_name;
+	volume->dentry = vdentry;
+
+	for (i = 0; i < 256; i++) {
+		sprintf(name, "@%02x", i);
+		fan = cachefiles_get_directory(cache, vdentry, name);
+		if (IS_ERR(fan))
+			goto error_fan;
+		volume->fanout[i] = fan;
+	}
+
+	cachefiles_end_secure(cache, saved_cred);
+
+	vcookie->cache_priv = volume;
+	n_accesses = atomic_inc_return(&vcookie->n_accesses); /* Stop wakeups on dec-to-0 */
+	trace_fscache_access_volume(vcookie->debug_id, refcount_read(&vcookie->ref),
+				    n_accesses, fscache_access_cache_pin);
+
+	spin_lock(&cache->object_list_lock);
+	list_add(&volume->cache_link, &volume->cache->volumes);
+	spin_unlock(&cache->object_list_lock);
+
+	kfree(name);
+	return;
+
+error_fan:
+	for (i = 0; i < 256; i++)
+		dput(volume->fanout[i]);
+	dput(volume->dentry);
+error_name:
+	kfree(name);
+error_vol:
+	kfree(volume);
+	cachefiles_end_secure(cache, saved_cred);
+}
+
+/*
+ * Release a volume representation.
+ */
+static void __cachefiles_free_volume(struct cachefiles_volume *volume)
+{
+	int i;
+
+	_enter("");
+
+	volume->vcookie->cache_priv = NULL;
+
+	for (i = 0; i < 256; i++)
+		dput(volume->fanout[i]);
+	dput(volume->dentry);
+	kfree(volume);
+}
+
+void cachefiles_free_volume(struct fscache_volume *vcookie)
+{
+	struct cachefiles_volume *volume = vcookie->cache_priv;
+
+	if (volume) {
+		spin_lock(&volume->cache->object_list_lock);
+		list_del_init(&volume->cache_link);
+		spin_unlock(&volume->cache->object_list_lock);
+		__cachefiles_free_volume(volume);
+	}
+}
+
+void cachefiles_withdraw_volume(struct cachefiles_volume *volume)
+{
+	struct fscache_volume *vcookie = volume->vcookie;
+	int n_accesses;
+
+	_debug("withdraw V=%x", vcookie->debug_id);
+
+	/* Allow wakeups on dec-to-0 */
+	n_accesses = atomic_dec_return(&vcookie->n_accesses);
+	trace_fscache_access_volume(vcookie->debug_id, refcount_read(&vcookie->ref),
+				    n_accesses, fscache_access_cache_unpin);
+
+	wait_var_event(&vcookie->n_accesses,
+		       atomic_read(&vcookie->n_accesses) == 0);
+	__cachefiles_free_volume(volume);
+}
diff --git a/fs/cachefiles/xattr.c b/fs/cachefiles/xattr.c
index 82c822bb71af..b77bbb6c4a17 100644
--- a/fs/cachefiles/xattr.c
+++ b/fs/cachefiles/xattr.c
@@ -15,6 +15,8 @@
 #include <linux/slab.h>
 #include "internal.h"
 
+#define CACHEFILES_COOKIE_TYPE_DATA 1
+
 struct cachefiles_xattr {
 	uint8_t				type;
 	uint8_t				data[];
@@ -44,11 +46,10 @@ int cachefiles_set_object_xattr(struct cachefiles_object *object)
 	if (!buf)
 		return -ENOMEM;
 
-	buf->type = object->cookie->type;
+	buf->type = CACHEFILES_COOKIE_TYPE_DATA;
 	if (len > 0)
 		memcpy(buf->data, fscache_get_aux(object->cookie), len);
 
-	clear_bit(FSCACHE_COOKIE_AUX_UPDATED, &object->cookie->flags);
 	ret = vfs_setxattr(&init_user_ns, dentry, cachefiles_xattr_cache,
 			   buf, sizeof(struct cachefiles_xattr) + len, 0);
 	if (ret < 0) {
@@ -95,7 +96,7 @@ int cachefiles_check_auxdata(struct cachefiles_object *object)
 				object,
 				"Failed to read aux with error %zd", xlen);
 		why = cachefiles_coherency_check_xattr;
-	} else if (buf->type != object->cookie->type) {
+	} else if (buf->type != CACHEFILES_COOKIE_TYPE_DATA) {
 		why = cachefiles_coherency_check_type;
 	} else if (memcmp(buf->data, p, len) != 0) {
 		why = cachefiles_coherency_check_aux;
diff --git a/fs/fscache/Makefile b/fs/fscache/Makefile
index 14dfdce1c045..afb090ea16c4 100644
--- a/fs/fscache/Makefile
+++ b/fs/fscache/Makefile
@@ -6,11 +6,9 @@
 fscache-y := \
 	cache.o \
 	cookie.o \
-	fsdef.o \
 	io.o \
 	main.o \
-	netfs.o \
-	object.o
+	volume.o
 
 fscache-$(CONFIG_PROC_FS) += proc.o
 fscache-$(CONFIG_FSCACHE_STATS) += stats.o
diff --git a/fs/fscache/cache.c b/fs/fscache/cache.c
index 8a3191a89c32..45bb38f5cf1c 100644
--- a/fs/fscache/cache.c
+++ b/fs/fscache/cache.c
@@ -1,198 +1,181 @@
 // SPDX-License-Identifier: GPL-2.0-or-later
 /* FS-Cache cache handling
  *
- * Copyright (C) 2007 Red Hat, Inc. All Rights Reserved.
+ * Copyright (C) 2007, 2021 Red Hat, Inc. All Rights Reserved.
  * Written by David Howells (dhowells@redhat.com)
  */
 
 #define FSCACHE_DEBUG_LEVEL CACHE
-#include <linux/module.h>
+#include <linux/export.h>
 #include <linux/slab.h>
 #include "internal.h"
 
-LIST_HEAD(fscache_cache_list);
+static LIST_HEAD(fscache_caches);
 DECLARE_RWSEM(fscache_addremove_sem);
-DECLARE_WAIT_QUEUE_HEAD(fscache_cache_cleared_wq);
-EXPORT_SYMBOL(fscache_cache_cleared_wq);
+EXPORT_SYMBOL(fscache_addremove_sem);
 
-static LIST_HEAD(fscache_cache_tag_list);
+static atomic_t fscache_cache_debug_id;
 
 /*
- * look up a cache tag
+ * Allocate a cache cookie.
  */
-struct fscache_cache_tag *__fscache_lookup_cache_tag(const char *name)
+static struct fscache_cache *fscache_alloc_cache(const char *name)
 {
-	struct fscache_cache_tag *tag, *xtag;
-
-	/* firstly check for the existence of the tag under read lock */
-	down_read(&fscache_addremove_sem);
-
-	list_for_each_entry(tag, &fscache_cache_tag_list, link) {
-		if (strcmp(tag->name, name) == 0) {
-			atomic_inc(&tag->usage);
-			refcount_inc(&tag->ref);
-			up_read(&fscache_addremove_sem);
-			return tag;
-		}
-	}
-
-	up_read(&fscache_addremove_sem);
-
-	/* the tag does not exist - create a candidate */
-	xtag = kzalloc(sizeof(*xtag) + strlen(name) + 1, GFP_KERNEL);
-	if (!xtag)
-		/* return a dummy tag if out of memory */
-		return ERR_PTR(-ENOMEM);
-
-	atomic_set(&xtag->usage, 1);
-	refcount_set(&xtag->ref, 1);
-	strcpy(xtag->name, name);
-
-	/* write lock, search again and add if still not present */
-	down_write(&fscache_addremove_sem);
+	struct fscache_cache *cache;
 
-	list_for_each_entry(tag, &fscache_cache_tag_list, link) {
-		if (strcmp(tag->name, name) == 0) {
-			atomic_inc(&tag->usage);
-			refcount_inc(&tag->ref);
-			up_write(&fscache_addremove_sem);
-			kfree(xtag);
-			return tag;
+	cache = kzalloc(sizeof(*cache), GFP_KERNEL);
+	if (cache) {
+		if (name) {
+			cache->name = kstrdup(name, GFP_KERNEL);
+			if (!cache->name) {
+				kfree(cache);
+				return NULL;
+			}
 		}
+		refcount_set(&cache->ref, 1);
+		INIT_LIST_HEAD(&cache->cache_link);
+		cache->debug_id = atomic_inc_return(&fscache_cache_debug_id);
 	}
-
-	list_add_tail(&xtag->link, &fscache_cache_tag_list);
-	up_write(&fscache_addremove_sem);
-	return xtag;
+	return cache;
 }
 
-/*
- * Unuse a cache tag
- */
-void __fscache_release_cache_tag(struct fscache_cache_tag *tag)
+static bool fscache_get_cache_maybe(struct fscache_cache *cache,
+				    enum fscache_cache_trace where)
 {
-	if (tag != ERR_PTR(-ENOMEM)) {
-		down_write(&fscache_addremove_sem);
-
-		if (atomic_dec_and_test(&tag->usage))
-			list_del_init(&tag->link);
-		else
-			tag = NULL;
+	bool success;
+	int ref;
 
-		up_write(&fscache_addremove_sem);
-		fscache_put_cache_tag(tag);
-	}
+	success = __refcount_inc_not_zero(&cache->ref, &ref);
+	if (success)
+		trace_fscache_cache(cache->debug_id, ref + 1, where);
+	return success;
 }
 
 /*
- * select a cache in which to store an object
- * - the cache addremove semaphore must be at least read-locked by the caller
- * - the object will never be an index
+ * Look up a cache cookie.
  */
-struct fscache_cache *fscache_select_cache_for_object(
-	struct fscache_cookie *cookie)
+struct fscache_cache *fscache_lookup_cache(const char *name, bool is_cache)
 {
-	struct fscache_cache_tag *tag;
-	struct cachefiles_object *object;
-	struct fscache_cache *cache;
+	struct fscache_cache *candidate, *cache, *unnamed = NULL;
 
-	_enter("");
+	/* firstly check for the existence of the cache under read lock */
+	down_read(&fscache_addremove_sem);
 
-	if (list_empty(&fscache_cache_list)) {
-		_leave(" = NULL [no cache]");
-		return NULL;
+	list_for_each_entry(cache, &fscache_caches, cache_link) {
+		if (cache->name && name && strcmp(cache->name, name) == 0 &&
+		    fscache_get_cache_maybe(cache, fscache_cache_get_acquire))
+			goto got_cache_r;
+		if (!cache->name && !name &&
+		    fscache_get_cache_maybe(cache, fscache_cache_get_acquire))
+			goto got_cache_r;
 	}
 
-	/* we check the parent to determine the cache to use */
-	spin_lock(&cookie->lock);
+	if (!name) {
+		list_for_each_entry(cache, &fscache_caches, cache_link) {
+			if (cache->name &&
+			    fscache_get_cache_maybe(cache, fscache_cache_get_acquire))
+				goto got_cache_r;
+		}
+	}
 
-	/* the first in the parent's backing list should be the preferred
-	 * cache */
-	if (!hlist_empty(&cookie->backing_objects)) {
-		object = hlist_entry(cookie->backing_objects.first,
-				     struct cachefiles_object, cookie_link);
+	up_read(&fscache_addremove_sem);
 
-		cache = object->cache;
-		if (fscache_object_is_dying(object) ||
-		    test_bit(FSCACHE_IOERROR, &cache->flags))
-			cache = NULL;
+	/* the cache does not exist - create a candidate */
+	candidate = fscache_alloc_cache(name);
+	if (!candidate)
+		return ERR_PTR(-ENOMEM);
 
-		spin_unlock(&cookie->lock);
-		_leave(" = %s [parent]", cache ? cache->tag->name : "NULL");
-		return cache;
-	}
+	/* write lock, search again and add if still not present */
+	down_write(&fscache_addremove_sem);
 
-	/* the parent is unbacked */
-	if (cookie->type != FSCACHE_COOKIE_TYPE_INDEX) {
-		/* cookie not an index and is unbacked */
-		spin_unlock(&cookie->lock);
-		_leave(" = NULL [cookie ub,ni]");
-		return NULL;
+	list_for_each_entry(cache, &fscache_caches, cache_link) {
+		if (cache->name && name && strcmp(cache->name, name) == 0 &&
+		    fscache_get_cache_maybe(cache, fscache_cache_get_acquire))
+			goto got_cache_w;
+		if (!cache->name) {
+			unnamed = cache;
+			if (!name &&
+			    fscache_get_cache_maybe(cache, fscache_cache_get_acquire))
+				goto got_cache_w;
+		}
 	}
 
-	spin_unlock(&cookie->lock);
-
-	tag = cookie->preferred_cache;
-	if (!tag)
-		goto no_preference;
+	if (unnamed && is_cache &&
+	    fscache_get_cache_maybe(unnamed, fscache_cache_get_acquire))
+		goto use_unnamed_cache;
 
-	if (!tag->cache) {
-		_leave(" = NULL [unbacked tag]");
-		return NULL;
+	if (!name) {
+		list_for_each_entry(cache, &fscache_caches, cache_link) {
+			if (cache->name &&
+			    fscache_get_cache_maybe(cache, fscache_cache_get_acquire))
+				goto got_cache_w;
+		}
 	}
 
-	if (test_bit(FSCACHE_IOERROR, &tag->cache->flags))
-		return NULL;
-
-	_leave(" = %s [specific]", tag->name);
-	return tag->cache;
+	list_add_tail(&candidate->cache_link, &fscache_caches);
+	trace_fscache_cache(candidate->debug_id,
+			    refcount_read(&candidate->ref),
+			    fscache_cache_new_acquire);
+	up_write(&fscache_addremove_sem);
+	return candidate;
 
-no_preference:
-	/* netfs has no preference - just select first cache */
-	cache = list_entry(fscache_cache_list.next,
-			   struct fscache_cache, link);
-	_leave(" = %s [first]", cache->tag->name);
+got_cache_r:
+	up_read(&fscache_addremove_sem);
+	return cache;
+use_unnamed_cache:
+	cache = unnamed;
+	cache->name = candidate->name;
+	candidate->name = NULL;
+got_cache_w:
+	up_write(&fscache_addremove_sem);
+	kfree(candidate->name);
+	kfree(candidate);
 	return cache;
 }
 
 /**
- * fscache_init_cache - Initialise a cache record
- * @cache: The cache record to be initialised
- * @ops: The cache operations to be installed in that record
- * @idfmt: Format string to define identifier
- * @...: sprintf-style arguments
- *
- * Initialise a record of a cache and fill in the name.
+ * fscache_acquire_cache - Acquire a cache record for a cache.
+ * @name: The name of the cache.
  *
- * See Documentation/filesystems/caching/backend-api.rst for a complete
- * description.
+ * Get a cache record for a cache.  If there is a nameless cache record
+ * available, this will acquire that and set its name, directing all the
+ * volumes using it to this cache.
  */
-void fscache_init_cache(struct fscache_cache *cache,
-			const struct fscache_cache_ops *ops,
-			const char *idfmt,
-			...)
+struct fscache_cache *fscache_acquire_cache(const char *name)
 {
-	va_list va;
+	ASSERT(name);
+	return fscache_lookup_cache(name, true);
+}
+EXPORT_SYMBOL(fscache_acquire_cache);
 
-	memset(cache, 0, sizeof(*cache));
+void fscache_put_cache(struct fscache_cache *cache,
+		       enum fscache_cache_trace where)
+{
+	unsigned int debug_id = cache->debug_id;
+	bool zero;
+	int ref;
 
-	cache->ops = ops;
+	if (IS_ERR_OR_NULL(cache))
+		return;
 
-	va_start(va, idfmt);
-	vsnprintf(cache->identifier, sizeof(cache->identifier), idfmt, va);
-	va_end(va);
+	zero = __refcount_dec_and_test(&cache->ref, &ref);
+	trace_fscache_cache(debug_id, ref - 1, where);
 
-	INIT_LIST_HEAD(&cache->link);
-	INIT_LIST_HEAD(&cache->object_list);
-	spin_lock_init(&cache->object_list_lock);
+	if (zero) {
+		down_write(&fscache_addremove_sem);
+		list_del_init(&cache->cache_link);
+		up_write(&fscache_addremove_sem);
+		kfree(cache->name);
+		kfree(cache);
+	}
 }
-EXPORT_SYMBOL(fscache_init_cache);
+EXPORT_SYMBOL(fscache_put_cache);
 
 /**
  * fscache_add_cache - Declare a cache as being open for business
  * @cache: The record describing the cache
- * @ifsdef: The record of the cache object describing the top-level index
- * @tagname: The tag describing this cache
+ * @ops: Table of cache operations to use
+ * @cache_priv: Private data for the cache record
  *
  * Add a cache to the system, making it available for netfs's to use.
  *
@@ -200,93 +183,72 @@ EXPORT_SYMBOL(fscache_init_cache);
  * description.
  */
 int fscache_add_cache(struct fscache_cache *cache,
-		      struct cachefiles_object *ifsdef,
-		      const char *tagname)
+		      const struct fscache_cache_ops *ops,
+		      void *cache_priv)
 {
-	struct fscache_cache_tag *tag;
-
-	ASSERTCMP(ifsdef->cookie, ==, &fscache_fsdef_index);
-	BUG_ON(!cache->ops);
-	BUG_ON(!ifsdef);
-
-	cache->flags = 0;
-	ifsdef->event_mask =
-		((1 << NR_FSCACHE_OBJECT_EVENTS) - 1) &
-		~(1 << FSCACHE_OBJECT_EV_CLEARED);
-	__set_bit(FSCACHE_OBJECT_IS_AVAILABLE, &ifsdef->flags);
-
-	if (!tagname)
-		tagname = cache->identifier;
-
-	BUG_ON(!tagname[0]);
-
-	_enter("{%s.%s},,%s", cache->ops->name, cache->identifier, tagname);
-
-	/* we use the cache tag to uniquely identify caches */
-	tag = __fscache_lookup_cache_tag(tagname);
-	if (IS_ERR(tag))
-		goto nomem;
+	int n_accesses;
 
-	if (test_and_set_bit(FSCACHE_TAG_RESERVED, &tag->flags))
-		goto tag_in_use;
+	_enter("{%s,%s}", ops->name, cache->name);
 
-	cache->kobj = kobject_create_and_add(tagname, fscache_root);
-	if (!cache->kobj)
-		goto error;
+	BUG_ON(fscache_cache_state(cache) != FSCACHE_CACHE_IS_PREPARING);
 
-	ifsdef->cache = cache;
-	cache->fsdef = ifsdef;
+	/* Get a ref on the cache cookie and keep its n_accesses counter raised
+	 * by 1 to prevent wakeups from transitioning it to 0 until we're
+	 * withdrawing caching services from it.
+	 */
+	n_accesses = atomic_inc_return(&cache->n_accesses);
+	trace_fscache_access_cache(cache->debug_id, refcount_read(&cache->ref),
+				   n_accesses, fscache_access_cache_pin);
 
 	down_write(&fscache_addremove_sem);
 
-	tag->cache = cache;
-	cache->tag = tag;
-
-	/* add the cache to the list */
-	list_add(&cache->link, &fscache_cache_list);
-
-	/* add the cache's netfs definition index object to the cache's
-	 * list */
-	spin_lock(&cache->object_list_lock);
-	list_add_tail(&ifsdef->cache_link, &cache->object_list);
-	spin_unlock(&cache->object_list_lock);
-
-	/* add the cache's netfs definition index object to the top level index
-	 * cookie as a known backing object */
-	spin_lock(&fscache_fsdef_index.lock);
-
-	hlist_add_head(&ifsdef->cookie_link,
-		       &fscache_fsdef_index.backing_objects);
-
-	refcount_inc(&fscache_fsdef_index.ref);
+	cache->ops = ops;
+	cache->cache_priv = cache_priv;
+	fscache_set_cache_state(cache, FSCACHE_CACHE_IS_ACTIVE);
 
-	/* done */
-	spin_unlock(&fscache_fsdef_index.lock);
 	up_write(&fscache_addremove_sem);
-
-	pr_notice("Cache \"%s\" added (type %s)\n",
-		  cache->tag->name, cache->ops->name);
-	kobject_uevent(cache->kobj, KOBJ_ADD);
-
-	_leave(" = 0 [%s]", cache->identifier);
+	pr_notice("Cache \"%s\" added (type %s)\n", cache->name, ops->name);
+	_leave(" = 0 [%s]", cache->name);
 	return 0;
+}
+EXPORT_SYMBOL(fscache_add_cache);
 
-tag_in_use:
-	pr_err("Cache tag '%s' already in use\n", tagname);
-	__fscache_release_cache_tag(tag);
-	_leave(" = -EXIST");
-	return -EEXIST;
-
-error:
-	__fscache_release_cache_tag(tag);
-	_leave(" = -EINVAL");
-	return -EINVAL;
+/*
+ * Get an increment on a cache's access counter if the cache is live to prevent
+ * it from going away whilst we're accessing it.
+ */
+bool fscache_begin_cache_access(struct fscache_cache *cache, enum fscache_access_trace why)
+{
+	int n_accesses;
+
+	if (!fscache_cache_is_live(cache))
+		return false;
+
+	n_accesses = atomic_inc_return(&cache->n_accesses);
+	smp_mb__after_atomic(); /* Reread live flag after n_accesses */
+	trace_fscache_access_cache(cache->debug_id, refcount_read(&cache->ref),
+				   n_accesses, why);
+	if (!fscache_cache_is_live(cache)) {
+		fscache_end_cache_access(cache, fscache_access_unlive);
+		return false;
+	}
+	return true;
+}
 
-nomem:
-	_leave(" = -ENOMEM");
-	return -ENOMEM;
+/*
+ * Drop an increment on a cache's access counter.
+ */
+void fscache_end_cache_access(struct fscache_cache *cache, enum fscache_access_trace why)
+{
+	int n_accesses;
+
+	smp_mb__before_atomic();
+	n_accesses = atomic_dec_return(&cache->n_accesses);
+	trace_fscache_access_cache(cache->debug_id, refcount_read(&cache->ref),
+				   n_accesses, why);
+	if (n_accesses == 0)
+		wake_up_var(&cache->n_accesses);
 }
-EXPORT_SYMBOL(fscache_add_cache);
 
 /**
  * fscache_io_error - Note a cache I/O error
@@ -300,100 +262,92 @@ EXPORT_SYMBOL(fscache_add_cache);
  */
 void fscache_io_error(struct fscache_cache *cache)
 {
-	if (!test_and_set_bit(FSCACHE_IOERROR, &cache->flags))
+	if (fscache_set_cache_state_maybe(cache,
+					  FSCACHE_CACHE_IS_ACTIVE,
+					  FSCACHE_CACHE_GOT_IOERROR))
 		pr_err("Cache '%s' stopped due to I/O error\n",
-		       cache->ops->name);
+		       cache->name);
 }
 EXPORT_SYMBOL(fscache_io_error);
 
-/*
- * request withdrawal of all the objects in a cache
- * - all the objects being withdrawn are moved onto the supplied list
+/**
+ * fscache_withdraw_cache - Withdraw a cache from the active service
+ * @cache: The cache cookie
+ *
+ * Begin the process of withdrawing a cache from service.
  */
-static void fscache_withdraw_all_objects(struct fscache_cache *cache,
-					 struct list_head *dying_objects)
+void fscache_withdraw_cache(struct fscache_cache *cache)
 {
-	struct cachefiles_object *object;
-
-	while (!list_empty(&cache->object_list)) {
-		spin_lock(&cache->object_list_lock);
+	int n_accesses;
 
-		if (!list_empty(&cache->object_list)) {
-			object = list_entry(cache->object_list.next,
-					    struct cachefiles_object, cache_link);
-			list_move_tail(&object->cache_link, dying_objects);
+	pr_notice("Withdrawing cache \"%s\" (%u objs)\n",
+		  cache->name, atomic_read(&cache->object_count));
 
-			_debug("withdraw %x", object->cookie->debug_id);
+	fscache_set_cache_state(cache, FSCACHE_CACHE_IS_WITHDRAWN);
 
-			/* This must be done under object_list_lock to prevent
-			 * a race with fscache_drop_object().
-			 */
-			fscache_raise_event(object, FSCACHE_OBJECT_EV_KILL);
-		}
+	/* Allow wakeups on dec-to-0 */
+	n_accesses = atomic_dec_return(&cache->n_accesses);
+	trace_fscache_access_cache(cache->debug_id, refcount_read(&cache->ref),
+				   n_accesses, fscache_access_cache_unpin);
 
-		spin_unlock(&cache->object_list_lock);
-		cond_resched();
-	}
+	wait_var_event(&cache->n_accesses,
+		       atomic_read(&cache->n_accesses) == 0);
 }
+EXPORT_SYMBOL(fscache_withdraw_cache);
 
-/**
- * fscache_withdraw_cache - Withdraw a cache from the active service
- * @cache: The record describing the cache
- *
- * Withdraw a cache from service, unbinding all its cache objects from the
- * netfs cookies they're currently representing.
- *
- * See Documentation/filesystems/caching/backend-api.rst for a complete
- * description.
+#ifdef CONFIG_PROC_FS
+static const char fscache_cache_states[NR__FSCACHE_CACHE_STATE] = "-PAEW";
+
+/*
+ * Generate a list of caches in /proc/fs/fscache/caches
  */
-void fscache_withdraw_cache(struct fscache_cache *cache)
+static int fscache_caches_seq_show(struct seq_file *m, void *v)
 {
-	LIST_HEAD(dying_objects);
+	struct fscache_cache *cache;
 
-	_enter("");
+	if (v == &fscache_caches) {
+		seq_puts(m,
+			 "CACHE    REF   VOLS  OBJS  ACCES S NAME\n"
+			 "======== ===== ===== ===== ===== = ===============\n"
+			 );
+		return 0;
+	}
 
-	pr_notice("Withdrawing cache \"%s\"\n",
-		  cache->tag->name);
+	cache = list_entry(v, struct fscache_cache, cache_link);
+	seq_printf(m,
+		   "%08x %5d %5d %5d %5d %c %s\n",
+		   cache->debug_id,
+		   refcount_read(&cache->ref),
+		   atomic_read(&cache->n_volumes),
+		   atomic_read(&cache->object_count),
+		   atomic_read(&cache->n_accesses),
+		   fscache_cache_states[cache->state],
+		   cache->name ?: "-");
+	return 0;
+}
 
-	/* make the cache unavailable for cookie acquisition */
-	if (test_and_set_bit(FSCACHE_CACHE_WITHDRAWN, &cache->flags))
-		BUG();
+static void *fscache_caches_seq_start(struct seq_file *m, loff_t *_pos)
+	__acquires(fscache_addremove_sem)
+{
+	down_read(&fscache_addremove_sem);
+	return seq_list_start_head(&fscache_caches, *_pos);
+}
 
-	down_write(&fscache_addremove_sem);
-	list_del_init(&cache->link);
-	cache->tag->cache = NULL;
-	up_write(&fscache_addremove_sem);
+static void *fscache_caches_seq_next(struct seq_file *m, void *v, loff_t *_pos)
+{
+	return seq_list_next(v, &fscache_caches, _pos);
+}
 
-	/* make sure all pages pinned by operations on behalf of the netfs are
-	 * written to disk */
-	fscache_stat(&fscache_n_cop_sync_cache);
-	cache->ops->sync_cache(cache);
-	fscache_stat_d(&fscache_n_cop_sync_cache);
-
-	/* we now have to destroy all the active objects pertaining to this
-	 * cache - which we do by passing them off to thread pool to be
-	 * disposed of */
-	_debug("destroy");
-
-	fscache_withdraw_all_objects(cache, &dying_objects);
-
-	/* wait for all extant objects to finish their outstanding operations
-	 * and go away */
-	_debug("wait for finish");
-	wait_event(fscache_cache_cleared_wq,
-		   atomic_read(&cache->object_count) == 0);
-	_debug("wait for clearance");
-	wait_event(fscache_cache_cleared_wq,
-		   list_empty(&cache->object_list));
-	_debug("cleared");
-	ASSERT(list_empty(&dying_objects));
-
-	kobject_put(cache->kobj);
-
-	clear_bit(FSCACHE_TAG_RESERVED, &cache->tag->flags);
-	fscache_release_cache_tag(cache->tag);
-	cache->tag = NULL;
-
-	_leave("");
+static void fscache_caches_seq_stop(struct seq_file *m, void *v)
+	__releases(fscache_addremove_sem)
+{
+	up_read(&fscache_addremove_sem);
 }
-EXPORT_SYMBOL(fscache_withdraw_cache);
+
+const struct seq_operations fscache_caches_seq_ops = {
+	.start  = fscache_caches_seq_start,
+	.next   = fscache_caches_seq_next,
+	.stop   = fscache_caches_seq_stop,
+	.show   = fscache_caches_seq_show,
+};
+#endif /* CONFIG_PROC_FS */
diff --git a/fs/fscache/cookie.c b/fs/fscache/cookie.c
index 3f7bb2eecdc3..90a16e6d6917 100644
--- a/fs/fscache/cookie.c
+++ b/fs/fscache/cookie.c
@@ -1,7 +1,7 @@
 // SPDX-License-Identifier: GPL-2.0-or-later
 /* netfs cookie management
  *
- * Copyright (C) 2004-2007 Red Hat, Inc. All Rights Reserved.
+ * Copyright (C) 2004-2007, 2020 Red Hat, Inc. All Rights Reserved.
  * Written by David Howells (dhowells@redhat.com)
  *
  * See Documentation/filesystems/caching/netfs-api.rst for more information on
@@ -15,66 +15,164 @@
 
 struct kmem_cache *fscache_cookie_jar;
 
-static atomic_t fscache_object_debug_id = ATOMIC_INIT(0);
+static void fscache_cookie_worker(struct work_struct *work);
+static void fscache_drop_cookie(struct fscache_cookie *cookie);
+static void fscache_lookup_cookie(struct fscache_cookie *cookie);
+static void fscache_invalidate_cookie(struct fscache_cookie *cookie);
 
 #define fscache_cookie_hash_shift 15
 static struct hlist_bl_head fscache_cookie_hash[1 << fscache_cookie_hash_shift];
 static LIST_HEAD(fscache_cookies);
 static DEFINE_RWLOCK(fscache_cookies_lock);
+static const char fscache_cookie_stages[FSCACHE_COOKIE_STAGE__NR] = "-LCAIFWRD";
 
-static int fscache_acquire_non_index_cookie(struct fscache_cookie *cookie);
-static int fscache_alloc_object(struct fscache_cache *cache,
-				struct fscache_cookie *cookie);
-static int fscache_attach_object(struct fscache_cookie *cookie,
-				 struct cachefiles_object *object);
-
-static void fscache_print_cookie(struct fscache_cookie *cookie, char prefix)
+void fscache_print_cookie(struct fscache_cookie *cookie, char prefix)
 {
-	struct cachefiles_object *object;
-	struct hlist_node *o;
 	const u8 *k;
-	unsigned loop;
 
-	pr_err("%c-cookie c=%08x [p=%08x fl=%lx nc=%u na=%u]\n",
+	pr_err("%c-cookie c=%08x [fl=%lx na=%u nA=%u s=%c]\n",
 	       prefix,
 	       cookie->debug_id,
-	       cookie->parent ? cookie->parent->debug_id : 0,
 	       cookie->flags,
-	       atomic_read(&cookie->n_children),
-	       atomic_read(&cookie->n_active));
-	pr_err("%c-cookie d=%s\n",
+	       atomic_read(&cookie->n_active),
+	       atomic_read(&cookie->n_accesses),
+	       fscache_cookie_stages[cookie->stage]);
+	pr_err("%c-cookie V=%08x [%s]\n",
 	       prefix,
-	       cookie->type_name);
-
-	o = READ_ONCE(cookie->backing_objects.first);
-	if (o) {
-		object = hlist_entry(o, struct cachefiles_object, cookie_link);
-		pr_err("%c-cookie o=%u\n", prefix, object->debug_id);
-	}
+	       cookie->volume->debug_id,
+	       cookie->volume->key);
 
-	pr_err("%c-key=[%u] '", prefix, cookie->key_len);
 	k = (cookie->key_len <= sizeof(cookie->inline_key)) ?
 		cookie->inline_key : cookie->key;
-	for (loop = 0; loop < cookie->key_len; loop++)
-		pr_cont("%02x", k[loop]);
-	pr_cont("'\n");
-}
-
-void fscache_free_cookie(struct fscache_cookie *cookie)
-{
-	if (cookie) {
-		BUG_ON(!hlist_empty(&cookie->backing_objects));
-		write_lock(&fscache_cookies_lock);
-		list_del(&cookie->proc_link);
-		write_unlock(&fscache_cookies_lock);
-		if (cookie->aux_len > sizeof(cookie->inline_aux))
-			kfree(cookie->aux);
-		if (cookie->key_len > sizeof(cookie->inline_key))
-			kfree(cookie->key);
-		fscache_put_cache_tag(cookie->preferred_cache);
-		kmem_cache_free(fscache_cookie_jar, cookie);
+	pr_err("%c-key=[%u] '%*phN'\n", prefix, cookie->key_len, cookie->key_len, k);
+}
+
+static void fscache_free_cookie(struct fscache_cookie *cookie)
+{
+	write_lock(&fscache_cookies_lock);
+	list_del(&cookie->proc_link);
+	write_unlock(&fscache_cookies_lock);
+	if (cookie->aux_len > sizeof(cookie->inline_aux))
+		kfree(cookie->aux);
+	if (cookie->key_len > sizeof(cookie->inline_key))
+		kfree(cookie->key);
+	fscache_stat_d(&fscache_n_cookies);
+	kmem_cache_free(fscache_cookie_jar, cookie);
+}
+
+static void __fscache_queue_cookie(struct fscache_cookie *cookie)
+{
+	if (!queue_work(fscache_wq, &cookie->work))
+		fscache_put_cookie(cookie, fscache_cookie_put_over_queued);
+}
+
+static void fscache_queue_cookie(struct fscache_cookie *cookie,
+				 enum fscache_cookie_trace where)
+{
+	fscache_get_cookie(cookie, where);
+	__fscache_queue_cookie(cookie);
+}
+
+static void __fscache_end_cookie_access(struct fscache_cookie *cookie)
+{
+	if (test_bit(FSCACHE_COOKIE_DO_RELINQUISH, &cookie->flags))
+		fscache_set_cookie_stage(cookie, FSCACHE_COOKIE_STAGE_RELINQUISHING);
+	else if (test_bit(FSCACHE_COOKIE_DO_WITHDRAW, &cookie->flags))
+		fscache_set_cookie_stage(cookie, FSCACHE_COOKIE_STAGE_WITHDRAWING);
+	fscache_queue_cookie(cookie, fscache_cookie_get_end_access);
+}
+
+/*
+ * Mark the end of an access on a cookie.  This brings a deferred
+ * relinquishment or withdrawal stage into effect.
+ */
+void fscache_end_cookie_access(struct fscache_cookie *cookie,
+			       enum fscache_access_trace why)
+{
+	int n_accesses;
+
+	smp_mb__before_atomic();
+	n_accesses = atomic_dec_return(&cookie->n_accesses);
+	trace_fscache_access(cookie->debug_id, refcount_read(&cookie->ref),
+			     n_accesses, why);
+	if (n_accesses == 0)
+		__fscache_end_cookie_access(cookie);
+}
+EXPORT_SYMBOL(fscache_end_cookie_access);
+
+/*
+ * Pin the cache behind a cookie so that we can access it.
+ */
+static void __fscache_begin_cookie_access(struct fscache_cookie *cookie,
+					  enum fscache_access_trace why)
+{
+	int n_accesses;
+
+	n_accesses = atomic_inc_return(&cookie->n_accesses);
+	smp_mb__after_atomic(); /* (Future) read stage after is-caching.
+				 * Reread n_accesses after is-caching
+				 */
+	trace_fscache_access(cookie->debug_id, refcount_read(&cookie->ref),
+			     n_accesses, why);
+}
+
+/*
+ * Pin the cache behind a cookie so that we can access it.
+ */
+bool fscache_begin_cookie_access(struct fscache_cookie *cookie,
+				 enum fscache_access_trace why)
+{
+	if (!test_bit(FSCACHE_COOKIE_IS_CACHING, &cookie->flags))
+		return false;
+	__fscache_begin_cookie_access(cookie, why);
+	if (!test_bit(FSCACHE_COOKIE_IS_CACHING, &cookie->flags) ||
+	    !fscache_cache_is_live(cookie->volume->cache)) {
+		fscache_end_cookie_access(cookie, fscache_access_unlive);
+		return false;
+	}
+	return true;
+}
+
+static inline void wake_up_cookie_stage(struct fscache_cookie *cookie)
+{
+	/* Use a barrier to ensure that waiters see the stage variable
+	 * change, as spin_unlock doesn't guarantee a barrier.
+	 *
+	 * See comments over wake_up_bit() and waitqueue_active().
+	 */
+	smp_mb();
+	wake_up_var(&cookie->stage);
+}
+
+static void __fscache_set_cookie_stage(struct fscache_cookie *cookie,
+				       enum fscache_cookie_stage stage)
+{
+	cookie->stage = stage;
+}
+
+/*
+ * Change the stage a cookie is at and wake up anyone waiting for that - but
+ * only if the cookie isn't already marked as being in a cleanup state.
+ */
+void fscache_set_cookie_stage(struct fscache_cookie *cookie,
+			      enum fscache_cookie_stage stage)
+{
+	bool changed = false;
+
+	spin_lock(&cookie->lock);
+	switch (cookie->stage) {
+	case FSCACHE_COOKIE_STAGE_RELINQUISHING:
+		break;
+	default:
+		__fscache_set_cookie_stage(cookie, stage);
+		changed = true;
+		break;
 	}
+	spin_unlock(&cookie->lock);
+	if (changed)
+		wake_up_cookie_stage(cookie);
 }
+EXPORT_SYMBOL(fscache_set_cookie_stage);
 
 /*
  * Set the index key in a cookie.  The cookie struct has space for a 16-byte
@@ -100,7 +198,7 @@ static int fscache_set_key(struct fscache_cookie *cookie,
 	}
 
 	memcpy(buf, index_key, index_key_len);
-	cookie->key_hash = fscache_hash(0, buf, bufs);
+	cookie->key_hash = fscache_hash(cookie->volume->key_hash, buf, bufs);
 	return 0;
 }
 
@@ -111,12 +209,10 @@ static long fscache_compare_cookie(const struct fscache_cookie *a,
 
 	if (a->key_hash != b->key_hash)
 		return (long)a->key_hash - (long)b->key_hash;
-	if (a->parent != b->parent)
-		return (long)a->parent - (long)b->parent;
+	if (a->volume != b->volume)
+		return (long)a->volume - (long)b->volume;
 	if (a->key_len != b->key_len)
 		return (long)a->key_len - (long)b->key_len;
-	if (a->type != b->type)
-		return (long)a->type - (long)b->type;
 
 	if (a->key_len <= sizeof(a->inline_key)) {
 		ka = &a->inline_key;
@@ -133,12 +229,9 @@ static atomic_t fscache_cookie_debug_id = ATOMIC_INIT(1);
 /*
  * Allocate a cookie.
  */
-struct fscache_cookie *fscache_alloc_cookie(
-	struct fscache_cookie *parent,
-	enum fscache_cookie_type type,
-	const char *type_name,
+static struct fscache_cookie *fscache_alloc_cookie(
+	struct fscache_volume *volume,
 	u8 advice,
-	struct fscache_cache_tag *preferred_cache,
 	const void *index_key, size_t index_key_len,
 	const void *aux_data, size_t aux_data_len,
 	loff_t object_size)
@@ -149,13 +242,13 @@ struct fscache_cookie *fscache_alloc_cookie(
 	cookie = kmem_cache_zalloc(fscache_cookie_jar, GFP_KERNEL);
 	if (!cookie)
 		return NULL;
+	fscache_stat(&fscache_n_cookies);
 
-	cookie->type = type;
-	cookie->advice = advice;
-	cookie->key_len = index_key_len;
-	cookie->aux_len = aux_data_len;
-	cookie->object_size = object_size;
-	strlcpy(cookie->type_name, type_name, sizeof(cookie->type_name));
+	cookie->volume		= volume;
+	cookie->advice		= advice;
+	cookie->key_len		= index_key_len;
+	cookie->aux_len		= aux_data_len;
+	cookie->object_size	= object_size;
 
 	if (fscache_set_key(cookie, index_key, index_key_len) < 0)
 		goto nomem;
@@ -169,24 +262,15 @@ struct fscache_cookie *fscache_alloc_cookie(
 	}
 
 	refcount_set(&cookie->ref, 1);
-	atomic_set(&cookie->n_children, 0);
 	cookie->debug_id = atomic_inc_return(&fscache_cookie_debug_id);
-
-	/* We keep the active count elevated until relinquishment to prevent an
-	 * attempt to wake up every time the object operations queue quiesces.
-	 */
-	atomic_set(&cookie->n_active, 1);
-
-	cookie->parent		= parent;
-	cookie->preferred_cache	= fscache_get_cache_tag(preferred_cache);
-
-	cookie->flags		= (1 << FSCACHE_COOKIE_NO_DATA_YET);
+	cookie->stage = FSCACHE_COOKIE_STAGE_QUIESCENT;
 	spin_lock_init(&cookie->lock);
-	INIT_HLIST_HEAD(&cookie->backing_objects);
+	INIT_WORK(&cookie->work, fscache_cookie_worker);
 
 	write_lock(&fscache_cookies_lock);
 	list_add_tail(&cookie->proc_link, &fscache_cookies);
 	write_unlock(&fscache_cookies_lock);
+	fscache_see_cookie(cookie, fscache_cookie_new_acquire);
 	return cookie;
 
 nomem:
@@ -194,13 +278,28 @@ struct fscache_cookie *fscache_alloc_cookie(
 	return NULL;
 }
 
+static void fscache_wait_on_collision(struct fscache_cookie *candidate,
+				      struct fscache_cookie *wait_for)
+{
+	enum fscache_cookie_stage *stagep = &wait_for->stage;
+
+	wait_var_event_timeout(stagep, READ_ONCE(*stagep) == FSCACHE_COOKIE_STAGE_DROPPED,
+			       20 * HZ);
+	if (READ_ONCE(*stagep) != FSCACHE_COOKIE_STAGE_DROPPED) {
+		pr_notice("Potential collision c=%08x old: c=%08x",
+			  candidate->debug_id, wait_for->debug_id);
+		wait_var_event(stagep, READ_ONCE(*stagep) == FSCACHE_COOKIE_STAGE_DROPPED);
+	}
+}
+
 /*
  * Attempt to insert the new cookie into the hash.  If there's a collision, we
- * return the old cookie if it's not in use and an error otherwise.
+ * wait for the old cookie to complete if it's being relinquished and an error
+ * otherwise.
  */
-struct fscache_cookie *fscache_hash_cookie(struct fscache_cookie *candidate)
+static bool fscache_hash_cookie(struct fscache_cookie *candidate)
 {
-	struct fscache_cookie *cursor;
+	struct fscache_cookie *cursor, *wait_for = NULL;
 	struct hlist_bl_head *h;
 	struct hlist_bl_node *p;
 	unsigned int bucket;
@@ -210,61 +309,52 @@ struct fscache_cookie *fscache_hash_cookie(struct fscache_cookie *candidate)
 
 	hlist_bl_lock(h);
 	hlist_bl_for_each_entry(cursor, p, h, hash_link) {
-		if (fscache_compare_cookie(candidate, cursor) == 0)
-			goto collision;
+		if (fscache_compare_cookie(candidate, cursor) == 0) {
+			if (!test_bit(FSCACHE_COOKIE_RELINQUISHED, &cursor->flags))
+				goto collision;
+			wait_for = fscache_get_cookie(cursor,
+						      fscache_cookie_get_hash_collision);
+			break;
+		}
 	}
 
-	__set_bit(FSCACHE_COOKIE_ACQUIRED, &candidate->flags);
-	fscache_get_cookie(candidate->parent, fscache_cookie_get_acquire_parent);
-	atomic_inc(&candidate->parent->n_children);
+	fscache_get_volume(candidate->volume, fscache_volume_get_cookie);
+	atomic_inc(&candidate->volume->n_cookies);
 	hlist_bl_add_head(&candidate->hash_link, h);
 	hlist_bl_unlock(h);
-	return candidate;
 
-collision:
-	if (test_and_set_bit(FSCACHE_COOKIE_ACQUIRED, &cursor->flags)) {
-		trace_fscache_cookie(cursor->debug_id, refcount_read(&cursor->ref),
-				     fscache_cookie_collision);
-		pr_err("Duplicate cookie detected\n");
-		fscache_print_cookie(cursor, 'O');
-		fscache_print_cookie(candidate, 'N');
-		hlist_bl_unlock(h);
-		return NULL;
+	if (wait_for) {
+		fscache_wait_on_collision(candidate, wait_for);
+		fscache_put_cookie(wait_for, fscache_cookie_put_hash_collision);
 	}
+	return true;
 
-	fscache_get_cookie(cursor, fscache_cookie_get_reacquire);
+collision:
+	trace_fscache_cookie(cursor->debug_id, refcount_read(&cursor->ref),
+			     fscache_cookie_collision);
+	pr_err("Duplicate cookie detected\n");
+	fscache_print_cookie(cursor, 'O');
+	fscache_print_cookie(candidate, 'N');
 	hlist_bl_unlock(h);
-	return cursor;
+	return false;
 }
 
 /*
- * request a cookie to represent an object (index, datafile, xattr, etc)
- * - parent specifies the parent object
- *   - the top level index cookie for each netfs is stored in the fscache_netfs
- *     struct upon registration
- * - all attached caches will be searched to see if they contain this object
- * - index objects aren't stored on disk until there's a dependent file that
- *   needs storing
- * - other objects are stored in a selected cache immediately, and all the
- *   indices forming the path to it are instantiated if necessary
- * - we never let on to the netfs about errors
- *   - we may set a negative cookie pointer, but that's okay
+ * Request a cookie to represent a data storage object within a volume.
+ *
+ * We never let on to the netfs about errors.  We may set a negative cookie
+ * pointer, but that's okay
  */
 struct fscache_cookie *__fscache_acquire_cookie(
-	struct fscache_cookie *parent,
-	enum fscache_cookie_type type,
-	const char *type_name,
+	struct fscache_volume *volume,
 	u8 advice,
-	struct fscache_cache_tag *preferred_cache,
 	const void *index_key, size_t index_key_len,
 	const void *aux_data, size_t aux_data_len,
-	loff_t object_size,
-	bool enable)
+	loff_t object_size)
 {
-	struct fscache_cookie *candidate, *cookie;
+	struct fscache_cookie *cookie;
 
-	_enter("{%s},{%s},%u",
-	       parent ? parent->type_name : "<no-parent>", type_name, enable);
+	_enter("V=%x", volume->debug_id);
 
 	if (!index_key || !index_key_len || index_key_len > 255 || aux_data_len > 255)
 		return NULL;
@@ -275,336 +365,229 @@ struct fscache_cookie *__fscache_acquire_cookie(
 
 	fscache_stat(&fscache_n_acquires);
 
-	/* if there's no parent cookie, then we don't create one here either */
-	if (!parent) {
-		fscache_stat(&fscache_n_acquires_null);
-		_leave(" [no parent]");
-		return NULL;
-	}
-
-	/* validate the definition */
-	BUG_ON(type == FSCACHE_COOKIE_TYPE_INDEX &&
-	       parent->type != FSCACHE_COOKIE_TYPE_INDEX);
-
-	candidate = fscache_alloc_cookie(parent, type, type_name, advice,
-					 preferred_cache,
-					 index_key, index_key_len,
-					 aux_data, aux_data_len,
-					 object_size);
-	if (!candidate) {
+	cookie = fscache_alloc_cookie(volume, advice,
+				      index_key, index_key_len,
+				      aux_data, aux_data_len,
+				      object_size);
+	if (!cookie) {
 		fscache_stat(&fscache_n_acquires_oom);
-		_leave(" [ENOMEM]");
 		return NULL;
 	}
 
-	cookie = fscache_hash_cookie(candidate);
-	if (!cookie) {
-		trace_fscache_cookie(candidate->debug_id, 1,
-				     fscache_cookie_discard);
-		goto out;
-	}
-
-	if (cookie == candidate)
-		candidate = NULL;
-
-	switch (cookie->type) {
-	case FSCACHE_COOKIE_TYPE_INDEX:
-		fscache_stat(&fscache_n_cookie_index);
-		break;
-	case FSCACHE_COOKIE_TYPE_DATAFILE:
-		fscache_stat(&fscache_n_cookie_data);
-		break;
-	default:
-		fscache_stat(&fscache_n_cookie_special);
-		break;
+	if (!fscache_hash_cookie(cookie)) {
+		fscache_see_cookie(cookie, fscache_cookie_discard);
+		fscache_free_cookie(cookie);
+		return NULL;
 	}
 
 	trace_fscache_acquire(cookie);
-
-	if (enable) {
-		/* if the object is an index then we need do nothing more here
-		 * - we create indices on disk when we need them as an index
-		 * may exist in multiple caches */
-		if (cookie->type != FSCACHE_COOKIE_TYPE_INDEX) {
-			if (fscache_acquire_non_index_cookie(cookie) == 0) {
-				set_bit(FSCACHE_COOKIE_ENABLED, &cookie->flags);
-			} else {
-				atomic_dec(&parent->n_children);
-				fscache_put_cookie(cookie,
-						   fscache_cookie_put_acquire_nobufs);
-				fscache_stat(&fscache_n_acquires_nobufs);
-				_leave(" = NULL");
-				return NULL;
-			}
-		} else {
-			set_bit(FSCACHE_COOKIE_ENABLED, &cookie->flags);
-		}
-	}
-
 	fscache_stat(&fscache_n_acquires_ok);
-
-out:
-	fscache_free_cookie(candidate);
+	_leave(" = c=%08x", cookie->debug_id);
 	return cookie;
 }
 EXPORT_SYMBOL(__fscache_acquire_cookie);
 
 /*
- * Enable a cookie to permit it to accept new operations.
+ * Look up a cookie to the cache.
  */
-void __fscache_enable_cookie(struct fscache_cookie *cookie,
-			     const void *aux_data,
-			     loff_t object_size,
-			     bool (*can_enable)(void *data),
-			     void *data)
+static void fscache_lookup_cookie(struct fscache_cookie *cookie)
 {
-	_enter("%x", cookie->debug_id);
-
-	trace_fscache_enable(cookie);
-
-	wait_on_bit_lock(&cookie->flags, FSCACHE_COOKIE_ENABLEMENT_LOCK,
-			 TASK_UNINTERRUPTIBLE);
-
-	cookie->object_size = object_size;
-	fscache_update_aux(cookie, aux_data);
-
-	if (test_bit(FSCACHE_COOKIE_ENABLED, &cookie->flags))
-		goto out_unlock;
-
-	if (can_enable && !can_enable(data)) {
-		/* The netfs decided it didn't want to enable after all */
-	} else if (cookie->type != FSCACHE_COOKIE_TYPE_INDEX) {
-		/* Wait for outstanding disablement to complete */
-		__fscache_wait_on_invalidate(cookie);
-
-		if (fscache_acquire_non_index_cookie(cookie) == 0)
-			set_bit(FSCACHE_COOKIE_ENABLED, &cookie->flags);
-	} else {
-		set_bit(FSCACHE_COOKIE_ENABLED, &cookie->flags);
-	}
-
-out_unlock:
-	clear_bit_unlock(FSCACHE_COOKIE_ENABLEMENT_LOCK, &cookie->flags);
-	wake_up_bit(&cookie->flags, FSCACHE_COOKIE_ENABLEMENT_LOCK);
-}
-EXPORT_SYMBOL(__fscache_enable_cookie);
-
-/*
- * acquire a non-index cookie
- * - this must make sure the index chain is instantiated and instantiate the
- *   object representation too
- */
-static int fscache_acquire_non_index_cookie(struct fscache_cookie *cookie)
-{
-	struct cachefiles_object *object;
-	struct fscache_cache *cache;
-	int ret;
+	bool changed_stage = false, need_withdraw = false;
 
 	_enter("");
 
-	set_bit(FSCACHE_COOKIE_UNAVAILABLE, &cookie->flags);
-
-	/* now we need to see whether the backing objects for this cookie yet
-	 * exist, if not there'll be nothing to search */
-	down_read(&fscache_addremove_sem);
-
-	if (list_empty(&fscache_cache_list)) {
-		up_read(&fscache_addremove_sem);
-		_leave(" = 0 [no caches]");
-		return 0;
-	}
-
-	/* select a cache in which to store the object */
-	cache = fscache_select_cache_for_object(cookie->parent);
-	if (!cache) {
-		up_read(&fscache_addremove_sem);
-		fscache_stat(&fscache_n_acquires_no_cache);
-		_leave(" = -ENOMEDIUM [no cache]");
-		return -ENOMEDIUM;
+	if (!cookie->volume->cache_priv) {
+		fscache_create_volume(cookie->volume, true);
+		if (!cookie->volume->cache_priv) {
+			fscache_set_cookie_stage(cookie, FSCACHE_COOKIE_STAGE_QUIESCENT);
+			goto out;
+		}
 	}
 
-	_debug("cache %s", cache->tag->name);
-
-	set_bit(FSCACHE_COOKIE_LOOKING_UP, &cookie->flags);
-
-	/* ask the cache to allocate objects for this cookie and its parent
-	 * chain */
-	ret = fscache_alloc_object(cache, cookie);
-	if (ret < 0) {
-		up_read(&fscache_addremove_sem);
-		_leave(" = %d", ret);
-		return ret;
+	if (!cookie->volume->cache->ops->lookup_cookie(cookie)) {
+		if (cookie->stage != FSCACHE_COOKIE_STAGE_FAILED)
+			fscache_set_cookie_stage(cookie, FSCACHE_COOKIE_STAGE_QUIESCENT);
+		need_withdraw = true;
+		_leave(" [fail]");
+		goto out;
 	}
 
 	spin_lock(&cookie->lock);
-	if (hlist_empty(&cookie->backing_objects)) {
-		spin_unlock(&cookie->lock);
-		goto unavailable;
+	if (cookie->stage != FSCACHE_COOKIE_STAGE_RELINQUISHING) {
+		__fscache_set_cookie_stage(cookie, FSCACHE_COOKIE_STAGE_ACTIVE);
+		fscache_see_cookie(cookie, fscache_cookie_see_active);
+		changed_stage = true;
 	}
-
-	object = hlist_entry(cookie->backing_objects.first,
-			     struct cachefiles_object, cookie_link);
-
-	/* initiate the process of looking up all the objects in the chain
-	 * (done by fscache_initialise_object()) */
-	fscache_raise_event(object, FSCACHE_OBJECT_EV_NEW_CHILD);
-
 	spin_unlock(&cookie->lock);
+	if (changed_stage)
+		wake_up_cookie_stage(cookie);
 
-	/* we may be required to wait for lookup to complete at this point */
-	if (!fscache_defer_lookup) {
-		wait_on_bit(&cookie->flags, FSCACHE_COOKIE_LOOKING_UP,
-			    TASK_UNINTERRUPTIBLE);
-		if (test_bit(FSCACHE_COOKIE_UNAVAILABLE, &cookie->flags))
-			goto unavailable;
-	}
-
-	up_read(&fscache_addremove_sem);
-	_leave(" = 0 [deferred]");
-	return 0;
-
-unavailable:
-	up_read(&fscache_addremove_sem);
-	_leave(" = -ENOBUFS");
-	return -ENOBUFS;
+out:
+	fscache_end_cookie_access(cookie, fscache_access_lookup_cookie_end);
+	if (need_withdraw)
+		cookie->volume->cache->ops->withdraw_cookie(cookie);
+	fscache_end_volume_access(cookie->volume, fscache_access_lookup_cookie_end);
 }
 
 /*
- * recursively allocate cache object records for a cookie/cache combination
- * - caller must be holding the addremove sem
+ * Start using the cookie for I/O.  This prevents the backing object from being
+ * reaped by VM pressure.
  */
-static int fscache_alloc_object(struct fscache_cache *cache,
-				struct fscache_cookie *cookie)
+void __fscache_use_cookie(struct fscache_cookie *cookie, bool will_modify)
 {
-	struct cachefiles_object *object;
-	int ret;
+	enum fscache_cookie_stage stage;
+	bool changed_stage = false, queue = false;
 
-	_enter("%s,%x{%s}", cache->tag->name, cookie->debug_id, cookie->type_name);
+	_enter("c=%08x", cookie->debug_id);
 
-	spin_lock(&cookie->lock);
-	hlist_for_each_entry(object, &cookie->backing_objects,
-			     cookie_link) {
-		if (object->cache == cache)
-			goto object_already_extant;
-	}
-	spin_unlock(&cookie->lock);
+	if (WARN(test_bit(FSCACHE_COOKIE_RELINQUISHED, &cookie->flags),
+		 "Trying to use relinquished cookie\n"))
+		return;
 
-	/* ask the cache to allocate an object (we may end up with duplicate
-	 * objects at this stage, but we sort that out later) */
-	fscache_stat(&fscache_n_cop_alloc_object);
-	object = cache->ops->alloc_object(cache, cookie);
-	fscache_stat_d(&fscache_n_cop_alloc_object);
-	if (IS_ERR(object)) {
-		fscache_stat(&fscache_n_object_no_alloc);
-		ret = PTR_ERR(object);
-		goto error;
-	}
+	spin_lock(&cookie->lock);
 
-	ASSERTCMP(object->cookie, ==, cookie);
-	fscache_stat(&fscache_n_object_alloc);
+	atomic_inc(&cookie->n_active);
 
-	object->debug_id = atomic_inc_return(&fscache_object_debug_id);
+	stage = cookie->stage;
+	switch (stage) {
+	case FSCACHE_COOKIE_STAGE_QUIESCENT:
+		if (!fscache_begin_volume_access(cookie->volume,
+						 fscache_access_lookup_cookie))
+			break;
 
-	_debug("ALLOC OBJ%x: %s {%lx}",
-	       object->debug_id, cookie->type_name, object->events);
+		__fscache_begin_cookie_access(cookie, fscache_access_lookup_cookie);
+		__fscache_set_cookie_stage(cookie, FSCACHE_COOKIE_STAGE_LOOKING_UP);
+		smp_mb__before_atomic(); /* Set stage before is-caching
+					  * vs __fscache_begin_cookie_access()
+					  */
+		set_bit(FSCACHE_COOKIE_IS_CACHING, &cookie->flags);
+		set_bit(FSCACHE_COOKIE_HAS_BEEN_CACHED, &cookie->flags);
+		changed_stage = true;
+		queue = true;
+		break;
 
-	ret = fscache_alloc_object(cache, cookie->parent);
-	if (ret < 0)
-		goto error_put;
+	case FSCACHE_COOKIE_STAGE_LOOKING_UP:
+	case FSCACHE_COOKIE_STAGE_CREATING:
+	case FSCACHE_COOKIE_STAGE_ACTIVE:
+	case FSCACHE_COOKIE_STAGE_INVALIDATING:
+	case FSCACHE_COOKIE_STAGE_FAILED:
+	case FSCACHE_COOKIE_STAGE_WITHDRAWING:
+		break;
 
-	/* only attach if we managed to allocate all we needed, otherwise
-	 * discard the object we just allocated and instead use the one
-	 * attached to the cookie */
-	if (fscache_attach_object(cookie, object) < 0) {
-		fscache_stat(&fscache_n_cop_put_object);
-		cache->ops->put_object(object, fscache_obj_put_attach_fail);
-		fscache_stat_d(&fscache_n_cop_put_object);
+	case FSCACHE_COOKIE_STAGE_DROPPED:
+	case FSCACHE_COOKIE_STAGE_RELINQUISHING:
+		WARN(1, "Can't use cookie in stage %u\n", cookie->stage);
+		break;
 	}
 
-	_leave(" = 0");
-	return 0;
-
-object_already_extant:
-	ret = -ENOBUFS;
-	if (fscache_object_is_dying(object) ||
-	    fscache_cache_is_broken(object)) {
-		spin_unlock(&cookie->lock);
-		goto error;
-	}
 	spin_unlock(&cookie->lock);
-	_leave(" = 0 [found]");
-	return 0;
+	if (changed_stage)
+		wake_up_cookie_stage(cookie);
+	if (queue)
+		fscache_queue_cookie(cookie, fscache_cookie_get_use_work);
+	_leave("");
+}
+EXPORT_SYMBOL(__fscache_use_cookie);
 
-error_put:
-	fscache_stat(&fscache_n_cop_put_object);
-	cache->ops->put_object(object, fscache_obj_put_alloc_fail);
-	fscache_stat_d(&fscache_n_cop_put_object);
-error:
-	_leave(" = %d", ret);
-	return ret;
+/*
+ * Stop using the cookie for I/O.
+ */
+void __fscache_unuse_cookie(struct fscache_cookie *cookie,
+			    const void *aux_data, const loff_t *object_size)
+{
+	if (aux_data || object_size)
+		__fscache_update_cookie(cookie, aux_data, object_size);
+	atomic_dec(&cookie->n_active);
 }
+EXPORT_SYMBOL(__fscache_unuse_cookie);
 
 /*
- * attach a cache object to a cookie
+ * Perform work upon the cookie, such as committing its cache state,
+ * relinquishing it or withdrawing the backing cache.  We're protected from the
+ * cache going away under us as object withdrawal must come through this
+ * non-reentrant work item.
  */
-static int fscache_attach_object(struct fscache_cookie *cookie,
-				 struct cachefiles_object *object)
+static void __fscache_cookie_worker(struct fscache_cookie *cookie)
 {
-	struct cachefiles_object *p;
-	struct fscache_cache *cache = object->cache;
-	int ret;
+	_enter("c=%x", cookie->debug_id);
 
-	_enter("{%s},{OBJ%x}", cookie->type_name, object->debug_id);
+again:
+	switch (READ_ONCE(cookie->stage)) {
+	case FSCACHE_COOKIE_STAGE_ACTIVE:
+		break;
 
-	ASSERTCMP(object->cookie, ==, cookie);
+	case FSCACHE_COOKIE_STAGE_LOOKING_UP:
+		fscache_lookup_cookie(cookie);
+		goto again;
 
-	spin_lock(&cookie->lock);
+	case FSCACHE_COOKIE_STAGE_CREATING:
+		WARN_ONCE(1, "Cookie %x in unexpected stage %u\n",
+			  cookie->debug_id, cookie->stage);
+		break;
 
-	/* there may be multiple initial creations of this object, but we only
-	 * want one */
-	ret = -EEXIST;
-	hlist_for_each_entry(p, &cookie->backing_objects, cookie_link) {
-		if (p->cache == object->cache) {
-			if (fscache_object_is_dying(p))
-				ret = -ENOBUFS;
-			goto cant_attach_object;
-		}
-	}
+	case FSCACHE_COOKIE_STAGE_INVALIDATING:
+		fscache_invalidate_cookie(cookie);
+		goto again;
 
-	/* pin the parent object */
-	spin_lock_nested(&cookie->parent->lock, 1);
-	hlist_for_each_entry(p, &cookie->parent->backing_objects,
-			     cookie_link) {
-		if (p->cache == object->cache) {
-			if (fscache_object_is_dying(p)) {
-				ret = -ENOBUFS;
-				spin_unlock(&cookie->parent->lock);
-				goto cant_attach_object;
-			}
-			object->parent = p;
-			spin_lock(&p->lock);
-			p->n_children++;
-			spin_unlock(&p->lock);
+	case FSCACHE_COOKIE_STAGE_FAILED:
+		break;
+
+	case FSCACHE_COOKIE_STAGE_RELINQUISHING:
+	case FSCACHE_COOKIE_STAGE_WITHDRAWING:
+		if (test_and_clear_bit(FSCACHE_COOKIE_IS_CACHING, &cookie->flags) &&
+		    cookie->cache_priv)
+			cookie->volume->cache->ops->withdraw_cookie(cookie);
+		if (cookie->stage == FSCACHE_COOKIE_STAGE_RELINQUISHING) {
+			fscache_see_cookie(cookie, fscache_cookie_see_relinquish);
+			fscache_drop_cookie(cookie);
 			break;
+		} else {
+			fscache_see_cookie(cookie, fscache_cookie_see_withdraw);
 		}
+		fallthrough;
+
+	case FSCACHE_COOKIE_STAGE_QUIESCENT:
+	case FSCACHE_COOKIE_STAGE_DROPPED:
+		clear_bit(FSCACHE_COOKIE_NEEDS_UPDATE, &cookie->flags);
+		clear_bit(FSCACHE_COOKIE_DO_WITHDRAW, &cookie->flags);
+		set_bit(FSCACHE_COOKIE_NO_DATA_TO_READ, &cookie->flags);
+		fscache_set_cookie_stage(cookie, FSCACHE_COOKIE_STAGE_QUIESCENT);
+		break;
 	}
-	spin_unlock(&cookie->parent->lock);
+	_leave("");
+}
 
-	/* attach to the cache's object list */
-	if (list_empty(&object->cache_link)) {
-		spin_lock(&cache->object_list_lock);
-		list_add(&object->cache_link, &cache->object_list);
-		spin_unlock(&cache->object_list_lock);
-	}
+static void fscache_cookie_worker(struct work_struct *work)
+{
+	struct fscache_cookie *cookie = container_of(work, struct fscache_cookie, work);
 
-	/* Attach to the cookie.  The object already has a ref on it. */
-	hlist_add_head(&object->cookie_link, &cookie->backing_objects);
-	ret = 0;
+	fscache_see_cookie(cookie, fscache_cookie_see_work);
+	__fscache_cookie_worker(cookie);
+	fscache_put_cookie(cookie, fscache_cookie_put_work);
+}
 
-cant_attach_object:
-	spin_unlock(&cookie->lock);
-	_leave(" = %d", ret);
-	return ret;
+/*
+ * Wait for the object to become inactive.  The cookie's work item will be
+ * scheduled when someone transitions n_accesses to 0.
+ */
+static void __fscache_withdraw_cookie(struct fscache_cookie *cookie)
+{
+	if (test_and_clear_bit(FSCACHE_COOKIE_NACC_ELEVATED, &cookie->flags))
+		fscache_end_cookie_access(cookie, fscache_access_cache_unpin);
+	else
+		__fscache_end_cookie_access(cookie);
+}
+
+/*
+ * Ask the cache to effect invalidation of a cookie.
+ */
+static void fscache_invalidate_cookie(struct fscache_cookie *cookie)
+{
+	if (cookie->volume->cache->ops->invalidate_cookie(cookie, 0))
+		fscache_set_cookie_stage(cookie, FSCACHE_COOKIE_STAGE_ACTIVE);
+	else
+		fscache_set_cookie_stage(cookie, FSCACHE_COOKIE_STAGE_FAILED);
+	fscache_end_cookie_access(cookie, fscache_access_invalidate_cookie_end);
 }
 
 /*
@@ -612,67 +595,60 @@ static int fscache_attach_object(struct fscache_cookie *cookie,
  */
 void __fscache_invalidate(struct fscache_cookie *cookie)
 {
-	_enter("{%s}", cookie->type_name);
+	bool is_caching;
+
+	_enter("c=%x", cookie->debug_id);
 
 	fscache_stat(&fscache_n_invalidates);
 
-	/* Only permit invalidation of data files.  Invalidating an index will
-	 * require the caller to release all its attachments to the tree rooted
-	 * there, and if it's doing that, it may as well just retire the
-	 * cookie.
-	 */
-	ASSERTCMP(cookie->type, ==, FSCACHE_COOKIE_TYPE_DATAFILE);
+	if (WARN(test_bit(FSCACHE_COOKIE_RELINQUISHED, &cookie->flags),
+		 "Trying to invalidate relinquished cookie\n"))
+		return;
 
-	/* TODO: Do invalidation */
+	spin_lock(&cookie->lock);
+	set_bit(FSCACHE_COOKIE_NO_DATA_TO_READ, &cookie->flags);
+	cookie->inval_counter++;
 
-	_leave("");
-}
-EXPORT_SYMBOL(__fscache_invalidate);
+	switch (cookie->stage) {
+	case FSCACHE_COOKIE_STAGE_INVALIDATING: /* is_still_valid will catch it */
+	default:
+		spin_unlock(&cookie->lock);
+		_leave(" [no %u]", cookie->stage);
+		return;
 
-/*
- * Wait for object invalidation to complete.
- */
-void __fscache_wait_on_invalidate(struct fscache_cookie *cookie)
-{
-	_enter("%x", cookie->debug_id);
+	case FSCACHE_COOKIE_STAGE_LOOKING_UP:
+	case FSCACHE_COOKIE_STAGE_CREATING:
+		spin_unlock(&cookie->lock);
+		_leave(" [look %x]", cookie->inval_counter);
+		return;
 
-	wait_on_bit(&cookie->flags, FSCACHE_COOKIE_INVALIDATING,
-		    TASK_UNINTERRUPTIBLE);
+	case FSCACHE_COOKIE_STAGE_ACTIVE:
+		__fscache_set_cookie_stage(cookie, FSCACHE_COOKIE_STAGE_INVALIDATING);
+		is_caching = fscache_begin_cookie_access(
+			cookie, fscache_access_invalidate_cookie);
+		spin_unlock(&cookie->lock);
+		wake_up_cookie_stage(cookie);
 
-	_leave("");
+		if (is_caching)
+			fscache_queue_cookie(cookie, fscache_cookie_get_inval_work);
+		_leave(" [inv]");
+		return;
+	}
 }
-EXPORT_SYMBOL(__fscache_wait_on_invalidate);
+EXPORT_SYMBOL(__fscache_invalidate);
 
 /*
- * update the index entries backing a cookie
+ * Update the index entries backing a cookie.  The writeback is done lazily.
  */
-void __fscache_update_cookie(struct fscache_cookie *cookie, const void *aux_data)
+void __fscache_update_cookie(struct fscache_cookie *cookie,
+			     const void *aux_data, const loff_t *object_size)
 {
-	struct cachefiles_object *object;
-
 	fscache_stat(&fscache_n_updates);
 
-	if (!cookie) {
-		fscache_stat(&fscache_n_updates_null);
-		_leave(" [no cookie]");
-		return;
-	}
-
-	_enter("{%s}", cookie->type_name);
-
 	spin_lock(&cookie->lock);
 
-	fscache_update_aux(cookie, aux_data);
-
-	if (fscache_cookie_enabled(cookie)) {
-		/* update the index entry on disk in each cache backing this
-		 * cookie.
-		 */
-		hlist_for_each_entry(object,
-				     &cookie->backing_objects, cookie_link) {
-			fscache_raise_event(object, FSCACHE_OBJECT_EV_UPDATE);
-		}
-	}
+	fscache_update_aux(cookie, aux_data, object_size);
+	set_bit(FSCACHE_COOKIE_NEEDS_UPDATE, &cookie->flags);
 
 	spin_unlock(&cookie->lock);
 	_leave("");
@@ -680,169 +656,106 @@ void __fscache_update_cookie(struct fscache_cookie *cookie, const void *aux_data
 EXPORT_SYMBOL(__fscache_update_cookie);
 
 /*
- * Disable a cookie to stop it from accepting new requests from the netfs.
+ * Remove a cookie from the hash table.
  */
-void __fscache_disable_cookie(struct fscache_cookie *cookie,
-			      const void *aux_data,
-			      bool invalidate)
+static void fscache_unhash_cookie(struct fscache_cookie *cookie)
 {
-	struct cachefiles_object *object;
-	bool awaken = false;
-
-	_enter("%x,%u", cookie->debug_id, invalidate);
-
-	trace_fscache_disable(cookie);
-
-	ASSERTCMP(atomic_read(&cookie->n_active), >, 0);
-
-	if (atomic_read(&cookie->n_children) != 0) {
-		pr_err("Cookie '%s' still has children\n",
-		       cookie->type_name);
-		BUG();
-	}
-
-	wait_on_bit_lock(&cookie->flags, FSCACHE_COOKIE_ENABLEMENT_LOCK,
-			 TASK_UNINTERRUPTIBLE);
-
-	fscache_update_aux(cookie, aux_data);
-
-	if (!test_and_clear_bit(FSCACHE_COOKIE_ENABLED, &cookie->flags))
-		goto out_unlock_enable;
+	struct hlist_bl_head *h;
+	unsigned int bucket;
 
-	/* If the cookie is being invalidated, wait for that to complete first
-	 * so that we can reuse the flag.
-	 */
-	__fscache_wait_on_invalidate(cookie);
+	bucket = cookie->key_hash & (ARRAY_SIZE(fscache_cookie_hash) - 1);
+	h = &fscache_cookie_hash[bucket];
 
-	/* Dispose of the backing objects */
-	set_bit(FSCACHE_COOKIE_INVALIDATING, &cookie->flags);
+	hlist_bl_lock(h);
+	hlist_bl_del(&cookie->hash_link);
+	hlist_bl_unlock(h);
+}
 
+/*
+ * Finalise a cookie after all its resources have been disposed of.
+ */
+static void fscache_drop_cookie(struct fscache_cookie *cookie)
+{
 	spin_lock(&cookie->lock);
-	if (!hlist_empty(&cookie->backing_objects)) {
-		hlist_for_each_entry(object, &cookie->backing_objects, cookie_link) {
-			if (invalidate)
-				set_bit(FSCACHE_OBJECT_RETIRED, &object->flags);
-			fscache_raise_event(object, FSCACHE_OBJECT_EV_KILL);
-		}
-	} else {
-		if (test_and_clear_bit(FSCACHE_COOKIE_INVALIDATING, &cookie->flags))
-			awaken = true;
-	}
+	__fscache_set_cookie_stage(cookie, FSCACHE_COOKIE_STAGE_DROPPED);
 	spin_unlock(&cookie->lock);
-	if (awaken)
-		wake_up_bit(&cookie->flags, FSCACHE_COOKIE_INVALIDATING);
+	wake_up_cookie_stage(cookie);
 
-	/* Wait for cessation of activity requiring access to the netfs (when
-	 * n_active reaches 0).  This makes sure outstanding reads and writes
-	 * have completed.
-	 */
-	if (!atomic_dec_and_test(&cookie->n_active)) {
-		wait_var_event(&cookie->n_active,
-			       !atomic_read(&cookie->n_active));
-	}
+	fscache_unhash_cookie(cookie);
+	fscache_stat(&fscache_n_relinquishes_dropped);
+}
 
-	/* Reset the cookie state if it wasn't relinquished */
-	if (!test_bit(FSCACHE_COOKIE_RELINQUISHED, &cookie->flags)) {
-		atomic_inc(&cookie->n_active);
-		set_bit(FSCACHE_COOKIE_NO_DATA_YET, &cookie->flags);
-	}
+static void fscache_drop_withdraw_cookie(struct fscache_cookie *cookie)
+{
+	__fscache_withdraw_cookie(cookie);
+}
 
-out_unlock_enable:
-	clear_bit_unlock(FSCACHE_COOKIE_ENABLEMENT_LOCK, &cookie->flags);
-	wake_up_bit(&cookie->flags, FSCACHE_COOKIE_ENABLEMENT_LOCK);
-	_leave("");
+/**
+ * fscache_withdraw_cookie - Mark a cookie for withdrawal
+ * @cookie: The cookie to be withdrawn.
+ *
+ * Allow the cache backend to withdraw the backing for a cookie for its own
+ * reasons, even if that cookie is in active use.
+ */
+void fscache_withdraw_cookie(struct fscache_cookie *cookie)
+{
+	set_bit(FSCACHE_COOKIE_DO_WITHDRAW, &cookie->flags);
+	fscache_drop_withdraw_cookie(cookie);
 }
-EXPORT_SYMBOL(__fscache_disable_cookie);
+EXPORT_SYMBOL(fscache_withdraw_cookie);
 
 /*
- * release a cookie back to the cache
+ * Allow the netfs to release a cookie back to the cache.
  * - the object will be marked as recyclable on disk if retire is true
- * - all dependents of this cookie must have already been unregistered
- *   (indices/files/pages)
  */
-void __fscache_relinquish_cookie(struct fscache_cookie *cookie,
-				 const void *aux_data,
-				 bool retire)
+void __fscache_relinquish_cookie(struct fscache_cookie *cookie, bool retire)
 {
 	fscache_stat(&fscache_n_relinquishes);
 	if (retire)
 		fscache_stat(&fscache_n_relinquishes_retire);
 
-	if (!cookie) {
-		fscache_stat(&fscache_n_relinquishes_null);
-		_leave(" [no cookie]");
-		return;
-	}
+	_enter("c=%08x{%d},%d",
+	       cookie->debug_id, atomic_read(&cookie->n_active), retire);
 
-	_enter("%x{%s,%d},%d",
-	       cookie->debug_id, cookie->type_name,
-	       atomic_read(&cookie->n_active), retire);
+	if (WARN(test_and_set_bit(FSCACHE_COOKIE_RELINQUISHED, &cookie->flags),
+		 "Cookie c=%x already relinquished\n", cookie->debug_id))
+		return;
 
+	if (retire)
+		set_bit(FSCACHE_COOKIE_RETIRED, &cookie->flags);
 	trace_fscache_relinquish(cookie, retire);
 
-	/* No further netfs-accessing operations on this cookie permitted */
-	if (test_and_set_bit(FSCACHE_COOKIE_RELINQUISHED, &cookie->flags))
-		BUG();
+	ASSERTCMP(atomic_read(&cookie->n_active), ==, 0);
+	ASSERTCMP(atomic_read(&cookie->volume->n_cookies), >, 0);
+	atomic_dec(&cookie->volume->n_cookies);
 
-	__fscache_disable_cookie(cookie, aux_data, retire);
+	set_bit(FSCACHE_COOKIE_DO_RELINQUISH, &cookie->flags);
 
-	if (cookie->parent) {
-		ASSERTCMP(refcount_read(&cookie->parent->ref), >, 0);
-		ASSERTCMP(atomic_read(&cookie->parent->n_children), >, 0);
-		atomic_dec(&cookie->parent->n_children);
-	}
-
-	/* Dispose of the netfs's link to the cookie */
+	if (test_bit(FSCACHE_COOKIE_HAS_BEEN_CACHED, &cookie->flags))
+		fscache_drop_withdraw_cookie(cookie);
+	else
+		fscache_drop_cookie(cookie);
 	fscache_put_cookie(cookie, fscache_cookie_put_relinquish);
-
-	_leave("");
 }
 EXPORT_SYMBOL(__fscache_relinquish_cookie);
 
-/*
- * Remove a cookie from the hash table.
- */
-static void fscache_unhash_cookie(struct fscache_cookie *cookie)
-{
-	struct hlist_bl_head *h;
-	unsigned int bucket;
-
-	bucket = cookie->key_hash & (ARRAY_SIZE(fscache_cookie_hash) - 1);
-	h = &fscache_cookie_hash[bucket];
-
-	hlist_bl_lock(h);
-	hlist_bl_del(&cookie->hash_link);
-	hlist_bl_unlock(h);
-}
-
 /*
  * Drop a reference to a cookie.
  */
 void fscache_put_cookie(struct fscache_cookie *cookie,
 			enum fscache_cookie_trace where)
 {
-	struct fscache_cookie *parent;
+	struct fscache_volume *volume = cookie->volume;
+	unsigned int cookie_debug_id = cookie->debug_id;
+	bool zero;
 	int ref;
 
-	_enter("%x", cookie->debug_id);
-
-	do {
-		unsigned int cookie_debug_id = cookie->debug_id;
-		bool zero = __refcount_dec_and_test(&cookie->ref, &ref);
-
-		trace_fscache_cookie(cookie_debug_id, ref - 1, where);
-		if (!zero)
-			return;
-
-		parent = cookie->parent;
-		fscache_unhash_cookie(cookie);
+	zero = __refcount_dec_and_test(&cookie->ref, &ref);
+	trace_fscache_cookie(cookie_debug_id, ref - 1, where);
+	if (zero) {
 		fscache_free_cookie(cookie);
-
-		cookie = parent;
-		where = fscache_cookie_put_parent;
-	} while (cookie);
-
-	_leave("");
+		fscache_put_volume(volume, fscache_volume_put_cookie);
+	}
 }
 EXPORT_SYMBOL(fscache_put_cookie);
 
@@ -867,43 +780,27 @@ static int fscache_cookies_seq_show(struct seq_file *m, void *v)
 {
 	struct fscache_cookie *cookie;
 	unsigned int keylen = 0, auxlen = 0;
-	char _type[3], *type;
 	u8 *p;
 
 	if (v == &fscache_cookies) {
 		seq_puts(m,
-			 "COOKIE   PARENT   USAGE CHILD ACT TY FL  DEF             \n"
-			 "======== ======== ===== ===== === == === ================\n"
+			 "COOKIE   VOLUME   REF ACT ACC S FL DEF             \n"
+			 "======== ======== === === === = == ================\n"
 			 );
 		return 0;
 	}
 
 	cookie = list_entry(v, struct fscache_cookie, proc_link);
 
-	switch (cookie->type) {
-	case 0:
-		type = "IX";
-		break;
-	case 1:
-		type = "DT";
-		break;
-	default:
-		snprintf(_type, sizeof(_type), "%02u",
-			 cookie->type);
-		type = _type;
-		break;
-	}
-
 	seq_printf(m,
-		   "%08x %08x %5u %5u %3u %s %03lx %-16s",
+		   "%08x %08x %3d %3d %3d %c %02lx",
 		   cookie->debug_id,
-		   cookie->parent ? cookie->parent->debug_id : 0,
+		   cookie->volume->debug_id,
 		   refcount_read(&cookie->ref),
-		   atomic_read(&cookie->n_children),
 		   atomic_read(&cookie->n_active),
-		   type,
-		   cookie->flags,
-		   cookie->type_name);
+		   atomic_read(&cookie->n_accesses) - 1,
+		   fscache_cookie_stages[cookie->stage],
+		   cookie->flags);
 
 	keylen = cookie->key_len;
 	auxlen = cookie->aux_len;
diff --git a/fs/fscache/fsdef.c b/fs/fscache/fsdef.c
deleted file mode 100644
index 15312f15848b..000000000000
--- a/fs/fscache/fsdef.c
+++ /dev/null
@@ -1,46 +0,0 @@
-// SPDX-License-Identifier: GPL-2.0-or-later
-/* Filesystem index definition
- *
- * Copyright (C) 2004-2007 Red Hat, Inc. All Rights Reserved.
- * Written by David Howells (dhowells@redhat.com)
- */
-
-#define FSCACHE_DEBUG_LEVEL CACHE
-#include <linux/module.h>
-#include "internal.h"
-
-/*
- * The root index is owned by FS-Cache itself.
- *
- * When a netfs requests caching facilities, FS-Cache will, if one doesn't
- * already exist, create an entry in the root index with the key being the name
- * of the netfs ("AFS" for example), and the auxiliary data holding the index
- * structure version supplied by the netfs:
- *
- *				     FSDEF
- *				       |
- *				 +-----------+
- *				 |           |
- *				NFS         AFS
- *			       [v=1]       [v=1]
- *
- * If an entry with the appropriate name does already exist, the version is
- * compared.  If the version is different, the entire subtree from that entry
- * will be discarded and a new entry created.
- *
- * The new entry will be an index, and a cookie referring to it will be passed
- * to the netfs.  This is then the root handle by which the netfs accesses the
- * cache.  It can create whatever objects it likes in that index, including
- * further indices.
- */
-struct fscache_cookie fscache_fsdef_index = {
-	.debug_id	= 1,
-	.ref		= REFCOUNT_INIT(1),
-	.n_active	= ATOMIC_INIT(1),
-	.lock		= __SPIN_LOCK_UNLOCKED(fscache_fsdef_index.lock),
-	.backing_objects = HLIST_HEAD_INIT,
-	.type_name	= ".fscach",
-	.flags		= 1 << FSCACHE_COOKIE_ENABLED,
-	.type		= FSCACHE_COOKIE_TYPE_INDEX,
-};
-EXPORT_SYMBOL(fscache_fsdef_index);
diff --git a/fs/fscache/internal.h b/fs/fscache/internal.h
index eefcb6dfee3e..f74f7bdea633 100644
--- a/fs/fscache/internal.h
+++ b/fs/fscache/internal.h
@@ -1,23 +1,10 @@
 /* SPDX-License-Identifier: GPL-2.0-or-later */
 /* Internal definitions for FS-Cache
  *
- * Copyright (C) 2004-2007 Red Hat, Inc. All Rights Reserved.
+ * Copyright (C) 2004-2007, 2001 Red Hat, Inc. All Rights Reserved.
  * Written by David Howells (dhowells@redhat.com)
  */
 
-/*
- * Lock order, in the order in which multiple locks should be obtained:
- * - fscache_addremove_sem
- * - cookie->lock
- * - cookie->parent->lock
- * - cache->object_list_lock
- * - object->lock
- * - object->parent->lock
- * - cookie->stores_lock
- * - fscache_thread_lock
- *
- */
-
 #ifdef pr_fmt
 #undef pr_fmt
 #endif
@@ -30,31 +17,15 @@
 #include <linux/sched.h>
 #include <linux/seq_file.h>
 
-#define FSCACHE_MIN_THREADS	4
-#define FSCACHE_MAX_THREADS	32
-
 /*
  * cache.c
  */
-extern struct list_head fscache_cache_list;
-extern struct rw_semaphore fscache_addremove_sem;
-
-extern struct fscache_cache *fscache_select_cache_for_object(
-	struct fscache_cookie *);
-
-static inline
-struct fscache_cache_tag *fscache_get_cache_tag(struct fscache_cache_tag *tag)
-{
-	if (tag)
-		refcount_inc(&tag->ref);
-	return tag;
-}
-
-static inline void fscache_put_cache_tag(struct fscache_cache_tag *tag)
-{
-	if (tag && refcount_dec_and_test(&tag->ref))
-		kfree(tag);
-}
+#ifdef CONFIG_PROC_FS
+extern const struct seq_operations fscache_caches_seq_ops;
+#endif
+bool fscache_begin_cache_access(struct fscache_cache *cache, enum fscache_access_trace why);
+void fscache_end_cache_access(struct fscache_cache *cache, enum fscache_access_trace why);
+struct fscache_cache *fscache_lookup_cache(const char *name, bool is_cache);
 
 /*
  * cookie.c
@@ -62,16 +33,9 @@ static inline void fscache_put_cache_tag(struct fscache_cache_tag *tag)
 extern struct kmem_cache *fscache_cookie_jar;
 extern const struct seq_operations fscache_cookies_seq_ops;
 
-extern void fscache_free_cookie(struct fscache_cookie *);
-extern struct fscache_cookie *fscache_alloc_cookie(struct fscache_cookie *,
-						   enum fscache_cookie_type,
-						   const char *,
-						   u8,
-						   struct fscache_cache_tag *,
-						   const void *, size_t,
-						   const void *, size_t,
-						   loff_t);
-extern struct fscache_cookie *fscache_hash_cookie(struct fscache_cookie *);
+extern void fscache_print_cookie(struct fscache_cookie *cookie, char prefix);
+extern bool fscache_begin_cookie_access(struct fscache_cookie *cookie,
+					enum fscache_access_trace why);
 
 static inline void fscache_see_cookie(struct fscache_cookie *cookie,
 				      enum fscache_cookie_trace where)
@@ -80,34 +44,13 @@ static inline void fscache_see_cookie(struct fscache_cookie *cookie,
 			     where);
 }
 
-/*
- * fsdef.c
- */
-extern struct fscache_cookie fscache_fsdef_index;
-
 /*
  * main.c
  */
-extern unsigned fscache_defer_lookup;
-extern unsigned fscache_defer_create;
 extern unsigned fscache_debug;
-extern struct kobject *fscache_root;
-extern struct workqueue_struct *fscache_object_wq;
-extern struct workqueue_struct *fscache_op_wq;
-DECLARE_PER_CPU(wait_queue_head_t, fscache_object_cong_wait);
 
 extern unsigned int fscache_hash(unsigned int salt, unsigned int *data, unsigned int n);
 
-static inline bool fscache_object_congested(void)
-{
-	return workqueue_congested(WORK_CPU_UNBOUND, fscache_object_wq);
-}
-
-/*
- * object.c
- */
-extern void fscache_enqueue_object(struct cachefiles_object *);
-
 /*
  * proc.c
  */
@@ -123,15 +66,10 @@ extern void fscache_proc_cleanup(void);
  * stats.c
  */
 #ifdef CONFIG_FSCACHE_STATS
-extern atomic_t fscache_n_op_pend;
-extern atomic_t fscache_n_op_run;
-extern atomic_t fscache_n_op_enqueue;
-extern atomic_t fscache_n_op_deferred_release;
-extern atomic_t fscache_n_op_initialised;
-extern atomic_t fscache_n_op_release;
-extern atomic_t fscache_n_op_gc;
-extern atomic_t fscache_n_op_cancelled;
-extern atomic_t fscache_n_op_rejected;
+extern atomic_t fscache_n_volumes;
+extern atomic_t fscache_n_volumes_collision;
+extern atomic_t fscache_n_volumes_nomem;
+extern atomic_t fscache_n_cookies;
 
 extern atomic_t fscache_n_retrievals;
 extern atomic_t fscache_n_retrievals_ok;
@@ -171,36 +109,10 @@ extern atomic_t fscache_n_updates_run;
 extern atomic_t fscache_n_relinquishes;
 extern atomic_t fscache_n_relinquishes_null;
 extern atomic_t fscache_n_relinquishes_retire;
+extern atomic_t fscache_n_relinquishes_dropped;
 
-extern atomic_t fscache_n_cookie_index;
-extern atomic_t fscache_n_cookie_data;
-extern atomic_t fscache_n_cookie_special;
-
-extern atomic_t fscache_n_object_alloc;
-extern atomic_t fscache_n_object_no_alloc;
-extern atomic_t fscache_n_object_lookups;
-extern atomic_t fscache_n_object_lookups_negative;
-extern atomic_t fscache_n_object_lookups_positive;
-extern atomic_t fscache_n_object_lookups_timed_out;
-extern atomic_t fscache_n_object_created;
-extern atomic_t fscache_n_object_avail;
-extern atomic_t fscache_n_object_dead;
-
-extern atomic_t fscache_n_cop_alloc_object;
-extern atomic_t fscache_n_cop_lookup_object;
-extern atomic_t fscache_n_cop_lookup_complete;
-extern atomic_t fscache_n_cop_grab_object;
-extern atomic_t fscache_n_cop_invalidate_object;
-extern atomic_t fscache_n_cop_update_object;
-extern atomic_t fscache_n_cop_drop_object;
-extern atomic_t fscache_n_cop_put_object;
-extern atomic_t fscache_n_cop_sync_cache;
-extern atomic_t fscache_n_cop_attr_changed;
-
-extern atomic_t fscache_n_cache_no_space_reject;
-extern atomic_t fscache_n_cache_stale_objects;
-extern atomic_t fscache_n_cache_retired_objects;
-extern atomic_t fscache_n_cache_culled_objects;
+extern atomic_t fscache_n_resizes;
+extern atomic_t fscache_n_resizes_null;
 
 static inline void fscache_stat(atomic_t *stat)
 {
@@ -223,35 +135,31 @@ int fscache_stats_show(struct seq_file *m, void *v);
 #endif
 
 /*
- * raise an event on an object
- * - if the event is not masked for that object, then the object is
- *   queued for attention by the thread pool.
+ * volume.c
  */
-static inline void fscache_raise_event(struct cachefiles_object *object,
-				       unsigned event)
-{
-	BUG_ON(event >= NR_FSCACHE_OBJECT_EVENTS);
-#if 0
-	printk("*** fscache_raise_event(OBJ%d{%lx},%x)\n",
-	       object->debug_id, object->event_mask, (1 << event));
-#endif
-	if (!test_and_set_bit(event, &object->events) &&
-	    test_bit(event, &object->event_mask))
-		fscache_enqueue_object(object);
-}
+extern const struct seq_operations fscache_volumes_seq_ops;
+
+struct fscache_volume *fscache_get_volume(struct fscache_volume *volume,
+					  enum fscache_volume_trace where);
+void fscache_put_volume(struct fscache_volume *volume,
+			enum fscache_volume_trace where);
+bool fscache_begin_volume_access(struct fscache_volume *volume,
+				 enum fscache_access_trace why);
+void fscache_create_volume(struct fscache_volume *volume, bool wait);
 
 /*
  * Update the auxiliary data on a cookie.
  */
 static inline
-void fscache_update_aux(struct fscache_cookie *cookie, const void *aux_data)
+void fscache_update_aux(struct fscache_cookie *cookie,
+			const void *aux_data, const loff_t *object_size)
 {
 	void *p = fscache_get_aux(cookie);
 
-	if (p && memcmp(p, aux_data, cookie->aux_len) != 0) {
+	if (aux_data && p)
 		memcpy(p, aux_data, cookie->aux_len);
-		set_bit(FSCACHE_COOKIE_AUX_UPDATED, &cookie->flags);
-	}
+	if (object_size)
+		cookie->object_size = *object_size;
 }
 
 /*****************************************************************************/
@@ -259,7 +167,7 @@ void fscache_update_aux(struct fscache_cookie *cookie, const void *aux_data)
  * debug tracing
  */
 #define dbgprintk(FMT, ...) \
-	printk(KERN_DEBUG "[%-6.6s] "FMT"\n", current->comm, ##__VA_ARGS__)
+	printk("[%-6.6s] "FMT"\n", current->comm, ##__VA_ARGS__)
 
 #define kenter(FMT, ...) dbgprintk("==> %s("FMT")", __func__, ##__VA_ARGS__)
 #define kleave(FMT, ...) dbgprintk("<== %s()"FMT"", __func__, ##__VA_ARGS__)
@@ -312,7 +220,7 @@ do {						\
 
 #define FSCACHE_DEBUG_CACHE	0
 #define FSCACHE_DEBUG_COOKIE	1
-#define FSCACHE_DEBUG_PAGE	2
+#define FSCACHE_DEBUG_OBJECT	2
 #define FSCACHE_DEBUG_OPERATION	3
 
 #define FSCACHE_POINT_ENTER	1
diff --git a/fs/fscache/io.c b/fs/fscache/io.c
index 2547892a6064..2b1c9f433530 100644
--- a/fs/fscache/io.c
+++ b/fs/fscache/io.c
@@ -4,160 +4,148 @@
  * Copyright (C) 2021 Red Hat, Inc. All Rights Reserved.
  * Written by David Howells (dhowells@redhat.com)
  */
-
-#define FSCACHE_DEBUG_LEVEL PAGE
-#include <linux/module.h>
+#define FSCACHE_DEBUG_LEVEL OPERATION
 #define FSCACHE_USE_NEW_IO_API
 #define FSCACHE_USE_FALLBACK_IO_API
 #include <linux/fscache-cache.h>
 #include <linux/uio.h>
 #include <linux/bvec.h>
 #include <linux/slab.h>
-#include <linux/netfs.h>
 #include "internal.h"
 
-/*
- * Start a cache operation.
- * - we return:
- *   -ENOMEM	- out of memory, some pages may be being read
- *   -ERESTARTSYS - interrupted, some pages may be being read
- *   -ENOBUFS	- no backing object or space available in which to cache any
- *                pages not being read
- *   -ENODATA	- no data available in the backing object for some or all of
- *                the pages
- *   0		- dispatched a read on all pages
+/**
+ * fscache_wait_for_operation - Wait for an object become accessible
+ * @cres: The cache resources for the operation being performed
+ * @want_stage: The minimum stage the object must be at
+ *
+ * See if the target cache object is at the specified minimum stage of
+ * accessibility yet, and if not, wait for it.
  */
-int __fscache_begin_operation(struct netfs_cache_resources *cres,
-			      struct fscache_cookie *cookie,
-			      bool for_write)
+bool fscache_wait_for_operation(struct netfs_cache_resources *cres,
+				enum fscache_want_stage want_stage)
 {
-#if 0
-	struct fscache_operation *op;
-	struct fscache_object *object;
-	bool wake_cookie = false;
-	int ret;
-
-	_enter("c=%08x", cres->debug_id);
-
-	if (for_write)
-		fscache_stat(&fscache_n_stores);
-	else
-		fscache_stat(&fscache_n_retrievals);
+	struct fscache_cookie *cookie = fscache_cres_cookie(cres);
+	enum fscache_cookie_stage stage;
 
-	if (hlist_empty(&cookie->backing_objects))
-		goto nobufs;
+again:
+	if (!fscache_cache_is_live(cookie->volume->cache)) {
+		_leave(" [broken]");
+		return false;
+	}
 
-	if (test_bit(FSCACHE_COOKIE_INVALIDATING, &cookie->flags)) {
-		_leave(" = -ENOBUFS [invalidating]");
-		return -ENOBUFS;
+	stage = READ_ONCE(cookie->stage);
+	_enter("c=%08x{%u},%x", cookie->debug_id, stage, want_stage);
+
+	switch (stage) {
+	case FSCACHE_COOKIE_STAGE_CREATING:
+	case FSCACHE_COOKIE_STAGE_INVALIDATING:
+		if (want_stage == FSCACHE_WANT_PARAMS)
+			goto ready; /* There can be no content */
+		fallthrough;
+	case FSCACHE_COOKIE_STAGE_LOOKING_UP:
+		wait_var_event(&cookie->stage, READ_ONCE(cookie->stage) != stage);
+		goto again;
+
+	case FSCACHE_COOKIE_STAGE_ACTIVE:
+		goto ready;
+	case FSCACHE_COOKIE_STAGE_DROPPED:
+	case FSCACHE_COOKIE_STAGE_RELINQUISHING:
+	default:
+		_leave(" [not live]");
+		return false;
 	}
 
-	ASSERTCMP(cookie->type, !=, FSCACHE_COOKIE_TYPE_INDEX);
+ready:
+	if (!cres->cache_priv2)
+		return cookie->volume->cache->ops->begin_operation(cres, want_stage);
+	return true;
+}
+EXPORT_SYMBOL(fscache_wait_for_operation);
 
-	if (fscache_wait_for_deferred_lookup(cookie) < 0)
-		return -ERESTARTSYS;
+/*
+ * Begin an I/O operation on the cache, waiting till we reach the right state.
+ *
+ * Attaches the resources required to the operation resources record.
+ */
+static int fscache_begin_operation(struct netfs_cache_resources *cres,
+				   struct fscache_cookie *cookie,
+				   enum fscache_want_stage want_stage,
+				   enum fscache_access_trace why)
+{
+	enum fscache_cookie_stage stage;
+	long timeo;
+	bool once_only = false;
 
-	op = kzalloc(sizeof(*op), GFP_KERNEL);
-	if (!op)
-		return -ENOMEM;
+	cres->ops		= NULL;
+	cres->cache_priv	= cookie;
+	cres->cache_priv2	= NULL;
+	cres->debug_id		= cookie->debug_id;
 
-	fscache_operation_init(cookie, op, NULL, NULL, NULL);
-	op->flags = FSCACHE_OP_MYTHREAD |
-		(1UL << FSCACHE_OP_WAITING) |
-		(1UL << FSCACHE_OP_UNUSE_COOKIE);
+	if (!fscache_begin_cookie_access(cookie, why))
+		return -ENOBUFS;
 
+again:
 	spin_lock(&cookie->lock);
 
-	if (!fscache_cookie_enabled(cookie) ||
-	    hlist_empty(&cookie->backing_objects))
-		goto nobufs_unlock;
-	object = hlist_entry(cookie->backing_objects.first,
-			     struct fscache_object, cookie_link);
-
-	__fscache_use_cookie(cookie);
-	atomic_inc(&object->n_reads);
-	__set_bit(FSCACHE_OP_DEC_READ_CNT, &op->flags);
+	stage = cookie->stage;
+	_enter("c=%08x{%u},%x", cookie->debug_id, stage, want_stage);
+
+	switch (stage) {
+	case FSCACHE_COOKIE_STAGE_LOOKING_UP:
+		goto wait_and_validate;
+	case FSCACHE_COOKIE_STAGE_INVALIDATING:
+	case FSCACHE_COOKIE_STAGE_CREATING:
+		if (want_stage == FSCACHE_WANT_PARAMS)
+			goto ready; /* There can be no content */
+		goto wait_and_validate;
+	case FSCACHE_COOKIE_STAGE_ACTIVE:
+		goto ready;
+	case FSCACHE_COOKIE_STAGE_DROPPED:
+	case FSCACHE_COOKIE_STAGE_RELINQUISHING:
+		WARN(1, "Can't use cookie in stage %u\n", cookie->stage);
+		goto not_live;
+	default:
+		goto not_live;
+	}
 
-	if (fscache_submit_op(object, op) < 0)
-		goto nobufs_unlock_dec;
+ready:
 	spin_unlock(&cookie->lock);
+	if (!cookie->volume->cache->ops->begin_operation(cres, want_stage))
+		goto failed;
+	return 0;
 
-	/* we wait for the operation to become active, and then process it
-	 * *here*, in this thread, and not in the thread pool */
-	if (for_write) {
-		fscache_stat(&fscache_n_store_ops);
-
-		ret = fscache_wait_for_operation_activation(
-			object, op,
-			__fscache_stat(&fscache_n_store_op_waits),
-			__fscache_stat(&fscache_n_stores_object_dead));
-	} else {
-		fscache_stat(&fscache_n_retrieval_ops);
-
-		ret = fscache_wait_for_operation_activation(
-			object, op,
-			__fscache_stat(&fscache_n_retrieval_op_waits),
-			__fscache_stat(&fscache_n_retrievals_object_dead));
-	}
-	if (ret < 0)
-		goto error;
-
-	/* ask the cache to honour the operation */
-	ret = object->cache->ops->begin_operation(cres, op);
-
-error:
-	if (for_write) {
-		if (ret == -ENOMEM)
-			fscache_stat(&fscache_n_stores_oom);
-		else if (ret == -ERESTARTSYS)
-			fscache_stat(&fscache_n_stores_intr);
-		else if (ret < 0)
-			fscache_stat(&fscache_n_stores_nobufs);
-		else
-			fscache_stat(&fscache_n_stores_ok);
-	} else {
-		if (ret == -ENOMEM)
-			fscache_stat(&fscache_n_retrievals_nomem);
-		else if (ret == -ERESTARTSYS)
-			fscache_stat(&fscache_n_retrievals_intr);
-		else if (ret == -ENODATA)
-			fscache_stat(&fscache_n_retrievals_nodata);
-		else if (ret < 0)
-			fscache_stat(&fscache_n_retrievals_nobufs);
-		else
-			fscache_stat(&fscache_n_retrievals_ok);
+wait_and_validate:
+	spin_unlock(&cookie->lock);
+	trace_fscache_access(cookie->debug_id, refcount_read(&cookie->ref),
+			     atomic_read(&cookie->n_accesses),
+			     fscache_access_io_wait);
+	timeo = wait_var_event_timeout(&cookie->stage,
+				       READ_ONCE(cookie->stage) != stage, 20 * HZ);
+	if (timeo <= 1 && !once_only) {
+		pr_warn("%s: cookie stage change wait timed out: cookie->stage=%u stage=%u",
+			__func__, READ_ONCE(cookie->stage), stage);
+		fscache_print_cookie(cookie, 'O');
+		once_only = true;
 	}
+	goto again;
 
-	fscache_put_operation(op);
-	_leave(" = %d", ret);
-	return ret;
-
-nobufs_unlock_dec:
-	atomic_dec(&object->n_reads);
-	wake_cookie = __fscache_unuse_cookie(cookie);
-nobufs_unlock:
+not_live:
 	spin_unlock(&cookie->lock);
-	fscache_put_operation(op);
-	if (wake_cookie)
-		__fscache_wake_unused_cookie(cookie);
-nobufs:
-	if (for_write)
-		fscache_stat(&fscache_n_stores_nobufs);
-	else
-		fscache_stat(&fscache_n_retrievals_nobufs);
-#endif
+failed:
+	cres->cache_priv = NULL;
+	cres->ops = NULL;
+	fscache_end_cookie_access(cookie, fscache_access_io_not_live);
 	_leave(" = -ENOBUFS");
 	return -ENOBUFS;
 }
-EXPORT_SYMBOL(__fscache_begin_operation);
 
-/*
- * Clean up an operation.
- */
-static void fscache_end_operation(struct netfs_cache_resources *cres)
+int __fscache_begin_read_operation(struct netfs_cache_resources *cres,
+				   struct fscache_cookie *cookie)
 {
-	cres->ops->end_operation(cres);
+	return fscache_begin_operation(cres, cookie, FSCACHE_WANT_PARAMS,
+				       fscache_access_io_read);
 }
+EXPORT_SYMBOL(__fscache_begin_read_operation);
 
 /*
  * Fallback page reading interface.
@@ -177,7 +165,8 @@ int __fscache_fallback_read_page(struct fscache_cookie *cookie, struct page *pag
 	bvec[0].bv_len		= PAGE_SIZE;
 	iov_iter_bvec(&iter, READ, bvec, ARRAY_SIZE(bvec), PAGE_SIZE);
 
-	ret = fscache_begin_read_operation(&cres, cookie);
+	ret = fscache_begin_operation(&cres, cookie, FSCACHE_WANT_READ,
+				      fscache_access_io_write);
 	if (ret < 0)
 		return ret;
 
@@ -207,7 +196,8 @@ int __fscache_fallback_write_page(struct fscache_cookie *cookie, struct page *pa
 	bvec[0].bv_len		= PAGE_SIZE;
 	iov_iter_bvec(&iter, WRITE, bvec, ARRAY_SIZE(bvec), PAGE_SIZE);
 
-	ret = __fscache_begin_operation(&cres, cookie, true);
+	ret = fscache_begin_operation(&cres, cookie, FSCACHE_WANT_WRITE,
+				      fscache_access_io_write);
 	if (ret < 0)
 		return ret;
 
diff --git a/fs/fscache/main.c b/fs/fscache/main.c
index 4207f98e405f..ba23745146cf 100644
--- a/fs/fscache/main.c
+++ b/fs/fscache/main.c
@@ -8,10 +8,6 @@
 #define FSCACHE_DEBUG_LEVEL CACHE
 #include <linux/module.h>
 #include <linux/init.h>
-#include <linux/sched.h>
-#include <linux/completion.h>
-#include <linux/slab.h>
-#include <linux/seq_file.h>
 #define CREATE_TRACE_POINTS
 #include "internal.h"
 
@@ -19,79 +15,18 @@ MODULE_DESCRIPTION("FS Cache Manager");
 MODULE_AUTHOR("Red Hat, Inc.");
 MODULE_LICENSE("GPL");
 
-unsigned fscache_defer_lookup = 1;
-module_param_named(defer_lookup, fscache_defer_lookup, uint,
-		   S_IWUSR | S_IRUGO);
-MODULE_PARM_DESC(fscache_defer_lookup,
-		 "Defer cookie lookup to background thread");
-
-unsigned fscache_defer_create = 1;
-module_param_named(defer_create, fscache_defer_create, uint,
-		   S_IWUSR | S_IRUGO);
-MODULE_PARM_DESC(fscache_defer_create,
-		 "Defer cookie creation to background thread");
-
 unsigned fscache_debug;
 module_param_named(debug, fscache_debug, uint,
 		   S_IWUSR | S_IRUGO);
 MODULE_PARM_DESC(fscache_debug,
 		 "FS-Cache debugging mask");
 
-struct kobject *fscache_root;
-struct workqueue_struct *fscache_object_wq;
-struct workqueue_struct *fscache_op_wq;
-
-DEFINE_PER_CPU(wait_queue_head_t, fscache_object_cong_wait);
+EXPORT_TRACEPOINT_SYMBOL(fscache_access_cache);
+EXPORT_TRACEPOINT_SYMBOL(fscache_access_volume);
+EXPORT_TRACEPOINT_SYMBOL(fscache_access);
 
-/* these values serve as lower bounds, will be adjusted in fscache_init() */
-static unsigned fscache_object_max_active = 4;
-static unsigned fscache_op_max_active = 2;
-
-#ifdef CONFIG_SYSCTL
-static struct ctl_table_header *fscache_sysctl_header;
-
-static int fscache_max_active_sysctl(struct ctl_table *table, int write,
-				     void *buffer, size_t *lenp, loff_t *ppos)
-{
-	struct workqueue_struct **wqp = table->extra1;
-	unsigned int *datap = table->data;
-	int ret;
-
-	ret = proc_dointvec(table, write, buffer, lenp, ppos);
-	if (ret == 0)
-		workqueue_set_max_active(*wqp, *datap);
-	return ret;
-}
-
-static struct ctl_table fscache_sysctls[] = {
-	{
-		.procname	= "object_max_active",
-		.data		= &fscache_object_max_active,
-		.maxlen		= sizeof(unsigned),
-		.mode		= 0644,
-		.proc_handler	= fscache_max_active_sysctl,
-		.extra1		= &fscache_object_wq,
-	},
-	{
-		.procname	= "operation_max_active",
-		.data		= &fscache_op_max_active,
-		.maxlen		= sizeof(unsigned),
-		.mode		= 0644,
-		.proc_handler	= fscache_max_active_sysctl,
-		.extra1		= &fscache_op_wq,
-	},
-	{}
-};
-
-static struct ctl_table fscache_sysctls_root[] = {
-	{
-		.procname	= "fscache",
-		.mode		= 0555,
-		.child		= fscache_sysctls,
-	},
-	{}
-};
-#endif
+struct workqueue_struct *fscache_wq;
+EXPORT_SYMBOL(fscache_wq);
 
 /*
  * Mixing scores (in bits) for (7,20):
@@ -137,44 +72,16 @@ unsigned int fscache_hash(unsigned int salt, unsigned int *data, unsigned int n)
  */
 static int __init fscache_init(void)
 {
-	unsigned int nr_cpus = num_possible_cpus();
-	unsigned int cpu;
-	int ret;
-
-	fscache_object_max_active =
-		clamp_val(nr_cpus,
-			  fscache_object_max_active, WQ_UNBOUND_MAX_ACTIVE);
-
-	ret = -ENOMEM;
-	fscache_object_wq = alloc_workqueue("fscache_object", WQ_UNBOUND,
-					    fscache_object_max_active);
-	if (!fscache_object_wq)
-		goto error_object_wq;
-
-	fscache_op_max_active =
-		clamp_val(fscache_object_max_active / 2,
-			  fscache_op_max_active, WQ_UNBOUND_MAX_ACTIVE);
+	int ret = -ENOMEM;
 
-	ret = -ENOMEM;
-	fscache_op_wq = alloc_workqueue("fscache_operation", WQ_UNBOUND,
-					fscache_op_max_active);
-	if (!fscache_op_wq)
-		goto error_op_wq;
-
-	for_each_possible_cpu(cpu)
-		init_waitqueue_head(&per_cpu(fscache_object_cong_wait, cpu));
+	fscache_wq = alloc_workqueue("fscache", WQ_UNBOUND | WQ_FREEZABLE, 0);
+	if (!fscache_wq)
+		goto error_wq;
 
 	ret = fscache_proc_init();
 	if (ret < 0)
 		goto error_proc;
 
-#ifdef CONFIG_SYSCTL
-	ret = -ENOMEM;
-	fscache_sysctl_header = register_sysctl_table(fscache_sysctls_root);
-	if (!fscache_sysctl_header)
-		goto error_sysctl;
-#endif
-
 	fscache_cookie_jar = kmem_cache_create("fscache_cookie_jar",
 					       sizeof(struct fscache_cookie),
 					       0, 0, NULL);
@@ -184,26 +91,14 @@ static int __init fscache_init(void)
 		goto error_cookie_jar;
 	}
 
-	fscache_root = kobject_create_and_add("fscache", kernel_kobj);
-	if (!fscache_root)
-		goto error_kobj;
-
 	pr_notice("Loaded\n");
 	return 0;
 
-error_kobj:
-	kmem_cache_destroy(fscache_cookie_jar);
 error_cookie_jar:
-#ifdef CONFIG_SYSCTL
-	unregister_sysctl_table(fscache_sysctl_header);
-error_sysctl:
-#endif
 	fscache_proc_cleanup();
 error_proc:
-	destroy_workqueue(fscache_op_wq);
-error_op_wq:
-	destroy_workqueue(fscache_object_wq);
-error_object_wq:
+	destroy_workqueue(fscache_wq);
+error_wq:
 	return ret;
 }
 
@@ -216,14 +111,9 @@ static void __exit fscache_exit(void)
 {
 	_enter("");
 
-	kobject_put(fscache_root);
 	kmem_cache_destroy(fscache_cookie_jar);
-#ifdef CONFIG_SYSCTL
-	unregister_sysctl_table(fscache_sysctl_header);
-#endif
 	fscache_proc_cleanup();
-	destroy_workqueue(fscache_op_wq);
-	destroy_workqueue(fscache_object_wq);
+	destroy_workqueue(fscache_wq);
 	pr_notice("Unloaded\n");
 }
 
diff --git a/fs/fscache/netfs.c b/fs/fscache/netfs.c
deleted file mode 100644
index d746365f1daf..000000000000
--- a/fs/fscache/netfs.c
+++ /dev/null
@@ -1,76 +0,0 @@
-// SPDX-License-Identifier: GPL-2.0-or-later
-/* FS-Cache netfs (client) registration
- *
- * Copyright (C) 2008 Red Hat, Inc. All Rights Reserved.
- * Written by David Howells (dhowells@redhat.com)
- */
-
-#define FSCACHE_DEBUG_LEVEL COOKIE
-#include <linux/module.h>
-#include <linux/slab.h>
-#include "internal.h"
-
-/*
- * register a network filesystem for caching
- */
-int __fscache_register_netfs(struct fscache_netfs *netfs)
-{
-	struct fscache_cookie *candidate, *cookie;
-
-	_enter("{%s}", netfs->name);
-
-	/* allocate a cookie for the primary index */
-	candidate = fscache_alloc_cookie(&fscache_fsdef_index,
-					 FSCACHE_COOKIE_TYPE_INDEX,
-					 ".netfs",
-					 0, NULL,
-					 netfs->name, strlen(netfs->name),
-					 &netfs->version, sizeof(netfs->version),
-					 0);
-	if (!candidate) {
-		_leave(" = -ENOMEM");
-		return -ENOMEM;
-	}
-
-	candidate->flags = 1 << FSCACHE_COOKIE_ENABLED;
-
-	/* check the netfs type is not already present */
-	cookie = fscache_hash_cookie(candidate);
-	if (!cookie)
-		goto already_registered;
-	if (cookie != candidate) {
-		trace_fscache_cookie(candidate->debug_id, 1, fscache_cookie_discard);
-		fscache_free_cookie(candidate);
-	}
-
-	fscache_get_cookie(cookie->parent, fscache_cookie_get_register_netfs);
-	atomic_inc(&cookie->parent->n_children);
-
-	netfs->primary_index = cookie;
-
-	pr_notice("Netfs '%s' registered for caching\n", netfs->name);
-	trace_fscache_netfs(netfs);
-	_leave(" = 0");
-	return 0;
-
-already_registered:
-	fscache_put_cookie(candidate, fscache_cookie_put_dup_netfs);
-	_leave(" = -EEXIST");
-	return -EEXIST;
-}
-EXPORT_SYMBOL(__fscache_register_netfs);
-
-/*
- * unregister a network filesystem from the cache
- * - all cookies must have been released first
- */
-void __fscache_unregister_netfs(struct fscache_netfs *netfs)
-{
-	_enter("{%s.%u}", netfs->name, netfs->version);
-
-	fscache_relinquish_cookie(netfs->primary_index, NULL, false);
-	pr_notice("Netfs '%s' unregistered from caching\n", netfs->name);
-
-	_leave("");
-}
-EXPORT_SYMBOL(__fscache_unregister_netfs);
diff --git a/fs/fscache/object.c b/fs/fscache/object.c
deleted file mode 100644
index e653d0194f71..000000000000
--- a/fs/fscache/object.c
+++ /dev/null
@@ -1,973 +0,0 @@
-// SPDX-License-Identifier: GPL-2.0-or-later
-/* FS-Cache object state machine handler
- *
- * Copyright (C) 2007 Red Hat, Inc. All Rights Reserved.
- * Written by David Howells (dhowells@redhat.com)
- *
- * See Documentation/filesystems/caching/object.rst for a description of the
- * object state machine and the in-kernel representations.
- */
-
-#define FSCACHE_DEBUG_LEVEL COOKIE
-#include <linux/module.h>
-#include <linux/slab.h>
-#include <linux/prefetch.h>
-#include "internal.h"
-
-static const struct fscache_state *fscache_abort_initialisation(struct cachefiles_object *, int);
-static const struct fscache_state *fscache_kill_dependents(struct cachefiles_object *, int);
-static const struct fscache_state *fscache_drop_object(struct cachefiles_object *, int);
-static const struct fscache_state *fscache_initialise_object(struct cachefiles_object *, int);
-static const struct fscache_state *fscache_invalidate_object(struct cachefiles_object *, int);
-static const struct fscache_state *fscache_jumpstart_dependents(struct cachefiles_object *, int);
-static const struct fscache_state *fscache_kill_object(struct cachefiles_object *, int);
-static const struct fscache_state *fscache_lookup_failure(struct cachefiles_object *, int);
-static const struct fscache_state *fscache_look_up_object(struct cachefiles_object *, int);
-static const struct fscache_state *fscache_object_available(struct cachefiles_object *, int);
-static const struct fscache_state *fscache_parent_ready(struct cachefiles_object *, int);
-static const struct fscache_state *fscache_update_object(struct cachefiles_object *, int);
-static const struct fscache_state *fscache_object_dead(struct cachefiles_object *, int);
-
-#define __STATE_NAME(n) fscache_osm_##n
-#define STATE(n) (&__STATE_NAME(n))
-
-/*
- * Define a work state.  Work states are execution states.  No event processing
- * is performed by them.  The function attached to a work state returns a
- * pointer indicating the next state to which the state machine should
- * transition.  Returning NO_TRANSIT repeats the current state, but goes back
- * to the scheduler first.
- */
-#define WORK_STATE(n, sn, f) \
-	const struct fscache_state __STATE_NAME(n) = {			\
-		.name = #n,						\
-		.short_name = sn,					\
-		.work = f						\
-	}
-
-/*
- * Returns from work states.
- */
-#define transit_to(state) ({ prefetch(&STATE(state)->work); STATE(state); })
-
-#define NO_TRANSIT ((struct fscache_state *)NULL)
-
-/*
- * Define a wait state.  Wait states are event processing states.  No execution
- * is performed by them.  Wait states are just tables of "if event X occurs,
- * clear it and transition to state Y".  The dispatcher returns to the
- * scheduler if none of the events in which the wait state has an interest are
- * currently pending.
- */
-#define WAIT_STATE(n, sn, ...) \
-	const struct fscache_state __STATE_NAME(n) = {			\
-		.name = #n,						\
-		.short_name = sn,					\
-		.work = NULL,						\
-		.transitions = { __VA_ARGS__, { 0, NULL } }		\
-	}
-
-#define TRANSIT_TO(state, emask) \
-	{ .events = (emask), .transit_to = STATE(state) }
-
-/*
- * The object state machine.
- */
-static WORK_STATE(INIT_OBJECT,		"INIT", fscache_initialise_object);
-static WORK_STATE(PARENT_READY,		"PRDY", fscache_parent_ready);
-static WORK_STATE(ABORT_INIT,		"ABRT", fscache_abort_initialisation);
-static WORK_STATE(LOOK_UP_OBJECT,	"LOOK", fscache_look_up_object);
-static WORK_STATE(OBJECT_AVAILABLE,	"AVBL", fscache_object_available);
-static WORK_STATE(JUMPSTART_DEPS,	"JUMP", fscache_jumpstart_dependents);
-
-static WORK_STATE(INVALIDATE_OBJECT,	"INVL", fscache_invalidate_object);
-static WORK_STATE(UPDATE_OBJECT,	"UPDT", fscache_update_object);
-
-static WORK_STATE(LOOKUP_FAILURE,	"LCFL", fscache_lookup_failure);
-static WORK_STATE(KILL_OBJECT,		"KILL", fscache_kill_object);
-static WORK_STATE(KILL_DEPENDENTS,	"KDEP", fscache_kill_dependents);
-static WORK_STATE(DROP_OBJECT,		"DROP", fscache_drop_object);
-static WORK_STATE(OBJECT_DEAD,		"DEAD", fscache_object_dead);
-
-static WAIT_STATE(WAIT_FOR_INIT,	"?INI",
-		  TRANSIT_TO(INIT_OBJECT,	1 << FSCACHE_OBJECT_EV_NEW_CHILD));
-
-static WAIT_STATE(WAIT_FOR_PARENT,	"?PRN",
-		  TRANSIT_TO(PARENT_READY,	1 << FSCACHE_OBJECT_EV_PARENT_READY));
-
-static WAIT_STATE(WAIT_FOR_CMD,		"?CMD",
-		  TRANSIT_TO(INVALIDATE_OBJECT,	1 << FSCACHE_OBJECT_EV_INVALIDATE),
-		  TRANSIT_TO(UPDATE_OBJECT,	1 << FSCACHE_OBJECT_EV_UPDATE),
-		  TRANSIT_TO(JUMPSTART_DEPS,	1 << FSCACHE_OBJECT_EV_NEW_CHILD));
-
-static WAIT_STATE(WAIT_FOR_CLEARANCE,	"?CLR",
-		  TRANSIT_TO(KILL_OBJECT,	1 << FSCACHE_OBJECT_EV_CLEARED));
-
-/*
- * Out-of-band event transition tables.  These are for handling unexpected
- * events, such as an I/O error.  If an OOB event occurs, the state machine
- * clears and disables the event and forces a transition to the nominated work
- * state (acurrently executing work states will complete first).
- *
- * In such a situation, object->state remembers the state the machine should
- * have been in/gone to and returning NO_TRANSIT returns to that.
- */
-static const struct fscache_transition fscache_osm_init_oob[] = {
-	   TRANSIT_TO(ABORT_INIT,
-		      (1 << FSCACHE_OBJECT_EV_ERROR) |
-		      (1 << FSCACHE_OBJECT_EV_KILL)),
-	   { 0, NULL }
-};
-
-static const struct fscache_transition fscache_osm_lookup_oob[] = {
-	   TRANSIT_TO(LOOKUP_FAILURE,
-		      (1 << FSCACHE_OBJECT_EV_ERROR) |
-		      (1 << FSCACHE_OBJECT_EV_KILL)),
-	   { 0, NULL }
-};
-
-static const struct fscache_transition fscache_osm_run_oob[] = {
-	   TRANSIT_TO(KILL_OBJECT,
-		      (1 << FSCACHE_OBJECT_EV_ERROR) |
-		      (1 << FSCACHE_OBJECT_EV_KILL)),
-	   { 0, NULL }
-};
-
-static int  fscache_get_object(struct cachefiles_object *,
-			       enum fscache_obj_ref_trace);
-static void fscache_put_object(struct cachefiles_object *,
-			       enum fscache_obj_ref_trace);
-static bool fscache_enqueue_dependents(struct cachefiles_object *, int);
-static void fscache_dequeue_object(struct cachefiles_object *);
-static void fscache_update_aux_data(struct cachefiles_object *);
-
-/*
- * we need to notify the parent when an op completes that we had outstanding
- * upon it
- */
-static inline void fscache_done_parent_op(struct cachefiles_object *object)
-{
-	struct cachefiles_object *parent = object->parent;
-
-	_enter("OBJ%x {OBJ%x,%x}",
-	       object->debug_id, parent->debug_id, parent->n_ops);
-
-	spin_lock_nested(&parent->lock, 1);
-	parent->n_obj_ops--;
-	parent->n_ops--;
-	if (parent->n_ops == 0)
-		fscache_raise_event(parent, FSCACHE_OBJECT_EV_CLEARED);
-	spin_unlock(&parent->lock);
-}
-
-/*
- * Object state machine dispatcher.
- */
-static void fscache_object_sm_dispatcher(struct cachefiles_object *object)
-{
-	const struct fscache_transition *t;
-	const struct fscache_state *state, *new_state;
-	unsigned long events, event_mask;
-	bool oob;
-	int event = -1;
-
-	ASSERT(object != NULL);
-
-	_enter("{OBJ%x,%s,%lx}",
-	       object->debug_id, object->state->name, object->events);
-
-	event_mask = object->event_mask;
-restart:
-	object->event_mask = 0; /* Mask normal event handling */
-	state = object->state;
-restart_masked:
-	events = object->events;
-
-	/* Handle any out-of-band events (typically an error) */
-	if (events & object->oob_event_mask) {
-		_debug("{OBJ%x} oob %lx",
-		       object->debug_id, events & object->oob_event_mask);
-		oob = true;
-		for (t = object->oob_table; t->events; t++) {
-			if (events & t->events) {
-				state = t->transit_to;
-				ASSERT(state->work != NULL);
-				event = fls(events & t->events) - 1;
-				__clear_bit(event, &object->oob_event_mask);
-				clear_bit(event, &object->events);
-				goto execute_work_state;
-			}
-		}
-	}
-	oob = false;
-
-	/* Wait states are just transition tables */
-	if (!state->work) {
-		if (events & event_mask) {
-			for (t = state->transitions; t->events; t++) {
-				if (events & t->events) {
-					new_state = t->transit_to;
-					event = fls(events & t->events) - 1;
-					trace_fscache_osm(object, state,
-							  true, false, event);
-					clear_bit(event, &object->events);
-					_debug("{OBJ%x} ev %d: %s -> %s",
-					       object->debug_id, event,
-					       state->name, new_state->name);
-					object->state = state = new_state;
-					goto execute_work_state;
-				}
-			}
-
-			/* The event mask didn't include all the tabled bits */
-			BUG();
-		}
-		/* Randomly woke up */
-		goto unmask_events;
-	}
-
-execute_work_state:
-	_debug("{OBJ%x} exec %s", object->debug_id, state->name);
-
-	trace_fscache_osm(object, state, false, oob, event);
-	new_state = state->work(object, event);
-	event = -1;
-	if (new_state == NO_TRANSIT) {
-		_debug("{OBJ%x} %s notrans", object->debug_id, state->name);
-		if (unlikely(state == STATE(OBJECT_DEAD))) {
-			_leave(" [dead]");
-			return;
-		}
-		fscache_enqueue_object(object);
-		event_mask = object->oob_event_mask;
-		goto unmask_events;
-	}
-
-	_debug("{OBJ%x} %s -> %s",
-	       object->debug_id, state->name, new_state->name);
-	object->state = state = new_state;
-
-	if (state->work) {
-		if (unlikely(state == STATE(OBJECT_DEAD))) {
-			_leave(" [dead]");
-			return;
-		}
-		goto restart_masked;
-	}
-
-	/* Transited to wait state */
-	event_mask = object->oob_event_mask;
-	for (t = state->transitions; t->events; t++)
-		event_mask |= t->events;
-
-unmask_events:
-	object->event_mask = event_mask;
-	smp_mb();
-	events = object->events;
-	if (events & event_mask)
-		goto restart;
-	_leave(" [msk %lx]", event_mask);
-}
-
-/*
- * execute an object
- */
-static void fscache_object_work_func(struct work_struct *work)
-{
-	struct cachefiles_object *object =
-		container_of(work, struct cachefiles_object, work);
-
-	_enter("{OBJ%x}", object->debug_id);
-
-	fscache_object_sm_dispatcher(object);
-	fscache_put_object(object, fscache_obj_put_work);
-}
-
-/**
- * fscache_object_init - Initialise a cache object description
- * @object: Object description
- * @cookie: Cookie object will be attached to
- * @cache: Cache in which backing object will be found
- *
- * Initialise a cache object description to its basic values.
- *
- * See Documentation/filesystems/caching/backend-api.rst for a complete
- * description.
- */
-void fscache_object_init(struct cachefiles_object *object,
-			 struct fscache_cookie *cookie,
-			 struct fscache_cache *cache)
-{
-	const struct fscache_transition *t;
-
-	atomic_inc(&cache->object_count);
-
-	object->state = STATE(WAIT_FOR_INIT);
-	object->oob_table = fscache_osm_init_oob;
-	object->flags = 1 << FSCACHE_OBJECT_IS_LIVE;
-	spin_lock_init(&object->lock);
-	INIT_LIST_HEAD(&object->cache_link);
-	INIT_HLIST_NODE(&object->cookie_link);
-	INIT_WORK(&object->work, fscache_object_work_func);
-	INIT_LIST_HEAD(&object->dependents);
-	INIT_LIST_HEAD(&object->dep_link);
-	object->n_children = 0;
-	object->n_ops = 0;
-	object->events = 0;
-	object->cache = cache;
-	object->cookie = cookie;
-	fscache_get_cookie(cookie, fscache_cookie_get_attach_object);
-	object->parent = NULL;
-#ifdef CONFIG_FSCACHE_OBJECT_LIST
-	RB_CLEAR_NODE(&object->objlist_link);
-#endif
-
-	object->oob_event_mask = 0;
-	for (t = object->oob_table; t->events; t++)
-		object->oob_event_mask |= t->events;
-	object->event_mask = object->oob_event_mask;
-	for (t = object->state->transitions; t->events; t++)
-		object->event_mask |= t->events;
-}
-EXPORT_SYMBOL(fscache_object_init);
-
-/*
- * Mark the object as no longer being live, making sure that we synchronise
- * against op submission.
- */
-static inline void fscache_mark_object_dead(struct cachefiles_object *object)
-{
-	spin_lock(&object->lock);
-	clear_bit(FSCACHE_OBJECT_IS_LIVE, &object->flags);
-	spin_unlock(&object->lock);
-}
-
-/*
- * Abort object initialisation before we start it.
- */
-static const struct fscache_state *fscache_abort_initialisation(struct cachefiles_object *object,
-								int event)
-{
-	_enter("{OBJ%x},%d", object->debug_id, event);
-
-	object->oob_event_mask = 0;
-	fscache_dequeue_object(object);
-	return transit_to(KILL_OBJECT);
-}
-
-/*
- * initialise an object
- * - check the specified object's parent to see if we can make use of it
- *   immediately to do a creation
- * - we may need to start the process of creating a parent and we need to wait
- *   for the parent's lookup and creation to complete if it's not there yet
- */
-static const struct fscache_state *fscache_initialise_object(struct cachefiles_object *object,
-							     int event)
-{
-	struct cachefiles_object *parent;
-	bool success;
-
-	_enter("{OBJ%x},%d", object->debug_id, event);
-
-	ASSERT(list_empty(&object->dep_link));
-
-	parent = object->parent;
-	if (!parent) {
-		_leave(" [no parent]");
-		return transit_to(DROP_OBJECT);
-	}
-
-	_debug("parent: %s of:%lx", parent->state->name, parent->flags);
-
-	if (fscache_object_is_dying(parent)) {
-		_leave(" [bad parent]");
-		return transit_to(DROP_OBJECT);
-	}
-
-	if (fscache_object_is_available(parent)) {
-		_leave(" [ready]");
-		return transit_to(PARENT_READY);
-	}
-
-	_debug("wait");
-
-	spin_lock(&parent->lock);
-	fscache_stat(&fscache_n_cop_grab_object);
-	success = false;
-	if (fscache_object_is_live(parent) &&
-	    object->cache->ops->grab_object(object, fscache_obj_get_add_to_deps)) {
-		list_add(&object->dep_link, &parent->dependents);
-		success = true;
-	}
-	fscache_stat_d(&fscache_n_cop_grab_object);
-	spin_unlock(&parent->lock);
-	if (!success) {
-		_leave(" [grab failed]");
-		return transit_to(DROP_OBJECT);
-	}
-
-	/* fscache_acquire_non_index_cookie() uses this
-	 * to wake the chain up */
-	fscache_raise_event(parent, FSCACHE_OBJECT_EV_NEW_CHILD);
-	_leave(" [wait]");
-	return transit_to(WAIT_FOR_PARENT);
-}
-
-/*
- * Once the parent object is ready, we should kick off our lookup op.
- */
-static const struct fscache_state *fscache_parent_ready(struct cachefiles_object *object,
-							int event)
-{
-	struct cachefiles_object *parent = object->parent;
-
-	_enter("{OBJ%x},%d", object->debug_id, event);
-
-	ASSERT(parent != NULL);
-
-	spin_lock(&parent->lock);
-	parent->n_ops++;
-	parent->n_obj_ops++;
-	spin_unlock(&parent->lock);
-
-	_leave("");
-	return transit_to(LOOK_UP_OBJECT);
-}
-
-/*
- * look an object up in the cache from which it was allocated
- * - we hold an "access lock" on the parent object, so the parent object cannot
- *   be withdrawn by either party till we've finished
- */
-static const struct fscache_state *fscache_look_up_object(struct cachefiles_object *object,
-							  int event)
-{
-	struct fscache_cookie *cookie = object->cookie;
-	struct cachefiles_object *parent = object->parent;
-	int ret;
-
-	_enter("{OBJ%x},%d", object->debug_id, event);
-
-	object->oob_table = fscache_osm_lookup_oob;
-
-	ASSERT(parent != NULL);
-	ASSERTCMP(parent->n_ops, >, 0);
-	ASSERTCMP(parent->n_obj_ops, >, 0);
-
-	/* make sure the parent is still available */
-	ASSERT(fscache_object_is_available(parent));
-
-	if (fscache_object_is_dying(parent) ||
-	    test_bit(FSCACHE_IOERROR, &object->cache->flags) ||
-	    !fscache_use_cookie(object)) {
-		_leave(" [unavailable]");
-		return transit_to(LOOKUP_FAILURE);
-	}
-
-	_debug("LOOKUP \"%s\" in \"%s\"",
-	       cookie->type_name, object->cache->tag->name);
-
-	fscache_stat(&fscache_n_object_lookups);
-	fscache_stat(&fscache_n_cop_lookup_object);
-	ret = object->cache->ops->lookup_object(object);
-	fscache_stat_d(&fscache_n_cop_lookup_object);
-
-	fscache_unuse_cookie(object);
-
-	if (ret == -ETIMEDOUT) {
-		/* probably stuck behind another object, so move this one to
-		 * the back of the queue */
-		fscache_stat(&fscache_n_object_lookups_timed_out);
-		_leave(" [timeout]");
-		return NO_TRANSIT;
-	}
-
-	if (ret < 0) {
-		_leave(" [error]");
-		return transit_to(LOOKUP_FAILURE);
-	}
-
-	_leave(" [ok]");
-	return transit_to(OBJECT_AVAILABLE);
-}
-
-/**
- * fscache_object_lookup_negative - Note negative cookie lookup
- * @object: Object pointing to cookie to mark
- *
- * Note negative lookup, permitting those waiting to read data from an already
- * existing backing object to continue as there's no data for them to read.
- */
-void fscache_object_lookup_negative(struct cachefiles_object *object)
-{
-	struct fscache_cookie *cookie = object->cookie;
-
-	_enter("{OBJ%x,%s}", object->debug_id, object->state->name);
-
-	if (!test_and_set_bit(FSCACHE_OBJECT_IS_LOOKED_UP, &object->flags)) {
-		fscache_stat(&fscache_n_object_lookups_negative);
-
-		/* Allow write requests to begin stacking up and read requests to begin
-		 * returning ENODATA.
-		 */
-		set_bit(FSCACHE_COOKIE_NO_DATA_YET, &cookie->flags);
-		clear_bit(FSCACHE_COOKIE_UNAVAILABLE, &cookie->flags);
-
-		clear_bit_unlock(FSCACHE_COOKIE_LOOKING_UP, &cookie->flags);
-		wake_up_bit(&cookie->flags, FSCACHE_COOKIE_LOOKING_UP);
-	}
-	_leave("");
-}
-EXPORT_SYMBOL(fscache_object_lookup_negative);
-
-/**
- * fscache_obtained_object - Note successful object lookup or creation
- * @object: Object pointing to cookie to mark
- *
- * Note successful lookup and/or creation, permitting those waiting to write
- * data to a backing object to continue.
- *
- * Note that after calling this, an object's cookie may be relinquished by the
- * netfs, and so must be accessed with object lock held.
- */
-void fscache_obtained_object(struct cachefiles_object *object)
-{
-	struct fscache_cookie *cookie = object->cookie;
-
-	_enter("{OBJ%x,%s}", object->debug_id, object->state->name);
-
-	/* if we were still looking up, then we must have a positive lookup
-	 * result, in which case there may be data available */
-	if (!test_and_set_bit(FSCACHE_OBJECT_IS_LOOKED_UP, &object->flags)) {
-		fscache_stat(&fscache_n_object_lookups_positive);
-
-		/* We do (presumably) have data */
-		clear_bit_unlock(FSCACHE_COOKIE_NO_DATA_YET, &cookie->flags);
-		clear_bit(FSCACHE_COOKIE_UNAVAILABLE, &cookie->flags);
-
-		/* Allow write requests to begin stacking up and read requests
-		 * to begin shovelling data.
-		 */
-		clear_bit_unlock(FSCACHE_COOKIE_LOOKING_UP, &cookie->flags);
-		wake_up_bit(&cookie->flags, FSCACHE_COOKIE_LOOKING_UP);
-	} else {
-		fscache_stat(&fscache_n_object_created);
-	}
-
-	set_bit(FSCACHE_OBJECT_IS_AVAILABLE, &object->flags);
-	_leave("");
-}
-EXPORT_SYMBOL(fscache_obtained_object);
-
-/*
- * handle an object that has just become available
- */
-static const struct fscache_state *fscache_object_available(struct cachefiles_object *object,
-							    int event)
-{
-	_enter("{OBJ%x},%d", object->debug_id, event);
-
-	object->oob_table = fscache_osm_run_oob;
-
-	spin_lock(&object->lock);
-
-	fscache_done_parent_op(object);
-	spin_unlock(&object->lock);
-
-	fscache_stat(&fscache_n_cop_lookup_complete);
-	object->cache->ops->lookup_complete(object);
-	fscache_stat_d(&fscache_n_cop_lookup_complete);
-
-	fscache_stat(&fscache_n_object_avail);
-
-	_leave("");
-	return transit_to(JUMPSTART_DEPS);
-}
-
-/*
- * Wake up this object's dependent objects now that we've become available.
- */
-static const struct fscache_state *fscache_jumpstart_dependents(struct cachefiles_object *object,
-								int event)
-{
-	_enter("{OBJ%x},%d", object->debug_id, event);
-
-	if (!fscache_enqueue_dependents(object, FSCACHE_OBJECT_EV_PARENT_READY))
-		return NO_TRANSIT; /* Not finished; requeue */
-	return transit_to(WAIT_FOR_CMD);
-}
-
-/*
- * Handle lookup or creation failute.
- */
-static const struct fscache_state *fscache_lookup_failure(struct cachefiles_object *object,
-							  int event)
-{
-	struct fscache_cookie *cookie;
-
-	_enter("{OBJ%x},%d", object->debug_id, event);
-
-	object->oob_event_mask = 0;
-
-	fscache_stat(&fscache_n_cop_lookup_complete);
-	object->cache->ops->lookup_complete(object);
-	fscache_stat_d(&fscache_n_cop_lookup_complete);
-
-	set_bit(FSCACHE_OBJECT_KILLED_BY_CACHE, &object->flags);
-
-	cookie = object->cookie;
-	set_bit(FSCACHE_COOKIE_UNAVAILABLE, &cookie->flags);
-	if (test_and_clear_bit(FSCACHE_COOKIE_LOOKING_UP, &cookie->flags))
-		wake_up_bit(&cookie->flags, FSCACHE_COOKIE_LOOKING_UP);
-
-	fscache_done_parent_op(object);
-	return transit_to(KILL_OBJECT);
-}
-
-/*
- * Wait for completion of all active operations on this object and the death of
- * all child objects of this object.
- */
-static const struct fscache_state *fscache_kill_object(struct cachefiles_object *object,
-						       int event)
-{
-	_enter("{OBJ%x,%d,%d},%d",
-	       object->debug_id, object->n_ops, object->n_children, event);
-
-	fscache_mark_object_dead(object);
-	object->oob_event_mask = 0;
-
-	if (list_empty(&object->dependents) &&
-	    object->n_ops == 0 &&
-	    object->n_children == 0)
-		return transit_to(DROP_OBJECT);
-
-	if (!list_empty(&object->dependents))
-		return transit_to(KILL_DEPENDENTS);
-
-	return transit_to(WAIT_FOR_CLEARANCE);
-}
-
-/*
- * Kill dependent objects.
- */
-static const struct fscache_state *fscache_kill_dependents(struct cachefiles_object *object,
-							   int event)
-{
-	_enter("{OBJ%x},%d", object->debug_id, event);
-
-	if (!fscache_enqueue_dependents(object, FSCACHE_OBJECT_EV_KILL))
-		return NO_TRANSIT; /* Not finished */
-	return transit_to(WAIT_FOR_CLEARANCE);
-}
-
-/*
- * Drop an object's attachments
- */
-static const struct fscache_state *fscache_drop_object(struct cachefiles_object *object,
-						       int event)
-{
-	struct cachefiles_object *parent = object->parent;
-	struct fscache_cookie *cookie = object->cookie;
-	struct fscache_cache *cache = object->cache;
-	bool awaken = false;
-
-	_enter("{OBJ%x,%d},%d", object->debug_id, object->n_children, event);
-
-	ASSERT(cookie != NULL);
-	ASSERT(!hlist_unhashed(&object->cookie_link));
-
-	if (test_bit(FSCACHE_COOKIE_AUX_UPDATED, &cookie->flags)) {
-		_debug("final update");
-		fscache_update_aux_data(object);
-	}
-
-	/* Make sure the cookie no longer points here and that the netfs isn't
-	 * waiting for us.
-	 */
-	spin_lock(&cookie->lock);
-	hlist_del_init(&object->cookie_link);
-	if (hlist_empty(&cookie->backing_objects) &&
-	    test_and_clear_bit(FSCACHE_COOKIE_INVALIDATING, &cookie->flags))
-		awaken = true;
-	spin_unlock(&cookie->lock);
-
-	if (awaken)
-		wake_up_bit(&cookie->flags, FSCACHE_COOKIE_INVALIDATING);
-	if (test_and_clear_bit(FSCACHE_COOKIE_LOOKING_UP, &cookie->flags))
-		wake_up_bit(&cookie->flags, FSCACHE_COOKIE_LOOKING_UP);
-
-
-	/* Prevent a race with our last child, which has to signal EV_CLEARED
-	 * before dropping our spinlock.
-	 */
-	spin_lock(&object->lock);
-	spin_unlock(&object->lock);
-
-	/* Discard from the cache's collection of objects */
-	spin_lock(&cache->object_list_lock);
-	list_del_init(&object->cache_link);
-	spin_unlock(&cache->object_list_lock);
-
-	fscache_stat(&fscache_n_cop_drop_object);
-	cache->ops->drop_object(object);
-	fscache_stat_d(&fscache_n_cop_drop_object);
-
-	/* The parent object wants to know when all it dependents have gone */
-	if (parent) {
-		_debug("release parent OBJ%x {%d}",
-		       parent->debug_id, parent->n_children);
-
-		spin_lock(&parent->lock);
-		parent->n_children--;
-		if (parent->n_children == 0)
-			fscache_raise_event(parent, FSCACHE_OBJECT_EV_CLEARED);
-		spin_unlock(&parent->lock);
-		object->parent = NULL;
-	}
-
-	/* this just shifts the object release to the work processor */
-	fscache_put_object(object, fscache_obj_put_drop_obj);
-	fscache_stat(&fscache_n_object_dead);
-
-	_leave("");
-	return transit_to(OBJECT_DEAD);
-}
-
-/*
- * get a ref on an object
- */
-static int fscache_get_object(struct cachefiles_object *object,
-			      enum fscache_obj_ref_trace why)
-{
-	int ret;
-
-	fscache_stat(&fscache_n_cop_grab_object);
-	ret = object->cache->ops->grab_object(object, why) ? 0 : -EAGAIN;
-	fscache_stat_d(&fscache_n_cop_grab_object);
-	return ret;
-}
-
-/*
- * Discard a ref on an object
- */
-static void fscache_put_object(struct cachefiles_object *object,
-			       enum fscache_obj_ref_trace why)
-{
-	fscache_stat(&fscache_n_cop_put_object);
-	object->cache->ops->put_object(object, why);
-	fscache_stat_d(&fscache_n_cop_put_object);
-}
-
-/**
- * fscache_object_destroy - Note that a cache object is about to be destroyed
- * @object: The object to be destroyed
- *
- * Note the imminent destruction and deallocation of a cache object record.
- */
-void fscache_object_destroy(struct cachefiles_object *object)
-{
-	/* We can get rid of the cookie now */
-	fscache_put_cookie(object->cookie, fscache_cookie_put_object);
-	object->cookie = NULL;
-}
-EXPORT_SYMBOL(fscache_object_destroy);
-
-/*
- * enqueue an object for metadata-type processing
- */
-void fscache_enqueue_object(struct cachefiles_object *object)
-{
-	_enter("{OBJ%x}", object->debug_id);
-
-	if (fscache_get_object(object, fscache_obj_get_queue) >= 0) {
-		wait_queue_head_t *cong_wq =
-			&get_cpu_var(fscache_object_cong_wait);
-
-		if (queue_work(fscache_object_wq, &object->work)) {
-			if (fscache_object_congested())
-				wake_up(cong_wq);
-		} else
-			fscache_put_object(object, fscache_obj_put_queue);
-
-		put_cpu_var(fscache_object_cong_wait);
-	}
-}
-
-/**
- * fscache_object_sleep_till_congested - Sleep until object wq is congested
- * @timeoutp: Scheduler sleep timeout
- *
- * Allow an object handler to sleep until the object workqueue is congested.
- *
- * The caller must set up a wake up event before calling this and must have set
- * the appropriate sleep mode (such as TASK_UNINTERRUPTIBLE) and tested its own
- * condition before calling this function as no test is made here.
- *
- * %true is returned if the object wq is congested, %false otherwise.
- */
-bool fscache_object_sleep_till_congested(signed long *timeoutp)
-{
-	wait_queue_head_t *cong_wq = this_cpu_ptr(&fscache_object_cong_wait);
-	DEFINE_WAIT(wait);
-
-	if (fscache_object_congested())
-		return true;
-
-	add_wait_queue_exclusive(cong_wq, &wait);
-	if (!fscache_object_congested())
-		*timeoutp = schedule_timeout(*timeoutp);
-	finish_wait(cong_wq, &wait);
-
-	return fscache_object_congested();
-}
-EXPORT_SYMBOL_GPL(fscache_object_sleep_till_congested);
-
-/*
- * Enqueue the dependents of an object for metadata-type processing.
- *
- * If we don't manage to finish the list before the scheduler wants to run
- * again then return false immediately.  We return true if the list was
- * cleared.
- */
-static bool fscache_enqueue_dependents(struct cachefiles_object *object, int event)
-{
-	struct cachefiles_object *dep;
-	bool ret = true;
-
-	_enter("{OBJ%x}", object->debug_id);
-
-	if (list_empty(&object->dependents))
-		return true;
-
-	spin_lock(&object->lock);
-
-	while (!list_empty(&object->dependents)) {
-		dep = list_entry(object->dependents.next,
-				 struct cachefiles_object, dep_link);
-		list_del_init(&dep->dep_link);
-
-		fscache_raise_event(dep, event);
-		fscache_put_object(dep, fscache_obj_put_enq_dep);
-
-		if (!list_empty(&object->dependents) && need_resched()) {
-			ret = false;
-			break;
-		}
-	}
-
-	spin_unlock(&object->lock);
-	return ret;
-}
-
-/*
- * remove an object from whatever queue it's waiting on
- */
-static void fscache_dequeue_object(struct cachefiles_object *object)
-{
-	_enter("{OBJ%x}", object->debug_id);
-
-	if (!list_empty(&object->dep_link)) {
-		spin_lock(&object->parent->lock);
-		list_del_init(&object->dep_link);
-		spin_unlock(&object->parent->lock);
-	}
-
-	_leave("");
-}
-
-static const struct fscache_state *fscache_invalidate_object(struct cachefiles_object *object,
-							     int event)
-{
-	return transit_to(UPDATE_OBJECT);
-}
-
-/*
- * Update auxiliary data.
- */
-static void fscache_update_aux_data(struct cachefiles_object *object)
-{
-	fscache_stat(&fscache_n_updates_run);
-	fscache_stat(&fscache_n_cop_update_object);
-	object->cache->ops->update_object(object);
-	fscache_stat_d(&fscache_n_cop_update_object);
-}
-
-/*
- * Asynchronously update an object.
- */
-static const struct fscache_state *fscache_update_object(struct cachefiles_object *object,
-							 int event)
-{
-	_enter("{OBJ%x},%d", object->debug_id, event);
-
-	fscache_update_aux_data(object);
-
-	_leave("");
-	return transit_to(WAIT_FOR_CMD);
-}
-
-/**
- * fscache_object_retrying_stale - Note retrying stale object
- * @object: The object that will be retried
- *
- * Note that an object lookup found an on-disk object that was adjudged to be
- * stale and has been deleted.  The lookup will be retried.
- */
-void fscache_object_retrying_stale(struct cachefiles_object *object)
-{
-	fscache_stat(&fscache_n_cache_no_space_reject);
-}
-EXPORT_SYMBOL(fscache_object_retrying_stale);
-
-/**
- * fscache_object_mark_killed - Note that an object was killed
- * @object: The object that was culled
- * @why: The reason the object was killed.
- *
- * Note that an object was killed.  Returns true if the object was
- * already marked killed, false if it wasn't.
- */
-void fscache_object_mark_killed(struct cachefiles_object *object,
-				enum fscache_why_object_killed why)
-{
-	if (test_and_set_bit(FSCACHE_OBJECT_KILLED_BY_CACHE, &object->flags)) {
-		pr_err("Error: Object already killed by cache [%s]\n",
-		       object->cache->identifier);
-		return;
-	}
-
-	switch (why) {
-	case FSCACHE_OBJECT_NO_SPACE:
-		fscache_stat(&fscache_n_cache_no_space_reject);
-		break;
-	case FSCACHE_OBJECT_IS_STALE:
-		fscache_stat(&fscache_n_cache_stale_objects);
-		break;
-	case FSCACHE_OBJECT_WAS_RETIRED:
-		fscache_stat(&fscache_n_cache_retired_objects);
-		break;
-	case FSCACHE_OBJECT_WAS_CULLED:
-		fscache_stat(&fscache_n_cache_culled_objects);
-		break;
-	}
-}
-EXPORT_SYMBOL(fscache_object_mark_killed);
-
-/*
- * The object is dead.  We can get here if an object gets queued by an event
- * that would lead to its death (such as EV_KILL) when the dispatcher is
- * already running (and so can be requeued) but hasn't yet cleared the event
- * mask.
- */
-static const struct fscache_state *fscache_object_dead(struct cachefiles_object *object,
-						       int event)
-{
-	if (!test_and_set_bit(FSCACHE_OBJECT_RUN_AFTER_DEAD,
-			      &object->flags))
-		return NO_TRANSIT;
-
-	WARN(true, "FS-Cache object redispatched after death");
-	return NO_TRANSIT;
-}
diff --git a/fs/fscache/proc.c b/fs/fscache/proc.c
index 061df8f61ffc..b3fc14f08ced 100644
--- a/fs/fscache/proc.c
+++ b/fs/fscache/proc.c
@@ -5,7 +5,7 @@
  * Written by David Howells (dhowells@redhat.com)
  */
 
-#define FSCACHE_DEBUG_LEVEL OPERATION
+#define FSCACHE_DEBUG_LEVEL CACHE
 #include <linux/module.h>
 #include <linux/proc_fs.h>
 #include <linux/seq_file.h>
@@ -16,42 +16,32 @@
  */
 int __init fscache_proc_init(void)
 {
-	_enter("");
-
 	if (!proc_mkdir("fs/fscache", NULL))
 		goto error_dir;
 
+	if (!proc_create_seq("fs/fscache/caches", S_IFREG | 0444, NULL,
+			     &fscache_caches_seq_ops))
+		goto error;
+
+	if (!proc_create_seq("fs/fscache/volumes", S_IFREG | 0444, NULL,
+			     &fscache_volumes_seq_ops))
+		goto error;
+
 	if (!proc_create_seq("fs/fscache/cookies", S_IFREG | 0444, NULL,
 			     &fscache_cookies_seq_ops))
-		goto error_cookies;
+		goto error;
 
 #ifdef CONFIG_FSCACHE_STATS
 	if (!proc_create_single("fs/fscache/stats", S_IFREG | 0444, NULL,
-			fscache_stats_show))
-		goto error_stats;
+				fscache_stats_show))
+		goto error;
 #endif
 
-#ifdef CONFIG_FSCACHE_OBJECT_LIST
-	if (!proc_create("fs/fscache/objects", S_IFREG | 0444, NULL,
-			 &fscache_objlist_proc_ops))
-		goto error_objects;
-#endif
-
-	_leave(" = 0");
 	return 0;
 
-#ifdef CONFIG_FSCACHE_OBJECT_LIST
-error_objects:
-#endif
-#ifdef CONFIG_FSCACHE_STATS
-	remove_proc_entry("fs/fscache/stats", NULL);
-error_stats:
-#endif
-	remove_proc_entry("fs/fscache/cookies", NULL);
-error_cookies:
+error:
 	remove_proc_entry("fs/fscache", NULL);
 error_dir:
-	_leave(" = -ENOMEM");
 	return -ENOMEM;
 }
 
@@ -60,12 +50,5 @@ int __init fscache_proc_init(void)
  */
 void fscache_proc_cleanup(void)
 {
-#ifdef CONFIG_FSCACHE_OBJECT_LIST
-	remove_proc_entry("fs/fscache/objects", NULL);
-#endif
-#ifdef CONFIG_FSCACHE_STATS
-	remove_proc_entry("fs/fscache/stats", NULL);
-#endif
-	remove_proc_entry("fs/fscache/cookies", NULL);
 	remove_proc_entry("fs/fscache", NULL);
 }
diff --git a/fs/fscache/stats.c b/fs/fscache/stats.c
index cb9dd0a93e0d..13e90b940bd2 100644
--- a/fs/fscache/stats.c
+++ b/fs/fscache/stats.c
@@ -5,7 +5,7 @@
  * Written by David Howells (dhowells@redhat.com)
  */
 
-#define FSCACHE_DEBUG_LEVEL THREAD
+#define FSCACHE_DEBUG_LEVEL CACHE
 #include <linux/module.h>
 #include <linux/proc_fs.h>
 #include <linux/seq_file.h>
@@ -14,15 +14,10 @@
 /*
  * operation counters
  */
-atomic_t fscache_n_op_pend;
-atomic_t fscache_n_op_run;
-atomic_t fscache_n_op_enqueue;
-atomic_t fscache_n_op_deferred_release;
-atomic_t fscache_n_op_initialised;
-atomic_t fscache_n_op_release;
-atomic_t fscache_n_op_gc;
-atomic_t fscache_n_op_cancelled;
-atomic_t fscache_n_op_rejected;
+atomic_t fscache_n_volumes;
+atomic_t fscache_n_volumes_collision;
+atomic_t fscache_n_volumes_nomem;
+atomic_t fscache_n_cookies;
 
 atomic_t fscache_n_retrievals;
 atomic_t fscache_n_retrievals_ok;
@@ -62,36 +57,15 @@ atomic_t fscache_n_updates_run;
 atomic_t fscache_n_relinquishes;
 atomic_t fscache_n_relinquishes_null;
 atomic_t fscache_n_relinquishes_retire;
+atomic_t fscache_n_relinquishes_dropped;
 
-atomic_t fscache_n_cookie_index;
-atomic_t fscache_n_cookie_data;
-atomic_t fscache_n_cookie_special;
-
-atomic_t fscache_n_object_alloc;
-atomic_t fscache_n_object_no_alloc;
-atomic_t fscache_n_object_lookups;
-atomic_t fscache_n_object_lookups_negative;
-atomic_t fscache_n_object_lookups_positive;
-atomic_t fscache_n_object_lookups_timed_out;
-atomic_t fscache_n_object_created;
-atomic_t fscache_n_object_avail;
-atomic_t fscache_n_object_dead;
-
-atomic_t fscache_n_cop_alloc_object;
-atomic_t fscache_n_cop_lookup_object;
-atomic_t fscache_n_cop_lookup_complete;
-atomic_t fscache_n_cop_grab_object;
-atomic_t fscache_n_cop_invalidate_object;
-atomic_t fscache_n_cop_update_object;
-atomic_t fscache_n_cop_drop_object;
-atomic_t fscache_n_cop_put_object;
-atomic_t fscache_n_cop_sync_cache;
-atomic_t fscache_n_cop_attr_changed;
-
-atomic_t fscache_n_cache_no_space_reject;
-atomic_t fscache_n_cache_stale_objects;
-atomic_t fscache_n_cache_retired_objects;
-atomic_t fscache_n_cache_culled_objects;
+atomic_t fscache_n_resizes;
+atomic_t fscache_n_resizes_null;
+
+atomic_t fscache_n_read;
+EXPORT_SYMBOL(fscache_n_read);
+atomic_t fscache_n_write;
+EXPORT_SYMBOL(fscache_n_write);
 
 /*
  * display the general statistics
@@ -99,17 +73,12 @@ atomic_t fscache_n_cache_culled_objects;
 int fscache_stats_show(struct seq_file *m, void *v)
 {
 	seq_puts(m, "FS-Cache statistics\n");
-
-	seq_printf(m, "Cookies: idx=%u dat=%u spc=%u\n",
-		   atomic_read(&fscache_n_cookie_index),
-		   atomic_read(&fscache_n_cookie_data),
-		   atomic_read(&fscache_n_cookie_special));
-
-	seq_printf(m, "Objects: alc=%u nal=%u avl=%u ded=%u\n",
-		   atomic_read(&fscache_n_object_alloc),
-		   atomic_read(&fscache_n_object_no_alloc),
-		   atomic_read(&fscache_n_object_avail),
-		   atomic_read(&fscache_n_object_dead));
+	seq_printf(m, "Cookies: n=%d v=%d vcol=%u voom=%u\n",
+		   atomic_read(&fscache_n_cookies),
+		   atomic_read(&fscache_n_volumes),
+		   atomic_read(&fscache_n_volumes_collision),
+		   atomic_read(&fscache_n_volumes_nomem)
+		   );
 
 	seq_printf(m, "Acquire: n=%u nul=%u noc=%u ok=%u nbf=%u"
 		   " oom=%u\n",
@@ -120,13 +89,6 @@ int fscache_stats_show(struct seq_file *m, void *v)
 		   atomic_read(&fscache_n_acquires_nobufs),
 		   atomic_read(&fscache_n_acquires_oom));
 
-	seq_printf(m, "Lookups: n=%u neg=%u pos=%u crt=%u tmo=%u\n",
-		   atomic_read(&fscache_n_object_lookups),
-		   atomic_read(&fscache_n_object_lookups_negative),
-		   atomic_read(&fscache_n_object_lookups_positive),
-		   atomic_read(&fscache_n_object_created),
-		   atomic_read(&fscache_n_object_lookups_timed_out));
-
 	seq_printf(m, "Invals : n=%u run=%u\n",
 		   atomic_read(&fscache_n_invalidates),
 		   atomic_read(&fscache_n_invalidates_run));
@@ -136,66 +98,15 @@ int fscache_stats_show(struct seq_file *m, void *v)
 		   atomic_read(&fscache_n_updates_null),
 		   atomic_read(&fscache_n_updates_run));
 
-	seq_printf(m, "Relinqs: n=%u nul=%u rtr=%u\n",
+	seq_printf(m, "Relinqs: n=%u rtr=%u drop=%u\n",
 		   atomic_read(&fscache_n_relinquishes),
-		   atomic_read(&fscache_n_relinquishes_null),
-		   atomic_read(&fscache_n_relinquishes_retire));
-
-	seq_printf(m, "Retrvls: n=%u ok=%u wt=%u nod=%u nbf=%u"
-		   " int=%u oom=%u\n",
-		   atomic_read(&fscache_n_retrievals),
-		   atomic_read(&fscache_n_retrievals_ok),
-		   atomic_read(&fscache_n_retrievals_wait),
-		   atomic_read(&fscache_n_retrievals_nodata),
-		   atomic_read(&fscache_n_retrievals_nobufs),
-		   atomic_read(&fscache_n_retrievals_intr),
-		   atomic_read(&fscache_n_retrievals_nomem));
-	seq_printf(m, "Retrvls: ops=%u owt=%u abt=%u\n",
-		   atomic_read(&fscache_n_retrieval_ops),
-		   atomic_read(&fscache_n_retrieval_op_waits),
-		   atomic_read(&fscache_n_retrievals_object_dead));
-
-	seq_printf(m, "Stores : n=%u ok=%u agn=%u nbf=%u int=%u oom=%u\n",
-		   atomic_read(&fscache_n_stores),
-		   atomic_read(&fscache_n_stores_ok),
-		   atomic_read(&fscache_n_stores_again),
-		   atomic_read(&fscache_n_stores_nobufs),
-		   atomic_read(&fscache_n_stores_intr),
-		   atomic_read(&fscache_n_stores_oom));
-	seq_printf(m, "Stores : ops=%u owt=%u abt=%u\n",
-		   atomic_read(&fscache_n_store_ops),
-		   atomic_read(&fscache_n_store_op_waits),
-		   atomic_read(&fscache_n_stores_object_dead));
-
-	seq_printf(m, "Ops    : pend=%u run=%u enq=%u can=%u rej=%u\n",
-		   atomic_read(&fscache_n_op_pend),
-		   atomic_read(&fscache_n_op_run),
-		   atomic_read(&fscache_n_op_enqueue),
-		   atomic_read(&fscache_n_op_cancelled),
-		   atomic_read(&fscache_n_op_rejected));
-	seq_printf(m, "Ops    : ini=%u dfr=%u rel=%u gc=%u\n",
-		   atomic_read(&fscache_n_op_initialised),
-		   atomic_read(&fscache_n_op_deferred_release),
-		   atomic_read(&fscache_n_op_release),
-		   atomic_read(&fscache_n_op_gc));
-
-	seq_printf(m, "CacheOp: alo=%d luo=%d luc=%d gro=%d\n",
-		   atomic_read(&fscache_n_cop_alloc_object),
-		   atomic_read(&fscache_n_cop_lookup_object),
-		   atomic_read(&fscache_n_cop_lookup_complete),
-		   atomic_read(&fscache_n_cop_grab_object));
-	seq_printf(m, "CacheOp: inv=%d upo=%d dro=%d pto=%d atc=%d syn=%d\n",
-		   atomic_read(&fscache_n_cop_invalidate_object),
-		   atomic_read(&fscache_n_cop_update_object),
-		   atomic_read(&fscache_n_cop_drop_object),
-		   atomic_read(&fscache_n_cop_put_object),
-		   atomic_read(&fscache_n_cop_attr_changed),
-		   atomic_read(&fscache_n_cop_sync_cache));
-	seq_printf(m, "CacheEv: nsp=%d stl=%d rtr=%d cul=%d\n",
-		   atomic_read(&fscache_n_cache_no_space_reject),
-		   atomic_read(&fscache_n_cache_stale_objects),
-		   atomic_read(&fscache_n_cache_retired_objects),
-		   atomic_read(&fscache_n_cache_culled_objects));
+		   atomic_read(&fscache_n_relinquishes_retire),
+		   atomic_read(&fscache_n_relinquishes_dropped));
+
+	seq_printf(m, "IO     : rd=%u wr=%u\n",
+		   atomic_read(&fscache_n_read),
+		   atomic_read(&fscache_n_write));
+
 	netfs_stats_show(m);
 	return 0;
 }
diff --git a/fs/fscache/volume.c b/fs/fscache/volume.c
new file mode 100644
index 000000000000..d1e57ce95b72
--- /dev/null
+++ b/fs/fscache/volume.c
@@ -0,0 +1,449 @@
+// SPDX-License-Identifier: GPL-2.0-or-later
+/* Volume-level cache cookie handling.
+ *
+ * Copyright (C) 2021 Red Hat, Inc. All Rights Reserved.
+ * Written by David Howells (dhowells@redhat.com)
+ */
+
+#define FSCACHE_DEBUG_LEVEL COOKIE
+#include <linux/export.h>
+#include <linux/slab.h>
+#include "internal.h"
+
+#define fscache_volume_hash_shift 10
+static struct hlist_bl_head fscache_volume_hash[1 << fscache_volume_hash_shift];
+static atomic_t fscache_volume_debug_id;
+static LIST_HEAD(fscache_volumes);
+
+static void fscache_create_volume_work(struct work_struct *work);
+
+struct fscache_volume *fscache_get_volume(struct fscache_volume *volume,
+					  enum fscache_volume_trace where)
+{
+	int ref;
+
+	__refcount_inc(&volume->ref, &ref);
+	trace_fscache_volume(volume->debug_id, ref + 1, where);
+	return volume;
+}
+
+static void fscache_see_volume(struct fscache_volume *volume,
+			       enum fscache_volume_trace where)
+{
+	int ref = refcount_read(&volume->ref);
+
+	trace_fscache_volume(volume->debug_id, ref, where);
+}
+
+/*
+ * Pin the cache behind a volume so that we can access it.
+ */
+static void __fscache_begin_volume_access(struct fscache_volume *volume,
+					  enum fscache_access_trace why)
+{
+	int n_accesses;
+
+	n_accesses = atomic_inc_return(&volume->n_accesses);
+	smp_mb__after_atomic();
+	trace_fscache_access_volume(volume->debug_id, refcount_read(&volume->ref),
+				    n_accesses, why);
+}
+
+/*
+ * If the cache behind a volume is live, pin it so that we can access it.
+ */
+bool fscache_begin_volume_access(struct fscache_volume *volume,
+				 enum fscache_access_trace why)
+{
+	if (!fscache_cache_is_live(volume->cache))
+		return false;
+	__fscache_begin_volume_access(volume, why);
+	if (!fscache_cache_is_live(volume->cache)) {
+		fscache_end_volume_access(volume, fscache_access_unlive);
+		return false;
+	}
+	return true;
+}
+
+/*
+ * Mark the end of an access on a volume.
+ */
+void fscache_end_volume_access(struct fscache_volume *volume,
+			       enum fscache_access_trace why)
+{
+	int n_accesses;
+
+	smp_mb__before_atomic();
+	n_accesses = atomic_dec_return(&volume->n_accesses);
+	trace_fscache_access_volume(volume->debug_id, refcount_read(&volume->ref),
+				    n_accesses, why);
+	if (n_accesses == 0)
+		wake_up_var(&volume->n_accesses);
+}
+EXPORT_SYMBOL(fscache_end_volume_access);
+
+static long fscache_compare_volume(const struct fscache_volume *a,
+				   const struct fscache_volume *b)
+{
+	size_t klen;
+
+	if (a->key_hash != b->key_hash)
+		return (long)a->key_hash - (long)b->key_hash;
+	if (a->cache != b->cache)
+		return (long)a->cache    - (long)b->cache;
+	if (a->key[0] != b->key[0])
+		return (long)a->key[0]   - (long)b->key[0];
+
+	klen = round_up(a->key[0] + 1, sizeof(unsigned int));
+	return memcmp(a->key, b->key, klen);
+}
+
+static bool fscache_is_acquire_pending(struct fscache_volume *volume)
+{
+	return test_bit(FSCACHE_VOLUME_ACQUIRE_PENDING, &volume->flags);
+}
+
+static void fscache_wait_on_volume_collision(struct fscache_volume *candidate,
+					     unsigned int collidee_debug_id)
+{
+	wait_var_event_timeout(&candidate->flags,
+			       fscache_is_acquire_pending(candidate), 20 * HZ);
+	if (!fscache_is_acquire_pending(candidate)) {
+		pr_notice("Potential volume collision new=%08x old=%08x",
+			  candidate->debug_id, collidee_debug_id);
+		fscache_stat(&fscache_n_volumes_collision);
+		wait_var_event(&candidate->flags, fscache_is_acquire_pending(candidate));
+	}
+}
+
+/*
+ * Attempt to insert the new volume into the hash.  If there's a collision, we
+ * wait for the old volume to complete if it's being relinquished and an error
+ * otherwise.
+ */
+static struct fscache_volume *fscache_hash_volume(struct fscache_volume *candidate)
+{
+	struct fscache_volume *cursor;
+	struct hlist_bl_head *h;
+	struct hlist_bl_node *p;
+	unsigned int bucket, collidee_debug_id = 0;
+
+	bucket = candidate->key_hash & (ARRAY_SIZE(fscache_volume_hash) - 1);
+	h = &fscache_volume_hash[bucket];
+
+	hlist_bl_lock(h);
+	hlist_bl_for_each_entry(cursor, p, h, hash_link) {
+		if (fscache_compare_volume(candidate, cursor) == 0) {
+			if (!test_bit(FSCACHE_VOLUME_RELINQUISHED, &cursor->flags))
+				goto collision;
+			fscache_see_volume(cursor, fscache_volume_get_hash_collision);
+			set_bit(FSCACHE_VOLUME_COLLIDED_WITH, &cursor->flags);
+			set_bit(FSCACHE_VOLUME_ACQUIRE_PENDING, &candidate->flags);
+			collidee_debug_id = cursor->debug_id;
+			break;
+		}
+	}
+
+	hlist_bl_add_head(&candidate->hash_link, h);
+	hlist_bl_unlock(h);
+
+	if (test_bit(FSCACHE_VOLUME_ACQUIRE_PENDING, &candidate->flags))
+		fscache_wait_on_volume_collision(candidate, collidee_debug_id);
+	return candidate;
+
+collision:
+	fscache_see_volume(cursor, fscache_volume_collision);
+	pr_err("Cache volume already in use\n");
+	hlist_bl_unlock(h);
+	return NULL;
+}
+
+/*
+ * Allocate and initialise a volume representation cookie.
+ */
+static struct fscache_volume *fscache_alloc_volume(const char *volume_key,
+						   const char *cache_name,
+						   u64 coherency_data)
+{
+	struct fscache_volume *volume;
+	struct fscache_cache *cache;
+	size_t klen, hlen;
+	char *key;
+
+	cache = fscache_lookup_cache(cache_name, false);
+	if (!cache)
+		return NULL;
+
+	volume = kzalloc(sizeof(*volume), GFP_KERNEL);
+	if (!volume)
+		goto err_cache;
+
+	volume->cache = cache;
+	volume->coherency = coherency_data;
+	INIT_LIST_HEAD(&volume->proc_link);
+	INIT_WORK(&volume->work, fscache_create_volume_work);
+	refcount_set(&volume->ref, 1);
+	spin_lock_init(&volume->lock);
+
+	/* Stick the length on the front of the key and pad it out to make
+	 * hashing easier.
+	 */
+	klen = strlen(volume_key);
+	hlen = round_up(1 + klen + 1, sizeof(unsigned int));
+	key = kzalloc(hlen, GFP_KERNEL);
+	if (!key)
+		goto err_vol;
+	key[0] = klen;
+	memcpy(key + 1, volume_key, klen);
+
+	volume->key = key;
+	volume->key_hash = fscache_hash(0, (unsigned int *)key,
+					hlen / sizeof(unsigned int));
+
+	volume->debug_id = atomic_inc_return(&fscache_volume_debug_id);
+	down_write(&fscache_addremove_sem);
+	atomic_inc(&cache->n_volumes);
+	list_add_tail(&volume->proc_link, &fscache_volumes);
+	fscache_see_volume(volume, fscache_volume_new_acquire);
+	fscache_stat(&fscache_n_volumes);
+	up_write(&fscache_addremove_sem);
+	_leave(" = v=%x", volume->debug_id);
+	return volume;
+
+err_vol:
+	kfree(volume);
+err_cache:
+	fscache_put_cache(cache, fscache_cache_put_alloc_volume);
+	fscache_stat(&fscache_n_volumes_nomem);
+	return NULL;
+}
+
+/*
+ * Create a volume's representation on disk.  Have a volume ref and a cache
+ * access we have to release.
+ */
+static void fscache_create_volume_work(struct work_struct *work)
+{
+	const struct fscache_cache_ops *ops;
+	struct fscache_volume *volume =
+		container_of(work, struct fscache_volume, work);
+
+	fscache_see_volume(volume, fscache_volume_see_create_work);
+
+	ops = volume->cache->ops;
+	if (ops->acquire_volume)
+		ops->acquire_volume(volume);
+	fscache_end_cache_access(volume->cache,
+				 fscache_access_acquire_volume_end);
+
+	clear_bit_unlock(FSCACHE_VOLUME_CREATING, &volume->flags);
+	wake_up_bit(&volume->flags, FSCACHE_VOLUME_CREATING);
+	fscache_put_volume(volume, fscache_volume_put_create_work);
+}
+
+/*
+ * Dispatch a worker thread to create a volume's representation on disk.
+ */
+void fscache_create_volume(struct fscache_volume *volume, bool wait)
+{
+	if (test_and_set_bit(FSCACHE_VOLUME_CREATING, &volume->flags))
+		goto maybe_wait;
+	if (volume->cache_priv)
+		goto no_wait; /* We raced */
+	if (!fscache_begin_cache_access(volume->cache,
+					fscache_access_acquire_volume))
+		goto no_wait;
+
+	fscache_get_volume(volume, fscache_volume_get_create_work);
+	if (!schedule_work(&volume->work))
+		fscache_put_volume(volume, fscache_volume_put_create_work);
+
+maybe_wait:
+	if (wait) {
+		fscache_see_volume(volume, fscache_volume_wait_create_work);
+		wait_on_bit(&volume->flags, FSCACHE_VOLUME_CREATING,
+			    TASK_UNINTERRUPTIBLE);
+	}
+	return;
+no_wait:
+	clear_bit_unlock(FSCACHE_VOLUME_CREATING, &volume->flags);
+	wake_up_bit(&volume->flags, FSCACHE_VOLUME_CREATING);
+}
+
+/*
+ * Acquire a volume representation cookie and link it to a (proposed) cache.
+ */
+struct fscache_volume *__fscache_acquire_volume(const char *volume_key,
+						const char *cache_name,
+						u64 coherency_data)
+{
+	struct fscache_volume *volume;
+
+	volume = fscache_alloc_volume(volume_key, cache_name, coherency_data);
+	if (!volume)
+		return NULL;
+
+	if (!fscache_hash_volume(volume)) {
+		fscache_put_volume(volume, fscache_volume_put_hash_collision);
+		return NULL;
+	}
+
+	fscache_create_volume(volume, false);
+	return volume;
+}
+EXPORT_SYMBOL(__fscache_acquire_volume);
+
+static void fscache_wake_pending_volume(struct fscache_volume *volume,
+					struct hlist_bl_head *h)
+{
+	struct fscache_volume *cursor;
+	struct hlist_bl_node *p;
+
+	hlist_bl_for_each_entry(cursor, p, h, hash_link) {
+		if (fscache_compare_volume(cursor, volume) == 0) {
+			fscache_see_volume(cursor, fscache_volume_see_hash_wake);
+			clear_bit(FSCACHE_VOLUME_ACQUIRE_PENDING, &cursor->flags);
+			wake_up_bit(&cursor->flags, FSCACHE_VOLUME_ACQUIRE_PENDING);
+			return;
+		}
+	}
+}
+
+/*
+ * Remove a volume cookie from the hash table.
+ */
+static void fscache_unhash_volume(struct fscache_volume *volume)
+{
+	struct hlist_bl_head *h;
+	unsigned int bucket;
+
+	bucket = volume->key_hash & (ARRAY_SIZE(fscache_volume_hash) - 1);
+	h = &fscache_volume_hash[bucket];
+
+	hlist_bl_lock(h);
+	hlist_bl_del(&volume->hash_link);
+	if (test_bit(FSCACHE_VOLUME_COLLIDED_WITH, &volume->flags))
+		fscache_wake_pending_volume(volume, h);
+	hlist_bl_unlock(h);
+}
+
+/*
+ * Drop a cache's volume attachments.
+ */
+static void fscache_free_volume(struct fscache_volume *volume)
+{
+	struct fscache_cache *cache = volume->cache;
+
+	if (volume->cache_priv) {
+		__fscache_begin_volume_access(volume, fscache_access_relinquish_volume);
+		if (volume->cache_priv) {
+			const struct fscache_cache_ops *ops = cache->ops;
+			if (ops->free_volume)
+				ops->free_volume(volume);
+		}
+		fscache_end_volume_access(volume, fscache_access_relinquish_volume_end);
+	}
+
+	down_write(&fscache_addremove_sem);
+	list_del_init(&volume->proc_link);
+	atomic_dec(&volume->cache->n_volumes);
+	up_write(&fscache_addremove_sem);
+
+	if (!hlist_bl_unhashed(&volume->hash_link))
+		fscache_unhash_volume(volume);
+
+	trace_fscache_volume(volume->debug_id, 0, fscache_volume_free);
+	kfree(volume->key);
+	kfree(volume);
+	fscache_stat_d(&fscache_n_volumes);
+	fscache_put_cache(cache, fscache_cache_put_volume);
+}
+
+/*
+ * Drop a reference to a volume cookie.
+ */
+void fscache_put_volume(struct fscache_volume *volume,
+			enum fscache_volume_trace where)
+{
+	if (volume) {
+		unsigned int debug_id = volume->debug_id;
+		bool zero;
+		int ref;
+
+		zero = __refcount_dec_and_test(&volume->ref, &ref);
+		trace_fscache_volume(debug_id, ref - 1, where);
+		if (zero)
+			fscache_free_volume(volume);
+	}
+}
+
+/*
+ * Relinquish a volume representation cookie.
+ */
+void __fscache_relinquish_volume(struct fscache_volume *volume,
+				 u64 coherency_data,
+				 bool invalidate)
+{
+	if (WARN_ON(test_and_set_bit(FSCACHE_VOLUME_RELINQUISHED, &volume->flags)))
+		return;
+
+	if (invalidate)
+		set_bit(FSCACHE_VOLUME_INVALIDATE, &volume->flags);
+
+	fscache_put_volume(volume, fscache_volume_put_relinquish);
+}
+EXPORT_SYMBOL(__fscache_relinquish_volume);
+
+#ifdef CONFIG_PROC_FS
+/*
+ * Generate a list of volumes in /proc/fs/fscache/volumes
+ */
+static int fscache_volumes_seq_show(struct seq_file *m, void *v)
+{
+	struct fscache_volume *volume;
+
+	if (v == &fscache_volumes) {
+		seq_puts(m,
+			 "VOLUME   REF   nCOOK ACC FL CACHE           KEY\n"
+			 "======== ===== ===== === == =============== ================\n");
+		return 0;
+	}
+
+	volume = list_entry(v, struct fscache_volume, proc_link);
+	seq_printf(m,
+		   "%08x %5d %5d %3d %02lx %-15.15s %s\n",
+		   volume->debug_id,
+		   refcount_read(&volume->ref),
+		   atomic_read(&volume->n_cookies),
+		   atomic_read(&volume->n_accesses),
+		   volume->flags,
+		   volume->cache->name ?: "-",
+		   volume->key + 1);
+	return 0;
+}
+
+static void *fscache_volumes_seq_start(struct seq_file *m, loff_t *_pos)
+	__acquires(&fscache_addremove_sem)
+{
+	down_read(&fscache_addremove_sem);
+	return seq_list_start_head(&fscache_volumes, *_pos);
+}
+
+static void *fscache_volumes_seq_next(struct seq_file *m, void *v, loff_t *_pos)
+{
+	return seq_list_next(v, &fscache_volumes, _pos);
+}
+
+static void fscache_volumes_seq_stop(struct seq_file *m, void *v)
+	__releases(&fscache_addremove_sem)
+{
+	up_read(&fscache_addremove_sem);
+}
+
+const struct seq_operations fscache_volumes_seq_ops = {
+	.start  = fscache_volumes_seq_start,
+	.next   = fscache_volumes_seq_next,
+	.stop   = fscache_volumes_seq_stop,
+	.show   = fscache_volumes_seq_show,
+};
+#endif /* CONFIG_PROC_FS */
diff --git a/include/linux/fscache-cache.h b/include/linux/fscache-cache.h
index 90a7c92fca98..657e54b4cd90 100644
--- a/include/linux/fscache-cache.h
+++ b/include/linux/fscache-cache.h
@@ -1,7 +1,7 @@
 /* SPDX-License-Identifier: GPL-2.0-or-later */
 /* General filesystem caching backing cache interface
  *
- * Copyright (C) 2004-2007 Red Hat, Inc. All Rights Reserved.
+ * Copyright (C) 2004-2007, 2021 Red Hat, Inc. All Rights Reserved.
  * Written by David Howells (dhowells@redhat.com)
  *
  * NOTE!!! See:
@@ -15,64 +15,38 @@
 #define _LINUX_FSCACHE_CACHE_H
 
 #include <linux/fscache.h>
-#include <linux/sched.h>
-#include <linux/workqueue.h>
-
-#define NR_MAXCACHES BITS_PER_LONG
 
 struct fscache_cache;
 struct fscache_cache_ops;
-struct cachefiles_object;
+enum fscache_cache_trace;
 enum fscache_cookie_trace;
-
-enum fscache_obj_ref_trace {
-	fscache_obj_get_add_to_deps,
-	fscache_obj_get_queue,
-	fscache_obj_put_alloc_fail,
-	fscache_obj_put_attach_fail,
-	fscache_obj_put_drop_obj,
-	fscache_obj_put_enq_dep,
-	fscache_obj_put_queue,
-	fscache_obj_put_work,
-	fscache_obj_ref__nr_traces
-};
-
-/*
- * cache tag definition
- */
-struct fscache_cache_tag {
-	struct list_head	link;
-	struct fscache_cache	*cache;		/* cache referred to by this tag */
-	unsigned long		flags;
-#define FSCACHE_TAG_RESERVED	0		/* T if tag is reserved for a cache */
-	atomic_t		usage;		/* Number of using netfs's */
-	refcount_t		ref;		/* Reference count on structure */
-	char			name[];		/* tag name */
+enum fscache_access_trace;
+
+enum fscache_cache_state {
+	FSCACHE_CACHE_IS_NOT_PRESENT,	/* No cache is present for this name */
+	FSCACHE_CACHE_IS_PREPARING,	/* A cache is preparing to come live */
+	FSCACHE_CACHE_IS_ACTIVE,	/* Attached cache is active and can be used */
+	FSCACHE_CACHE_GOT_IOERROR,	/* Attached cache stopped on I/O error */
+	FSCACHE_CACHE_IS_WITHDRAWN,	/* Attached cache is being withdrawn */
+#define NR__FSCACHE_CACHE_STATE (FSCACHE_CACHE_IS_WITHDRAWN + 1)
 };
 
 /*
- * cache definition
+ * Cache cookie.
  */
 struct fscache_cache {
 	const struct fscache_cache_ops *ops;
-	struct fscache_cache_tag *tag;		/* tag representing this cache */
-	struct kobject		*kobj;		/* system representation of this cache */
-	struct list_head	link;		/* link in list of caches */
-	size_t			max_index_size;	/* maximum size of index data */
-	char			identifier[36];	/* cache label */
-
-	/* node management */
-	struct list_head	object_list;	/* list of data/index objects */
-	spinlock_t		object_list_lock;
+	struct list_head	cache_link;	/* Link in cache list */
+	void			*cache_priv;	/* Private cache data (or NULL) */
+	refcount_t		ref;
+	atomic_t		n_volumes;	/* Number of active volumes; */
+	atomic_t		n_accesses;	/* Number of in-progress accesses on the cache */
 	atomic_t		object_count;	/* no. of live objects in this cache */
-	struct cachefiles_object	*fsdef;		/* object for the fsdef index */
-	unsigned long		flags;
-#define FSCACHE_IOERROR		0	/* cache stopped on I/O error */
-#define FSCACHE_CACHE_WITHDRAWN	1	/* cache has been withdrawn */
+	unsigned int		debug_id;
+	enum fscache_cache_state state;
+	char			*name;
 };
 
-extern wait_queue_head_t fscache_cache_cleared_wq;
-
 /*
  * cache operations
  */
@@ -80,265 +54,79 @@ struct fscache_cache_ops {
 	/* name of cache provider */
 	const char *name;
 
-	/* allocate an object record for a cookie */
-	struct cachefiles_object *(*alloc_object)(struct fscache_cache *cache,
-					       struct fscache_cookie *cookie);
-
-	/* look up the object for a cookie
-	 * - return -ETIMEDOUT to be requeued
-	 */
-	int (*lookup_object)(struct cachefiles_object *object);
+	/* Acquire a volume */
+	void (*acquire_volume)(struct fscache_volume *volume);
 
-	/* finished looking up */
-	void (*lookup_complete)(struct cachefiles_object *object);
+	/* Free the cache's data attached to a volume */
+	void (*free_volume)(struct fscache_volume *volume);
 
-	/* increment the usage count on this object (may fail if unmounting) */
-	struct cachefiles_object *(*grab_object)(struct cachefiles_object *object,
-					      enum fscache_obj_ref_trace why);
+	/* Look up a cookie in the cache */
+	bool (*lookup_cookie)(struct fscache_cookie *cookie);
 
-	/* pin an object in the cache */
-	int (*pin_object)(struct cachefiles_object *object);
-
-	/* unpin an object in the cache */
-	void (*unpin_object)(struct cachefiles_object *object);
-
-	/* store the updated auxiliary data on an object */
-	void (*update_object)(struct cachefiles_object *object);
+	/* Withdraw an object without any cookie access counts held */
+	void (*withdraw_cookie)(struct fscache_cookie *cookie);
 
 	/* Invalidate an object */
-	void (*invalidate_object)(struct cachefiles_object *object);
-
-	/* discard the resources pinned by an object and effect retirement if
-	 * necessary */
-	void (*drop_object)(struct cachefiles_object *object);
-
-	/* dispose of a reference to an object */
-	void (*put_object)(struct cachefiles_object *object,
-			   enum fscache_obj_ref_trace why);
-
-	/* sync a cache */
-	void (*sync_cache)(struct fscache_cache *cache);
-
-	/* reserve space for an object's data and associated metadata */
-	int (*reserve_space)(struct cachefiles_object *object, loff_t i_size);
+	bool (*invalidate_cookie)(struct fscache_cookie *cookie,
+				  unsigned int flags);
 
 	/* Begin an operation for the netfs lib */
-	int (*begin_operation)(struct netfs_cache_resources *cres);
+	bool (*begin_operation)(struct netfs_cache_resources *cres,
+				enum fscache_want_stage want_stage);
 };
 
-extern struct fscache_cookie fscache_fsdef_index;
-
-/*
- * Event list for fscache_object::{event_mask,events}
- */
-enum {
-	FSCACHE_OBJECT_EV_NEW_CHILD,	/* T if object has a new child */
-	FSCACHE_OBJECT_EV_PARENT_READY,	/* T if object's parent is ready */
-	FSCACHE_OBJECT_EV_UPDATE,	/* T if object should be updated */
-	FSCACHE_OBJECT_EV_INVALIDATE,	/* T if cache requested object invalidation */
-	FSCACHE_OBJECT_EV_CLEARED,	/* T if accessors all gone */
-	FSCACHE_OBJECT_EV_ERROR,	/* T if fatal error occurred during processing */
-	FSCACHE_OBJECT_EV_KILL,		/* T if netfs relinquished or cache withdrew object */
-	NR_FSCACHE_OBJECT_EVENTS
-};
-
-#define FSCACHE_OBJECT_EVENTS_MASK ((1UL << NR_FSCACHE_OBJECT_EVENTS) - 1)
-
-/*
- * States for object state machine.
- */
-struct fscache_transition {
-	unsigned long events;
-	const struct fscache_state *transit_to;
-};
-
-struct fscache_state {
-	char name[24];
-	char short_name[8];
-	const struct fscache_state *(*work)(struct cachefiles_object *object,
-					    int event);
-	const struct fscache_transition transitions[];
-};
-
-/*
- * on-disk cache file or index handle
- */
-struct cachefiles_object {
-	const struct fscache_state *state;	/* Object state machine state */
-	const struct fscache_transition *oob_table; /* OOB state transition table */
-	int			debug_id;	/* debugging ID */
-	int			n_children;	/* number of child objects */
-	int			n_ops;		/* number of extant ops on object */
-	int			n_obj_ops;	/* number of object ops outstanding on object */
-	spinlock_t		lock;		/* state and operations lock */
-
-	unsigned long		lookup_jif;	/* time at which lookup started */
-	unsigned long		oob_event_mask;	/* OOB events this object is interested in */
-	unsigned long		event_mask;	/* events this object is interested in */
-	unsigned long		events;		/* events to be processed by this object
-						 * (order is important - using fls) */
-
-	unsigned long		flags;
-#define FSCACHE_OBJECT_LOCK		0	/* T if object is busy being processed */
-#define FSCACHE_OBJECT_WAITING		2	/* T if object is waiting on its parent */
-#define FSCACHE_OBJECT_IS_LIVE		3	/* T if object is not withdrawn or relinquished */
-#define FSCACHE_OBJECT_IS_LOOKED_UP	4	/* T if object has been looked up */
-#define FSCACHE_OBJECT_IS_AVAILABLE	5	/* T if object has become active */
-#define FSCACHE_OBJECT_RETIRED		6	/* T if object was retired on relinquishment */
-#define FSCACHE_OBJECT_KILLED_BY_CACHE	7	/* T if object was killed by the cache */
-#define FSCACHE_OBJECT_RUN_AFTER_DEAD	8	/* T if object has been dispatched after death */
-
-	struct list_head	cache_link;	/* link in cache->object_list */
-	struct hlist_node	cookie_link;	/* link in cookie->backing_objects */
-	struct fscache_cache	*cache;		/* cache that supplied this object */
-	struct fscache_cookie	*cookie;	/* netfs's file/index object */
-	struct cachefiles_object	*parent;	/* parent object */
-	struct work_struct	work;		/* attention scheduling record */
-	struct list_head	dependents;	/* FIFO of dependent objects */
-	struct list_head	dep_link;	/* link in parent's dependents list */
-
-	char				*d_name;	/* Filename */
-	struct file			*file;		/* The file representing this object */
-	loff_t				i_size;		/* object size */
-	atomic_t			usage;		/* object usage count */
-	uint8_t				type;		/* object type */
-	bool				new;		/* T if object new */
-	u8				d_name_len;	/* Length of filename */
-	u8				key_hash;
-};
-
-extern void fscache_object_init(struct cachefiles_object *, struct fscache_cookie *,
-				struct fscache_cache *);
-extern void fscache_object_destroy(struct cachefiles_object *);
-
-extern void fscache_object_lookup_negative(struct cachefiles_object *object);
-extern void fscache_obtained_object(struct cachefiles_object *object);
-
-static inline bool fscache_object_is_live(struct cachefiles_object *object)
+static inline enum fscache_cache_state fscache_cache_state(const struct fscache_cache *cache)
 {
-	return test_bit(FSCACHE_OBJECT_IS_LIVE, &object->flags);
+	return smp_load_acquire(&cache->state);
 }
 
-static inline bool fscache_object_is_dying(struct cachefiles_object *object)
+static inline bool fscache_cache_is_live(const struct fscache_cache *cache)
 {
-	return !fscache_object_is_live(object);
+	return fscache_cache_state(cache) == FSCACHE_CACHE_IS_ACTIVE;
 }
 
-static inline bool fscache_object_is_available(struct cachefiles_object *object)
+static inline void fscache_set_cache_state(struct fscache_cache *cache,
+					   enum fscache_cache_state new_state)
 {
-	return test_bit(FSCACHE_OBJECT_IS_AVAILABLE, &object->flags);
-}
+	smp_store_release(&cache->state, new_state);
 
-static inline bool fscache_cache_is_broken(struct cachefiles_object *object)
-{
-	return test_bit(FSCACHE_IOERROR, &object->cache->flags);
 }
 
-static inline bool fscache_object_is_active(struct cachefiles_object *object)
-{
-	return fscache_object_is_available(object) &&
-		fscache_object_is_live(object) &&
-		!fscache_cache_is_broken(object);
-}
-
-/**
- * fscache_object_destroyed - Note destruction of an object in a cache
- * @cache: The cache from which the object came
- *
- * Note the destruction and deallocation of an object record in a cache.
- */
-static inline void fscache_object_destroyed(struct fscache_cache *cache)
+static inline bool fscache_set_cache_state_maybe(struct fscache_cache *cache,
+						 enum fscache_cache_state old_state,
+						 enum fscache_cache_state new_state)
 {
-	if (atomic_dec_and_test(&cache->object_count))
-		wake_up_all(&fscache_cache_cleared_wq);
-}
-
-/**
- * fscache_object_lookup_error - Note an object encountered an error
- * @object: The object on which the error was encountered
- *
- * Note that an object encountered a fatal error (usually an I/O error) and
- * that it should be withdrawn as soon as possible.
- */
-static inline void fscache_object_lookup_error(struct cachefiles_object *object)
-{
-	set_bit(FSCACHE_OBJECT_EV_ERROR, &object->events);
-}
-
-static inline void __fscache_use_cookie(struct fscache_cookie *cookie)
-{
-	atomic_inc(&cookie->n_active);
-}
-
-/**
- * fscache_use_cookie - Request usage of cookie attached to an object
- * @object: Object description
- * 
- * Request usage of the cookie attached to an object.  NULL is returned if the
- * relinquishment had reduced the cookie usage count to 0.
- */
-static inline bool fscache_use_cookie(struct cachefiles_object *object)
-{
-	struct fscache_cookie *cookie = object->cookie;
-	return atomic_inc_not_zero(&cookie->n_active) != 0;
-}
-
-static inline bool __fscache_unuse_cookie(struct fscache_cookie *cookie)
-{
-	return atomic_dec_and_test(&cookie->n_active);
-}
-
-static inline void __fscache_wake_unused_cookie(struct fscache_cookie *cookie)
-{
-	wake_up_var(&cookie->n_active);
-}
-
-/**
- * fscache_unuse_cookie - Cease usage of cookie attached to an object
- * @object: Object description
- * 
- * Cease usage of the cookie attached to an object.  When the users count
- * reaches zero then the cookie relinquishment will be permitted to proceed.
- */
-static inline void fscache_unuse_cookie(struct cachefiles_object *object)
-{
-	struct fscache_cookie *cookie = object->cookie;
-	if (__fscache_unuse_cookie(cookie))
-		__fscache_wake_unused_cookie(cookie);
+	return try_cmpxchg_release(&cache->state, &old_state, new_state);
 }
 
 /*
  * out-of-line cache backend functions
  */
-extern __printf(3, 4)
-void fscache_init_cache(struct fscache_cache *cache,
-			const struct fscache_cache_ops *ops,
-			const char *idfmt, ...);
-
+extern struct rw_semaphore fscache_addremove_sem;
+extern struct fscache_cache *fscache_acquire_cache(const char *name);
 extern int fscache_add_cache(struct fscache_cache *cache,
-			     struct cachefiles_object *fsdef,
-			     const char *tagname);
+			     const struct fscache_cache_ops *ops,
+			     void *cache_priv);
+extern void fscache_put_cache(struct fscache_cache *cache,
+			      enum fscache_cache_trace where);
 extern void fscache_withdraw_cache(struct fscache_cache *cache);
+extern void fscache_withdraw_cookie(struct fscache_cookie *cookie);
 
 extern void fscache_io_error(struct fscache_cache *cache);
 
-extern bool fscache_object_sleep_till_congested(signed long *timeoutp);
-
-extern void fscache_object_retrying_stale(struct cachefiles_object *object);
-
-enum fscache_why_object_killed {
-	FSCACHE_OBJECT_IS_STALE,
-	FSCACHE_OBJECT_NO_SPACE,
-	FSCACHE_OBJECT_WAS_RETIRED,
-	FSCACHE_OBJECT_WAS_CULLED,
-};
-extern void fscache_object_mark_killed(struct cachefiles_object *object,
-				       enum fscache_why_object_killed why);
+extern void fscache_end_volume_access(struct fscache_volume *volume,
+				      enum fscache_access_trace why);
 
 extern struct fscache_cookie *fscache_get_cookie(struct fscache_cookie *cookie,
 						 enum fscache_cookie_trace where);
 extern void fscache_put_cookie(struct fscache_cookie *cookie,
 			       enum fscache_cookie_trace where);
+extern void fscache_end_cookie_access(struct fscache_cookie *cookie,
+				      enum fscache_access_trace why);
+extern void fscache_set_cookie_stage(struct fscache_cookie *cookie,
+				     enum fscache_cookie_stage stage);
+extern bool fscache_wait_for_operation(struct netfs_cache_resources *cred,
+				       enum fscache_want_stage stage);
 
 /*
  * Find the key on a cookie.
@@ -362,4 +150,47 @@ static inline void *fscache_get_aux(struct fscache_cookie *cookie)
 		return cookie->aux;
 }
 
+/**
+ * fscache_cookie_lookup_negative - Note negative lookup
+ * @cookie: The cookie that was being looked up
+ *
+ * Note that some part of the metadata path in the cache doesn't exist and so
+ * we can release any waiting readers in the certain knowledge that there's
+ * nothing for them to actually read.
+ */
+static inline void fscache_cookie_lookup_negative(struct fscache_cookie *cookie)
+{
+	set_bit(FSCACHE_COOKIE_NO_DATA_TO_READ, &cookie->flags);
+	fscache_set_cookie_stage(cookie, FSCACHE_COOKIE_STAGE_CREATING);
+}
+
+static inline struct fscache_cookie *fscache_cres_cookie(struct netfs_cache_resources *cres)
+{
+	return cres->cache_priv;
+}
+
+/**
+ * fscache_end_operation - End an fscache I/O operation.
+ * @cres: The resources to dispose of.
+ */
+static inline
+void fscache_end_operation(struct netfs_cache_resources *cres)
+{
+	const struct netfs_cache_ops *ops = fscache_operation_valid(cres);
+	if (ops)
+		ops->end_operation(cres);
+}
+
+#ifdef CONFIG_FSCACHE_STATS
+extern atomic_t fscache_n_read;
+extern atomic_t fscache_n_write;
+#define fscache_count_read() atomic_inc(&fscache_n_read)
+#define fscache_count_write() atomic_inc(&fscache_n_write)
+#else
+#define fscache_count_read() do {} while(0)
+#define fscache_count_write() do {} while(0)
+#endif
+
+extern struct workqueue_struct *fscache_wq;
+
 #endif /* _LINUX_FSCACHE_CACHE_H */
diff --git a/include/linux/fscache.h b/include/linux/fscache.h
index 1dba014e848f..aeee14f5663a 100644
--- a/include/linux/fscache.h
+++ b/include/linux/fscache.h
@@ -1,7 +1,7 @@
 /* SPDX-License-Identifier: GPL-2.0-or-later */
 /* General filesystem caching interface
  *
- * Copyright (C) 2004-2007 Red Hat, Inc. All Rights Reserved.
+ * Copyright (C) 2004-2007, 2021 Red Hat, Inc. All Rights Reserved.
  * Written by David Howells (dhowells@redhat.com)
  *
  * NOTE!!! See:
@@ -23,76 +23,110 @@
 #include <linux/netfs.h>
 
 #if defined(CONFIG_FSCACHE) || defined(CONFIG_FSCACHE_MODULE)
+#define __fscache_available (1)
 #define fscache_available() (1)
+#define fscache_volume_valid(volume) (volume)
 #define fscache_cookie_valid(cookie) (cookie)
 #define fscache_resources_valid(cres) ((cres)->cache_priv)
 #else
+#define __fscache_available (0)
 #define fscache_available() (0)
+#define fscache_volume_valid(volume) (0)
 #define fscache_cookie_valid(cookie) (0)
 #define fscache_resources_valid(cres) (false)
 #endif
 
-struct fscache_cache_tag;
 struct fscache_cookie;
-struct fscache_netfs;
-struct netfs_read_request;
 
-enum fscache_cookie_type {
-	FSCACHE_COOKIE_TYPE_INDEX,
-	FSCACHE_COOKIE_TYPE_DATAFILE,
+#define FSCACHE_ADV_SINGLE_CHUNK	0x01 /* The object is a single chunk of data */
+#define FSCACHE_ADV_WRITE_CACHE		0x00 /* Do cache if written to locally */
+#define FSCACHE_ADV_WRITE_NOCACHE	0x02 /* Don't cache if written to locally */
+#define FSCACHE_ADV_FALLBACK_IO		0x04 /* Going to use the fallback I/O API (dangerous) */
+
+enum fscache_want_stage {
+	FSCACHE_WANT_PARAMS,
+	FSCACHE_WANT_WRITE,
+	FSCACHE_WANT_READ,
 };
 
-#define FSCACHE_ADV_SINGLE_CHUNK	0x01 /* The object is a single chunk of data */
+/*
+ * Data object state.
+ */
+enum fscache_cookie_stage {
+	FSCACHE_COOKIE_STAGE_QUIESCENT,		/* The cookie is uncached */
+	FSCACHE_COOKIE_STAGE_LOOKING_UP,	/* The cache object is being looked up */
+	FSCACHE_COOKIE_STAGE_CREATING,		/* The cache object is being created */
+	FSCACHE_COOKIE_STAGE_ACTIVE,		/* The cache is active, readable and writable */
+	FSCACHE_COOKIE_STAGE_INVALIDATING,	/* The cache is being invalidated */
+	FSCACHE_COOKIE_STAGE_FAILED,		/* The cache failed, withdraw to clear */
+	FSCACHE_COOKIE_STAGE_WITHDRAWING,	/* The cookie is being withdrawn */
+	FSCACHE_COOKIE_STAGE_RELINQUISHING,	/* The cookie is being relinquished */
+	FSCACHE_COOKIE_STAGE_DROPPED,		/* The cookie has been dropped */
+#define FSCACHE_COOKIE_STAGE__NR (FSCACHE_COOKIE_STAGE_DROPPED + 1)
+} __attribute__((mode(byte)));
 
 /*
- * fscache cached network filesystem type
- * - name, version and ops must be filled in before registration
- * - all other fields will be set during registration
+ * Volume representation cookie.
  */
-struct fscache_netfs {
-	uint32_t			version;	/* indexing version */
-	const char			*name;		/* filesystem name */
-	struct fscache_cookie		*primary_index;
+struct fscache_volume {
+	refcount_t			ref;
+	atomic_t			n_cookies;	/* Number of data cookies in volume */
+	atomic_t			n_accesses;	/* Number of cache accesses in progress */
+	unsigned int			debug_id;
+	unsigned int			key_hash;	/* Hash of key string */
+	char				*key;		/* Volume ID, eg. "afs@example.com@1234" */
+	struct list_head		proc_link;	/* Link in /proc/fs/fscache/volumes */
+	struct hlist_bl_node		hash_link;	/* Link in hash table */
+	struct work_struct		work;
+	struct fscache_cache		*cache;		/* The cache in which this resides */
+	void				*cache_priv;	/* Cache private data */
+	u64				coherency;	/* Coherency data */
+	spinlock_t			lock;
+	unsigned long			flags;
+#define FSCACHE_VOLUME_RELINQUISHED	0	/* Volume is being cleaned up */
+#define FSCACHE_VOLUME_INVALIDATE	1	/* Volume was invalidated */
+#define FSCACHE_VOLUME_COLLIDED_WITH	2	/* Volume was collided with */
+#define FSCACHE_VOLUME_ACQUIRE_PENDING	3	/* Volume is waiting to complete acquisition */
+#define FSCACHE_VOLUME_CREATING		4	/* Volume is being created on disk */
 };
 
 /*
- * data file or index object cookie
+ * Data file representation cookie.
  * - a file will only appear in one cache
  * - a request to cache a file may or may not be honoured, subject to
  *   constraints such as disk space
  * - indices are created on disk just-in-time
  */
 struct fscache_cookie {
-	refcount_t			ref;		/* number of users of this cookie */
-	atomic_t			n_children;	/* number of children of this cookie */
-	atomic_t			n_active;	/* number of active users of netfs ptrs */
+	refcount_t			ref;
+	atomic_t			n_active;	/* number of active users of cookie */
+	atomic_t			n_accesses;	/* Number of cache accesses in progress */
 	unsigned int			debug_id;
+	unsigned int			inval_counter;	/* Number of invalidations made */
 	spinlock_t			lock;
-	struct hlist_head		backing_objects; /* object(s) backing this file/index */
-	struct fscache_cookie		*parent;	/* parent of this entry */
-	struct fscache_cache_tag	*preferred_cache; /* The preferred cache or NULL */
+	struct fscache_volume		*volume;	/* Parent volume of this file. */
+	void				*cache_priv;	/* Cache-side representation */
 	struct hlist_bl_node		hash_link;	/* Link in hash table */
 	struct list_head		proc_link;	/* Link in proc list */
-	char				type_name[8];	/* Cookie type name */
+	struct work_struct		work;		/* Commit/relinq/withdraw work */
 	loff_t				object_size;	/* Size of the netfs object */
 
 	unsigned long			flags;
-#define FSCACHE_COOKIE_LOOKING_UP	0	/* T if non-index cookie being looked up still */
-#define FSCACHE_COOKIE_NO_DATA_YET	1	/* T if new object with no cached data yet */
-#define FSCACHE_COOKIE_UNAVAILABLE	2	/* T if cookie is unavailable (error, etc) */
-#define FSCACHE_COOKIE_INVALIDATING	3	/* T if cookie is being invalidated */
-#define FSCACHE_COOKIE_RELINQUISHED	4	/* T if cookie has been relinquished */
-#define FSCACHE_COOKIE_ENABLED		5	/* T if cookie is enabled */
-#define FSCACHE_COOKIE_ENABLEMENT_LOCK	6	/* T if cookie is being en/disabled */
-#define FSCACHE_COOKIE_AUX_UPDATED	8	/* T if the auxiliary data was updated */
-#define FSCACHE_COOKIE_ACQUIRED		9	/* T if cookie is in use */
-#define FSCACHE_COOKIE_RELINQUISHING	10	/* T if cookie is being relinquished */
-
-	enum fscache_cookie_type	type:8;
-	u8				advice;		/* FSCACHE_COOKIE_ADV_* */
+#define FSCACHE_COOKIE_RELINQUISHED	0		/* T if cookie has been relinquished */
+#define FSCACHE_COOKIE_RETIRED		1		/* T if this cookie has retired on relinq */
+#define FSCACHE_COOKIE_IS_CACHING	2		/* T if this cookie is cached */
+#define FSCACHE_COOKIE_NO_DATA_TO_READ	3		/* T if this cookie has nothing to read */
+#define FSCACHE_COOKIE_NEEDS_UPDATE	4		/* T if attrs have been updated */
+#define FSCACHE_COOKIE_HAS_BEEN_CACHED	5		/* T if cookie needs withdraw-on-relinq */
+#define FSCACHE_COOKIE_NACC_ELEVATED	8		/* T if n_accesses is incremented */
+#define FSCACHE_COOKIE_DO_RELINQUISH	9		/* T if this cookie needs relinquishment */
+#define FSCACHE_COOKIE_DO_WITHDRAW	10		/* T if this cookie needs withdrawing */
+
+	enum fscache_cookie_stage	stage;
+	u8				advice;		/* FSCACHE_ADV_* */
 	u8				key_len;	/* Length of index key */
 	u8				aux_len;	/* Length of auxiliary data */
-	u32				key_hash;	/* Hash of parent, type, key, len */
+	u32				key_hash;	/* Hash of volume, key, len */
 	union {
 		void			*key;		/* Index key */
 		u8			inline_key[16];	/* - If the key is short enough */
@@ -103,12 +137,6 @@ struct fscache_cookie {
 	};
 };
 
-static inline bool fscache_cookie_enabled(struct fscache_cookie *cookie)
-{
-	return (fscache_cookie_valid(cookie) &&
-		test_bit(FSCACHE_COOKIE_ENABLED, &cookie->flags));
-}
-
 /*
  * slow-path functions for when there is actually caching available, and the
  * netfs does actually have a valid token
@@ -116,195 +144,172 @@ static inline bool fscache_cookie_enabled(struct fscache_cookie *cookie)
  * - these are undefined symbols when FS-Cache is not configured and the
  *   optimiser takes care of not using them
  */
-extern int __fscache_register_netfs(struct fscache_netfs *);
-extern void __fscache_unregister_netfs(struct fscache_netfs *);
-extern struct fscache_cache_tag *__fscache_lookup_cache_tag(const char *);
-extern void __fscache_release_cache_tag(struct fscache_cache_tag *);
+extern struct fscache_volume *__fscache_acquire_volume(const char *, const char *, u64);
+extern void __fscache_relinquish_volume(struct fscache_volume *, u64, bool);
 
 extern struct fscache_cookie *__fscache_acquire_cookie(
-	struct fscache_cookie *,
-	enum fscache_cookie_type,
-	const char *,
+	struct fscache_volume *,
 	u8,
-	struct fscache_cache_tag *,
 	const void *, size_t,
 	const void *, size_t,
-	loff_t, bool);
-extern void __fscache_relinquish_cookie(struct fscache_cookie *, const void *, bool);
-extern void __fscache_update_cookie(struct fscache_cookie *, const void *);
+	loff_t);
+extern void __fscache_use_cookie(struct fscache_cookie *, bool);
+extern void __fscache_unuse_cookie(struct fscache_cookie *, const void *, const loff_t *);
+extern void __fscache_relinquish_cookie(struct fscache_cookie *, bool);
+extern void __fscache_update_cookie(struct fscache_cookie *, const void *, const loff_t *);
 extern void __fscache_invalidate(struct fscache_cookie *);
-extern void __fscache_wait_on_invalidate(struct fscache_cookie *);
 #ifdef FSCACHE_USE_NEW_IO_API
-extern int __fscache_begin_operation(struct netfs_cache_resources *, struct fscache_cookie *,
-				     bool);
+extern int __fscache_begin_read_operation(struct netfs_cache_resources *, struct fscache_cookie *);
 #endif
 #ifdef FSCACHE_USE_FALLBACK_IO_API
 extern int __fscache_fallback_read_page(struct fscache_cookie *, struct page *);
 extern int __fscache_fallback_write_page(struct fscache_cookie *, struct page *);
 #endif
-extern void __fscache_disable_cookie(struct fscache_cookie *, const void *, bool);
-extern void __fscache_enable_cookie(struct fscache_cookie *, const void *, loff_t,
-				    bool (*)(void *), void *);
 
 /**
- * fscache_register_netfs - Register a filesystem as desiring caching services
- * @netfs: The description of the filesystem
- *
- * Register a filesystem as desiring caching services if they're available.
- *
- * See Documentation/filesystems/caching/netfs-api.rst for a complete
- * description.
+ * fscache_acquire_volume - Register a volume as desiring caching services
+ * @volume_key: An identification string for the volume
+ * @cache_name: The name of the cache to use (or NULL for the default)
+ * @coherency_data: Piece of arbitrary coherency data to check
+ *
+ * Register a volume as desiring caching services if they're available.  The
+ * caller must provide an identifier for the volume and may also indicate which
+ * cache it should be in.  If a preexisting volume entry is found in the cache,
+ * the coherency data must match otherwise the entry will be invalidated.
  */
 static inline
-int fscache_register_netfs(struct fscache_netfs *netfs)
+struct fscache_volume *fscache_acquire_volume(const char *volume_key,
+					      const char *cache_name,
+					      u64 coherency_data)
 {
-	if (fscache_available())
-		return __fscache_register_netfs(netfs);
-	else
-		return 0;
+	if (!fscache_available())
+		return NULL;
+	return __fscache_acquire_volume(volume_key, cache_name, coherency_data);
 }
 
 /**
- * fscache_unregister_netfs - Indicate that a filesystem no longer desires
- * caching services
- * @netfs: The description of the filesystem
- *
- * Indicate that a filesystem no longer desires caching services for the
- * moment.
- *
- * See Documentation/filesystems/caching/netfs-api.rst for a complete
- * description.
+ * fscache_relinquish_volume - Cease caching a volume
+ * @volume: The volume cookie
+ * @coherency_data: Piece of arbitrary coherency data to set
+ * @invalidate: True if the volume should be invalidated
+ *
+ * Indicate that a filesystem no longer desires caching services for a volume.
+ * The caller must have relinquished all file cookies prior to calling this.
+ * The coherency data stored is updated.
  */
 static inline
-void fscache_unregister_netfs(struct fscache_netfs *netfs)
+void fscache_relinquish_volume(struct fscache_volume *volume,
+			       u64 coherency_data,
+			       bool invalidate)
 {
-	if (fscache_available())
-		__fscache_unregister_netfs(netfs);
+	if (fscache_volume_valid(volume))
+		__fscache_relinquish_volume(volume, coherency_data, invalidate);
 }
 
 /**
- * fscache_lookup_cache_tag - Look up a cache tag
- * @name: The name of the tag to search for
+ * fscache_acquire_cookie - Acquire a cookie to represent a cache object
+ * @volume: The volume in which to locate/create this cookie
+ * @advice: Advice flags (FSCACHE_COOKIE_ADV_*)
+ * @index_key: The index key for this cookie
+ * @index_key_len: Size of the index key
+ * @aux_data: The auxiliary data for the cookie (may be NULL)
+ * @aux_data_len: Size of the auxiliary data buffer
+ * @object_size: The initial size of object
  *
- * Acquire a specific cache referral tag that can be used to select a specific
- * cache in which to cache an index.
+ * Acquire a cookie to represent a data file within the given cache volume.
  *
  * See Documentation/filesystems/caching/netfs-api.rst for a complete
  * description.
  */
 static inline
-struct fscache_cache_tag *fscache_lookup_cache_tag(const char *name)
+struct fscache_cookie *fscache_acquire_cookie(struct fscache_volume *volume,
+					      u8 advice,
+					      const void *index_key,
+					      size_t index_key_len,
+					      const void *aux_data,
+					      size_t aux_data_len,
+					      loff_t object_size)
 {
-	if (fscache_available())
-		return __fscache_lookup_cache_tag(name);
-	else
+	if (!fscache_volume_valid(volume))
 		return NULL;
+	return __fscache_acquire_cookie(volume, advice,
+					index_key, index_key_len,
+					aux_data, aux_data_len,
+					object_size);
 }
 
 /**
- * fscache_release_cache_tag - Release a cache tag
- * @tag: The tag to release
+ * fscache_use_cookie - Request usage of cookie attached to an object
+ * @object: Object description
+ * @will_modify: If cache is expected to be modified locally
  *
- * Release a reference to a cache referral tag previously looked up.
- *
- * See Documentation/filesystems/caching/netfs-api.rst for a complete
- * description.
+ * Request usage of the cookie attached to an object.  The caller should tell
+ * the cache if the object's contents are about to be modified locally and then
+ * the cache can apply the policy that has been set to handle this case.
  */
-static inline
-void fscache_release_cache_tag(struct fscache_cache_tag *tag)
+static inline void fscache_use_cookie(struct fscache_cookie *cookie,
+				      bool will_modify)
 {
-	if (fscache_available())
-		__fscache_release_cache_tag(tag);
+	if (fscache_cookie_valid(cookie))
+		__fscache_use_cookie(cookie, will_modify);
 }
 
 /**
- * fscache_acquire_cookie - Acquire a cookie to represent a cache object
- * @parent: The cookie that's to be the parent of this one
- * @type: Type of the cookie
- * @type_name: Name of cookie type (max 7 chars)
- * @advice: Advice flags (FSCACHE_COOKIE_ADV_*)
- * @preferred_cache: The cache to use (or NULL)
- * @index_key: The index key for this cookie
- * @index_key_len: Size of the index key
- * @aux_data: The auxiliary data for the cookie (may be NULL)
- * @aux_data_len: Size of the auxiliary data buffer
- * @netfs_data: An arbitrary piece of data to be kept in the cookie to
- * represent the cache object to the netfs
- * @object_size: The initial size of object
- * @enable: Whether or not to enable a data cookie immediately
+ * fscache_unuse_cookie - Cease usage of cookie attached to an object
+ * @object: Object description
+ * @aux_data: Updated auxiliary data (or NULL)
+ * @object_size: Revised size of the object (or NULL)
  *
- * This function is used to inform FS-Cache about part of an index hierarchy
- * that can be used to locate files.  This is done by requesting a cookie for
- * each index in the path to the file.
- *
- * See Documentation/filesystems/caching/netfs-api.rst for a complete
- * description.
+ * Cease usage of the cookie attached to an object.  When the users count
+ * reaches zero then the cookie relinquishment will be permitted to proceed.
  */
-static inline
-struct fscache_cookie *fscache_acquire_cookie(
-	struct fscache_cookie *parent,
-	enum fscache_cookie_type type,
-	const char *type_name,
-	u8 advice,
-	struct fscache_cache_tag *preferred_cache,
-	const void *index_key,
-	size_t index_key_len,
-	const void *aux_data,
-	size_t aux_data_len,
-	loff_t object_size,
-	bool enable)
+static inline void fscache_unuse_cookie(struct fscache_cookie *cookie,
+					const void *aux_data,
+					const loff_t *object_size)
 {
-	if (fscache_cookie_valid(parent) && fscache_cookie_enabled(parent))
-		return __fscache_acquire_cookie(parent, type, type_name, advice,
-						preferred_cache,
-						index_key, index_key_len,
-						aux_data, aux_data_len,
-						object_size, enable);
-	else
-		return NULL;
+	if (fscache_cookie_valid(cookie))
+		__fscache_unuse_cookie(cookie, aux_data, object_size);
 }
 
 /**
  * fscache_relinquish_cookie - Return the cookie to the cache, maybe discarding
  * it
  * @cookie: The cookie being returned
- * @aux_data: The updated auxiliary data for the cookie (may be NULL)
  * @retire: True if the cache object the cookie represents is to be discarded
  *
  * This function returns a cookie to the cache, forcibly discarding the
- * associated cache object if retire is set to true.  The opportunity is
- * provided to update the auxiliary data in the cache before the object is
- * disconnected.
+ * associated cache object if retire is set to true.
  *
  * See Documentation/filesystems/caching/netfs-api.rst for a complete
  * description.
  */
 static inline
-void fscache_relinquish_cookie(struct fscache_cookie *cookie,
-			       const void *aux_data,
-			       bool retire)
+void fscache_relinquish_cookie(struct fscache_cookie *cookie, bool retire)
 {
 	if (fscache_cookie_valid(cookie))
-		__fscache_relinquish_cookie(cookie, aux_data, retire);
+		__fscache_relinquish_cookie(cookie, retire);
 }
 
 /**
  * fscache_update_cookie - Request that a cache object be updated
  * @cookie: The cookie representing the cache object
  * @aux_data: The updated auxiliary data for the cookie (may be NULL)
+ * @object_size: The current size of the object (may be NULL)
  *
  * Request an update of the index data for the cache object associated with the
  * cookie.  The auxiliary data on the cookie will be updated first if @aux_data
- * is set.
+ * is set and the object size will be updated and the object possibly trimmed
+ * if @object_size is set.
  *
  * See Documentation/filesystems/caching/netfs-api.rst for a complete
  * description.
  */
 static inline
-void fscache_update_cookie(struct fscache_cookie *cookie, const void *aux_data)
+void fscache_update_cookie(struct fscache_cookie *cookie, const void *aux_data,
+			   const loff_t *object_size)
 {
-	if (fscache_cookie_valid(cookie) && fscache_cookie_enabled(cookie))
-		__fscache_update_cookie(cookie, aux_data);
+	if (fscache_cookie_valid(cookie))
+		__fscache_update_cookie(cookie, aux_data, object_size);
 }
 
 /**
@@ -352,24 +357,20 @@ void fscache_unpin_cookie(struct fscache_cookie *cookie)
 static inline
 void fscache_invalidate(struct fscache_cookie *cookie)
 {
-	if (fscache_cookie_valid(cookie) && fscache_cookie_enabled(cookie))
+	if (fscache_cookie_valid(cookie))
 		__fscache_invalidate(cookie);
 }
 
 /**
- * fscache_wait_on_invalidate - Wait for invalidation to complete
- * @cookie: The cookie representing the cache object
- *
- * Wait for the invalidation of an object to complete.
+ * fscache_operation_valid - Return true if operations resources are usable
+ * @cres: The resources to check.
  *
- * See Documentation/filesystems/caching/netfs-api.rst for a complete
- * description.
+ * Returns a pointer to the operations table if usable or NULL if not.
  */
 static inline
-void fscache_wait_on_invalidate(struct fscache_cookie *cookie)
+const struct netfs_cache_ops *fscache_operation_valid(const struct netfs_cache_resources *cres)
 {
-	if (fscache_cookie_valid(cookie))
-		__fscache_wait_on_invalidate(cookie);
+	return fscache_resources_valid(cres) ? cres->ops : NULL;
 }
 
 #ifdef FSCACHE_USE_NEW_IO_API
@@ -395,23 +396,11 @@ static inline
 int fscache_begin_read_operation(struct netfs_cache_resources *cres,
 				 struct fscache_cookie *cookie)
 {
-	if (fscache_cookie_valid(cookie) && fscache_cookie_enabled(cookie))
-		return __fscache_begin_operation(cres, cookie, false);
+	if (fscache_cookie_valid(cookie))
+		return __fscache_begin_read_operation(cres, cookie);
 	return -ENOBUFS;
 }
 
-/**
- * fscache_operation_valid - Return true if operations resources are usable
- * @cres: The resources to check.
- *
- * Returns a pointer to the operations table if usable or NULL if not.
- */
-static inline
-const struct netfs_cache_ops *fscache_operation_valid(const struct netfs_cache_resources *cres)
-{
-	return fscache_resources_valid(cres) ? cres->ops : NULL;
-}
-
 /**
  * fscache_read - Start a read from the cache.
  * @cres: The cache resources to use
@@ -478,60 +467,6 @@ int fscache_write(struct netfs_cache_resources *cres,
 
 #endif /* FSCACHE_USE_NEW_IO_API */
 
-/**
- * fscache_disable_cookie - Disable a cookie
- * @cookie: The cookie representing the cache object
- * @aux_data: The updated auxiliary data for the cookie (may be NULL)
- * @invalidate: Invalidate the backing object
- *
- * Disable a cookie from accepting further alloc, read, write, invalidate,
- * update or acquire operations.  Outstanding operations can still be waited
- * upon and pages can still be uncached and the cookie relinquished.
- *
- * This will not return until all outstanding operations have completed.
- *
- * If @invalidate is set, then the backing object will be invalidated and
- * detached, otherwise it will just be detached.
- *
- * If @aux_data is set, then auxiliary data will be updated from that.
- */
-static inline
-void fscache_disable_cookie(struct fscache_cookie *cookie,
-			    const void *aux_data,
-			    bool invalidate)
-{
-	if (fscache_cookie_valid(cookie) && fscache_cookie_enabled(cookie))
-		__fscache_disable_cookie(cookie, aux_data, invalidate);
-}
-
-/**
- * fscache_enable_cookie - Reenable a cookie
- * @cookie: The cookie representing the cache object
- * @aux_data: The updated auxiliary data for the cookie (may be NULL)
- * @object_size: Current size of object
- * @can_enable: A function to permit enablement once lock is held
- * @data: Data for can_enable()
- *
- * Reenable a previously disabled cookie, allowing it to accept further alloc,
- * read, write, invalidate, update or acquire operations.  An attempt will be
- * made to immediately reattach the cookie to a backing object.  If @aux_data
- * is set, the auxiliary data attached to the cookie will be updated.
- *
- * The can_enable() function is called (if not NULL) once the enablement lock
- * is held to rule on whether enablement is still permitted to go ahead.
- */
-static inline
-void fscache_enable_cookie(struct fscache_cookie *cookie,
-			   const void *aux_data,
-			   loff_t object_size,
-			   bool (*can_enable)(void *data),
-			   void *data)
-{
-	if (fscache_cookie_valid(cookie) && !fscache_cookie_enabled(cookie))
-		__fscache_enable_cookie(cookie, aux_data, object_size,
-					can_enable, data);
-}
-
 #ifdef FSCACHE_USE_FALLBACK_IO_API
 
 /**
@@ -549,7 +484,7 @@ void fscache_enable_cookie(struct fscache_cookie *cookie,
 static inline
 int fscache_fallback_read_page(struct fscache_cookie *cookie, struct page *page)
 {
-	if (fscache_cookie_enabled(cookie))
+	if (fscache_cookie_valid(cookie))
 		return __fscache_fallback_read_page(cookie, page);
 	return -ENOBUFS;
 }
@@ -569,7 +504,7 @@ int fscache_fallback_read_page(struct fscache_cookie *cookie, struct page *page)
 static inline
 int fscache_fallback_write_page(struct fscache_cookie *cookie, struct page *page)
 {
-	if (fscache_cookie_enabled(cookie))
+	if (fscache_cookie_valid(cookie))
 		return __fscache_fallback_write_page(cookie, page);
 	return -ENOBUFS;
 }
diff --git a/include/trace/events/cachefiles.h b/include/trace/events/cachefiles.h
index 47df44550ad6..d98adabce92e 100644
--- a/include/trace/events/cachefiles.h
+++ b/include/trace/events/cachefiles.h
@@ -19,9 +19,25 @@
 #define __CACHEFILES_DECLARE_TRACE_ENUMS_ONCE_ONLY
 
 enum cachefiles_obj_ref_trace {
-	cachefiles_obj_put_wait_retry = fscache_obj_ref__nr_traces,
-	cachefiles_obj_put_wait_timeo,
-	cachefiles_obj_ref__nr_traces
+	cachefiles_obj_get_ioreq,
+	cachefiles_obj_new,
+	cachefiles_obj_put_alloc_fail,
+	cachefiles_obj_put_detach,
+	cachefiles_obj_put_ioreq,
+	cachefiles_obj_see_clean_commit,
+	cachefiles_obj_see_clean_delete,
+	cachefiles_obj_see_clean_drop_tmp,
+	cachefiles_obj_see_lookup_cookie,
+	cachefiles_obj_see_lookup_failed,
+	cachefiles_obj_see_withdraw_cookie,
+	cachefiles_obj_see_withdrawal,
+};
+
+enum fscache_why_object_killed {
+	FSCACHE_OBJECT_IS_STALE,
+	FSCACHE_OBJECT_NO_SPACE,
+	FSCACHE_OBJECT_WAS_RETIRED,
+	FSCACHE_OBJECT_WAS_CULLED,
 };
 
 enum cachefiles_coherency_trace {
@@ -40,6 +56,8 @@ enum cachefiles_coherency_trace {
 enum cachefiles_trunc_trace {
 	cachefiles_trunc_invalidate,
 	cachefiles_trunc_set_size,
+	cachefiles_trunc_dio_adjust,
+	cachefiles_trunc_shrink,
 };
 
 #endif
@@ -54,16 +72,18 @@ enum cachefiles_trunc_trace {
 	E_(FSCACHE_OBJECT_WAS_CULLED,	"was_culled")
 
 #define cachefiles_obj_ref_traces					\
-	EM(fscache_obj_get_add_to_deps,		"GET add_to_deps")	\
-	EM(fscache_obj_get_queue,		"GET queue")		\
-	EM(fscache_obj_put_alloc_fail,		"PUT alloc_fail")	\
-	EM(fscache_obj_put_attach_fail,		"PUT attach_fail")	\
-	EM(fscache_obj_put_drop_obj,		"PUT drop_obj")		\
-	EM(fscache_obj_put_enq_dep,		"PUT enq_dep")		\
-	EM(fscache_obj_put_queue,		"PUT queue")		\
-	EM(fscache_obj_put_work,		"PUT work")		\
-	EM(cachefiles_obj_put_wait_retry,	"PUT wait_retry")	\
-	E_(cachefiles_obj_put_wait_timeo,	"PUT wait_timeo")
+	EM(cachefiles_obj_get_ioreq,		"GET ioreq")		\
+	EM(cachefiles_obj_new,			"NEW obj")		\
+	EM(cachefiles_obj_put_alloc_fail,	"PUT alloc_fail")	\
+	EM(cachefiles_obj_put_detach,		"PUT detach")		\
+	EM(cachefiles_obj_put_ioreq,		"PUT ioreq")		\
+	EM(cachefiles_obj_see_clean_commit,	"SEE clean_commit")	\
+	EM(cachefiles_obj_see_clean_delete,	"SEE clean_delete")	\
+	EM(cachefiles_obj_see_clean_drop_tmp,	"SEE clean_drop_tmp")	\
+	EM(cachefiles_obj_see_lookup_cookie,	"SEE lookup_cookie")	\
+	EM(cachefiles_obj_see_lookup_failed,	"SEE lookup_failed")	\
+	EM(cachefiles_obj_see_withdraw_cookie,	"SEE withdraw_cookie")	\
+	E_(cachefiles_obj_see_withdrawal,	"SEE withdrawal")
 
 #define cachefiles_coherency_traces					\
 	EM(cachefiles_coherency_check_aux,	"BAD aux ")		\
@@ -79,7 +99,9 @@ enum cachefiles_trunc_trace {
 
 #define cachefiles_trunc_traces						\
 	EM(cachefiles_trunc_invalidate,		"INVAL ")		\
-	E_(cachefiles_trunc_set_size,		"SETSIZ")
+	EM(cachefiles_trunc_set_size,		"SETSIZ")		\
+	EM(cachefiles_trunc_dio_adjust,		"DIOADJ")		\
+	E_(cachefiles_trunc_shrink,		"SHRINK")
 
 /*
  * Export enum symbols via userspace.
@@ -105,12 +127,12 @@ cachefiles_trunc_traces;
 
 
 TRACE_EVENT(cachefiles_ref,
-	    TP_PROTO(struct cachefiles_object *obj,
-		     struct fscache_cookie *cookie,
-		     enum cachefiles_obj_ref_trace why,
-		     int usage),
+	    TP_PROTO(unsigned int object_debug_id,
+		     unsigned int cookie_debug_id,
+		     int usage,
+		     enum cachefiles_obj_ref_trace why),
 
-	    TP_ARGS(obj, cookie, why, usage),
+	    TP_ARGS(object_debug_id, cookie_debug_id, usage, why),
 
 	    /* Note that obj may be NULL */
 	    TP_STRUCT__entry(
@@ -121,8 +143,8 @@ TRACE_EVENT(cachefiles_ref,
 			     ),
 
 	    TP_fast_assign(
-		    __entry->obj	= obj->debug_id;
-		    __entry->cookie	= cookie->debug_id;
+		    __entry->obj	= object_debug_id;
+		    __entry->cookie	= cookie_debug_id;
 		    __entry->usage	= usage;
 		    __entry->why	= why;
 			   ),
diff --git a/include/trace/events/fscache.h b/include/trace/events/fscache.h
index 412f016f6975..0d9789745a91 100644
--- a/include/trace/events/fscache.h
+++ b/include/trace/events/fscache.h
@@ -19,18 +19,76 @@
 #ifndef __FSCACHE_DECLARE_TRACE_ENUMS_ONCE_ONLY
 #define __FSCACHE_DECLARE_TRACE_ENUMS_ONCE_ONLY
 
+enum fscache_cache_trace {
+	fscache_cache_collision,
+	fscache_cache_get_acquire,
+	fscache_cache_new_acquire,
+	fscache_cache_put_alloc_volume,
+	fscache_cache_put_cache,
+	fscache_cache_put_volume,
+	fscache_cache_put_withdraw,
+};
+
+enum fscache_volume_trace {
+	fscache_volume_collision,
+	fscache_volume_get_cookie,
+	fscache_volume_get_create_work,
+	fscache_volume_get_hash_collision,
+	fscache_volume_free,
+	fscache_volume_new_acquire,
+	fscache_volume_put_cookie,
+	fscache_volume_put_create_work,
+	fscache_volume_put_hash_collision,
+	fscache_volume_put_relinquish,
+	fscache_volume_see_create_work,
+	fscache_volume_see_hash_wake,
+	fscache_volume_wait_create_work,
+};
+
 enum fscache_cookie_trace {
 	fscache_cookie_collision,
 	fscache_cookie_discard,
-	fscache_cookie_get_acquire_parent,
 	fscache_cookie_get_attach_object,
-	fscache_cookie_get_reacquire,
-	fscache_cookie_get_register_netfs,
-	fscache_cookie_put_acquire_nobufs,
-	fscache_cookie_put_dup_netfs,
-	fscache_cookie_put_relinquish,
+	fscache_cookie_get_end_access,
+	fscache_cookie_get_hash_collision,
+	fscache_cookie_get_inval_work,
+	fscache_cookie_get_use_work,
+	fscache_cookie_get_withdraw,
+	fscache_cookie_new_acquire,
+	fscache_cookie_put_hash_collision,
 	fscache_cookie_put_object,
-	fscache_cookie_put_parent,
+	fscache_cookie_put_over_queued,
+	fscache_cookie_put_relinquish,
+	fscache_cookie_put_withdrawn,
+	fscache_cookie_put_work,
+	fscache_cookie_see_active,
+	fscache_cookie_see_relinquish,
+	fscache_cookie_see_withdraw,
+	fscache_cookie_see_work,
+};
+
+enum fscache_access_trace {
+	fscache_access_acquire_volume,
+	fscache_access_acquire_volume_end,
+	fscache_access_cache_pin,
+	fscache_access_cache_unpin,
+	fscache_access_invalidate_cookie,
+	fscache_access_invalidate_cookie_end,
+	fscache_access_io_end,
+	fscache_access_io_no_data_yet,
+	fscache_access_io_not_live,
+	fscache_access_io_read,
+	fscache_access_io_resize,
+	fscache_access_io_wait,
+	fscache_access_io_write,
+	fscache_access_lookup_cookie,
+	fscache_access_lookup_cookie_end,
+	fscache_access_relinquish_cookie,
+	fscache_access_relinquish_cookie_end,
+	fscache_access_relinquish_defer,
+	fscache_access_relinquish_volume,
+	fscache_access_relinquish_volume_end,
+	fscache_access_unlive,
 };
 
 #endif
@@ -38,18 +96,73 @@ enum fscache_cookie_trace {
 /*
  * Declare tracing information enums and their string mappings for display.
  */
+#define fscache_cache_traces						\
+	EM(fscache_cache_collision,		"*COLLIDE*")		\
+	EM(fscache_cache_get_acquire,		"GET acq  ")		\
+	EM(fscache_cache_new_acquire,		"NEW acq  ")		\
+	EM(fscache_cache_put_alloc_volume,	"PUT alvol")		\
+	EM(fscache_cache_put_cache,		"PUT cache")		\
+	EM(fscache_cache_put_volume,		"PUT vol  ")		\
+	E_(fscache_cache_put_withdraw,		"PUT withd")
+
+#define fscache_volume_traces						\
+	EM(fscache_volume_collision,		"*COLLIDE*")		\
+	EM(fscache_volume_get_cookie,		"GET cook ")		\
+	EM(fscache_volume_get_create_work,	"GET creat")		\
+	EM(fscache_volume_get_hash_collision,	"GET hcoll")		\
+	EM(fscache_volume_free,			"FREE     ")		\
+	EM(fscache_volume_new_acquire,		"NEW acq  ")		\
+	EM(fscache_volume_put_cookie,		"PUT cook ")		\
+	EM(fscache_volume_put_create_work,	"PUT creat")		\
+	EM(fscache_volume_put_hash_collision,	"PUT hcoll")		\
+	EM(fscache_volume_put_relinquish,	"PUT relnq")		\
+	EM(fscache_volume_see_create_work,	"SEE creat")		\
+	EM(fscache_volume_see_hash_wake,	"SEE hwake")		\
+	E_(fscache_volume_wait_create_work,	"WAIT crea")
+
 #define fscache_cookie_traces						\
-	EM(fscache_cookie_collision,		"*COLLISION*")		\
-	EM(fscache_cookie_discard,		"DISCARD")		\
-	EM(fscache_cookie_get_acquire_parent,	"GET prn")		\
-	EM(fscache_cookie_get_attach_object,	"GET obj")		\
-	EM(fscache_cookie_get_reacquire,	"GET raq")		\
-	EM(fscache_cookie_get_register_netfs,	"GET net")		\
-	EM(fscache_cookie_put_acquire_nobufs,	"PUT nbf")		\
-	EM(fscache_cookie_put_dup_netfs,	"PUT dnt")		\
-	EM(fscache_cookie_put_relinquish,	"PUT rlq")		\
-	EM(fscache_cookie_put_object,		"PUT obj")		\
-	E_(fscache_cookie_put_parent,		"PUT prn")
+	EM(fscache_cookie_collision,		"*COLLIDE*")		\
+	EM(fscache_cookie_discard,		"DISCARD  ")		\
+	EM(fscache_cookie_get_attach_object,	"GET attch")		\
+	EM(fscache_cookie_get_hash_collision,	"GET hcoll")		\
+	EM(fscache_cookie_get_end_access,	"GQ  endac")		\
+	EM(fscache_cookie_get_inval_work,	"GQ  inval")		\
+	EM(fscache_cookie_get_use_work,		"GQ  use  ")		\
+	EM(fscache_cookie_get_withdraw,		"GQ  wthdr")		\
+	EM(fscache_cookie_new_acquire,		"NEW acq  ")		\
+	EM(fscache_cookie_put_hash_collision,	"PUT hcoll")		\
+	EM(fscache_cookie_put_object,		"PUT obj  ")		\
+	EM(fscache_cookie_put_over_queued,	"PQ  overq")		\
+	EM(fscache_cookie_put_relinquish,	"PUT relnq")		\
+	EM(fscache_cookie_put_withdrawn,	"PUT wthdn")		\
+	EM(fscache_cookie_put_work,		"PQ  work ")		\
+	EM(fscache_cookie_see_active,		"-   active")		\
+	EM(fscache_cookie_see_relinquish,	"-   x-rlq")		\
+	EM(fscache_cookie_see_withdraw,		"-   x-wth")		\
+	E_(fscache_cookie_see_work,		"-   work ")
+
+#define fscache_access_traces		\
+	EM(fscache_access_acquire_volume,	"BEGIN acq_vol")	\
+	EM(fscache_access_acquire_volume_end,	"END   acq_vol")	\
+	EM(fscache_access_cache_pin,		"PIN   cache  ")	\
+	EM(fscache_access_cache_unpin,		"UNPIN cache  ")	\
+	EM(fscache_access_invalidate_cookie,	"BEGIN inval  ")	\
+	EM(fscache_access_invalidate_cookie_end,"END   inval  ")	\
+	EM(fscache_access_io_end,		"END   io     ")	\
+	EM(fscache_access_io_no_data_yet,	"END   io_nody")	\
+	EM(fscache_access_io_not_live,		"END   io_notl")	\
+	EM(fscache_access_io_read,		"BEGIN io_read")	\
+	EM(fscache_access_io_resize,		"BEGIN io_resz")	\
+	EM(fscache_access_io_wait,		"WAIT  io    ")		\
+	EM(fscache_access_io_write,		"BEGIN io_writ")	\
+	EM(fscache_access_lookup_cookie,	"BEGIN lookup ")	\
+	EM(fscache_access_lookup_cookie_end,	"END   lookup ")	\
+	EM(fscache_access_relinquish_cookie,	"BEGIN relinq ")	\
+	EM(fscache_access_relinquish_cookie_end,"END   relinq ")	\
+	EM(fscache_access_relinquish_defer,	"DEFER relinq ")	\
+	EM(fscache_access_relinquish_volume,	"BEGIN rlq_vol")	\
+	EM(fscache_access_relinquish_volume_end,"END   rlq_vol")	\
+	E_(fscache_access_unlive,		"END   unlive ")
 
 /*
  * Export enum symbols via userspace.
@@ -59,7 +172,10 @@ enum fscache_cookie_trace {
 #define EM(a, b) TRACE_DEFINE_ENUM(a);
 #define E_(a, b) TRACE_DEFINE_ENUM(a);
 
+fscache_cache_traces;
+fscache_volume_traces;
 fscache_cookie_traces;
+fscache_access_traces;
 
 /*
  * Now redefine the EM() and E_() macros to map the enums to the strings that
@@ -71,6 +187,56 @@ fscache_cookie_traces;
 #define E_(a, b)	{ a, b }
 
 
+TRACE_EVENT(fscache_cache,
+	    TP_PROTO(unsigned int cache_debug_id,
+		     int usage,
+		     enum fscache_cache_trace where),
+
+	    TP_ARGS(cache_debug_id, usage, where),
+
+	    TP_STRUCT__entry(
+		    __field(unsigned int,		cache		)
+		    __field(int,			usage		)
+		    __field(enum fscache_cache_trace,	where		)
+			     ),
+
+	    TP_fast_assign(
+		    __entry->cache	= cache_debug_id;
+		    __entry->usage	= usage;
+		    __entry->where	= where;
+			   ),
+
+	    TP_printk("C=%08x %s r=%d",
+		      __entry->cache,
+		      __print_symbolic(__entry->where, fscache_cache_traces),
+		      __entry->usage)
+	    );
+
+TRACE_EVENT(fscache_volume,
+	    TP_PROTO(unsigned int volume_debug_id,
+		     int usage,
+		     enum fscache_volume_trace where),
+
+	    TP_ARGS(volume_debug_id, usage, where),
+
+	    TP_STRUCT__entry(
+		    __field(unsigned int,		volume		)
+		    __field(int,			usage		)
+		    __field(enum fscache_volume_trace,	where		)
+			     ),
+
+	    TP_fast_assign(
+		    __entry->volume	= volume_debug_id;
+		    __entry->usage	= usage;
+		    __entry->where	= where;
+			   ),
+
+	    TP_printk("V=%08x %s u=%d",
+		      __entry->volume,
+		      __print_symbolic(__entry->where, fscache_volume_traces),
+		      __entry->usage)
+	    );
+
 TRACE_EVENT(fscache_cookie,
 	    TP_PROTO(unsigned int cookie_debug_id,
 		     int ref,
@@ -80,189 +246,160 @@ TRACE_EVENT(fscache_cookie,
 
 	    TP_STRUCT__entry(
 		    __field(unsigned int,		cookie		)
-		    __field(enum fscache_cookie_trace,	where		)
 		    __field(int,			ref		)
+		    __field(enum fscache_cookie_trace,	where		)
 			     ),
 
 	    TP_fast_assign(
 		    __entry->cookie	= cookie_debug_id;
-		    __entry->where	= where;
 		    __entry->ref	= ref;
+		    __entry->where	= where;
 			   ),
 
-	    TP_printk("%s c=%08x r=%d",
+	    TP_printk("c=%08x %s r=%d",
+		      __entry->cookie,
 		      __print_symbolic(__entry->where, fscache_cookie_traces),
-		      __entry->cookie, __entry->ref)
+		      __entry->ref)
 	    );
 
-TRACE_EVENT(fscache_netfs,
-	    TP_PROTO(struct fscache_netfs *netfs),
+TRACE_EVENT(fscache_access_cache,
+	    TP_PROTO(unsigned int cache_debug_id,
+		     int ref,
+		     int n_accesses,
+		     enum fscache_access_trace why),
 
-	    TP_ARGS(netfs),
+	    TP_ARGS(cache_debug_id, ref, n_accesses, why),
 
 	    TP_STRUCT__entry(
-		    __field(unsigned int,		cookie		)
-		    __array(char,			name, 8		)
+		    __field(unsigned int,		cache		)
+		    __field(int,			ref		)
+		    __field(int,			n_accesses	)
+		    __field(enum fscache_access_trace,	why		)
 			     ),
 
 	    TP_fast_assign(
-		    __entry->cookie		= netfs->primary_index->debug_id;
-		    strncpy(__entry->name, netfs->name, 8);
-		    __entry->name[7]		= 0;
+		    __entry->cache	= cache_debug_id;
+		    __entry->ref	= ref;
+		    __entry->n_accesses	= n_accesses;
+		    __entry->why	= why;
 			   ),
 
-	    TP_printk("c=%08x n=%s",
-		      __entry->cookie, __entry->name)
+	    TP_printk("C=%08x %s r=%d a=%d",
+		      __entry->cache,
+		      __print_symbolic(__entry->why, fscache_access_traces),
+		      __entry->ref,
+		      __entry->n_accesses)
 	    );
 
-TRACE_EVENT(fscache_acquire,
-	    TP_PROTO(struct fscache_cookie *cookie),
+TRACE_EVENT(fscache_access_volume,
+	    TP_PROTO(unsigned int volume_debug_id,
+		     int ref,
+		     int n_accesses,
+		     enum fscache_access_trace why),
 
-	    TP_ARGS(cookie),
+	    TP_ARGS(volume_debug_id, ref, n_accesses, why),
 
 	    TP_STRUCT__entry(
-		    __field(unsigned int,		cookie		)
-		    __field(unsigned int,		parent		)
-		    __array(char,			name, 8		)
-		    __field(int,			p_ref		)
-		    __field(int,			p_n_children	)
-		    __field(u8,				p_flags		)
+		    __field(unsigned int,		volume		)
+		    __field(int,			ref		)
+		    __field(int,			n_accesses	)
+		    __field(enum fscache_access_trace,	why		)
 			     ),
 
 	    TP_fast_assign(
-		    __entry->cookie		= cookie->debug_id;
-		    __entry->parent		= cookie->parent->debug_id;
-		    __entry->p_ref		= refcount_read(&cookie->parent->ref);
-		    __entry->p_n_children	= atomic_read(&cookie->parent->n_children);
-		    __entry->p_flags		= cookie->parent->flags;
-		    memcpy(__entry->name, cookie->type_name, 8);
-		    __entry->name[7]		= 0;
+		    __entry->volume	= volume_debug_id;
+		    __entry->ref	= ref;
+		    __entry->n_accesses	= n_accesses;
+		    __entry->why	= why;
 			   ),
 
-	    TP_printk("c=%08x p=%08x pr=%d pc=%d pf=%02x n=%s",
-		      __entry->cookie, __entry->parent, __entry->p_ref,
-		      __entry->p_n_children, __entry->p_flags, __entry->name)
+	    TP_printk("V=%08x %s r=%d a=%d",
+		      __entry->volume,
+		      __print_symbolic(__entry->why, fscache_access_traces),
+		      __entry->ref,
+		      __entry->n_accesses)
 	    );
 
-TRACE_EVENT(fscache_relinquish,
-	    TP_PROTO(struct fscache_cookie *cookie, bool retire),
+TRACE_EVENT(fscache_access,
+	    TP_PROTO(unsigned int cookie_debug_id,
+		     int ref,
+		     int n_accesses,
+		     enum fscache_access_trace why),
 
-	    TP_ARGS(cookie, retire),
+	    TP_ARGS(cookie_debug_id, ref, n_accesses, why),
 
 	    TP_STRUCT__entry(
 		    __field(unsigned int,		cookie		)
-		    __field(unsigned int,		parent		)
 		    __field(int,			ref		)
-		    __field(int,			n_children	)
-		    __field(int,			n_active	)
-		    __field(u8,				flags		)
-		    __field(bool,			retire		)
+		    __field(int,			n_accesses	)
+		    __field(enum fscache_access_trace,	why		)
 			     ),
 
 	    TP_fast_assign(
-		    __entry->cookie	= cookie->debug_id;
-		    __entry->parent	= cookie->parent->debug_id;
-		    __entry->ref	= refcount_read(&cookie->ref);
-		    __entry->n_children	= atomic_read(&cookie->n_children);
-		    __entry->n_active	= atomic_read(&cookie->n_active);
-		    __entry->flags	= cookie->flags;
-		    __entry->retire	= retire;
+		    __entry->cookie	= cookie_debug_id;
+		    __entry->ref	= ref;
+		    __entry->n_accesses	= n_accesses;
+		    __entry->why	= why;
 			   ),
 
-	    TP_printk("c=%08x r=%d p=%08x Nc=%d Na=%d f=%02x r=%u",
-		      __entry->cookie, __entry->ref,
-		      __entry->parent, __entry->n_children, __entry->n_active,
-		      __entry->flags, __entry->retire)
+	    TP_printk("c=%08x %s r=%d a=%d",
+		      __entry->cookie,
+		      __print_symbolic(__entry->why, fscache_access_traces),
+		      __entry->ref,
+		      __entry->n_accesses)
 	    );
 
-TRACE_EVENT(fscache_enable,
+TRACE_EVENT(fscache_acquire,
 	    TP_PROTO(struct fscache_cookie *cookie),
 
 	    TP_ARGS(cookie),
 
 	    TP_STRUCT__entry(
 		    __field(unsigned int,		cookie		)
-		    __field(int,			ref		)
-		    __field(int,			n_children	)
-		    __field(int,			n_active	)
-		    __field(u8,				flags		)
+		    __field(unsigned int,		volume		)
+		    __field(int,			v_ref		)
+		    __field(int,			v_n_cookies	)
+		    __field(struct fscache_cookie *,	cookie_p	)
 			     ),
 
 	    TP_fast_assign(
-		    __entry->cookie	= cookie->debug_id;
-		    __entry->ref	= refcount_read(&cookie->ref);
-		    __entry->n_children	= atomic_read(&cookie->n_children);
-		    __entry->n_active	= atomic_read(&cookie->n_active);
-		    __entry->flags	= cookie->flags;
+		    __entry->cookie		= cookie->debug_id;
+		    __entry->volume		= cookie->volume->debug_id;
+		    __entry->v_ref		= refcount_read(&cookie->volume->ref);
+		    __entry->v_n_cookies	= atomic_read(&cookie->volume->n_cookies);
 			   ),
 
-	    TP_printk("c=%08x r=%d Nc=%d Na=%d f=%02x",
-		      __entry->cookie, __entry->ref,
-		      __entry->n_children, __entry->n_active, __entry->flags)
+	    TP_printk("c=%08x V=%08x vr=%d vc=%d",
+		      __entry->cookie,
+		      __entry->volume, __entry->v_ref, __entry->v_n_cookies)
 	    );
 
-TRACE_EVENT(fscache_disable,
-	    TP_PROTO(struct fscache_cookie *cookie),
+TRACE_EVENT(fscache_relinquish,
+	    TP_PROTO(struct fscache_cookie *cookie, bool retire),
 
-	    TP_ARGS(cookie),
+	    TP_ARGS(cookie, retire),
 
 	    TP_STRUCT__entry(
 		    __field(unsigned int,		cookie		)
+		    __field(unsigned int,		volume		)
 		    __field(int,			ref		)
-		    __field(int,			n_children	)
 		    __field(int,			n_active	)
 		    __field(u8,				flags		)
+		    __field(bool,			retire		)
 			     ),
 
 	    TP_fast_assign(
 		    __entry->cookie	= cookie->debug_id;
+		    __entry->volume	= cookie->volume->debug_id;
 		    __entry->ref	= refcount_read(&cookie->ref);
-		    __entry->n_children	= atomic_read(&cookie->n_children);
 		    __entry->n_active	= atomic_read(&cookie->n_active);
 		    __entry->flags	= cookie->flags;
+		    __entry->retire	= retire;
 			   ),
 
-	    TP_printk("c=%08x r=%d Nc=%d Na=%d f=%02x",
-		      __entry->cookie, __entry->ref,
-		      __entry->n_children, __entry->n_active, __entry->flags)
-	    );
-
-TRACE_EVENT(fscache_osm,
-	    TP_PROTO(struct cachefiles_object *object,
-		     const struct fscache_state *state,
-		     bool wait, bool oob, s8 event_num),
-
-	    TP_ARGS(object, state, wait, oob, event_num),
-
-	    TP_STRUCT__entry(
-		    __field(unsigned int,		cookie		)
-		    __field(unsigned int,		object		)
-		    __array(char,			state, 8	)
-		    __field(bool,			wait		)
-		    __field(bool,			oob		)
-		    __field(s8,				event_num	)
-			     ),
-
-	    TP_fast_assign(
-		    __entry->cookie		= object->cookie->debug_id;
-		    __entry->object		= object->debug_id;
-		    __entry->wait		= wait;
-		    __entry->oob		= oob;
-		    __entry->event_num		= event_num;
-		    memcpy(__entry->state, state->short_name, 8);
-			   ),
-
-	    TP_printk("c=%08x o=%08d %s %s%sev=%d",
-		      __entry->cookie,
-		      __entry->object,
-		      __entry->state,
-		      __print_symbolic(__entry->wait,
-				       { true,  "WAIT" },
-				       { false, "WORK" }),
-		      __print_symbolic(__entry->oob,
-				       { true,  " OOB " },
-				       { false, " " }),
-		      __entry->event_num)
+	    TP_printk("c=%08x V=%08x r=%d U=%d f=%02x rt=%u",
+		      __entry->cookie, __entry->volume, __entry->ref,
+		      __entry->n_active, __entry->flags, __entry->retire)
 	    );
 
 #endif /* _TRACE_FSCACHE_H */



^ permalink raw reply related	[flat|nested] 87+ messages in thread

* [PATCH 33/67] cachefiles: Trace decisions in cachefiles_prepare_read()
  2021-10-18 14:50 [PATCH 00/67] fscache: Rewrite index API and management system David Howells
                   ` (31 preceding siblings ...)
  2021-10-18 14:59 ` [PATCH 32/67] fscache: Replace the object management state machine David Howells
@ 2021-10-18 14:59 ` David Howells
  2021-10-18 14:59 ` [PATCH 34/67] cachefiles: Make cachefiles_write_prepare() check for space David Howells
                   ` (37 subsequent siblings)
  70 siblings, 0 replies; 87+ messages in thread
From: David Howells @ 2021-10-18 14:59 UTC (permalink / raw)
  To: linux-cachefs
  Cc: dhowells, Trond Myklebust, Anna Schumaker, Steve French,
	Dominique Martinet, Jeff Layton, Matthew Wilcox, Alexander Viro,
	Omar Sandoval, Linus Torvalds, linux-afs, linux-nfs, linux-cifs,
	ceph-devel, v9fs-developer, linux-fsdevel, linux-kernel

Add a tracepoint to log decisions made in cachefiles_prepare_read() about
what it's going to request for the next subrequest and why.

Signed-off-by: David Howells <dhowells@redhat.com>
---

 fs/cachefiles/io.c                |   22 +++++++++++--
 fs/cachefiles/main.c              |    2 +
 include/trace/events/cachefiles.h |   64 +++++++++++++++++++++++++++++++++++++
 3 files changed, 85 insertions(+), 3 deletions(-)

diff --git a/fs/cachefiles/io.c b/fs/cachefiles/io.c
index e5c29c0decea..c05f64cdfd0e 100644
--- a/fs/cachefiles/io.c
+++ b/fs/cachefiles/io.c
@@ -303,6 +303,7 @@ static int cachefiles_write(struct netfs_cache_resources *cres,
 static enum netfs_read_source cachefiles_prepare_read(struct netfs_read_subrequest *subreq,
 						      loff_t i_size)
 {
+	enum cachefiles_prepare_read_trace why;
 	struct netfs_read_request *rreq = subreq->rreq;
 	struct netfs_cache_resources *cres = &rreq->cache_resources;
 	struct cachefiles_object *object;
@@ -312,26 +313,31 @@ static enum netfs_read_source cachefiles_prepare_read(struct netfs_read_subreque
 	struct file *file = cachefiles_cres_file(cres);
 	enum netfs_read_source ret = NETFS_DOWNLOAD_FROM_SERVER;
 	loff_t off, to;
+	ino_t ino = file ? file_inode(file)->i_ino : 0;
 
 	_enter("%zx @%llx/%llx", subreq->len, subreq->start, i_size);
 
 	if (subreq->start >= i_size) {
 		ret = NETFS_FILL_WITH_ZEROES;
+		why = cachefiles_trace_read_after_eof;
 		goto out_no_object;
 	}
 
 	if (test_bit(FSCACHE_COOKIE_NO_DATA_TO_READ, &cookie->flags)) {
 		__set_bit(NETFS_SREQ_WRITE_TO_CACHE, &subreq->flags);
+		why = cachefiles_trace_read_no_data;
 		goto out_no_object;
 	}
 
 	/* The object and the file may be being created in the background. */
 	if (!file) {
+		why = cachefiles_trace_read_no_file;
 		if (!fscache_wait_for_operation(cres, FSCACHE_WANT_READ))
 			goto out_no_object;
 		file = cachefiles_cres_file(cres);
 		if (!file)
 			goto out_no_object;
+		ino = file_inode(file)->i_ino;
 	}
 
 	object = cachefiles_cres_object(cres);
@@ -340,23 +346,31 @@ static enum netfs_read_source cachefiles_prepare_read(struct netfs_read_subreque
 
 	off = vfs_llseek(file, subreq->start, SEEK_DATA);
 	if (off < 0 && off >= (loff_t)-MAX_ERRNO) {
-		if (off == (loff_t)-ENXIO)
+		if (off == (loff_t)-ENXIO) {
+			why = cachefiles_trace_read_seek_nxio;
 			goto download_and_store;
+		}
+		why = cachefiles_trace_read_seek_error;
 		goto out;
 	}
 
-	if (off >= subreq->start + subreq->len)
+	if (off >= subreq->start + subreq->len) {
+		why = cachefiles_trace_read_found_hole;
 		goto download_and_store;
+	}
 
 	if (off > subreq->start) {
 		off = round_up(off, cache->bsize);
 		subreq->len = off - subreq->start;
+		why = cachefiles_trace_read_found_part;
 		goto download_and_store;
 	}
 
 	to = vfs_llseek(file, subreq->start, SEEK_HOLE);
-	if (to < 0 && to >= (loff_t)-MAX_ERRNO)
+	if (to < 0 && to >= (loff_t)-MAX_ERRNO) {
+		why = cachefiles_trace_read_seek_error;
 		goto out;
+	}
 
 	if (to < subreq->start + subreq->len) {
 		if (subreq->start + subreq->len >= i_size)
@@ -366,6 +380,7 @@ static enum netfs_read_source cachefiles_prepare_read(struct netfs_read_subreque
 		subreq->len = to - subreq->start;
 	}
 
+	why = cachefiles_trace_read_have_data;
 	ret = NETFS_READ_FROM_CACHE;
 	goto out;
 
@@ -375,6 +390,7 @@ static enum netfs_read_source cachefiles_prepare_read(struct netfs_read_subreque
 out:
 	cachefiles_end_secure(cache, saved_cred);
 out_no_object:
+	trace_cachefiles_prep_read(subreq, ret, why, ino);
 	return ret;
 }
 
diff --git a/fs/cachefiles/main.c b/fs/cachefiles/main.c
index dc7731812b98..522fda828563 100644
--- a/fs/cachefiles/main.c
+++ b/fs/cachefiles/main.c
@@ -18,6 +18,8 @@
 #include <linux/statfs.h>
 #include <linux/sysctl.h>
 #include <linux/miscdevice.h>
+#include <linux/netfs.h>
+#include <trace/events/netfs.h>
 #define CREATE_TRACE_POINTS
 #include "internal.h"
 
diff --git a/include/trace/events/cachefiles.h b/include/trace/events/cachefiles.h
index d98adabce92e..d63e5fb46d27 100644
--- a/include/trace/events/cachefiles.h
+++ b/include/trace/events/cachefiles.h
@@ -60,6 +60,17 @@ enum cachefiles_trunc_trace {
 	cachefiles_trunc_shrink,
 };
 
+enum cachefiles_prepare_read_trace {
+	cachefiles_trace_read_after_eof,
+	cachefiles_trace_read_found_hole,
+	cachefiles_trace_read_found_part,
+	cachefiles_trace_read_have_data,
+	cachefiles_trace_read_no_data,
+	cachefiles_trace_read_no_file,
+	cachefiles_trace_read_seek_error,
+	cachefiles_trace_read_seek_nxio,
+};
+
 #endif
 
 /*
@@ -103,6 +114,17 @@ enum cachefiles_trunc_trace {
 	EM(cachefiles_trunc_dio_adjust,		"DIOADJ")		\
 	E_(cachefiles_trunc_shrink,		"SHRINK")
 
+#define cachefiles_prepare_read_traces					\
+	EM(cachefiles_trace_read_after_eof,	"after-eof ")		\
+	EM(cachefiles_trace_read_found_hole,	"found-hole")		\
+	EM(cachefiles_trace_read_found_part,	"found-part")		\
+	EM(cachefiles_trace_read_have_data,	"have-data ")		\
+	EM(cachefiles_trace_read_no_data,	"no-data   ")		\
+	EM(cachefiles_trace_read_no_file,	"no-file   ")		\
+	EM(cachefiles_trace_read_seek_error,	"seek-error")		\
+	E_(cachefiles_trace_read_seek_nxio,	"seek-enxio")
+
+
 /*
  * Export enum symbols via userspace.
  */
@@ -115,6 +137,7 @@ cachefiles_obj_kill_traces;
 cachefiles_obj_ref_traces;
 cachefiles_coherency_traces;
 cachefiles_trunc_traces;
+cachefiles_prepare_read_traces;
 
 /*
  * Now redefine the EM() and E_() macros to map the enums to the strings that
@@ -324,6 +347,47 @@ TRACE_EVENT(cachefiles_coherency,
 		      __entry->content)
 	    );
 
+TRACE_EVENT(cachefiles_prep_read,
+	    TP_PROTO(struct netfs_read_subrequest *sreq,
+		     enum netfs_read_source source,
+		     enum cachefiles_prepare_read_trace why,
+		     ino_t cache_inode),
+
+	    TP_ARGS(sreq, source, why, cache_inode),
+
+	    TP_STRUCT__entry(
+		    __field(unsigned int,		rreq		)
+		    __field(unsigned short,		index		)
+		    __field(unsigned short,		flags		)
+		    __field(enum netfs_read_source,	source		)
+		    __field(enum cachefiles_prepare_read_trace,	why	)
+		    __field(size_t,			len		)
+		    __field(loff_t,			start		)
+		    __field(unsigned int,		netfs_inode	)
+		    __field(unsigned int,		cache_inode	)
+			     ),
+
+	    TP_fast_assign(
+		    __entry->rreq	= sreq->rreq->debug_id;
+		    __entry->index	= sreq->debug_index;
+		    __entry->flags	= sreq->flags;
+		    __entry->source	= source;
+		    __entry->why	= why;
+		    __entry->len	= sreq->len;
+		    __entry->start	= sreq->start;
+		    __entry->netfs_inode = sreq->rreq->inode->i_ino;
+		    __entry->cache_inode = cache_inode;
+			   ),
+
+	    TP_printk("R=%08x[%u] %s %s f=%02x s=%llx %zx ni=%x b=%x",
+		      __entry->rreq, __entry->index,
+		      __print_symbolic(__entry->source, netfs_sreq_sources),
+		      __print_symbolic(__entry->why, cachefiles_prepare_read_traces),
+		      __entry->flags,
+		      __entry->start, __entry->len,
+		      __entry->netfs_inode, __entry->cache_inode)
+	    );
+
 TRACE_EVENT(cachefiles_read,
 	    TP_PROTO(struct cachefiles_object *obj,
 		     struct inode *backer,



^ permalink raw reply related	[flat|nested] 87+ messages in thread

* [PATCH 34/67] cachefiles: Make cachefiles_write_prepare() check for space
  2021-10-18 14:50 [PATCH 00/67] fscache: Rewrite index API and management system David Howells
                   ` (32 preceding siblings ...)
  2021-10-18 14:59 ` [PATCH 33/67] cachefiles: Trace decisions in cachefiles_prepare_read() David Howells
@ 2021-10-18 14:59 ` David Howells
  2021-10-18 14:59 ` [PATCH 35/67] fscache: Automatically close a file that's been unused for a while David Howells
                   ` (36 subsequent siblings)
  70 siblings, 0 replies; 87+ messages in thread
From: David Howells @ 2021-10-18 14:59 UTC (permalink / raw)
  To: linux-cachefs
  Cc: dhowells, Trond Myklebust, Anna Schumaker, Steve French,
	Dominique Martinet, Jeff Layton, Matthew Wilcox, Alexander Viro,
	Omar Sandoval, Linus Torvalds, linux-afs, linux-nfs, linux-cifs,
	ceph-devel, v9fs-developer, linux-fsdevel, linux-kernel

Make the cachefiles_write_prepare() function check that there's sufficient
space to fully satisfy a proposed write.

If we already checked for allocated data during read preparation, then this
fact can be used to skip the more thorough checks here.

If there's enough space in the cache, we just allow the write.

If we're uncertain, then we use SEEK_DATA/SEEK_HOLE to check if the block
is already fully allocated - and if it is, we just allow the write.

However, if there's insufficient space for the whole write and there's
partially allocated data in the file, we punch out that data and disallow
the write.  This frees up some space and removes old data from the cache.

Signed-off-by: David Howells <dhowells@redhat.com>
---

 fs/cachefiles/io.c     |   82 ++++++++++++++++++++++++++++++++++++++++++++----
 fs/netfs/read_helper.c |    2 +
 include/linux/netfs.h  |    3 +-
 3 files changed, 79 insertions(+), 8 deletions(-)

diff --git a/fs/cachefiles/io.c b/fs/cachefiles/io.c
index c05f64cdfd0e..350243b45dd5 100644
--- a/fs/cachefiles/io.c
+++ b/fs/cachefiles/io.c
@@ -9,6 +9,7 @@
 #include <linux/slab.h>
 #include <linux/file.h>
 #include <linux/uio.h>
+#include <linux/falloc.h>
 #include <linux/sched/mm.h>
 #include <trace/events/fscache.h>
 #include "internal.h"
@@ -385,8 +386,7 @@ static enum netfs_read_source cachefiles_prepare_read(struct netfs_read_subreque
 	goto out;
 
 download_and_store:
-	if (cachefiles_has_space(cache, 0, (subreq->len + PAGE_SIZE - 1) / PAGE_SIZE) == 0)
-		__set_bit(NETFS_SREQ_WRITE_TO_CACHE, &subreq->flags);
+	__set_bit(NETFS_SREQ_WRITE_TO_CACHE, &subreq->flags);
 out:
 	cachefiles_end_secure(cache, saved_cred);
 out_no_object:
@@ -397,17 +397,87 @@ static enum netfs_read_source cachefiles_prepare_read(struct netfs_read_subreque
 /*
  * Prepare for a write to occur.
  */
-static int cachefiles_prepare_write(struct netfs_cache_resources *cres,
-				    loff_t *_start, size_t *_len, loff_t i_size)
+static int __cachefiles_prepare_write(struct netfs_cache_resources *cres,
+				      loff_t *_start, size_t *_len, loff_t i_size,
+				      bool no_space_allocated_yet)
 {
-	loff_t start = *_start;
+	struct cachefiles_object *object = cachefiles_cres_object(cres);
+	struct cachefiles_cache *cache = object->volume->cache;
+	struct file *file = cachefiles_cres_file(cres);
+	loff_t start = *_start, pos;
 	size_t len = *_len, down;
+	int ret;
 
 	/* Round to DIO size */
 	down = start - round_down(start, PAGE_SIZE);
 	*_start = start - down;
 	*_len = round_up(down + len, PAGE_SIZE);
-	return 0;
+
+	/* We need to work out whether there's sufficient disk space to perform
+	 * the write - but we can skip that check if we have space already
+	 * allocated.
+	 */
+	if (no_space_allocated_yet)
+		goto check_space;
+
+	pos = vfs_llseek(file, *_start, SEEK_DATA);
+	if (pos < 0 && pos >= (loff_t)-MAX_ERRNO) {
+		if (pos == -ENXIO)
+			goto check_space; /* Unallocated tail */
+		return pos;
+	}
+	if ((u64)pos >= (u64)*_start + *_len)
+		goto check_space; /* Unallocated region */
+
+	/* We have a block that's at least partially filled - if we're low on
+	 * space, we need to see if it's fully allocated.  If it's not, we may
+	 * want to cull it.
+	 */
+	if (cachefiles_has_space(cache, 0, *_len / PAGE_SIZE) == 0)
+		return 0; /* Enough space to simply overwrite the whole block */
+
+	pos = vfs_llseek(file, *_start, SEEK_HOLE);
+	if (pos < 0 && pos >= (loff_t)-MAX_ERRNO)
+		return pos;
+	if ((u64)pos >= (u64)*_start + *_len)
+		return 0; /* Fully allocated */
+
+	/* Partially allocated, but insufficient space: cull. */
+	ret = vfs_fallocate(file, FALLOC_FL_PUNCH_HOLE | FALLOC_FL_KEEP_SIZE,
+			    *_start, *_len);
+	if (ret < 0) {
+		cachefiles_io_error_obj(object,
+					"CacheFiles: fallocate failed (%d)\n", ret);
+		ret = -EIO;
+	}
+
+	return ret;
+
+check_space:
+	return cachefiles_has_space(cache, 0, *_len / PAGE_SIZE);
+}
+
+static int cachefiles_prepare_write(struct netfs_cache_resources *cres,
+				    loff_t *_start, size_t *_len, loff_t i_size,
+				    bool no_space_allocated_yet)
+{
+	struct cachefiles_object *object = cachefiles_cres_object(cres);
+	struct cachefiles_cache *cache = object->volume->cache;
+	const struct cred *saved_cred;
+	int ret;
+
+	if (!cachefiles_cres_file(cres)) {
+		if (!fscache_wait_for_operation(cres, FSCACHE_WANT_WRITE))
+			return -ENOBUFS;
+		if (!cachefiles_cres_file(cres))
+			return -ENOBUFS;
+	}
+
+	cachefiles_begin_secure(cache, &saved_cred);
+	ret = __cachefiles_prepare_write(cres, _start, _len, i_size,
+					 no_space_allocated_yet);
+	cachefiles_end_secure(cache, saved_cred);
+	return ret;
 }
 
 /*
diff --git a/fs/netfs/read_helper.c b/fs/netfs/read_helper.c
index dfc60c79a9f3..80f8e334371d 100644
--- a/fs/netfs/read_helper.c
+++ b/fs/netfs/read_helper.c
@@ -323,7 +323,7 @@ static void netfs_rreq_do_write_to_cache(struct netfs_read_request *rreq)
 		}
 
 		ret = cres->ops->prepare_write(cres, &subreq->start, &subreq->len,
-					       rreq->i_size);
+					       rreq->i_size, true);
 		if (ret < 0) {
 			trace_netfs_failure(rreq, subreq, ret, netfs_fail_prepare_write);
 			trace_netfs_sreq(subreq, netfs_sreq_trace_write_skip);
diff --git a/include/linux/netfs.h b/include/linux/netfs.h
index 014fb502fd91..99137486d351 100644
--- a/include/linux/netfs.h
+++ b/include/linux/netfs.h
@@ -220,7 +220,8 @@ struct netfs_cache_ops {
 	 * actually do.
 	 */
 	int (*prepare_write)(struct netfs_cache_resources *cres,
-			     loff_t *_start, size_t *_len, loff_t i_size);
+			     loff_t *_start, size_t *_len, loff_t i_size,
+			     bool no_space_allocated_yet);
 
 	/* Prepare a write operation for the fallback fscache API, working out
 	 * whether we can cache a page or not.



^ permalink raw reply related	[flat|nested] 87+ messages in thread

* [PATCH 35/67] fscache: Automatically close a file that's been unused for a while
  2021-10-18 14:50 [PATCH 00/67] fscache: Rewrite index API and management system David Howells
                   ` (33 preceding siblings ...)
  2021-10-18 14:59 ` [PATCH 34/67] cachefiles: Make cachefiles_write_prepare() check for space David Howells
@ 2021-10-18 14:59 ` David Howells
  2021-10-18 15:00 ` [PATCH 36/67] fscache: Add stats for the cookie commit LRU David Howells
                   ` (35 subsequent siblings)
  70 siblings, 0 replies; 87+ messages in thread
From: David Howells @ 2021-10-18 14:59 UTC (permalink / raw)
  To: linux-cachefs
  Cc: dhowells, Trond Myklebust, Anna Schumaker, Steve French,
	Dominique Martinet, Jeff Layton, Matthew Wilcox, Alexander Viro,
	Omar Sandoval, Linus Torvalds, linux-afs, linux-nfs, linux-cifs,
	ceph-devel, v9fs-developer, linux-fsdevel, linux-kernel

When a cache object becomes fully unused, schedule it to be committed and
closed after a few seconds if not re-used.

Signed-off-by: David Howells <dhowells@redhat.com>
---

 fs/fscache/cookie.c            |  108 +++++++++++++++++++++++++++++++++++++++-
 fs/fscache/io.c                |    2 +
 include/linux/fscache.h        |    5 +-
 include/trace/events/fscache.h |    8 +++
 4 files changed, 120 insertions(+), 3 deletions(-)

diff --git a/fs/fscache/cookie.c b/fs/fscache/cookie.c
index 90a16e6d6917..dfc61b2e105d 100644
--- a/fs/fscache/cookie.c
+++ b/fs/fscache/cookie.c
@@ -15,6 +15,8 @@
 
 struct kmem_cache *fscache_cookie_jar;
 
+static void fscache_cookie_lru_timed_out(struct timer_list *timer);
+static void fscache_cookie_lru_worker(struct work_struct *work);
 static void fscache_cookie_worker(struct work_struct *work);
 static void fscache_drop_cookie(struct fscache_cookie *cookie);
 static void fscache_lookup_cookie(struct fscache_cookie *cookie);
@@ -24,7 +26,12 @@ static void fscache_invalidate_cookie(struct fscache_cookie *cookie);
 static struct hlist_bl_head fscache_cookie_hash[1 << fscache_cookie_hash_shift];
 static LIST_HEAD(fscache_cookies);
 static DEFINE_RWLOCK(fscache_cookies_lock);
-static const char fscache_cookie_stages[FSCACHE_COOKIE_STAGE__NR] = "-LCAIFWRD";
+static LIST_HEAD(fscache_cookie_lru);
+static DEFINE_SPINLOCK(fscache_cookie_lru_lock);
+static DEFINE_TIMER(fscache_cookie_lru_timer, fscache_cookie_lru_timed_out);
+static DECLARE_WORK(fscache_cookie_lru_work, fscache_cookie_lru_worker);
+static const char fscache_cookie_stages[FSCACHE_COOKIE_STAGE__NR] = "-LCAIFMWRD";
+unsigned int fscache_lru_cookie_timeout = 10 * HZ;
 
 void fscache_print_cookie(struct fscache_cookie *cookie, char prefix)
 {
@@ -49,6 +56,11 @@ void fscache_print_cookie(struct fscache_cookie *cookie, char prefix)
 
 static void fscache_free_cookie(struct fscache_cookie *cookie)
 {
+	if (WARN_ON_ONCE(!list_empty(&cookie->commit_link))) {
+		spin_lock(&fscache_cookie_lru_lock);
+		list_del_init(&cookie->commit_link);
+		spin_unlock(&fscache_cookie_lru_lock);
+	}
 	write_lock(&fscache_cookies_lock);
 	list_del(&cookie->proc_link);
 	write_unlock(&fscache_cookies_lock);
@@ -265,6 +277,7 @@ static struct fscache_cookie *fscache_alloc_cookie(
 	cookie->debug_id = atomic_inc_return(&fscache_cookie_debug_id);
 	cookie->stage = FSCACHE_COOKIE_STAGE_QUIESCENT;
 	spin_lock_init(&cookie->lock);
+	INIT_LIST_HEAD(&cookie->commit_link);
 	INIT_WORK(&cookie->work, fscache_cookie_worker);
 
 	write_lock(&fscache_cookies_lock);
@@ -448,6 +461,7 @@ void __fscache_use_cookie(struct fscache_cookie *cookie, bool will_modify)
 
 	atomic_inc(&cookie->n_active);
 
+again:
 	stage = cookie->stage;
 	switch (stage) {
 	case FSCACHE_COOKIE_STAGE_QUIESCENT:
@@ -474,6 +488,13 @@ void __fscache_use_cookie(struct fscache_cookie *cookie, bool will_modify)
 	case FSCACHE_COOKIE_STAGE_WITHDRAWING:
 		break;
 
+	case FSCACHE_COOKIE_STAGE_COMMITTING:
+		spin_unlock(&cookie->lock);
+		wait_var_event(&cookie->stage,
+			       READ_ONCE(cookie->stage) != FSCACHE_COOKIE_STAGE_COMMITTING);
+		spin_lock(&cookie->lock);
+		goto again;
+
 	case FSCACHE_COOKIE_STAGE_DROPPED:
 	case FSCACHE_COOKIE_STAGE_RELINQUISHING:
 		WARN(1, "Can't use cookie in stage %u\n", cookie->stage);
@@ -497,7 +518,19 @@ void __fscache_unuse_cookie(struct fscache_cookie *cookie,
 {
 	if (aux_data || object_size)
 		__fscache_update_cookie(cookie, aux_data, object_size);
-	atomic_dec(&cookie->n_active);
+	cookie->unused_at = jiffies;
+	if (atomic_dec_return(&cookie->n_active) == 0) {
+		if (test_bit(FSCACHE_COOKIE_IS_CACHING, &cookie->flags)) {
+			spin_lock(&fscache_cookie_lru_lock);
+			if (list_empty(&cookie->commit_link)) {
+				fscache_get_cookie(cookie, fscache_cookie_get_lru);
+				list_move_tail(&cookie->commit_link, &fscache_cookie_lru);
+			}
+			spin_unlock(&fscache_cookie_lru_lock);
+			timer_reduce(&fscache_cookie_lru_timer,
+				     jiffies + fscache_lru_cookie_timeout);
+		}
+	}
 }
 EXPORT_SYMBOL(__fscache_unuse_cookie);
 
@@ -532,6 +565,7 @@ static void __fscache_cookie_worker(struct fscache_cookie *cookie)
 	case FSCACHE_COOKIE_STAGE_FAILED:
 		break;
 
+	case FSCACHE_COOKIE_STAGE_COMMITTING:
 	case FSCACHE_COOKIE_STAGE_RELINQUISHING:
 	case FSCACHE_COOKIE_STAGE_WITHDRAWING:
 		if (test_and_clear_bit(FSCACHE_COOKIE_IS_CACHING, &cookie->flags) &&
@@ -541,6 +575,8 @@ static void __fscache_cookie_worker(struct fscache_cookie *cookie)
 			fscache_see_cookie(cookie, fscache_cookie_see_relinquish);
 			fscache_drop_cookie(cookie);
 			break;
+		} else if (cookie->stage == FSCACHE_COOKIE_STAGE_COMMITTING) {
+			fscache_see_cookie(cookie, fscache_cookie_see_committing);
 		} else {
 			fscache_see_cookie(cookie, fscache_cookie_see_withdraw);
 		}
@@ -550,6 +586,7 @@ static void __fscache_cookie_worker(struct fscache_cookie *cookie)
 	case FSCACHE_COOKIE_STAGE_DROPPED:
 		clear_bit(FSCACHE_COOKIE_NEEDS_UPDATE, &cookie->flags);
 		clear_bit(FSCACHE_COOKIE_DO_WITHDRAW, &cookie->flags);
+		clear_bit(FSCACHE_COOKIE_DO_COMMIT, &cookie->flags);
 		set_bit(FSCACHE_COOKIE_NO_DATA_TO_READ, &cookie->flags);
 		fscache_set_cookie_stage(cookie, FSCACHE_COOKIE_STAGE_QUIESCENT);
 		break;
@@ -578,6 +615,72 @@ static void __fscache_withdraw_cookie(struct fscache_cookie *cookie)
 		__fscache_end_cookie_access(cookie);
 }
 
+static void fscache_cookie_lru_do_one(struct fscache_cookie *cookie)
+{
+	fscache_see_cookie(cookie, fscache_cookie_see_lru_do_one);
+
+	spin_lock(&cookie->lock);
+	if (cookie->stage != FSCACHE_COOKIE_STAGE_ACTIVE ||
+	    time_before(jiffies, cookie->unused_at + fscache_lru_cookie_timeout) ||
+	    atomic_read(&cookie->n_active) > 0) {
+		spin_unlock(&cookie->lock);
+	} else {
+		__fscache_set_cookie_stage(cookie, FSCACHE_COOKIE_STAGE_COMMITTING);
+		set_bit(FSCACHE_COOKIE_DO_COMMIT, &cookie->flags);
+		spin_unlock(&cookie->lock);
+		_debug("lru c=%x", cookie->debug_id);
+		__fscache_withdraw_cookie(cookie);
+	}
+
+	fscache_put_cookie(cookie, fscache_cookie_put_lru);
+}
+
+static void fscache_cookie_lru_worker(struct work_struct *work)
+{
+	struct fscache_cookie *cookie;
+	unsigned long unused_at;
+
+	spin_lock(&fscache_cookie_lru_lock);
+
+	while (!list_empty(&fscache_cookie_lru)) {
+		cookie = list_first_entry(&fscache_cookie_lru,
+					  struct fscache_cookie, commit_link);
+		unused_at = cookie->unused_at + fscache_lru_cookie_timeout;
+		if (time_before(jiffies, unused_at)) {
+			timer_reduce(&fscache_cookie_lru_timer, unused_at);
+			break;
+		}
+
+		list_del_init(&cookie->commit_link);
+		spin_unlock(&fscache_cookie_lru_lock);
+		fscache_cookie_lru_do_one(cookie);
+		spin_lock(&fscache_cookie_lru_lock);
+	}
+
+	spin_unlock(&fscache_cookie_lru_lock);
+}
+
+static void fscache_cookie_lru_timed_out(struct timer_list *timer)
+{
+	queue_work(fscache_wq, &fscache_cookie_lru_work);
+}
+
+static void fscache_cookie_drop_from_lru(struct fscache_cookie *cookie)
+{
+	bool need_put = false;
+
+	if (!list_empty(&cookie->commit_link)) {
+		spin_lock(&fscache_cookie_lru_lock);
+		if (!list_empty(&cookie->commit_link)) {
+			list_del_init(&cookie->commit_link);
+			need_put = true;
+		}
+		spin_unlock(&fscache_cookie_lru_lock);
+		if (need_put)
+			fscache_put_cookie(cookie, fscache_cookie_put_lru);
+	}
+}
+
 /*
  * Ask the cache to effect invalidation of a cookie.
  */
@@ -687,6 +790,7 @@ static void fscache_drop_cookie(struct fscache_cookie *cookie)
 
 static void fscache_drop_withdraw_cookie(struct fscache_cookie *cookie)
 {
+	fscache_cookie_drop_from_lru(cookie);
 	__fscache_withdraw_cookie(cookie);
 }
 
diff --git a/fs/fscache/io.c b/fs/fscache/io.c
index 2b1c9f433530..ef4b0606019d 100644
--- a/fs/fscache/io.c
+++ b/fs/fscache/io.c
@@ -43,6 +43,7 @@ bool fscache_wait_for_operation(struct netfs_cache_resources *cres,
 			goto ready; /* There can be no content */
 		fallthrough;
 	case FSCACHE_COOKIE_STAGE_LOOKING_UP:
+	case FSCACHE_COOKIE_STAGE_COMMITTING:
 		wait_var_event(&cookie->stage, READ_ONCE(cookie->stage) != stage);
 		goto again;
 
@@ -92,6 +93,7 @@ static int fscache_begin_operation(struct netfs_cache_resources *cres,
 
 	switch (stage) {
 	case FSCACHE_COOKIE_STAGE_LOOKING_UP:
+	case FSCACHE_COOKIE_STAGE_COMMITTING:
 		goto wait_and_validate;
 	case FSCACHE_COOKIE_STAGE_INVALIDATING:
 	case FSCACHE_COOKIE_STAGE_CREATING:
diff --git a/include/linux/fscache.h b/include/linux/fscache.h
index aeee14f5663a..c6ee09661351 100644
--- a/include/linux/fscache.h
+++ b/include/linux/fscache.h
@@ -59,6 +59,7 @@ enum fscache_cookie_stage {
 	FSCACHE_COOKIE_STAGE_ACTIVE,		/* The cache is active, readable and writable */
 	FSCACHE_COOKIE_STAGE_INVALIDATING,	/* The cache is being invalidated */
 	FSCACHE_COOKIE_STAGE_FAILED,		/* The cache failed, withdraw to clear */
+	FSCACHE_COOKIE_STAGE_COMMITTING,	/* The cookie is being committed */
 	FSCACHE_COOKIE_STAGE_WITHDRAWING,	/* The cookie is being withdrawn */
 	FSCACHE_COOKIE_STAGE_RELINQUISHING,	/* The cookie is being relinquished */
 	FSCACHE_COOKIE_STAGE_DROPPED,		/* The cookie has been dropped */
@@ -108,9 +109,10 @@ struct fscache_cookie {
 	void				*cache_priv;	/* Cache-side representation */
 	struct hlist_bl_node		hash_link;	/* Link in hash table */
 	struct list_head		proc_link;	/* Link in proc list */
+	struct list_head		commit_link;	/* Link in commit queue */
 	struct work_struct		work;		/* Commit/relinq/withdraw work */
 	loff_t				object_size;	/* Size of the netfs object */
-
+	unsigned long			unused_at;	/* Time at which unused (jiffies) */
 	unsigned long			flags;
 #define FSCACHE_COOKIE_RELINQUISHED	0		/* T if cookie has been relinquished */
 #define FSCACHE_COOKIE_RETIRED		1		/* T if this cookie has retired on relinq */
@@ -121,6 +123,7 @@ struct fscache_cookie {
 #define FSCACHE_COOKIE_NACC_ELEVATED	8		/* T if n_accesses is incremented */
 #define FSCACHE_COOKIE_DO_RELINQUISH	9		/* T if this cookie needs relinquishment */
 #define FSCACHE_COOKIE_DO_WITHDRAW	10		/* T if this cookie needs withdrawing */
+#define FSCACHE_COOKIE_DO_COMMIT	11		/* T if this cookie needs committing */
 
 	enum fscache_cookie_stage	stage;
 	u8				advice;		/* FSCACHE_ADV_* */
diff --git a/include/trace/events/fscache.h b/include/trace/events/fscache.h
index 0d9789745a91..50f28a2a4ca8 100644
--- a/include/trace/events/fscache.h
+++ b/include/trace/events/fscache.h
@@ -52,16 +52,20 @@ enum fscache_cookie_trace {
 	fscache_cookie_get_end_access,
 	fscache_cookie_get_hash_collision,
 	fscache_cookie_get_inval_work,
+	fscache_cookie_get_lru,
 	fscache_cookie_get_use_work,
 	fscache_cookie_get_withdraw,
 	fscache_cookie_new_acquire,
 	fscache_cookie_put_hash_collision,
+	fscache_cookie_put_lru,
 	fscache_cookie_put_object,
 	fscache_cookie_put_over_queued,
 	fscache_cookie_put_relinquish,
 	fscache_cookie_put_withdrawn,
 	fscache_cookie_put_work,
 	fscache_cookie_see_active,
+	fscache_cookie_see_lru_do_one,
+	fscache_cookie_see_committing,
 	fscache_cookie_see_relinquish,
 	fscache_cookie_see_withdraw,
 	fscache_cookie_see_work,
@@ -127,16 +131,20 @@ enum fscache_access_trace {
 	EM(fscache_cookie_get_hash_collision,	"GET hcoll")		\
 	EM(fscache_cookie_get_end_access,	"GQ  endac")		\
 	EM(fscache_cookie_get_inval_work,	"GQ  inval")		\
+	EM(fscache_cookie_get_lru,		"GET lru  ")		\
 	EM(fscache_cookie_get_use_work,		"GQ  use  ")		\
 	EM(fscache_cookie_get_withdraw,		"GQ  wthdr")		\
 	EM(fscache_cookie_new_acquire,		"NEW acq  ")		\
 	EM(fscache_cookie_put_hash_collision,	"PUT hcoll")		\
+	EM(fscache_cookie_put_lru,		"PUT lru  ")		\
 	EM(fscache_cookie_put_object,		"PUT obj  ")		\
 	EM(fscache_cookie_put_over_queued,	"PQ  overq")		\
 	EM(fscache_cookie_put_relinquish,	"PUT relnq")		\
 	EM(fscache_cookie_put_withdrawn,	"PUT wthdn")		\
 	EM(fscache_cookie_put_work,		"PQ  work ")		\
 	EM(fscache_cookie_see_active,		"-   active")		\
+	EM(fscache_cookie_see_lru_do_one,	"-   lrudo")		\
+	EM(fscache_cookie_see_committing,	"-   x-cmt")		\
 	EM(fscache_cookie_see_relinquish,	"-   x-rlq")		\
 	EM(fscache_cookie_see_withdraw,		"-   x-wth")		\
 	E_(fscache_cookie_see_work,		"-   work ")



^ permalink raw reply related	[flat|nested] 87+ messages in thread

* [PATCH 36/67] fscache: Add stats for the cookie commit LRU
  2021-10-18 14:50 [PATCH 00/67] fscache: Rewrite index API and management system David Howells
                   ` (34 preceding siblings ...)
  2021-10-18 14:59 ` [PATCH 35/67] fscache: Automatically close a file that's been unused for a while David Howells
@ 2021-10-18 15:00 ` David Howells
  2021-10-18 15:00 ` [PATCH 37/67] fscache: Move fscache_update_cookie() complete inline David Howells
                   ` (34 subsequent siblings)
  70 siblings, 0 replies; 87+ messages in thread
From: David Howells @ 2021-10-18 15:00 UTC (permalink / raw)
  To: linux-cachefs
  Cc: dhowells, Trond Myklebust, Anna Schumaker, Steve French,
	Dominique Martinet, Jeff Layton, Matthew Wilcox, Alexander Viro,
	Omar Sandoval, Linus Torvalds, linux-afs, linux-nfs, linux-cifs,
	ceph-devel, v9fs-developer, linux-fsdevel, linux-kernel

Add some stats to indicate the state of the cookie commit LRU, including an
indication of how many are currently on it, how many have been expired,
removed (withdrawn/reused) or dropped (relinquished) from it and how long
till the next reap happens.

Signed-off-by: David Howells <dhowells@redhat.com>
---

 fs/fscache/cookie.c   |   10 +++++++++-
 fs/fscache/internal.h |    5 +++++
 fs/fscache/stats.c    |   12 ++++++++++++
 3 files changed, 26 insertions(+), 1 deletion(-)

diff --git a/fs/fscache/cookie.c b/fs/fscache/cookie.c
index dfc61b2e105d..c6b553609f33 100644
--- a/fs/fscache/cookie.c
+++ b/fs/fscache/cookie.c
@@ -28,7 +28,7 @@ static LIST_HEAD(fscache_cookies);
 static DEFINE_RWLOCK(fscache_cookies_lock);
 static LIST_HEAD(fscache_cookie_lru);
 static DEFINE_SPINLOCK(fscache_cookie_lru_lock);
-static DEFINE_TIMER(fscache_cookie_lru_timer, fscache_cookie_lru_timed_out);
+DEFINE_TIMER(fscache_cookie_lru_timer, fscache_cookie_lru_timed_out);
 static DECLARE_WORK(fscache_cookie_lru_work, fscache_cookie_lru_worker);
 static const char fscache_cookie_stages[FSCACHE_COOKIE_STAGE__NR] = "-LCAIFMWRD";
 unsigned int fscache_lru_cookie_timeout = 10 * HZ;
@@ -60,6 +60,8 @@ static void fscache_free_cookie(struct fscache_cookie *cookie)
 		spin_lock(&fscache_cookie_lru_lock);
 		list_del_init(&cookie->commit_link);
 		spin_unlock(&fscache_cookie_lru_lock);
+		fscache_stat_d(&fscache_n_cookies_lru);
+		fscache_stat(&fscache_n_cookies_lru_removed);
 	}
 	write_lock(&fscache_cookies_lock);
 	list_del(&cookie->proc_link);
@@ -525,6 +527,7 @@ void __fscache_unuse_cookie(struct fscache_cookie *cookie,
 			if (list_empty(&cookie->commit_link)) {
 				fscache_get_cookie(cookie, fscache_cookie_get_lru);
 				list_move_tail(&cookie->commit_link, &fscache_cookie_lru);
+				fscache_stat(&fscache_n_cookies_lru);
 			}
 			spin_unlock(&fscache_cookie_lru_lock);
 			timer_reduce(&fscache_cookie_lru_timer,
@@ -624,10 +627,12 @@ static void fscache_cookie_lru_do_one(struct fscache_cookie *cookie)
 	    time_before(jiffies, cookie->unused_at + fscache_lru_cookie_timeout) ||
 	    atomic_read(&cookie->n_active) > 0) {
 		spin_unlock(&cookie->lock);
+		fscache_stat(&fscache_n_cookies_lru_removed);
 	} else {
 		__fscache_set_cookie_stage(cookie, FSCACHE_COOKIE_STAGE_COMMITTING);
 		set_bit(FSCACHE_COOKIE_DO_COMMIT, &cookie->flags);
 		spin_unlock(&cookie->lock);
+		fscache_stat(&fscache_n_cookies_lru_expired);
 		_debug("lru c=%x", cookie->debug_id);
 		__fscache_withdraw_cookie(cookie);
 	}
@@ -652,6 +657,7 @@ static void fscache_cookie_lru_worker(struct work_struct *work)
 		}
 
 		list_del_init(&cookie->commit_link);
+		fscache_stat_d(&fscache_n_cookies_lru);
 		spin_unlock(&fscache_cookie_lru_lock);
 		fscache_cookie_lru_do_one(cookie);
 		spin_lock(&fscache_cookie_lru_lock);
@@ -673,6 +679,8 @@ static void fscache_cookie_drop_from_lru(struct fscache_cookie *cookie)
 		spin_lock(&fscache_cookie_lru_lock);
 		if (!list_empty(&cookie->commit_link)) {
 			list_del_init(&cookie->commit_link);
+			fscache_stat_d(&fscache_n_cookies_lru);
+			fscache_stat(&fscache_n_cookies_lru_dropped);
 			need_put = true;
 		}
 		spin_unlock(&fscache_cookie_lru_lock);
diff --git a/fs/fscache/internal.h b/fs/fscache/internal.h
index f74f7bdea633..62e6a5bbef8e 100644
--- a/fs/fscache/internal.h
+++ b/fs/fscache/internal.h
@@ -32,6 +32,7 @@ struct fscache_cache *fscache_lookup_cache(const char *name, bool is_cache);
  */
 extern struct kmem_cache *fscache_cookie_jar;
 extern const struct seq_operations fscache_cookies_seq_ops;
+extern struct timer_list fscache_cookie_lru_timer;
 
 extern void fscache_print_cookie(struct fscache_cookie *cookie, char prefix);
 extern bool fscache_begin_cookie_access(struct fscache_cookie *cookie,
@@ -70,6 +71,10 @@ extern atomic_t fscache_n_volumes;
 extern atomic_t fscache_n_volumes_collision;
 extern atomic_t fscache_n_volumes_nomem;
 extern atomic_t fscache_n_cookies;
+extern atomic_t fscache_n_cookies_lru;
+extern atomic_t fscache_n_cookies_lru_expired;
+extern atomic_t fscache_n_cookies_lru_removed;
+extern atomic_t fscache_n_cookies_lru_dropped;
 
 extern atomic_t fscache_n_retrievals;
 extern atomic_t fscache_n_retrievals_ok;
diff --git a/fs/fscache/stats.c b/fs/fscache/stats.c
index 13e90b940bd2..5700e5712018 100644
--- a/fs/fscache/stats.c
+++ b/fs/fscache/stats.c
@@ -18,6 +18,10 @@ atomic_t fscache_n_volumes;
 atomic_t fscache_n_volumes_collision;
 atomic_t fscache_n_volumes_nomem;
 atomic_t fscache_n_cookies;
+atomic_t fscache_n_cookies_lru;
+atomic_t fscache_n_cookies_lru_expired;
+atomic_t fscache_n_cookies_lru_removed;
+atomic_t fscache_n_cookies_lru_dropped;
 
 atomic_t fscache_n_retrievals;
 atomic_t fscache_n_retrievals_ok;
@@ -89,6 +93,14 @@ int fscache_stats_show(struct seq_file *m, void *v)
 		   atomic_read(&fscache_n_acquires_nobufs),
 		   atomic_read(&fscache_n_acquires_oom));
 
+	seq_printf(m, "LRU    : n=%u exp=%u rmv=%u drp=%u at=%ld\n",
+		   atomic_read(&fscache_n_cookies_lru),
+		   atomic_read(&fscache_n_cookies_lru_expired),
+		   atomic_read(&fscache_n_cookies_lru_removed),
+		   atomic_read(&fscache_n_cookies_lru_dropped),
+		   timer_pending(&fscache_cookie_lru_timer) ?
+		   fscache_cookie_lru_timer.expires - jiffies : 0);
+
 	seq_printf(m, "Invals : n=%u run=%u\n",
 		   atomic_read(&fscache_n_invalidates),
 		   atomic_read(&fscache_n_invalidates_run));



^ permalink raw reply related	[flat|nested] 87+ messages in thread

* [PATCH 37/67] fscache: Move fscache_update_cookie() complete inline
  2021-10-18 14:50 [PATCH 00/67] fscache: Rewrite index API and management system David Howells
                   ` (35 preceding siblings ...)
  2021-10-18 15:00 ` [PATCH 36/67] fscache: Add stats for the cookie commit LRU David Howells
@ 2021-10-18 15:00 ` David Howells
  2021-10-18 15:00 ` [PATCH 38/67] fscache: Remove more obsolete stats David Howells
                   ` (33 subsequent siblings)
  70 siblings, 0 replies; 87+ messages in thread
From: David Howells @ 2021-10-18 15:00 UTC (permalink / raw)
  To: linux-cachefs
  Cc: dhowells, Trond Myklebust, Anna Schumaker, Steve French,
	Dominique Martinet, Jeff Layton, Matthew Wilcox, Alexander Viro,
	Omar Sandoval, Linus Torvalds, linux-afs, linux-nfs, linux-cifs,
	ceph-devel, v9fs-developer, linux-fsdevel, linux-kernel


---

 fs/fscache/cookie.c           |   18 -----------------
 fs/fscache/internal.h         |   18 -----------------
 fs/fscache/stats.c            |    9 +++------
 include/linux/fscache-cache.h |   11 ----------
 include/linux/fscache.h       |   43 ++++++++++++++++++++++++++++++++++++++++-
 5 files changed, 45 insertions(+), 54 deletions(-)

diff --git a/fs/fscache/cookie.c b/fs/fscache/cookie.c
index c6b553609f33..94976f90dc71 100644
--- a/fs/fscache/cookie.c
+++ b/fs/fscache/cookie.c
@@ -748,24 +748,6 @@ void __fscache_invalidate(struct fscache_cookie *cookie)
 }
 EXPORT_SYMBOL(__fscache_invalidate);
 
-/*
- * Update the index entries backing a cookie.  The writeback is done lazily.
- */
-void __fscache_update_cookie(struct fscache_cookie *cookie,
-			     const void *aux_data, const loff_t *object_size)
-{
-	fscache_stat(&fscache_n_updates);
-
-	spin_lock(&cookie->lock);
-
-	fscache_update_aux(cookie, aux_data, object_size);
-	set_bit(FSCACHE_COOKIE_NEEDS_UPDATE, &cookie->flags);
-
-	spin_unlock(&cookie->lock);
-	_leave("");
-}
-EXPORT_SYMBOL(__fscache_update_cookie);
-
 /*
  * Remove a cookie from the hash table.
  */
diff --git a/fs/fscache/internal.h b/fs/fscache/internal.h
index 62e6a5bbef8e..1cb1effa7cba 100644
--- a/fs/fscache/internal.h
+++ b/fs/fscache/internal.h
@@ -107,10 +107,6 @@ extern atomic_t fscache_n_acquires_oom;
 extern atomic_t fscache_n_invalidates;
 extern atomic_t fscache_n_invalidates_run;
 
-extern atomic_t fscache_n_updates;
-extern atomic_t fscache_n_updates_null;
-extern atomic_t fscache_n_updates_run;
-
 extern atomic_t fscache_n_relinquishes;
 extern atomic_t fscache_n_relinquishes_null;
 extern atomic_t fscache_n_relinquishes_retire;
@@ -152,20 +148,6 @@ bool fscache_begin_volume_access(struct fscache_volume *volume,
 				 enum fscache_access_trace why);
 void fscache_create_volume(struct fscache_volume *volume, bool wait);
 
-/*
- * Update the auxiliary data on a cookie.
- */
-static inline
-void fscache_update_aux(struct fscache_cookie *cookie,
-			const void *aux_data, const loff_t *object_size)
-{
-	void *p = fscache_get_aux(cookie);
-
-	if (aux_data && p)
-		memcpy(p, aux_data, cookie->aux_len);
-	if (object_size)
-		cookie->object_size = *object_size;
-}
 
 /*****************************************************************************/
 /*
diff --git a/fs/fscache/stats.c b/fs/fscache/stats.c
index 5700e5712018..a16473df8be0 100644
--- a/fs/fscache/stats.c
+++ b/fs/fscache/stats.c
@@ -55,8 +55,7 @@ atomic_t fscache_n_invalidates;
 atomic_t fscache_n_invalidates_run;
 
 atomic_t fscache_n_updates;
-atomic_t fscache_n_updates_null;
-atomic_t fscache_n_updates_run;
+EXPORT_SYMBOL(fscache_n_updates);
 
 atomic_t fscache_n_relinquishes;
 atomic_t fscache_n_relinquishes_null;
@@ -105,10 +104,8 @@ int fscache_stats_show(struct seq_file *m, void *v)
 		   atomic_read(&fscache_n_invalidates),
 		   atomic_read(&fscache_n_invalidates_run));
 
-	seq_printf(m, "Updates: n=%u nul=%u run=%u\n",
-		   atomic_read(&fscache_n_updates),
-		   atomic_read(&fscache_n_updates_null),
-		   atomic_read(&fscache_n_updates_run));
+	seq_printf(m, "Updates: n=%u\n",
+		   atomic_read(&fscache_n_updates));
 
 	seq_printf(m, "Relinqs: n=%u rtr=%u drop=%u\n",
 		   atomic_read(&fscache_n_relinquishes),
diff --git a/include/linux/fscache-cache.h b/include/linux/fscache-cache.h
index 657e54b4cd90..bf0d3e862915 100644
--- a/include/linux/fscache-cache.h
+++ b/include/linux/fscache-cache.h
@@ -139,17 +139,6 @@ static inline void *fscache_get_key(struct fscache_cookie *cookie)
 		return cookie->key;
 }
 
-/*
- * Find the auxiliary data on a cookie.
- */
-static inline void *fscache_get_aux(struct fscache_cookie *cookie)
-{
-	if (cookie->aux_len <= sizeof(cookie->inline_aux))
-		return cookie->inline_aux;
-	else
-		return cookie->aux;
-}
-
 /**
  * fscache_cookie_lookup_negative - Note negative lookup
  * @cookie: The cookie that was being looked up
diff --git a/include/linux/fscache.h b/include/linux/fscache.h
index c6ee09661351..41e579ff65ee 100644
--- a/include/linux/fscache.h
+++ b/include/linux/fscache.h
@@ -159,7 +159,6 @@ extern struct fscache_cookie *__fscache_acquire_cookie(
 extern void __fscache_use_cookie(struct fscache_cookie *, bool);
 extern void __fscache_unuse_cookie(struct fscache_cookie *, const void *, const loff_t *);
 extern void __fscache_relinquish_cookie(struct fscache_cookie *, bool);
-extern void __fscache_update_cookie(struct fscache_cookie *, const void *, const loff_t *);
 extern void __fscache_invalidate(struct fscache_cookie *);
 #ifdef FSCACHE_USE_NEW_IO_API
 extern int __fscache_begin_read_operation(struct netfs_cache_resources *, struct fscache_cookie *);
@@ -293,6 +292,48 @@ void fscache_relinquish_cookie(struct fscache_cookie *cookie, bool retire)
 		__fscache_relinquish_cookie(cookie, retire);
 }
 
+/*
+ * Find the auxiliary data on a cookie.
+ */
+static inline void *fscache_get_aux(struct fscache_cookie *cookie)
+{
+	if (cookie->aux_len <= sizeof(cookie->inline_aux))
+		return cookie->inline_aux;
+	else
+		return cookie->aux;
+}
+
+/*
+ * Update the auxiliary data on a cookie.
+ */
+static inline
+void fscache_update_aux(struct fscache_cookie *cookie,
+			const void *aux_data, const loff_t *object_size)
+{
+	void *p = fscache_get_aux(cookie);
+
+	if (aux_data && p)
+		memcpy(p, aux_data, cookie->aux_len);
+	if (object_size)
+		cookie->object_size = *object_size;
+}
+
+#ifdef CONFIG_FSCACHE_STATS
+extern atomic_t fscache_n_updates;
+#endif
+
+static inline
+void __fscache_update_cookie(struct fscache_cookie *cookie, const void *aux_data,
+			     const loff_t *object_size)
+{
+#ifdef CONFIG_FSCACHE_STATS
+	atomic_inc(&fscache_n_updates);
+#endif
+	fscache_update_aux(cookie, aux_data, object_size);
+	smp_wmb();
+	set_bit(FSCACHE_COOKIE_NEEDS_UPDATE, &cookie->flags);
+}
+
 /**
  * fscache_update_cookie - Request that a cache object be updated
  * @cookie: The cookie representing the cache object



^ permalink raw reply related	[flat|nested] 87+ messages in thread

* [PATCH 38/67] fscache: Remove more obsolete stats
  2021-10-18 14:50 [PATCH 00/67] fscache: Rewrite index API and management system David Howells
                   ` (36 preceding siblings ...)
  2021-10-18 15:00 ` [PATCH 37/67] fscache: Move fscache_update_cookie() complete inline David Howells
@ 2021-10-18 15:00 ` David Howells
  2021-10-18 15:00 ` [PATCH 39/67] fscache: Note the object size during invalidation David Howells
                   ` (32 subsequent siblings)
  70 siblings, 0 replies; 87+ messages in thread
From: David Howells @ 2021-10-18 15:00 UTC (permalink / raw)
  To: linux-cachefs
  Cc: dhowells, Trond Myklebust, Anna Schumaker, Steve French,
	Dominique Martinet, Jeff Layton, Matthew Wilcox, Alexander Viro,
	Omar Sandoval, Linus Torvalds, linux-afs, linux-nfs, linux-cifs,
	ceph-devel, v9fs-developer, linux-fsdevel, linux-kernel

Remove some more stats that have become obsolete.

Signed-off-by: David Howells <dhowells@redhat.com>
---

 fs/fscache/internal.h |    3 ---
 fs/fscache/stats.c    |   13 +++----------
 2 files changed, 3 insertions(+), 13 deletions(-)

diff --git a/fs/fscache/internal.h b/fs/fscache/internal.h
index 1cb1effa7cba..65fb4e88f655 100644
--- a/fs/fscache/internal.h
+++ b/fs/fscache/internal.h
@@ -101,14 +101,11 @@ extern atomic_t fscache_n_acquires;
 extern atomic_t fscache_n_acquires_null;
 extern atomic_t fscache_n_acquires_no_cache;
 extern atomic_t fscache_n_acquires_ok;
-extern atomic_t fscache_n_acquires_nobufs;
 extern atomic_t fscache_n_acquires_oom;
 
 extern atomic_t fscache_n_invalidates;
-extern atomic_t fscache_n_invalidates_run;
 
 extern atomic_t fscache_n_relinquishes;
-extern atomic_t fscache_n_relinquishes_null;
 extern atomic_t fscache_n_relinquishes_retire;
 extern atomic_t fscache_n_relinquishes_dropped;
 
diff --git a/fs/fscache/stats.c b/fs/fscache/stats.c
index a16473df8be0..44b3b901e191 100644
--- a/fs/fscache/stats.c
+++ b/fs/fscache/stats.c
@@ -6,7 +6,6 @@
  */
 
 #define FSCACHE_DEBUG_LEVEL CACHE
-#include <linux/module.h>
 #include <linux/proc_fs.h>
 #include <linux/seq_file.h>
 #include "internal.h"
@@ -48,17 +47,14 @@ atomic_t fscache_n_acquires;
 atomic_t fscache_n_acquires_null;
 atomic_t fscache_n_acquires_no_cache;
 atomic_t fscache_n_acquires_ok;
-atomic_t fscache_n_acquires_nobufs;
 atomic_t fscache_n_acquires_oom;
 
 atomic_t fscache_n_invalidates;
-atomic_t fscache_n_invalidates_run;
 
 atomic_t fscache_n_updates;
 EXPORT_SYMBOL(fscache_n_updates);
 
 atomic_t fscache_n_relinquishes;
-atomic_t fscache_n_relinquishes_null;
 atomic_t fscache_n_relinquishes_retire;
 atomic_t fscache_n_relinquishes_dropped;
 
@@ -83,13 +79,11 @@ int fscache_stats_show(struct seq_file *m, void *v)
 		   atomic_read(&fscache_n_volumes_nomem)
 		   );
 
-	seq_printf(m, "Acquire: n=%u nul=%u noc=%u ok=%u nbf=%u"
-		   " oom=%u\n",
+	seq_printf(m, "Acquire: n=%u nul=%u noc=%u ok=%u oom=%u\n",
 		   atomic_read(&fscache_n_acquires),
 		   atomic_read(&fscache_n_acquires_null),
 		   atomic_read(&fscache_n_acquires_no_cache),
 		   atomic_read(&fscache_n_acquires_ok),
-		   atomic_read(&fscache_n_acquires_nobufs),
 		   atomic_read(&fscache_n_acquires_oom));
 
 	seq_printf(m, "LRU    : n=%u exp=%u rmv=%u drp=%u at=%ld\n",
@@ -100,9 +94,8 @@ int fscache_stats_show(struct seq_file *m, void *v)
 		   timer_pending(&fscache_cookie_lru_timer) ?
 		   fscache_cookie_lru_timer.expires - jiffies : 0);
 
-	seq_printf(m, "Invals : n=%u run=%u\n",
-		   atomic_read(&fscache_n_invalidates),
-		   atomic_read(&fscache_n_invalidates_run));
+	seq_printf(m, "Invals : n=%u\n",
+		   atomic_read(&fscache_n_invalidates));
 
 	seq_printf(m, "Updates: n=%u\n",
 		   atomic_read(&fscache_n_updates));



^ permalink raw reply related	[flat|nested] 87+ messages in thread

* [PATCH 39/67] fscache: Note the object size during invalidation
  2021-10-18 14:50 [PATCH 00/67] fscache: Rewrite index API and management system David Howells
                   ` (37 preceding siblings ...)
  2021-10-18 15:00 ` [PATCH 38/67] fscache: Remove more obsolete stats David Howells
@ 2021-10-18 15:00 ` David Howells
  2021-10-18 15:00 ` [PATCH 40/67] vfs, fscache: Force ->write_inode() to occur if cookie pinned for writeback David Howells
                   ` (31 subsequent siblings)
  70 siblings, 0 replies; 87+ messages in thread
From: David Howells @ 2021-10-18 15:00 UTC (permalink / raw)
  To: linux-cachefs
  Cc: dhowells, Trond Myklebust, Anna Schumaker, Steve French,
	Dominique Martinet, Jeff Layton, Matthew Wilcox, Alexander Viro,
	Omar Sandoval, Linus Torvalds, linux-afs, linux-nfs, linux-cifs,
	ceph-devel, v9fs-developer, linux-fsdevel, linux-kernel

Note the size of the cache object during invalidation.  For AFS, we set
this to the file size when invalidating a file due to third-party
interference.

Signed-off-by: David Howells <dhowells@redhat.com>
---

 fs/afs/inode.c          |    2 +-
 fs/fscache/cookie.c     |    3 ++-
 include/linux/fscache.h |    7 ++++---
 3 files changed, 7 insertions(+), 5 deletions(-)

diff --git a/fs/afs/inode.c b/fs/afs/inode.c
index c2f4afff9837..c21c1352b149 100644
--- a/fs/afs/inode.c
+++ b/fs/afs/inode.c
@@ -565,7 +565,7 @@ static void afs_zap_data(struct afs_vnode *vnode)
 	_enter("{%llx:%llu}", vnode->fid.vid, vnode->fid.vnode);
 
 #ifdef CONFIG_AFS_FSCACHE
-	fscache_invalidate(vnode->cache);
+	fscache_invalidate(vnode->cache, i_size_read(&vnode->vfs_inode));
 #endif
 
 	/* nuke all the non-dirty pages that aren't locked, mapped or being
diff --git a/fs/fscache/cookie.c b/fs/fscache/cookie.c
index 94976f90dc71..6b49c2321256 100644
--- a/fs/fscache/cookie.c
+++ b/fs/fscache/cookie.c
@@ -704,7 +704,7 @@ static void fscache_invalidate_cookie(struct fscache_cookie *cookie)
 /*
  * Invalidate an object.  Callable with spinlocks held.
  */
-void __fscache_invalidate(struct fscache_cookie *cookie)
+void __fscache_invalidate(struct fscache_cookie *cookie, loff_t new_size)
 {
 	bool is_caching;
 
@@ -718,6 +718,7 @@ void __fscache_invalidate(struct fscache_cookie *cookie)
 
 	spin_lock(&cookie->lock);
 	set_bit(FSCACHE_COOKIE_NO_DATA_TO_READ, &cookie->flags);
+	cookie->object_size = new_size;
 	cookie->inval_counter++;
 
 	switch (cookie->stage) {
diff --git a/include/linux/fscache.h b/include/linux/fscache.h
index 41e579ff65ee..fa7eef2674bf 100644
--- a/include/linux/fscache.h
+++ b/include/linux/fscache.h
@@ -159,7 +159,7 @@ extern struct fscache_cookie *__fscache_acquire_cookie(
 extern void __fscache_use_cookie(struct fscache_cookie *, bool);
 extern void __fscache_unuse_cookie(struct fscache_cookie *, const void *, const loff_t *);
 extern void __fscache_relinquish_cookie(struct fscache_cookie *, bool);
-extern void __fscache_invalidate(struct fscache_cookie *);
+extern void __fscache_invalidate(struct fscache_cookie *, loff_t);
 #ifdef FSCACHE_USE_NEW_IO_API
 extern int __fscache_begin_read_operation(struct netfs_cache_resources *, struct fscache_cookie *);
 #endif
@@ -388,6 +388,7 @@ void fscache_unpin_cookie(struct fscache_cookie *cookie)
 /**
  * fscache_invalidate - Notify cache that an object needs invalidation
  * @cookie: The cookie representing the cache object
+ * @size: The revised size of the object.
  *
  * Notify the cache that an object is needs to be invalidated and that it
  * should abort any retrievals or stores it is doing on the cache.  The object
@@ -399,10 +400,10 @@ void fscache_unpin_cookie(struct fscache_cookie *cookie)
  * description.
  */
 static inline
-void fscache_invalidate(struct fscache_cookie *cookie)
+void fscache_invalidate(struct fscache_cookie *cookie, loff_t size)
 {
 	if (fscache_cookie_valid(cookie))
-		__fscache_invalidate(cookie);
+		__fscache_invalidate(cookie, size);
 }
 
 /**



^ permalink raw reply related	[flat|nested] 87+ messages in thread

* [PATCH 40/67] vfs, fscache: Force ->write_inode() to occur if cookie pinned for writeback
  2021-10-18 14:50 [PATCH 00/67] fscache: Rewrite index API and management system David Howells
                   ` (38 preceding siblings ...)
  2021-10-18 15:00 ` [PATCH 39/67] fscache: Note the object size during invalidation David Howells
@ 2021-10-18 15:00 ` David Howells
  2021-10-18 15:00 ` [PATCH 41/67] afs: Render cache cookie key as big endian David Howells
                   ` (30 subsequent siblings)
  70 siblings, 0 replies; 87+ messages in thread
From: David Howells @ 2021-10-18 15:00 UTC (permalink / raw)
  To: linux-cachefs
  Cc: dhowells, Trond Myklebust, Anna Schumaker, Steve French,
	Dominique Martinet, Jeff Layton, Matthew Wilcox, Alexander Viro,
	Omar Sandoval, Linus Torvalds, linux-afs, linux-nfs, linux-cifs,
	ceph-devel, v9fs-developer, linux-fsdevel, linux-kernel

Use an inode flag, I_PINNING_FSCACHE_WB, to indicate that a cookie is
pinned in use by that inode for the purposes of writeback.

Pinning is necessary because the in-use pin from the open file is released
before the writeback takes place, but if the resources aren't pinned, the
dirty data can't be written to the cache.

Signed-off-by: David Howells <dhowells@redhat.com>
---

 fs/fscache/io.c         |   38 ++++++++++++++++++++++++++++++++++++++
 include/linux/fscache.h |   41 +++++++++++++++++++++++++++++++++++++++++
 2 files changed, 79 insertions(+)

diff --git a/fs/fscache/io.c b/fs/fscache/io.c
index ef4b0606019d..25976413fe34 100644
--- a/fs/fscache/io.c
+++ b/fs/fscache/io.c
@@ -214,3 +214,41 @@ int __fscache_fallback_write_page(struct fscache_cookie *cookie, struct page *pa
 	return ret;
 }
 EXPORT_SYMBOL(__fscache_fallback_write_page);
+
+/**
+ * fscache_set_page_dirty - Mark page dirty and pin a cache object for writeback
+ * @page: The page being dirtied
+ * @cookie: The cookie referring to the cache object
+ *
+ * Set the dirty flag on a page and pin an in-use cache object in memory when
+ * dirtying a page so that writeback can later write to it.  This is intended
+ * to be called from the filesystem's ->set_page_dirty() method.
+ *
+ *  Returns 1 if PG_dirty was set on the page, 0 otherwise.
+ */
+int fscache_set_page_dirty(struct page *page, struct fscache_cookie *cookie)
+{
+	struct inode *inode = page->mapping->host;
+	bool need_use = false;
+
+	_enter("");
+
+	if (!__set_page_dirty_nobuffers(page))
+		return 0;
+	if (!fscache_cookie_valid(cookie))
+		return 1;
+
+	if (!(inode->i_state & I_PINNING_FSCACHE_WB)) {
+		spin_lock(&inode->i_lock);
+		if (!(inode->i_state & I_PINNING_FSCACHE_WB)) {
+			inode->i_state |= I_PINNING_FSCACHE_WB;
+			need_use = true;
+		}
+		spin_unlock(&inode->i_lock);
+
+		if (need_use)
+			fscache_use_cookie(cookie, true);
+	}
+	return 1;
+}
+EXPORT_SYMBOL(fscache_set_page_dirty);
diff --git a/include/linux/fscache.h b/include/linux/fscache.h
index fa7eef2674bf..abf5413c3151 100644
--- a/include/linux/fscache.h
+++ b/include/linux/fscache.h
@@ -512,6 +512,47 @@ int fscache_write(struct netfs_cache_resources *cres,
 
 #endif /* FSCACHE_USE_NEW_IO_API */
 
+#if __fscache_available
+extern int fscache_set_page_dirty(struct page *page, struct fscache_cookie *cookie);
+#else
+#define fscache_set_page_dirty(PAGE, COOKIE) (__set_page_dirty_nobuffers((PAGE)))
+#endif
+
+/**
+ * fscache_unpin_writeback - Unpin writeback resources
+ * @wbc: The writeback control
+ * @cookie: The cookie referring to the cache object
+ *
+ * Unpin the writeback resources pinned by fscache_set_page_dirty().  This is
+ * intended to be called by the netfs's ->write_inode() method.
+ */
+static inline void fscache_unpin_writeback(struct writeback_control *wbc,
+					   struct fscache_cookie *cookie)
+{
+	if (wbc->unpinned_fscache_wb)
+		fscache_unuse_cookie(cookie, NULL, NULL);
+}
+
+/**
+ * fscache_clear_inode_writeback - Clear writeback resources pinned by an inode
+ * @cookie: The cookie referring to the cache object
+ * @inode: The inode to clean up
+ * @aux: Auxiliary data to apply to the inode
+ *
+ * Clear any writeback resources held by an inode when the inode is evicted.
+ * This must be called before clear_inode() is called.
+ */
+static inline void fscache_clear_inode_writeback(struct fscache_cookie *cookie,
+						 struct inode *inode,
+						 const void *aux)
+{
+	if (inode->i_state & I_PINNING_FSCACHE_WB) {
+		loff_t i_size = i_size_read(inode);
+		fscache_unuse_cookie(cookie, aux, &i_size);
+	}
+
+}
+
 #ifdef FSCACHE_USE_FALLBACK_IO_API
 
 /**



^ permalink raw reply related	[flat|nested] 87+ messages in thread

* [PATCH 41/67] afs: Render cache cookie key as big endian
  2021-10-18 14:50 [PATCH 00/67] fscache: Rewrite index API and management system David Howells
                   ` (39 preceding siblings ...)
  2021-10-18 15:00 ` [PATCH 40/67] vfs, fscache: Force ->write_inode() to occur if cookie pinned for writeback David Howells
@ 2021-10-18 15:00 ` David Howells
  2021-10-18 15:01 ` [PATCH 42/67] cachefiles: Use tmpfile/link David Howells
                   ` (29 subsequent siblings)
  70 siblings, 0 replies; 87+ messages in thread
From: David Howells @ 2021-10-18 15:00 UTC (permalink / raw)
  To: linux-cachefs
  Cc: Marc Dionne, linux-afs, dhowells, Trond Myklebust,
	Anna Schumaker, Steve French, Dominique Martinet, Jeff Layton,
	Matthew Wilcox, Alexander Viro, Omar Sandoval, Linus Torvalds,
	linux-afs, linux-nfs, linux-cifs, ceph-devel, v9fs-developer,
	linux-fsdevel, linux-kernel

Render the parts of the cookie key for an afs inode cookie as big endian.

Signed-off-by: David Howells <dhowells@redhat.com>
cc: Marc Dionne <marc.dionne@auristor.com>
cc: linux-afs@lists.infradead.org
---

 fs/afs/file.c     |    2 +-
 fs/afs/inode.c    |   16 ++++++++--------
 fs/afs/internal.h |    8 +++++++-
 3 files changed, 16 insertions(+), 10 deletions(-)

diff --git a/fs/afs/file.c b/fs/afs/file.c
index b4666da93b54..5e29d433960d 100644
--- a/fs/afs/file.c
+++ b/fs/afs/file.c
@@ -189,7 +189,7 @@ int afs_release(struct inode *inode, struct file *file)
 
 	if ((file->f_mode & FMODE_WRITE)) {
 		i_size = i_size_read(&vnode->vfs_inode);
-		aux.data_version = vnode->status.data_version;
+		afs_set_cache_aux(vnode, &aux);
 		fscache_unuse_cookie(afs_vnode_cache(vnode), &aux, &i_size);
 	} else {
 		fscache_unuse_cookie(afs_vnode_cache(vnode), NULL, NULL);
diff --git a/fs/afs/inode.c b/fs/afs/inode.c
index c21c1352b149..842570e4470f 100644
--- a/fs/afs/inode.c
+++ b/fs/afs/inode.c
@@ -413,9 +413,9 @@ static void afs_get_inode_cache(struct afs_vnode *vnode)
 {
 #ifdef CONFIG_AFS_FSCACHE
 	struct {
-		u32 vnode_id;
-		u32 unique;
-		u32 vnode_id_ext[2];	/* Allow for a 96-bit key */
+		__be32 vnode_id;
+		__be32 unique;
+		__be32 vnode_id_ext[2];	/* Allow for a 96-bit key */
 	} __packed key;
 	struct afs_vnode_cache_aux aux;
 
@@ -424,11 +424,11 @@ static void afs_get_inode_cache(struct afs_vnode *vnode)
 		return;
 	}
 
-	key.vnode_id		= vnode->fid.vnode;
-	key.unique		= vnode->fid.unique;
-	key.vnode_id_ext[0]	= vnode->fid.vnode >> 32;
-	key.vnode_id_ext[1]	= vnode->fid.vnode_hi;
-	aux.data_version	= vnode->status.data_version;
+	key.vnode_id		= htonl(vnode->fid.vnode);
+	key.unique		= htonl(vnode->fid.unique);
+	key.vnode_id_ext[0]	= htonl(vnode->fid.vnode >> 32);
+	key.vnode_id_ext[1]	= htonl(vnode->fid.vnode_hi);
+	afs_set_cache_aux(vnode, &aux);
 
 	vnode->cache = fscache_acquire_cookie(
 		vnode->volume->cache,
diff --git a/fs/afs/internal.h b/fs/afs/internal.h
index 8e168c3fa5d1..3180ba6bd46d 100644
--- a/fs/afs/internal.h
+++ b/fs/afs/internal.h
@@ -869,9 +869,15 @@ struct afs_operation {
  * Cache auxiliary data.
  */
 struct afs_vnode_cache_aux {
-	u64			data_version;
+	__be64			data_version;
 } __packed;
 
+static inline void afs_set_cache_aux(struct afs_vnode *vnode,
+				     struct afs_vnode_cache_aux *aux)
+{
+	aux->data_version = cpu_to_be64(vnode->status.data_version);
+}
+
 /*
  * We use page->private to hold the amount of the page that we've written to,
  * splitting the field into two parts.  However, we need to represent a range



^ permalink raw reply related	[flat|nested] 87+ messages in thread

* [PATCH 42/67] cachefiles: Use tmpfile/link
  2021-10-18 14:50 [PATCH 00/67] fscache: Rewrite index API and management system David Howells
                   ` (40 preceding siblings ...)
  2021-10-18 15:00 ` [PATCH 41/67] afs: Render cache cookie key as big endian David Howells
@ 2021-10-18 15:01 ` David Howells
  2021-10-18 15:01 ` [PATCH 43/67] fscache: Rewrite invalidation David Howells
                   ` (28 subsequent siblings)
  70 siblings, 0 replies; 87+ messages in thread
From: David Howells @ 2021-10-18 15:01 UTC (permalink / raw)
  To: linux-cachefs
  Cc: dhowells, Trond Myklebust, Anna Schumaker, Steve French,
	Dominique Martinet, Jeff Layton, Matthew Wilcox, Alexander Viro,
	Omar Sandoval, Linus Torvalds, linux-afs, linux-nfs, linux-cifs,
	ceph-devel, v9fs-developer, linux-fsdevel, linux-kernel

Make cachefiles use a temporary file (created with vfs_tmpfile()) for a new
file rather than creating it immediately.  This means we don't have to wait
exclusively on the directory's inode lock at this point.

The directory entry creation is then deferred to the point at which the
file is committed.  Indeed, if the file is deleted before that point, it
can just be abandoned without ever modifying the directory.

Invalidation is achieved by simply closing the old file and creating a new
tmpfile.  Any in-progress ops hold the old file open till they've finished.
We don't need to cancel them and can just deal with reissuing a read to the
server upon completion (that will be a separate patch).

Note: This would be easier if linkat() could be given a flag to indicate
the destination should be overwritten or if RENAME_EXCHANGE could be
applied to tmpfiles, effectively unlinking the destination.

Signed-off-by: David Howells <dhowells@redhat.com>
---

 fs/cachefiles/interface.c         |   85 +++++++++-----
 fs/cachefiles/internal.h          |    9 ++
 fs/cachefiles/namei.c             |  219 ++++++++++++++++++++++++++-----------
 include/trace/events/cachefiles.h |   48 ++++++--
 4 files changed, 251 insertions(+), 110 deletions(-)

diff --git a/fs/cachefiles/interface.c b/fs/cachefiles/interface.c
index d186a68ff810..a114b59e5b29 100644
--- a/fs/cachefiles/interface.c
+++ b/fs/cachefiles/interface.c
@@ -197,6 +197,9 @@ static void cachefiles_commit_object(struct cachefiles_object *object,
 		update = true;
 	if (update)
 		cachefiles_update_object(object);
+
+	if (test_bit(CACHEFILES_OBJECT_USING_TMPFILE, &object->flags))
+		cachefiles_commit_tmpfile(cache, object);
 }
 
 /*
@@ -206,9 +209,14 @@ static void cachefiles_clean_up_object(struct cachefiles_object *object,
 				       struct cachefiles_cache *cache)
 {
 	if (test_bit(FSCACHE_COOKIE_RETIRED, &object->cookie->flags)) {
-		cachefiles_see_object(object, cachefiles_obj_see_clean_delete);
-		_debug("- inval object OBJ%x", object->debug_id);
-		cachefiles_delete_object(object, FSCACHE_OBJECT_WAS_RETIRED);
+		if (!test_bit(CACHEFILES_OBJECT_USING_TMPFILE, &object->flags)) {
+			cachefiles_see_object(object, cachefiles_obj_see_clean_delete);
+			_debug("- inval object OBJ%x", object->debug_id);
+			cachefiles_delete_object(object, FSCACHE_OBJECT_WAS_RETIRED);
+		} else {
+			cachefiles_see_object(object, cachefiles_obj_see_clean_drop_tmp);
+			_debug("- inval object OBJ%x tmpfile", object->debug_id);
+		}
 	} else {
 		cachefiles_see_object(object, cachefiles_obj_see_clean_commit);
 		cachefiles_commit_object(object, cache);
@@ -372,41 +380,58 @@ static bool cachefiles_invalidate_cookie(struct fscache_cookie *cookie,
 					 unsigned int flags)
 {
 	struct cachefiles_object *object = cookie->cache_priv;
-	struct cachefiles_cache *cache = object->volume->cache;
-	const struct cred *saved_cred;
-	struct file *file = object->file;
-	uint64_t ni_size = cookie->object_size;
-	int ret;
+	struct file *new_file, *old_file;
+	bool old_tmpfile;
 
-	_enter("{OBJ%x},[%llu]",
-	       object->debug_id, (unsigned long long)ni_size);
+	_enter("o=%x,[%llu]", object->debug_id, object->cookie->object_size);
 
-	if (file) {
-		ASSERT(d_is_reg(file->f_path.dentry));
+	old_tmpfile = test_bit(CACHEFILES_OBJECT_USING_TMPFILE, &object->flags);
 
-		cachefiles_begin_secure(cache, &saved_cred);
-		trace_cachefiles_trunc(object, file_inode(file),
-				       i_size_read(file_inode(file)), 0,
-				       cachefiles_trunc_invalidate);
-		ret = vfs_truncate(&file->f_path, 0);
-		if (ret == 0) {
-			ni_size = round_up(ni_size, CACHEFILES_DIO_BLOCK_SIZE);
-			trace_cachefiles_trunc(object, file_inode(file),
-					       0, ni_size,
-					       cachefiles_trunc_set_size);
-			ret = vfs_truncate(&file->f_path, ni_size);
-		}
-		cachefiles_end_secure(cache, saved_cred);
+	if (!object->file) {
+		fscache_set_cookie_stage(cookie, FSCACHE_COOKIE_STAGE_ACTIVE);
+		_leave(" = t [light]");
+		return true;
+	}
+
+	new_file = cachefiles_create_tmpfile(object);
+	if (IS_ERR(new_file))
+		goto failed;
+
+	/* Substitute the VFS target */
+	_debug("sub");
+	spin_lock(&object->lock);
+
+	old_file = object->file;
+	object->file = new_file;
+	set_bit(CACHEFILES_OBJECT_USING_TMPFILE, &object->flags);
+	set_bit(FSCACHE_COOKIE_NEEDS_UPDATE, &object->cookie->flags);
+
+	spin_unlock(&object->lock);
+	_debug("subbed");
+
+	/* Allow I/O to take place again */
+	fscache_set_cookie_stage(cookie, FSCACHE_COOKIE_STAGE_ACTIVE);
 
-		if (ret != 0) {
-			if (ret == -EIO)
-				cachefiles_io_error_obj(object,
-							"Invalidate failed");
-			return false;
+	if (old_file) {
+		if (!old_tmpfile) {
+			struct cachefiles_volume *volume = object->volume;
+			struct dentry *fan = volume->fanout[(u8)object->key_hash];
+
+			inode_lock_nested(d_inode(fan), I_MUTEX_PARENT);
+			cachefiles_bury_object(volume->cache, object, fan,
+					       old_file->f_path.dentry,
+					       FSCACHE_OBJECT_INVALIDATED);
 		}
+		fput(old_file);
 	}
 
+	_leave(" = t");
 	return true;
+
+failed:
+	fscache_set_cookie_stage(cookie, FSCACHE_COOKIE_STAGE_FAILED);
+	_leave(" = f");
+	return false;
 }
 
 const struct fscache_cache_ops cachefiles_cache_ops = {
diff --git a/fs/cachefiles/internal.h b/fs/cachefiles/internal.h
index d8a70ecbe94a..6cc22c85c8f2 100644
--- a/fs/cachefiles/internal.h
+++ b/fs/cachefiles/internal.h
@@ -59,6 +59,7 @@ struct cachefiles_object {
 	u8				key_hash;	/* Hash of object key */
 	unsigned long			flags;
 #define CACHEFILES_OBJECT_IS_NEW	0		/* Set if object is new */
+#define CACHEFILES_OBJECT_USING_TMPFILE	1		/* Have an unlinked tmpfile */
 };
 
 extern struct kmem_cache *cachefiles_object_jar;
@@ -171,6 +172,11 @@ extern bool cachefiles_cook_key(struct cachefiles_object *object);
  * namei.c
  */
 extern void cachefiles_unmark_inode_in_use(struct cachefiles_object *object);
+extern int cachefiles_bury_object(struct cachefiles_cache *cache,
+				  struct cachefiles_object *object,
+				  struct dentry *dir,
+				  struct dentry *rep,
+				  enum fscache_why_object_killed why);
 extern int cachefiles_delete_object(struct cachefiles_object *object,
 				    enum fscache_why_object_killed why);
 extern bool cachefiles_walk_to_object(struct cachefiles_object *object);
@@ -183,6 +189,9 @@ extern int cachefiles_cull(struct cachefiles_cache *cache, struct dentry *dir,
 
 extern int cachefiles_check_in_use(struct cachefiles_cache *cache,
 				   struct dentry *dir, char *filename);
+extern struct file *cachefiles_create_tmpfile(struct cachefiles_object *object);
+extern bool cachefiles_commit_tmpfile(struct cachefiles_cache *cache,
+				      struct cachefiles_object *object);
 
 /*
  * security.c
diff --git a/fs/cachefiles/namei.c b/fs/cachefiles/namei.c
index f7e73aba9104..0edf1276768b 100644
--- a/fs/cachefiles/namei.c
+++ b/fs/cachefiles/namei.c
@@ -80,11 +80,11 @@ static void cachefiles_mark_object_inactive(struct cachefiles_object *object)
  * - directory backed objects are stuffed into the graveyard for userspace to
  *   delete
  */
-static int cachefiles_bury_object(struct cachefiles_cache *cache,
-				  struct cachefiles_object *object,
-				  struct dentry *dir,
-				  struct dentry *rep,
-				  enum fscache_why_object_killed why)
+int cachefiles_bury_object(struct cachefiles_cache *cache,
+			   struct cachefiles_object *object,
+			   struct dentry *dir,
+			   struct dentry *rep,
+			   enum fscache_why_object_killed why)
 {
 	struct dentry *grave, *trap;
 	struct path path, path_to_graveyard;
@@ -302,83 +302,73 @@ static int cachefiles_open_file(struct cachefiles_object *object,
 {
 	struct cachefiles_cache *cache = object->volume->cache;
 	struct dentry *dentry;
-	struct inode *dinode = d_backing_inode(fan), *inode;
 	struct file *file;
-	struct path fan_path, path;
+	struct path path;
 	int ret;
 
 	_enter("%pd %s", fan, object->d_name);
 
-	inode_lock_nested(dinode, I_MUTEX_PARENT);
-
-	dentry = lookup_one_len(object->d_name, fan, object->d_name_len);
+	dentry = lookup_positive_unlocked(object->d_name, fan, object->d_name_len);
 	trace_cachefiles_lookup(object, dentry);
-	if (IS_ERR(dentry)) {
-		ret = PTR_ERR(dentry);
-		goto error_unlock;
-	}
-
-	if (d_is_negative(dentry)) {
+	if (dentry == ERR_PTR(-ENOENT)) {
+		set_bit(CACHEFILES_OBJECT_IS_NEW, &object->flags);
 		fscache_cookie_lookup_negative(object->cookie);
 
 		ret = cachefiles_has_space(cache, 1, 0);
 		if (ret < 0)
-			goto error_dput;
-
-		fan_path.mnt = cache->mnt;
-		fan_path.dentry = fan;
-		ret = security_path_mknod(&fan_path, dentry, S_IFREG, 0);
-		if (ret < 0)
-			goto error_dput;
-		ret = vfs_create(&init_user_ns, dinode, dentry, S_IFREG, true);
-		trace_cachefiles_create(object, dentry, ret);
-		if (ret < 0)
-			goto error_dput;
+			goto error;
 
-		inode = d_backing_inode(dentry);
-		_debug("create -> %pd{ino=%lu}", dentry, inode->i_ino);
-		set_bit(CACHEFILES_OBJECT_IS_NEW, &object->flags);
+		file = cachefiles_create_tmpfile(object);
+		if (IS_ERR(file)) {
+			ret = PTR_ERR(file);
+			goto error;
+		}
 
-	} else if (!d_is_reg(dentry)) {
-		inode = d_backing_inode(dentry);
-		pr_err("inode %lu is not a file\n", inode->i_ino);
-		ret = -EIO;
-		goto error_dput;
-	} else {
-		inode = d_backing_inode(dentry);
-		_debug("file -> %pd positive", dentry);
+		set_bit(FSCACHE_COOKIE_NEEDS_UPDATE, &object->cookie->flags);
+		set_bit(CACHEFILES_OBJECT_USING_TMPFILE, &object->flags);
+		_debug("create -> %pD{ino=%lu}", file, file_inode(file)->i_ino);
+		goto out;
 	}
 
-	inode_unlock(dinode);
-
-	/* We need to open a file interface onto a data file now as we can't do
-	 * it on demand because writeback called from do_exit() sees
-	 * current->fs == NULL - which breaks d_path() called from ext4 open.
-	 */
-	path.mnt = cache->mnt;
-	path.dentry = dentry;
-	file = open_with_fake_path(&path, O_RDWR | O_LARGEFILE | O_DIRECT,
-				   inode, cache->cache_cred);
-	dput(dentry);
-	if (IS_ERR(file)) {
-		ret = PTR_ERR(file);
+	if (IS_ERR(dentry)) {
+		ret = PTR_ERR(dentry);
 		goto error;
 	}
-	if (unlikely(!file->f_op->read_iter) ||
-	    unlikely(!file->f_op->write_iter)) {
-		pr_notice("Cache does not support read_iter and write_iter\n");
+
+	if (!d_is_reg(dentry)) {
+		pr_err("%pd is not a file\n", dentry);
+		dput(dentry);
 		ret = -EIO;
-		goto error_fput;
+		goto error;
+	} else {
+		clear_bit(CACHEFILES_OBJECT_IS_NEW, &object->flags);
+
+		/* We need to open a file interface onto a data file now as we
+		 * can't do it on demand because writeback called from
+		 * do_exit() sees current->fs == NULL - which breaks d_path()
+		 * called from ext4 open.
+		 */
+		path.mnt = cache->mnt;
+		path.dentry = dentry;
+		file = open_with_fake_path(&path, O_RDWR | O_LARGEFILE | O_DIRECT,
+					   d_backing_inode(dentry), cache->cache_cred);
+		dput(dentry);
+		if (IS_ERR(file)) {
+			ret = PTR_ERR(file);
+			goto error;
+		}
+		if (unlikely(!file->f_op->read_iter) ||
+		    unlikely(!file->f_op->write_iter)) {
+			pr_notice("Cache does not support read_iter and write_iter\n");
+			ret = -EIO;
+			goto error_fput;
+		}
+		_debug("file -> %pd positive", dentry);
 	}
 
+out:
 	object->file = file;
 	return 0;
-
-error_dput:
-	dput(dentry);
-error_unlock:
-	inode_unlock(dinode);
-	return ret;
 error_fput:
 	fput(file);
 error:
@@ -458,9 +448,11 @@ struct dentry *cachefiles_get_directory(struct cachefiles_cache *cache,
 
 	/* we need to create the subdir if it doesn't exist yet */
 	if (d_is_negative(subdir)) {
-		ret = cachefiles_has_space(cache, 1, 0);
-		if (ret < 0)
-			goto mkdir_error;
+		if (cache->store) {
+			ret = cachefiles_has_space(cache, 1, 0);
+			if (ret < 0)
+				goto mkdir_error;
+		}
 
 		_debug("attempt mkdir");
 
@@ -498,7 +490,6 @@ struct dentry *cachefiles_get_directory(struct cachefiles_cache *cache,
 	if (!(d_backing_inode(subdir)->i_opflags & IOP_XATTR) ||
 	    !d_backing_inode(subdir)->i_op->lookup ||
 	    !d_backing_inode(subdir)->i_op->mkdir ||
-	    !d_backing_inode(subdir)->i_op->create ||
 	    !d_backing_inode(subdir)->i_op->rename ||
 	    !d_backing_inode(subdir)->i_op->rmdir ||
 	    !d_backing_inode(subdir)->i_op->unlink)
@@ -687,3 +678,101 @@ int cachefiles_check_in_use(struct cachefiles_cache *cache, struct dentry *dir,
 	//_leave(" = 0");
 	return ret;
 }
+
+/*
+ * Create a temporary file and leave it unattached and un-xattr'd until the
+ * time comes to discard the object from memory.
+ */
+struct file *cachefiles_create_tmpfile(struct cachefiles_object *object)
+{
+	struct cachefiles_volume *volume = object->volume;
+	struct cachefiles_cache *cache = volume->cache;
+	const struct cred *saved_cred;
+	struct dentry *fan = volume->fanout[(u8)object->key_hash];
+	struct file *file;
+	struct path path;
+	uint64_t ni_size = object->cookie->object_size;
+	long ret;
+
+	ni_size = round_up(ni_size, CACHEFILES_DIO_BLOCK_SIZE);
+
+	cachefiles_begin_secure(cache, &saved_cred);
+
+	path.mnt = cache->mnt,
+	path.dentry = vfs_tmpfile(&init_user_ns, fan, S_IFREG, O_RDWR);
+	if (IS_ERR(path.dentry)) {
+		if (PTR_ERR(path.dentry) == -EIO)
+			cachefiles_io_error_obj(object, "Failed to create tmpfile");
+		file = ERR_CAST(path.dentry);
+		goto out;
+	}
+
+	trace_cachefiles_tmpfile(object, d_backing_inode(path.dentry));
+
+	if (ni_size > 0) {
+		trace_cachefiles_trunc(object, d_backing_inode(path.dentry), 0, ni_size,
+				       cachefiles_trunc_expand_tmpfile);
+		ret = vfs_truncate(&path, ni_size);
+		if (ret < 0) {
+			file = ERR_PTR(ret);
+			goto out_dput;
+		}
+	}
+
+	file = open_with_fake_path(&path, O_RDWR | O_LARGEFILE | O_DIRECT,
+				   d_backing_inode(path.dentry), cache->cache_cred);
+	if (IS_ERR(file))
+		goto out_dput;
+	if (unlikely(!file->f_op->read_iter) ||
+	    unlikely(!file->f_op->write_iter)) {
+		fput(file);
+		pr_notice("Cache does not support read_iter and write_iter\n");
+		file = ERR_PTR(-EINVAL);
+	}
+
+out_dput:
+	dput(path.dentry);
+out:
+	cachefiles_end_secure(cache, saved_cred);
+	return file;
+}
+
+/*
+ * Attempt to link a temporary file into its rightful place in the cache.
+ */
+bool cachefiles_commit_tmpfile(struct cachefiles_cache *cache,
+			       struct cachefiles_object *object)
+{
+	struct cachefiles_volume *volume = object->volume;
+	struct dentry *dentry, *fan = volume->fanout[(u8)object->key_hash];
+	bool success = false;
+	int ret;
+
+	_enter(",%pD", object->file);
+
+	inode_lock_nested(d_inode(fan), I_MUTEX_PARENT);
+	dentry = lookup_one_len(object->d_name, fan, object->d_name_len);
+	if (IS_ERR(dentry)) {
+		_debug("lookup fail %ld", PTR_ERR(dentry));
+		goto out_unlock;
+	}
+
+	ret = vfs_link(object->file->f_path.dentry, &init_user_ns,
+		       d_inode(fan), dentry, NULL);
+	if (ret < 0) {
+		_debug("link fail %d", ret);
+	} else {
+		trace_cachefiles_link(object, file_inode(object->file));
+		spin_lock(&object->lock);
+		/* TODO: Do we want to switch the file pointer to the new dentry? */
+		clear_bit(CACHEFILES_OBJECT_USING_TMPFILE, &object->flags);
+		spin_unlock(&object->lock);
+		success = true;
+	}
+
+	dput(dentry);
+out_unlock:
+	inode_unlock(d_inode(fan));
+	_leave(" = %u", success);
+	return success;
+}
diff --git a/include/trace/events/cachefiles.h b/include/trace/events/cachefiles.h
index d63e5fb46d27..c0632ee8cf69 100644
--- a/include/trace/events/cachefiles.h
+++ b/include/trace/events/cachefiles.h
@@ -35,6 +35,7 @@ enum cachefiles_obj_ref_trace {
 
 enum fscache_why_object_killed {
 	FSCACHE_OBJECT_IS_STALE,
+	FSCACHE_OBJECT_INVALIDATED,
 	FSCACHE_OBJECT_NO_SPACE,
 	FSCACHE_OBJECT_WAS_RETIRED,
 	FSCACHE_OBJECT_WAS_CULLED,
@@ -54,9 +55,8 @@ enum cachefiles_coherency_trace {
 };
 
 enum cachefiles_trunc_trace {
-	cachefiles_trunc_invalidate,
-	cachefiles_trunc_set_size,
 	cachefiles_trunc_dio_adjust,
+	cachefiles_trunc_expand_tmpfile,
 	cachefiles_trunc_shrink,
 };
 
@@ -78,6 +78,7 @@ enum cachefiles_prepare_read_trace {
  */
 #define cachefiles_obj_kill_traces				\
 	EM(FSCACHE_OBJECT_IS_STALE,	"stale")		\
+	EM(FSCACHE_OBJECT_INVALIDATED,	"inval")		\
 	EM(FSCACHE_OBJECT_NO_SPACE,	"no_space")		\
 	EM(FSCACHE_OBJECT_WAS_RETIRED,	"was_retired")		\
 	E_(FSCACHE_OBJECT_WAS_CULLED,	"was_culled")
@@ -109,9 +110,8 @@ enum cachefiles_prepare_read_trace {
 	E_(cachefiles_coherency_set_ok,		"SET ok  ")
 
 #define cachefiles_trunc_traces						\
-	EM(cachefiles_trunc_invalidate,		"INVAL ")		\
-	EM(cachefiles_trunc_set_size,		"SETSIZ")		\
 	EM(cachefiles_trunc_dio_adjust,		"DIOADJ")		\
+	EM(cachefiles_trunc_expand_tmpfile,	"EXPTMP")		\
 	E_(cachefiles_trunc_shrink,		"SHRINK")
 
 #define cachefiles_prepare_read_traces					\
@@ -200,26 +200,44 @@ TRACE_EVENT(cachefiles_lookup,
 		      __entry->obj, __entry->ino, __entry->error)
 	    );
 
-TRACE_EVENT(cachefiles_create,
-	    TP_PROTO(struct cachefiles_object *obj,
-		     struct dentry *de, int ret),
+TRACE_EVENT(cachefiles_tmpfile,
+	    TP_PROTO(struct cachefiles_object *obj, struct inode *backer),
 
-	    TP_ARGS(obj, de, ret),
+	    TP_ARGS(obj, backer),
 
 	    TP_STRUCT__entry(
-		    __field(unsigned int,		obj	)
-		    __field(struct dentry *,		de	)
-		    __field(int,			ret	)
+		    __field(unsigned int,			obj	)
+		    __field(unsigned int,			backer	)
 			     ),
 
 	    TP_fast_assign(
 		    __entry->obj	= obj->debug_id;
-		    __entry->de		= de;
-		    __entry->ret	= ret;
+		    __entry->backer	= backer->i_ino;
+			   ),
+
+	    TP_printk("o=%08x b=%08x",
+		      __entry->obj,
+		      __entry->backer)
+	    );
+
+TRACE_EVENT(cachefiles_link,
+	    TP_PROTO(struct cachefiles_object *obj, struct inode *backer),
+
+	    TP_ARGS(obj, backer),
+
+	    TP_STRUCT__entry(
+		    __field(unsigned int,			obj	)
+		    __field(unsigned int,			backer	)
+			     ),
+
+	    TP_fast_assign(
+		    __entry->obj	= obj->debug_id;
+		    __entry->backer	= backer->i_ino;
 			   ),
 
-	    TP_printk("o=%08x d=%p r=%u",
-		      __entry->obj, __entry->de, __entry->ret)
+	    TP_printk("o=%08x b=%08x",
+		      __entry->obj,
+		      __entry->backer)
 	    );
 
 TRACE_EVENT(cachefiles_unlink,



^ permalink raw reply related	[flat|nested] 87+ messages in thread

* [PATCH 43/67] fscache: Rewrite invalidation
  2021-10-18 14:50 [PATCH 00/67] fscache: Rewrite index API and management system David Howells
                   ` (41 preceding siblings ...)
  2021-10-18 15:01 ` [PATCH 42/67] cachefiles: Use tmpfile/link David Howells
@ 2021-10-18 15:01 ` David Howells
  2021-10-18 15:01 ` [PATCH 44/67] fscache: disable cookie when doing an invalidation for DIO write David Howells
                   ` (27 subsequent siblings)
  70 siblings, 0 replies; 87+ messages in thread
From: David Howells @ 2021-10-18 15:01 UTC (permalink / raw)
  To: linux-cachefs
  Cc: dhowells, Trond Myklebust, Anna Schumaker, Steve French,
	Dominique Martinet, Jeff Layton, Matthew Wilcox, Alexander Viro,
	Omar Sandoval, Linus Torvalds, linux-afs, linux-nfs, linux-cifs,
	ceph-devel, v9fs-developer, linux-fsdevel, linux-kernel

Rewrite the fscache_invalidate() function.  The following changes are made
to fscache:

 (1) Invalidation is now ignored or allowed to proceed depending on the
     'stage' a non-index cookie is in with respect to the backing object.

 (2) If invalidation is proceeds, it pins the object and holds a cookie
     access count for the duration to prevent the cache from going away.

 (3) The fscache_object struct is given an invalidation counter that is
     incremented every time fscache_invalidate() is called, even if the
     cookie is at a stage in which it cannot be applied.  The counter,
     however, can be noted and applied retroactively later.

 (4) The invalidation counter is noted in the operation struct when a cache
     operation is begun and can be checked on operation completion to find
     out if any consequent metadata changes should be dropped.

 (5) New operations aren't allowed to proceed if the object is being
     invalidated.

Signed-off-by: David Howells <dhowells@redhat.com>
---

 fs/afs/inode.c                 |    4 +---
 fs/afs/internal.h              |    9 +++++++++
 fs/cachefiles/interface.c      |    3 +--
 fs/cachefiles/io.c             |    4 ++--
 fs/fscache/cookie.c            |   14 +++++++-------
 fs/fscache/io.c                |    1 +
 include/linux/fscache-cache.h  |    3 +--
 include/linux/fscache.h        |   16 ++++++++++++----
 include/linux/netfs.h          |    1 +
 include/trace/events/fscache.h |   19 +++++++++++++++++++
 10 files changed, 54 insertions(+), 20 deletions(-)

diff --git a/fs/afs/inode.c b/fs/afs/inode.c
index 842570e4470f..32038ccbce67 100644
--- a/fs/afs/inode.c
+++ b/fs/afs/inode.c
@@ -564,9 +564,7 @@ static void afs_zap_data(struct afs_vnode *vnode)
 {
 	_enter("{%llx:%llu}", vnode->fid.vid, vnode->fid.vnode);
 
-#ifdef CONFIG_AFS_FSCACHE
-	fscache_invalidate(vnode->cache, i_size_read(&vnode->vfs_inode));
-#endif
+	afs_invalidate_cache(vnode, 0);
 
 	/* nuke all the non-dirty pages that aren't locked, mapped or being
 	 * written back in a regular file and completely discard the pages in a
diff --git a/fs/afs/internal.h b/fs/afs/internal.h
index 3180ba6bd46d..6c591b7c55f1 100644
--- a/fs/afs/internal.h
+++ b/fs/afs/internal.h
@@ -878,6 +878,15 @@ static inline void afs_set_cache_aux(struct afs_vnode *vnode,
 	aux->data_version = cpu_to_be64(vnode->status.data_version);
 }
 
+static inline void afs_invalidate_cache(struct afs_vnode *vnode, unsigned int flags)
+{
+	struct afs_vnode_cache_aux aux;
+
+	afs_set_cache_aux(vnode, &aux);
+	fscache_invalidate(afs_vnode_cache(vnode), &aux,
+			   i_size_read(&vnode->vfs_inode), flags);
+}
+
 /*
  * We use page->private to hold the amount of the page that we've written to,
  * splitting the field into two parts.  However, we need to represent a range
diff --git a/fs/cachefiles/interface.c b/fs/cachefiles/interface.c
index a114b59e5b29..96a703d5f62c 100644
--- a/fs/cachefiles/interface.c
+++ b/fs/cachefiles/interface.c
@@ -376,8 +376,7 @@ static int cachefiles_attr_changed(struct cachefiles_object *object)
 /*
  * Invalidate the storage associated with a cookie.
  */
-static bool cachefiles_invalidate_cookie(struct fscache_cookie *cookie,
-					 unsigned int flags)
+static bool cachefiles_invalidate_cookie(struct fscache_cookie *cookie)
 {
 	struct cachefiles_object *object = cookie->cache_priv;
 	struct file *new_file, *old_file;
diff --git a/fs/cachefiles/io.c b/fs/cachefiles/io.c
index 350243b45dd5..67ea9f44931f 100644
--- a/fs/cachefiles/io.c
+++ b/fs/cachefiles/io.c
@@ -134,7 +134,7 @@ static int cachefiles_read(struct netfs_cache_resources *cres,
 	ki->iocb.ki_ioprio	= get_current_ioprio();
 	ki->skipped		= skipped;
 	ki->object		= object;
-	ki->inval_counter	= object->cookie->inval_counter;
+	ki->inval_counter	= cres->inval_counter;
 	ki->term_func		= term_func;
 	ki->term_func_priv	= term_func_priv;
 	ki->was_async		= true;
@@ -239,7 +239,7 @@ static int cachefiles_write(struct netfs_cache_resources *cres,
 	ki->iocb.ki_hint	= ki_hint_validate(file_write_hint(file));
 	ki->iocb.ki_ioprio	= get_current_ioprio();
 	ki->object		= object;
-	ki->inval_counter	= object->cookie->inval_counter;
+	ki->inval_counter	= cres->inval_counter;
 	ki->start		= start_pos;
 	ki->len			= len;
 	ki->term_func		= term_func;
diff --git a/fs/fscache/cookie.c b/fs/fscache/cookie.c
index 6b49c2321256..8731188a5ac7 100644
--- a/fs/fscache/cookie.c
+++ b/fs/fscache/cookie.c
@@ -694,17 +694,16 @@ static void fscache_cookie_drop_from_lru(struct fscache_cookie *cookie)
  */
 static void fscache_invalidate_cookie(struct fscache_cookie *cookie)
 {
-	if (cookie->volume->cache->ops->invalidate_cookie(cookie, 0))
-		fscache_set_cookie_stage(cookie, FSCACHE_COOKIE_STAGE_ACTIVE);
-	else
-		fscache_set_cookie_stage(cookie, FSCACHE_COOKIE_STAGE_FAILED);
+	cookie->volume->cache->ops->invalidate_cookie(cookie);
 	fscache_end_cookie_access(cookie, fscache_access_invalidate_cookie_end);
 }
 
 /*
- * Invalidate an object.  Callable with spinlocks held.
+ * Invalidate an object.
  */
-void __fscache_invalidate(struct fscache_cookie *cookie, loff_t new_size)
+void __fscache_invalidate(struct fscache_cookie *cookie,
+			  const void *aux_data, loff_t new_size,
+			  unsigned int flags)
 {
 	bool is_caching;
 
@@ -718,8 +717,9 @@ void __fscache_invalidate(struct fscache_cookie *cookie, loff_t new_size)
 
 	spin_lock(&cookie->lock);
 	set_bit(FSCACHE_COOKIE_NO_DATA_TO_READ, &cookie->flags);
-	cookie->object_size = new_size;
+	fscache_update_aux(cookie, aux_data, &new_size);
 	cookie->inval_counter++;
+	trace_fscache_invalidate(cookie, new_size);
 
 	switch (cookie->stage) {
 	case FSCACHE_COOKIE_STAGE_INVALIDATING: /* is_still_valid will catch it */
diff --git a/fs/fscache/io.c b/fs/fscache/io.c
index 25976413fe34..ad9798f0c850 100644
--- a/fs/fscache/io.c
+++ b/fs/fscache/io.c
@@ -81,6 +81,7 @@ static int fscache_begin_operation(struct netfs_cache_resources *cres,
 	cres->cache_priv	= cookie;
 	cres->cache_priv2	= NULL;
 	cres->debug_id		= cookie->debug_id;
+	cres->inval_counter	= cookie->inval_counter;
 
 	if (!fscache_begin_cookie_access(cookie, why))
 		return -ENOBUFS;
diff --git a/include/linux/fscache-cache.h b/include/linux/fscache-cache.h
index bf0d3e862915..22064611d182 100644
--- a/include/linux/fscache-cache.h
+++ b/include/linux/fscache-cache.h
@@ -67,8 +67,7 @@ struct fscache_cache_ops {
 	void (*withdraw_cookie)(struct fscache_cookie *cookie);
 
 	/* Invalidate an object */
-	bool (*invalidate_cookie)(struct fscache_cookie *cookie,
-				  unsigned int flags);
+	bool (*invalidate_cookie)(struct fscache_cookie *cookie);
 
 	/* Begin an operation for the netfs lib */
 	bool (*begin_operation)(struct netfs_cache_resources *cres,
diff --git a/include/linux/fscache.h b/include/linux/fscache.h
index abf5413c3151..a29bd81996ea 100644
--- a/include/linux/fscache.h
+++ b/include/linux/fscache.h
@@ -159,7 +159,7 @@ extern struct fscache_cookie *__fscache_acquire_cookie(
 extern void __fscache_use_cookie(struct fscache_cookie *, bool);
 extern void __fscache_unuse_cookie(struct fscache_cookie *, const void *, const loff_t *);
 extern void __fscache_relinquish_cookie(struct fscache_cookie *, bool);
-extern void __fscache_invalidate(struct fscache_cookie *, loff_t);
+extern void __fscache_invalidate(struct fscache_cookie *, const void *, loff_t, unsigned int);
 #ifdef FSCACHE_USE_NEW_IO_API
 extern int __fscache_begin_read_operation(struct netfs_cache_resources *, struct fscache_cookie *);
 #endif
@@ -388,22 +388,30 @@ void fscache_unpin_cookie(struct fscache_cookie *cookie)
 /**
  * fscache_invalidate - Notify cache that an object needs invalidation
  * @cookie: The cookie representing the cache object
+ * @aux_data: The updated auxiliary data for the cookie (may be NULL)
  * @size: The revised size of the object.
+ * @flags: Invalidation flags (FSCACHE_INVAL_*)
  *
  * Notify the cache that an object is needs to be invalidated and that it
  * should abort any retrievals or stores it is doing on the cache.  The object
  * is then marked non-caching until such time as the invalidation is complete.
  *
- * This can be called with spinlocks held.
+ * FSCACHE_INVAL_LIGHT indicates that if the object has been invalidated and
+ * replaced by a temporary object, the temporary object need not be replaced
+ * again.  This is primarily intended for use with FSCACHE_ADV_SINGLE_CHUNK.
+ *
+ * FSCACHE_INVAL_DIO_WRITE indicates that this is due to a direct I/O write and
+ * may cause caching to be suspended on this cookie.
  *
  * See Documentation/filesystems/caching/netfs-api.rst for a complete
  * description.
  */
 static inline
-void fscache_invalidate(struct fscache_cookie *cookie, loff_t size)
+void fscache_invalidate(struct fscache_cookie *cookie,
+			const void *aux_data, loff_t size, unsigned int flags)
 {
 	if (fscache_cookie_valid(cookie))
-		__fscache_invalidate(cookie, size);
+		__fscache_invalidate(cookie, aux_data, size, flags);
 }
 
 /**
diff --git a/include/linux/netfs.h b/include/linux/netfs.h
index 99137486d351..3c70eef56599 100644
--- a/include/linux/netfs.h
+++ b/include/linux/netfs.h
@@ -103,6 +103,7 @@ struct netfs_cache_resources {
 	void				*cache_priv;
 	void				*cache_priv2;
 	unsigned int			debug_id;	/* Cookie debug ID */
+	unsigned int			inval_counter;	/* object->inval_counter at begin_op */
 };
 
 /*
diff --git a/include/trace/events/fscache.h b/include/trace/events/fscache.h
index 50f28a2a4ca8..adde3bb61f0f 100644
--- a/include/trace/events/fscache.h
+++ b/include/trace/events/fscache.h
@@ -410,6 +410,25 @@ TRACE_EVENT(fscache_relinquish,
 		      __entry->n_active, __entry->flags, __entry->retire)
 	    );
 
+TRACE_EVENT(fscache_invalidate,
+	    TP_PROTO(struct fscache_cookie *cookie, loff_t new_size),
+
+	    TP_ARGS(cookie, new_size),
+
+	    TP_STRUCT__entry(
+		    __field(unsigned int,		cookie		)
+		    __field(loff_t,			new_size	)
+			     ),
+
+	    TP_fast_assign(
+		    __entry->cookie	= cookie->debug_id;
+		    __entry->new_size	= new_size;
+			   ),
+
+	    TP_printk("c=%08x sz=%llx",
+		      __entry->cookie, __entry->new_size)
+	    );
+
 #endif /* _TRACE_FSCACHE_H */
 
 /* This part must be outside protection */



^ permalink raw reply related	[flat|nested] 87+ messages in thread

* [PATCH 44/67] fscache: disable cookie when doing an invalidation for DIO write
  2021-10-18 14:50 [PATCH 00/67] fscache: Rewrite index API and management system David Howells
                   ` (42 preceding siblings ...)
  2021-10-18 15:01 ` [PATCH 43/67] fscache: Rewrite invalidation David Howells
@ 2021-10-18 15:01 ` David Howells
  2021-10-18 15:01 ` [PATCH 45/67] cachefiles: Simplify the file lookup/creation/check code David Howells
                   ` (26 subsequent siblings)
  70 siblings, 0 replies; 87+ messages in thread
From: David Howells @ 2021-10-18 15:01 UTC (permalink / raw)
  To: linux-cachefs
  Cc: Jeff Layton, dhowells, Trond Myklebust, Anna Schumaker,
	Steve French, Dominique Martinet, Jeff Layton, Matthew Wilcox,
	Alexander Viro, Omar Sandoval, Linus Torvalds, linux-afs,
	linux-nfs, linux-cifs, ceph-devel, v9fs-developer, linux-fsdevel,
	linux-kernel

From: Jeff Layton <jlayton@kernel.org>

O_DIRECT I/O is probably a good indicator that we don't need to be
caching this file at the moment. Disable the cookie by treating it
as we would a NULL cookie after the invalidation completes. Reenable
when the last unuse is done.

Signed-off-by: Jeff Layton <jlayton@kernel.org>
Signed-off-by: David Howells <dhowells@redhat.com>
---

 fs/afs/file.c           |    2 +-
 fs/fscache/cookie.c     |    5 +++++
 include/linux/fscache.h |   15 ++++++++++-----
 3 files changed, 16 insertions(+), 6 deletions(-)

diff --git a/fs/afs/file.c b/fs/afs/file.c
index 5e29d433960d..5e674a503c1b 100644
--- a/fs/afs/file.c
+++ b/fs/afs/file.c
@@ -357,7 +357,7 @@ static bool afs_is_cache_enabled(struct inode *inode)
 {
 	struct fscache_cookie *cookie = afs_vnode_cache(AFS_FS_I(inode));
 
-	return fscache_cookie_valid(cookie) && cookie->cache_priv;
+	return fscache_cookie_enabled(cookie) && cookie->cache_priv;
 }
 
 static int afs_begin_cache_operation(struct netfs_read_request *rreq)
diff --git a/fs/fscache/cookie.c b/fs/fscache/cookie.c
index 8731188a5ac7..70bfbd269652 100644
--- a/fs/fscache/cookie.c
+++ b/fs/fscache/cookie.c
@@ -522,6 +522,7 @@ void __fscache_unuse_cookie(struct fscache_cookie *cookie,
 		__fscache_update_cookie(cookie, aux_data, object_size);
 	cookie->unused_at = jiffies;
 	if (atomic_dec_return(&cookie->n_active) == 0) {
+		clear_bit(FSCACHE_COOKIE_DISABLED, &cookie->flags);
 		if (test_bit(FSCACHE_COOKIE_IS_CACHING, &cookie->flags)) {
 			spin_lock(&fscache_cookie_lru_lock);
 			if (list_empty(&cookie->commit_link)) {
@@ -715,6 +716,10 @@ void __fscache_invalidate(struct fscache_cookie *cookie,
 		 "Trying to invalidate relinquished cookie\n"))
 		return;
 
+	if ((flags & FSCACHE_INVAL_DIO_WRITE) &&
+	    test_and_set_bit(FSCACHE_COOKIE_DISABLED, &cookie->flags))
+		return;
+
 	spin_lock(&cookie->lock);
 	set_bit(FSCACHE_COOKIE_NO_DATA_TO_READ, &cookie->flags);
 	fscache_update_aux(cookie, aux_data, &new_size);
diff --git a/include/linux/fscache.h b/include/linux/fscache.h
index a29bd81996ea..08663ad7feed 100644
--- a/include/linux/fscache.h
+++ b/include/linux/fscache.h
@@ -28,12 +28,14 @@
 #define fscache_volume_valid(volume) (volume)
 #define fscache_cookie_valid(cookie) (cookie)
 #define fscache_resources_valid(cres) ((cres)->cache_priv)
+#define fscache_cookie_enabled(cookie) (cookie && !test_bit(FSCACHE_COOKIE_DISABLED, &cookie->flags))
 #else
 #define __fscache_available (0)
 #define fscache_available() (0)
 #define fscache_volume_valid(volume) (0)
 #define fscache_cookie_valid(cookie) (0)
 #define fscache_resources_valid(cres) (false)
+#define fscache_cookie_enabled(cookie) (0)
 #endif
 
 struct fscache_cookie;
@@ -43,6 +45,8 @@ struct fscache_cookie;
 #define FSCACHE_ADV_WRITE_NOCACHE	0x02 /* Don't cache if written to locally */
 #define FSCACHE_ADV_FALLBACK_IO		0x04 /* Going to use the fallback I/O API (dangerous) */
 
+#define FSCACHE_INVAL_DIO_WRITE		0x01 /* Invalidate due to DIO write */
+
 enum fscache_want_stage {
 	FSCACHE_WANT_PARAMS,
 	FSCACHE_WANT_WRITE,
@@ -120,6 +124,7 @@ struct fscache_cookie {
 #define FSCACHE_COOKIE_NO_DATA_TO_READ	3		/* T if this cookie has nothing to read */
 #define FSCACHE_COOKIE_NEEDS_UPDATE	4		/* T if attrs have been updated */
 #define FSCACHE_COOKIE_HAS_BEEN_CACHED	5		/* T if cookie needs withdraw-on-relinq */
+#define FSCACHE_COOKIE_DISABLED		6		/* T if cookie has been disabled */
 #define FSCACHE_COOKIE_NACC_ELEVATED	8		/* T if n_accesses is incremented */
 #define FSCACHE_COOKIE_DO_RELINQUISH	9		/* T if this cookie needs relinquishment */
 #define FSCACHE_COOKIE_DO_WITHDRAW	10		/* T if this cookie needs withdrawing */
@@ -352,7 +357,7 @@ static inline
 void fscache_update_cookie(struct fscache_cookie *cookie, const void *aux_data,
 			   const loff_t *object_size)
 {
-	if (fscache_cookie_valid(cookie))
+	if (fscache_cookie_enabled(cookie))
 		__fscache_update_cookie(cookie, aux_data, object_size);
 }
 
@@ -410,7 +415,7 @@ static inline
 void fscache_invalidate(struct fscache_cookie *cookie,
 			const void *aux_data, loff_t size, unsigned int flags)
 {
-	if (fscache_cookie_valid(cookie))
+	if (fscache_cookie_enabled(cookie))
 		__fscache_invalidate(cookie, aux_data, size, flags);
 }
 
@@ -449,7 +454,7 @@ static inline
 int fscache_begin_read_operation(struct netfs_cache_resources *cres,
 				 struct fscache_cookie *cookie)
 {
-	if (fscache_cookie_valid(cookie))
+	if (fscache_cookie_enabled(cookie))
 		return __fscache_begin_read_operation(cres, cookie);
 	return -ENOBUFS;
 }
@@ -578,7 +583,7 @@ static inline void fscache_clear_inode_writeback(struct fscache_cookie *cookie,
 static inline
 int fscache_fallback_read_page(struct fscache_cookie *cookie, struct page *page)
 {
-	if (fscache_cookie_valid(cookie))
+	if (fscache_cookie_enabled(cookie))
 		return __fscache_fallback_read_page(cookie, page);
 	return -ENOBUFS;
 }
@@ -598,7 +603,7 @@ int fscache_fallback_read_page(struct fscache_cookie *cookie, struct page *page)
 static inline
 int fscache_fallback_write_page(struct fscache_cookie *cookie, struct page *page)
 {
-	if (fscache_cookie_valid(cookie))
+	if (fscache_cookie_enabled(cookie))
 		return __fscache_fallback_write_page(cookie, page);
 	return -ENOBUFS;
 }



^ permalink raw reply related	[flat|nested] 87+ messages in thread

* [PATCH 45/67] cachefiles: Simplify the file lookup/creation/check code
  2021-10-18 14:50 [PATCH 00/67] fscache: Rewrite index API and management system David Howells
                   ` (43 preceding siblings ...)
  2021-10-18 15:01 ` [PATCH 44/67] fscache: disable cookie when doing an invalidation for DIO write David Howells
@ 2021-10-18 15:01 ` David Howells
  2021-10-18 15:01 ` [PATCH 46/67] fscache: Provide resize operation David Howells
                   ` (25 subsequent siblings)
  70 siblings, 0 replies; 87+ messages in thread
From: David Howells @ 2021-10-18 15:01 UTC (permalink / raw)
  To: linux-cachefs
  Cc: dhowells, Trond Myklebust, Anna Schumaker, Steve French,
	Dominique Martinet, Jeff Layton, Matthew Wilcox, Alexander Viro,
	Omar Sandoval, Linus Torvalds, linux-afs, linux-nfs, linux-cifs,
	ceph-devel, v9fs-developer, linux-fsdevel, linux-kernel

Simplify the code that does file lookup, creation and checking.

Signed-off-by: David Howells <dhowells@redhat.com>
---

 fs/cachefiles/interface.c         |    4 -
 fs/cachefiles/internal.h          |   11 +
 fs/cachefiles/io.c                |    2 
 fs/cachefiles/namei.c             |  276 ++++++++++++++++++++-----------------
 fs/cachefiles/xattr.c             |    6 -
 include/trace/events/cachefiles.h |    2 
 6 files changed, 160 insertions(+), 141 deletions(-)

diff --git a/fs/cachefiles/interface.c b/fs/cachefiles/interface.c
index 96a703d5f62c..38ae34b7aaf4 100644
--- a/fs/cachefiles/interface.c
+++ b/fs/cachefiles/interface.c
@@ -79,7 +79,7 @@ static bool cachefiles_lookup_cookie(struct fscache_cookie *cookie)
 
 	/* look up the key, creating any missing bits */
 	cachefiles_begin_secure(cache, &saved_cred);
-	success = cachefiles_walk_to_object(object);
+	success = cachefiles_look_up_object(object);
 	cachefiles_end_secure(cache, saved_cred);
 
 	if (!success)
@@ -222,7 +222,7 @@ static void cachefiles_clean_up_object(struct cachefiles_object *object,
 		cachefiles_commit_object(object, cache);
 	}
 
-	cachefiles_unmark_inode_in_use(object);
+	cachefiles_unmark_inode_in_use(object, object->file);
 	if (object->file) {
 		fput(object->file);
 		object->file = NULL;
diff --git a/fs/cachefiles/internal.h b/fs/cachefiles/internal.h
index 6cc22c85c8f2..a2d2ed2f19eb 100644
--- a/fs/cachefiles/internal.h
+++ b/fs/cachefiles/internal.h
@@ -58,8 +58,7 @@ struct cachefiles_object {
 	u8				d_name_len;	/* Length of filename */
 	u8				key_hash;	/* Hash of object key */
 	unsigned long			flags;
-#define CACHEFILES_OBJECT_IS_NEW	0		/* Set if object is new */
-#define CACHEFILES_OBJECT_USING_TMPFILE	1		/* Have an unlinked tmpfile */
+#define CACHEFILES_OBJECT_USING_TMPFILE	0		/* Have an unlinked tmpfile */
 };
 
 extern struct kmem_cache *cachefiles_object_jar;
@@ -171,7 +170,8 @@ extern bool cachefiles_cook_key(struct cachefiles_object *object);
 /*
  * namei.c
  */
-extern void cachefiles_unmark_inode_in_use(struct cachefiles_object *object);
+extern void cachefiles_unmark_inode_in_use(struct cachefiles_object *object,
+					   struct file *file);
 extern int cachefiles_bury_object(struct cachefiles_cache *cache,
 				  struct cachefiles_object *object,
 				  struct dentry *dir,
@@ -179,7 +179,7 @@ extern int cachefiles_bury_object(struct cachefiles_cache *cache,
 				  enum fscache_why_object_killed why);
 extern int cachefiles_delete_object(struct cachefiles_object *object,
 				    enum fscache_why_object_killed why);
-extern bool cachefiles_walk_to_object(struct cachefiles_object *object);
+extern bool cachefiles_look_up_object(struct cachefiles_object *object);
 extern struct dentry *cachefiles_get_directory(struct cachefiles_cache *cache,
 					       struct dentry *dir,
 					       const char *name);
@@ -224,7 +224,8 @@ void cachefiles_withdraw_volume(struct cachefiles_volume *volume);
  * xattr.c
  */
 extern int cachefiles_set_object_xattr(struct cachefiles_object *object);
-extern int cachefiles_check_auxdata(struct cachefiles_object *object);
+extern int cachefiles_check_auxdata(struct cachefiles_object *object,
+				    struct file *file);
 extern int cachefiles_remove_object_xattr(struct cachefiles_cache *cache,
 					  struct dentry *dentry);
 
diff --git a/fs/cachefiles/io.c b/fs/cachefiles/io.c
index 67ea9f44931f..9b3b55a94e66 100644
--- a/fs/cachefiles/io.c
+++ b/fs/cachefiles/io.c
@@ -525,7 +525,7 @@ bool cachefiles_begin_operation(struct netfs_cache_resources *cres,
 
 	if (!cachefiles_cres_file(cres)) {
 		cres->ops = &cachefiles_netfs_cache_ops;
-		if (object) {
+		if (object->file) {
 			spin_lock(&object->lock);
 			if (!cres->cache_priv2 && object->file)
 				cres->cache_priv2 = get_file(object->file);
diff --git a/fs/cachefiles/namei.c b/fs/cachefiles/namei.c
index 0edf1276768b..989df918570b 100644
--- a/fs/cachefiles/namei.c
+++ b/fs/cachefiles/namei.c
@@ -21,9 +21,10 @@
 /*
  * Mark the backing file as being a cache file if it's not already in use so.
  */
-static bool cachefiles_mark_inode_in_use(struct cachefiles_object *object)
+static bool cachefiles_mark_inode_in_use(struct cachefiles_object *object,
+					 struct dentry *dentry)
 {
-	struct inode *inode = file_inode(object->file);
+	struct inode *inode = d_backing_inode(dentry);
 	bool can_use = false;
 
 	_enter(",%x", object->debug_id);
@@ -45,9 +46,10 @@ static bool cachefiles_mark_inode_in_use(struct cachefiles_object *object)
 /*
  * Unmark a backing inode.
  */
-void cachefiles_unmark_inode_in_use(struct cachefiles_object *object)
+void cachefiles_unmark_inode_in_use(struct cachefiles_object *object,
+				    struct file *file)
 {
-	struct inode *inode = file_inode(object->file);
+	struct inode *inode = file_inode(file);
 
 	if (!inode)
 		return;
@@ -61,10 +63,11 @@ void cachefiles_unmark_inode_in_use(struct cachefiles_object *object)
 /*
  * Mark an object as being inactive.
  */
-static void cachefiles_mark_object_inactive(struct cachefiles_object *object)
+static void cachefiles_mark_object_inactive(struct cachefiles_object *object,
+					    struct file *file)
 {
 	struct cachefiles_cache *cache = object->volume->cache;
-	blkcnt_t i_blocks = file_inode(object->file)->i_blocks;
+	blkcnt_t i_blocks = file_inode(file)->i_blocks;
 
 	/* This object can now be culled, so we need to let the daemon know
 	 * that there is something it can remove if it needs to.
@@ -232,191 +235,180 @@ int cachefiles_bury_object(struct cachefiles_cache *cache,
 	return 0;
 }
 
+static int cachefiles_unlink(struct cachefiles_object *object,
+			     struct dentry *fan, struct dentry *dentry,
+			     enum fscache_why_object_killed why)
+{
+	struct path path = {
+		.mnt	= object->volume->cache->mnt,
+		.dentry	= fan,
+	};
+	int ret;
+
+	trace_cachefiles_unlink(object, dentry, why);
+	ret = security_path_unlink(&path, dentry);
+	if (ret == 0)
+		ret = vfs_unlink(&init_user_ns, d_backing_inode(fan), dentry, NULL);
+	return ret;
+}
+
 /*
- * delete an object representation from the cache
+ * Delete a cache file.
  */
 int cachefiles_delete_object(struct cachefiles_object *object,
 			     enum fscache_why_object_killed why)
 {
 	struct cachefiles_volume *volume = object->volume;
+	struct dentry *dentry = object->file->f_path.dentry;
 	struct dentry *fan = volume->fanout[(u8)object->key_hash];
+	int ret;
 
 	_enter(",OBJ%x{%pD}", object->debug_id, object->file);
 
+	/* Stop the dentry being negated if it's only pinned by a file struct. */
+	dget(dentry);
+
 	inode_lock_nested(d_backing_inode(fan), I_MUTEX_PARENT);
-	return cachefiles_bury_object(volume->cache, object, fan,
-				      object->file->f_path.dentry, why);
+	ret = cachefiles_unlink(object, fan, dentry, why);
+	inode_unlock(d_backing_inode(fan));
+	dput(dentry);
+
+	if (ret < 0 && ret != -ENOENT)
+		cachefiles_io_error(volume->cache, "Unlink failed");
+	return ret;
 }
 
 /*
- * Check the attributes on a file we've just opened and delete it if it's out
- * of date.
+ * Create a new file.
  */
-static int cachefiles_check_open_object(struct cachefiles_object *object,
-					struct dentry *fan)
+static bool cachefiles_create_file(struct cachefiles_object *object)
 {
+	struct file *file;
 	int ret;
 
-	if (!cachefiles_mark_inode_in_use(object))
-		return -EBUSY;
-
-	_enter("%pD", object->file);
-
-	ret = cachefiles_check_auxdata(object);
-	if (ret == -ESTALE)
-		goto stale;
+	ret = cachefiles_has_space(object->volume->cache, 1, 0);
 	if (ret < 0)
-		goto error_unmark;
+		return false;
 
-	/* Always update the atime on an object we've just looked up (this is
-	 * used to keep track of culling, and atimes are only updated by read,
-	 * write and readdir but not lookup or open).
-	 */
-	touch_atime(&object->file->f_path);
-	return 0;
-
-stale:
-	set_bit(CACHEFILES_OBJECT_IS_NEW, &object->flags);
-	fscache_cookie_lookup_negative(object->cookie);
-	cachefiles_unmark_inode_in_use(object);
-	inode_lock_nested(d_inode(fan), I_MUTEX_PARENT);
-	ret = cachefiles_bury_object(object->volume->cache, object, fan,
-				     object->file->f_path.dentry,
-				     FSCACHE_OBJECT_IS_STALE);
-	if (ret < 0)
-		return ret;
-	cachefiles_mark_object_inactive(object);
-	_debug("redo lookup");
-	return -ESTALE;
+	file = cachefiles_create_tmpfile(object);
+	if (IS_ERR(file))
+		return false;
 
-error_unmark:
-	cachefiles_unmark_inode_in_use(object);
-	return ret;
+	set_bit(FSCACHE_COOKIE_NEEDS_UPDATE, &object->cookie->flags);
+	set_bit(CACHEFILES_OBJECT_USING_TMPFILE, &object->flags);
+	_debug("create -> %pD{ino=%lu}", file, file_inode(file)->i_ino);
+	object->file = file;
+	return true;
 }
 
 /*
- * Look up a file, creating it if necessary.
+ * Open an existing file, checking its attributes and replacing it if it is
+ * stale.
  */
-static int cachefiles_open_file(struct cachefiles_object *object,
-				struct dentry *fan)
+static bool cachefiles_open_file(struct cachefiles_object *object,
+				 struct dentry *dentry)
 {
 	struct cachefiles_cache *cache = object->volume->cache;
-	struct dentry *dentry;
 	struct file *file;
 	struct path path;
 	int ret;
 
-	_enter("%pd %s", fan, object->d_name);
+	_enter("%pd", dentry);
 
-	dentry = lookup_positive_unlocked(object->d_name, fan, object->d_name_len);
-	trace_cachefiles_lookup(object, dentry);
-	if (dentry == ERR_PTR(-ENOENT)) {
-		set_bit(CACHEFILES_OBJECT_IS_NEW, &object->flags);
-		fscache_cookie_lookup_negative(object->cookie);
-
-		ret = cachefiles_has_space(cache, 1, 0);
-		if (ret < 0)
-			goto error;
+	if (!cachefiles_mark_inode_in_use(object, dentry))
+		return false;
 
-		file = cachefiles_create_tmpfile(object);
-		if (IS_ERR(file)) {
-			ret = PTR_ERR(file);
-			goto error;
-		}
+	/* We need to open a file interface onto a data file now as we can't do
+	 * it on demand because writeback called from do_exit() sees
+	 * current->fs == NULL - which breaks d_path() called from ext4 open.
+	 */
+	path.mnt = cache->mnt;
+	path.dentry = dentry;
+	file = open_with_fake_path(&path, O_RDWR | O_LARGEFILE | O_DIRECT,
+				   d_backing_inode(dentry), cache->cache_cred);
+	if (IS_ERR(file))
+		goto error;
 
-		set_bit(FSCACHE_COOKIE_NEEDS_UPDATE, &object->cookie->flags);
-		set_bit(CACHEFILES_OBJECT_USING_TMPFILE, &object->flags);
-		_debug("create -> %pD{ino=%lu}", file, file_inode(file)->i_ino);
-		goto out;
+	if (unlikely(!file->f_op->read_iter) ||
+	    unlikely(!file->f_op->write_iter)) {
+		pr_notice("Cache does not support read_iter and write_iter\n");
+		goto error_fput;
 	}
+	_debug("file -> %pd positive", dentry);
 
-	if (IS_ERR(dentry)) {
-		ret = PTR_ERR(dentry);
-		goto error;
-	}
+	ret = cachefiles_check_auxdata(object, file);
+	if (ret < 0)
+		goto check_failed;
 
-	if (!d_is_reg(dentry)) {
-		pr_err("%pd is not a file\n", dentry);
-		dput(dentry);
-		ret = -EIO;
-		goto error;
-	} else {
-		clear_bit(CACHEFILES_OBJECT_IS_NEW, &object->flags);
+	object->file = file;
 
-		/* We need to open a file interface onto a data file now as we
-		 * can't do it on demand because writeback called from
-		 * do_exit() sees current->fs == NULL - which breaks d_path()
-		 * called from ext4 open.
-		 */
-		path.mnt = cache->mnt;
-		path.dentry = dentry;
-		file = open_with_fake_path(&path, O_RDWR | O_LARGEFILE | O_DIRECT,
-					   d_backing_inode(dentry), cache->cache_cred);
+	/* Always update the atime on an object we've just looked up (this is
+	 * used to keep track of culling, and atimes are only updated by read,
+	 * write and readdir but not lookup or open).
+	 */
+	touch_atime(&file->f_path);
+	dput(dentry);
+	return true;
+
+check_failed:
+	fscache_cookie_lookup_negative(object->cookie);
+	cachefiles_unmark_inode_in_use(object, file);
+	cachefiles_mark_object_inactive(object, file);
+	if (ret == -ESTALE) {
+		fput(file);
 		dput(dentry);
-		if (IS_ERR(file)) {
-			ret = PTR_ERR(file);
-			goto error;
-		}
-		if (unlikely(!file->f_op->read_iter) ||
-		    unlikely(!file->f_op->write_iter)) {
-			pr_notice("Cache does not support read_iter and write_iter\n");
-			ret = -EIO;
-			goto error_fput;
-		}
-		_debug("file -> %pd positive", dentry);
+		return cachefiles_create_file(object);
 	}
-
-out:
-	object->file = file;
-	return 0;
 error_fput:
 	fput(file);
 error:
-	return ret;
+	dput(dentry);
+	return false;
 }
 
 /*
  * walk from the parent object to the child object through the backing
  * filesystem, creating directories as we go
  */
-bool cachefiles_walk_to_object(struct cachefiles_object *object)
+bool cachefiles_look_up_object(struct cachefiles_object *object)
 {
-	struct cachefiles_volume *volume = object->cookie->volume->cache_priv;
-	struct dentry *fan;
+	struct cachefiles_volume *volume = object->volume;
+	struct dentry *dentry, *fan = volume->fanout[(u8)object->key_hash];
 	int ret;
 
 	_enter("OBJ%x,%s,", object->debug_id, object->d_name);
 
-lookup_again:
-	/* Open path "cache/vol/fanout/file". */
-	fan = volume->fanout[(u8)object->key_hash];
-	ret = cachefiles_open_file(object, fan);
-	if (ret < 0)
-		goto lookup_error;
+	/* Look up path "cache/vol/fanout/file". */
+	dentry = lookup_positive_unlocked(object->d_name, fan, object->d_name_len);
+	trace_cachefiles_lookup(object, dentry);
+	if (IS_ERR(dentry)) {
+		if (dentry == ERR_PTR(-ENOENT))
+			goto new_file;
+		if (dentry == ERR_PTR(-EIO))
+			cachefiles_io_error_obj(object, "Lookup failed");
+		return false;
+	}
 
-	if (!test_bit(CACHEFILES_OBJECT_IS_NEW, &object->flags)) {
-		ret = cachefiles_check_open_object(object, fan);
+	if (!d_is_reg(dentry)) {
+		pr_err("%pd is not a file\n", dentry);
+		inode_lock_nested(d_inode(fan), I_MUTEX_PARENT);
+		ret = cachefiles_bury_object(volume->cache, object, fan, dentry,
+					     FSCACHE_OBJECT_IS_WEIRD);
+		dput(dentry);
 		if (ret < 0)
-			goto check_error;
-	} else {
-		ret = -EBUSY;
-		if (!cachefiles_mark_inode_in_use(object))
-			goto check_error;
+			return false;
+		goto new_file;
 	}
 
-	clear_bit(CACHEFILES_OBJECT_IS_NEW, &object->flags);
+	if (!cachefiles_open_file(object, dentry))
+		return false;
+
 	_leave(" = t [%lu]", file_inode(object->file)->i_ino);
 	return true;
 
-check_error:
-	fput(object->file);
-	object->file = NULL;
-	if (ret == -ESTALE)
-		goto lookup_again;
-lookup_error:
-	if (ret == -EIO)
-		cachefiles_io_error_obj(object, "Lookup failed");
-	return false;
+new_file:
+	fscache_cookie_lookup_negative(object->cookie);
+	return cachefiles_create_file(object);
 }
 
 /*
@@ -709,6 +701,11 @@ struct file *cachefiles_create_tmpfile(struct cachefiles_object *object)
 
 	trace_cachefiles_tmpfile(object, d_backing_inode(path.dentry));
 
+	if (!cachefiles_mark_inode_in_use(object, path.dentry)) {
+		file = ERR_PTR(-EBUSY);
+		goto out_dput;
+	}
+
 	if (ni_size > 0) {
 		trace_cachefiles_trunc(object, d_backing_inode(path.dentry), 0, ni_size,
 				       cachefiles_trunc_expand_tmpfile);
@@ -757,6 +754,24 @@ bool cachefiles_commit_tmpfile(struct cachefiles_cache *cache,
 		goto out_unlock;
 	}
 
+	if (!d_is_negative(dentry)) {
+		if (d_backing_inode(dentry) == file_inode(object->file)) {
+			success = true;
+			goto out_dput;
+		}
+
+		ret = cachefiles_unlink(object, fan, dentry, FSCACHE_OBJECT_IS_STALE);
+		if (ret < 0)
+			goto out_dput;
+
+		dput(dentry);
+		dentry = lookup_one_len(object->d_name, fan, object->d_name_len);
+		if (IS_ERR(dentry)) {
+			_debug("lookup fail %ld", PTR_ERR(dentry));
+			goto out_unlock;
+		}
+	}
+
 	ret = vfs_link(object->file->f_path.dentry, &init_user_ns,
 		       d_inode(fan), dentry, NULL);
 	if (ret < 0) {
@@ -770,6 +785,7 @@ bool cachefiles_commit_tmpfile(struct cachefiles_cache *cache,
 		success = true;
 	}
 
+out_dput:
 	dput(dentry);
 out_unlock:
 	inode_unlock(d_inode(fan));
diff --git a/fs/cachefiles/xattr.c b/fs/cachefiles/xattr.c
index b77bbb6c4a17..50b2a4588946 100644
--- a/fs/cachefiles/xattr.c
+++ b/fs/cachefiles/xattr.c
@@ -74,10 +74,10 @@ int cachefiles_set_object_xattr(struct cachefiles_object *object)
 /*
  * check the consistency between the backing cache and the FS-Cache cookie
  */
-int cachefiles_check_auxdata(struct cachefiles_object *object)
+int cachefiles_check_auxdata(struct cachefiles_object *object, struct file *file)
 {
 	struct cachefiles_xattr *buf;
-	struct dentry *dentry = object->file->f_path.dentry;
+	struct dentry *dentry = file->f_path.dentry;
 	unsigned int len = object->cookie->aux_len, tlen;
 	const void *p = fscache_get_aux(object->cookie);
 	enum cachefiles_coherency_trace why;
@@ -105,7 +105,7 @@ int cachefiles_check_auxdata(struct cachefiles_object *object)
 		ret = 0;
 	}
 
-	trace_cachefiles_coherency(object, file_inode(object->file)->i_ino, 0, why);
+	trace_cachefiles_coherency(object, file_inode(file)->i_ino, 0, why);
 	kfree(buf);
 	return ret;
 }
diff --git a/include/trace/events/cachefiles.h b/include/trace/events/cachefiles.h
index c0632ee8cf69..a7b31b248f2d 100644
--- a/include/trace/events/cachefiles.h
+++ b/include/trace/events/cachefiles.h
@@ -35,6 +35,7 @@ enum cachefiles_obj_ref_trace {
 
 enum fscache_why_object_killed {
 	FSCACHE_OBJECT_IS_STALE,
+	FSCACHE_OBJECT_IS_WEIRD,
 	FSCACHE_OBJECT_INVALIDATED,
 	FSCACHE_OBJECT_NO_SPACE,
 	FSCACHE_OBJECT_WAS_RETIRED,
@@ -78,6 +79,7 @@ enum cachefiles_prepare_read_trace {
  */
 #define cachefiles_obj_kill_traces				\
 	EM(FSCACHE_OBJECT_IS_STALE,	"stale")		\
+	EM(FSCACHE_OBJECT_IS_WEIRD,	"weird")		\
 	EM(FSCACHE_OBJECT_INVALIDATED,	"inval")		\
 	EM(FSCACHE_OBJECT_NO_SPACE,	"no_space")		\
 	EM(FSCACHE_OBJECT_WAS_RETIRED,	"was_retired")		\



^ permalink raw reply related	[flat|nested] 87+ messages in thread

* [PATCH 46/67] fscache: Provide resize operation
  2021-10-18 14:50 [PATCH 00/67] fscache: Rewrite index API and management system David Howells
                   ` (44 preceding siblings ...)
  2021-10-18 15:01 ` [PATCH 45/67] cachefiles: Simplify the file lookup/creation/check code David Howells
@ 2021-10-18 15:01 ` David Howells
  2021-10-18 15:02 ` [PATCH 47/67] cachefiles: Put more information in the xattr attached to the cache file David Howells
                   ` (24 subsequent siblings)
  70 siblings, 0 replies; 87+ messages in thread
From: David Howells @ 2021-10-18 15:01 UTC (permalink / raw)
  To: linux-cachefs
  Cc: dhowells, Trond Myklebust, Anna Schumaker, Steve French,
	Dominique Martinet, Jeff Layton, Matthew Wilcox, Alexander Viro,
	Omar Sandoval, Linus Torvalds, linux-afs, linux-nfs, linux-cifs,
	ceph-devel, v9fs-developer, linux-fsdevel, linux-kernel

Provide a cache operation to resize an object.  This is intended to be run
synchronously rather than being deferred as it really needs to run inside
the inode lock on the netfs inode from ->setattr() to correctly order with
respect to other truncates and writes.

Signed-off-by: David Howells <dhowells@redhat.com>
---

 fs/cachefiles/interface.c      |   93 ++++++++++++++++++++++++----------------
 fs/fscache/io.c                |   25 +++++++++++
 fs/fscache/stats.c             |    6 ++-
 include/linux/fscache-cache.h  |    4 ++
 include/linux/fscache.h        |   18 ++++++++
 include/trace/events/fscache.h |   23 ++++++++++
 6 files changed, 130 insertions(+), 39 deletions(-)

diff --git a/fs/cachefiles/interface.c b/fs/cachefiles/interface.c
index 38ae34b7aaf4..f90f6ddd07a5 100644
--- a/fs/cachefiles/interface.c
+++ b/fs/cachefiles/interface.c
@@ -9,6 +9,7 @@
 #include <linux/mount.h>
 #include <linux/xattr.h>
 #include <linux/file.h>
+#include <linux/falloc.h>
 #include <trace/events/fscache.h>
 #include "internal.h"
 
@@ -134,55 +135,72 @@ struct cachefiles_object *cachefiles_grab_object(struct cachefiles_object *objec
 }
 
 /*
- * update the auxiliary data for an object object on disk
+ * Shorten the backing object to discard any dirty data and free up
+ * any unused granules.
  */
-static void cachefiles_update_object(struct cachefiles_object *object)
+static bool cachefiles_shorten_object(struct cachefiles_object *object,
+				      struct file *file, loff_t new_size)
 {
 	struct cachefiles_cache *cache = object->volume->cache;
-	const struct cred *saved_cred;
-	struct file *file = object->file;
-	loff_t object_size, i_size;
+	struct inode *inode = file_inode(file);
+	loff_t i_size, dio_size;
 	int ret;
 
-	_enter("{OBJ%x}", object->debug_id);
+	dio_size = round_up(new_size, CACHEFILES_DIO_BLOCK_SIZE);
+	i_size = i_size_read(inode);
 
-	cachefiles_begin_secure(cache, &saved_cred);
+	trace_cachefiles_trunc(object, inode, i_size, dio_size,
+			       cachefiles_trunc_shrink);
+	ret = vfs_truncate(&file->f_path, dio_size);
+	if (ret < 0) {
+		cachefiles_io_error_obj(object, "Trunc-to-size failed %d", ret);
+		cachefiles_remove_object_xattr(cache, file->f_path.dentry);
+		return false;
+	}
 
-	object_size = object->cookie->object_size;
-	i_size = i_size_read(file_inode(file));
-	if (i_size > object_size) {
-		_debug("trunc %llx -> %llx", i_size, object_size);
-		trace_cachefiles_trunc(object, file_inode(file),
-				       i_size, object_size,
-				       cachefiles_trunc_shrink);
-		ret = vfs_truncate(&file->f_path, object_size);
+	if (new_size < dio_size) {
+		trace_cachefiles_trunc(object, inode, dio_size, new_size,
+				       cachefiles_trunc_dio_adjust);
+		ret = vfs_fallocate(file, FALLOC_FL_ZERO_RANGE,
+				    new_size, dio_size);
 		if (ret < 0) {
-			cachefiles_io_error_obj(object, "Trunc-to-size failed");
+			cachefiles_io_error_obj(object, "Trunc-to-dio-size failed %d", ret);
 			cachefiles_remove_object_xattr(cache, file->f_path.dentry);
-			goto out;
-		}
-
-		object_size = round_up(object_size, CACHEFILES_DIO_BLOCK_SIZE);
-		i_size = i_size_read(file_inode(file));
-		_debug("trunc %llx -> %llx", i_size, object_size);
-		if (i_size < object_size) {
-			trace_cachefiles_trunc(object, file_inode(file),
-					       i_size, object_size,
-					       cachefiles_trunc_dio_adjust);
-			ret = vfs_truncate(&file->f_path, object_size);
-			if (ret < 0) {
-				cachefiles_io_error_obj(object, "Trunc-to-dio-size failed");
-				cachefiles_remove_object_xattr(cache, file->f_path.dentry);
-				goto out;
-			}
+			return false;
 		}
 	}
 
-	cachefiles_set_object_xattr(object);
+	return true;
+}
 
-out:
-	cachefiles_end_secure(cache, saved_cred);
-	_leave("");
+/*
+ * Resize the backing object.
+ */
+static void cachefiles_resize_cookie(struct netfs_cache_resources *cres,
+				     loff_t new_size)
+{
+	struct cachefiles_object *object = cachefiles_cres_object(cres);
+	struct cachefiles_cache *cache = object->volume->cache;
+	struct fscache_cookie *cookie = object->cookie;
+	const struct cred *saved_cred;
+	struct file *file = cachefiles_cres_file(cres);
+	loff_t old_size = cookie->object_size;
+
+	_enter("%llu->%llu", old_size, new_size);
+
+	if (new_size < old_size) {
+		cachefiles_begin_secure(cache, &saved_cred);
+		cachefiles_shorten_object(object, file, new_size);
+		cachefiles_end_secure(cache, saved_cred);
+		object->cookie->object_size = new_size;
+		return;
+	}
+
+	/* The file is being expanded.  We don't need to do anything
+	 * particularly.  cookie->initial_size doesn't change and so the point
+	 * at which we have to download before doesn't change.
+	 */
+	cookie->object_size = new_size;
 }
 
 /*
@@ -196,7 +214,7 @@ static void cachefiles_commit_object(struct cachefiles_object *object,
 	if (test_and_clear_bit(FSCACHE_COOKIE_NEEDS_UPDATE, &object->cookie->flags))
 		update = true;
 	if (update)
-		cachefiles_update_object(object);
+		cachefiles_set_object_xattr(object);
 
 	if (test_bit(CACHEFILES_OBJECT_USING_TMPFILE, &object->flags))
 		cachefiles_commit_tmpfile(cache, object);
@@ -440,5 +458,6 @@ const struct fscache_cache_ops cachefiles_cache_ops = {
 	.lookup_cookie		= cachefiles_lookup_cookie,
 	.withdraw_cookie	= cachefiles_withdraw_cookie,
 	.invalidate_cookie	= cachefiles_invalidate_cookie,
+	.resize_cookie		= cachefiles_resize_cookie,
 	.begin_operation	= cachefiles_begin_operation,
 };
diff --git a/fs/fscache/io.c b/fs/fscache/io.c
index ad9798f0c850..8b1a865a0847 100644
--- a/fs/fscache/io.c
+++ b/fs/fscache/io.c
@@ -253,3 +253,28 @@ int fscache_set_page_dirty(struct page *page, struct fscache_cookie *cookie)
 	return 1;
 }
 EXPORT_SYMBOL(fscache_set_page_dirty);
+
+/*
+ * Change the size of a backing object.
+ */
+void __fscache_resize_cookie(struct fscache_cookie *cookie, loff_t new_size)
+{
+	struct netfs_cache_resources cres;
+
+	trace_fscache_resize(cookie, new_size);
+	if (fscache_begin_operation(&cres, cookie, FSCACHE_WANT_WRITE,
+				    fscache_access_io_resize) == 0) {
+		fscache_stat(&fscache_n_resizes);
+		set_bit(FSCACHE_COOKIE_NEEDS_UPDATE, &cookie->flags);
+
+		/* We cannot defer a resize as we need to do it inside the
+		 * netfs's inode lock so that we're serialised with respect to
+		 * writes.
+		 */
+		cookie->volume->cache->ops->resize_cookie(&cres, new_size);
+		fscache_end_operation(&cres);
+	} else {
+		fscache_stat(&fscache_n_resizes_null);
+	}
+}
+EXPORT_SYMBOL(__fscache_resize_cookie);
diff --git a/fs/fscache/stats.c b/fs/fscache/stats.c
index 44b3b901e191..b1174d0dac3e 100644
--- a/fs/fscache/stats.c
+++ b/fs/fscache/stats.c
@@ -97,8 +97,10 @@ int fscache_stats_show(struct seq_file *m, void *v)
 	seq_printf(m, "Invals : n=%u\n",
 		   atomic_read(&fscache_n_invalidates));
 
-	seq_printf(m, "Updates: n=%u\n",
-		   atomic_read(&fscache_n_updates));
+	seq_printf(m, "Updates: n=%u rsz=%u rsn=%u\n",
+		   atomic_read(&fscache_n_updates),
+		   atomic_read(&fscache_n_resizes),
+		   atomic_read(&fscache_n_resizes_null));
 
 	seq_printf(m, "Relinqs: n=%u rtr=%u drop=%u\n",
 		   atomic_read(&fscache_n_relinquishes),
diff --git a/include/linux/fscache-cache.h b/include/linux/fscache-cache.h
index 22064611d182..2e7265e24df6 100644
--- a/include/linux/fscache-cache.h
+++ b/include/linux/fscache-cache.h
@@ -66,6 +66,10 @@ struct fscache_cache_ops {
 	/* Withdraw an object without any cookie access counts held */
 	void (*withdraw_cookie)(struct fscache_cookie *cookie);
 
+	/* Change the size of a data object */
+	void (*resize_cookie)(struct netfs_cache_resources *cres,
+			      loff_t new_size);
+
 	/* Invalidate an object */
 	bool (*invalidate_cookie)(struct fscache_cookie *cookie);
 
diff --git a/include/linux/fscache.h b/include/linux/fscache.h
index 08663ad7feed..8148193045cd 100644
--- a/include/linux/fscache.h
+++ b/include/linux/fscache.h
@@ -164,6 +164,7 @@ extern struct fscache_cookie *__fscache_acquire_cookie(
 extern void __fscache_use_cookie(struct fscache_cookie *, bool);
 extern void __fscache_unuse_cookie(struct fscache_cookie *, const void *, const loff_t *);
 extern void __fscache_relinquish_cookie(struct fscache_cookie *, bool);
+extern void __fscache_resize_cookie(struct fscache_cookie *, loff_t);
 extern void __fscache_invalidate(struct fscache_cookie *, const void *, loff_t, unsigned int);
 #ifdef FSCACHE_USE_NEW_IO_API
 extern int __fscache_begin_read_operation(struct netfs_cache_resources *, struct fscache_cookie *);
@@ -361,6 +362,23 @@ void fscache_update_cookie(struct fscache_cookie *cookie, const void *aux_data,
 		__fscache_update_cookie(cookie, aux_data, object_size);
 }
 
+/**
+ * fscache_resize_cookie - Request that a cache object be resized
+ * @cookie: The cookie representing the cache object
+ * @new_size: The new size of the object (may be NULL)
+ *
+ * Request that the size of an object be changed.
+ *
+ * See Documentation/filesystems/caching/netfs-api.txt for a complete
+ * description.
+ */
+static inline
+void fscache_resize_cookie(struct fscache_cookie *cookie, loff_t new_size)
+{
+	if (fscache_cookie_enabled(cookie))
+		__fscache_resize_cookie(cookie, new_size);
+}
+
 /**
  * fscache_pin_cookie - Pin a data-storage cache object in its cache
  * @cookie: The cookie representing the cache object
diff --git a/include/trace/events/fscache.h b/include/trace/events/fscache.h
index adde3bb61f0f..9554b43e6fdf 100644
--- a/include/trace/events/fscache.h
+++ b/include/trace/events/fscache.h
@@ -429,6 +429,29 @@ TRACE_EVENT(fscache_invalidate,
 		      __entry->cookie, __entry->new_size)
 	    );
 
+TRACE_EVENT(fscache_resize,
+	    TP_PROTO(struct fscache_cookie *cookie, loff_t new_size),
+
+	    TP_ARGS(cookie, new_size),
+
+	    TP_STRUCT__entry(
+		    __field(unsigned int,		cookie		)
+		    __field(loff_t,			old_size	)
+		    __field(loff_t,			new_size	)
+			     ),
+
+	    TP_fast_assign(
+		    __entry->cookie	= cookie->debug_id;
+		    __entry->old_size	= cookie->object_size;
+		    __entry->new_size	= new_size;
+			   ),
+
+	    TP_printk("c=%08x os=%08llx sz=%08llx",
+		      __entry->cookie,
+		      __entry->old_size,
+		      __entry->new_size)
+	    );
+
 #endif /* _TRACE_FSCACHE_H */
 
 /* This part must be outside protection */



^ permalink raw reply related	[flat|nested] 87+ messages in thread

* [PATCH 47/67] cachefiles: Put more information in the xattr attached to the cache file
  2021-10-18 14:50 [PATCH 00/67] fscache: Rewrite index API and management system David Howells
                   ` (45 preceding siblings ...)
  2021-10-18 15:01 ` [PATCH 46/67] fscache: Provide resize operation David Howells
@ 2021-10-18 15:02 ` David Howells
  2021-10-18 15:02 ` [PATCH 48/67] fscache: Implement "will_modify" parameter on fscache_use_cookie() David Howells
                   ` (23 subsequent siblings)
  70 siblings, 0 replies; 87+ messages in thread
From: David Howells @ 2021-10-18 15:02 UTC (permalink / raw)
  To: linux-cachefs
  Cc: dhowells, Trond Myklebust, Anna Schumaker, Steve French,
	Dominique Martinet, Jeff Layton, Matthew Wilcox, Alexander Viro,
	Omar Sandoval, Linus Torvalds, linux-afs, linux-nfs, linux-cifs,
	ceph-devel, v9fs-developer, linux-fsdevel, linux-kernel

Put some more information into the xattr attached to a cache file including
the size of the object (i_size may get rounded to a block size for DIO
purposes), the point after which the server has no data and a content
mapping type.

Note that the new cache and the old cache will see each other's cache files
as being incoherent and discard them.

Signed-off-by: David Howells <dhowells@redhat.com>
---

 fs/cachefiles/interface.c |    1 +
 fs/cachefiles/internal.h  |   11 +++++++++++
 fs/cachefiles/xattr.c     |   25 +++++++++++++++++++------
 3 files changed, 31 insertions(+), 6 deletions(-)

diff --git a/fs/cachefiles/interface.c b/fs/cachefiles/interface.c
index f90f6ddd07a5..751b0fec4591 100644
--- a/fs/cachefiles/interface.c
+++ b/fs/cachefiles/interface.c
@@ -420,6 +420,7 @@ static bool cachefiles_invalidate_cookie(struct fscache_cookie *cookie)
 
 	old_file = object->file;
 	object->file = new_file;
+	object->content_info = CACHEFILES_CONTENT_NO_DATA;
 	set_bit(CACHEFILES_OBJECT_USING_TMPFILE, &object->flags);
 	set_bit(FSCACHE_COOKIE_NEEDS_UPDATE, &object->cookie->flags);
 
diff --git a/fs/cachefiles/internal.h b/fs/cachefiles/internal.h
index a2d2ed2f19eb..1d3e37bca087 100644
--- a/fs/cachefiles/internal.h
+++ b/fs/cachefiles/internal.h
@@ -31,6 +31,16 @@ extern unsigned cachefiles_debug;
 
 #define cachefiles_gfp (__GFP_RECLAIM | __GFP_NORETRY | __GFP_NOMEMALLOC)
 
+enum cachefiles_content {
+	/* These values are saved on disk */
+	CACHEFILES_CONTENT_NO_DATA	= 0, /* No content stored */
+	CACHEFILES_CONTENT_SINGLE	= 1, /* Content is monolithic, all is present */
+	CACHEFILES_CONTENT_ALL		= 2, /* Content is all present, no map */
+	CACHEFILES_CONTENT_BACKFS_MAP	= 3, /* Content is piecemeal, mapped through backing fs */
+	CACHEFILES_CONTENT_DIRTY	= 4, /* Content is dirty (only seen on disk) */
+	nr__cachefiles_content
+};
+
 /*
  * Cached volume representation.
  */
@@ -59,6 +69,7 @@ struct cachefiles_object {
 	u8				key_hash;	/* Hash of object key */
 	unsigned long			flags;
 #define CACHEFILES_OBJECT_USING_TMPFILE	0		/* Have an unlinked tmpfile */
+	enum cachefiles_content		content_info:8;	/* Info about content presence */
 };
 
 extern struct kmem_cache *cachefiles_object_jar;
diff --git a/fs/cachefiles/xattr.c b/fs/cachefiles/xattr.c
index 50b2a4588946..ba3d050a5174 100644
--- a/fs/cachefiles/xattr.c
+++ b/fs/cachefiles/xattr.c
@@ -18,8 +18,11 @@
 #define CACHEFILES_COOKIE_TYPE_DATA 1
 
 struct cachefiles_xattr {
-	uint8_t				type;
-	uint8_t				data[];
+	__be64	object_size;	/* Actual size of the object */
+	__be64	zero_point;	/* Size after which server has no data not written by us */
+	__u8	type;		/* Type of object */
+	__u8	content;	/* Content presence (enum cachefiles_content) */
+	__u8	data[];		/* netfs coherency data */
 } __packed;
 
 static const char cachefiles_xattr_cache[] =
@@ -46,7 +49,10 @@ int cachefiles_set_object_xattr(struct cachefiles_object *object)
 	if (!buf)
 		return -ENOMEM;
 
-	buf->type = CACHEFILES_COOKIE_TYPE_DATA;
+	buf->object_size	= cpu_to_be64(object->cookie->object_size);
+	buf->zero_point		= 0;
+	buf->type		= CACHEFILES_COOKIE_TYPE_DATA;
+	buf->content		= object->content_info;
 	if (len > 0)
 		memcpy(buf->data, fscache_get_aux(object->cookie), len);
 
@@ -54,7 +60,7 @@ int cachefiles_set_object_xattr(struct cachefiles_object *object)
 			   buf, sizeof(struct cachefiles_xattr) + len, 0);
 	if (ret < 0) {
 		trace_cachefiles_coherency(object, file_inode(file)->i_ino,
-					   0,
+					   buf->content,
 					   cachefiles_coherency_set_fail);
 		if (ret != -ENOMEM)
 			cachefiles_io_error_obj(
@@ -62,7 +68,7 @@ int cachefiles_set_object_xattr(struct cachefiles_object *object)
 				"Failed to set xattr with error %d", ret);
 	} else {
 		trace_cachefiles_coherency(object, file_inode(file)->i_ino,
-					   0,
+					   buf->content,
 					   cachefiles_coherency_set_ok);
 	}
 
@@ -100,12 +106,19 @@ int cachefiles_check_auxdata(struct cachefiles_object *object, struct file *file
 		why = cachefiles_coherency_check_type;
 	} else if (memcmp(buf->data, p, len) != 0) {
 		why = cachefiles_coherency_check_aux;
+	} else if (be64_to_cpu(buf->object_size) != object->cookie->object_size) {
+		why = cachefiles_coherency_check_objsize;
+	} else if (buf->content == CACHEFILES_CONTENT_DIRTY) {
+		// TODO: Begin conflict resolution
+		pr_warn("Dirty object in cache\n");
+		why = cachefiles_coherency_check_dirty;
 	} else {
 		why = cachefiles_coherency_check_ok;
 		ret = 0;
 	}
 
-	trace_cachefiles_coherency(object, file_inode(file)->i_ino, 0, why);
+	trace_cachefiles_coherency(object, file_inode(file)->i_ino,
+				   buf->content, why);
 	kfree(buf);
 	return ret;
 }



^ permalink raw reply related	[flat|nested] 87+ messages in thread

* [PATCH 48/67] fscache: Implement "will_modify" parameter on fscache_use_cookie()
  2021-10-18 14:50 [PATCH 00/67] fscache: Rewrite index API and management system David Howells
                   ` (46 preceding siblings ...)
  2021-10-18 15:02 ` [PATCH 47/67] cachefiles: Put more information in the xattr attached to the cache file David Howells
@ 2021-10-18 15:02 ` David Howells
  2021-10-18 15:02 ` [PATCH 49/67] fscache: Add support for writing to the cache David Howells
                   ` (22 subsequent siblings)
  70 siblings, 0 replies; 87+ messages in thread
From: David Howells @ 2021-10-18 15:02 UTC (permalink / raw)
  To: linux-cachefs
  Cc: dhowells, Trond Myklebust, Anna Schumaker, Steve French,
	Dominique Martinet, Jeff Layton, Matthew Wilcox, Alexander Viro,
	Omar Sandoval, Linus Torvalds, linux-afs, linux-nfs, linux-cifs,
	ceph-devel, v9fs-developer, linux-fsdevel, linux-kernel

Implement the "will_modify" parameter passed to fscache_use_cookie().

Setting this to true will henceforth cause the affected object to be marked
as dirty on disk, subject to conflict resolution in the event that power
failure or a crash occurs or the filesystem operates in disconnected mode.

The dirty flag is removed when the cache object is discarded from memory.

A cache hook is provided to prepare for writing - and this can be used to
mark the object on disk.

Signed-off-by: David Howells <dhowells@redhat.com>
---

 fs/cachefiles/interface.c         |    3 +++
 fs/cachefiles/internal.h          |    2 +-
 fs/cachefiles/xattr.c             |   20 ++++++++++++++++++++
 fs/fscache/cookie.c               |   30 +++++++++++++++++++++++++++++-
 include/linux/fscache-cache.h     |    3 +++
 include/linux/fscache.h           |    2 ++
 include/trace/events/cachefiles.h |    4 ++--
 7 files changed, 60 insertions(+), 4 deletions(-)

diff --git a/fs/cachefiles/interface.c b/fs/cachefiles/interface.c
index 751b0fec4591..96f30eba585a 100644
--- a/fs/cachefiles/interface.c
+++ b/fs/cachefiles/interface.c
@@ -211,6 +211,8 @@ static void cachefiles_commit_object(struct cachefiles_object *object,
 {
 	bool update = false;
 
+	if (test_and_clear_bit(FSCACHE_COOKIE_LOCAL_WRITE, &object->cookie->flags))
+		update = true;
 	if (test_and_clear_bit(FSCACHE_COOKIE_NEEDS_UPDATE, &object->cookie->flags))
 		update = true;
 	if (update)
@@ -461,4 +463,5 @@ const struct fscache_cache_ops cachefiles_cache_ops = {
 	.invalidate_cookie	= cachefiles_invalidate_cookie,
 	.resize_cookie		= cachefiles_resize_cookie,
 	.begin_operation	= cachefiles_begin_operation,
+	.prepare_to_write	= cachefiles_prepare_to_write,
 };
diff --git a/fs/cachefiles/internal.h b/fs/cachefiles/internal.h
index 1d3e37bca087..83cf2ca3a763 100644
--- a/fs/cachefiles/internal.h
+++ b/fs/cachefiles/internal.h
@@ -239,7 +239,7 @@ extern int cachefiles_check_auxdata(struct cachefiles_object *object,
 				    struct file *file);
 extern int cachefiles_remove_object_xattr(struct cachefiles_cache *cache,
 					  struct dentry *dentry);
-
+extern void cachefiles_prepare_to_write(struct fscache_cookie *cookie);
 
 /*
  * error handling
diff --git a/fs/cachefiles/xattr.c b/fs/cachefiles/xattr.c
index ba3d050a5174..30adca42883b 100644
--- a/fs/cachefiles/xattr.c
+++ b/fs/cachefiles/xattr.c
@@ -53,6 +53,8 @@ int cachefiles_set_object_xattr(struct cachefiles_object *object)
 	buf->zero_point		= 0;
 	buf->type		= CACHEFILES_COOKIE_TYPE_DATA;
 	buf->content		= object->content_info;
+	if (test_bit(FSCACHE_COOKIE_LOCAL_WRITE, &object->cookie->flags))
+		buf->content	= CACHEFILES_CONTENT_DIRTY;
 	if (len > 0)
 		memcpy(buf->data, fscache_get_aux(object->cookie), len);
 
@@ -145,3 +147,21 @@ int cachefiles_remove_object_xattr(struct cachefiles_cache *cache,
 	_leave(" = %d", ret);
 	return ret;
 }
+
+/*
+ * Stick a marker on the cache object to indicate that it's dirty.
+ */
+void cachefiles_prepare_to_write(struct fscache_cookie *cookie)
+{
+	const struct cred *saved_cred;
+	struct cachefiles_object *object = cookie->cache_priv;
+	struct cachefiles_cache *cache = object->volume->cache;
+
+	_enter("c=%08x", object->cookie->debug_id);
+
+	if (!test_bit(CACHEFILES_OBJECT_USING_TMPFILE, &object->flags)) {
+		cachefiles_begin_secure(cache, &saved_cred);
+		cachefiles_set_object_xattr(object);
+		cachefiles_end_secure(cache, saved_cred);
+	}
+}
diff --git a/fs/fscache/cookie.c b/fs/fscache/cookie.c
index 70bfbd269652..1420027cfe97 100644
--- a/fs/fscache/cookie.c
+++ b/fs/fscache/cookie.c
@@ -402,12 +402,20 @@ struct fscache_cookie *__fscache_acquire_cookie(
 }
 EXPORT_SYMBOL(__fscache_acquire_cookie);
 
+/*
+ * Prepare a cache object to be written to.
+ */
+static void fscache_prepare_to_write(struct fscache_cookie *cookie)
+{
+	cookie->volume->cache->ops->prepare_to_write(cookie);
+}
+
 /*
  * Look up a cookie to the cache.
  */
 static void fscache_lookup_cookie(struct fscache_cookie *cookie)
 {
-	bool changed_stage = false, need_withdraw = false;
+	bool changed_stage = false, need_withdraw = false, prep_write = false;
 
 	_enter("");
 
@@ -429,6 +437,7 @@ static void fscache_lookup_cookie(struct fscache_cookie *cookie)
 
 	spin_lock(&cookie->lock);
 	if (cookie->stage != FSCACHE_COOKIE_STAGE_RELINQUISHING) {
+		prep_write = test_bit(FSCACHE_COOKIE_LOCAL_WRITE, &cookie->flags);
 		__fscache_set_cookie_stage(cookie, FSCACHE_COOKIE_STAGE_ACTIVE);
 		fscache_see_cookie(cookie, fscache_cookie_see_active);
 		changed_stage = true;
@@ -436,6 +445,8 @@ static void fscache_lookup_cookie(struct fscache_cookie *cookie)
 	spin_unlock(&cookie->lock);
 	if (changed_stage)
 		wake_up_cookie_stage(cookie);
+	if (prep_write)
+		fscache_prepare_to_write(cookie);
 
 out:
 	fscache_end_cookie_access(cookie, fscache_access_lookup_cookie_end);
@@ -467,6 +478,10 @@ void __fscache_use_cookie(struct fscache_cookie *cookie, bool will_modify)
 	stage = cookie->stage;
 	switch (stage) {
 	case FSCACHE_COOKIE_STAGE_QUIESCENT:
+		if (will_modify) {
+			set_bit(FSCACHE_COOKIE_LOCAL_WRITE, &cookie->flags);
+			set_bit(FSCACHE_COOKIE_DO_PREP_TO_WRITE, &cookie->flags);
+		}
 		if (!fscache_begin_volume_access(cookie->volume,
 						 fscache_access_lookup_cookie))
 			break;
@@ -484,8 +499,18 @@ void __fscache_use_cookie(struct fscache_cookie *cookie, bool will_modify)
 
 	case FSCACHE_COOKIE_STAGE_LOOKING_UP:
 	case FSCACHE_COOKIE_STAGE_CREATING:
+		if (will_modify)
+			set_bit(FSCACHE_COOKIE_LOCAL_WRITE, &cookie->flags);
+		break;
 	case FSCACHE_COOKIE_STAGE_ACTIVE:
 	case FSCACHE_COOKIE_STAGE_INVALIDATING:
+		if (will_modify &&
+		    !test_and_set_bit(FSCACHE_COOKIE_LOCAL_WRITE, &cookie->flags)) {
+			set_bit(FSCACHE_COOKIE_DO_PREP_TO_WRITE, &cookie->flags);
+			queue = true;
+		}
+		break;
+
 	case FSCACHE_COOKIE_STAGE_FAILED:
 	case FSCACHE_COOKIE_STAGE_WITHDRAWING:
 		break;
@@ -551,6 +576,8 @@ static void __fscache_cookie_worker(struct fscache_cookie *cookie)
 again:
 	switch (READ_ONCE(cookie->stage)) {
 	case FSCACHE_COOKIE_STAGE_ACTIVE:
+		if (test_and_clear_bit(FSCACHE_COOKIE_DO_PREP_TO_WRITE, &cookie->flags))
+			fscache_prepare_to_write(cookie);
 		break;
 
 	case FSCACHE_COOKIE_STAGE_LOOKING_UP:
@@ -591,6 +618,7 @@ static void __fscache_cookie_worker(struct fscache_cookie *cookie)
 		clear_bit(FSCACHE_COOKIE_NEEDS_UPDATE, &cookie->flags);
 		clear_bit(FSCACHE_COOKIE_DO_WITHDRAW, &cookie->flags);
 		clear_bit(FSCACHE_COOKIE_DO_COMMIT, &cookie->flags);
+		clear_bit(FSCACHE_COOKIE_DO_PREP_TO_WRITE, &cookie->flags);
 		set_bit(FSCACHE_COOKIE_NO_DATA_TO_READ, &cookie->flags);
 		fscache_set_cookie_stage(cookie, FSCACHE_COOKIE_STAGE_QUIESCENT);
 		break;
diff --git a/include/linux/fscache-cache.h b/include/linux/fscache-cache.h
index 2e7265e24df6..889ae37fae0f 100644
--- a/include/linux/fscache-cache.h
+++ b/include/linux/fscache-cache.h
@@ -76,6 +76,9 @@ struct fscache_cache_ops {
 	/* Begin an operation for the netfs lib */
 	bool (*begin_operation)(struct netfs_cache_resources *cres,
 				enum fscache_want_stage want_stage);
+
+	/* Prepare to write to a live cache object */
+	void (*prepare_to_write)(struct fscache_cookie *cookie);
 };
 
 static inline enum fscache_cache_state fscache_cache_state(const struct fscache_cache *cache)
diff --git a/include/linux/fscache.h b/include/linux/fscache.h
index 8148193045cd..8ab691e52cc5 100644
--- a/include/linux/fscache.h
+++ b/include/linux/fscache.h
@@ -125,10 +125,12 @@ struct fscache_cookie {
 #define FSCACHE_COOKIE_NEEDS_UPDATE	4		/* T if attrs have been updated */
 #define FSCACHE_COOKIE_HAS_BEEN_CACHED	5		/* T if cookie needs withdraw-on-relinq */
 #define FSCACHE_COOKIE_DISABLED		6		/* T if cookie has been disabled */
+#define FSCACHE_COOKIE_LOCAL_WRITE	7		/* T if cookie has been modified locally */
 #define FSCACHE_COOKIE_NACC_ELEVATED	8		/* T if n_accesses is incremented */
 #define FSCACHE_COOKIE_DO_RELINQUISH	9		/* T if this cookie needs relinquishment */
 #define FSCACHE_COOKIE_DO_WITHDRAW	10		/* T if this cookie needs withdrawing */
 #define FSCACHE_COOKIE_DO_COMMIT	11		/* T if this cookie needs committing */
+#define FSCACHE_COOKIE_DO_PREP_TO_WRITE	12		/* T if cookie needs write preparation */
 
 	enum fscache_cookie_stage	stage;
 	u8				advice;		/* FSCACHE_ADV_* */
diff --git a/include/trace/events/cachefiles.h b/include/trace/events/cachefiles.h
index a7b31b248f2d..11447b20fc83 100644
--- a/include/trace/events/cachefiles.h
+++ b/include/trace/events/cachefiles.h
@@ -340,7 +340,7 @@ TRACE_EVENT(cachefiles_mark_inactive,
 TRACE_EVENT(cachefiles_coherency,
 	    TP_PROTO(struct cachefiles_object *obj,
 		     ino_t ino,
-		     int content,
+		     enum cachefiles_content content,
 		     enum cachefiles_coherency_trace why),
 
 	    TP_ARGS(obj, ino, content, why),
@@ -349,7 +349,7 @@ TRACE_EVENT(cachefiles_coherency,
 	    TP_STRUCT__entry(
 		    __field(unsigned int,			obj	)
 		    __field(enum cachefiles_coherency_trace,	why	)
-		    __field(int,				content	)
+		    __field(enum cachefiles_content,		content	)
 		    __field(u64,				ino	)
 			     ),
 



^ permalink raw reply related	[flat|nested] 87+ messages in thread

* [PATCH 49/67] fscache: Add support for writing to the cache
  2021-10-18 14:50 [PATCH 00/67] fscache: Rewrite index API and management system David Howells
                   ` (47 preceding siblings ...)
  2021-10-18 15:02 ` [PATCH 48/67] fscache: Implement "will_modify" parameter on fscache_use_cookie() David Howells
@ 2021-10-18 15:02 ` David Howells
  2021-10-18 15:03 ` [PATCH 50/67] fscache: Make fscache_clear_page_bits() conditional on cookie David Howells
                   ` (21 subsequent siblings)
  70 siblings, 0 replies; 87+ messages in thread
From: David Howells @ 2021-10-18 15:02 UTC (permalink / raw)
  To: linux-cachefs
  Cc: dhowells, Trond Myklebust, Anna Schumaker, Steve French,
	Dominique Martinet, Jeff Layton, Matthew Wilcox, Alexander Viro,
	Omar Sandoval, Linus Torvalds, linux-afs, linux-nfs, linux-cifs,
	ceph-devel, v9fs-developer, linux-fsdevel, linux-kernel

Add a pair of helpers for use by a netfs to write data to the cache.

Signed-off-by: David Howells <dhowells@redhat.com>
---

 fs/fscache/io.c         |  100 +++++++++++++++++++++++++++++++++++++++++++++++
 include/linux/fscache.h |   58 +++++++++++++++++++++++++++
 2 files changed, 158 insertions(+)

diff --git a/fs/fscache/io.c b/fs/fscache/io.c
index 8b1a865a0847..3910cba65545 100644
--- a/fs/fscache/io.c
+++ b/fs/fscache/io.c
@@ -11,6 +11,7 @@
 #include <linux/uio.h>
 #include <linux/bvec.h>
 #include <linux/slab.h>
+#include <linux/uio.h>
 #include "internal.h"
 
 /**
@@ -278,3 +279,102 @@ void __fscache_resize_cookie(struct fscache_cookie *cookie, loff_t new_size)
 	}
 }
 EXPORT_SYMBOL(__fscache_resize_cookie);
+
+struct fscache_write_request {
+	struct netfs_cache_resources cache_resources;
+	struct address_space	*mapping;
+	loff_t			start;
+	size_t			len;
+	netfs_io_terminated_t	term_func;
+	void			*term_func_priv;
+};
+
+void __fscache_clear_page_bits(struct address_space *mapping,
+			       loff_t start, size_t len)
+{
+	pgoff_t first = start / PAGE_SIZE;
+	pgoff_t last = (start + len - 1) / PAGE_SIZE;
+	struct page *page;
+
+	if (len) {
+		XA_STATE(xas, &mapping->i_pages, first);
+
+		rcu_read_lock();
+		xas_for_each(&xas, page, last) {
+			end_page_fscache(page);
+		}
+		rcu_read_unlock();
+	}
+}
+EXPORT_SYMBOL(__fscache_clear_page_bits);
+
+/*
+ * Deal with the completion of writing the data to the cache.
+ */
+static void fscache_wreq_done(void *priv, ssize_t transferred_or_error,
+			      bool was_async)
+{
+	struct fscache_write_request *wreq = priv;
+
+	fscache_clear_page_bits(wreq->mapping, wreq->start, wreq->len);
+
+	if (wreq->term_func)
+		wreq->term_func(wreq->term_func_priv, transferred_or_error,
+				was_async);
+	fscache_end_operation(&wreq->cache_resources);
+	kfree(wreq);
+}
+
+void __fscache_write_to_cache(struct fscache_cookie *cookie,
+			      struct address_space *mapping,
+			      loff_t start, size_t len, loff_t i_size,
+			      netfs_io_terminated_t term_func,
+			      void *term_func_priv)
+{
+	struct fscache_write_request *wreq;
+	struct netfs_cache_resources *cres;
+	struct iov_iter iter;
+	int ret = -ENOBUFS;
+
+	if (!fscache_cookie_valid(cookie) || len == 0)
+		goto abandon;
+
+	_enter("%llx,%zx", start, len);
+
+	wreq = kzalloc(sizeof(struct fscache_write_request), GFP_NOFS);
+	if (!wreq)
+		goto abandon;
+	wreq->mapping		= mapping;
+	wreq->start		= start;
+	wreq->len		= len;
+	wreq->term_func		= term_func;
+	wreq->term_func_priv	= term_func_priv;
+
+	cres = &wreq->cache_resources;
+	if (fscache_begin_operation(cres, cookie, FSCACHE_WANT_WRITE,
+				    fscache_access_io_write) < 0)
+		goto abandon_free;
+
+	ret = cres->ops->prepare_write(cres, &start, &len, i_size, false);
+	if (ret < 0)
+		goto abandon_end;
+
+	/* TODO: Consider clearing page bits now for space the write isn't
+	 * covering.  This is more complicated than it appears when THPs are
+	 * taken into account.
+	 */
+
+	iov_iter_xarray(&iter, WRITE, &mapping->i_pages, start, len);
+	fscache_write(cres, start, &iter, fscache_wreq_done, wreq);
+	return;
+
+abandon_end:
+	return fscache_wreq_done(wreq, ret, false);
+abandon_free:
+	kfree(wreq);
+abandon:
+	fscache_clear_page_bits(mapping, start, len);
+	if (term_func)
+		term_func(term_func_priv, ret, false);
+}
+EXPORT_SYMBOL(__fscache_write_to_cache);
diff --git a/include/linux/fscache.h b/include/linux/fscache.h
index 8ab691e52cc5..fe4d588641da 100644
--- a/include/linux/fscache.h
+++ b/include/linux/fscache.h
@@ -176,6 +176,10 @@ extern int __fscache_fallback_read_page(struct fscache_cookie *, struct page *);
 extern int __fscache_fallback_write_page(struct fscache_cookie *, struct page *);
 #endif
 
+extern void __fscache_write_to_cache(struct fscache_cookie *, struct address_space *,
+				     loff_t, size_t, loff_t, netfs_io_terminated_t, void *);
+extern void __fscache_clear_page_bits(struct address_space *, loff_t, size_t);
+
 /**
  * fscache_acquire_volume - Register a volume as desiring caching services
  * @volume_key: An identification string for the volume
@@ -543,6 +547,60 @@ int fscache_write(struct netfs_cache_resources *cres,
 	return ops->write(cres, start_pos, iter, term_func, term_func_priv);
 }
 
+/**
+ * fscache_clear_page_bits - Clear the PG_fscache bits from a set of pages
+ * @mapping: The netfs inode to use as the source
+ * @start: The start position in @mapping
+ * @len: The amount of data to unlock
+ *
+ * Clear the PG_fscache flag from a sequence of pages and wake up anyone who's
+ * waiting.
+ */
+static inline void fscache_clear_page_bits(struct address_space *mapping,
+					   loff_t start, size_t len)
+{
+	if (fscache_available())
+		__fscache_clear_page_bits(mapping, start, len);
+}
+
+/**
+ * fscache_write_to_cache - Save a write to the cache and clear PG_fscache
+ * @cookie: The cookie representing the cache object
+ * @mapping: The netfs inode to use as the source
+ * @start: The start position in @mapping
+ * @len: The amount of data to write back
+ * @i_size: The new size of the inode
+ * @term_func: The function to call upon completion
+ * @term_func_priv: The private data for @term_func
+ *
+ * Helper function for a netfs to write dirty data from an inode into the cache
+ * object that's backing it.
+ *
+ * @start and @len describe the range of the data.  This does not need to be
+ * page-aligned, but to satisfy DIO requirements, the cache may expand it up to
+ * the page boundaries on either end.  All the pages covering the range must be
+ * marked with PG_fscache.
+ *
+ * If given, @term_func will be called upon completion and supplied with
+ * @term_func_priv.  Note that the PG_fscache flags will have been cleared by
+ * this point, so the netfs must retain its own pin on the mapping.
+ */
+static inline void fscache_write_to_cache(struct fscache_cookie *cookie,
+					  struct address_space *mapping,
+					  loff_t start, size_t len, loff_t i_size,
+					  netfs_io_terminated_t term_func,
+					  void *term_func_priv)
+{
+	if (fscache_available()) {
+		__fscache_write_to_cache(cookie, mapping, start, len, i_size,
+					 term_func, term_func_priv);
+	} else {
+		fscache_clear_page_bits(mapping, start, len);
+		if (term_func)
+			term_func(term_func_priv, -ENOBUFS, false);
+	}
+
+}
 #endif /* FSCACHE_USE_NEW_IO_API */
 
 #if __fscache_available



^ permalink raw reply related	[flat|nested] 87+ messages in thread

* [PATCH 50/67] fscache: Make fscache_clear_page_bits() conditional on cookie
  2021-10-18 14:50 [PATCH 00/67] fscache: Rewrite index API and management system David Howells
                   ` (48 preceding siblings ...)
  2021-10-18 15:02 ` [PATCH 49/67] fscache: Add support for writing to the cache David Howells
@ 2021-10-18 15:03 ` David Howells
  2021-10-18 15:03 ` [PATCH 51/67] fscache: Make fscache_write_to_cache() " David Howells
                   ` (20 subsequent siblings)
  70 siblings, 0 replies; 87+ messages in thread
From: David Howells @ 2021-10-18 15:03 UTC (permalink / raw)
  To: linux-cachefs
  Cc: dhowells, Trond Myklebust, Anna Schumaker, Steve French,
	Dominique Martinet, Jeff Layton, Matthew Wilcox, Alexander Viro,
	Omar Sandoval, Linus Torvalds, linux-afs, linux-nfs, linux-cifs,
	ceph-devel, v9fs-developer, linux-fsdevel, linux-kernel

Make fscache_clear_page_bits() conditional on cookie not being NULL, rather
than merely conditional on CONFIG_FSCACHE=[ym].  The problem with the
latter is if a filesystem, say afs, has CONFIG_AFS_FSCACHE=n but calls into
this function - in which it linkage will fail if CONFIG_FSCACHE is less
than CONFIG_AFS.  Analogous problems can affect other filesystems, e.g. 9p.

Making fscache_clear_page_bits() conditional on the cookie achieves two
things:

 (1) If cookie optimised down to constant NULL, the rest of the function is
     thrown away and the slow path is never called.

 (2) __fscache_clear_page_bits() isn't called if there's no cookie - and
     so, in such a case, the pages won't iterated over attempting to clear
     PG_fscache bits that haven't been set.

Signed-off-by: David Howells <dhowells@redhat.com>
---

 fs/fscache/io.c         |    5 +++--
 include/linux/fscache.h |    8 +++++---
 2 files changed, 8 insertions(+), 5 deletions(-)

diff --git a/fs/fscache/io.c b/fs/fscache/io.c
index 3910cba65545..bc8d1ac0e85c 100644
--- a/fs/fscache/io.c
+++ b/fs/fscache/io.c
@@ -316,7 +316,8 @@ static void fscache_wreq_done(void *priv, ssize_t transferred_or_error,
 {
 	struct fscache_write_request *wreq = priv;
 
-	fscache_clear_page_bits(wreq->mapping, wreq->start, wreq->len);
+	fscache_clear_page_bits(fscache_cres_cookie(&wreq->cache_resources),
+				wreq->mapping, wreq->start, wreq->len);
 
 	if (wreq->term_func)
 		wreq->term_func(wreq->term_func_priv, transferred_or_error,
@@ -373,7 +374,7 @@ void __fscache_write_to_cache(struct fscache_cookie *cookie,
 abandon_free:
 	kfree(wreq);
 abandon:
-	fscache_clear_page_bits(mapping, start, len);
+	fscache_clear_page_bits(cookie, mapping, start, len);
 	if (term_func)
 		term_func(term_func_priv, ret, false);
 }
diff --git a/include/linux/fscache.h b/include/linux/fscache.h
index fe4d588641da..847c076d05a6 100644
--- a/include/linux/fscache.h
+++ b/include/linux/fscache.h
@@ -549,6 +549,7 @@ int fscache_write(struct netfs_cache_resources *cres,
 
 /**
  * fscache_clear_page_bits - Clear the PG_fscache bits from a set of pages
+ * @cookie: The cookie representing the cache object
  * @mapping: The netfs inode to use as the source
  * @start: The start position in @mapping
  * @len: The amount of data to unlock
@@ -556,10 +557,11 @@ int fscache_write(struct netfs_cache_resources *cres,
  * Clear the PG_fscache flag from a sequence of pages and wake up anyone who's
  * waiting.
  */
-static inline void fscache_clear_page_bits(struct address_space *mapping,
+static inline void fscache_clear_page_bits(struct fscache_cookie *cookie,
+					   struct address_space *mapping,
 					   loff_t start, size_t len)
 {
-	if (fscache_available())
+	if (fscache_cookie_valid(cookie))
 		__fscache_clear_page_bits(mapping, start, len);
 }
 
@@ -595,7 +597,7 @@ static inline void fscache_write_to_cache(struct fscache_cookie *cookie,
 		__fscache_write_to_cache(cookie, mapping, start, len, i_size,
 					 term_func, term_func_priv);
 	} else {
-		fscache_clear_page_bits(mapping, start, len);
+		fscache_clear_page_bits(cookie, mapping, start, len);
 		if (term_func)
 			term_func(term_func_priv, -ENOBUFS, false);
 	}



^ permalink raw reply related	[flat|nested] 87+ messages in thread

* [PATCH 51/67] fscache: Make fscache_write_to_cache() conditional on cookie
  2021-10-18 14:50 [PATCH 00/67] fscache: Rewrite index API and management system David Howells
                   ` (49 preceding siblings ...)
  2021-10-18 15:03 ` [PATCH 50/67] fscache: Make fscache_clear_page_bits() conditional on cookie David Howells
@ 2021-10-18 15:03 ` David Howells
  2021-10-18 15:03 ` [PATCH 52/67] afs: Copy local writes to the cache when writing to the server David Howells
                   ` (19 subsequent siblings)
  70 siblings, 0 replies; 87+ messages in thread
From: David Howells @ 2021-10-18 15:03 UTC (permalink / raw)
  To: linux-cachefs
  Cc: dhowells, Trond Myklebust, Anna Schumaker, Steve French,
	Dominique Martinet, Jeff Layton, Matthew Wilcox, Alexander Viro,
	Omar Sandoval, Linus Torvalds, linux-afs, linux-nfs, linux-cifs,
	ceph-devel, v9fs-developer, linux-fsdevel, linux-kernel

Make fscache_write_to_cache() conditional on cookie not being NULL, rather
than merely conditional on CONFIG_FSCACHE=[ym].  The problem with the
latter is if a filesystem, say afs, has CONFIG_AFS_FSCACHE=n but calls into
this function - linkage will fail if CONFIG_FSCACHE is less than
CONFIG_AFS.  Analogous problems can affect other filesystems, e.g. 9p.

Making fscache_write_to_cache() conditional on the cookie achieves two
things:

 (1) If cookie optimises down to constant NULL, term_func is called
     directly and may be inlined and the slow path is never called.

 (2) __fscache_write_to_cache() isn't called if cookie is dynamically NULL
     - and so, in such a case, term_func is called immediately.

Signed-off-by: David Howells <dhowells@redhat.com>
---

 include/linux/fscache.h |    2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/include/linux/fscache.h b/include/linux/fscache.h
index 847c076d05a6..ba192567d099 100644
--- a/include/linux/fscache.h
+++ b/include/linux/fscache.h
@@ -593,7 +593,7 @@ static inline void fscache_write_to_cache(struct fscache_cookie *cookie,
 					  netfs_io_terminated_t term_func,
 					  void *term_func_priv)
 {
-	if (fscache_available()) {
+	if (fscache_cookie_valid(cookie)) {
 		__fscache_write_to_cache(cookie, mapping, start, len, i_size,
 					 term_func, term_func_priv);
 	} else {



^ permalink raw reply related	[flat|nested] 87+ messages in thread

* [PATCH 52/67] afs: Copy local writes to the cache when writing to the server
  2021-10-18 14:50 [PATCH 00/67] fscache: Rewrite index API and management system David Howells
                   ` (50 preceding siblings ...)
  2021-10-18 15:03 ` [PATCH 51/67] fscache: Make fscache_write_to_cache() " David Howells
@ 2021-10-18 15:03 ` David Howells
  2021-10-18 15:04 ` [PATCH 53/67] afs: Invoke fscache_resize_cookie() when handling ATTR_SIZE for setattr David Howells
                   ` (18 subsequent siblings)
  70 siblings, 0 replies; 87+ messages in thread
From: David Howells @ 2021-10-18 15:03 UTC (permalink / raw)
  To: linux-cachefs
  Cc: Marc Dionne, linux-afs, dhowells, Trond Myklebust,
	Anna Schumaker, Steve French, Dominique Martinet, Jeff Layton,
	Matthew Wilcox, Alexander Viro, Omar Sandoval, Linus Torvalds,
	linux-afs, linux-nfs, linux-cifs, ceph-devel, v9fs-developer,
	linux-fsdevel, linux-kernel

When writing to the server from afs_writepage() or afs_writepages(), copy
the data to the cache object too.

To make this possible, the cookie must have its active users count
incremented when the page is dirtied and kept incremented until we manage
to clean up all the pages.  This allows the writeback to take place after
the last file struct is released.

Signed-off-by: David Howells <dhowells@redhat.com>
cc: Marc Dionne <marc.dionne@auristor.com>
cc: linux-afs@lists.infradead.org
---

 fs/afs/file.c     |    6 ++++
 fs/afs/inode.c    |    8 +++--
 fs/afs/internal.h |    5 +++
 fs/afs/super.c    |    1 +
 fs/afs/write.c    |   79 +++++++++++++++++++++++++++++++++++++++++++++--------
 5 files changed, 84 insertions(+), 15 deletions(-)

diff --git a/fs/afs/file.c b/fs/afs/file.c
index 5e674a503c1b..4af7ef607cf6 100644
--- a/fs/afs/file.c
+++ b/fs/afs/file.c
@@ -402,6 +402,12 @@ static void afs_readahead(struct readahead_control *ractl)
 	netfs_readahead(ractl, &afs_req_ops, NULL);
 }
 
+int afs_write_inode(struct inode *inode, struct writeback_control *wbc)
+{
+	fscache_unpin_writeback(wbc, afs_vnode_cache(AFS_FS_I(inode)));
+	return 0;
+}
+
 /*
  * Adjust the dirty region of the page on truncation or full invalidation,
  * getting rid of the markers altogether if the region is entirely invalidated.
diff --git a/fs/afs/inode.c b/fs/afs/inode.c
index 32038ccbce67..311df66c64d0 100644
--- a/fs/afs/inode.c
+++ b/fs/afs/inode.c
@@ -761,9 +761,8 @@ int afs_drop_inode(struct inode *inode)
  */
 void afs_evict_inode(struct inode *inode)
 {
-	struct afs_vnode *vnode;
-
-	vnode = AFS_FS_I(inode);
+	struct afs_vnode_cache_aux aux;
+	struct afs_vnode *vnode = AFS_FS_I(inode);
 
 	_enter("{%llx:%llu.%d}",
 	       vnode->fid.vid,
@@ -775,6 +774,9 @@ void afs_evict_inode(struct inode *inode)
 	ASSERTCMP(inode->i_ino, ==, vnode->fid.vnode);
 
 	truncate_inode_pages_final(&inode->i_data);
+
+	afs_set_cache_aux(vnode, &aux);
+	fscache_clear_inode_writeback(afs_vnode_cache(vnode), inode, &aux);
 	clear_inode(inode);
 
 	while (!list_empty(&vnode->wb_keys)) {
diff --git a/fs/afs/internal.h b/fs/afs/internal.h
index 6c591b7c55f1..07d34291bf4f 100644
--- a/fs/afs/internal.h
+++ b/fs/afs/internal.h
@@ -1072,6 +1072,7 @@ extern int afs_release(struct inode *, struct file *);
 extern int afs_fetch_data(struct afs_vnode *, struct afs_read *);
 extern struct afs_read *afs_alloc_read(gfp_t);
 extern void afs_put_read(struct afs_read *);
+extern int afs_write_inode(struct inode *, struct writeback_control *);
 
 static inline struct afs_read *afs_get_read(struct afs_read *req)
 {
@@ -1519,7 +1520,11 @@ extern int afs_check_volume_status(struct afs_volume *, struct afs_operation *);
 /*
  * write.c
  */
+#ifdef CONFIG_AFS_FSCACHE
 extern int afs_set_page_dirty(struct page *);
+#else
+#define afs_set_page_dirty __set_page_dirty_nobuffers
+#endif
 extern int afs_write_begin(struct file *file, struct address_space *mapping,
 			loff_t pos, unsigned len, unsigned flags,
 			struct page **pagep, void **fsdata);
diff --git a/fs/afs/super.c b/fs/afs/super.c
index d110def8aa8e..af7cbd9949c5 100644
--- a/fs/afs/super.c
+++ b/fs/afs/super.c
@@ -55,6 +55,7 @@ int afs_net_id;
 static const struct super_operations afs_super_ops = {
 	.statfs		= afs_statfs,
 	.alloc_inode	= afs_alloc_inode,
+	.write_inode	= afs_write_inode,
 	.drop_inode	= afs_drop_inode,
 	.destroy_inode	= afs_destroy_inode,
 	.free_inode	= afs_free_inode,
diff --git a/fs/afs/write.c b/fs/afs/write.c
index e1cb19cb6314..a540da6ad423 100644
--- a/fs/afs/write.c
+++ b/fs/afs/write.c
@@ -12,17 +12,29 @@
 #include <linux/writeback.h>
 #include <linux/pagevec.h>
 #include <linux/netfs.h>
-#include <linux/fscache.h>
 #include "internal.h"
 
+static void afs_write_to_cache(struct afs_vnode *vnode, loff_t start, size_t len,
+			       loff_t i_size);
+
+#ifdef CONFIG_AFS_FSCACHE
 /*
- * mark a page as having been made dirty and thus needing writeback
+ * Mark a page as having been made dirty and thus needing writeback.  We also
+ * need to pin the cache object to write back to.
  */
 int afs_set_page_dirty(struct page *page)
 {
-	_enter("");
-	return __set_page_dirty_nobuffers(page);
+	return fscache_set_page_dirty(page, afs_vnode_cache(AFS_FS_I(page->mapping->host)));
+}
+static void afs_set_page_fscache(struct page *page)
+{
+	set_page_fscache(page);
+}
+#else
+static void afs_set_page_fscache(struct page *page)
+{
 }
+#endif
 
 /*
  * Prepare to perform part of a write to a page.  Note that len may extend
@@ -114,7 +126,7 @@ int afs_write_end(struct file *file, struct address_space *mapping,
 	unsigned long priv;
 	unsigned int f, from = offset_in_thp(page, pos);
 	unsigned int t, to = from + copied;
-	loff_t i_size, maybe_i_size;
+	loff_t i_size, write_end_pos;
 
 	_enter("{%llx:%llu},{%lx}",
 	       vnode->fid.vid, vnode->fid.vnode, page->index);
@@ -132,15 +144,16 @@ int afs_write_end(struct file *file, struct address_space *mapping,
 	if (copied == 0)
 		goto out;
 
-	maybe_i_size = pos + copied;
+	write_end_pos = pos + copied;
 
 	i_size = i_size_read(&vnode->vfs_inode);
-	if (maybe_i_size > i_size) {
+	if (write_end_pos > i_size) {
 		write_seqlock(&vnode->cb_lock);
 		i_size = i_size_read(&vnode->vfs_inode);
-		if (maybe_i_size > i_size)
-			afs_set_i_size(vnode, maybe_i_size);
+		if (write_end_pos > i_size)
+			afs_set_i_size(vnode, write_end_pos);
 		write_sequnlock(&vnode->cb_lock);
+		fscache_update_cookie(afs_vnode_cache(vnode), NULL, &write_end_pos);
 	}
 
 	if (PagePrivate(page)) {
@@ -482,7 +495,8 @@ static void afs_extend_writeback(struct address_space *mapping,
 				put_page(page);
 				break;
 			}
-			if (!PageDirty(page) || PageWriteback(page)) {
+			if (!PageDirty(page) || PageWriteback(page) ||
+			    PageFsCache(page)) {
 				unlock_page(page);
 				put_page(page);
 				break;
@@ -530,6 +544,7 @@ static void afs_extend_writeback(struct address_space *mapping,
 				BUG();
 			if (test_set_page_writeback(page))
 				BUG();
+			afs_set_page_fscache(page);
 
 			*_count -= thp_nr_pages(page);
 			unlock_page(page);
@@ -564,6 +579,7 @@ static ssize_t afs_write_back_from_locked_page(struct address_space *mapping,
 
 	if (test_set_page_writeback(page))
 		BUG();
+	afs_set_page_fscache(page);
 
 	count -= thp_nr_pages(page);
 
@@ -603,12 +619,19 @@ static ssize_t afs_write_back_from_locked_page(struct address_space *mapping,
 	if (start < i_size) {
 		_debug("write back %x @%llx [%llx]", len, start, i_size);
 
+		/* Speculatively write to the cache.  We have to fix this up
+		 * later if the store fails.
+		 */
+		afs_write_to_cache(vnode, start, len, i_size);
+
 		iov_iter_xarray(&iter, WRITE, &mapping->i_pages, start, len);
 		ret = afs_store_data(vnode, &iter, start, false);
 	} else {
 		_debug("write discard %x @%llx [%llx]", len, start, i_size);
 
 		/* The dirty region was entirely beyond the EOF. */
+		fscache_clear_page_bits(afs_vnode_cache(vnode),
+					mapping, start, len);
 		afs_pages_written_back(vnode, start, len);
 		ret = 0;
 	}
@@ -666,6 +689,10 @@ int afs_writepage(struct page *page, struct writeback_control *wbc)
 
 	_enter("{%lx},", page->index);
 
+#ifdef CONFIG_AFS_FSCACHE
+	wait_on_page_fscache(page);
+#endif
+
 	start = page->index * PAGE_SIZE;
 	ret = afs_write_back_from_locked_page(page->mapping, wbc, page,
 					      start, LLONG_MAX - start);
@@ -728,10 +755,14 @@ static int afs_writepages_region(struct address_space *mapping,
 			continue;
 		}
 
-		if (PageWriteback(page)) {
+		if (PageWriteback(page) || PageFsCache(page)) {
 			unlock_page(page);
-			if (wbc->sync_mode != WB_SYNC_NONE)
+			if (wbc->sync_mode != WB_SYNC_NONE) {
 				wait_on_page_writeback(page);
+#ifdef CONFIG_AFS_FSCACHE
+				wait_on_page_fscache(page);
+#endif
+			}
 			put_page(page);
 			continue;
 		}
@@ -984,3 +1015,27 @@ int afs_launder_page(struct page *page)
 	wait_on_page_fscache(page);
 	return ret;
 }
+
+/*
+ * Deal with the completion of writing the data to the cache.
+ */
+static void afs_write_to_cache_done(void *priv, ssize_t transferred_or_error,
+				    bool was_async)
+{
+	struct afs_vnode *vnode = priv;
+
+	if (IS_ERR_VALUE(transferred_or_error) &&
+	    transferred_or_error != -ENOBUFS)
+		afs_invalidate_cache(vnode, 0);
+}
+
+/*
+ * Save the write to the cache also.
+ */
+static void afs_write_to_cache(struct afs_vnode *vnode,
+			       loff_t start, size_t len, loff_t i_size)
+{
+	fscache_write_to_cache(afs_vnode_cache(vnode),
+			       vnode->vfs_inode.i_mapping, start, len, i_size,
+			       afs_write_to_cache_done, vnode);
+}



^ permalink raw reply related	[flat|nested] 87+ messages in thread

* [PATCH 53/67] afs: Invoke fscache_resize_cookie() when handling ATTR_SIZE for setattr
  2021-10-18 14:50 [PATCH 00/67] fscache: Rewrite index API and management system David Howells
                   ` (51 preceding siblings ...)
  2021-10-18 15:03 ` [PATCH 52/67] afs: Copy local writes to the cache when writing to the server David Howells
@ 2021-10-18 15:04 ` David Howells
  2021-10-18 15:04 ` [PATCH 54/67] afs: Add O_DIRECT read support David Howells
                   ` (17 subsequent siblings)
  70 siblings, 0 replies; 87+ messages in thread
From: David Howells @ 2021-10-18 15:04 UTC (permalink / raw)
  To: linux-cachefs
  Cc: Marc Dionne, linux-afs, dhowells, Trond Myklebust,
	Anna Schumaker, Steve French, Dominique Martinet, Jeff Layton,
	Matthew Wilcox, Alexander Viro, Omar Sandoval, Linus Torvalds,
	linux-afs, linux-nfs, linux-cifs, ceph-devel, v9fs-developer,
	linux-fsdevel, linux-kernel

Invoke fscache_resize_cookie() to adjust the size of the backing cache
object when setattr is called with ATTR_SIZE.  This discards any data that
then lies beyond the revised EOF and frees up space.

Signed-off-by: David Howells <dhowells@redhat.com>
cc: Marc Dionne <marc.dionne@auristor.com>
cc: linux-afs@lists.infradead.org
---

 fs/afs/inode.c |    6 ++++++
 1 file changed, 6 insertions(+)

diff --git a/fs/afs/inode.c b/fs/afs/inode.c
index 311df66c64d0..c4af4fda37dd 100644
--- a/fs/afs/inode.c
+++ b/fs/afs/inode.c
@@ -829,6 +829,9 @@ static void afs_setattr_edit_file(struct afs_operation *op)
 
 		if (size < i_size)
 			truncate_pagecache(inode, size);
+		if (size != i_size)
+			fscache_resize_cookie(afs_vnode_cache(vp->vnode),
+					      vp->scb.status.size);
 	}
 }
 
@@ -872,6 +875,8 @@ int afs_setattr(struct user_namespace *mnt_userns, struct dentry *dentry,
 			attr->ia_valid &= ~ATTR_SIZE;
 	}
 
+	fscache_use_cookie(afs_vnode_cache(vnode), true);
+
 	/* flush any dirty data outstanding on a regular file */
 	if (S_ISREG(vnode->vfs_inode.i_mode))
 		filemap_write_and_wait(vnode->vfs_inode.i_mapping);
@@ -903,6 +908,7 @@ int afs_setattr(struct user_namespace *mnt_userns, struct dentry *dentry,
 
 out_unlock:
 	up_write(&vnode->validate_lock);
+	fscache_unuse_cookie(afs_vnode_cache(vnode), NULL, NULL);
 	_leave(" = %d", ret);
 	return ret;
 }



^ permalink raw reply related	[flat|nested] 87+ messages in thread

* [PATCH 54/67] afs: Add O_DIRECT read support
  2021-10-18 14:50 [PATCH 00/67] fscache: Rewrite index API and management system David Howells
                   ` (52 preceding siblings ...)
  2021-10-18 15:04 ` [PATCH 53/67] afs: Invoke fscache_resize_cookie() when handling ATTR_SIZE for setattr David Howells
@ 2021-10-18 15:04 ` David Howells
  2021-10-18 15:05 ` [PATCH 55/67] afs: Skip truncation on the server of data we haven't written yet David Howells
                   ` (16 subsequent siblings)
  70 siblings, 0 replies; 87+ messages in thread
From: David Howells @ 2021-10-18 15:04 UTC (permalink / raw)
  To: linux-cachefs
  Cc: Marc Dionne, linux-afs, dhowells, Trond Myklebust,
	Anna Schumaker, Steve French, Dominique Martinet, Jeff Layton,
	Matthew Wilcox, Alexander Viro, Omar Sandoval, Linus Torvalds,
	linux-afs, linux-nfs, linux-cifs, ceph-devel, v9fs-developer,
	linux-fsdevel, linux-kernel

Add synchronous O_DIRECT read support to AFS (no AIO yet).  It can
theoretically handle reads up to the maximum size describable by loff_t -
and given an iterator with sufficiently capacity to handle that and given
support on the server.

Signed-off-by: David Howells <dhowells@redhat.com>
cc: Marc Dionne <marc.dionne@auristor.com>
cc: linux-afs@lists.infradead.org
---

 fs/afs/file.c      |   59 +++++++++++++++++++++++++++++++++++++++++++
 fs/afs/fsclient.c  |   18 +++++++++----
 fs/afs/internal.h  |    2 +
 fs/afs/write.c     |   72 +++++++++++++++++++++++++++++++++++++++++++++++++++-
 fs/afs/yfsclient.c |   12 ++++++---
 5 files changed, 153 insertions(+), 10 deletions(-)

diff --git a/fs/afs/file.c b/fs/afs/file.c
index 4af7ef607cf6..5db1e7d29ad5 100644
--- a/fs/afs/file.c
+++ b/fs/afs/file.c
@@ -28,6 +28,7 @@ static ssize_t afs_file_read_iter(struct kiocb *iocb, struct iov_iter *iter);
 static void afs_vm_open(struct vm_area_struct *area);
 static void afs_vm_close(struct vm_area_struct *area);
 static vm_fault_t afs_vm_map_pages(struct vm_fault *vmf, pgoff_t start_pgoff, pgoff_t end_pgoff);
+static ssize_t afs_direct_IO(struct kiocb *iocb, struct iov_iter *iter);
 
 const struct file_operations afs_file_operations = {
 	.open		= afs_open,
@@ -56,6 +57,7 @@ const struct address_space_operations afs_fs_aops = {
 	.launder_page	= afs_launder_page,
 	.releasepage	= afs_releasepage,
 	.invalidatepage	= afs_invalidatepage,
+	.direct_IO	= afs_direct_IO,
 	.write_begin	= afs_write_begin,
 	.write_end	= afs_write_end,
 	.writepage	= afs_writepage,
@@ -601,3 +603,60 @@ static ssize_t afs_file_read_iter(struct kiocb *iocb, struct iov_iter *iter)
 
 	return generic_file_read_iter(iocb, iter);
 }
+
+/*
+ * Direct file read operation for an AFS file.
+ *
+ * TODO: To support AIO, the pages in the iterator have to be copied and
+ * refs taken on them.  Then -EIOCBQUEUED needs to be returned.
+ * iocb->ki_complete must then be called upon completion of the operation.
+ */
+static ssize_t afs_file_direct_read(struct kiocb *iocb, struct iov_iter *iter)
+{
+	struct file *file = iocb->ki_filp;
+	struct afs_vnode *vnode = AFS_FS_I(file_inode(file));
+	struct afs_read *req;
+	ssize_t ret, transferred;
+
+	_enter("%llx,%zx", iocb->ki_pos, iov_iter_count(iter));
+
+	req = afs_alloc_read(GFP_KERNEL);
+	if (!req)
+		return -ENOMEM;
+
+	req->vnode	= vnode;
+	req->key	= key_get(afs_file_key(file));
+	req->pos	= iocb->ki_pos;
+	req->len	= iov_iter_count(iter);
+	req->iter	= iter;
+
+	task_io_account_read(req->len);
+
+	// TODO nfs_start_io_direct(inode);
+	ret = afs_fetch_data(vnode, req);
+	if (ret == 0)
+		transferred = req->actual_len;
+	afs_put_read(req);
+
+	// TODO nfs_end_io_direct(inode);
+
+	if (ret == 0)
+		ret = transferred;
+
+	BUG_ON(ret == -EIOCBQUEUED); // TODO
+	//if (iocb->ki_complete)
+	//	iocb->ki_complete(iocb, ret, 0); // only if ret == -EIOCBQUEUED
+
+	_leave(" = %zu", ret);
+	return ret;
+}
+
+/*
+ * Do direct I/O.
+ */
+static ssize_t afs_direct_IO(struct kiocb *iocb, struct iov_iter *iter)
+{
+	if (iov_iter_rw(iter) == READ)
+		return afs_file_direct_read(iocb, iter);
+	return afs_file_direct_write(iocb, iter);
+}
diff --git a/fs/afs/fsclient.c b/fs/afs/fsclient.c
index 4943413d9c5f..290d4d391a0e 100644
--- a/fs/afs/fsclient.c
+++ b/fs/afs/fsclient.c
@@ -439,7 +439,7 @@ static void afs_fs_fetch_data64(struct afs_operation *op)
 	bp[3] = htonl(vp->fid.unique);
 	bp[4] = htonl(upper_32_bits(req->pos));
 	bp[5] = htonl(lower_32_bits(req->pos));
-	bp[6] = 0;
+	bp[6] = htonl(upper_32_bits(req->len));
 	bp[7] = htonl(lower_32_bits(req->len));
 
 	trace_afs_make_fs_call(call, &vp->fid);
@@ -1057,6 +1057,7 @@ static void afs_fs_store_data64(struct afs_operation *op)
 	struct afs_vnode_param *vp = &op->file[0];
 	struct afs_call *call;
 	__be32 *bp;
+	u32 mask = 0;
 
 	_enter(",%x,{%llx:%llu},,",
 	       key_serial(op->key), vp->fid.vid, vp->fid.vnode);
@@ -1069,6 +1070,9 @@ static void afs_fs_store_data64(struct afs_operation *op)
 
 	call->write_iter = op->store.write_iter;
 
+	if (op->flags & AFS_OPERATION_SET_MTIME)
+		mask |= AFS_SET_MTIME;
+
 	/* marshall the parameters */
 	bp = call->request;
 	*bp++ = htonl(FSSTOREDATA64);
@@ -1076,8 +1080,8 @@ static void afs_fs_store_data64(struct afs_operation *op)
 	*bp++ = htonl(vp->fid.vnode);
 	*bp++ = htonl(vp->fid.unique);
 
-	*bp++ = htonl(AFS_SET_MTIME); /* mask */
-	*bp++ = htonl(op->mtime.tv_sec); /* mtime */
+	*bp++ = htonl(mask);
+	*bp++ = htonl(op->mtime.tv_sec);
 	*bp++ = 0; /* owner */
 	*bp++ = 0; /* group */
 	*bp++ = 0; /* unix mode */
@@ -1102,6 +1106,7 @@ void afs_fs_store_data(struct afs_operation *op)
 	struct afs_vnode_param *vp = &op->file[0];
 	struct afs_call *call;
 	__be32 *bp;
+	u32 mask = 0;
 
 	_enter(",%x,{%llx:%llu},,",
 	       key_serial(op->key), vp->fid.vid, vp->fid.vnode);
@@ -1122,6 +1127,9 @@ void afs_fs_store_data(struct afs_operation *op)
 
 	call->write_iter = op->store.write_iter;
 
+	if (op->flags & AFS_OPERATION_SET_MTIME)
+		mask |= AFS_SET_MTIME;
+
 	/* marshall the parameters */
 	bp = call->request;
 	*bp++ = htonl(FSSTOREDATA);
@@ -1129,8 +1137,8 @@ void afs_fs_store_data(struct afs_operation *op)
 	*bp++ = htonl(vp->fid.vnode);
 	*bp++ = htonl(vp->fid.unique);
 
-	*bp++ = htonl(AFS_SET_MTIME); /* mask */
-	*bp++ = htonl(op->mtime.tv_sec); /* mtime */
+	*bp++ = htonl(mask);
+	*bp++ = htonl(op->mtime.tv_sec);
 	*bp++ = 0; /* owner */
 	*bp++ = 0; /* group */
 	*bp++ = 0; /* unix mode */
diff --git a/fs/afs/internal.h b/fs/afs/internal.h
index 07d34291bf4f..d3fb9b3845bc 100644
--- a/fs/afs/internal.h
+++ b/fs/afs/internal.h
@@ -863,6 +863,7 @@ struct afs_operation {
 #define AFS_OPERATION_TRIED_ALL		0x0400	/* Set if we've tried all the fileservers */
 #define AFS_OPERATION_RETRY_SERVER	0x0800	/* Set if we should retry the current server */
 #define AFS_OPERATION_DIR_CONFLICT	0x1000	/* Set if we detected a 3rd-party dir change */
+#define AFS_OPERATION_SET_MTIME		0x2000	/* Set if we should try to store the mtime */
 };
 
 /*
@@ -1538,6 +1539,7 @@ extern int afs_fsync(struct file *, loff_t, loff_t, int);
 extern vm_fault_t afs_page_mkwrite(struct vm_fault *vmf);
 extern void afs_prune_wb_keys(struct afs_vnode *);
 extern int afs_launder_page(struct page *);
+extern ssize_t afs_file_direct_write(struct kiocb *, struct iov_iter *);
 
 /*
  * xattr.c
diff --git a/fs/afs/write.c b/fs/afs/write.c
index a540da6ad423..281f0e93e2c6 100644
--- a/fs/afs/write.c
+++ b/fs/afs/write.c
@@ -406,7 +406,7 @@ static int afs_store_data(struct afs_vnode *vnode, struct iov_iter *iter, loff_t
 	op->store.i_size = max(pos + size, i_size);
 	op->store.laundering = laundering;
 	op->mtime = vnode->vfs_inode.i_mtime;
-	op->flags |= AFS_OPERATION_UNINTR;
+	op->flags |= AFS_OPERATION_SET_MTIME | AFS_OPERATION_UNINTR;
 	op->ops = &afs_store_data_operation;
 
 try_next_key:
@@ -1039,3 +1039,73 @@ static void afs_write_to_cache(struct afs_vnode *vnode,
 			       vnode->vfs_inode.i_mapping, start, len, i_size,
 			       afs_write_to_cache_done, vnode);
 }
+
+static void afs_dio_store_data_success(struct afs_operation *op)
+{
+	struct afs_vnode *vnode = op->file[0].vnode;
+
+	op->ctime = op->file[0].scb.status.mtime_client;
+	afs_vnode_commit_status(op, &op->file[0]);
+	if (op->error == 0) {
+		afs_stat_v(vnode, n_stores);
+		atomic_long_add(op->store.size, &afs_v2net(vnode)->n_store_bytes);
+	}
+}
+
+static const struct afs_operation_ops afs_dio_store_data_operation = {
+	.issue_afs_rpc	= afs_fs_store_data,
+	.issue_yfs_rpc	= yfs_fs_store_data,
+	.success	= afs_dio_store_data_success,
+};
+
+/*
+ * Direct file write operation for an AFS file.
+ *
+ * TODO: To support AIO, the pages in the iterator have to be copied and
+ * refs taken on them.  Then -EIOCBQUEUED needs to be returned.
+ * iocb->ki_complete must then be called upon completion of the operation.
+ */
+ssize_t afs_file_direct_write(struct kiocb *iocb, struct iov_iter *iter)
+{
+	struct file *file = iocb->ki_filp;
+	struct afs_vnode *vnode = AFS_FS_I(file_inode(file));
+	struct afs_operation *op;
+	loff_t size = iov_iter_count(iter), i_size;
+	ssize_t ret;
+
+	_enter("%s{%llx:%llu.%u},%llx,%llx",
+	       vnode->volume->name,
+	       vnode->fid.vid,
+	       vnode->fid.vnode,
+	       vnode->fid.unique,
+	       size, iocb->ki_pos);
+
+	op = afs_alloc_operation(afs_file_key(file), vnode->volume);
+	if (IS_ERR(op))
+		return -ENOMEM;
+
+	i_size = i_size_read(&vnode->vfs_inode);
+
+	afs_op_set_vnode(op, 0, vnode);
+	op->file[0].dv_delta	= 1;
+	op->file[0].set_size	= true;
+	op->store.write_iter	= iter;
+	op->store.pos		= iocb->ki_pos;
+	op->store.size		= size;
+	op->store.i_size	= max(iocb->ki_pos + size, i_size);
+	op->ops			= &afs_dio_store_data_operation;
+
+	//if (!is_sync_kiocb(iocb)) {
+
+	ret = afs_do_sync_operation(op);
+	if (ret == 0)
+		ret = size;
+
+	afs_invalidate_cache(vnode, FSCACHE_INVAL_DIO_WRITE);
+
+	//if (iocb->ki_complete)
+	//	iocb->ki_complete(iocb, ret, 0); // only if ret == -EIOCBQUEUED
+
+	_leave(" = %zd", ret);
+	return ret;
+}
diff --git a/fs/afs/yfsclient.c b/fs/afs/yfsclient.c
index 2b35cba8ad62..a8c4e230002d 100644
--- a/fs/afs/yfsclient.c
+++ b/fs/afs/yfsclient.c
@@ -95,12 +95,16 @@ static __be32 *xdr_encode_YFSStoreStatus_mode(__be32 *bp, mode_t mode)
 	return bp + xdr_size(x);
 }
 
-static __be32 *xdr_encode_YFSStoreStatus_mtime(__be32 *bp, const struct timespec64 *t)
+static __be32 *xdr_encode_YFSStoreStatus_mtime(__be32 *bp, struct afs_operation *op)
 {
 	struct yfs_xdr_YFSStoreStatus *x = (void *)bp;
-	s64 mtime = linux_to_yfs_time(t);
+	s64 mtime = linux_to_yfs_time(&op->mtime);
+	u32 mask = 0;
 
-	x->mask		= htonl(AFS_SET_MTIME);
+	if (op->flags & AFS_OPERATION_SET_MTIME)
+		mask |= AFS_SET_MTIME;
+
+	x->mask		= htonl(mask);
 	x->mode		= htonl(0);
 	x->mtime_client	= u64_to_xdr(mtime);
 	x->owner	= u64_to_xdr(0);
@@ -1103,7 +1107,7 @@ void yfs_fs_store_data(struct afs_operation *op)
 	bp = xdr_encode_u32(bp, YFSSTOREDATA64);
 	bp = xdr_encode_u32(bp, 0); /* RPC flags */
 	bp = xdr_encode_YFSFid(bp, &vp->fid);
-	bp = xdr_encode_YFSStoreStatus_mtime(bp, &op->mtime);
+	bp = xdr_encode_YFSStoreStatus_mtime(bp, op);
 	bp = xdr_encode_u64(bp, op->store.pos);
 	bp = xdr_encode_u64(bp, op->store.size);
 	bp = xdr_encode_u64(bp, op->store.i_size);



^ permalink raw reply related	[flat|nested] 87+ messages in thread

* [PATCH 55/67] afs: Skip truncation on the server of data we haven't written yet
  2021-10-18 14:50 [PATCH 00/67] fscache: Rewrite index API and management system David Howells
                   ` (53 preceding siblings ...)
  2021-10-18 15:04 ` [PATCH 54/67] afs: Add O_DIRECT read support David Howells
@ 2021-10-18 15:05 ` David Howells
  2021-10-18 15:05 ` [PATCH 56/67] afs: Make afs_write_begin() return the THP subpage David Howells
                   ` (15 subsequent siblings)
  70 siblings, 0 replies; 87+ messages in thread
From: David Howells @ 2021-10-18 15:05 UTC (permalink / raw)
  To: linux-cachefs
  Cc: dhowells, Trond Myklebust, Anna Schumaker, Steve French,
	Dominique Martinet, Jeff Layton, Matthew Wilcox, Alexander Viro,
	Omar Sandoval, Linus Torvalds, linux-afs, linux-nfs, linux-cifs,
	ceph-devel, v9fs-developer, linux-fsdevel, linux-kernel

Don't send a truncation RPC to the server if we're only shortening data
that's in the pagecache and is beyond the server's EOF.

Also don't automatically force writeback on setattr, but do wait to store
RPCs that are in the region to be removed on a shortening truncation.

Signed-off-by: David Howells <dhowells@redhat.com>
---

 fs/afs/inode.c |   45 +++++++++++++++++++++++++++++++++++----------
 1 file changed, 35 insertions(+), 10 deletions(-)

diff --git a/fs/afs/inode.c b/fs/afs/inode.c
index c4af4fda37dd..4c66a2b86add 100644
--- a/fs/afs/inode.c
+++ b/fs/afs/inode.c
@@ -848,42 +848,67 @@ static const struct afs_operation_ops afs_setattr_operation = {
 int afs_setattr(struct user_namespace *mnt_userns, struct dentry *dentry,
 		struct iattr *attr)
 {
+	const unsigned int supported =
+		ATTR_SIZE | ATTR_MODE | ATTR_UID | ATTR_GID |
+		ATTR_MTIME | ATTR_MTIME_SET | ATTR_TIMES_SET | ATTR_TOUCH;
 	struct afs_operation *op;
 	struct afs_vnode *vnode = AFS_FS_I(d_inode(dentry));
+	struct inode *inode = &vnode->vfs_inode;
+	loff_t i_size;
 	int ret;
 
 	_enter("{%llx:%llu},{n=%pd},%x",
 	       vnode->fid.vid, vnode->fid.vnode, dentry,
 	       attr->ia_valid);
 
-	if (!(attr->ia_valid & (ATTR_SIZE | ATTR_MODE | ATTR_UID | ATTR_GID |
-				ATTR_MTIME | ATTR_MTIME_SET | ATTR_TIMES_SET |
-				ATTR_TOUCH))) {
+	if (!(attr->ia_valid & supported)) {
 		_leave(" = 0 [unsupported]");
 		return 0;
 	}
 
+	i_size = i_size_read(inode);
 	if (attr->ia_valid & ATTR_SIZE) {
-		if (!S_ISREG(vnode->vfs_inode.i_mode))
+		if (!S_ISREG(inode->i_mode))
 			return -EISDIR;
 
-		ret = inode_newsize_ok(&vnode->vfs_inode, attr->ia_size);
+		ret = inode_newsize_ok(inode, attr->ia_size);
 		if (ret)
 			return ret;
 
-		if (attr->ia_size == i_size_read(&vnode->vfs_inode))
+		if (attr->ia_size == i_size)
 			attr->ia_valid &= ~ATTR_SIZE;
 	}
 
 	fscache_use_cookie(afs_vnode_cache(vnode), true);
 
-	/* flush any dirty data outstanding on a regular file */
-	if (S_ISREG(vnode->vfs_inode.i_mode))
-		filemap_write_and_wait(vnode->vfs_inode.i_mapping);
-
 	/* Prevent any new writebacks from starting whilst we do this. */
 	down_write(&vnode->validate_lock);
 
+	if ((attr->ia_valid & ATTR_SIZE) && S_ISREG(inode->i_mode)) {
+		loff_t size = attr->ia_size;
+
+		/* Wait for any outstanding writes to the server to complete */
+		loff_t from = min(size, i_size);
+		loff_t to = max(size, i_size);
+		ret = filemap_fdatawait_range(inode->i_mapping, from, to);
+		if (ret < 0)
+			goto out_unlock;
+
+		/* Don't talk to the server if we're just shortening in-memory
+		 * writes that haven't gone to the server yet.
+		 */
+		if (!(attr->ia_valid & (supported & ~ATTR_SIZE & ~ATTR_MTIME)) &&
+		    attr->ia_size < i_size &&
+		    attr->ia_size > vnode->status.size) {
+			truncate_pagecache(inode, attr->ia_size);
+			fscache_resize_cookie(afs_vnode_cache(vnode),
+					      attr->ia_size);
+			i_size_write(inode, attr->ia_size);
+			ret = 0;
+			goto out_unlock;
+		}
+	}
+
 	op = afs_alloc_operation(((attr->ia_valid & ATTR_FILE) ?
 				  afs_file_key(attr->ia_file) : NULL),
 				 vnode->volume);



^ permalink raw reply related	[flat|nested] 87+ messages in thread

* [PATCH 56/67] afs: Make afs_write_begin() return the THP subpage
  2021-10-18 14:50 [PATCH 00/67] fscache: Rewrite index API and management system David Howells
                   ` (54 preceding siblings ...)
  2021-10-18 15:05 ` [PATCH 55/67] afs: Skip truncation on the server of data we haven't written yet David Howells
@ 2021-10-18 15:05 ` David Howells
  2021-10-18 15:05 ` [PATCH 57/67] cachefiles, afs: Drive FSCACHE_COOKIE_NO_DATA_TO_READ David Howells
                   ` (14 subsequent siblings)
  70 siblings, 0 replies; 87+ messages in thread
From: David Howells @ 2021-10-18 15:05 UTC (permalink / raw)
  To: linux-cachefs
  Cc: dhowells, Trond Myklebust, Anna Schumaker, Steve French,
	Dominique Martinet, Jeff Layton, Matthew Wilcox, Alexander Viro,
	Omar Sandoval, Linus Torvalds, linux-afs, linux-nfs, linux-cifs,
	ceph-devel, v9fs-developer, linux-fsdevel, linux-kernel

generic_perform_write() can't handle a THP, so we have to return the
subpage of that THP from afs_write_begin() and then convert it back into
the head on entry to afs_write_end().

Signed-off-by: David Howells <dhowells@redhat.com>
---

 fs/afs/write.c |    5 +++--
 1 file changed, 3 insertions(+), 2 deletions(-)

diff --git a/fs/afs/write.c b/fs/afs/write.c
index 281f0e93e2c6..5cd417e95029 100644
--- a/fs/afs/write.c
+++ b/fs/afs/write.c
@@ -90,7 +90,7 @@ int afs_write_begin(struct file *file, struct address_space *mapping,
 			goto flush_conflicting_write;
 	}
 
-	*_page = page;
+	*_page = find_subpage(page, pos / PAGE_SIZE);
 	_leave(" = 0");
 	return 0;
 
@@ -120,9 +120,10 @@ int afs_write_begin(struct file *file, struct address_space *mapping,
  */
 int afs_write_end(struct file *file, struct address_space *mapping,
 		  loff_t pos, unsigned len, unsigned copied,
-		  struct page *page, void *fsdata)
+		  struct page *subpage, void *fsdata)
 {
 	struct afs_vnode *vnode = AFS_FS_I(file_inode(file));
+	struct page *page = thp_head(subpage);
 	unsigned long priv;
 	unsigned int f, from = offset_in_thp(page, pos);
 	unsigned int t, to = from + copied;



^ permalink raw reply related	[flat|nested] 87+ messages in thread

* [PATCH 57/67] cachefiles, afs: Drive FSCACHE_COOKIE_NO_DATA_TO_READ
  2021-10-18 14:50 [PATCH 00/67] fscache: Rewrite index API and management system David Howells
                   ` (55 preceding siblings ...)
  2021-10-18 15:05 ` [PATCH 56/67] afs: Make afs_write_begin() return the THP subpage David Howells
@ 2021-10-18 15:05 ` David Howells
  2021-10-18 15:06 ` [PATCH 58/67] NFS: Convert fscache_acquire_cookie and fscache_relinquish_cookie David Howells
                   ` (13 subsequent siblings)
  70 siblings, 0 replies; 87+ messages in thread
From: David Howells @ 2021-10-18 15:05 UTC (permalink / raw)
  To: linux-cachefs
  Cc: dhowells, Trond Myklebust, Anna Schumaker, Steve French,
	Dominique Martinet, Jeff Layton, Matthew Wilcox, Alexander Viro,
	Omar Sandoval, Linus Torvalds, linux-afs, linux-nfs, linux-cifs,
	ceph-devel, v9fs-developer, linux-fsdevel, linux-kernel

Drive the FSCACHE_COOKIE_NO_DATA_TO_READ bit to skip reads on cache files
that can't have any data available to read.  This needs clearing once we've
written some data and then released the netfs page that contained it.

Signed-off-by: David Howells <dhowells@redhat.com>
---

 fs/afs/file.c           |    1 +
 fs/cachefiles/io.c      |    1 +
 fs/fscache/cookie.c     |    2 ++
 include/linux/fscache.h |   16 ++++++++++++++++
 4 files changed, 20 insertions(+)

diff --git a/fs/afs/file.c b/fs/afs/file.c
index 5db1e7d29ad5..7fe57f210259 100644
--- a/fs/afs/file.c
+++ b/fs/afs/file.c
@@ -506,6 +506,7 @@ static int afs_releasepage(struct page *page, gfp_t gfp_flags)
 			return false;
 		wait_on_page_fscache(page);
 	}
+	fscache_note_page_release(afs_vnode_cache(vnode));
 #endif
 
 	if (PagePrivate(page)) {
diff --git a/fs/cachefiles/io.c b/fs/cachefiles/io.c
index 9b3b55a94e66..5e3579800689 100644
--- a/fs/cachefiles/io.c
+++ b/fs/cachefiles/io.c
@@ -195,6 +195,7 @@ static void cachefiles_write_complete(struct kiocb *iocb, long ret, long ret2)
 	__sb_writers_acquired(inode->i_sb, SB_FREEZE_WRITE);
 	__sb_end_write(inode->i_sb, SB_FREEZE_WRITE);
 
+	set_bit(FSCACHE_COOKIE_HAVE_DATA, &ki->object->cookie->flags);
 	if (ki->term_func)
 		ki->term_func(ki->term_func_priv, ret, ki->was_async);
 	cachefiles_put_kiocb(ki);
diff --git a/fs/fscache/cookie.c b/fs/fscache/cookie.c
index 1420027cfe97..369f9258bb50 100644
--- a/fs/fscache/cookie.c
+++ b/fs/fscache/cookie.c
@@ -263,6 +263,8 @@ static struct fscache_cookie *fscache_alloc_cookie(
 	cookie->key_len		= index_key_len;
 	cookie->aux_len		= aux_data_len;
 	cookie->object_size	= object_size;
+	if (object_size == 0)
+		__set_bit(FSCACHE_COOKIE_NO_DATA_TO_READ, &cookie->flags);
 
 	if (fscache_set_key(cookie, index_key, index_key_len) < 0)
 		goto nomem;
diff --git a/include/linux/fscache.h b/include/linux/fscache.h
index ba192567d099..d18b7d3564ab 100644
--- a/include/linux/fscache.h
+++ b/include/linux/fscache.h
@@ -131,6 +131,7 @@ struct fscache_cookie {
 #define FSCACHE_COOKIE_DO_WITHDRAW	10		/* T if this cookie needs withdrawing */
 #define FSCACHE_COOKIE_DO_COMMIT	11		/* T if this cookie needs committing */
 #define FSCACHE_COOKIE_DO_PREP_TO_WRITE	12		/* T if cookie needs write preparation */
+#define FSCACHE_COOKIE_HAVE_DATA	13		/* T if this cookie has data stored */
 
 	enum fscache_cookie_stage	stage;
 	u8				advice;		/* FSCACHE_ADV_* */
@@ -643,7 +644,22 @@ static inline void fscache_clear_inode_writeback(struct fscache_cookie *cookie,
 		loff_t i_size = i_size_read(inode);
 		fscache_unuse_cookie(cookie, aux, &i_size);
 	}
+}
 
+/**
+ * fscache_note_page_release - Note that a netfs page got released
+ * @cookie: The cookie corresponding to the file
+ *
+ * Note that a page that has been copied to the cache has been released.  This
+ * means that future reads will need to look in the cache to see if it's there.
+ */
+static inline
+void fscache_note_page_release(struct fscache_cookie *cookie)
+{
+	if (cookie &&
+	    test_bit(FSCACHE_COOKIE_HAVE_DATA, &cookie->flags) &&
+	    test_bit(FSCACHE_COOKIE_NO_DATA_TO_READ, &cookie->flags))
+		clear_bit(FSCACHE_COOKIE_NO_DATA_TO_READ, &cookie->flags);
 }
 
 #ifdef FSCACHE_USE_FALLBACK_IO_API



^ permalink raw reply related	[flat|nested] 87+ messages in thread

* [PATCH 58/67] NFS: Convert fscache_acquire_cookie and fscache_relinquish_cookie
  2021-10-18 14:50 [PATCH 00/67] fscache: Rewrite index API and management system David Howells
                   ` (56 preceding siblings ...)
  2021-10-18 15:05 ` [PATCH 57/67] cachefiles, afs: Drive FSCACHE_COOKIE_NO_DATA_TO_READ David Howells
@ 2021-10-18 15:06 ` David Howells
  2021-10-18 15:06 ` [PATCH 59/67] NFS: Convert fscache_enable_cookie and fscache_disable_cookie David Howells
                   ` (12 subsequent siblings)
  70 siblings, 0 replies; 87+ messages in thread
From: David Howells @ 2021-10-18 15:06 UTC (permalink / raw)
  To: linux-cachefs
  Cc: Dave Wysochanski, Trond Myklebust, Anna Schumaker, linux-nfs,
	dhowells, Trond Myklebust, Anna Schumaker, Steve French,
	Dominique Martinet, Jeff Layton, Matthew Wilcox, Alexander Viro,
	Omar Sandoval, Linus Torvalds, linux-afs, linux-nfs, linux-cifs,
	ceph-devel, v9fs-developer, linux-fsdevel, linux-kernel

From: Dave Wysochanski <dwysocha@redhat.com>

The new FS-Cache netfs API changes the cookie API slightly.

The changes to fscache_acquire_cookie are:
* remove struct fscache_cookie_def
* add 'type' of cookie (was member of fscache_cookie_def)
* add 'name' of cookie (was member of fscache_cookie_def)
* add 'advice' flags (tells cache how to handle object); set to 0
* add 'preferred_cache' tag (if NULL, derive from parent)
* remove 'netfs_data'
* remove 'enable' (See fscache_use_cookie())

The changes to fscache_relinquish_cookie are:
* remove 'aux_data'

Signed-off-by: Dave Wysochanski <dwysocha@redhat.com>
Signed-off-by: David Howells <dhowells@redhat.com>
cc: Trond Myklebust <trond.myklebust@hammerspace.com>
cc: Anna Schumaker <anna.schumaker@netapp.com>
cc: linux-nfs@vger.kernel.org
---

 fs/nfs/fscache-index.c |   68 ------------------------------------------------
 fs/nfs/fscache.c       |   44 ++++++++++++++++++++-----------
 fs/nfs/fscache.h       |    3 --
 3 files changed, 29 insertions(+), 86 deletions(-)

diff --git a/fs/nfs/fscache-index.c b/fs/nfs/fscache-index.c
index 4bd5ce736193..b4fdacd955f3 100644
--- a/fs/nfs/fscache-index.c
+++ b/fs/nfs/fscache-index.c
@@ -44,71 +44,3 @@ void nfs_fscache_unregister(void)
 {
 	fscache_unregister_netfs(&nfs_fscache_netfs);
 }
-
-/*
- * Define the server object for FS-Cache.  This is used to describe a server
- * object to fscache_acquire_cookie().  It is keyed by the NFS protocol and
- * server address parameters.
- */
-const struct fscache_cookie_def nfs_fscache_server_index_def = {
-	.name		= "NFS.server",
-	.type 		= FSCACHE_COOKIE_TYPE_INDEX,
-};
-
-/*
- * Define the superblock object for FS-Cache.  This is used to describe a
- * superblock object to fscache_acquire_cookie().  It is keyed by all the NFS
- * parameters that might cause a separate superblock.
- */
-const struct fscache_cookie_def nfs_fscache_super_index_def = {
-	.name		= "NFS.super",
-	.type 		= FSCACHE_COOKIE_TYPE_INDEX,
-};
-
-/*
- * Consult the netfs about the state of an object
- * - This function can be absent if the index carries no state data
- * - The netfs data from the cookie being used as the target is
- *   presented, as is the auxiliary data
- */
-static
-enum fscache_checkaux nfs_fscache_inode_check_aux(void *cookie_netfs_data,
-						  const void *data,
-						  uint16_t datalen,
-						  loff_t object_size)
-{
-	struct nfs_fscache_inode_auxdata auxdata;
-	struct nfs_inode *nfsi = cookie_netfs_data;
-
-	if (datalen != sizeof(auxdata))
-		return FSCACHE_CHECKAUX_OBSOLETE;
-
-	memset(&auxdata, 0, sizeof(auxdata));
-	auxdata.mtime_sec  = nfsi->vfs_inode.i_mtime.tv_sec;
-	auxdata.mtime_nsec = nfsi->vfs_inode.i_mtime.tv_nsec;
-	auxdata.ctime_sec  = nfsi->vfs_inode.i_ctime.tv_sec;
-	auxdata.ctime_nsec = nfsi->vfs_inode.i_ctime.tv_nsec;
-
-	if (NFS_SERVER(&nfsi->vfs_inode)->nfs_client->rpc_ops->version == 4)
-		auxdata.change_attr = inode_peek_iversion_raw(&nfsi->vfs_inode);
-
-	if (memcmp(data, &auxdata, datalen) != 0)
-		return FSCACHE_CHECKAUX_OBSOLETE;
-
-	return FSCACHE_CHECKAUX_OKAY;
-}
-
-/*
- * Define the inode object for FS-Cache.  This is used to describe an inode
- * object to fscache_acquire_cookie().  It is keyed by the NFS file handle for
- * an inode.
- *
- * Coherency is managed by comparing the copies of i_size, i_mtime and i_ctime
- * held in the cache auxiliary data for the data storage object with those in
- * the inode struct in memory.
- */
-const struct fscache_cookie_def nfs_fscache_inode_object_def = {
-	.name		= "NFS.fh",
-	.type		= FSCACHE_COOKIE_TYPE_DATAFILE,
-	.check_aux	= nfs_fscache_inode_check_aux,
-};
diff --git a/fs/nfs/fscache.c b/fs/nfs/fscache.c
index 68e266a37675..514d50d079a2 100644
--- a/fs/nfs/fscache.c
+++ b/fs/nfs/fscache.c
@@ -81,10 +81,15 @@ void nfs_fscache_get_client_cookie(struct nfs_client *clp)
 
 	/* create a cache index for looking up filehandles */
 	clp->fscache = fscache_acquire_cookie(nfs_fscache_netfs.primary_index,
-					      &nfs_fscache_server_index_def,
-					      &key, len,
-					      NULL, 0,
-					      clp, 0, true);
+					      FSCACHE_COOKIE_TYPE_INDEX,
+					      "NFS.server",
+					      0,    /* advice */
+					      NULL, /* preferred_cache */
+					      &key, /* index_key */
+					      len,
+					      NULL, /* aux_data */
+					      0,
+					      0);
 	dfprintk(FSCACHE, "NFS: get client cookie (0x%p/0x%p)\n",
 		 clp, clp->fscache);
 }
@@ -97,7 +102,7 @@ void nfs_fscache_release_client_cookie(struct nfs_client *clp)
 	dfprintk(FSCACHE, "NFS: releasing client cookie (0x%p/0x%p)\n",
 		 clp, clp->fscache);
 
-	fscache_relinquish_cookie(clp->fscache, NULL, false);
+	fscache_relinquish_cookie(clp->fscache, false);
 	clp->fscache = NULL;
 }
 
@@ -185,11 +190,15 @@ void nfs_fscache_get_super_cookie(struct super_block *sb, const char *uniq, int
 
 	/* create a cache index for looking up filehandles */
 	nfss->fscache = fscache_acquire_cookie(nfss->nfs_client->fscache,
-					       &nfs_fscache_super_index_def,
-					       &key->key,
+					       FSCACHE_COOKIE_TYPE_INDEX,
+					       "NFS.super",
+					       0,    /* advice */
+					       NULL, /* preferred_cache */
+					       &key->key,  /* index_key */
 					       sizeof(key->key) + ulen,
-					       NULL, 0,
-					       nfss, 0, true);
+					       NULL, /* aux_data */
+					       0,
+					       0);
 	dfprintk(FSCACHE, "NFS: get superblock cookie (0x%p/0x%p)\n",
 		 nfss, nfss->fscache);
 	return;
@@ -213,7 +222,7 @@ void nfs_fscache_release_super_cookie(struct super_block *sb)
 	dfprintk(FSCACHE, "NFS: releasing superblock cookie (0x%p/0x%p)\n",
 		 nfss, nfss->fscache);
 
-	fscache_relinquish_cookie(nfss->fscache, NULL, false);
+	fscache_relinquish_cookie(nfss->fscache, false);
 	nfss->fscache = NULL;
 
 	if (nfss->fscache_key) {
@@ -254,10 +263,15 @@ void nfs_fscache_init_inode(struct inode *inode)
 	nfs_fscache_update_auxdata(&auxdata, nfsi);
 
 	nfsi->fscache = fscache_acquire_cookie(NFS_SB(inode->i_sb)->fscache,
-					       &nfs_fscache_inode_object_def,
-					       nfsi->fh.data, nfsi->fh.size,
-					       &auxdata, sizeof(auxdata),
-					       nfsi, nfsi->vfs_inode.i_size, false);
+					       FSCACHE_COOKIE_TYPE_DATAFILE,
+					       "NFS.fh",
+					       0,             /* advice */
+					       NULL, /* preferred_cache */
+					       nfsi->fh.data, /* index_key */
+					       nfsi->fh.size,
+					       &auxdata,      /* aux_data */
+					       sizeof(auxdata),
+					       i_size_read(&nfsi->vfs_inode));
 }
 
 /*
@@ -272,7 +286,7 @@ void nfs_fscache_clear_inode(struct inode *inode)
 	dfprintk(FSCACHE, "NFS: clear cookie (0x%p/0x%p)\n", nfsi, cookie);
 
 	nfs_fscache_update_auxdata(&auxdata, nfsi);
-	fscache_relinquish_cookie(cookie, &auxdata, false);
+	fscache_relinquish_cookie(cookie, false);
 	nfsi->fscache = NULL;
 }
 
diff --git a/fs/nfs/fscache.h b/fs/nfs/fscache.h
index 679055720dae..1e30fcb45665 100644
--- a/fs/nfs/fscache.h
+++ b/fs/nfs/fscache.h
@@ -74,9 +74,6 @@ struct nfs_fscache_inode_auxdata {
  * fscache-index.c
  */
 extern struct fscache_netfs nfs_fscache_netfs;
-extern const struct fscache_cookie_def nfs_fscache_server_index_def;
-extern const struct fscache_cookie_def nfs_fscache_super_index_def;
-extern const struct fscache_cookie_def nfs_fscache_inode_object_def;
 
 extern int nfs_fscache_register(void);
 extern void nfs_fscache_unregister(void);



^ permalink raw reply related	[flat|nested] 87+ messages in thread

* [PATCH 59/67] NFS: Convert fscache_enable_cookie and fscache_disable_cookie
  2021-10-18 14:50 [PATCH 00/67] fscache: Rewrite index API and management system David Howells
                   ` (57 preceding siblings ...)
  2021-10-18 15:06 ` [PATCH 58/67] NFS: Convert fscache_acquire_cookie and fscache_relinquish_cookie David Howells
@ 2021-10-18 15:06 ` David Howells
  2021-10-18 15:07 ` [PATCH 60/67] NFS: Convert fscache invalidation and update aux_data and i_size David Howells
                   ` (11 subsequent siblings)
  70 siblings, 0 replies; 87+ messages in thread
From: David Howells @ 2021-10-18 15:06 UTC (permalink / raw)
  To: linux-cachefs
  Cc: Dave Wysochanski, Trond Myklebust, Anna Schumaker, linux-nfs,
	dhowells, Trond Myklebust, Anna Schumaker, Steve French,
	Dominique Martinet, Jeff Layton, Matthew Wilcox, Alexander Viro,
	Omar Sandoval, Linus Torvalds, linux-afs, linux-nfs, linux-cifs,
	ceph-devel, v9fs-developer, linux-fsdevel, linux-kernel

From: Dave Wysochanski <dwysocha@redhat.com>

The new FS-Cache API removes the fscache_enable_cookie() and
fscache_disable_cookie() in favor of the new APIs
fscache_use_cookie() and fscache_unuse_cookie(), so update these
areas as needed.

For NFS, we enable fscache on an inode only if the inode is open
readonly and not if the inode is open for write.  Use the new
APIs and change the existing logic slightly and make a decision
whether to "use" an inode based fscache cookie one time, by gating
the call to fscache_use_cookie() and fscache_unuse_cookie()
by the NFS_INO_FSCACHE flag on the nfs_inode.

Signed-off-by: Dave Wysochanski <dwysocha@redhat.com>
Signed-off-by: David Howells <dhowells@redhat.com>
cc: Trond Myklebust <trond.myklebust@hammerspace.com>
cc: Anna Schumaker <anna.schumaker@netapp.com>
cc: linux-nfs@vger.kernel.org
---

 fs/nfs/fscache.c |   24 +++++++++++++-----------
 1 file changed, 13 insertions(+), 11 deletions(-)

diff --git a/fs/nfs/fscache.c b/fs/nfs/fscache.c
index 514d50d079a2..5e584f2e84a9 100644
--- a/fs/nfs/fscache.c
+++ b/fs/nfs/fscache.c
@@ -285,7 +285,10 @@ void nfs_fscache_clear_inode(struct inode *inode)
 
 	dfprintk(FSCACHE, "NFS: clear cookie (0x%p/0x%p)\n", nfsi, cookie);
 
-	nfs_fscache_update_auxdata(&auxdata, nfsi);
+	if (test_and_clear_bit(NFS_INO_FSCACHE, &NFS_I(inode)->flags)) {
+		nfs_fscache_update_auxdata(&auxdata, nfsi);
+		fscache_unuse_cookie(cookie, &auxdata, NULL);
+	}
 	fscache_relinquish_cookie(cookie, false);
 	nfsi->fscache = NULL;
 }
@@ -325,18 +328,17 @@ void nfs_fscache_open_file(struct inode *inode, struct file *filp)
 	if (!fscache_cookie_valid(cookie))
 		return;
 
-	nfs_fscache_update_auxdata(&auxdata, nfsi);
-
 	if (inode_is_open_for_write(inode)) {
-		dfprintk(FSCACHE, "NFS: nfsi 0x%p disabling cache\n", nfsi);
-		clear_bit(NFS_INO_FSCACHE, &nfsi->flags);
-		fscache_disable_cookie(cookie, &auxdata, true);
+		if (test_and_clear_bit(NFS_INO_FSCACHE, &nfsi->flags)) {
+			dfprintk(FSCACHE, "NFS: nfsi 0x%p disabling cache\n", nfsi);
+			nfs_fscache_update_auxdata(&auxdata, nfsi);
+			fscache_unuse_cookie(cookie, &auxdata, NULL);
+		}
 	} else {
-		dfprintk(FSCACHE, "NFS: nfsi 0x%p enabling cache\n", nfsi);
-		fscache_enable_cookie(cookie, &auxdata, nfsi->vfs_inode.i_size,
-				      nfs_fscache_can_enable, inode);
-		if (fscache_cookie_enabled(cookie))
-			set_bit(NFS_INO_FSCACHE, &NFS_I(inode)->flags);
+		if (!test_and_set_bit(NFS_INO_FSCACHE, &nfsi->flags)) {
+			dfprintk(FSCACHE, "NFS: nfsi 0x%p enabling cache\n", nfsi);
+			fscache_use_cookie(cookie, false);
+		}
 	}
 }
 EXPORT_SYMBOL_GPL(nfs_fscache_open_file);



^ permalink raw reply related	[flat|nested] 87+ messages in thread

* [PATCH 60/67] NFS: Convert fscache invalidation and update aux_data and i_size
  2021-10-18 14:50 [PATCH 00/67] fscache: Rewrite index API and management system David Howells
                   ` (58 preceding siblings ...)
  2021-10-18 15:06 ` [PATCH 59/67] NFS: Convert fscache_enable_cookie and fscache_disable_cookie David Howells
@ 2021-10-18 15:07 ` David Howells
  2021-10-18 15:08 ` [PATCH 61/67] nfs: Convert to new fscache volume/cookie API David Howells
                   ` (10 subsequent siblings)
  70 siblings, 0 replies; 87+ messages in thread
From: David Howells @ 2021-10-18 15:07 UTC (permalink / raw)
  To: linux-cachefs
  Cc: Dave Wysochanski, Trond Myklebust, Anna Schumaker, linux-nfs,
	dhowells, Trond Myklebust, Anna Schumaker, Steve French,
	Dominique Martinet, Jeff Layton, Matthew Wilcox, Alexander Viro,
	Omar Sandoval, Linus Torvalds, linux-afs, linux-nfs, linux-cifs,
	ceph-devel, v9fs-developer, linux-fsdevel, linux-kernel

From: Dave Wysochanski <dwysocha@redhat.com>

Convert nfs_fscache_invalidate to the new FS-Cache API.  Also,
now when invalidating an fscache cookie, be sure to pass the latest
i_size as well as aux_data to fscache.

A few APIs no longer exist so remove them.  We can call directly to
wait_on_page_fscache() because it checks whether a page is an fscache
page before waiting on it.

Since the current NFS fscache implementation is only enabled when a
file is open for read, handle writes with invalidation.  If a write
extends the size of the file and fscache is enabled, we must invalidate
the object with the new size.

We must also invalidate fscache when doing a direct write to avoid
issues seen specifically with NFSv4.x when direct writes are followed
by non-direct (buffered) reads.  Note that we cannot call
fscache_invalidate() while holding inode->i_lock.

Signed-off-by: Dave Wysochanski <dwysocha@redhat.com>
Signed-off-by: David Howells <dhowells@redhat.com>
cc: Trond Myklebust <trond.myklebust@hammerspace.com>
cc: Anna Schumaker <anna.schumaker@netapp.com>
cc: linux-nfs@vger.kernel.org
---

 fs/nfs/direct.c  |    2 ++
 fs/nfs/file.c    |    7 ++++++-
 fs/nfs/fscache.c |   20 --------------------
 fs/nfs/fscache.h |   32 ++++++++++++++++++++++----------
 fs/nfs/inode.c   |    4 ++--
 fs/nfs/write.c   |    1 +
 6 files changed, 33 insertions(+), 33 deletions(-)

diff --git a/fs/nfs/direct.c b/fs/nfs/direct.c
index 2e894fec036b..8b4839ef4b0c 100644
--- a/fs/nfs/direct.c
+++ b/fs/nfs/direct.c
@@ -59,6 +59,7 @@
 #include "internal.h"
 #include "iostat.h"
 #include "pnfs.h"
+#include "fscache.h"
 
 #define NFSDBG_FACILITY		NFSDBG_VFS
 
@@ -959,6 +960,7 @@ ssize_t nfs_file_direct_write(struct kiocb *iocb, struct iov_iter *iter)
 	} else {
 		result = requested;
 	}
+	nfs_fscache_invalidate(inode, FSCACHE_INVAL_DIO_WRITE);
 out_release:
 	nfs_direct_req_release(dreq);
 out:
diff --git a/fs/nfs/file.c b/fs/nfs/file.c
index 209dac208477..0a7f1e9f1203 100644
--- a/fs/nfs/file.c
+++ b/fs/nfs/file.c
@@ -436,6 +436,7 @@ static int nfs_release_page(struct page *page, gfp_t gfp)
 		if (!(gfp & __GFP_DIRECT_RECLAIM) || !(gfp & __GFP_FS))
 			return false;
 		wait_on_page_fscache(page);
+		fscache_note_page_release(nfs_i_fscache(page->mapping->host));
 	}
 	return true;
 }
@@ -559,7 +560,11 @@ static vm_fault_t nfs_vm_page_mkwrite(struct vm_fault *vmf)
 	sb_start_pagefault(inode->i_sb);
 
 	/* make sure the cache has finished storing the page */
-	wait_on_page_fscache(page);
+	if (PageFsCache(page) &&
+	    wait_on_page_fscache_killable(vmf->page) < 0) {
+		ret = VM_FAULT_RETRY;
+		goto out;
+	}
 
 	wait_on_bit_action(&NFS_I(inode)->flags, NFS_INO_INVALIDATING,
 			nfs_wait_bit_killable, TASK_KILLABLE);
diff --git a/fs/nfs/fscache.c b/fs/nfs/fscache.c
index 5e584f2e84a9..8250a029759e 100644
--- a/fs/nfs/fscache.c
+++ b/fs/nfs/fscache.c
@@ -234,19 +234,6 @@ void nfs_fscache_release_super_cookie(struct super_block *sb)
 	}
 }
 
-static void nfs_fscache_update_auxdata(struct nfs_fscache_inode_auxdata *auxdata,
-				  struct nfs_inode *nfsi)
-{
-	memset(auxdata, 0, sizeof(*auxdata));
-	auxdata->mtime_sec  = nfsi->vfs_inode.i_mtime.tv_sec;
-	auxdata->mtime_nsec = nfsi->vfs_inode.i_mtime.tv_nsec;
-	auxdata->ctime_sec  = nfsi->vfs_inode.i_ctime.tv_sec;
-	auxdata->ctime_nsec = nfsi->vfs_inode.i_ctime.tv_nsec;
-
-	if (NFS_SERVER(&nfsi->vfs_inode)->nfs_client->rpc_ops->version == 4)
-		auxdata->change_attr = inode_peek_iversion_raw(&nfsi->vfs_inode);
-}
-
 /*
  * Initialise the per-inode cache cookie pointer for an NFS inode.
  */
@@ -293,13 +280,6 @@ void nfs_fscache_clear_inode(struct inode *inode)
 	nfsi->fscache = NULL;
 }
 
-static bool nfs_fscache_can_enable(void *data)
-{
-	struct inode *inode = data;
-
-	return !inode_is_open_for_write(inode);
-}
-
 /*
  * Enable or disable caching for a file that is being opened as appropriate.
  * The cookie is allocated when the inode is initialised, but is not enabled at
diff --git a/fs/nfs/fscache.h b/fs/nfs/fscache.h
index 1e30fcb45665..cc94b257059b 100644
--- a/fs/nfs/fscache.h
+++ b/fs/nfs/fscache.h
@@ -13,6 +13,7 @@
 #include <linux/nfs4_mount.h>
 #define FSCACHE_USE_FALLBACK_IO_API
 #include <linux/fscache.h>
+#include <linux/iversion.h>
 
 #ifdef CONFIG_NFS_FSCACHE
 
@@ -118,20 +119,32 @@ static inline void nfs_readpage_to_fscache(struct inode *inode,
 		__nfs_readpage_to_fscache(inode, page);
 }
 
-/*
- * Invalidate the contents of fscache for this inode.  This will not sleep.
- */
-static inline void nfs_fscache_invalidate(struct inode *inode)
+static inline void nfs_fscache_update_auxdata(struct nfs_fscache_inode_auxdata *auxdata,
+				  struct nfs_inode *nfsi)
 {
-	fscache_invalidate(NFS_I(inode)->fscache);
+	memset(auxdata, 0, sizeof(*auxdata));
+	auxdata->mtime_sec  = nfsi->vfs_inode.i_mtime.tv_sec;
+	auxdata->mtime_nsec = nfsi->vfs_inode.i_mtime.tv_nsec;
+	auxdata->ctime_sec  = nfsi->vfs_inode.i_ctime.tv_sec;
+	auxdata->ctime_nsec = nfsi->vfs_inode.i_ctime.tv_nsec;
+
+	if (NFS_SERVER(&nfsi->vfs_inode)->nfs_client->rpc_ops->version == 4)
+		auxdata->change_attr = inode_peek_iversion_raw(&nfsi->vfs_inode);
 }
 
 /*
- * Wait for an object to finish being invalidated.
+ * Invalidate the contents of fscache for this inode.  This will not sleep.
  */
-static inline void nfs_fscache_wait_on_invalidate(struct inode *inode)
+static inline void nfs_fscache_invalidate(struct inode *inode, int flags)
 {
-	fscache_wait_on_invalidate(NFS_I(inode)->fscache);
+	struct nfs_fscache_inode_auxdata auxdata;
+	struct nfs_inode *nfsi = NFS_I(inode);
+
+	if (nfsi->fscache) {
+		nfs_fscache_update_auxdata(&auxdata, nfsi);
+		fscache_invalidate(nfsi->fscache, &auxdata,
+				   i_size_read(&nfsi->vfs_inode), flags);
+	}
 }
 
 /*
@@ -167,8 +180,7 @@ static inline void nfs_readpage_to_fscache(struct inode *inode,
 					   struct page *page) {}
 
 
-static inline void nfs_fscache_invalidate(struct inode *inode) {}
-static inline void nfs_fscache_wait_on_invalidate(struct inode *inode) {}
+static inline void nfs_fscache_invalidate(struct inode *inode, int flags) {}
 
 static inline const char *nfs_server_fscache_state(struct nfs_server *server)
 {
diff --git a/fs/nfs/inode.c b/fs/nfs/inode.c
index 853213b3a209..370c49b98646 100644
--- a/fs/nfs/inode.c
+++ b/fs/nfs/inode.c
@@ -209,7 +209,7 @@ void nfs_set_cache_invalid(struct inode *inode, unsigned long flags)
 	if (!nfs_has_xattr_cache(nfsi))
 		flags &= ~NFS_INO_INVALID_XATTR;
 	if (flags & NFS_INO_INVALID_DATA)
-		nfs_fscache_invalidate(inode);
+		nfs_fscache_invalidate(inode, 0);
 	if (inode->i_mapping->nrpages == 0)
 		flags &= ~(NFS_INO_INVALID_DATA|NFS_INO_DATA_INVAL_DEFER);
 	flags &= ~(NFS_INO_REVAL_PAGECACHE | NFS_INO_REVAL_FORCED);
@@ -1281,6 +1281,7 @@ static int nfs_invalidate_mapping(struct inode *inode, struct address_space *map
 {
 	int ret;
 
+	nfs_fscache_invalidate(inode, 0);
 	if (mapping->nrpages != 0) {
 		if (S_ISREG(inode->i_mode)) {
 			ret = nfs_sync_mapping(mapping);
@@ -1292,7 +1293,6 @@ static int nfs_invalidate_mapping(struct inode *inode, struct address_space *map
 			return ret;
 	}
 	nfs_inc_stats(inode, NFSIOS_DATAINVALIDATE);
-	nfs_fscache_wait_on_invalidate(inode);
 
 	dfprintk(PAGECACHE, "NFS: (%s/%Lu) data cache invalidated\n",
 			inode->i_sb->s_id,
diff --git a/fs/nfs/write.c b/fs/nfs/write.c
index 466266a96b2a..cbbf400db126 100644
--- a/fs/nfs/write.c
+++ b/fs/nfs/write.c
@@ -293,6 +293,7 @@ static void nfs_grow_file(struct page *page, unsigned int offset, unsigned int c
 	nfs_inc_stats(inode, NFSIOS_EXTENDWRITE);
 out:
 	spin_unlock(&inode->i_lock);
+	nfs_fscache_invalidate(inode, 0);
 }
 
 /* A writeback failed: mark the page as bad, and invalidate the page cache */



^ permalink raw reply related	[flat|nested] 87+ messages in thread

* [PATCH 61/67] nfs: Convert to new fscache volume/cookie API
  2021-10-18 14:50 [PATCH 00/67] fscache: Rewrite index API and management system David Howells
                   ` (59 preceding siblings ...)
  2021-10-18 15:07 ` [PATCH 60/67] NFS: Convert fscache invalidation and update aux_data and i_size David Howells
@ 2021-10-18 15:08 ` David Howells
  2021-10-18 15:08 ` [PATCH 62/67] 9p: Use fscache indexing rewrite and reenable caching David Howells
                   ` (9 subsequent siblings)
  70 siblings, 0 replies; 87+ messages in thread
From: David Howells @ 2021-10-18 15:08 UTC (permalink / raw)
  To: linux-cachefs
  Cc: Dave Wysochanski, Trond Myklebust, Anna Schumaker, linux-nfs,
	dhowells, Trond Myklebust, Anna Schumaker, Steve French,
	Dominique Martinet, Jeff Layton, Matthew Wilcox, Alexander Viro,
	Omar Sandoval, Linus Torvalds, linux-afs, linux-nfs, linux-cifs,
	ceph-devel, v9fs-developer, linux-fsdevel, linux-kernel

Change the nfs filesystem to support fscache's indexing rewrite and
reenable caching in nfs.

The following changes have been made:

 (1) The fscache_netfs struct is no more, and there's no need to register
     the filesystem as a whole.

 (2) The session cookie is now an fscache_volume cookie, allocated with
     fscache_acquire_volume().  That takes three parameters: a string
     representing the "volume" in the index, a string naming the cache to
     use (or NULL) and a u64 that conveys coherency metadata for the
     volume.

     For nfs, I've made it render the volume name string as:

        "nfs,<ver>,<family>,<address>,<port>,<fsidH>,<fsidL>*<,param>[,<uniq>]"

 (3) The fscache_cookie_def is no more and needed information is passed
     directly to fscache_acquire_cookie().  The cache no longer calls back
     into the filesystem, but rather metadata changes are indicated at
     other times.

     fscache_acquire_cookie() is passed the same keying and coherency
     information as before.

Signed-off-by: David Howells <dhowells@redhat.com>
cc: Dave Wysochanski <dwysocha@redhat.com>
cc: Trond Myklebust <trond.myklebust@hammerspace.com>
cc: Anna Schumaker <anna.schumaker@netapp.com>
cc: linux-nfs@vger.kernel.org
---

 fs/nfs/Kconfig            |    2 
 fs/nfs/Makefile           |    2 
 fs/nfs/client.c           |    4 -
 fs/nfs/fscache-index.c    |   46 ---------
 fs/nfs/fscache.c          |  220 ++++++++++++++-------------------------------
 fs/nfs/fscache.h          |   56 -----------
 fs/nfs/inode.c            |    7 -
 fs/nfs/super.c            |    7 +
 include/linux/nfs_fs_sb.h |    9 --
 9 files changed, 74 insertions(+), 279 deletions(-)
 delete mode 100644 fs/nfs/fscache-index.c

diff --git a/fs/nfs/Kconfig b/fs/nfs/Kconfig
index a8b73c90aa00..14a72224b657 100644
--- a/fs/nfs/Kconfig
+++ b/fs/nfs/Kconfig
@@ -170,7 +170,7 @@ config ROOT_NFS
 
 config NFS_FSCACHE
 	bool "Provide NFS client caching support"
-	depends on NFS_FS=m && FSCACHE_OLD || NFS_FS=y && FSCACHE_OLD=y
+	depends on NFS_FS=m && FSCACHE || NFS_FS=y && FSCACHE=y
 	help
 	  Say Y here if you want NFS data to be cached locally on disc through
 	  the general filesystem cache manager
diff --git a/fs/nfs/Makefile b/fs/nfs/Makefile
index 22d11fdc6deb..5f6db37f461e 100644
--- a/fs/nfs/Makefile
+++ b/fs/nfs/Makefile
@@ -12,7 +12,7 @@ nfs-y 			:= client.o dir.o file.o getroot.o inode.o super.o \
 			   export.o sysfs.o fs_context.o
 nfs-$(CONFIG_ROOT_NFS)	+= nfsroot.o
 nfs-$(CONFIG_SYSCTL)	+= sysctl.o
-nfs-$(CONFIG_NFS_FSCACHE) += fscache.o fscache-index.o
+nfs-$(CONFIG_NFS_FSCACHE) += fscache.o
 
 obj-$(CONFIG_NFS_V2) += nfsv2.o
 nfsv2-y := nfs2super.o proc.o nfs2xdr.o
diff --git a/fs/nfs/client.c b/fs/nfs/client.c
index 23e165d5ec9c..8f35e26d8a29 100644
--- a/fs/nfs/client.c
+++ b/fs/nfs/client.c
@@ -183,8 +183,6 @@ struct nfs_client *nfs_alloc_client(const struct nfs_client_initdata *cl_init)
 	clp->cl_net = get_net(cl_init->net);
 
 	clp->cl_principal = "*";
-	nfs_fscache_get_client_cookie(clp);
-
 	return clp;
 
 error_cleanup:
@@ -238,8 +236,6 @@ static void pnfs_init_server(struct nfs_server *server)
  */
 void nfs_free_client(struct nfs_client *clp)
 {
-	nfs_fscache_release_client_cookie(clp);
-
 	/* -EIO all pending I/O */
 	if (!IS_ERR(clp->cl_rpcclient))
 		rpc_shutdown_client(clp->cl_rpcclient);
diff --git a/fs/nfs/fscache-index.c b/fs/nfs/fscache-index.c
deleted file mode 100644
index b4fdacd955f3..000000000000
--- a/fs/nfs/fscache-index.c
+++ /dev/null
@@ -1,46 +0,0 @@
-// SPDX-License-Identifier: GPL-2.0-or-later
-/* NFS FS-Cache index structure definition
- *
- * Copyright (C) 2008 Red Hat, Inc. All Rights Reserved.
- * Written by David Howells (dhowells@redhat.com)
- */
-
-#include <linux/init.h>
-#include <linux/kernel.h>
-#include <linux/sched.h>
-#include <linux/mm.h>
-#include <linux/nfs_fs.h>
-#include <linux/nfs_fs_sb.h>
-#include <linux/in6.h>
-#include <linux/iversion.h>
-
-#include "internal.h"
-#include "fscache.h"
-
-#define NFSDBG_FACILITY		NFSDBG_FSCACHE
-
-/*
- * Define the NFS filesystem for FS-Cache.  Upon registration FS-Cache sticks
- * the cookie for the top-level index object for NFS into here.  The top-level
- * index can than have other cache objects inserted into it.
- */
-struct fscache_netfs nfs_fscache_netfs = {
-	.name		= "nfs",
-	.version	= 0,
-};
-
-/*
- * Register NFS for caching
- */
-int nfs_fscache_register(void)
-{
-	return fscache_register_netfs(&nfs_fscache_netfs);
-}
-
-/*
- * Unregister NFS for caching
- */
-void nfs_fscache_unregister(void)
-{
-	fscache_unregister_netfs(&nfs_fscache_netfs);
-}
diff --git a/fs/nfs/fscache.c b/fs/nfs/fscache.c
index 8250a029759e..c15bdf3606eb 100644
--- a/fs/nfs/fscache.c
+++ b/fs/nfs/fscache.c
@@ -22,24 +22,18 @@
 
 #define NFSDBG_FACILITY		NFSDBG_FSCACHE
 
-static struct rb_root nfs_fscache_keys = RB_ROOT;
-static DEFINE_SPINLOCK(nfs_fscache_keys_lock);
+#define NFS_MAX_KEY_LEN 1000
 
-/*
- * Layout of the key for an NFS server cache object.
- */
-struct nfs_server_key {
-	struct {
-		uint16_t	nfsversion;		/* NFS protocol version */
-		uint32_t	minorversion;		/* NFSv4 minor version */
-		uint16_t	family;			/* address family */
-		__be16		port;			/* IP port */
-	} hdr;
-	union {
-		struct in_addr	ipv4_addr;	/* IPv4 address */
-		struct in6_addr ipv6_addr;	/* IPv6 address */
-	};
-} __packed;
+static bool nfs_append_int(char *key, int *_len, unsigned long long x)
+{
+	if (*_len > NFS_MAX_KEY_LEN)
+		return false;
+	if (x == 0)
+		key[(*_len)++] = ',';
+	else
+		*_len += sprintf(key + *_len, ",%llx", x);
+	return true;
+}
 
 /*
  * Get the per-client index cookie for an NFS client if the appropriate mount
@@ -47,68 +41,43 @@ struct nfs_server_key {
  * - We always try and get an index cookie for the client, but get filehandle
  *   cookies on a per-superblock basis, depending on the mount flags
  */
-void nfs_fscache_get_client_cookie(struct nfs_client *clp)
+static bool nfs_fscache_get_client_key(struct nfs_client *clp,
+				       char *key, int *_len)
 {
 	const struct sockaddr_in6 *sin6 = (struct sockaddr_in6 *) &clp->cl_addr;
 	const struct sockaddr_in *sin = (struct sockaddr_in *) &clp->cl_addr;
-	struct nfs_server_key key;
-	uint16_t len = sizeof(key.hdr);
 
-	memset(&key, 0, sizeof(key));
-	key.hdr.nfsversion = clp->rpc_ops->version;
-	key.hdr.minorversion = clp->cl_minorversion;
-	key.hdr.family = clp->cl_addr.ss_family;
+	*_len += snprintf(key + *_len, NFS_MAX_KEY_LEN - *_len,
+			  ",%u.%u,%x",
+			  clp->rpc_ops->version,
+			  clp->cl_minorversion,
+			  clp->cl_addr.ss_family);
 
 	switch (clp->cl_addr.ss_family) {
 	case AF_INET:
-		key.hdr.port = sin->sin_port;
-		key.ipv4_addr = sin->sin_addr;
-		len += sizeof(key.ipv4_addr);
-		break;
+		if (!nfs_append_int(key, _len, sin->sin_port) ||
+		    !nfs_append_int(key, _len, sin->sin_addr.s_addr))
+			return false;
+		return true;
 
 	case AF_INET6:
-		key.hdr.port = sin6->sin6_port;
-		key.ipv6_addr = sin6->sin6_addr;
-		len += sizeof(key.ipv6_addr);
-		break;
+		if (!nfs_append_int(key, _len, sin6->sin6_port) ||
+		    !nfs_append_int(key, _len, sin6->sin6_addr.s6_addr32[0]) ||
+		    !nfs_append_int(key, _len, sin6->sin6_addr.s6_addr32[1]) ||
+		    !nfs_append_int(key, _len, sin6->sin6_addr.s6_addr32[2]) ||
+		    !nfs_append_int(key, _len, sin6->sin6_addr.s6_addr32[3]))
+			return false;
+		return true;
 
 	default:
 		printk(KERN_WARNING "NFS: Unknown network family '%d'\n",
 		       clp->cl_addr.ss_family);
-		clp->fscache = NULL;
-		return;
+		return false;
 	}
-
-	/* create a cache index for looking up filehandles */
-	clp->fscache = fscache_acquire_cookie(nfs_fscache_netfs.primary_index,
-					      FSCACHE_COOKIE_TYPE_INDEX,
-					      "NFS.server",
-					      0,    /* advice */
-					      NULL, /* preferred_cache */
-					      &key, /* index_key */
-					      len,
-					      NULL, /* aux_data */
-					      0,
-					      0);
-	dfprintk(FSCACHE, "NFS: get client cookie (0x%p/0x%p)\n",
-		 clp, clp->fscache);
 }
 
 /*
- * Dispose of a per-client cookie
- */
-void nfs_fscache_release_client_cookie(struct nfs_client *clp)
-{
-	dfprintk(FSCACHE, "NFS: releasing client cookie (0x%p/0x%p)\n",
-		 clp, clp->fscache);
-
-	fscache_relinquish_cookie(clp->fscache, false);
-	clp->fscache = NULL;
-}
-
-/*
- * Get the cache cookie for an NFS superblock.  We have to handle
- * uniquification here because the cache doesn't do it for us.
+ * Get the cache cookie for an NFS superblock.
  *
  * The default uniquifier is just an empty string, but it may be overridden
  * either by the 'fsc=xxx' option to mount, or by inheriting it from the parent
@@ -116,100 +85,53 @@ void nfs_fscache_release_client_cookie(struct nfs_client *clp)
  */
 void nfs_fscache_get_super_cookie(struct super_block *sb, const char *uniq, int ulen)
 {
-	struct nfs_fscache_key *key, *xkey;
 	struct nfs_server *nfss = NFS_SB(sb);
-	struct rb_node **p, *parent;
-	int diff;
+	unsigned int len = 3;
+	char *key;
 
-	nfss->fscache_key = NULL;
-	nfss->fscache = NULL;
-	if (!uniq) {
-		uniq = "";
-		ulen = 1;
+	if (uniq) {
+		nfss->fscache_uniq = kmemdup_nul(uniq, ulen, GFP_KERNEL);
+		if (!nfss->fscache_uniq)
+			return;
 	}
 
-	key = kzalloc(sizeof(*key) + ulen, GFP_KERNEL);
+	key = kmalloc(NFS_MAX_KEY_LEN + 24, GFP_KERNEL);
 	if (!key)
 		return;
 
-	key->nfs_client = nfss->nfs_client;
-	key->key.super.s_flags = sb->s_flags & NFS_SB_MASK;
-	key->key.nfs_server.flags = nfss->flags;
-	key->key.nfs_server.rsize = nfss->rsize;
-	key->key.nfs_server.wsize = nfss->wsize;
-	key->key.nfs_server.acregmin = nfss->acregmin;
-	key->key.nfs_server.acregmax = nfss->acregmax;
-	key->key.nfs_server.acdirmin = nfss->acdirmin;
-	key->key.nfs_server.acdirmax = nfss->acdirmax;
-	key->key.nfs_server.fsid = nfss->fsid;
-	key->key.rpc_auth.au_flavor = nfss->client->cl_auth->au_flavor;
-
-	key->key.uniq_len = ulen;
-	memcpy(key->key.uniquifier, uniq, ulen);
-
-	spin_lock(&nfs_fscache_keys_lock);
-	p = &nfs_fscache_keys.rb_node;
-	parent = NULL;
-	while (*p) {
-		parent = *p;
-		xkey = rb_entry(parent, struct nfs_fscache_key, node);
-
-		if (key->nfs_client < xkey->nfs_client)
-			goto go_left;
-		if (key->nfs_client > xkey->nfs_client)
-			goto go_right;
-
-		diff = memcmp(&key->key, &xkey->key, sizeof(key->key));
-		if (diff < 0)
-			goto go_left;
-		if (diff > 0)
-			goto go_right;
-
-		if (key->key.uniq_len == 0)
-			goto non_unique;
-		diff = memcmp(key->key.uniquifier,
-			      xkey->key.uniquifier,
-			      key->key.uniq_len);
-		if (diff < 0)
-			goto go_left;
-		if (diff > 0)
-			goto go_right;
-		goto non_unique;
-
-	go_left:
-		p = &(*p)->rb_left;
-		continue;
-	go_right:
-		p = &(*p)->rb_right;
+	memcpy(key, "nfs", 3);
+	if (!nfs_fscache_get_client_key(nfss->nfs_client, key, &len) ||
+	    !nfs_append_int(key, &len, nfss->fsid.major) ||
+	    !nfs_append_int(key, &len, nfss->fsid.minor) ||
+	    !nfs_append_int(key, &len, sb->s_flags & NFS_SB_MASK) ||
+	    !nfs_append_int(key, &len, nfss->flags) ||
+	    !nfs_append_int(key, &len, nfss->rsize) ||
+	    !nfs_append_int(key, &len, nfss->wsize) ||
+	    !nfs_append_int(key, &len, nfss->acregmin) ||
+	    !nfs_append_int(key, &len, nfss->acregmax) ||
+	    !nfs_append_int(key, &len, nfss->acdirmin) ||
+	    !nfs_append_int(key, &len, nfss->acdirmax) ||
+	    !nfs_append_int(key, &len, nfss->client->cl_auth->au_flavor))
+		goto out;
+
+	if (ulen > 0) {
+		if (ulen > NFS_MAX_KEY_LEN - len)
+			goto out;
+		key[len++] = ',';
+		memcpy(key + len, uniq, ulen);
+		len += ulen;
 	}
-
-	rb_link_node(&key->node, parent, p);
-	rb_insert_color(&key->node, &nfs_fscache_keys);
-	spin_unlock(&nfs_fscache_keys_lock);
-	nfss->fscache_key = key;
+	key[len] = 0;
 
 	/* create a cache index for looking up filehandles */
-	nfss->fscache = fscache_acquire_cookie(nfss->nfs_client->fscache,
-					       FSCACHE_COOKIE_TYPE_INDEX,
-					       "NFS.super",
-					       0,    /* advice */
+	nfss->fscache = fscache_acquire_volume(key,
 					       NULL, /* preferred_cache */
-					       &key->key,  /* index_key */
-					       sizeof(key->key) + ulen,
-					       NULL, /* aux_data */
-					       0,
-					       0);
+					       0 /* coherency_data */);
 	dfprintk(FSCACHE, "NFS: get superblock cookie (0x%p/0x%p)\n",
 		 nfss, nfss->fscache);
-	return;
 
-non_unique:
-	spin_unlock(&nfs_fscache_keys_lock);
+out:
 	kfree(key);
-	nfss->fscache_key = NULL;
-	nfss->fscache = NULL;
-	printk(KERN_WARNING "NFS:"
-	       " Cache request denied due to non-unique superblock keys\n");
 }
 
 /*
@@ -222,16 +144,9 @@ void nfs_fscache_release_super_cookie(struct super_block *sb)
 	dfprintk(FSCACHE, "NFS: releasing superblock cookie (0x%p/0x%p)\n",
 		 nfss, nfss->fscache);
 
-	fscache_relinquish_cookie(nfss->fscache, false);
+	fscache_relinquish_volume(nfss->fscache, 0, false);
 	nfss->fscache = NULL;
-
-	if (nfss->fscache_key) {
-		spin_lock(&nfs_fscache_keys_lock);
-		rb_erase(&nfss->fscache_key->node, &nfs_fscache_keys);
-		spin_unlock(&nfs_fscache_keys_lock);
-		kfree(nfss->fscache_key);
-		nfss->fscache_key = NULL;
-	}
+	kfree(nfss->fscache_uniq);
 }
 
 /*
@@ -250,10 +165,7 @@ void nfs_fscache_init_inode(struct inode *inode)
 	nfs_fscache_update_auxdata(&auxdata, nfsi);
 
 	nfsi->fscache = fscache_acquire_cookie(NFS_SB(inode->i_sb)->fscache,
-					       FSCACHE_COOKIE_TYPE_DATAFILE,
-					       "NFS.fh",
-					       0,             /* advice */
-					       NULL, /* preferred_cache */
+					       FSCACHE_ADV_FALLBACK_IO,
 					       nfsi->fh.data, /* index_key */
 					       nfsi->fh.size,
 					       &auxdata,      /* aux_data */
diff --git a/fs/nfs/fscache.h b/fs/nfs/fscache.h
index cc94b257059b..0cf2fbe30051 100644
--- a/fs/nfs/fscache.h
+++ b/fs/nfs/fscache.h
@@ -17,43 +17,6 @@
 
 #ifdef CONFIG_NFS_FSCACHE
 
-/*
- * set of NFS FS-Cache objects that form a superblock key
- */
-struct nfs_fscache_key {
-	struct rb_node		node;
-	struct nfs_client	*nfs_client;	/* the server */
-
-	/* the elements of the unique key - as used by nfs_compare_super() and
-	 * nfs_compare_mount_options() to distinguish superblocks */
-	struct {
-		struct {
-			unsigned long	s_flags;	/* various flags
-							 * (& NFS_MS_MASK) */
-		} super;
-
-		struct {
-			struct nfs_fsid fsid;
-			int		flags;
-			unsigned int	rsize;		/* read size */
-			unsigned int	wsize;		/* write size */
-			unsigned int	acregmin;	/* attr cache timeouts */
-			unsigned int	acregmax;
-			unsigned int	acdirmin;
-			unsigned int	acdirmax;
-		} nfs_server;
-
-		struct {
-			rpc_authflavor_t au_flavor;
-		} rpc_auth;
-
-		/* uniquifier - can be used if nfs_server.flags includes
-		 * NFS_MOUNT_UNSHARED  */
-		u8 uniq_len;
-		char uniquifier[0];
-	} key;
-};
-
 /*
  * Definition of the auxiliary data attached to NFS inode storage objects
  * within the cache.
@@ -71,20 +34,9 @@ struct nfs_fscache_inode_auxdata {
 	u64	change_attr;
 };
 
-/*
- * fscache-index.c
- */
-extern struct fscache_netfs nfs_fscache_netfs;
-
-extern int nfs_fscache_register(void);
-extern void nfs_fscache_unregister(void);
-
 /*
  * fscache.c
  */
-extern void nfs_fscache_get_client_cookie(struct nfs_client *);
-extern void nfs_fscache_release_client_cookie(struct nfs_client *);
-
 extern void nfs_fscache_get_super_cookie(struct super_block *, const char *, int);
 extern void nfs_fscache_release_super_cookie(struct super_block *);
 
@@ -120,7 +72,7 @@ static inline void nfs_readpage_to_fscache(struct inode *inode,
 }
 
 static inline void nfs_fscache_update_auxdata(struct nfs_fscache_inode_auxdata *auxdata,
-				  struct nfs_inode *nfsi)
+					      struct nfs_inode *nfsi)
 {
 	memset(auxdata, 0, sizeof(*auxdata));
 	auxdata->mtime_sec  = nfsi->vfs_inode.i_mtime.tv_sec;
@@ -158,12 +110,6 @@ static inline const char *nfs_server_fscache_state(struct nfs_server *server)
 }
 
 #else /* CONFIG_NFS_FSCACHE */
-static inline int nfs_fscache_register(void) { return 0; }
-static inline void nfs_fscache_unregister(void) {}
-
-static inline void nfs_fscache_get_client_cookie(struct nfs_client *clp) {}
-static inline void nfs_fscache_release_client_cookie(struct nfs_client *clp) {}
-
 static inline void nfs_fscache_release_super_cookie(struct super_block *sb) {}
 
 static inline void nfs_fscache_init_inode(struct inode *inode) {}
diff --git a/fs/nfs/inode.c b/fs/nfs/inode.c
index 370c49b98646..1cfc8f5c9fe2 100644
--- a/fs/nfs/inode.c
+++ b/fs/nfs/inode.c
@@ -2361,10 +2361,6 @@ static int __init init_nfs_fs(void)
 	if (err < 0)
 		goto out9;
 
-	err = nfs_fscache_register();
-	if (err < 0)
-		goto out8;
-
 	err = nfsiod_start();
 	if (err)
 		goto out7;
@@ -2416,8 +2412,6 @@ static int __init init_nfs_fs(void)
 out6:
 	nfsiod_stop();
 out7:
-	nfs_fscache_unregister();
-out8:
 	unregister_pernet_subsys(&nfs_net_ops);
 out9:
 	nfs_sysfs_exit();
@@ -2432,7 +2426,6 @@ static void __exit exit_nfs_fs(void)
 	nfs_destroy_readpagecache();
 	nfs_destroy_inodecache();
 	nfs_destroy_nfspagecache();
-	nfs_fscache_unregister();
 	unregister_pernet_subsys(&nfs_net_ops);
 	rpc_proc_unregister(&init_net, "nfs");
 	unregister_nfs_fs();
diff --git a/fs/nfs/super.c b/fs/nfs/super.c
index e65c83494c05..e73d4adba50f 100644
--- a/fs/nfs/super.c
+++ b/fs/nfs/super.c
@@ -1206,7 +1206,6 @@ static void nfs_get_cache_cookie(struct super_block *sb,
 	char *uniq = NULL;
 	int ulen = 0;
 
-	nfss->fscache_key = NULL;
 	nfss->fscache = NULL;
 
 	if (!ctx)
@@ -1216,9 +1215,9 @@ static void nfs_get_cache_cookie(struct super_block *sb,
 		struct nfs_server *mnt_s = NFS_SB(ctx->clone_data.sb);
 		if (!(mnt_s->options & NFS_OPTION_FSCACHE))
 			return;
-		if (mnt_s->fscache_key) {
-			uniq = mnt_s->fscache_key->key.uniquifier;
-			ulen = mnt_s->fscache_key->key.uniq_len;
+		if (mnt_s->fscache_uniq) {
+			uniq = mnt_s->fscache_uniq;
+			ulen = strlen(uniq);
 		}
 	} else {
 		if (!(ctx->options & NFS_OPTION_FSCACHE))
diff --git a/include/linux/nfs_fs_sb.h b/include/linux/nfs_fs_sb.h
index 2a9acbfe00f0..77b2dba27bbb 100644
--- a/include/linux/nfs_fs_sb.h
+++ b/include/linux/nfs_fs_sb.h
@@ -120,11 +120,6 @@ struct nfs_client {
 	 * This is used to generate the mv0 callback address.
 	 */
 	char			cl_ipaddr[48];
-
-#ifdef CONFIG_NFS_FSCACHE
-	struct fscache_cookie	*fscache;	/* client index cache cookie */
-#endif
-
 	struct net		*cl_net;
 	struct list_head	pending_cb_stateids;
 };
@@ -194,8 +189,8 @@ struct nfs_server {
 	struct nfs_auth_info	auth_info;	/* parsed auth flavors */
 
 #ifdef CONFIG_NFS_FSCACHE
-	struct nfs_fscache_key	*fscache_key;	/* unique key for superblock */
-	struct fscache_cookie	*fscache;	/* superblock cookie */
+	struct fscache_volume	*fscache;	/* superblock cookie */
+	char			*fscache_uniq;	/* Uniquifier (or NULL) */
 #endif
 
 	u32			pnfs_blksize;	/* layout_blksize attr */



^ permalink raw reply related	[flat|nested] 87+ messages in thread

* [PATCH 62/67] 9p: Use fscache indexing rewrite and reenable caching
  2021-10-18 14:50 [PATCH 00/67] fscache: Rewrite index API and management system David Howells
                   ` (60 preceding siblings ...)
  2021-10-18 15:08 ` [PATCH 61/67] nfs: Convert to new fscache volume/cookie API David Howells
@ 2021-10-18 15:08 ` David Howells
  2021-10-18 15:08 ` [PATCH 63/67] 9p: Copy local writes to the cache when writing to the server David Howells
                   ` (8 subsequent siblings)
  70 siblings, 0 replies; 87+ messages in thread
From: David Howells @ 2021-10-18 15:08 UTC (permalink / raw)
  To: linux-cachefs
  Cc: Eric Van Hensbergen, Latchesar Ionkov, Dominique Martinet,
	v9fs-developer, dhowells, Trond Myklebust, Anna Schumaker,
	Steve French, Dominique Martinet, Jeff Layton, Matthew Wilcox,
	Alexander Viro, Omar Sandoval, Linus Torvalds, linux-afs,
	linux-nfs, linux-cifs, ceph-devel, v9fs-developer, linux-fsdevel,
	linux-kernel

Change the 9p filesystem to take account of the changes to fscache's
indexing rewrite and reenable caching in 9p.

The following changes have been made:

 (1) The fscache_netfs struct is no more, and there's no need to register
     the filesystem as a whole.

 (2) The session cookie is now an fscache_volume cookie, allocated with
     fscache_acquire_volume().  That takes three parameters: a string
     representing the "volume" in the index, a string naming the cache to
     use (or NULL) and a u64 that conveys coherency metadata for the
     volume.

     For 9p, I've made it render the volume name string as:

	"9p,<devname>,<cachetag>"

     where the cachetag is replaced by the aname if it wasn't supplied.

     This probably needs rethinking a bit as the aname can have slashes in
     it.  It might be better to hash the cachetag and use the hash or I
     could substitute commas for the slashes or something.

 (3) The fscache_cookie_def is no more and needed information is passed
     directly to fscache_acquire_cookie().  The cache no longer calls back
     into the filesystem, but rather metadata changes are indicated at
     other times.

     fscache_acquire_cookie() is passed the same keying and coherency
     information as before.

 (4) The functions to set/reset/flush cookies are removed and
     fscache_use_cookie() and fscache_unuse_cookie() are used instead.

     fscache_use_cookie() is passed a flag to indicate if the cookie is
     opened for writing.  fscache_unuse_cookie() is passed updates for the
     metadata if we changed it (ie. if the file was opened for writing).

     These are called when the file is opened or closed.

 (5) wait_on_page_bit[_killable]() is replaced with the specific wait
     functions for the bits waited upon.

 (6) I've got rid of some of the 9p-specific cache helper functions and
     called things like fscache_relinquish_cookie() directly as they'll
     optimise away if v9fs_inode_cookie() returns an unconditional NULL
     (which will be the case if CONFIG_9P_FSCACHE=n).

 (7) v9fs_vfs_setattr() is made to call fscache_resize() to change the size
     of the cache object.

Notes:

 (A) We should call fscache_invalidate() if we detect that the server's
     copy of a file got changed by a third party, but I don't know where to
     do that.  We don't need to do that when allocating the cookie as we
     get a check-and-invalidate when we initially bind to the cache object.

 (B) The copy-to-cache-on-writeback side of things will be handled in
     separate patch.

Signed-off-by: David Howells <dhowells@redhat.com>
cc: Eric Van Hensbergen <ericvh@gmail.com>
cc: Latchesar Ionkov <lucho@ionkov.net>
cc: Dominique Martinet <asmadeus@codewreck.org>
cc: v9fs-developer@lists.sourceforge.net
---

 fs/9p/Kconfig          |    2 -
 fs/9p/cache.c          |  184 ++++++------------------------------------------
 fs/9p/cache.h          |   23 +-----
 fs/9p/v9fs.c           |   14 +---
 fs/9p/v9fs.h           |   13 +++
 fs/9p/vfs_dir.c        |   13 +++
 fs/9p/vfs_file.c       |    7 +-
 fs/9p/vfs_inode.c      |   22 +++---
 fs/9p/vfs_inode_dotl.c |    3 +
 9 files changed, 71 insertions(+), 210 deletions(-)

diff --git a/fs/9p/Kconfig b/fs/9p/Kconfig
index b11c15c30bac..d7bc93447c85 100644
--- a/fs/9p/Kconfig
+++ b/fs/9p/Kconfig
@@ -14,7 +14,7 @@ config 9P_FS
 if 9P_FS
 config 9P_FSCACHE
 	bool "Enable 9P client caching support"
-	depends on 9P_FS=m && FSCACHE_OLD || 9P_FS=y && FSCACHE_OLD=y
+	depends on 9P_FS=m && FSCACHE || 9P_FS=y && FSCACHE=y
 	help
 	  Choose Y here to enable persistent, read-only local
 	  caching support for 9p clients using FS-Cache
diff --git a/fs/9p/cache.c b/fs/9p/cache.c
index 077f0a40aa01..99662920699d 100644
--- a/fs/9p/cache.c
+++ b/fs/9p/cache.c
@@ -16,89 +16,26 @@
 #include "v9fs.h"
 #include "cache.h"
 
-#define CACHETAG_LEN  11
-
-struct fscache_netfs v9fs_cache_netfs = {
-	.name 		= "9p",
-	.version 	= 0,
-};
-
-/*
- * v9fs_random_cachetag - Generate a random tag to be associated
- *			  with a new cache session.
- *
- * The value of jiffies is used for a fairly randomly cache tag.
- */
-
-static
-int v9fs_random_cachetag(struct v9fs_session_info *v9ses)
-{
-	v9ses->cachetag = kmalloc(CACHETAG_LEN, GFP_KERNEL);
-	if (!v9ses->cachetag)
-		return -ENOMEM;
-
-	return scnprintf(v9ses->cachetag, CACHETAG_LEN, "%lu", jiffies);
-}
-
-const struct fscache_cookie_def v9fs_cache_session_index_def = {
-	.name		= "9P.session",
-	.type		= FSCACHE_COOKIE_TYPE_INDEX,
-};
-
-void v9fs_cache_session_get_cookie(struct v9fs_session_info *v9ses)
+void v9fs_cache_session_get_cookie(struct v9fs_session_info *v9ses,
+				   const char *dev_name)
 {
-	/* If no cache session tag was specified, we generate a random one. */
-	if (!v9ses->cachetag) {
-		if (v9fs_random_cachetag(v9ses) < 0) {
-			v9ses->fscache = NULL;
-			kfree(v9ses->cachetag);
-			v9ses->cachetag = NULL;
-			return;
-		}
-	}
+	char *name, *p;
 
-	v9ses->fscache = fscache_acquire_cookie(v9fs_cache_netfs.primary_index,
-						&v9fs_cache_session_index_def,
-						v9ses->cachetag,
-						strlen(v9ses->cachetag),
-						NULL, 0,
-						v9ses, 0, true);
-	p9_debug(P9_DEBUG_FSC, "session %p get cookie %p\n",
-		 v9ses, v9ses->fscache);
-}
-
-void v9fs_cache_session_put_cookie(struct v9fs_session_info *v9ses)
-{
-	p9_debug(P9_DEBUG_FSC, "session %p put cookie %p\n",
-		 v9ses, v9ses->fscache);
-	fscache_relinquish_cookie(v9ses->fscache, NULL, false);
-	v9ses->fscache = NULL;
-}
-
-static enum
-fscache_checkaux v9fs_cache_inode_check_aux(void *cookie_netfs_data,
-					    const void *buffer,
-					    uint16_t buflen,
-					    loff_t object_size)
-{
-	const struct v9fs_inode *v9inode = cookie_netfs_data;
-
-	if (buflen != sizeof(v9inode->qid.version))
-		return FSCACHE_CHECKAUX_OBSOLETE;
+	name = kasprintf(GFP_KERNEL, "9p,%s,%s",
+			 dev_name, v9ses->cachetag ?: v9ses->aname);
+	if (!name)
+		return;
 
-	if (memcmp(buffer, &v9inode->qid.version,
-		   sizeof(v9inode->qid.version)))
-		return FSCACHE_CHECKAUX_OBSOLETE;
+	for (p = name; *p; p++)
+		if (*p == '/')
+			*p = ';';
 
-	return FSCACHE_CHECKAUX_OKAY;
+	v9ses->fscache = fscache_acquire_volume(name, NULL, 0);
+	p9_debug(P9_DEBUG_FSC, "session %p get volume %p (%s)\n",
+		 v9ses, v9ses->fscache, name);
+	kfree(name);
 }
 
-const struct fscache_cookie_def v9fs_cache_inode_index_def = {
-	.name		= "9p.inode",
-	.type		= FSCACHE_COOKIE_TYPE_DATAFILE,
-	.check_aux	= v9fs_cache_inode_check_aux,
-};
-
 void v9fs_cache_inode_get_cookie(struct inode *inode)
 {
 	struct v9fs_inode *v9inode;
@@ -108,94 +45,19 @@ void v9fs_cache_inode_get_cookie(struct inode *inode)
 		return;
 
 	v9inode = V9FS_I(inode);
-	if (v9inode->fscache)
+	if (WARN_ON(v9inode->fscache))
 		return;
 
 	v9ses = v9fs_inode2v9ses(inode);
-	v9inode->fscache = fscache_acquire_cookie(v9ses->fscache,
-						  &v9fs_cache_inode_index_def,
-						  &v9inode->qid.path,
-						  sizeof(v9inode->qid.path),
-						  &v9inode->qid.version,
-						  sizeof(v9inode->qid.version),
-						  v9inode,
-						  i_size_read(&v9inode->vfs_inode),
-						  true);
+	v9inode->fscache =
+		fscache_acquire_cookie(v9fs_session_cache(v9ses),
+				       0,
+				       &v9inode->qid.path,
+				       sizeof(v9inode->qid.path),
+				       &v9inode->qid.version,
+				       sizeof(v9inode->qid.version),
+				       i_size_read(&v9inode->vfs_inode));
 
 	p9_debug(P9_DEBUG_FSC, "inode %p get cookie %p\n",
 		 inode, v9inode->fscache);
 }
-
-void v9fs_cache_inode_put_cookie(struct inode *inode)
-{
-	struct v9fs_inode *v9inode = V9FS_I(inode);
-
-	if (!v9inode->fscache)
-		return;
-	p9_debug(P9_DEBUG_FSC, "inode %p put cookie %p\n",
-		 inode, v9inode->fscache);
-
-	fscache_relinquish_cookie(v9inode->fscache, &v9inode->qid.version,
-				  false);
-	v9inode->fscache = NULL;
-}
-
-void v9fs_cache_inode_flush_cookie(struct inode *inode)
-{
-	struct v9fs_inode *v9inode = V9FS_I(inode);
-
-	if (!v9inode->fscache)
-		return;
-	p9_debug(P9_DEBUG_FSC, "inode %p flush cookie %p\n",
-		 inode, v9inode->fscache);
-
-	fscache_relinquish_cookie(v9inode->fscache, NULL, true);
-	v9inode->fscache = NULL;
-}
-
-void v9fs_cache_inode_set_cookie(struct inode *inode, struct file *filp)
-{
-	struct v9fs_inode *v9inode = V9FS_I(inode);
-
-	if (!v9inode->fscache)
-		return;
-
-	mutex_lock(&v9inode->fscache_lock);
-
-	if ((filp->f_flags & O_ACCMODE) != O_RDONLY)
-		v9fs_cache_inode_flush_cookie(inode);
-	else
-		v9fs_cache_inode_get_cookie(inode);
-
-	mutex_unlock(&v9inode->fscache_lock);
-}
-
-void v9fs_cache_inode_reset_cookie(struct inode *inode)
-{
-	struct v9fs_inode *v9inode = V9FS_I(inode);
-	struct v9fs_session_info *v9ses;
-	struct fscache_cookie *old;
-
-	if (!v9inode->fscache)
-		return;
-
-	old = v9inode->fscache;
-
-	mutex_lock(&v9inode->fscache_lock);
-	fscache_relinquish_cookie(v9inode->fscache, NULL, true);
-
-	v9ses = v9fs_inode2v9ses(inode);
-	v9inode->fscache = fscache_acquire_cookie(v9ses->fscache,
-						  &v9fs_cache_inode_index_def,
-						  &v9inode->qid.path,
-						  sizeof(v9inode->qid.path),
-						  &v9inode->qid.version,
-						  sizeof(v9inode->qid.version),
-						  v9inode,
-						  i_size_read(&v9inode->vfs_inode),
-						  true);
-	p9_debug(P9_DEBUG_FSC, "inode %p revalidating cookie old %p new %p\n",
-		 inode, old, v9inode->fscache);
-
-	mutex_unlock(&v9inode->fscache_lock);
-}
diff --git a/fs/9p/cache.h b/fs/9p/cache.h
index cfafa89b972c..e485049eec85 100644
--- a/fs/9p/cache.h
+++ b/fs/9p/cache.h
@@ -13,21 +13,10 @@
 
 #ifdef CONFIG_9P_FSCACHE
 
-extern struct fscache_netfs v9fs_cache_netfs;
-extern const struct fscache_cookie_def v9fs_cache_session_index_def;
-extern const struct fscache_cookie_def v9fs_cache_inode_index_def;
-
-extern void v9fs_cache_session_get_cookie(struct v9fs_session_info *v9ses);
-extern void v9fs_cache_session_put_cookie(struct v9fs_session_info *v9ses);
+extern void v9fs_cache_session_get_cookie(struct v9fs_session_info *v9ses,
+					  const char *dev_name);
 
 extern void v9fs_cache_inode_get_cookie(struct inode *inode);
-extern void v9fs_cache_inode_put_cookie(struct inode *inode);
-extern void v9fs_cache_inode_flush_cookie(struct inode *inode);
-extern void v9fs_cache_inode_set_cookie(struct inode *inode, struct file *filp);
-extern void v9fs_cache_inode_reset_cookie(struct inode *inode);
-
-extern int __v9fs_cache_register(void);
-extern void __v9fs_cache_unregister(void);
 
 #else /* CONFIG_9P_FSCACHE */
 
@@ -35,13 +24,5 @@ static inline void v9fs_cache_inode_get_cookie(struct inode *inode)
 {
 }
 
-static inline void v9fs_cache_inode_put_cookie(struct inode *inode)
-{
-}
-
-static inline void v9fs_cache_inode_set_cookie(struct inode *inode, struct file *file)
-{
-}
-
 #endif /* CONFIG_9P_FSCACHE */
 #endif /* _9P_CACHE_H */
diff --git a/fs/9p/v9fs.c b/fs/9p/v9fs.c
index 2e0fa7c932db..ab5b2069b78d 100644
--- a/fs/9p/v9fs.c
+++ b/fs/9p/v9fs.c
@@ -468,7 +468,8 @@ struct p9_fid *v9fs_session_init(struct v9fs_session_info *v9ses,
 
 #ifdef CONFIG_9P_FSCACHE
 	/* register the session for caching */
-	v9fs_cache_session_get_cookie(v9ses);
+	if (v9ses->cache == CACHE_LOOSE || v9ses->cache == CACHE_FSCACHE)
+		v9fs_cache_session_get_cookie(v9ses, dev_name);
 #endif
 	spin_lock(&v9fs_sessionlist_lock);
 	list_add(&v9ses->slist, &v9fs_sessionlist);
@@ -501,8 +502,7 @@ void v9fs_session_close(struct v9fs_session_info *v9ses)
 	}
 
 #ifdef CONFIG_9P_FSCACHE
-	if (v9ses->fscache)
-		v9fs_cache_session_put_cookie(v9ses);
+	fscache_relinquish_volume(v9fs_session_cache(v9ses), 0, false);
 	kfree(v9ses->cachetag);
 #endif
 	kfree(v9ses->uname);
@@ -662,20 +662,12 @@ static int v9fs_cache_register(void)
 	ret = v9fs_init_inode_cache();
 	if (ret < 0)
 		return ret;
-#ifdef CONFIG_9P_FSCACHE
-	ret = fscache_register_netfs(&v9fs_cache_netfs);
-	if (ret < 0)
-		v9fs_destroy_inode_cache();
-#endif
 	return ret;
 }
 
 static void v9fs_cache_unregister(void)
 {
 	v9fs_destroy_inode_cache();
-#ifdef CONFIG_9P_FSCACHE
-	fscache_unregister_netfs(&v9fs_cache_netfs);
-#endif
 }
 
 /**
diff --git a/fs/9p/v9fs.h b/fs/9p/v9fs.h
index 92124b235a6d..cf4f645785e0 100644
--- a/fs/9p/v9fs.h
+++ b/fs/9p/v9fs.h
@@ -89,7 +89,7 @@ struct v9fs_session_info {
 	unsigned int cache;
 #ifdef CONFIG_9P_FSCACHE
 	char *cachetag;
-	struct fscache_cookie *fscache;
+	struct fscache_volume *fscache;
 #endif
 
 	char *uname;		/* user name to mount as */
@@ -109,7 +109,6 @@ struct v9fs_session_info {
 
 struct v9fs_inode {
 #ifdef CONFIG_9P_FSCACHE
-	struct mutex fscache_lock;
 	struct fscache_cookie *fscache;
 #endif
 	struct p9_qid qid;
@@ -133,6 +132,16 @@ static inline struct fscache_cookie *v9fs_inode_cookie(struct v9fs_inode *v9inod
 #endif
 }
 
+static inline struct fscache_volume *v9fs_session_cache(struct v9fs_session_info *v9ses)
+{
+#ifdef CONFIG_9P_FSCACHE
+	return v9ses->fscache;
+#else
+	return NULL;
+#endif
+}
+
+
 extern int v9fs_show_options(struct seq_file *m, struct dentry *root);
 
 struct p9_fid *v9fs_session_init(struct v9fs_session_info *, const char *,
diff --git a/fs/9p/vfs_dir.c b/fs/9p/vfs_dir.c
index b6a5a0be444d..24f7fd29f102 100644
--- a/fs/9p/vfs_dir.c
+++ b/fs/9p/vfs_dir.c
@@ -19,6 +19,7 @@
 #include <linux/idr.h>
 #include <linux/slab.h>
 #include <linux/uio.h>
+#include <linux/fscache.h>
 #include <net/9p/9p.h>
 #include <net/9p/client.h>
 
@@ -205,8 +206,10 @@ static int v9fs_dir_readdir_dotl(struct file *file, struct dir_context *ctx)
 
 int v9fs_dir_release(struct inode *inode, struct file *filp)
 {
+	struct v9fs_inode *v9inode = V9FS_I(inode);
 	struct p9_fid *fid;
-
+	loff_t i_size;
+	
 	fid = filp->private_data;
 	p9_debug(P9_DEBUG_VFS, "inode: %p filp: %p fid: %d\n",
 		 inode, filp, fid ? fid->fid : -1);
@@ -216,6 +219,14 @@ int v9fs_dir_release(struct inode *inode, struct file *filp)
 		spin_unlock(&inode->i_lock);
 		p9_client_clunk(fid);
 	}
+
+	if ((filp->f_mode & FMODE_WRITE)) {
+		i_size = i_size_read(inode);
+		fscache_unuse_cookie(v9fs_inode_cookie(v9inode),
+				     &v9inode->qid.version, &i_size);
+	} else {
+		fscache_unuse_cookie(v9fs_inode_cookie(v9inode), NULL, NULL);
+	}
 	return 0;
 }
 
diff --git a/fs/9p/vfs_file.c b/fs/9p/vfs_file.c
index 80052497f00f..461a51509300 100644
--- a/fs/9p/vfs_file.c
+++ b/fs/9p/vfs_file.c
@@ -95,7 +95,8 @@ int v9fs_file_open(struct inode *inode, struct file *file)
 	}
 	mutex_unlock(&v9inode->v_mutex);
 	if (v9ses->cache == CACHE_LOOSE || v9ses->cache == CACHE_FSCACHE)
-		v9fs_cache_inode_set_cookie(inode, file);
+		fscache_use_cookie(v9fs_inode_cookie(v9inode),
+				   file->f_mode & FMODE_WRITE);
 	v9fs_open_fid_add(inode, fid);
 	return 0;
 out_error:
@@ -544,12 +545,12 @@ v9fs_vm_page_mkwrite(struct vm_fault *vmf)
 	 */
 #ifdef CONFIG_9P_FSCACHE
 	if (PageFsCache(page) &&
-	    wait_on_page_bit_killable(page, PG_fscache) < 0)
+	    wait_on_page_fscache_killable(page) < 0)
 		return VM_FAULT_RETRY;
 #endif
 
 	if (PageWriteback(page) &&
-	    wait_on_page_bit_killable(page, PG_writeback) < 0)
+	    wait_on_page_writeback_killable(page) < 0)
 		return VM_FAULT_RETRY;
 
 	/* Update file times before taking page lock */
diff --git a/fs/9p/vfs_inode.c b/fs/9p/vfs_inode.c
index 08f48b70a741..83db37bd4252 100644
--- a/fs/9p/vfs_inode.c
+++ b/fs/9p/vfs_inode.c
@@ -228,7 +228,6 @@ struct inode *v9fs_alloc_inode(struct super_block *sb)
 		return NULL;
 #ifdef CONFIG_9P_FSCACHE
 	v9inode->fscache = NULL;
-	mutex_init(&v9inode->fscache_lock);
 #endif
 	v9inode->writeback_fid = NULL;
 	v9inode->cache_validity = 0;
@@ -381,7 +380,7 @@ void v9fs_evict_inode(struct inode *inode)
 	clear_inode(inode);
 	filemap_fdatawrite(&inode->i_data);
 
-	v9fs_cache_inode_put_cookie(inode);
+	fscache_relinquish_cookie(v9fs_inode_cookie(v9inode), false);
 	/* clunk the fid stashed in writeback_fid */
 	if (v9inode->writeback_fid) {
 		p9_client_clunk(v9inode->writeback_fid);
@@ -862,7 +861,8 @@ v9fs_vfs_atomic_open(struct inode *dir, struct dentry *dentry,
 
 	file->private_data = fid;
 	if (v9ses->cache == CACHE_LOOSE || v9ses->cache == CACHE_FSCACHE)
-		v9fs_cache_inode_set_cookie(d_inode(dentry), file);
+		fscache_use_cookie(v9fs_inode_cookie(v9inode),
+				   file->f_mode & FMODE_WRITE);
 	v9fs_open_fid_add(inode, fid);
 
 	file->f_mode |= FMODE_CREATED;
@@ -1065,6 +1065,8 @@ static int v9fs_vfs_setattr(struct user_namespace *mnt_userns,
 			    struct dentry *dentry, struct iattr *iattr)
 {
 	int retval, use_dentry = 0;
+	struct inode *inode = d_inode(dentry);
+	struct v9fs_inode *v9inode = V9FS_I(inode);
 	struct v9fs_session_info *v9ses;
 	struct p9_fid *fid = NULL;
 	struct p9_wstat wstat;
@@ -1110,7 +1112,7 @@ static int v9fs_vfs_setattr(struct user_namespace *mnt_userns,
 
 	/* Write all dirty data */
 	if (d_is_reg(dentry))
-		filemap_write_and_wait(d_inode(dentry)->i_mapping);
+		filemap_write_and_wait(inode->i_mapping);
 
 	retval = p9_client_wstat(fid, &wstat);
 
@@ -1121,13 +1123,15 @@ static int v9fs_vfs_setattr(struct user_namespace *mnt_userns,
 		return retval;
 
 	if ((iattr->ia_valid & ATTR_SIZE) &&
-	    iattr->ia_size != i_size_read(d_inode(dentry)))
-		truncate_setsize(d_inode(dentry), iattr->ia_size);
+	    iattr->ia_size != i_size_read(inode)) {
+		truncate_setsize(inode, iattr->ia_size);
+		fscache_resize_cookie(v9fs_inode_cookie(v9inode), iattr->ia_size);
+	}
 
-	v9fs_invalidate_inode_attr(d_inode(dentry));
+	v9fs_invalidate_inode_attr(inode);
 
-	setattr_copy(&init_user_ns, d_inode(dentry), iattr);
-	mark_inode_dirty(d_inode(dentry));
+	setattr_copy(&init_user_ns, inode, iattr);
+	mark_inode_dirty(inode);
 	return 0;
 }
 
diff --git a/fs/9p/vfs_inode_dotl.c b/fs/9p/vfs_inode_dotl.c
index 01b9e1281a29..a5058c9a2ab3 100644
--- a/fs/9p/vfs_inode_dotl.c
+++ b/fs/9p/vfs_inode_dotl.c
@@ -346,7 +346,8 @@ v9fs_vfs_atomic_open_dotl(struct inode *dir, struct dentry *dentry,
 		goto err_clunk_old_fid;
 	file->private_data = ofid;
 	if (v9ses->cache == CACHE_LOOSE || v9ses->cache == CACHE_FSCACHE)
-		v9fs_cache_inode_set_cookie(inode, file);
+		fscache_use_cookie(v9fs_inode_cookie(v9inode),
+				   file->f_mode & FMODE_WRITE);
 	v9fs_open_fid_add(inode, ofid);
 	file->f_mode |= FMODE_CREATED;
 out:



^ permalink raw reply related	[flat|nested] 87+ messages in thread

* [PATCH 63/67] 9p: Copy local writes to the cache when writing to the server
  2021-10-18 14:50 [PATCH 00/67] fscache: Rewrite index API and management system David Howells
                   ` (61 preceding siblings ...)
  2021-10-18 15:08 ` [PATCH 62/67] 9p: Use fscache indexing rewrite and reenable caching David Howells
@ 2021-10-18 15:08 ` David Howells
  2021-10-18 15:08 ` [PATCH 64/67] netfs: Display the netfs inode number in the netfs_read tracepoint David Howells
                   ` (7 subsequent siblings)
  70 siblings, 0 replies; 87+ messages in thread
From: David Howells @ 2021-10-18 15:08 UTC (permalink / raw)
  To: linux-cachefs
  Cc: Eric Van Hensbergen, Latchesar Ionkov, Dominique Martinet,
	v9fs-developer, dhowells, Trond Myklebust, Anna Schumaker,
	Steve French, Dominique Martinet, Jeff Layton, Matthew Wilcox,
	Alexander Viro, Omar Sandoval, Linus Torvalds, linux-afs,
	linux-nfs, linux-cifs, ceph-devel, v9fs-developer, linux-fsdevel,
	linux-kernel

When writing to the server from v9fs_vfs_writepage(), copy the data to the
cache object too.

To make this possible, the cookie must have its active users count
incremented when the page is dirtied and kept incremented until we manage
to clean up all the pages.  This allows the writeback to take place after
the last file struct is released.

This is done by taking a use on the cookie in v9fs_set_page_dirty() if we
haven't already done so (controlled by the I_PINNING_FSCACHE_WB flag) and
dropping the pin in v9fs_write_inode() if __writeback_single_inode() clears
all the outstanding dirty pages (conveyed by the unpinned_fscache_wb flag
in the writeback_control struct).

Inode eviction must also clear the flag after truncating away all the
outstanding pages.

In the future this will be handled more gracefully by netfslib.

Signed-off-by: David Howells <dhowells@redhat.com>
cc: Eric Van Hensbergen <ericvh@gmail.com>
cc: Latchesar Ionkov <lucho@ionkov.net>
cc: Dominique Martinet <asmadeus@codewreck.org>
cc: v9fs-developer@lists.sourceforge.net
---

 fs/9p/vfs_addr.c  |   53 ++++++++++++++++++++++++++++++++++++++++++++++++-----
 fs/9p/vfs_inode.c |    2 ++
 fs/9p/vfs_super.c |    3 +++
 3 files changed, 53 insertions(+), 5 deletions(-)

diff --git a/fs/9p/vfs_addr.c b/fs/9p/vfs_addr.c
index de857fa4629b..2d1577538cde 100644
--- a/fs/9p/vfs_addr.c
+++ b/fs/9p/vfs_addr.c
@@ -132,15 +132,17 @@ static void v9fs_vfs_readahead(struct readahead_control *ractl)
 
 static int v9fs_release_page(struct page *page, gfp_t gfp)
 {
+	struct inode *inode = page->mapping->host;
+	struct v9fs_inode *v9inode = V9FS_I(inode);
+
 	if (PagePrivate(page))
 		return 0;
-#ifdef CONFIG_9P_FSCACHE
 	if (PageFsCache(page)) {
 		if (!(gfp & __GFP_DIRECT_RECLAIM) || !(gfp & __GFP_FS))
 			return 0;
 		wait_on_page_fscache(page);
 	}
-#endif
+	fscache_note_page_release(v9fs_inode_cookie(v9inode));
 	return 1;
 }
 
@@ -157,10 +159,23 @@ static void v9fs_invalidate_page(struct page *page, unsigned int offset,
 	wait_on_page_fscache(page);
 }
 
+static void v9fs_write_to_cache_done(void *priv, ssize_t transferred_or_error,
+				     bool was_async)
+{
+	struct v9fs_inode *v9inode = priv;
+
+	if (IS_ERR_VALUE(transferred_or_error) &&
+	    transferred_or_error != -ENOBUFS)
+		fscache_invalidate(v9fs_inode_cookie(v9inode),
+				   &v9inode->qid.version,
+				   i_size_read(&v9inode->vfs_inode), 0);
+}
+
 static int v9fs_vfs_writepage_locked(struct page *page)
 {
 	struct inode *inode = page->mapping->host;
 	struct v9fs_inode *v9inode = V9FS_I(inode);
+	struct fscache_cookie *cookie = v9fs_inode_cookie(v9inode);
 	loff_t start = page_offset(page);
 	loff_t size = i_size_read(inode);
 	struct iov_iter from;
@@ -176,10 +191,21 @@ static int v9fs_vfs_writepage_locked(struct page *page)
 	/* We should have writeback_fid always set */
 	BUG_ON(!v9inode->writeback_fid);
 
+	wait_on_page_fscache(page);
+
 	set_page_writeback(page);
 
 	p9_client_write(v9inode->writeback_fid, start, &from, &err);
 
+	if (err == 0 &&
+	    fscache_cookie_enabled(cookie) &&
+	    test_bit(FSCACHE_COOKIE_IS_CACHING, &cookie->flags)) {
+		set_page_fscache(page);
+		fscache_write_to_cache(v9fs_inode_cookie(v9inode),
+				       page->mapping, start, len, size,
+				       v9fs_write_to_cache_done, v9inode);
+	}
+
 	end_page_writeback(page);
 	return err;
 }
@@ -290,10 +316,12 @@ static int v9fs_write_begin(struct file *filp, struct address_space *mapping,
 
 static int v9fs_write_end(struct file *filp, struct address_space *mapping,
 			  loff_t pos, unsigned len, unsigned copied,
-			  struct page *page, void *fsdata)
+			  struct page *subpage, void *fsdata)
 {
+	struct page *page = thp_head(subpage);
 	loff_t last_pos = pos + copied;
-	struct inode *inode = page->mapping->host;
+	struct inode *inode = mapping->host;
+	struct v9fs_inode *v9inode = V9FS_I(inode);
 
 	p9_debug(P9_DEBUG_VFS, "filp %p, mapping %p\n", filp, mapping);
 
@@ -313,6 +341,7 @@ static int v9fs_write_end(struct file *filp, struct address_space *mapping,
 	if (last_pos > inode->i_size) {
 		inode_add_bytes(inode, last_pos - inode->i_size);
 		i_size_write(inode, last_pos);
+		fscache_update_cookie(v9fs_inode_cookie(v9inode), NULL, &last_pos);
 	}
 	set_page_dirty(page);
 out:
@@ -322,11 +351,25 @@ static int v9fs_write_end(struct file *filp, struct address_space *mapping,
 	return copied;
 }
 
+#ifdef CONFIG_9P_FSCACHE
+/*
+ * Mark a page as having been made dirty and thus needing writeback.  We also
+ * need to pin the cache object to write back to.
+ */
+static int v9fs_set_page_dirty(struct page *page)
+{
+	struct v9fs_inode *v9inode = V9FS_I(page->mapping->host);
+
+	return fscache_set_page_dirty(page, v9fs_inode_cookie(v9inode));
+}
+#else
+#define v9fs_set_page_dirty __set_page_dirty_nobuffers
+#endif
 
 const struct address_space_operations v9fs_addr_operations = {
 	.readpage = v9fs_vfs_readpage,
 	.readahead = v9fs_vfs_readahead,
-	.set_page_dirty = __set_page_dirty_nobuffers,
+	.set_page_dirty = v9fs_set_page_dirty,
 	.writepage = v9fs_vfs_writepage,
 	.write_begin = v9fs_write_begin,
 	.write_end = v9fs_write_end,
diff --git a/fs/9p/vfs_inode.c b/fs/9p/vfs_inode.c
index 83db37bd4252..a990c50cc27d 100644
--- a/fs/9p/vfs_inode.c
+++ b/fs/9p/vfs_inode.c
@@ -377,6 +377,8 @@ void v9fs_evict_inode(struct inode *inode)
 	struct v9fs_inode *v9inode = V9FS_I(inode);
 
 	truncate_inode_pages_final(&inode->i_data);
+	fscache_clear_inode_writeback(v9fs_inode_cookie(v9inode), inode,
+				      &v9inode->qid.version);
 	clear_inode(inode);
 	filemap_fdatawrite(&inode->i_data);
 
diff --git a/fs/9p/vfs_super.c b/fs/9p/vfs_super.c
index 5fce6e30bc5a..3721098e0992 100644
--- a/fs/9p/vfs_super.c
+++ b/fs/9p/vfs_super.c
@@ -24,6 +24,7 @@
 #include <linux/slab.h>
 #include <linux/statfs.h>
 #include <linux/magic.h>
+#include <linux/fscache.h>
 #include <net/9p/9p.h>
 #include <net/9p/client.h>
 
@@ -307,6 +308,7 @@ static int v9fs_write_inode(struct inode *inode,
 		__mark_inode_dirty(inode, I_DIRTY_DATASYNC);
 		return ret;
 	}
+	fscache_unpin_writeback(wbc, v9fs_inode_cookie(v9inode));
 	return 0;
 }
 
@@ -330,6 +332,7 @@ static int v9fs_write_inode_dotl(struct inode *inode,
 		__mark_inode_dirty(inode, I_DIRTY_DATASYNC);
 		return ret;
 	}
+	fscache_unpin_writeback(wbc, v9fs_inode_cookie(v9inode));
 	return 0;
 }
 



^ permalink raw reply related	[flat|nested] 87+ messages in thread

* [PATCH 64/67] netfs: Display the netfs inode number in the netfs_read tracepoint
  2021-10-18 14:50 [PATCH 00/67] fscache: Rewrite index API and management system David Howells
                   ` (62 preceding siblings ...)
  2021-10-18 15:08 ` [PATCH 63/67] 9p: Copy local writes to the cache when writing to the server David Howells
@ 2021-10-18 15:08 ` David Howells
  2021-10-18 15:08 ` [PATCH 65/67] cachefiles: Add tracepoints to log errors from ops on the backing fs David Howells
                   ` (6 subsequent siblings)
  70 siblings, 0 replies; 87+ messages in thread
From: David Howells @ 2021-10-18 15:08 UTC (permalink / raw)
  To: linux-cachefs
  Cc: dhowells, Trond Myklebust, Anna Schumaker, Steve French,
	Dominique Martinet, Jeff Layton, Matthew Wilcox, Alexander Viro,
	Omar Sandoval, Linus Torvalds, linux-afs, linux-nfs, linux-cifs,
	ceph-devel, v9fs-developer, linux-fsdevel, linux-kernel

Display the netfs inode number in the netfs_read tracepoint so that this
can be used to correlate with the cachefiles_prep_read tracepoint.

Signed-off-by: David Howells <dhowells@redhat.com>
---

 include/trace/events/netfs.h |    5 ++++-
 1 file changed, 4 insertions(+), 1 deletion(-)

diff --git a/include/trace/events/netfs.h b/include/trace/events/netfs.h
index 4d470bffd9f1..e6f4ebbb4c69 100644
--- a/include/trace/events/netfs.h
+++ b/include/trace/events/netfs.h
@@ -135,6 +135,7 @@ TRACE_EVENT(netfs_read,
 		    __field(loff_t,			start		)
 		    __field(size_t,			len		)
 		    __field(enum netfs_read_trace,	what		)
+		    __field(unsigned int,		netfs_inode	)
 			     ),
 
 	    TP_fast_assign(
@@ -143,12 +144,14 @@ TRACE_EVENT(netfs_read,
 		    __entry->start	= start;
 		    __entry->len	= len;
 		    __entry->what	= what;
+		    __entry->netfs_inode = rreq->inode->i_ino;
 			   ),
 
-	    TP_printk("R=%08x %s c=%08x s=%llx %zx",
+	    TP_printk("R=%08x %s c=%08x ni=%x s=%llx %zx",
 		      __entry->rreq,
 		      __print_symbolic(__entry->what, netfs_read_traces),
 		      __entry->cookie,
+		      __entry->netfs_inode,
 		      __entry->start, __entry->len)
 	    );
 



^ permalink raw reply related	[flat|nested] 87+ messages in thread

* [PATCH 65/67] cachefiles: Add tracepoints to log errors from ops on the backing fs
  2021-10-18 14:50 [PATCH 00/67] fscache: Rewrite index API and management system David Howells
                   ` (63 preceding siblings ...)
  2021-10-18 15:08 ` [PATCH 64/67] netfs: Display the netfs inode number in the netfs_read tracepoint David Howells
@ 2021-10-18 15:08 ` David Howells
  2021-10-18 15:08 ` [PATCH 66/67] cachefiles: Add error injection support David Howells
                   ` (5 subsequent siblings)
  70 siblings, 0 replies; 87+ messages in thread
From: David Howells @ 2021-10-18 15:08 UTC (permalink / raw)
  To: linux-cachefs
  Cc: dhowells, Trond Myklebust, Anna Schumaker, Steve French,
	Dominique Martinet, Jeff Layton, Matthew Wilcox, Alexander Viro,
	Omar Sandoval, Linus Torvalds, linux-afs, linux-nfs, linux-cifs,
	ceph-devel, v9fs-developer, linux-fsdevel, linux-kernel

Add a couple of tracepoints to log errors that cachefiles gets from making
calls to manipulate the backing filesystem.  These calls are divided into
two categories - VFS manipulation (eg. mkdir) and data I/O - with a
separate tracepoint for each.

Signed-off-by: David Howells <dhowells@redhat.com>
---

 fs/cachefiles/daemon.c            |    2 +
 fs/cachefiles/interface.c         |   11 ++++
 fs/cachefiles/internal.h          |    1 
 fs/cachefiles/io.c                |   22 ++++++++-
 fs/cachefiles/namei.c             |   59 ++++++++++++++++++++---
 fs/cachefiles/xattr.c             |    8 +++
 include/trace/events/cachefiles.h |   94 +++++++++++++++++++++++++++++++++++++
 7 files changed, 187 insertions(+), 10 deletions(-)

diff --git a/fs/cachefiles/daemon.c b/fs/cachefiles/daemon.c
index 6d31fba31ce9..2b926af39f43 100644
--- a/fs/cachefiles/daemon.c
+++ b/fs/cachefiles/daemon.c
@@ -692,6 +692,8 @@ int cachefiles_has_space(struct cachefiles_cache *cache,
 
 	ret = vfs_statfs(&path, &stats);
 	if (ret < 0) {
+		trace_cachefiles_vfs_error(NULL, d_inode(cache->store), ret,
+					   cachefiles_trace_statfs_error);
 		if (ret == -EIO)
 			cachefiles_io_error(cache, "statfs failed");
 		_leave(" = %d", ret);
diff --git a/fs/cachefiles/interface.c b/fs/cachefiles/interface.c
index 96f30eba585a..115be503b23a 100644
--- a/fs/cachefiles/interface.c
+++ b/fs/cachefiles/interface.c
@@ -153,8 +153,10 @@ static bool cachefiles_shorten_object(struct cachefiles_object *object,
 			       cachefiles_trunc_shrink);
 	ret = vfs_truncate(&file->f_path, dio_size);
 	if (ret < 0) {
+		trace_cachefiles_io_error(object, file_inode(file), ret,
+					  cachefiles_trace_trunc_error);
 		cachefiles_io_error_obj(object, "Trunc-to-size failed %d", ret);
-		cachefiles_remove_object_xattr(cache, file->f_path.dentry);
+		cachefiles_remove_object_xattr(cache, object, file->f_path.dentry);
 		return false;
 	}
 
@@ -164,8 +166,10 @@ static bool cachefiles_shorten_object(struct cachefiles_object *object,
 		ret = vfs_fallocate(file, FALLOC_FL_ZERO_RANGE,
 				    new_size, dio_size);
 		if (ret < 0) {
+			trace_cachefiles_io_error(object, file_inode(file), ret,
+						  cachefiles_trace_fallocate_error);
 			cachefiles_io_error_obj(object, "Trunc-to-dio-size failed %d", ret);
-			cachefiles_remove_object_xattr(cache, file->f_path.dentry);
+			cachefiles_remove_object_xattr(cache, object, file->f_path.dentry);
 			return false;
 		}
 	}
@@ -384,6 +388,9 @@ static int cachefiles_attr_changed(struct cachefiles_object *object)
 	inode_unlock(file_inode(file));
 	cachefiles_end_secure(cache, saved_cred);
 
+	if (ret < 0)
+		trace_cachefiles_io_error(NULL, file_inode(file), ret,
+					  cachefiles_trace_notify_change_error);
 	if (ret == -EIO) {
 		cachefiles_io_error_obj(object, "Size set failed");
 		ret = -ENOBUFS;
diff --git a/fs/cachefiles/internal.h b/fs/cachefiles/internal.h
index 83cf2ca3a763..3dd3e13989d6 100644
--- a/fs/cachefiles/internal.h
+++ b/fs/cachefiles/internal.h
@@ -238,6 +238,7 @@ extern int cachefiles_set_object_xattr(struct cachefiles_object *object);
 extern int cachefiles_check_auxdata(struct cachefiles_object *object,
 				    struct file *file);
 extern int cachefiles_remove_object_xattr(struct cachefiles_cache *cache,
+					  struct cachefiles_object *object,
 					  struct dentry *dentry);
 extern void cachefiles_prepare_to_write(struct fscache_cookie *cookie);
 
diff --git a/fs/cachefiles/io.c b/fs/cachefiles/io.c
index 5e3579800689..136d0ea0a7c9 100644
--- a/fs/cachefiles/io.c
+++ b/fs/cachefiles/io.c
@@ -44,9 +44,14 @@ static inline void cachefiles_put_kiocb(struct cachefiles_kiocb *ki)
 static void cachefiles_read_complete(struct kiocb *iocb, long ret, long ret2)
 {
 	struct cachefiles_kiocb *ki = container_of(iocb, struct cachefiles_kiocb, iocb);
+	struct inode *inode = file_inode(ki->iocb.ki_filp);
 
 	_enter("%ld,%ld", ret, ret2);
 
+	if (ret < 0)
+		trace_cachefiles_io_error(ki->object, inode, ret,
+					  cachefiles_trace_read_error);
+
 	if (ki->term_func) {
 		if (ret >= 0) {
 			if (ki->object->cookie->inval_counter == ki->inval_counter)
@@ -195,6 +200,10 @@ static void cachefiles_write_complete(struct kiocb *iocb, long ret, long ret2)
 	__sb_writers_acquired(inode->i_sb, SB_FREEZE_WRITE);
 	__sb_end_write(inode->i_sb, SB_FREEZE_WRITE);
 
+	if (ret < 0)
+		trace_cachefiles_io_error(ki->object, inode, ret,
+					  cachefiles_trace_write_error);
+
 	set_bit(FSCACHE_COOKIE_HAVE_DATA, &ki->object->cookie->flags);
 	if (ki->term_func)
 		ki->term_func(ki->term_func_priv, ret, ki->was_async);
@@ -352,6 +361,8 @@ static enum netfs_read_source cachefiles_prepare_read(struct netfs_read_subreque
 			why = cachefiles_trace_read_seek_nxio;
 			goto download_and_store;
 		}
+		trace_cachefiles_io_error(object, file_inode(file), off,
+					  cachefiles_trace_seek_error);
 		why = cachefiles_trace_read_seek_error;
 		goto out;
 	}
@@ -370,6 +381,8 @@ static enum netfs_read_source cachefiles_prepare_read(struct netfs_read_subreque
 
 	to = vfs_llseek(file, subreq->start, SEEK_HOLE);
 	if (to < 0 && to >= (loff_t)-MAX_ERRNO) {
+		trace_cachefiles_io_error(object, file_inode(file), to,
+					  cachefiles_trace_seek_error);
 		why = cachefiles_trace_read_seek_error;
 		goto out;
 	}
@@ -425,6 +438,8 @@ static int __cachefiles_prepare_write(struct netfs_cache_resources *cres,
 	if (pos < 0 && pos >= (loff_t)-MAX_ERRNO) {
 		if (pos == -ENXIO)
 			goto check_space; /* Unallocated tail */
+		trace_cachefiles_io_error(object, file_inode(file), pos,
+					  cachefiles_trace_seek_error);
 		return pos;
 	}
 	if ((u64)pos >= (u64)*_start + *_len)
@@ -438,8 +453,11 @@ static int __cachefiles_prepare_write(struct netfs_cache_resources *cres,
 		return 0; /* Enough space to simply overwrite the whole block */
 
 	pos = vfs_llseek(file, *_start, SEEK_HOLE);
-	if (pos < 0 && pos >= (loff_t)-MAX_ERRNO)
+	if (pos < 0 && pos >= (loff_t)-MAX_ERRNO) {
+		trace_cachefiles_io_error(object, file_inode(file), pos,
+					  cachefiles_trace_seek_error);
 		return pos;
+	}
 	if ((u64)pos >= (u64)*_start + *_len)
 		return 0; /* Fully allocated */
 
@@ -447,6 +465,8 @@ static int __cachefiles_prepare_write(struct netfs_cache_resources *cres,
 	ret = vfs_fallocate(file, FALLOC_FL_PUNCH_HOLE | FALLOC_FL_KEEP_SIZE,
 			    *_start, *_len);
 	if (ret < 0) {
+		trace_cachefiles_io_error(object, file_inode(file), ret,
+					  cachefiles_trace_fallocate_error);
 		cachefiles_io_error_obj(object,
 					"CacheFiles: fallocate failed (%d)\n", ret);
 		ret = -EIO;
diff --git a/fs/cachefiles/namei.c b/fs/cachefiles/namei.c
index 989df918570b..12266c90e5f8 100644
--- a/fs/cachefiles/namei.c
+++ b/fs/cachefiles/namei.c
@@ -122,8 +122,12 @@ int cachefiles_bury_object(struct cachefiles_cache *cache,
 
 		inode_unlock(d_inode(dir));
 
-		if (ret == -EIO)
-			cachefiles_io_error(cache, "Unlink failed");
+		if (ret < 0) {
+			trace_cachefiles_vfs_error(object, d_inode(dir), ret,
+						   cachefiles_trace_unlink_error);
+			if (ret == -EIO)
+				cachefiles_io_error(cache, "Unlink failed");
+		}
 
 		_leave(" = %d", ret);
 		return ret;
@@ -172,6 +176,9 @@ int cachefiles_bury_object(struct cachefiles_cache *cache,
 	grave = lookup_one_len(nbuffer, cache->graveyard, strlen(nbuffer));
 	if (IS_ERR(grave)) {
 		unlock_rename(cache->graveyard, dir);
+		trace_cachefiles_vfs_error(object, d_inode(cache->graveyard),
+					   PTR_ERR(grave),
+					   cachefiles_trace_lookup_error);
 
 		if (PTR_ERR(grave) == -ENOMEM) {
 			_leave(" = -ENOMEM");
@@ -224,6 +231,10 @@ int cachefiles_bury_object(struct cachefiles_cache *cache,
 		};
 		trace_cachefiles_rename(object, rep, grave, why);
 		ret = vfs_rename(&rd);
+		if (ret != 0)
+			trace_cachefiles_vfs_error(object, d_inode(dir),
+						   PTR_ERR(grave),
+						   cachefiles_trace_rename_error);
 		if (ret != 0 && ret != -ENOMEM)
 			cachefiles_io_error(cache,
 					    "Rename failed with error %d", ret);
@@ -249,6 +260,9 @@ static int cachefiles_unlink(struct cachefiles_object *object,
 	ret = security_path_unlink(&path, dentry);
 	if (ret == 0)
 		ret = vfs_unlink(&init_user_ns, d_backing_inode(fan), dentry, NULL);
+	if (ret != 0)
+		trace_cachefiles_vfs_error(object, d_backing_inode(fan), ret,
+					   cachefiles_trace_unlink_error);
 	return ret;
 }
 
@@ -273,6 +287,9 @@ int cachefiles_delete_object(struct cachefiles_object *object,
 	inode_unlock(d_backing_inode(fan));
 	dput(dentry);
 
+	if (ret < 0)
+		trace_cachefiles_vfs_error(object, d_backing_inode(fan), ret,
+					   cachefiles_trace_unlink_error);
 	if (ret < 0 && ret != -ENOENT)
 		cachefiles_io_error(volume->cache, "Unlink failed");
 	return ret;
@@ -326,8 +343,12 @@ static bool cachefiles_open_file(struct cachefiles_object *object,
 	path.dentry = dentry;
 	file = open_with_fake_path(&path, O_RDWR | O_LARGEFILE | O_DIRECT,
 				   d_backing_inode(dentry), cache->cache_cred);
-	if (IS_ERR(file))
+	if (IS_ERR(file)) {
+		trace_cachefiles_vfs_error(object, d_backing_inode(dentry),
+					   PTR_ERR(file),
+					   cachefiles_trace_open_error);
 		goto error;
+	}
 
 	if (unlikely(!file->f_op->read_iter) ||
 	    unlikely(!file->f_op->write_iter)) {
@@ -430,6 +451,9 @@ struct dentry *cachefiles_get_directory(struct cachefiles_cache *cache,
 retry:
 	subdir = lookup_one_len(dirname, dir, strlen(dirname));
 	if (IS_ERR(subdir)) {
+		trace_cachefiles_vfs_error(NULL, d_backing_inode(dir),
+					   PTR_ERR(subdir),
+					   cachefiles_trace_lookup_error);
 		if (PTR_ERR(subdir) == -ENOMEM)
 			goto nomem_d_alloc;
 		goto lookup_error;
@@ -454,8 +478,11 @@ struct dentry *cachefiles_get_directory(struct cachefiles_cache *cache,
 		if (ret < 0)
 			goto mkdir_error;
 		ret = vfs_mkdir(&init_user_ns, d_inode(dir), subdir, 0700);
-		if (ret < 0)
+		if (ret < 0) {
+			trace_cachefiles_vfs_error(NULL, d_inode(dir), ret,
+						   cachefiles_trace_mkdir_error);
 			goto mkdir_error;
+		}
 
 		if (unlikely(d_unhashed(subdir))) {
 			dput(subdir);
@@ -610,7 +637,7 @@ int cachefiles_cull(struct cachefiles_cache *cache, struct dentry *dir,
 	 */
 	_debug("victim is cullable");
 
-	ret = cachefiles_remove_object_xattr(cache, victim);
+	ret = cachefiles_remove_object_xattr(cache, NULL, victim);
 	if (ret < 0)
 		goto error_unlock;
 
@@ -693,6 +720,8 @@ struct file *cachefiles_create_tmpfile(struct cachefiles_object *object)
 	path.mnt = cache->mnt,
 	path.dentry = vfs_tmpfile(&init_user_ns, fan, S_IFREG, O_RDWR);
 	if (IS_ERR(path.dentry)) {
+		trace_cachefiles_vfs_error(object, d_inode(fan), PTR_ERR(path.dentry),
+					   cachefiles_trace_tmpfile_error);
 		if (PTR_ERR(path.dentry) == -EIO)
 			cachefiles_io_error_obj(object, "Failed to create tmpfile");
 		file = ERR_CAST(path.dentry);
@@ -711,6 +740,9 @@ struct file *cachefiles_create_tmpfile(struct cachefiles_object *object)
 				       cachefiles_trunc_expand_tmpfile);
 		ret = vfs_truncate(&path, ni_size);
 		if (ret < 0) {
+			trace_cachefiles_vfs_error(
+				object, d_backing_inode(path.dentry), ret,
+				cachefiles_trace_trunc_error);
 			file = ERR_PTR(ret);
 			goto out_dput;
 		}
@@ -718,8 +750,12 @@ struct file *cachefiles_create_tmpfile(struct cachefiles_object *object)
 
 	file = open_with_fake_path(&path, O_RDWR | O_LARGEFILE | O_DIRECT,
 				   d_backing_inode(path.dentry), cache->cache_cred);
-	if (IS_ERR(file))
+	if (IS_ERR(file)) {
+		trace_cachefiles_vfs_error(object, d_backing_inode(path.dentry),
+					   PTR_ERR(file),
+					   cachefiles_trace_open_error);
 		goto out_dput;
+	}
 	if (unlikely(!file->f_op->read_iter) ||
 	    unlikely(!file->f_op->write_iter)) {
 		fput(file);
@@ -750,6 +786,8 @@ bool cachefiles_commit_tmpfile(struct cachefiles_cache *cache,
 	inode_lock_nested(d_inode(fan), I_MUTEX_PARENT);
 	dentry = lookup_one_len(object->d_name, fan, object->d_name_len);
 	if (IS_ERR(dentry)) {
+		trace_cachefiles_vfs_error(object, d_inode(fan), PTR_ERR(dentry),
+					   cachefiles_trace_lookup_error);
 		_debug("lookup fail %ld", PTR_ERR(dentry));
 		goto out_unlock;
 	}
@@ -761,12 +799,17 @@ bool cachefiles_commit_tmpfile(struct cachefiles_cache *cache,
 		}
 
 		ret = cachefiles_unlink(object, fan, dentry, FSCACHE_OBJECT_IS_STALE);
-		if (ret < 0)
+		if (ret < 0) {
+			trace_cachefiles_vfs_error(object, d_inode(fan), ret,
+						   cachefiles_trace_unlink_error);
 			goto out_dput;
+		}
 
 		dput(dentry);
 		dentry = lookup_one_len(object->d_name, fan, object->d_name_len);
 		if (IS_ERR(dentry)) {
+			trace_cachefiles_vfs_error(object, d_inode(fan), PTR_ERR(dentry),
+						   cachefiles_trace_lookup_error);
 			_debug("lookup fail %ld", PTR_ERR(dentry));
 			goto out_unlock;
 		}
@@ -775,6 +818,8 @@ bool cachefiles_commit_tmpfile(struct cachefiles_cache *cache,
 	ret = vfs_link(object->file->f_path.dentry, &init_user_ns,
 		       d_inode(fan), dentry, NULL);
 	if (ret < 0) {
+		trace_cachefiles_vfs_error(object, d_inode(fan), PTR_ERR(dentry),
+					   cachefiles_trace_link_error);
 		_debug("link fail %d", ret);
 	} else {
 		trace_cachefiles_link(object, file_inode(object->file));
diff --git a/fs/cachefiles/xattr.c b/fs/cachefiles/xattr.c
index 30adca42883b..e0f77329c3ec 100644
--- a/fs/cachefiles/xattr.c
+++ b/fs/cachefiles/xattr.c
@@ -61,6 +61,8 @@ int cachefiles_set_object_xattr(struct cachefiles_object *object)
 	ret = vfs_setxattr(&init_user_ns, dentry, cachefiles_xattr_cache,
 			   buf, sizeof(struct cachefiles_xattr) + len, 0);
 	if (ret < 0) {
+		trace_cachefiles_vfs_error(object, file_inode(file), ret,
+					   cachefiles_trace_setxattr_error);
 		trace_cachefiles_coherency(object, file_inode(file)->i_ino,
 					   buf->content,
 					   cachefiles_coherency_set_fail);
@@ -99,6 +101,9 @@ int cachefiles_check_auxdata(struct cachefiles_object *object, struct file *file
 
 	xlen = vfs_getxattr(&init_user_ns, dentry, cachefiles_xattr_cache, buf, tlen);
 	if (xlen != tlen) {
+		if (xlen < 0)
+			trace_cachefiles_vfs_error(object, file_inode(file), xlen,
+						   cachefiles_trace_getxattr_error);
 		if (xlen == -EIO)
 			cachefiles_io_error_obj(
 				object,
@@ -129,12 +134,15 @@ int cachefiles_check_auxdata(struct cachefiles_object *object, struct file *file
  * remove the object's xattr to mark it stale
  */
 int cachefiles_remove_object_xattr(struct cachefiles_cache *cache,
+				   struct cachefiles_object *object,
 				   struct dentry *dentry)
 {
 	int ret;
 
 	ret = vfs_removexattr(&init_user_ns, dentry, cachefiles_xattr_cache);
 	if (ret < 0) {
+		trace_cachefiles_vfs_error(object, d_inode(dentry), ret,
+					   cachefiles_trace_remxattr_error);
 		if (ret == -ENOENT || ret == -ENODATA)
 			ret = 0;
 		else if (ret != -ENOMEM)
diff --git a/include/trace/events/cachefiles.h b/include/trace/events/cachefiles.h
index 11447b20fc83..6a41a36ce581 100644
--- a/include/trace/events/cachefiles.h
+++ b/include/trace/events/cachefiles.h
@@ -72,6 +72,26 @@ enum cachefiles_prepare_read_trace {
 	cachefiles_trace_read_seek_nxio,
 };
 
+enum cachefiles_error_trace {
+	cachefiles_trace_fallocate_error,
+	cachefiles_trace_getxattr_error,
+	cachefiles_trace_link_error,
+	cachefiles_trace_lookup_error,
+	cachefiles_trace_mkdir_error,
+	cachefiles_trace_notify_change_error,
+	cachefiles_trace_open_error,
+	cachefiles_trace_read_error,
+	cachefiles_trace_remxattr_error,
+	cachefiles_trace_rename_error,
+	cachefiles_trace_seek_error,
+	cachefiles_trace_setxattr_error,
+	cachefiles_trace_statfs_error,
+	cachefiles_trace_tmpfile_error,
+	cachefiles_trace_trunc_error,
+	cachefiles_trace_unlink_error,
+	cachefiles_trace_write_error,
+};
+
 #endif
 
 /*
@@ -126,6 +146,25 @@ enum cachefiles_prepare_read_trace {
 	EM(cachefiles_trace_read_seek_error,	"seek-error")		\
 	E_(cachefiles_trace_read_seek_nxio,	"seek-enxio")
 
+#define cachefiles_error_traces						\
+	EM(cachefiles_trace_fallocate_error,	"fallocate")		\
+	EM(cachefiles_trace_getxattr_error,	"getxattr")		\
+	EM(cachefiles_trace_link_error,		"link")			\
+	EM(cachefiles_trace_lookup_error,	"lookup")		\
+	EM(cachefiles_trace_mkdir_error,	"mkdir")		\
+	EM(cachefiles_trace_notify_change_error, "notify_change")	\
+	EM(cachefiles_trace_open_error,		"open")			\
+	EM(cachefiles_trace_read_error,		"read")			\
+	EM(cachefiles_trace_remxattr_error,	"remxattr")		\
+	EM(cachefiles_trace_rename_error,	"rename")		\
+	EM(cachefiles_trace_seek_error,		"seek")			\
+	EM(cachefiles_trace_setxattr_error,	"setxattr")		\
+	EM(cachefiles_trace_statfs_error,	"statfs")		\
+	EM(cachefiles_trace_tmpfile_error,	"tmpfile")		\
+	EM(cachefiles_trace_trunc_error,	"trunc")		\
+	EM(cachefiles_trace_unlink_error,	"unlink")		\
+	E_(cachefiles_trace_write_error,	"write")
+
 
 /*
  * Export enum symbols via userspace.
@@ -140,6 +179,7 @@ cachefiles_obj_ref_traces;
 cachefiles_coherency_traces;
 cachefiles_trunc_traces;
 cachefiles_prepare_read_traces;
+cachefiles_error_traces;
 
 /*
  * Now redefine the EM() and E_() macros to map the enums to the strings that
@@ -496,6 +536,60 @@ TRACE_EVENT(cachefiles_trunc,
 		      __entry->to)
 	    );
 
+TRACE_EVENT(cachefiles_vfs_error,
+	    TP_PROTO(struct cachefiles_object *obj, struct inode *backer,
+		     int error, enum cachefiles_error_trace where),
+
+	    TP_ARGS(obj, backer, error, where),
+
+	    TP_STRUCT__entry(
+		    __field(unsigned int,			obj	)
+		    __field(unsigned int,			backer	)
+		    __field(enum cachefiles_error_trace,	where	)
+		    __field(short,				error	)
+			     ),
+
+	    TP_fast_assign(
+		    __entry->obj	= obj ? obj->debug_id : 0;
+		    __entry->backer	= backer->i_ino;
+		    __entry->error	= error;
+		    __entry->where	= where;
+			   ),
+
+	    TP_printk("o=%08x b=%08x %s e=%d",
+		      __entry->obj,
+		      __entry->backer,
+		      __print_symbolic(__entry->where, cachefiles_error_traces),
+		      __entry->error)
+	    );
+
+TRACE_EVENT(cachefiles_io_error,
+	    TP_PROTO(struct cachefiles_object *obj, struct inode *backer,
+		     int error, enum cachefiles_error_trace where),
+
+	    TP_ARGS(obj, backer, error, where),
+
+	    TP_STRUCT__entry(
+		    __field(unsigned int,			obj	)
+		    __field(unsigned int,			backer	)
+		    __field(enum cachefiles_error_trace,	where	)
+		    __field(short,				error	)
+			     ),
+
+	    TP_fast_assign(
+		    __entry->obj	= obj ? obj->debug_id : 0;
+		    __entry->backer	= backer->i_ino;
+		    __entry->error	= error;
+		    __entry->where	= where;
+			   ),
+
+	    TP_printk("o=%08x b=%08x %s e=%d",
+		      __entry->obj,
+		      __entry->backer,
+		      __print_symbolic(__entry->where, cachefiles_error_traces),
+		      __entry->error)
+	    );
+
 #endif /* _TRACE_CACHEFILES_H */
 
 /* This part must be outside protection */



^ permalink raw reply related	[flat|nested] 87+ messages in thread

* [PATCH 66/67] cachefiles: Add error injection support
  2021-10-18 14:50 [PATCH 00/67] fscache: Rewrite index API and management system David Howells
                   ` (64 preceding siblings ...)
  2021-10-18 15:08 ` [PATCH 65/67] cachefiles: Add tracepoints to log errors from ops on the backing fs David Howells
@ 2021-10-18 15:08 ` David Howells
  2021-10-18 15:09 ` [PATCH 67/67] cifs: Support fscache indexing rewrite (untested) David Howells
                   ` (4 subsequent siblings)
  70 siblings, 0 replies; 87+ messages in thread
From: David Howells @ 2021-10-18 15:08 UTC (permalink / raw)
  To: linux-cachefs
  Cc: dhowells, Trond Myklebust, Anna Schumaker, Steve French,
	Dominique Martinet, Jeff Layton, Matthew Wilcox, Alexander Viro,
	Omar Sandoval, Linus Torvalds, linux-afs, linux-nfs, linux-cifs,
	ceph-devel, v9fs-developer, linux-fsdevel, linux-kernel

Add support for injecting ENOSPC or EIO errors.  This needs to be enabled
by CONFIG_CACHEFILES_ERROR_INJECTION=y.  Once enabled, ENOSPC on things
like write and mkdir can be triggered by:

	echo 1 >/proc/sys/cachefiles/error_injection

and EIO can be triggered on most operations by:

	echo 2 >/proc/sys/cachefiles/error_injection

Signed-off-by: David Howells <dhowells@redhat.com>
---

 fs/cachefiles/Kconfig        |    8 ++++++
 fs/cachefiles/Makefile       |    2 +
 fs/cachefiles/error_inject.c |   46 ++++++++++++++++++++++++++++++++++
 fs/cachefiles/interface.c    |   21 +++++++++++----
 fs/cachefiles/internal.h     |   39 +++++++++++++++++++++++++++++
 fs/cachefiles/io.c           |   34 ++++++++++++++++++-------
 fs/cachefiles/main.c         |    7 +++++
 fs/cachefiles/namei.c        |   57 +++++++++++++++++++++++++++++++++---------
 fs/cachefiles/xattr.c        |   14 +++++++---
 9 files changed, 197 insertions(+), 31 deletions(-)
 create mode 100644 fs/cachefiles/error_inject.c

diff --git a/fs/cachefiles/Kconfig b/fs/cachefiles/Kconfig
index 6827b40f7ddc..c19515dd253e 100644
--- a/fs/cachefiles/Kconfig
+++ b/fs/cachefiles/Kconfig
@@ -19,3 +19,11 @@ config CACHEFILES_DEBUG
 	  caching on files module.  If this is set, the debugging output may be
 	  enabled by setting bits in /sys/modules/cachefiles/parameter/debug or
 	  by including a debugging specifier in /etc/cachefilesd.conf.
+
+
+config CACHEFILES_ERROR_INJECTION
+	bool "Provide error injection for cachefiles"
+	depends on CACHEFILES && SYSCTL
+	help
+	  This permits error injection to be enabled in cachefiles whilst a
+	  cache is in service.
diff --git a/fs/cachefiles/Makefile b/fs/cachefiles/Makefile
index 9062767331e8..c0df4c42cb09 100644
--- a/fs/cachefiles/Makefile
+++ b/fs/cachefiles/Makefile
@@ -15,4 +15,6 @@ cachefiles-y := \
 	volume.o \
 	xattr.o
 
+cachefiles-$(CONFIG_CACHEFILES_ERROR_INJECTION) += error_inject.o
+
 obj-$(CONFIG_CACHEFILES) := cachefiles.o
diff --git a/fs/cachefiles/error_inject.c b/fs/cachefiles/error_inject.c
new file mode 100644
index 000000000000..58f8aec964e4
--- /dev/null
+++ b/fs/cachefiles/error_inject.c
@@ -0,0 +1,46 @@
+// SPDX-License-Identifier: GPL-2.0-or-later
+/* Error injection handling.
+ *
+ * Copyright (C) 2021 Red Hat, Inc. All Rights Reserved.
+ * Written by David Howells (dhowells@redhat.com)
+ */
+
+#include <linux/sysctl.h>
+#include "internal.h"
+
+unsigned int cachefiles_error_injection_state;
+
+static struct ctl_table_header *cachefiles_sysctl;
+static struct ctl_table cachefiles_sysctls[] = {
+	{
+		.procname	= "error_injection",
+		.data		= &cachefiles_error_injection_state,
+		.maxlen		= sizeof(unsigned int),
+		.mode		= 0644,
+		.proc_handler	= proc_douintvec,
+	},
+	{}
+};
+
+static struct ctl_table cachefiles_sysctls_root[] = {
+	{
+		.procname	= "cachefiles",
+		.mode		= 0555,
+		.child		= cachefiles_sysctls,
+	},
+	{}
+};
+
+int __init cachefiles_register_error_injection(void)
+{
+	cachefiles_sysctl = register_sysctl_table(cachefiles_sysctls_root);
+	if (!cachefiles_sysctl)
+		return -ENOMEM;
+	return 0;
+
+}
+
+void cachefiles_unregister_error_injection(void)
+{
+	unregister_sysctl_table(cachefiles_sysctl);
+}
diff --git a/fs/cachefiles/interface.c b/fs/cachefiles/interface.c
index 115be503b23a..3c9710ceae86 100644
--- a/fs/cachefiles/interface.c
+++ b/fs/cachefiles/interface.c
@@ -151,7 +151,9 @@ static bool cachefiles_shorten_object(struct cachefiles_object *object,
 
 	trace_cachefiles_trunc(object, inode, i_size, dio_size,
 			       cachefiles_trunc_shrink);
-	ret = vfs_truncate(&file->f_path, dio_size);
+	ret = cachefiles_inject_remove_error();
+	if (ret == 0)
+		ret = vfs_truncate(&file->f_path, dio_size);
 	if (ret < 0) {
 		trace_cachefiles_io_error(object, file_inode(file), ret,
 					  cachefiles_trace_trunc_error);
@@ -163,8 +165,10 @@ static bool cachefiles_shorten_object(struct cachefiles_object *object,
 	if (new_size < dio_size) {
 		trace_cachefiles_trunc(object, inode, dio_size, new_size,
 				       cachefiles_trunc_dio_adjust);
-		ret = vfs_fallocate(file, FALLOC_FL_ZERO_RANGE,
-				    new_size, dio_size);
+		ret = cachefiles_inject_write_error();
+		if (ret == 0)
+			ret = vfs_fallocate(file, FALLOC_FL_ZERO_RANGE,
+					    new_size, dio_size);
 		if (ret < 0) {
 			trace_cachefiles_io_error(object, file_inode(file), ret,
 						  cachefiles_trace_fallocate_error);
@@ -374,15 +378,20 @@ static int cachefiles_attr_changed(struct cachefiles_object *object)
 		_debug("discard tail %llx", oi_size);
 		newattrs.ia_valid = ATTR_SIZE;
 		newattrs.ia_size = oi_size & PAGE_MASK;
-		ret = notify_change(&init_user_ns, file->f_path.dentry,
-				    &newattrs, NULL);
+		ret = cachefiles_inject_remove_error();
+		if (ret == 0)
+			ret = notify_change(&init_user_ns, file->f_path.dentry,
+					    &newattrs, NULL);
 		if (ret < 0)
 			goto truncate_failed;
 	}
 
 	newattrs.ia_valid = ATTR_SIZE;
 	newattrs.ia_size = ni_size;
-	ret = notify_change(&init_user_ns, file->f_path.dentry, &newattrs, NULL);
+	ret = cachefiles_inject_write_error();
+	if (ret == 0)
+		ret = notify_change(&init_user_ns, file->f_path.dentry,
+				    &newattrs, NULL);
 
 truncate_failed:
 	inode_unlock(file_inode(file));
diff --git a/fs/cachefiles/internal.h b/fs/cachefiles/internal.h
index 3dd3e13989d6..096ee6376f2a 100644
--- a/fs/cachefiles/internal.h
+++ b/fs/cachefiles/internal.h
@@ -155,6 +155,45 @@ extern const struct file_operations cachefiles_daemon_fops;
 extern int cachefiles_has_space(struct cachefiles_cache *cache,
 				unsigned fnr, unsigned bnr);
 
+/*
+ * error_inject.c
+ */
+#ifdef CONFIG_CACHEFILES_ERROR_INJECTION
+extern unsigned int cachefiles_error_injection_state;
+extern int cachefiles_register_error_injection(void);
+extern void cachefiles_unregister_error_injection(void);
+
+#else
+#define cachefiles_error_injection_state 0
+
+static inline int cachefiles_register_error_injection(void)
+{
+	return 0;
+}
+
+static inline void cachefiles_unregister_error_injection(void)
+{
+}
+#endif
+
+
+static inline int cachefiles_inject_read_error(void)
+{
+	return cachefiles_error_injection_state & 2 ? -EIO : 0;
+}
+
+static inline int cachefiles_inject_write_error(void)
+{
+	return cachefiles_error_injection_state & 2 ? -EIO :
+		cachefiles_error_injection_state & 1 ? -ENOSPC :
+		0;
+}
+
+static inline int cachefiles_inject_remove_error(void)
+{
+	return cachefiles_error_injection_state & 2 ? -EIO : 0;
+}
+
 /*
  * interface.c
  */
diff --git a/fs/cachefiles/io.c b/fs/cachefiles/io.c
index 136d0ea0a7c9..78e6ef781f73 100644
--- a/fs/cachefiles/io.c
+++ b/fs/cachefiles/io.c
@@ -100,7 +100,9 @@ static int cachefiles_read(struct netfs_cache_resources *cres,
 	if (read_hole != NETFS_READ_HOLE_IGNORE) {
 		loff_t off = start_pos, off2;
 
-		off2 = vfs_llseek(file, off, SEEK_DATA);
+		off2 = cachefiles_inject_read_error();
+		if (off2 == 0)
+			off2 = vfs_llseek(file, off, SEEK_DATA);
 		if (off2 < 0 && off2 >= (loff_t)-MAX_ERRNO && off2 != -ENXIO) {
 			skipped = 0;
 			ret = off2;
@@ -152,7 +154,9 @@ static int cachefiles_read(struct netfs_cache_resources *cres,
 
 	trace_cachefiles_read(object, file_inode(file), ki->iocb.ki_pos, len - skipped);
 	old_nofs = memalloc_nofs_save();
-	ret = vfs_iocb_iter_read(file, &ki->iocb, iter);
+	ret = cachefiles_inject_read_error();
+	if (ret == 0)
+		ret = vfs_iocb_iter_read(file, &ki->iocb, iter);
 	memalloc_nofs_restore(old_nofs);
 	switch (ret) {
 	case -EIOCBQUEUED:
@@ -273,7 +277,9 @@ static int cachefiles_write(struct netfs_cache_resources *cres,
 
 	trace_cachefiles_write(object, inode, ki->iocb.ki_pos, len);
 	old_nofs = memalloc_nofs_save();
-	ret = vfs_iocb_iter_write(file, &ki->iocb, iter);
+	ret = cachefiles_inject_write_error();
+	if (ret == 0)
+		ret = vfs_iocb_iter_write(file, &ki->iocb, iter);
 	memalloc_nofs_restore(old_nofs);
 	switch (ret) {
 	case -EIOCBQUEUED:
@@ -355,7 +361,9 @@ static enum netfs_read_source cachefiles_prepare_read(struct netfs_read_subreque
 	cache = object->volume->cache;
 	cachefiles_begin_secure(cache, &saved_cred);
 
-	off = vfs_llseek(file, subreq->start, SEEK_DATA);
+	off = cachefiles_inject_read_error();
+	if (off == 0)
+		off = vfs_llseek(file, subreq->start, SEEK_DATA);
 	if (off < 0 && off >= (loff_t)-MAX_ERRNO) {
 		if (off == (loff_t)-ENXIO) {
 			why = cachefiles_trace_read_seek_nxio;
@@ -379,7 +387,9 @@ static enum netfs_read_source cachefiles_prepare_read(struct netfs_read_subreque
 		goto download_and_store;
 	}
 
-	to = vfs_llseek(file, subreq->start, SEEK_HOLE);
+	to = cachefiles_inject_read_error();
+	if (to == 0)
+		to = vfs_llseek(file, subreq->start, SEEK_HOLE);
 	if (to < 0 && to >= (loff_t)-MAX_ERRNO) {
 		trace_cachefiles_io_error(object, file_inode(file), to,
 					  cachefiles_trace_seek_error);
@@ -434,7 +444,9 @@ static int __cachefiles_prepare_write(struct netfs_cache_resources *cres,
 	if (no_space_allocated_yet)
 		goto check_space;
 
-	pos = vfs_llseek(file, *_start, SEEK_DATA);
+	pos = cachefiles_inject_read_error();
+	if (pos == 0)
+		pos = vfs_llseek(file, *_start, SEEK_DATA);
 	if (pos < 0 && pos >= (loff_t)-MAX_ERRNO) {
 		if (pos == -ENXIO)
 			goto check_space; /* Unallocated tail */
@@ -452,7 +464,9 @@ static int __cachefiles_prepare_write(struct netfs_cache_resources *cres,
 	if (cachefiles_has_space(cache, 0, *_len / PAGE_SIZE) == 0)
 		return 0; /* Enough space to simply overwrite the whole block */
 
-	pos = vfs_llseek(file, *_start, SEEK_HOLE);
+	pos = cachefiles_inject_read_error();
+	if (pos == 0)
+		pos = vfs_llseek(file, *_start, SEEK_HOLE);
 	if (pos < 0 && pos >= (loff_t)-MAX_ERRNO) {
 		trace_cachefiles_io_error(object, file_inode(file), pos,
 					  cachefiles_trace_seek_error);
@@ -462,8 +476,10 @@ static int __cachefiles_prepare_write(struct netfs_cache_resources *cres,
 		return 0; /* Fully allocated */
 
 	/* Partially allocated, but insufficient space: cull. */
-	ret = vfs_fallocate(file, FALLOC_FL_PUNCH_HOLE | FALLOC_FL_KEEP_SIZE,
-			    *_start, *_len);
+	pos = cachefiles_inject_remove_error();
+	if (pos == 0)
+		ret = vfs_fallocate(file, FALLOC_FL_PUNCH_HOLE | FALLOC_FL_KEEP_SIZE,
+				    *_start, *_len);
 	if (ret < 0) {
 		trace_cachefiles_io_error(object, file_inode(file), ret,
 					  cachefiles_trace_fallocate_error);
diff --git a/fs/cachefiles/main.c b/fs/cachefiles/main.c
index 522fda828563..83f8a221a890 100644
--- a/fs/cachefiles/main.c
+++ b/fs/cachefiles/main.c
@@ -46,6 +46,10 @@ static int __init cachefiles_init(void)
 {
 	int ret;
 
+	ret = cachefiles_register_error_injection();
+	if (ret < 0)
+		goto error_einj;
+
 	ret = misc_register(&cachefiles_dev);
 	if (ret < 0)
 		goto error_dev;
@@ -67,6 +71,8 @@ static int __init cachefiles_init(void)
 error_object_jar:
 	misc_deregister(&cachefiles_dev);
 error_dev:
+	cachefiles_unregister_error_injection();
+error_einj:
 	pr_err("failed to register: %d\n", ret);
 	return ret;
 }
@@ -82,6 +88,7 @@ static void __exit cachefiles_exit(void)
 
 	kmem_cache_destroy(cachefiles_object_jar);
 	misc_deregister(&cachefiles_dev);
+	cachefiles_unregister_error_injection();
 }
 
 module_exit(cachefiles_exit);
diff --git a/fs/cachefiles/namei.c b/fs/cachefiles/namei.c
index 12266c90e5f8..686480120570 100644
--- a/fs/cachefiles/namei.c
+++ b/fs/cachefiles/namei.c
@@ -116,7 +116,9 @@ int cachefiles_bury_object(struct cachefiles_cache *cache,
 			dget(rep); /* Stop the dentry being negated if it's
 				    * only pinned by a file struct.
 				    */
-			ret = vfs_unlink(&init_user_ns, d_inode(dir), rep, NULL);
+			ret = cachefiles_inject_remove_error();
+			if (ret == 0)
+				ret = vfs_unlink(&init_user_ns, d_inode(dir), rep, NULL);
 			dput(rep);
 		}
 
@@ -230,7 +232,9 @@ int cachefiles_bury_object(struct cachefiles_cache *cache,
 			.new_dentry	= grave,
 		};
 		trace_cachefiles_rename(object, rep, grave, why);
-		ret = vfs_rename(&rd);
+		ret = cachefiles_inject_read_error();
+		if (ret == 0)
+			ret = vfs_rename(&rd);
 		if (ret != 0)
 			trace_cachefiles_vfs_error(object, d_inode(dir),
 						   PTR_ERR(grave),
@@ -258,6 +262,8 @@ static int cachefiles_unlink(struct cachefiles_object *object,
 
 	trace_cachefiles_unlink(object, dentry, why);
 	ret = security_path_unlink(&path, dentry);
+	if (ret == 0)
+		ret = cachefiles_inject_remove_error();
 	if (ret == 0)
 		ret = vfs_unlink(&init_user_ns, d_backing_inode(fan), dentry, NULL);
 	if (ret != 0)
@@ -400,7 +406,12 @@ bool cachefiles_look_up_object(struct cachefiles_object *object)
 	_enter("OBJ%x,%s,", object->debug_id, object->d_name);
 
 	/* Look up path "cache/vol/fanout/file". */
-	dentry = lookup_positive_unlocked(object->d_name, fan, object->d_name_len);
+	ret = cachefiles_inject_read_error();
+	if (ret == 0)
+		dentry = lookup_positive_unlocked(object->d_name, fan,
+						  object->d_name_len);
+	else
+		dentry = ERR_PTR(ret);
 	trace_cachefiles_lookup(object, dentry);
 	if (IS_ERR(dentry)) {
 		if (dentry == ERR_PTR(-ENOENT))
@@ -449,7 +460,11 @@ struct dentry *cachefiles_get_directory(struct cachefiles_cache *cache,
 	inode_lock(d_inode(dir));
 
 retry:
-	subdir = lookup_one_len(dirname, dir, strlen(dirname));
+	ret = cachefiles_inject_read_error();
+	if (ret == 0)
+		subdir = lookup_one_len(dirname, dir, strlen(dirname));
+	else
+		subdir = ERR_PTR(ret);
 	if (IS_ERR(subdir)) {
 		trace_cachefiles_vfs_error(NULL, d_backing_inode(dir),
 					   PTR_ERR(subdir),
@@ -477,7 +492,9 @@ struct dentry *cachefiles_get_directory(struct cachefiles_cache *cache,
 		ret = security_path_mkdir(&path, subdir, 0700);
 		if (ret < 0)
 			goto mkdir_error;
-		ret = vfs_mkdir(&init_user_ns, d_inode(dir), subdir, 0700);
+		ret = cachefiles_inject_write_error();
+		if (ret == 0)
+			ret = vfs_mkdir(&init_user_ns, d_inode(dir), subdir, 0700);
 		if (ret < 0) {
 			trace_cachefiles_vfs_error(NULL, d_inode(dir), ret,
 						   cachefiles_trace_mkdir_error);
@@ -717,8 +734,12 @@ struct file *cachefiles_create_tmpfile(struct cachefiles_object *object)
 
 	cachefiles_begin_secure(cache, &saved_cred);
 
-	path.mnt = cache->mnt,
-	path.dentry = vfs_tmpfile(&init_user_ns, fan, S_IFREG, O_RDWR);
+	path.mnt = cache->mnt;
+	ret = cachefiles_inject_write_error();
+	if (ret == 0)
+		path.dentry = vfs_tmpfile(&init_user_ns, fan, S_IFREG, O_RDWR);
+	else
+		path.dentry = ERR_PTR(ret);
 	if (IS_ERR(path.dentry)) {
 		trace_cachefiles_vfs_error(object, d_inode(fan), PTR_ERR(path.dentry),
 					   cachefiles_trace_tmpfile_error);
@@ -738,7 +759,9 @@ struct file *cachefiles_create_tmpfile(struct cachefiles_object *object)
 	if (ni_size > 0) {
 		trace_cachefiles_trunc(object, d_backing_inode(path.dentry), 0, ni_size,
 				       cachefiles_trunc_expand_tmpfile);
-		ret = vfs_truncate(&path, ni_size);
+		ret = cachefiles_inject_write_error();
+		if (ret == 0)
+			ret = vfs_truncate(&path, ni_size);
 		if (ret < 0) {
 			trace_cachefiles_vfs_error(
 				object, d_backing_inode(path.dentry), ret,
@@ -784,7 +807,11 @@ bool cachefiles_commit_tmpfile(struct cachefiles_cache *cache,
 	_enter(",%pD", object->file);
 
 	inode_lock_nested(d_inode(fan), I_MUTEX_PARENT);
-	dentry = lookup_one_len(object->d_name, fan, object->d_name_len);
+	ret = cachefiles_inject_read_error();
+	if (ret == 0)
+		dentry = lookup_one_len(object->d_name, fan, object->d_name_len);
+	else
+		dentry = ERR_PTR(ret);
 	if (IS_ERR(dentry)) {
 		trace_cachefiles_vfs_error(object, d_inode(fan), PTR_ERR(dentry),
 					   cachefiles_trace_lookup_error);
@@ -806,7 +833,11 @@ bool cachefiles_commit_tmpfile(struct cachefiles_cache *cache,
 		}
 
 		dput(dentry);
-		dentry = lookup_one_len(object->d_name, fan, object->d_name_len);
+		ret = cachefiles_inject_read_error();
+		if (ret == 0)
+			dentry = lookup_one_len(object->d_name, fan, object->d_name_len);
+		else
+			dentry = ERR_PTR(ret);
 		if (IS_ERR(dentry)) {
 			trace_cachefiles_vfs_error(object, d_inode(fan), PTR_ERR(dentry),
 						   cachefiles_trace_lookup_error);
@@ -815,8 +846,10 @@ bool cachefiles_commit_tmpfile(struct cachefiles_cache *cache,
 		}
 	}
 
-	ret = vfs_link(object->file->f_path.dentry, &init_user_ns,
-		       d_inode(fan), dentry, NULL);
+	ret = cachefiles_inject_read_error();
+	if (ret == 0)
+		ret = vfs_link(object->file->f_path.dentry, &init_user_ns,
+			       d_inode(fan), dentry, NULL);
 	if (ret < 0) {
 		trace_cachefiles_vfs_error(object, d_inode(fan), PTR_ERR(dentry),
 					   cachefiles_trace_link_error);
diff --git a/fs/cachefiles/xattr.c b/fs/cachefiles/xattr.c
index e0f77329c3ec..2555b82be7e2 100644
--- a/fs/cachefiles/xattr.c
+++ b/fs/cachefiles/xattr.c
@@ -58,8 +58,10 @@ int cachefiles_set_object_xattr(struct cachefiles_object *object)
 	if (len > 0)
 		memcpy(buf->data, fscache_get_aux(object->cookie), len);
 
-	ret = vfs_setxattr(&init_user_ns, dentry, cachefiles_xattr_cache,
-			   buf, sizeof(struct cachefiles_xattr) + len, 0);
+	ret = cachefiles_inject_write_error();
+	if (ret == 0)
+		ret = vfs_setxattr(&init_user_ns, dentry, cachefiles_xattr_cache,
+				   buf, sizeof(struct cachefiles_xattr) + len, 0);
 	if (ret < 0) {
 		trace_cachefiles_vfs_error(object, file_inode(file), ret,
 					   cachefiles_trace_setxattr_error);
@@ -99,7 +101,9 @@ int cachefiles_check_auxdata(struct cachefiles_object *object, struct file *file
 	if (!buf)
 		return -ENOMEM;
 
-	xlen = vfs_getxattr(&init_user_ns, dentry, cachefiles_xattr_cache, buf, tlen);
+	xlen = cachefiles_inject_read_error();
+	if (xlen == 0)
+		xlen = vfs_getxattr(&init_user_ns, dentry, cachefiles_xattr_cache, buf, tlen);
 	if (xlen != tlen) {
 		if (xlen < 0)
 			trace_cachefiles_vfs_error(object, file_inode(file), xlen,
@@ -139,7 +143,9 @@ int cachefiles_remove_object_xattr(struct cachefiles_cache *cache,
 {
 	int ret;
 
-	ret = vfs_removexattr(&init_user_ns, dentry, cachefiles_xattr_cache);
+	ret = cachefiles_inject_remove_error();
+	if (ret == 0)
+		ret = vfs_removexattr(&init_user_ns, dentry, cachefiles_xattr_cache);
 	if (ret < 0) {
 		trace_cachefiles_vfs_error(object, d_inode(dentry), ret,
 					   cachefiles_trace_remxattr_error);



^ permalink raw reply related	[flat|nested] 87+ messages in thread

* [PATCH 67/67] cifs: Support fscache indexing rewrite (untested)
  2021-10-18 14:50 [PATCH 00/67] fscache: Rewrite index API and management system David Howells
                   ` (65 preceding siblings ...)
  2021-10-18 15:08 ` [PATCH 66/67] cachefiles: Add error injection support David Howells
@ 2021-10-18 15:09 ` David Howells
  2021-10-19 13:29 ` [Linux-cachefs] [PATCH 00/67] fscache: Rewrite index API and management system Marc Dionne
                   ` (3 subsequent siblings)
  70 siblings, 0 replies; 87+ messages in thread
From: David Howells @ 2021-10-18 15:09 UTC (permalink / raw)
  To: linux-cachefs
  Cc: Jeff Layton, Steve French, Shyam Prasad N, linux-cifs, dhowells,
	Trond Myklebust, Anna Schumaker, Steve French,
	Dominique Martinet, Jeff Layton, Matthew Wilcox, Alexander Viro,
	Omar Sandoval, Linus Torvalds, linux-afs, linux-nfs, linux-cifs,
	ceph-devel, v9fs-developer, linux-fsdevel, linux-kernel

Change the cifs filesystem to take account of the changes to fscache's
indexing rewrite and reenable caching in cifs.

The following changes have been made:

 (1) The fscache_netfs struct is no more, and there's no need to register
     the filesystem as a whole.

 (2) The session cookie is now an fscache_volume cookie, allocated with
     fscache_acquire_volume().  That takes three parameters: a string
     representing the "volume" in the index, a string naming the cache to
     use (or NULL) and a u64 that conveys coherency metadata for the
     volume.

     For cifs, I've made it render the volume name string as:

	"cifs,<ipaddress>,<sharename>"

     where the sharename has '/' characters replaced with ';'.

     This probably needs rethinking a bit as the total name could exceed
     the maximum filename component length.

     Further, the coherency data is currently just set to 0.  It needs
     something else doing with it - I wonder if it would suffice simply to
     sum the resource_id, vol_create_time and vol_serial_number or maybe
     hash them.

 (3) The fscache_cookie_def is no more and needed information is passed
     directly to fscache_acquire_cookie().  The cache no longer calls back
     into the filesystem, but rather metadata changes are indicated at
     other times.

     fscache_acquire_cookie() is passed the same keying and coherency
     information as before.

 (4) The functions to set/reset cookies are removed and
     fscache_use_cookie() and fscache_unuse_cookie() are used instead.

     fscache_use_cookie() is passed a flag to indicate if the cookie is
     opened for writing.  fscache_unuse_cookie() is passed updates for the
     metadata if we changed it (ie. if the file was opened for writing).

     These are called when the file is opened or closed.

 (5) cifs_setattr_*() are made to call fscache_resize() to change the size
     of the cache object.

Signed-off-by: David Howells <dhowells@redhat.com>
Acked-by: Jeff Layton <jlayton@kernel.org>
cc: Steve French <sfrench@samba.org>
cc: Shyam Prasad N <nspmangalore@gmail.com>
cc: linux-cifs@vger.kernel.org
---

 fs/cifs/Kconfig    |    2 -
 fs/cifs/Makefile   |    2 -
 fs/cifs/cache.c    |  105 ---------------------------
 fs/cifs/cifsfs.c   |   11 +--
 fs/cifs/cifsglob.h |    5 -
 fs/cifs/connect.c  |    3 -
 fs/cifs/file.c     |   37 ++++++++--
 fs/cifs/fscache.c  |  201 +++++++++++++---------------------------------------
 fs/cifs/fscache.h  |   51 +++++--------
 fs/cifs/inode.c    |   18 +++--
 10 files changed, 117 insertions(+), 318 deletions(-)
 delete mode 100644 fs/cifs/cache.c

diff --git a/fs/cifs/Kconfig b/fs/cifs/Kconfig
index c5477abbcff0..3b7e3b9e4fd2 100644
--- a/fs/cifs/Kconfig
+++ b/fs/cifs/Kconfig
@@ -188,7 +188,7 @@ config CIFS_SMB_DIRECT
 
 config CIFS_FSCACHE
 	bool "Provide CIFS client caching support"
-	depends on CIFS=m && FSCACHE_OLD || CIFS=y && FSCACHE_OLD=y
+	depends on CIFS=m && FSCACHE || CIFS=y && FSCACHE=y
 	help
 	  Makes CIFS FS-Cache capable. Say Y here if you want your CIFS data
 	  to be cached locally on disk through the general filesystem cache
diff --git a/fs/cifs/Makefile b/fs/cifs/Makefile
index 87fcacdf3de7..cc8fdcb35b71 100644
--- a/fs/cifs/Makefile
+++ b/fs/cifs/Makefile
@@ -25,7 +25,7 @@ cifs-$(CONFIG_CIFS_DFS_UPCALL) += cifs_dfs_ref.o dfs_cache.o
 
 cifs-$(CONFIG_CIFS_SWN_UPCALL) += netlink.o cifs_swn.o
 
-cifs-$(CONFIG_CIFS_FSCACHE) += fscache.o cache.o
+cifs-$(CONFIG_CIFS_FSCACHE) += fscache.o
 
 cifs-$(CONFIG_CIFS_SMB_DIRECT) += smbdirect.o
 
diff --git a/fs/cifs/cache.c b/fs/cifs/cache.c
deleted file mode 100644
index 8be57aaedab6..000000000000
--- a/fs/cifs/cache.c
+++ /dev/null
@@ -1,105 +0,0 @@
-// SPDX-License-Identifier: LGPL-2.1
-/*
- *   CIFS filesystem cache index structure definitions
- *
- *   Copyright (c) 2010 Novell, Inc.
- *   Authors(s): Suresh Jayaraman (sjayaraman@suse.de>
- *
- */
-#include "fscache.h"
-#include "cifs_debug.h"
-
-/*
- * CIFS filesystem definition for FS-Cache
- */
-struct fscache_netfs cifs_fscache_netfs = {
-	.name = "cifs",
-	.version = 0,
-};
-
-/*
- * Register CIFS for caching with FS-Cache
- */
-int cifs_fscache_register(void)
-{
-	return fscache_register_netfs(&cifs_fscache_netfs);
-}
-
-/*
- * Unregister CIFS for caching
- */
-void cifs_fscache_unregister(void)
-{
-	fscache_unregister_netfs(&cifs_fscache_netfs);
-}
-
-/*
- * Server object for FS-Cache
- */
-const struct fscache_cookie_def cifs_fscache_server_index_def = {
-	.name = "CIFS.server",
-	.type = FSCACHE_COOKIE_TYPE_INDEX,
-};
-
-static enum
-fscache_checkaux cifs_fscache_super_check_aux(void *cookie_netfs_data,
-					      const void *data,
-					      uint16_t datalen,
-					      loff_t object_size)
-{
-	struct cifs_fscache_super_auxdata auxdata;
-	const struct cifs_tcon *tcon = cookie_netfs_data;
-
-	if (datalen != sizeof(auxdata))
-		return FSCACHE_CHECKAUX_OBSOLETE;
-
-	memset(&auxdata, 0, sizeof(auxdata));
-	auxdata.resource_id = tcon->resource_id;
-	auxdata.vol_create_time = tcon->vol_create_time;
-	auxdata.vol_serial_number = tcon->vol_serial_number;
-
-	if (memcmp(data, &auxdata, datalen) != 0)
-		return FSCACHE_CHECKAUX_OBSOLETE;
-
-	return FSCACHE_CHECKAUX_OKAY;
-}
-
-/*
- * Superblock object for FS-Cache
- */
-const struct fscache_cookie_def cifs_fscache_super_index_def = {
-	.name = "CIFS.super",
-	.type = FSCACHE_COOKIE_TYPE_INDEX,
-	.check_aux = cifs_fscache_super_check_aux,
-};
-
-static enum
-fscache_checkaux cifs_fscache_inode_check_aux(void *cookie_netfs_data,
-					      const void *data,
-					      uint16_t datalen,
-					      loff_t object_size)
-{
-	struct cifs_fscache_inode_auxdata auxdata;
-	struct cifsInodeInfo *cifsi = cookie_netfs_data;
-
-	if (datalen != sizeof(auxdata))
-		return FSCACHE_CHECKAUX_OBSOLETE;
-
-	memset(&auxdata, 0, sizeof(auxdata));
-	auxdata.eof = cifsi->server_eof;
-	auxdata.last_write_time_sec = cifsi->vfs_inode.i_mtime.tv_sec;
-	auxdata.last_change_time_sec = cifsi->vfs_inode.i_ctime.tv_sec;
-	auxdata.last_write_time_nsec = cifsi->vfs_inode.i_mtime.tv_nsec;
-	auxdata.last_change_time_nsec = cifsi->vfs_inode.i_ctime.tv_nsec;
-
-	if (memcmp(data, &auxdata, datalen) != 0)
-		return FSCACHE_CHECKAUX_OBSOLETE;
-
-	return FSCACHE_CHECKAUX_OKAY;
-}
-
-const struct fscache_cookie_def cifs_fscache_inode_object_def = {
-	.name		= "CIFS.uniqueid",
-	.type		= FSCACHE_COOKIE_TYPE_DATAFILE,
-	.check_aux	= cifs_fscache_inode_check_aux,
-};
diff --git a/fs/cifs/cifsfs.c b/fs/cifs/cifsfs.c
index 9fa930dfd78d..d44a587a4c32 100644
--- a/fs/cifs/cifsfs.c
+++ b/fs/cifs/cifsfs.c
@@ -397,6 +397,8 @@ static void
 cifs_evict_inode(struct inode *inode)
 {
 	truncate_inode_pages_final(&inode->i_data);
+	if (inode->i_state & I_PINNING_FSCACHE_WB)
+		cifs_fscache_unuse_inode_cookie(inode, true);
 	clear_inode(inode);
 }
 
@@ -1625,13 +1627,9 @@ init_cifs(void)
 		goto out_destroy_cifsoplockd_wq;
 	}
 
-	rc = cifs_fscache_register();
-	if (rc)
-		goto out_destroy_deferredclose_wq;
-
 	rc = cifs_init_inodecache();
 	if (rc)
-		goto out_unreg_fscache;
+		goto out_destroy_deferredclose_wq;
 
 	rc = cifs_init_mids();
 	if (rc)
@@ -1693,8 +1691,6 @@ init_cifs(void)
 	cifs_destroy_mids();
 out_destroy_inodecache:
 	cifs_destroy_inodecache();
-out_unreg_fscache:
-	cifs_fscache_unregister();
 out_destroy_deferredclose_wq:
 	destroy_workqueue(deferredclose_wq);
 out_destroy_cifsoplockd_wq:
@@ -1730,7 +1726,6 @@ exit_cifs(void)
 	cifs_destroy_request_bufs();
 	cifs_destroy_mids();
 	cifs_destroy_inodecache();
-	cifs_fscache_unregister();
 	destroy_workqueue(deferredclose_wq);
 	destroy_workqueue(cifsoplockd_wq);
 	destroy_workqueue(decrypt_wq);
diff --git a/fs/cifs/cifsglob.h b/fs/cifs/cifsglob.h
index e916470468ea..d749250af775 100644
--- a/fs/cifs/cifsglob.h
+++ b/fs/cifs/cifsglob.h
@@ -653,9 +653,6 @@ struct TCP_Server_Info {
 	unsigned int total_read; /* total amount of data read in this pass */
 	atomic_t in_send; /* requests trying to send */
 	atomic_t num_waiters;   /* blocked waiting to get in sendrecv */
-#ifdef CONFIG_CIFS_FSCACHE
-	struct fscache_cookie   *fscache; /* client index cache cookie */
-#endif
 #ifdef CONFIG_CIFS_STATS2
 	atomic_t num_cmds[NUMBER_OF_SMB2_COMMANDS]; /* total requests by cmd */
 	atomic_t smb2slowcmd[NUMBER_OF_SMB2_COMMANDS]; /* count resps > 1 sec */
@@ -1084,7 +1081,7 @@ struct cifs_tcon {
 	__u32 max_bytes_copy;
 #ifdef CONFIG_CIFS_FSCACHE
 	u64 resource_id;		/* server resource id */
-	struct fscache_cookie *fscache;	/* cookie for share */
+	struct fscache_volume *fscache;	/* cookie for share */
 #endif
 	struct list_head pending_opens;	/* list of incomplete opens */
 	struct cached_fid crfid; /* Cached root fid */
diff --git a/fs/cifs/connect.c b/fs/cifs/connect.c
index c3b94c1e4591..db13e0f8bc4c 100644
--- a/fs/cifs/connect.c
+++ b/fs/cifs/connect.c
@@ -1331,7 +1331,6 @@ cifs_put_tcp_session(struct TCP_Server_Info *server, int from_reconnect)
 	spin_unlock(&GlobalMid_Lock);
 
 	cifs_crypto_secmech_release(server);
-	cifs_fscache_release_client_cookie(server);
 
 	kfree(server->session_key.response);
 	server->session_key.response = NULL;
@@ -1477,8 +1476,6 @@ cifs_get_tcp_session(struct smb3_fs_context *ctx)
 	list_add(&tcp_ses->tcp_ses_list, &cifs_tcp_ses_list);
 	spin_unlock(&cifs_tcp_ses_lock);
 
-	cifs_fscache_get_client_cookie(tcp_ses);
-
 	/* queue echo request delayed work */
 	queue_delayed_work(cifsiod_wq, &tcp_ses->echo, tcp_ses->echo_interval);
 
diff --git a/fs/cifs/file.c b/fs/cifs/file.c
index 02894e999c56..04c841354077 100644
--- a/fs/cifs/file.c
+++ b/fs/cifs/file.c
@@ -632,7 +632,18 @@ int cifs_open(struct inode *inode, struct file *file)
 		goto out;
 	}
 
-	cifs_fscache_set_inode_cookie(inode, file);
+
+	fscache_use_cookie(cifs_inode_cookie(file_inode(file)),
+			   file->f_mode & FMODE_WRITE);
+	if (file->f_flags & O_DIRECT &&
+	    (!((file->f_flags & O_ACCMODE) != O_RDONLY) ||
+	     file->f_flags & O_APPEND)) {
+		struct cifs_fscache_inode_auxdata auxdata;
+		cifs_fscache_fill_auxdata(file_inode(file), &auxdata);
+		fscache_invalidate(cifs_inode_cookie(file_inode(file)),
+				   &auxdata, i_size_read(file_inode(file)),
+				   FSCACHE_INVAL_DIO_WRITE);
+	}
 
 	if ((oplock & CIFS_CREATE_ACTION) && !posix_open_ok && tcon->unix_ext) {
 		/*
@@ -876,6 +887,8 @@ int cifs_close(struct inode *inode, struct file *file)
 	struct cifs_sb_info *cifs_sb = CIFS_SB(inode->i_sb);
 	struct cifs_deferred_close *dclose;
 
+	cifs_fscache_unuse_inode_cookie(inode, file->f_mode & FMODE_WRITE);
+
 	if (file->private_data != NULL) {
 		cfile = file->private_data;
 		file->private_data = NULL;
@@ -886,7 +899,6 @@ int cifs_close(struct inode *inode, struct file *file)
 		    dclose) {
 			if (test_and_clear_bit(CIFS_INO_MODIFIED_ATTR, &cinode->flags)) {
 				inode->i_ctime = inode->i_mtime = current_time(inode);
-				cifs_fscache_update_inode_cookie(inode);
 			}
 			spin_lock(&cinode->deferred_lock);
 			cifs_add_deferred_close(cfile, dclose);
@@ -4787,14 +4799,14 @@ static int cifs_release_page(struct page *page, gfp_t gfp)
 			return false;
 		wait_on_page_fscache(page);
 	}
+	fscache_note_page_release(cifs_inode_cookie(page->mapping->host));
 	return true;
 }
 
 static void cifs_invalidate_page(struct page *page, unsigned int offset,
 				 unsigned int length)
 {
-	if (offset == 0 && length == PAGE_SIZE)
-		wait_on_page_fscache(page);
+	wait_on_page_fscache(page);
 }
 
 static int cifs_launder_page(struct page *page)
@@ -4971,6 +4983,19 @@ static void cifs_swap_deactivate(struct file *file)
 	/* do we need to unpin (or unlock) the file */
 }
 
+/*
+ * Mark a page as having been made dirty and thus needing writeback.  We also
+ * need to pin the cache object to write back to.
+ */
+#ifdef CONFIG_CIFS_FSCACHE
+static int cifs_set_page_dirty(struct page *page)
+{
+	return fscache_set_page_dirty(page, cifs_inode_cookie(page->mapping->host));
+}
+#else
+#define cifs_set_page_dirty __set_page_dirty_nobuffers
+#endif
+
 const struct address_space_operations cifs_addr_ops = {
 	.readpage = cifs_readpage,
 	.readpages = cifs_readpages,
@@ -4978,7 +5003,7 @@ const struct address_space_operations cifs_addr_ops = {
 	.writepages = cifs_writepages,
 	.write_begin = cifs_write_begin,
 	.write_end = cifs_write_end,
-	.set_page_dirty = __set_page_dirty_nobuffers,
+	.set_page_dirty = cifs_set_page_dirty,
 	.releasepage = cifs_release_page,
 	.direct_IO = cifs_direct_io,
 	.invalidatepage = cifs_invalidate_page,
@@ -5003,7 +5028,7 @@ const struct address_space_operations cifs_addr_ops_smallbuf = {
 	.writepages = cifs_writepages,
 	.write_begin = cifs_write_begin,
 	.write_end = cifs_write_end,
-	.set_page_dirty = __set_page_dirty_nobuffers,
+	.set_page_dirty = cifs_set_page_dirty,
 	.releasepage = cifs_release_page,
 	.invalidatepage = cifs_invalidate_page,
 	.launder_page = cifs_launder_page,
diff --git a/fs/cifs/fscache.c b/fs/cifs/fscache.c
index d6ff668c268a..fa397e734b5c 100644
--- a/fs/cifs/fscache.c
+++ b/fs/cifs/fscache.c
@@ -12,217 +12,114 @@
 #include "cifs_fs_sb.h"
 #include "cifsproto.h"
 
-/*
- * Key layout of CIFS server cache index object
- */
-struct cifs_server_key {
-	struct {
-		uint16_t	family;		/* address family */
-		__be16		port;		/* IP port */
-	} hdr;
-	union {
-		struct in_addr	ipv4_addr;
-		struct in6_addr	ipv6_addr;
-	};
-} __packed;
-
-/*
- * Get a cookie for a server object keyed by {IPaddress,port,family} tuple
- */
-void cifs_fscache_get_client_cookie(struct TCP_Server_Info *server)
+void cifs_fscache_get_super_cookie(struct cifs_tcon *tcon)
 {
-	const struct sockaddr *sa = (struct sockaddr *) &server->dstaddr;
-	const struct sockaddr_in *addr = (struct sockaddr_in *) sa;
-	const struct sockaddr_in6 *addr6 = (struct sockaddr_in6 *) sa;
-	struct cifs_server_key key;
-	uint16_t key_len = sizeof(key.hdr);
-
-	memset(&key, 0, sizeof(key));
+	struct cifs_fscache_super_auxdata auxdata;
+	struct TCP_Server_Info *server = tcon->ses->server;
+	const struct sockaddr *sa = (struct sockaddr *)&server->dstaddr;
+	size_t slen, i;
+	char *sharename;
+	char *key;
 
-	/*
-	 * Should not be a problem as sin_family/sin6_family overlays
-	 * sa_family field
-	 */
-	key.hdr.family = sa->sa_family;
+	tcon->fscache = NULL;
 	switch (sa->sa_family) {
 	case AF_INET:
-		key.hdr.port = addr->sin_port;
-		key.ipv4_addr = addr->sin_addr;
-		key_len += sizeof(key.ipv4_addr);
-		break;
-
 	case AF_INET6:
-		key.hdr.port = addr6->sin6_port;
-		key.ipv6_addr = addr6->sin6_addr;
-		key_len += sizeof(key.ipv6_addr);
 		break;
-
 	default:
 		cifs_dbg(VFS, "Unknown network family '%d'\n", sa->sa_family);
-		server->fscache = NULL;
 		return;
 	}
 
-	server->fscache =
-		fscache_acquire_cookie(cifs_fscache_netfs.primary_index,
-				       &cifs_fscache_server_index_def,
-				       &key, key_len,
-				       NULL, 0,
-				       server, 0, true);
-	cifs_dbg(FYI, "%s: (0x%p/0x%p)\n",
-		 __func__, server, server->fscache);
-}
-
-void cifs_fscache_release_client_cookie(struct TCP_Server_Info *server)
-{
-	cifs_dbg(FYI, "%s: (0x%p/0x%p)\n",
-		 __func__, server, server->fscache);
-	fscache_relinquish_cookie(server->fscache, NULL, false);
-	server->fscache = NULL;
-}
-
-void cifs_fscache_get_super_cookie(struct cifs_tcon *tcon)
-{
-	struct TCP_Server_Info *server = tcon->ses->server;
-	char *sharename;
-	struct cifs_fscache_super_auxdata auxdata;
-
 	sharename = extract_sharename(tcon->treeName);
 	if (IS_ERR(sharename)) {
 		cifs_dbg(FYI, "%s: couldn't extract sharename\n", __func__);
-		tcon->fscache = NULL;
 		return;
 	}
 
+	slen = strlen(sharename);
+	for (i = 0; i < slen; i++)
+		if (sharename[i] == '/')
+			sharename[i] = ';';
+
+	key = kasprintf(GFP_KERNEL, "cifs,%pISpc,%s", sa, sharename);
+	if (!key)
+		goto out;
+
 	memset(&auxdata, 0, sizeof(auxdata));
 	auxdata.resource_id = tcon->resource_id;
 	auxdata.vol_create_time = tcon->vol_create_time;
 	auxdata.vol_serial_number = tcon->vol_serial_number;
+	// TODO: Do something with the volume coherency data
 
-	tcon->fscache =
-		fscache_acquire_cookie(server->fscache,
-				       &cifs_fscache_super_index_def,
-				       sharename, strlen(sharename),
-				       &auxdata, sizeof(auxdata),
-				       tcon, 0, true);
+	tcon->fscache = fscache_acquire_volume(key,
+					       NULL, /* preferred_cache */
+					       0 /* coherency_data */);
+	cifs_dbg(FYI, "%s: (%s/0x%p)\n", __func__, key, tcon->fscache);
+
+	kfree(key);
+out:
 	kfree(sharename);
-	cifs_dbg(FYI, "%s: (0x%p/0x%p)\n",
-		 __func__, server->fscache, tcon->fscache);
 }
 
 void cifs_fscache_release_super_cookie(struct cifs_tcon *tcon)
 {
 	struct cifs_fscache_super_auxdata auxdata;
 
+	cifs_dbg(FYI, "%s: (0x%p)\n", __func__, tcon->fscache);
+
 	memset(&auxdata, 0, sizeof(auxdata));
 	auxdata.resource_id = tcon->resource_id;
 	auxdata.vol_create_time = tcon->vol_create_time;
 	auxdata.vol_serial_number = tcon->vol_serial_number;
+	// TODO: Do something with the volume coherency data
 
-	cifs_dbg(FYI, "%s: (0x%p)\n", __func__, tcon->fscache);
-	fscache_relinquish_cookie(tcon->fscache, &auxdata, false);
+	fscache_relinquish_volume(tcon->fscache,
+				  0, /* coherency_data */
+				  false);
 	tcon->fscache = NULL;
 }
 
-static void cifs_fscache_acquire_inode_cookie(struct cifsInodeInfo *cifsi,
-					      struct cifs_tcon *tcon)
+void cifs_fscache_get_inode_cookie(struct inode *inode)
 {
+	struct cifsInodeInfo *cifsi = CIFS_I(inode);
+	struct cifs_sb_info *cifs_sb = CIFS_SB(inode->i_sb);
+	struct cifs_tcon *tcon = cifs_sb_master_tcon(cifs_sb);
 	struct cifs_fscache_inode_auxdata auxdata;
 
-	memset(&auxdata, 0, sizeof(auxdata));
-	auxdata.eof = cifsi->server_eof;
-	auxdata.last_write_time_sec = cifsi->vfs_inode.i_mtime.tv_sec;
-	auxdata.last_change_time_sec = cifsi->vfs_inode.i_ctime.tv_sec;
-	auxdata.last_write_time_nsec = cifsi->vfs_inode.i_mtime.tv_nsec;
-	auxdata.last_change_time_nsec = cifsi->vfs_inode.i_ctime.tv_nsec;
+	cifs_fscache_fill_auxdata(&cifsi->vfs_inode, &auxdata);
 
 	cifsi->fscache =
-		fscache_acquire_cookie(tcon->fscache,
-				       &cifs_fscache_inode_object_def,
+		fscache_acquire_cookie(tcon->fscache, 0,
 				       &cifsi->uniqueid, sizeof(cifsi->uniqueid),
 				       &auxdata, sizeof(auxdata),
-				       cifsi, cifsi->vfs_inode.i_size, true);
+				       cifsi->vfs_inode.i_size);
 }
 
-static void cifs_fscache_enable_inode_cookie(struct inode *inode)
+void cifs_fscache_unuse_inode_cookie(struct inode *inode, bool update)
 {
-	struct cifsInodeInfo *cifsi = CIFS_I(inode);
-	struct cifs_sb_info *cifs_sb = CIFS_SB(inode->i_sb);
-	struct cifs_tcon *tcon = cifs_sb_master_tcon(cifs_sb);
-
-	if (cifsi->fscache)
-		return;
-
-	if (!(cifs_sb->mnt_cifs_flags & CIFS_MOUNT_FSCACHE))
-		return;
-
-	cifs_fscache_acquire_inode_cookie(cifsi, tcon);
-
-	cifs_dbg(FYI, "%s: got FH cookie (0x%p/0x%p)\n",
-		 __func__, tcon->fscache, cifsi->fscache);
+	if (update) {
+		struct cifs_fscache_inode_auxdata auxdata;
+		loff_t i_size = i_size_read(inode);
+
+		cifs_fscache_fill_auxdata(inode, &auxdata);
+		fscache_unuse_cookie(cifs_inode_cookie(inode), &auxdata, &i_size);
+	} else {
+		fscache_unuse_cookie(cifs_inode_cookie(inode), NULL, NULL);
+	}
 }
 
 void cifs_fscache_release_inode_cookie(struct inode *inode)
 {
-	struct cifs_fscache_inode_auxdata auxdata;
 	struct cifsInodeInfo *cifsi = CIFS_I(inode);
 
 	if (cifsi->fscache) {
-		memset(&auxdata, 0, sizeof(auxdata));
-		auxdata.eof = cifsi->server_eof;
-		auxdata.last_write_time_sec = cifsi->vfs_inode.i_mtime.tv_sec;
-		auxdata.last_change_time_sec = cifsi->vfs_inode.i_ctime.tv_sec;
-		auxdata.last_write_time_nsec = cifsi->vfs_inode.i_mtime.tv_nsec;
-		auxdata.last_change_time_nsec = cifsi->vfs_inode.i_ctime.tv_nsec;
-
 		cifs_dbg(FYI, "%s: (0x%p)\n", __func__, cifsi->fscache);
-		/* fscache_relinquish_cookie does not seem to update auxdata */
-		fscache_update_cookie(cifsi->fscache, &auxdata);
-		fscache_relinquish_cookie(cifsi->fscache, &auxdata, false);
+		fscache_relinquish_cookie(cifsi->fscache, false);
 		cifsi->fscache = NULL;
 	}
 }
 
-void cifs_fscache_update_inode_cookie(struct inode *inode)
-{
-	struct cifs_fscache_inode_auxdata auxdata;
-	struct cifsInodeInfo *cifsi = CIFS_I(inode);
-
-	if (cifsi->fscache) {
-		memset(&auxdata, 0, sizeof(auxdata));
-		auxdata.eof = cifsi->server_eof;
-		auxdata.last_write_time_sec = cifsi->vfs_inode.i_mtime.tv_sec;
-		auxdata.last_change_time_sec = cifsi->vfs_inode.i_ctime.tv_sec;
-		auxdata.last_write_time_nsec = cifsi->vfs_inode.i_mtime.tv_nsec;
-		auxdata.last_change_time_nsec = cifsi->vfs_inode.i_ctime.tv_nsec;
-
-		cifs_dbg(FYI, "%s: (0x%p)\n", __func__, cifsi->fscache);
-		fscache_update_cookie(cifsi->fscache, &auxdata);
-	}
-}
-
-void cifs_fscache_set_inode_cookie(struct inode *inode, struct file *filp)
-{
-	cifs_fscache_enable_inode_cookie(inode);
-}
-
-void cifs_fscache_reset_inode_cookie(struct inode *inode)
-{
-	struct cifsInodeInfo *cifsi = CIFS_I(inode);
-	struct cifs_sb_info *cifs_sb = CIFS_SB(inode->i_sb);
-	struct cifs_tcon *tcon = cifs_sb_master_tcon(cifs_sb);
-	struct fscache_cookie *old = cifsi->fscache;
-
-	if (cifsi->fscache) {
-		/* retire the current fscache cache and get a new one */
-		fscache_relinquish_cookie(cifsi->fscache, NULL, true);
-
-		cifs_fscache_acquire_inode_cookie(cifsi, tcon);
-		cifs_dbg(FYI, "%s: new cookie 0x%p oldcookie 0x%p\n",
-			 __func__, cifsi->fscache, old);
-	}
-}
-
 /*
  * Retrieve a page from FS-Cache
  */
diff --git a/fs/cifs/fscache.h b/fs/cifs/fscache.h
index 081481645b77..886a47a4c409 100644
--- a/fs/cifs/fscache.h
+++ b/fs/cifs/fscache.h
@@ -33,32 +33,31 @@ struct cifs_fscache_inode_auxdata {
 	u64 last_change_time_sec;
 	u32 last_write_time_nsec;
 	u32 last_change_time_nsec;
-	u64 eof;
 };
 
-/*
- * cache.c
- */
-extern struct fscache_netfs cifs_fscache_netfs;
-extern const struct fscache_cookie_def cifs_fscache_server_index_def;
-extern const struct fscache_cookie_def cifs_fscache_super_index_def;
-extern const struct fscache_cookie_def cifs_fscache_inode_object_def;
-
-extern int cifs_fscache_register(void);
-extern void cifs_fscache_unregister(void);
-
 /*
  * fscache.c
  */
-extern void cifs_fscache_get_client_cookie(struct TCP_Server_Info *);
-extern void cifs_fscache_release_client_cookie(struct TCP_Server_Info *);
 extern void cifs_fscache_get_super_cookie(struct cifs_tcon *);
 extern void cifs_fscache_release_super_cookie(struct cifs_tcon *);
 
+extern void cifs_fscache_get_inode_cookie(struct inode *);
 extern void cifs_fscache_release_inode_cookie(struct inode *);
-extern void cifs_fscache_update_inode_cookie(struct inode *inode);
-extern void cifs_fscache_set_inode_cookie(struct inode *, struct file *);
-extern void cifs_fscache_reset_inode_cookie(struct inode *);
+extern void cifs_fscache_unuse_inode_cookie(struct inode *, bool);
+
+static inline
+void cifs_fscache_fill_auxdata(struct inode *inode,
+			       struct cifs_fscache_inode_auxdata *auxdata)
+{
+	struct cifsInodeInfo *cifsi = CIFS_I(inode);
+
+	memset(&auxdata, 0, sizeof(auxdata));
+	auxdata->last_write_time_sec   = cifsi->vfs_inode.i_mtime.tv_sec;
+	auxdata->last_write_time_nsec  = cifsi->vfs_inode.i_mtime.tv_nsec;
+	auxdata->last_change_time_sec  = cifsi->vfs_inode.i_ctime.tv_sec;
+	auxdata->last_change_time_nsec = cifsi->vfs_inode.i_ctime.tv_nsec;
+}
+
 
 extern int __cifs_readpage_from_fscache(struct inode *, struct page *);
 extern void __cifs_readpage_to_fscache(struct inode *, struct page *);
@@ -85,23 +84,13 @@ static inline struct fscache_cookie *cifs_inode_cookie(struct inode *inode)
 }
 
 #else /* CONFIG_CIFS_FSCACHE */
-static inline int cifs_fscache_register(void) { return 0; }
-static inline void cifs_fscache_unregister(void) {}
-
-static inline void
-cifs_fscache_get_client_cookie(struct TCP_Server_Info *server) {}
-static inline void
-cifs_fscache_release_client_cookie(struct TCP_Server_Info *server) {}
 static inline void cifs_fscache_get_super_cookie(struct cifs_tcon *tcon) {}
-static inline void
-cifs_fscache_release_super_cookie(struct cifs_tcon *tcon) {}
+static inline void cifs_fscache_release_super_cookie(struct cifs_tcon *tcon) {}
 
+static inline void cifs_fscache_get_inode_cookie(struct inode *inode) {}
 static inline void cifs_fscache_release_inode_cookie(struct inode *inode) {}
-static inline void cifs_fscache_update_inode_cookie(struct inode *inode) {}
-static inline void cifs_fscache_set_inode_cookie(struct inode *inode,
-						 struct file *filp) {}
-static inline void cifs_fscache_reset_inode_cookie(struct inode *inode) {}
-
+static inline void cifs_fscache_unuse_inode_cookie(struct inode *inode) {}
+static inline struct fscache_cookie *cifs_inode_cookie(struct inode *inode) { return NULL; }
 
 static inline int
 cifs_readpage_from_fscache(struct inode *inode, struct page *page)
diff --git a/fs/cifs/inode.c b/fs/cifs/inode.c
index 82848412ad85..384d7b8686b1 100644
--- a/fs/cifs/inode.c
+++ b/fs/cifs/inode.c
@@ -1298,10 +1298,7 @@ cifs_iget(struct super_block *sb, struct cifs_fattr *fattr)
 			inode->i_flags |= S_NOATIME | S_NOCMTIME;
 		if (inode->i_state & I_NEW) {
 			inode->i_ino = hash;
-#ifdef CONFIG_CIFS_FSCACHE
-			/* initialize per-inode cache cookie pointer */
-			CIFS_I(inode)->fscache = NULL;
-#endif
+			cifs_fscache_get_inode_cookie(inode);
 			unlock_new_inode(inode);
 		}
 	}
@@ -2263,6 +2260,8 @@ cifs_dentry_needs_reval(struct dentry *dentry)
 int
 cifs_invalidate_mapping(struct inode *inode)
 {
+	struct cifs_fscache_inode_auxdata auxdata;
+	struct cifsInodeInfo *cifsi = CIFS_I(inode);
 	int rc = 0;
 
 	if (inode->i_mapping && inode->i_mapping->nrpages != 0) {
@@ -2272,7 +2271,8 @@ cifs_invalidate_mapping(struct inode *inode)
 				 __func__, inode);
 	}
 
-	cifs_fscache_reset_inode_cookie(inode);
+	cifs_fscache_fill_auxdata(&cifsi->vfs_inode, &auxdata);
+	fscache_invalidate(cifs_inode_cookie(inode), &auxdata, i_size_read(inode), 0);
 	return rc;
 }
 
@@ -2777,8 +2777,10 @@ cifs_setattr_unix(struct dentry *direntry, struct iattr *attrs)
 		goto out;
 
 	if ((attrs->ia_valid & ATTR_SIZE) &&
-	    attrs->ia_size != i_size_read(inode))
+	    attrs->ia_size != i_size_read(inode)) {
 		truncate_setsize(inode, attrs->ia_size);
+		fscache_resize_cookie(cifs_inode_cookie(inode), attrs->ia_size);
+	}
 
 	setattr_copy(&init_user_ns, inode, attrs);
 	mark_inode_dirty(inode);
@@ -2973,8 +2975,10 @@ cifs_setattr_nounix(struct dentry *direntry, struct iattr *attrs)
 		goto cifs_setattr_exit;
 
 	if ((attrs->ia_valid & ATTR_SIZE) &&
-	    attrs->ia_size != i_size_read(inode))
+	    attrs->ia_size != i_size_read(inode)) {
 		truncate_setsize(inode, attrs->ia_size);
+		fscache_resize_cookie(cifs_inode_cookie(inode), attrs->ia_size);
+	}
 
 	setattr_copy(&init_user_ns, inode, attrs);
 	mark_inode_dirty(inode);



^ permalink raw reply related	[flat|nested] 87+ messages in thread

* Re: [Linux-cachefs] [PATCH 00/67] fscache: Rewrite index API and management system
  2021-10-18 14:50 [PATCH 00/67] fscache: Rewrite index API and management system David Howells
                   ` (66 preceding siblings ...)
  2021-10-18 15:09 ` [PATCH 67/67] cifs: Support fscache indexing rewrite (untested) David Howells
@ 2021-10-19 13:29 ` Marc Dionne
  2021-10-19 18:08 ` Jeff Layton
                   ` (2 subsequent siblings)
  70 siblings, 0 replies; 87+ messages in thread
From: Marc Dionne @ 2021-10-19 13:29 UTC (permalink / raw)
  To: David Howells
  Cc: linux-cachefs, Latchesar Ionkov, Dominique Martinet, linux-mm,
	linux-afs, Shyam Prasad N, linux-cifs, Matthew Wilcox,
	Trond Myklebust, v9fs-developer, Ilya Dryomov, Kent Overstreet,
	Alexander Viro, ceph-devel, Trond Myklebust, linux-nfs,
	Jeff Layton, Linux Kernel Mailing List, Steve French,
	linux-fsdevel, Omar Sandoval, Linus Torvalds, Anna Schumaker

On Mon, Oct 18, 2021 at 11:50 AM David Howells <dhowells@redhat.com> wrote:
>
>
> Here's a set of patches that rewrites and simplifies the fscache index API
> to remove the complex operation scheduling and object state machine in
> favour of something much smaller and simpler.  It is built on top of the
> set of patches that removes the old API[1].

Testing this series in our afs test framework, saw the oops pasted below.

cachefiles_begin_operation+0x2d maps to cachefiles/io.c:565, where
object is probably NULL (object->file is at offset 0x28).

Marc
===
BUG: kernel NULL pointer dereference, address: 0000000000000028
#PF: supervisor read access in kernel mode
#PF: error_code(0x0000) - not-present page
PGD 0 P4D 0
Oops: 0000 [#1] SMP NOPTI
CPU: 5 PID: 16607 Comm: ar Tainted: G            E
5.15.0-rc5.kafs_testing+ #37
Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS
1.14.0-2.fc34 04/01/2014
RIP: 0010:cachefiles_begin_operation+0x2d/0x80 [cachefiles]
Code: 00 00 55 53 48 83 ec 08 48 8b 47 08 48 83 7f 10 00 48 8b 68 20
74 0c b8 01 00 00 00 48 83 c4 08 5b 5d c3 48 c7 07 a0 12 1b a0 <48> 8b
45 28 48 89 fb 48 85 c0 74 20 48 8d 7d 04 89 74 24 04 e8 3a
RSP: 0018:ffffc90000d33b48 EFLAGS: 00010246
RAX: ffff888014991420 RBX: ffff888100ae9cf0 RCX: 0000000000000000
RDX: 0000000000000001 RSI: 0000000000000000 RDI: ffff888100ae9cf0
RBP: 0000000000000000 R08: 00000000000006b8 R09: ffff88810e98e000
R10: 0000000000000000 R11: 0000000000000000 R12: ffff888014991434
R13: 0000000000000002 R14: ffff888014991420 R15: 0000000000000002
FS:  00007f72d0486b80(0000) GS:ffff888139940000(0000) knlGS:0000000000000000
CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
CR2: 0000000000000028 CR3: 000000007bac8004 CR4: 0000000000770ee0
PKRU: 55555554
Call Trace:
 fscache_begin_operation.part.0+0x1e3/0x210 [fscache]
 netfs_write_begin+0x3fb/0x800 [netfs]
 ? __fscache_use_cookie+0x120/0x200 [fscache]
 afs_write_begin+0x58/0x2c0 [kafs]
 ? __vfs_getxattr+0x2a/0x70
 generic_perform_write+0xb1/0x1b0
 ? file_update_time+0xcf/0x120
 __generic_file_write_iter+0x14c/0x1d0
 generic_file_write_iter+0x5d/0xb0
 afs_file_write+0x73/0xa0 [kafs]
 new_sync_write+0x105/0x180
 vfs_write+0x1cb/0x260
 ksys_write+0x4f/0xc0
 do_syscall_64+0x35/0x80
 entry_SYSCALL_64_after_hwframe+0x44/0xae
RIP: 0033:0x7f72d059a7a7
Code: 0d 00 f7 d8 64 89 02 48 c7 c0 ff ff ff ff eb b7 0f 1f 00 f3 0f
1e fa 64 8b 04 25 18 00 00 00 85 c0 75 10 b8 01 00 00 00 0f 05 <48> 3d
00 f0 ff ff 77 51 c3 48 83 ec 28 48 89 54 24 18 48 89 74 24
RSP: 002b:00007fffc31942b8 EFLAGS: 00000246 ORIG_RAX: 0000000000000001
RAX: ffffffffffffffda RBX: 0000000000000008 RCX: 00007f72d059a7a7
RDX: 0000000000000008 RSI: 000055fe42367730 RDI: 0000000000000003
RBP: 000055fe42367730 R08: 0000000000000000 R09: 00007f72d066ca00
R10: 000000000000007c R11: 0000000000000246 R12: 0000000000000008

^ permalink raw reply	[flat|nested] 87+ messages in thread

* Re: [PATCH 01/67] mm: Stop filemap_read() from grabbing a superfluous page
  2021-10-18 14:50 ` [PATCH 01/67] mm: Stop filemap_read() from grabbing a superfluous page David Howells
@ 2021-10-19 17:13   ` Jeff Layton
  2021-10-19 18:28   ` Matthew Wilcox
  2021-10-19 18:48   ` David Howells
  2 siblings, 0 replies; 87+ messages in thread
From: Jeff Layton @ 2021-10-19 17:13 UTC (permalink / raw)
  To: David Howells, linux-cachefs
  Cc: Kent Overstreet, Matthew Wilcox (Oracle),
	linux-mm, linux-fsdevel, Trond Myklebust, Anna Schumaker,
	Steve French, Dominique Martinet, Alexander Viro, Omar Sandoval,
	Linus Torvalds, linux-afs, linux-nfs, linux-cifs, ceph-devel,
	v9fs-developer, linux-kernel

On Mon, 2021-10-18 at 15:50 +0100, David Howells wrote:
> Under some circumstances, filemap_read() will allocate sufficient pages to
> read to the end of the file, call readahead/readpages on them and copy the
> data over - and then it will allocate another page at the EOF and call
> readpage on that and then ignore it.  This is unnecessary and a waste of
> time and resources.
> 
> filemap_read() *does* check for this, but only after it has already done
> the allocation and I/O.  Fix this by checking before calling
> filemap_get_pages() also.
> 
> Signed-off-by: David Howells <dhowells@redhat.com>
> Acked-by: Kent Overstreet <kent.overstreet@gmail.com>
> cc: Matthew Wilcox (Oracle) <willy@infradead.org>
> cc: linux-mm@kvack.org
> cc: linux-fsdevel@vger.kernel.org
> Link: https://lore.kernel.org/r/160588481358.3465195.16552616179674485179.stgit@warthog.procyon.org.uk/
> ---
> 
>  mm/filemap.c |    4 ++++
>  1 file changed, 4 insertions(+)
> 
> diff --git a/mm/filemap.c b/mm/filemap.c
> index dae481293b5d..c0cdc44c844e 100644
> --- a/mm/filemap.c
> +++ b/mm/filemap.c
> @@ -2625,6 +2625,10 @@ ssize_t filemap_read(struct kiocb *iocb, struct iov_iter *iter,
>  		if ((iocb->ki_flags & IOCB_WAITQ) && already_read)
>  			iocb->ki_flags |= IOCB_NOWAIT;
>  
> +		isize = i_size_read(inode);
> +		if (unlikely(iocb->ki_pos >= isize))
> +			goto put_pages;
> +
>  		error = filemap_get_pages(iocb, iter, &pvec);
>  		if (error < 0)
>  			break;
> 
> 

I would wager that it's worth checking for this. I imagine read calls
beyond EOF are common enough that it's probably helpful to optimize that
case:

Acked-by: Jeff Layton <jlayton@redhat.com>


^ permalink raw reply	[flat|nested] 87+ messages in thread

* Re: [PATCH 03/67] vfs, fscache: Force ->write_inode() to occur if cookie pinned for writeback
  2021-10-18 14:51 ` [PATCH 03/67] vfs, fscache: Force ->write_inode() to occur if cookie pinned for writeback David Howells
@ 2021-10-19 17:22   ` Jeff Layton
  2021-10-20  9:49   ` David Howells
  1 sibling, 0 replies; 87+ messages in thread
From: Jeff Layton @ 2021-10-19 17:22 UTC (permalink / raw)
  To: David Howells, linux-cachefs
  Cc: Trond Myklebust, Anna Schumaker, Steve French,
	Dominique Martinet, Matthew Wilcox, Alexander Viro,
	Omar Sandoval, Linus Torvalds, linux-afs, linux-nfs, linux-cifs,
	ceph-devel, v9fs-developer, linux-fsdevel, linux-kernel

On Mon, 2021-10-18 at 15:51 +0100, David Howells wrote:
> Use an inode flag, I_PINNING_FSCACHE_WB, to indicate that a cookie is
> pinned in use by that inode for the purposes of writeback.
> 
> Pinning is necessary because the in-use pin from the open file is released
> before the writeback takes place, but if the resources aren't pinned, the
> dirty data can't be written to the cache.
> 
> Signed-off-by: David Howells <dhowells@redhat.com>
> ---
> 
>  fs/fs-writeback.c         |    8 ++++++++
>  include/linux/fs.h        |    3 +++
>  include/linux/fscache.h   |    1 +
>  include/linux/writeback.h |    1 +
>  4 files changed, 13 insertions(+)
> 
> diff --git a/fs/fs-writeback.c b/fs/fs-writeback.c
> index 81ec192ce067..f3122831c4fe 100644
> --- a/fs/fs-writeback.c
> +++ b/fs/fs-writeback.c
> @@ -1666,6 +1666,13 @@ __writeback_single_inode(struct inode *inode, struct writeback_control *wbc)
>  
>  	if (mapping_tagged(mapping, PAGECACHE_TAG_DIRTY))
>  		inode->i_state |= I_DIRTY_PAGES;
> +	else if (unlikely(inode->i_state & I_PINNING_FSCACHE_WB)) {
> +		if (!(inode->i_state & I_DIRTY_PAGES)) {
> +			inode->i_state &= ~I_PINNING_FSCACHE_WB;
> +			wbc->unpinned_fscache_wb = true;
> +			dirty |= I_PINNING_FSCACHE_WB; /* Cause write_inode */
> +		}
> +	}

IDGI: how would I_PINNING_FSCACHE_WB get set in the first place? 

>  
>  	spin_unlock(&inode->i_lock);
>  
> @@ -1675,6 +1682,7 @@ __writeback_single_inode(struct inode *inode, struct writeback_control *wbc)
>  		if (ret == 0)
>  			ret = err;
>  	}
> +	wbc->unpinned_fscache_wb = false;
>  	trace_writeback_single_inode(inode, wbc, nr_to_write);
>  	return ret;
>  }
> diff --git a/include/linux/fs.h b/include/linux/fs.h
> index 197493507744..336739fed3e9 100644
> --- a/include/linux/fs.h
> +++ b/include/linux/fs.h
> @@ -2420,6 +2420,8 @@ static inline void kiocb_clone(struct kiocb *kiocb, struct kiocb *kiocb_src,
>   *			Used to detect that mark_inode_dirty() should not move
>   * 			inode between dirty lists.
>   *
> + * I_PINNING_FSCACHE_WB	Inode is pinning an fscache object for writeback.
> + *
>   * Q: What is the difference between I_WILL_FREE and I_FREEING?
>   */
>  #define I_DIRTY_SYNC		(1 << 0)
> @@ -2442,6 +2444,7 @@ static inline void kiocb_clone(struct kiocb *kiocb, struct kiocb *kiocb_src,
>  #define I_CREATING		(1 << 15)
>  #define I_DONTCACHE		(1 << 16)
>  #define I_SYNC_QUEUED		(1 << 17)
> +#define I_PINNING_FSCACHE_WB	(1 << 18)
>  
>  #define I_DIRTY_INODE (I_DIRTY_SYNC | I_DIRTY_DATASYNC)
>  #define I_DIRTY (I_DIRTY_INODE | I_DIRTY_PAGES)
> diff --git a/include/linux/fscache.h b/include/linux/fscache.h
> index 01558d155799..ba4878b56717 100644
> --- a/include/linux/fscache.h
> +++ b/include/linux/fscache.h
> @@ -19,6 +19,7 @@
>  #include <linux/pagemap.h>
>  #include <linux/pagevec.h>
>  #include <linux/list_bl.h>
> +#include <linux/writeback.h>
>  #include <linux/netfs.h>
>  
>  #if defined(CONFIG_FSCACHE) || defined(CONFIG_FSCACHE_MODULE)
> diff --git a/include/linux/writeback.h b/include/linux/writeback.h
> index d1f65adf6a26..2fda288600d3 100644
> --- a/include/linux/writeback.h
> +++ b/include/linux/writeback.h
> @@ -69,6 +69,7 @@ struct writeback_control {
>  	unsigned for_reclaim:1;		/* Invoked from the page allocator */
>  	unsigned range_cyclic:1;	/* range_start is cyclic */
>  	unsigned for_sync:1;		/* sync(2) WB_SYNC_ALL writeback */
> +	unsigned unpinned_fscache_wb:1;	/* Cleared I_PINNING_FSCACHE_WB */
>  
>  	/*
>  	 * When writeback IOs are bounced through async layers, only the
> 
> 

-- 
Jeff Layton <jlayton@redhat.com>


^ permalink raw reply	[flat|nested] 87+ messages in thread

* Re: [PATCH 06/67] nfs, cifs, ceph, 9p: Disable use of fscache prior to its rewrite
  2021-10-18 14:51 ` [PATCH 06/67] nfs, cifs, ceph, 9p: Disable use of fscache prior to its rewrite David Howells
@ 2021-10-19 17:50   ` Jeff Layton
  2021-10-20 10:57   ` David Howells
  1 sibling, 0 replies; 87+ messages in thread
From: Jeff Layton @ 2021-10-19 17:50 UTC (permalink / raw)
  To: David Howells, linux-cachefs
  Cc: Dave Wysochanski, Trond Myklebust, Anna Schumaker, linux-nfs,
	Ilya Dryomov, ceph-devel, Steve French, linux-cifs,
	Eric Van Hensbergen, Latchesar Ionkov, Dominique Martinet,
	v9fs-developer, Trond Myklebust, Matthew Wilcox, Alexander Viro,
	Omar Sandoval, Linus Torvalds, linux-afs, linux-fsdevel,
	linux-kernel

On Mon, 2021-10-18 at 15:51 +0100, David Howells wrote:
> Temporarily disable the use of fscache by the various Linux network
> filesystems, apart from afs, so that the fscache core can be rewritten.
> 
> Signed-off-by: David Howells <dhowells@redhat.com>
> cc: Dave Wysochanski <dwysocha@redhat.com>
> cc: Trond Myklebust <trond.myklebust@hammerspace.com>
> cc: Anna Schumaker <anna.schumaker@netapp.com>
> cc: linux-nfs@vger.kernel.org
> cc: Jeff Layton <jlayton@kernel.org>
> cc: Ilya Dryomov <idryomov@gmail.com>
> cc: ceph-devel@vger.kernel.org
> cc: Steve French <sfrench@samba.org>
> cc: linux-cifs@vger.kernel.org
> cc: Eric Van Hensbergen <ericvh@gmail.com>
> cc: Latchesar Ionkov <lucho@ionkov.net>
> cc: Dominique Martinet <asmadeus@codewreck.org>
> cc: v9fs-developer@lists.sourceforge.net
> ---
> 
>  fs/9p/Kconfig      |    2 +-
>  fs/ceph/Kconfig    |    2 +-
>  fs/cifs/Kconfig    |    2 +-
>  fs/fscache/Kconfig |    4 ++++
>  fs/nfs/Kconfig     |    2 +-
>  5 files changed, 8 insertions(+), 4 deletions(-)
> 
> diff --git a/fs/9p/Kconfig b/fs/9p/Kconfig
> index d7bc93447c85..b11c15c30bac 100644
> --- a/fs/9p/Kconfig
> +++ b/fs/9p/Kconfig
> @@ -14,7 +14,7 @@ config 9P_FS
>  if 9P_FS
>  config 9P_FSCACHE
>  	bool "Enable 9P client caching support"
> -	depends on 9P_FS=m && FSCACHE || 9P_FS=y && FSCACHE=y
> +	depends on 9P_FS=m && FSCACHE_OLD || 9P_FS=y && FSCACHE_OLD=y
>  	help
>  	  Choose Y here to enable persistent, read-only local
>  	  caching support for 9p clients using FS-Cache
> diff --git a/fs/ceph/Kconfig b/fs/ceph/Kconfig
> index 94df854147d3..77ad452337ee 100644
> --- a/fs/ceph/Kconfig
> +++ b/fs/ceph/Kconfig
> @@ -21,7 +21,7 @@ config CEPH_FS
>  if CEPH_FS
>  config CEPH_FSCACHE
>  	bool "Enable Ceph client caching support"
> -	depends on CEPH_FS=m && FSCACHE || CEPH_FS=y && FSCACHE=y
> +	depends on CEPH_FS=m && FSCACHE_OLD || CEPH_FS=y && FSCACHE_OLD=y
>  	help
>  	  Choose Y here to enable persistent, read-only local
>  	  caching support for Ceph clients using FS-Cache
> diff --git a/fs/cifs/Kconfig b/fs/cifs/Kconfig
> index 3b7e3b9e4fd2..c5477abbcff0 100644
> --- a/fs/cifs/Kconfig
> +++ b/fs/cifs/Kconfig
> @@ -188,7 +188,7 @@ config CIFS_SMB_DIRECT
>  
>  config CIFS_FSCACHE
>  	bool "Provide CIFS client caching support"
> -	depends on CIFS=m && FSCACHE || CIFS=y && FSCACHE=y
> +	depends on CIFS=m && FSCACHE_OLD || CIFS=y && FSCACHE_OLD=y
>  	help
>  	  Makes CIFS FS-Cache capable. Say Y here if you want your CIFS data
>  	  to be cached locally on disk through the general filesystem cache
> diff --git a/fs/fscache/Kconfig b/fs/fscache/Kconfig
> index b313a978ae0a..7850de3bdee0 100644
> --- a/fs/fscache/Kconfig
> +++ b/fs/fscache/Kconfig
> @@ -38,3 +38,7 @@ config FSCACHE_DEBUG
>  	  enabled by setting bits in /sys/modules/fscache/parameter/debug.
>  
>  	  See Documentation/filesystems/caching/fscache.rst for more information.
> +
> +config FSCACHE_OLD
> +	bool
> +	default n
> diff --git a/fs/nfs/Kconfig b/fs/nfs/Kconfig
> index 14a72224b657..a8b73c90aa00 100644
> --- a/fs/nfs/Kconfig
> +++ b/fs/nfs/Kconfig
> @@ -170,7 +170,7 @@ config ROOT_NFS
>  
>  config NFS_FSCACHE
>  	bool "Provide NFS client caching support"
> -	depends on NFS_FS=m && FSCACHE || NFS_FS=y && FSCACHE=y
> +	depends on NFS_FS=m && FSCACHE_OLD || NFS_FS=y && FSCACHE_OLD=y
>  	help
>  	  Say Y here if you want NFS data to be cached locally on disc through
>  	  the general filesystem cache manager
> 
> 

The typical way to do this would be to rebrand the existing FSCACHE
Kconfig symbols into FSCACHE_OLD and then build the new fscache
structure such that it exists in parallel with the old. You'd then just
drop the old infrastructure once all of the fs's are converted to the
new. You could even make them conflict with one another in Kconfig too,
so that only one could be built in during the transition period if
supporting both at runtime is too difficult.

This approach of disabling everything is much more of an all-or-nothing
affair. It may mean less "churn" overall, but it seems less "nice"
because you have an interval of commits where fscache is non-functional.

I'm not necessarily opposed to this approach, but I'd like to better
understand why doing it this way was preferred.
-- 
Jeff Layton <jlayton@kernel.org>


^ permalink raw reply	[flat|nested] 87+ messages in thread

* Re: [PATCH 07/67] fscache: Remove the netfs data from the cookie
  2021-10-18 14:52 ` [PATCH 07/67] fscache: Remove the netfs data from the cookie David Howells
@ 2021-10-19 18:05   ` Jeff Layton
  0 siblings, 0 replies; 87+ messages in thread
From: Jeff Layton @ 2021-10-19 18:05 UTC (permalink / raw)
  To: David Howells, linux-cachefs
  Cc: Trond Myklebust, Anna Schumaker, Steve French,
	Dominique Martinet, Matthew Wilcox, Alexander Viro,
	Omar Sandoval, Linus Torvalds, linux-afs, linux-nfs, linux-cifs,
	ceph-devel, v9fs-developer, linux-fsdevel, linux-kernel

On Mon, 2021-10-18 at 15:52 +0100, David Howells wrote:
> Remove the netfs data pointer from the cookie so that we don't refer back
> to the netfs state and don't need accessors to manage this.  Keep the
> information we do need (of which there's not actually a lot) in the cookie
> which we can keep hold of if the netfs state goes away.
> 
> Signed-off-by: David Howells <dhowells@redhat.com>
> ---
> 
>  fs/afs/cache.c                |   39 --------
>  fs/afs/cell.c                 |    3 -
>  fs/afs/inode.c                |    3 -
>  fs/afs/volume.c               |    4 -

This patch is really "fscache+afs: ", which sort of makes it harder to
follow. It would be nice to be able to follow in the logical footsteps
of how you converted AFS when converting other filesystems. This series
makes that hard to do. Some of the AFS-specific changes are hidden in
fscache patches. It would be cleaner to separate them.

AFS+fscache is not disabled in this part of the series either. If
(hypothetically) an AFS user were to be bisecting kernel revisions and
landed somewhere in the middle of this series, what behavior should they
expect to see? Will it still work?

>  fs/cachefiles/interface.c     |  104 +++------------------
>  fs/cachefiles/internal.h      |   20 +---
>  fs/cachefiles/namei.c         |   10 +-
>  fs/cachefiles/xattr.c         |  202 +++++++----------------------------------
>  fs/fscache/cache.c            |   20 +---
>  fs/fscache/cookie.c           |   33 +++----
>  fs/fscache/fsdef.c            |   37 --------
>  fs/fscache/internal.h         |   29 ++++--
>  fs/fscache/netfs.c            |    3 -
>  fs/fscache/object.c           |   49 ----------
>  include/linux/fscache-cache.h |   32 +++++-
>  include/linux/fscache.h       |   39 +-------
>  16 files changed, 136 insertions(+), 491 deletions(-)
> 
> diff --git a/fs/afs/cache.c b/fs/afs/cache.c
> index 037af93e3aba..9b2de3014dbf 100644
> --- a/fs/afs/cache.c
> +++ b/fs/afs/cache.c
> @@ -8,11 +8,6 @@
>  #include <linux/sched.h>
>  #include "internal.h"
>  
> -static enum fscache_checkaux afs_vnode_cache_check_aux(void *cookie_netfs_data,
> -						       const void *buffer,
> -						       uint16_t buflen,
> -						       loff_t object_size);
> -
>  struct fscache_netfs afs_cache_netfs = {
>  	.name			= "afs",
>  	.version		= 2,
> @@ -31,38 +26,4 @@ struct fscache_cookie_def afs_volume_cache_index_def = {
>  struct fscache_cookie_def afs_vnode_cache_index_def = {
>  	.name		= "AFS.vnode",
>  	.type		= FSCACHE_COOKIE_TYPE_DATAFILE,
> -	.check_aux	= afs_vnode_cache_check_aux,
>  };
> -
> -/*
> - * check that the auxiliary data indicates that the entry is still valid
> - */
> -static enum fscache_checkaux afs_vnode_cache_check_aux(void *cookie_netfs_data,
> -						       const void *buffer,
> -						       uint16_t buflen,
> -						       loff_t object_size)
> -{
> -	struct afs_vnode *vnode = cookie_netfs_data;
> -	struct afs_vnode_cache_aux aux;
> -
> -	_enter("{%llx,%x,%llx},%p,%u",
> -	       vnode->fid.vnode, vnode->fid.unique, vnode->status.data_version,
> -	       buffer, buflen);
> -
> -	memcpy(&aux, buffer, sizeof(aux));
> -
> -	/* check the size of the data is what we're expecting */
> -	if (buflen != sizeof(aux)) {
> -		_leave(" = OBSOLETE [len %hx != %zx]", buflen, sizeof(aux));
> -		return FSCACHE_CHECKAUX_OBSOLETE;
> -	}
> -
> -	if (vnode->status.data_version != aux.data_version) {
> -		_leave(" = OBSOLETE [vers %llx != %llx]",
> -		       aux.data_version, vnode->status.data_version);
> -		return FSCACHE_CHECKAUX_OBSOLETE;
> -	}
> -
> -	_leave(" = SUCCESS");
> -	return FSCACHE_CHECKAUX_OKAY;
> -}
> diff --git a/fs/afs/cell.c b/fs/afs/cell.c
> index d88407fb9bc0..416f9bd638a5 100644
> --- a/fs/afs/cell.c
> +++ b/fs/afs/cell.c
> @@ -683,9 +683,10 @@ static int afs_activate_cell(struct afs_net *net, struct afs_cell *cell)
>  #ifdef CONFIG_AFS_FSCACHE
>  	cell->cache = fscache_acquire_cookie(afs_cache_netfs.primary_index,
>  					     &afs_cell_cache_index_def,
> +					     NULL,
>  					     cell->name, strlen(cell->name),
>  					     NULL, 0,
> -					     cell, 0, true);
> +					     0, true);
>  #endif
>  	ret = afs_proc_cell_setup(cell);
>  	if (ret < 0)
> diff --git a/fs/afs/inode.c b/fs/afs/inode.c
> index 8fcffea2daf5..3b696ac7c05a 100644
> --- a/fs/afs/inode.c
> +++ b/fs/afs/inode.c
> @@ -432,9 +432,10 @@ static void afs_get_inode_cache(struct afs_vnode *vnode)
>  
>  	vnode->cache = fscache_acquire_cookie(vnode->volume->cache,
>  					      &afs_vnode_cache_index_def,
> +					      NULL,
>  					      &key, sizeof(key),
>  					      &aux, sizeof(aux),
> -					      vnode, vnode->status.size, true);
> +					      vnode->status.size, true);
>  #endif
>  }
>  
> diff --git a/fs/afs/volume.c b/fs/afs/volume.c
> index f84194b791d3..9ca246ab1a86 100644
> --- a/fs/afs/volume.c
> +++ b/fs/afs/volume.c
> @@ -273,9 +273,9 @@ void afs_activate_volume(struct afs_volume *volume)
>  #ifdef CONFIG_AFS_FSCACHE
>  	volume->cache = fscache_acquire_cookie(volume->cell->cache,
>  					       &afs_volume_cache_index_def,
> +					       NULL,
>  					       &volume->vid, sizeof(volume->vid),
> -					       NULL, 0,
> -					       volume, 0, true);
> +					       NULL, 0, 0, true);
>  #endif
>  }
>  
> diff --git a/fs/cachefiles/interface.c b/fs/cachefiles/interface.c
> index 83671488a323..6c48e81deccc 100644
> --- a/fs/cachefiles/interface.c
> +++ b/fs/cachefiles/interface.c
> @@ -7,13 +7,9 @@
>  
>  #include <linux/slab.h>
>  #include <linux/mount.h>
> +#include <linux/xattr.h>
>  #include "internal.h"
>  
> -struct cachefiles_lookup_data {
> -	struct cachefiles_xattr	*auxdata;	/* auxiliary data */
> -	char			*key;		/* key path */
> -};
> -
>  static int cachefiles_attr_changed(struct fscache_object *_object);
>  
>  /*
> @@ -23,11 +19,9 @@ static struct fscache_object *cachefiles_alloc_object(
>  	struct fscache_cache *_cache,
>  	struct fscache_cookie *cookie)
>  {
> -	struct cachefiles_lookup_data *lookup_data;
>  	struct cachefiles_object *object;
>  	struct cachefiles_cache *cache;
> -	struct cachefiles_xattr *auxdata;
> -	unsigned keylen, auxlen;
> +	unsigned keylen;
>  	void *buffer, *p;
>  	char *key;
>  
> @@ -35,10 +29,6 @@ static struct fscache_object *cachefiles_alloc_object(
>  
>  	_enter("{%s},%x,", cache->cache.identifier, cookie->debug_id);
>  
> -	lookup_data = kmalloc(sizeof(*lookup_data), cachefiles_gfp);
> -	if (!lookup_data)
> -		goto nomem_lookup_data;
> -
>  	/* create a new object record and a temporary leaf image */
>  	object = kmem_cache_alloc(cachefiles_object_jar, cachefiles_gfp);
>  	if (!object)
> @@ -62,10 +52,7 @@ static struct fscache_object *cachefiles_alloc_object(
>  		goto nomem_buffer;
>  
>  	keylen = cookie->key_len;
> -	if (keylen <= sizeof(cookie->inline_key))
> -		p = cookie->inline_key;
> -	else
> -		p = cookie->key;
> +	p = fscache_get_key(cookie);
>  	memcpy(buffer + 2, p, keylen);
>  
>  	*(uint16_t *)buffer = keylen;
> @@ -75,28 +62,13 @@ static struct fscache_object *cachefiles_alloc_object(
>  
>  	/* turn the raw key into something that can work with as a filename */
>  	key = cachefiles_cook_key(buffer, keylen + 2, object->type);
> +	kfree(buffer);
>  	if (!key)
>  		goto nomem_key;
>  
> -	/* get hold of the auxiliary data and prepend the object type */
> -	auxdata = buffer;
> -	auxlen = cookie->aux_len;
> -	if (auxlen) {
> -		if (auxlen <= sizeof(cookie->inline_aux))
> -			p = cookie->inline_aux;
> -		else
> -			p = cookie->aux;
> -		memcpy(auxdata->data, p, auxlen);
> -	}
> -
> -	auxdata->len = auxlen + 1;
> -	auxdata->type = cookie->type;
> -
> -	lookup_data->auxdata = auxdata;
> -	lookup_data->key = key;
> -	object->lookup_data = lookup_data;
> +	object->lookup_key = key;
>  
> -	_leave(" = %x [%p]", object->fscache.debug_id, lookup_data);
> +	_leave(" = %x [%s]", object->fscache.debug_id, key);
>  	return &object->fscache;
>  
>  nomem_key:
> @@ -106,8 +78,6 @@ static struct fscache_object *cachefiles_alloc_object(
>  	kmem_cache_free(cachefiles_object_jar, object);
>  	fscache_object_destroyed(&cache->cache);
>  nomem_object:
> -	kfree(lookup_data);
> -nomem_lookup_data:
>  	_leave(" = -ENOMEM");
>  	return ERR_PTR(-ENOMEM);
>  }
> @@ -118,7 +88,6 @@ static struct fscache_object *cachefiles_alloc_object(
>   */
>  static int cachefiles_lookup_object(struct fscache_object *_object)
>  {
> -	struct cachefiles_lookup_data *lookup_data;
>  	struct cachefiles_object *parent, *object;
>  	struct cachefiles_cache *cache;
>  	const struct cred *saved_cred;
> @@ -130,15 +99,12 @@ static int cachefiles_lookup_object(struct fscache_object *_object)
>  	parent = container_of(_object->parent,
>  			      struct cachefiles_object, fscache);
>  	object = container_of(_object, struct cachefiles_object, fscache);
> -	lookup_data = object->lookup_data;
>  
> -	ASSERTCMP(lookup_data, !=, NULL);
> +	ASSERTCMP(object->lookup_key, !=, NULL);
>  
>  	/* look up the key, creating any missing bits */
>  	cachefiles_begin_secure(cache, &saved_cred);
> -	ret = cachefiles_walk_to_object(parent, object,
> -					lookup_data->key,
> -					lookup_data->auxdata);
> +	ret = cachefiles_walk_to_object(parent, object, object->lookup_key);
>  	cachefiles_end_secure(cache, saved_cred);
>  
>  	/* polish off by setting the attributes of non-index files */
> @@ -165,14 +131,10 @@ static void cachefiles_lookup_complete(struct fscache_object *_object)
>  
>  	object = container_of(_object, struct cachefiles_object, fscache);
>  
> -	_enter("{OBJ%x,%p}", object->fscache.debug_id, object->lookup_data);
> +	_enter("{OBJ%x}", object->fscache.debug_id);
>  
> -	if (object->lookup_data) {
> -		kfree(object->lookup_data->key);
> -		kfree(object->lookup_data->auxdata);
> -		kfree(object->lookup_data);
> -		object->lookup_data = NULL;
> -	}
> +	kfree(object->lookup_key);
> +	object->lookup_key = NULL;
>  }
>  
>  /*
> @@ -204,12 +166,8 @@ struct fscache_object *cachefiles_grab_object(struct fscache_object *_object,
>  static void cachefiles_update_object(struct fscache_object *_object)
>  {
>  	struct cachefiles_object *object;
> -	struct cachefiles_xattr *auxdata;
>  	struct cachefiles_cache *cache;
> -	struct fscache_cookie *cookie;
>  	const struct cred *saved_cred;
> -	const void *aux;
> -	unsigned auxlen;
>  
>  	_enter("{OBJ%x}", _object->debug_id);
>  
> @@ -217,40 +175,9 @@ static void cachefiles_update_object(struct fscache_object *_object)
>  	cache = container_of(object->fscache.cache, struct cachefiles_cache,
>  			     cache);
>  
> -	if (!fscache_use_cookie(_object)) {
> -		_leave(" [relinq]");
> -		return;
> -	}
> -
> -	cookie = object->fscache.cookie;
> -	auxlen = cookie->aux_len;
> -
> -	if (!auxlen) {
> -		fscache_unuse_cookie(_object);
> -		_leave(" [no aux]");
> -		return;
> -	}
> -
> -	auxdata = kmalloc(2 + auxlen + 3, cachefiles_gfp);
> -	if (!auxdata) {
> -		fscache_unuse_cookie(_object);
> -		_leave(" [nomem]");
> -		return;
> -	}
> -
> -	aux = (auxlen <= sizeof(cookie->inline_aux)) ?
> -		cookie->inline_aux : cookie->aux;
> -
> -	memcpy(auxdata->data, aux, auxlen);
> -	fscache_unuse_cookie(_object);
> -
> -	auxdata->len = auxlen + 1;
> -	auxdata->type = cookie->type;
> -
>  	cachefiles_begin_secure(cache, &saved_cred);
> -	cachefiles_update_object_xattr(object, auxdata);
> +	cachefiles_set_object_xattr(object, XATTR_REPLACE);
>  	cachefiles_end_secure(cache, saved_cred);
> -	kfree(auxdata);
>  	_leave("");
>  }
>  
> @@ -354,12 +281,7 @@ void cachefiles_put_object(struct fscache_object *_object,
>  		ASSERTCMP(object->fscache.n_ops, ==, 0);
>  		ASSERTCMP(object->fscache.n_children, ==, 0);
>  
> -		if (object->lookup_data) {
> -			kfree(object->lookup_data->key);
> -			kfree(object->lookup_data->auxdata);
> -			kfree(object->lookup_data);
> -			object->lookup_data = NULL;
> -		}
> +		kfree(object->lookup_key);
>  
>  		cache = object->fscache.cache;
>  		fscache_object_destroy(&object->fscache);
> diff --git a/fs/cachefiles/internal.h b/fs/cachefiles/internal.h
> index de982f4f513f..f6b85c370935 100644
> --- a/fs/cachefiles/internal.h
> +++ b/fs/cachefiles/internal.h
> @@ -34,7 +34,7 @@ extern unsigned cachefiles_debug;
>   */
>  struct cachefiles_object {
>  	struct fscache_object		fscache;	/* fscache handle */
> -	struct cachefiles_lookup_data	*lookup_data;	/* cached lookup data */
> +	char				*lookup_key;	/* key to look up */
>  	struct dentry			*dentry;	/* the file/dir representing this object */
>  	struct dentry			*backer;	/* backing file */
>  	loff_t				i_size;		/* object size */
> @@ -88,15 +88,6 @@ struct cachefiles_cache {
>  	char				*tag;		/* cache binding tag */
>  };
>  
> -/*
> - * auxiliary data xattr buffer
> - */
> -struct cachefiles_xattr {
> -	uint16_t			len;
> -	uint8_t				type;
> -	uint8_t				data[];
> -};
> -
>  #include <trace/events/cachefiles.h>
>  
>  /*
> @@ -145,8 +136,7 @@ extern int cachefiles_delete_object(struct cachefiles_cache *cache,
>  				    struct cachefiles_object *object);
>  extern int cachefiles_walk_to_object(struct cachefiles_object *parent,
>  				     struct cachefiles_object *object,
> -				     const char *key,
> -				     struct cachefiles_xattr *auxdata);
> +				     const char *key);
>  extern struct dentry *cachefiles_get_directory(struct cachefiles_cache *cache,
>  					       struct dentry *dir,
>  					       const char *name);
> @@ -188,12 +178,8 @@ static inline void cachefiles_end_secure(struct cachefiles_cache *cache,
>   */
>  extern int cachefiles_check_object_type(struct cachefiles_object *object);
>  extern int cachefiles_set_object_xattr(struct cachefiles_object *object,
> -				       struct cachefiles_xattr *auxdata);
> -extern int cachefiles_update_object_xattr(struct cachefiles_object *object,
> -					  struct cachefiles_xattr *auxdata);
> +				       unsigned int xattr_flags);
>  extern int cachefiles_check_auxdata(struct cachefiles_object *object);
> -extern int cachefiles_check_object_xattr(struct cachefiles_object *object,
> -					 struct cachefiles_xattr *auxdata);
>  extern int cachefiles_remove_object_xattr(struct cachefiles_cache *cache,
>  					  struct dentry *dentry);
>  
> diff --git a/fs/cachefiles/namei.c b/fs/cachefiles/namei.c
> index a9aca5ab5970..dc5e1e48c0a8 100644
> --- a/fs/cachefiles/namei.c
> +++ b/fs/cachefiles/namei.c
> @@ -45,11 +45,10 @@ void __cachefiles_printk_object(struct cachefiles_object *object,
>  	spin_lock(&object->fscache.lock);
>  	cookie = object->fscache.cookie;
>  	if (cookie) {
> -		pr_err("%scookie=%x [pr=%x nd=%p fl=%lx]\n",
> +		pr_err("%scookie=%x [pr=%x fl=%lx]\n",
>  		       prefix,
>  		       cookie->debug_id,
>  		       cookie->parent ? cookie->parent->debug_id : 0,
> -		       cookie->netfs_data,
>  		       cookie->flags);
>  		pr_err("%skey=[%u] '", prefix, cookie->key_len);
>  		k = (cookie->key_len <= sizeof(cookie->inline_key)) ?
> @@ -487,8 +486,7 @@ int cachefiles_delete_object(struct cachefiles_cache *cache,
>   */
>  int cachefiles_walk_to_object(struct cachefiles_object *parent,
>  			      struct cachefiles_object *object,
> -			      const char *key,
> -			      struct cachefiles_xattr *auxdata)
> +			      const char *key)
>  {
>  	struct cachefiles_cache *cache;
>  	struct dentry *dir, *next = NULL;
> @@ -636,7 +634,7 @@ int cachefiles_walk_to_object(struct cachefiles_object *parent,
>  	if (!object->new) {
>  		_debug("validate '%pd'", next);
>  
> -		ret = cachefiles_check_object_xattr(object, auxdata);
> +		ret = cachefiles_check_auxdata(object);
>  		if (ret == -ESTALE) {
>  			/* delete the object (the deleter drops the directory
>  			 * mutex) */
> @@ -671,7 +669,7 @@ int cachefiles_walk_to_object(struct cachefiles_object *parent,
>  
>  	if (object->new) {
>  		/* attach data to a newly constructed terminal object */
> -		ret = cachefiles_set_object_xattr(object, auxdata);
> +		ret = cachefiles_set_object_xattr(object, XATTR_CREATE);
>  		if (ret < 0)
>  			goto check_error;
>  	} else {
> diff --git a/fs/cachefiles/xattr.c b/fs/cachefiles/xattr.c
> index 9e82de668595..c99952404932 100644
> --- a/fs/cachefiles/xattr.c
> +++ b/fs/cachefiles/xattr.c
> @@ -15,6 +15,11 @@
>  #include <linux/slab.h>
>  #include "internal.h"
>  
> +struct cachefiles_xattr {
> +	uint8_t				type;
> +	uint8_t				data[];
> +} __packed;
> +
>  static const char cachefiles_xattr_cache[] =
>  	XATTR_USER_PREFIX "CacheFiles.cache";
>  
> @@ -98,54 +103,35 @@ int cachefiles_check_object_type(struct cachefiles_object *object)
>   * set the state xattr on a cache file
>   */
>  int cachefiles_set_object_xattr(struct cachefiles_object *object,
> -				struct cachefiles_xattr *auxdata)
> -{
> -	struct dentry *dentry = object->dentry;
> -	int ret;
> -
> -	ASSERT(dentry);
> -
> -	_enter("%p,#%d", object, auxdata->len);
> -
> -	/* attempt to install the cache metadata directly */
> -	_debug("SET #%u", auxdata->len);
> -
> -	clear_bit(FSCACHE_COOKIE_AUX_UPDATED, &object->fscache.cookie->flags);
> -	ret = vfs_setxattr(&init_user_ns, dentry, cachefiles_xattr_cache,
> -			   &auxdata->type, auxdata->len, XATTR_CREATE);
> -	if (ret < 0 && ret != -ENOMEM)
> -		cachefiles_io_error_obj(
> -			object,
> -			"Failed to set xattr with error %d", ret);
> -
> -	_leave(" = %d", ret);
> -	return ret;
> -}
> -
> -/*
> - * update the state xattr on a cache file
> - */
> -int cachefiles_update_object_xattr(struct cachefiles_object *object,
> -				   struct cachefiles_xattr *auxdata)
> +				unsigned int xattr_flags)
>  {
> +	struct cachefiles_xattr *buf;
>  	struct dentry *dentry = object->dentry;
> +	unsigned int len = object->fscache.cookie->aux_len;
>  	int ret;
>  
>  	if (!dentry)
>  		return -ESTALE;
>  
> -	_enter("%x,#%d", object->fscache.debug_id, auxdata->len);
> +	_enter("%x,#%d", object->fscache.debug_id, len);
>  
> -	/* attempt to install the cache metadata directly */
> -	_debug("SET #%u", auxdata->len);
> +	buf = kmalloc(sizeof(struct cachefiles_xattr) + len, GFP_KERNEL);
> +	if (!buf)
> +		return -ENOMEM;
> +
> +	buf->type = object->fscache.cookie->def->type;
> +	if (len > 0)
> +		memcpy(buf->data, fscache_get_aux(object->fscache.cookie), len);
>  
>  	clear_bit(FSCACHE_COOKIE_AUX_UPDATED, &object->fscache.cookie->flags);
>  	ret = vfs_setxattr(&init_user_ns, dentry, cachefiles_xattr_cache,
> -			   &auxdata->type, auxdata->len, XATTR_REPLACE);
> +			   buf, sizeof(struct cachefiles_xattr) + len,
> +			   xattr_flags);
> +	kfree(buf);
>  	if (ret < 0 && ret != -ENOMEM)
>  		cachefiles_io_error_obj(
>  			object,
> -			"Failed to update xattr with error %d", ret);
> +			"Failed to set xattr with error %d", ret);
>  
>  	_leave(" = %d", ret);
>  	return ret;
> @@ -156,148 +142,30 @@ int cachefiles_update_object_xattr(struct cachefiles_object *object,
>   */
>  int cachefiles_check_auxdata(struct cachefiles_object *object)
>  {
> -	struct cachefiles_xattr *auxbuf;
> -	enum fscache_checkaux validity;
> -	struct dentry *dentry = object->dentry;
> -	ssize_t xlen;
> -	int ret;
> -
> -	ASSERT(dentry);
> -	ASSERT(d_backing_inode(dentry));
> -	ASSERT(object->fscache.cookie->def->check_aux);
> -
> -	auxbuf = kmalloc(sizeof(struct cachefiles_xattr) + 512, GFP_KERNEL);
> -	if (!auxbuf)
> -		return -ENOMEM;
> -
> -	xlen = vfs_getxattr(&init_user_ns, dentry, cachefiles_xattr_cache,
> -			    &auxbuf->type, 512 + 1);
> -	ret = -ESTALE;
> -	if (xlen < 1 ||
> -	    auxbuf->type != object->fscache.cookie->def->type)
> -		goto error;
> -
> -	xlen--;
> -	validity = fscache_check_aux(&object->fscache, &auxbuf->data, xlen,
> -				     i_size_read(d_backing_inode(dentry)));
> -	if (validity != FSCACHE_CHECKAUX_OKAY)
> -		goto error;
> -
> -	ret = 0;
> -error:
> -	kfree(auxbuf);
> -	return ret;
> -}
> -
> -/*
> - * check the state xattr on a cache file
> - * - return -ESTALE if the object should be deleted
> - */
> -int cachefiles_check_object_xattr(struct cachefiles_object *object,
> -				  struct cachefiles_xattr *auxdata)
> -{
> -	struct cachefiles_xattr *auxbuf;
> +	struct cachefiles_xattr *buf;
>  	struct dentry *dentry = object->dentry;
> -	int ret;
> -
> -	_enter("%p,#%d", object, auxdata->len);
> +	unsigned int len = object->fscache.cookie->aux_len, tlen;
> +	const void *p = fscache_get_aux(object->fscache.cookie);
> +	ssize_t ret;
>  
>  	ASSERT(dentry);
>  	ASSERT(d_backing_inode(dentry));
>  
> -	auxbuf = kmalloc(sizeof(struct cachefiles_xattr) + 512, cachefiles_gfp);
> -	if (!auxbuf) {
> -		_leave(" = -ENOMEM");
> +	tlen = sizeof(struct cachefiles_xattr) + len;
> +	buf = kmalloc(tlen, GFP_KERNEL);
> +	if (!buf)
>  		return -ENOMEM;
> -	}
> -
> -	/* read the current type label */
> -	ret = vfs_getxattr(&init_user_ns, dentry, cachefiles_xattr_cache,
> -			   &auxbuf->type, 512 + 1);
> -	if (ret < 0) {
> -		if (ret == -ENODATA)
> -			goto stale; /* no attribute - power went off
> -				     * mid-cull? */
> -
> -		if (ret == -ERANGE)
> -			goto bad_type_length;
>  
> -		cachefiles_io_error_obj(object,
> -					"Can't read xattr on %lu (err %d)",
> -					d_backing_inode(dentry)->i_ino, -ret);
> -		goto error;
> -	}
> -
> -	/* check the on-disk object */
> -	if (ret < 1)
> -		goto bad_type_length;
> -
> -	if (auxbuf->type != auxdata->type)
> -		goto stale;
> -
> -	auxbuf->len = ret;
> -
> -	/* consult the netfs */
> -	if (object->fscache.cookie->def->check_aux) {
> -		enum fscache_checkaux result;
> -		unsigned int dlen;
> -
> -		dlen = auxbuf->len - 1;
> -
> -		_debug("checkaux %s #%u",
> -		       object->fscache.cookie->def->name, dlen);
> -
> -		result = fscache_check_aux(&object->fscache,
> -					   &auxbuf->data, dlen,
> -					   i_size_read(d_backing_inode(dentry)));
> -
> -		switch (result) {
> -			/* entry okay as is */
> -		case FSCACHE_CHECKAUX_OKAY:
> -			goto okay;
> -
> -			/* entry requires update */
> -		case FSCACHE_CHECKAUX_NEEDS_UPDATE:
> -			break;
> -
> -			/* entry requires deletion */
> -		case FSCACHE_CHECKAUX_OBSOLETE:
> -			goto stale;
> -
> -		default:
> -			BUG();
> -		}
> -
> -		/* update the current label */
> -		ret = vfs_setxattr(&init_user_ns, dentry,
> -				   cachefiles_xattr_cache, &auxdata->type,
> -				   auxdata->len, XATTR_REPLACE);
> -		if (ret < 0) {
> -			cachefiles_io_error_obj(object,
> -						"Can't update xattr on %lu"
> -						" (error %d)",
> -						d_backing_inode(dentry)->i_ino, -ret);
> -			goto error;
> -		}
> -	}
> -
> -okay:
> -	ret = 0;
> +	ret = vfs_getxattr(&init_user_ns, dentry, cachefiles_xattr_cache, buf, tlen);
> +	if (ret == tlen &&
> +	    buf->type == object->fscache.cookie->def->type &&
> +	    memcmp(buf->data, p, len) == 0)
> +		ret = 0;
> +	else
> +		ret = -ESTALE;
>  
> -error:
> -	kfree(auxbuf);
> -	_leave(" = %d", ret);
> +	kfree(buf);
>  	return ret;
> -
> -bad_type_length:
> -	pr_err("Cache object %lu xattr length incorrect\n",
> -	       d_backing_inode(dentry)->i_ino);
> -	ret = -EIO;
> -	goto error;
> -
> -stale:
> -	ret = -ESTALE;
> -	goto error;
>  }
>  
>  /*
> diff --git a/fs/fscache/cache.c b/fs/fscache/cache.c
> index cfa60c2faf68..efcdb40267d6 100644
> --- a/fs/fscache/cache.c
> +++ b/fs/fscache/cache.c
> @@ -30,6 +30,7 @@ struct fscache_cache_tag *__fscache_lookup_cache_tag(const char *name)
>  	list_for_each_entry(tag, &fscache_cache_tag_list, link) {
>  		if (strcmp(tag->name, name) == 0) {
>  			atomic_inc(&tag->usage);
> +			refcount_inc(&tag->ref);
>  			up_read(&fscache_addremove_sem);
>  			return tag;
>  		}
> @@ -44,6 +45,7 @@ struct fscache_cache_tag *__fscache_lookup_cache_tag(const char *name)
>  		return ERR_PTR(-ENOMEM);
>  
>  	atomic_set(&xtag->usage, 1);
> +	refcount_set(&xtag->ref, 1);
>  	strcpy(xtag->name, name);
>  
>  	/* write lock, search again and add if still not present */
> @@ -52,6 +54,7 @@ struct fscache_cache_tag *__fscache_lookup_cache_tag(const char *name)
>  	list_for_each_entry(tag, &fscache_cache_tag_list, link) {
>  		if (strcmp(tag->name, name) == 0) {
>  			atomic_inc(&tag->usage);
> +			refcount_inc(&tag->ref);
>  			up_write(&fscache_addremove_sem);
>  			kfree(xtag);
>  			return tag;
> @@ -64,7 +67,7 @@ struct fscache_cache_tag *__fscache_lookup_cache_tag(const char *name)
>  }
>  
>  /*
> - * release a reference to a cache tag
> + * Unuse a cache tag
>   */
>  void __fscache_release_cache_tag(struct fscache_cache_tag *tag)
>  {
> @@ -77,8 +80,7 @@ void __fscache_release_cache_tag(struct fscache_cache_tag *tag)
>  			tag = NULL;
>  
>  		up_write(&fscache_addremove_sem);
> -
> -		kfree(tag);
> +		fscache_put_cache_tag(tag);
>  	}
>  }
>  
> @@ -130,20 +132,10 @@ struct fscache_cache *fscache_select_cache_for_object(
>  
>  	spin_unlock(&cookie->lock);
>  
> -	if (!cookie->def->select_cache)
> -		goto no_preference;
> -
> -	/* ask the netfs for its preference */
> -	tag = cookie->def->select_cache(cookie->parent->netfs_data,
> -					cookie->netfs_data);
> +	tag = cookie->preferred_cache;
>  	if (!tag)
>  		goto no_preference;
>  
> -	if (tag == ERR_PTR(-ENOMEM)) {
> -		_leave(" = NULL [nomem tag]");
> -		return NULL;
> -	}
> -
>  	if (!tag->cache) {
>  		_leave(" = NULL [unbacked tag]");
>  		return NULL;
> diff --git a/fs/fscache/cookie.c b/fs/fscache/cookie.c
> index 8a850c3d0775..a772d4b6cacd 100644
> --- a/fs/fscache/cookie.c
> +++ b/fs/fscache/cookie.c
> @@ -43,11 +43,10 @@ static void fscache_print_cookie(struct fscache_cookie *cookie, char prefix)
>  	       cookie->flags,
>  	       atomic_read(&cookie->n_children),
>  	       atomic_read(&cookie->n_active));
> -	pr_err("%c-cookie d=%p{%s} n=%p\n",
> +	pr_err("%c-cookie d=%p{%s}\n",
>  	       prefix,
>  	       cookie->def,
> -	       cookie->def ? cookie->def->name : "?",
> -	       cookie->netfs_data);
> +	       cookie->def ? cookie->def->name : "?");
>  
>  	o = READ_ONCE(cookie->backing_objects.first);
>  	if (o) {
> @@ -74,6 +73,7 @@ void fscache_free_cookie(struct fscache_cookie *cookie)
>  			kfree(cookie->aux);
>  		if (cookie->key_len > sizeof(cookie->inline_key))
>  			kfree(cookie->key);
> +		fscache_put_cache_tag(cookie->preferred_cache);
>  		kmem_cache_free(fscache_cookie_jar, cookie);
>  	}
>  }
> @@ -138,9 +138,9 @@ static atomic_t fscache_cookie_debug_id = ATOMIC_INIT(1);
>  struct fscache_cookie *fscache_alloc_cookie(
>  	struct fscache_cookie *parent,
>  	const struct fscache_cookie_def *def,
> +	struct fscache_cache_tag *preferred_cache,
>  	const void *index_key, size_t index_key_len,
>  	const void *aux_data, size_t aux_data_len,
> -	void *netfs_data,
>  	loff_t object_size)
>  {
>  	struct fscache_cookie *cookie;
> @@ -175,7 +175,9 @@ struct fscache_cookie *fscache_alloc_cookie(
>  
>  	cookie->def		= def;
>  	cookie->parent		= parent;
> -	cookie->netfs_data	= netfs_data;
> +
> +	cookie->preferred_cache	= fscache_get_cache_tag(preferred_cache);
> +
>  	cookie->flags		= (1 << FSCACHE_COOKIE_NO_DATA_YET);
>  	cookie->type		= def->type;
>  	spin_lock_init(&cookie->lock);
> @@ -240,7 +242,6 @@ struct fscache_cookie *fscache_hash_cookie(struct fscache_cookie *candidate)
>   *   - the top level index cookie for each netfs is stored in the fscache_netfs
>   *     struct upon registration
>   * - def points to the definition
> - * - the netfs_data will be passed to the functions pointed to in *def
>   * - all attached caches will be searched to see if they contain this object
>   * - index objects aren't stored on disk until there's a dependent file that
>   *   needs storing
> @@ -252,9 +253,9 @@ struct fscache_cookie *fscache_hash_cookie(struct fscache_cookie *candidate)
>  struct fscache_cookie *__fscache_acquire_cookie(
>  	struct fscache_cookie *parent,
>  	const struct fscache_cookie_def *def,
> +	struct fscache_cache_tag *preferred_cache,
>  	const void *index_key, size_t index_key_len,
>  	const void *aux_data, size_t aux_data_len,
> -	void *netfs_data,
>  	loff_t object_size,
>  	bool enable)
>  {
> @@ -262,9 +263,9 @@ struct fscache_cookie *__fscache_acquire_cookie(
>  
>  	BUG_ON(!def);
>  
> -	_enter("{%s},{%s},%p,%u",
> +	_enter("{%s},{%s},%u",
>  	       parent ? (char *) parent->def->name : "<no-parent>",
> -	       def->name, netfs_data, enable);
> +	       def->name, enable);
>  
>  	if (!index_key || !index_key_len || index_key_len > 255 || aux_data_len > 255)
>  		return NULL;
> @@ -288,10 +289,10 @@ struct fscache_cookie *__fscache_acquire_cookie(
>  	BUG_ON(def->type == FSCACHE_COOKIE_TYPE_INDEX &&
>  	       parent->type != FSCACHE_COOKIE_TYPE_INDEX);
>  
> -	candidate = fscache_alloc_cookie(parent, def,
> +	candidate = fscache_alloc_cookie(parent, def, preferred_cache,
>  					 index_key, index_key_len,
>  					 aux_data, aux_data_len,
> -					 netfs_data, object_size);
> +					 object_size);
>  	if (!candidate) {
>  		fscache_stat(&fscache_n_acquires_oom);
>  		_leave(" [ENOMEM]");
> @@ -812,7 +813,6 @@ void __fscache_relinquish_cookie(struct fscache_cookie *cookie,
>  	__fscache_disable_cookie(cookie, aux_data, retire);
>  
>  	/* Clear pointers back to the netfs */
> -	cookie->netfs_data	= NULL;
>  	cookie->def		= NULL;
>  
>  	if (cookie->parent) {
> @@ -978,8 +978,8 @@ static int fscache_cookies_seq_show(struct seq_file *m, void *v)
>  
>  	if (v == &fscache_cookies) {
>  		seq_puts(m,
> -			 "COOKIE   PARENT   USAGE CHILD ACT TY FL  DEF              NETFS_DATA\n"
> -			 "======== ======== ===== ===== === == === ================ ==========\n"
> +			 "COOKIE   PARENT   USAGE CHILD ACT TY FL  DEF             \n"
> +			 "======== ======== ===== ===== === == === ================\n"
>  			 );
>  		return 0;
>  	}
> @@ -1001,7 +1001,7 @@ static int fscache_cookies_seq_show(struct seq_file *m, void *v)
>  	}
>  
>  	seq_printf(m,
> -		   "%08x %08x %5u %5u %3u %s %03lx %-16s %px",
> +		   "%08x %08x %5u %5u %3u %s %03lx %-16s",
>  		   cookie->debug_id,
>  		   cookie->parent ? cookie->parent->debug_id : 0,
>  		   refcount_read(&cookie->ref),
> @@ -1009,8 +1009,7 @@ static int fscache_cookies_seq_show(struct seq_file *m, void *v)
>  		   atomic_read(&cookie->n_active),
>  		   type,
>  		   cookie->flags,
> -		   cookie->def->name,
> -		   cookie->netfs_data);
> +		   cookie->def->name);
>  
>  	keylen = cookie->key_len;
>  	auxlen = cookie->aux_len;
> diff --git a/fs/fscache/fsdef.c b/fs/fscache/fsdef.c
> index 0402673c680e..111946ba9ce1 100644
> --- a/fs/fscache/fsdef.c
> +++ b/fs/fscache/fsdef.c
> @@ -9,12 +9,6 @@
>  #include <linux/module.h>
>  #include "internal.h"
>  
> -static
> -enum fscache_checkaux fscache_fsdef_netfs_check_aux(void *cookie_netfs_data,
> -						    const void *data,
> -						    uint16_t datalen,
> -						    loff_t object_size);
> -
>  /*
>   * The root index is owned by FS-Cache itself.
>   *
> @@ -64,35 +58,4 @@ EXPORT_SYMBOL(fscache_fsdef_index);
>  struct fscache_cookie_def fscache_fsdef_netfs_def = {
>  	.name		= "FSDEF.netfs",
>  	.type		= FSCACHE_COOKIE_TYPE_INDEX,
> -	.check_aux	= fscache_fsdef_netfs_check_aux,
>  };
> -
> -/*
> - * check that the index structure version number stored in the auxiliary data
> - * matches the one the netfs gave us
> - */
> -static enum fscache_checkaux fscache_fsdef_netfs_check_aux(
> -	void *cookie_netfs_data,
> -	const void *data,
> -	uint16_t datalen,
> -	loff_t object_size)
> -{
> -	struct fscache_netfs *netfs = cookie_netfs_data;
> -	uint32_t version;
> -
> -	_enter("{%s},,%hu", netfs->name, datalen);
> -
> -	if (datalen != sizeof(version)) {
> -		_leave(" = OBSOLETE [dl=%d v=%zu]", datalen, sizeof(version));
> -		return FSCACHE_CHECKAUX_OBSOLETE;
> -	}
> -
> -	memcpy(&version, data, sizeof(version));
> -	if (version != netfs->version) {
> -		_leave(" = OBSOLETE [ver=%x net=%x]", version, netfs->version);
> -		return FSCACHE_CHECKAUX_OBSOLETE;
> -	}
> -
> -	_leave(" = OKAY");
> -	return FSCACHE_CHECKAUX_OKAY;
> -}
> diff --git a/fs/fscache/internal.h b/fs/fscache/internal.h
> index 6eb3f51d7275..29f21fbd5b5d 100644
> --- a/fs/fscache/internal.h
> +++ b/fs/fscache/internal.h
> @@ -24,6 +24,7 @@
>  
>  #define pr_fmt(fmt) "FS-Cache: " fmt
>  
> +#include <linux/slab.h>
>  #include <linux/fscache-cache.h>
>  #include <trace/events/fscache.h>
>  #include <linux/sched.h>
> @@ -41,6 +42,20 @@ extern struct rw_semaphore fscache_addremove_sem;
>  extern struct fscache_cache *fscache_select_cache_for_object(
>  	struct fscache_cookie *);
>  
> +static inline
> +struct fscache_cache_tag *fscache_get_cache_tag(struct fscache_cache_tag *tag)
> +{
> +	if (tag)
> +		refcount_inc(&tag->ref);
> +	return tag;
> +}
> +
> +static inline void fscache_put_cache_tag(struct fscache_cache_tag *tag)
> +{
> +	if (tag && refcount_dec_and_test(&tag->ref))
> +		kfree(tag);
> +}
> +
>  /*
>   * cookie.c
>   */
> @@ -50,9 +65,10 @@ extern const struct seq_operations fscache_cookies_seq_ops;
>  extern void fscache_free_cookie(struct fscache_cookie *);
>  extern struct fscache_cookie *fscache_alloc_cookie(struct fscache_cookie *,
>  						   const struct fscache_cookie_def *,
> +						   struct fscache_cache_tag *,
>  						   const void *, size_t,
>  						   const void *, size_t,
> -						   void *, loff_t);
> +						   loff_t);
>  extern struct fscache_cookie *fscache_hash_cookie(struct fscache_cookie *);
>  extern struct fscache_cookie *fscache_cookie_get(struct fscache_cookie *,
>  						 enum fscache_cookie_trace);
> @@ -270,16 +286,9 @@ static inline void fscache_raise_event(struct fscache_object *object,
>  static inline
>  void fscache_update_aux(struct fscache_cookie *cookie, const void *aux_data)
>  {
> -	void *p;
> -
> -	if (!aux_data)
> -		return;
> -	if (cookie->aux_len <= sizeof(cookie->inline_aux))
> -		p = cookie->inline_aux;
> -	else
> -		p = cookie->aux;
> +	void *p = fscache_get_aux(cookie);
>  
> -	if (memcmp(p, aux_data, cookie->aux_len) != 0) {
> +	if (p && memcmp(p, aux_data, cookie->aux_len) != 0) {
>  		memcpy(p, aux_data, cookie->aux_len);
>  		set_bit(FSCACHE_COOKIE_AUX_UPDATED, &cookie->flags);
>  	}
> diff --git a/fs/fscache/netfs.c b/fs/fscache/netfs.c
> index d6bdb7b5e723..b8db06804876 100644
> --- a/fs/fscache/netfs.c
> +++ b/fs/fscache/netfs.c
> @@ -22,9 +22,10 @@ int __fscache_register_netfs(struct fscache_netfs *netfs)
>  	/* allocate a cookie for the primary index */
>  	candidate = fscache_alloc_cookie(&fscache_fsdef_index,
>  					 &fscache_fsdef_netfs_def,
> +					 NULL,
>  					 netfs->name, strlen(netfs->name),
>  					 &netfs->version, sizeof(netfs->version),
> -					 netfs, 0);
> +					 0);
>  	if (!candidate) {
>  		_leave(" = -ENOMEM");
>  		return -ENOMEM;
> diff --git a/fs/fscache/object.c b/fs/fscache/object.c
> index 86ad941726f7..cdcf6720d748 100644
> --- a/fs/fscache/object.c
> +++ b/fs/fscache/object.c
> @@ -901,55 +901,6 @@ static void fscache_dequeue_object(struct fscache_object *object)
>  	_leave("");
>  }
>  
> -/**
> - * fscache_check_aux - Ask the netfs whether an object on disk is still valid
> - * @object: The object to ask about
> - * @data: The auxiliary data for the object
> - * @datalen: The size of the auxiliary data
> - * @object_size: The size of the object according to the server.
> - *
> - * This function consults the netfs about the coherency state of an object.
> - * The caller must be holding a ref on cookie->n_active (held by
> - * fscache_look_up_object() on behalf of the cache backend during object lookup
> - * and creation).
> - */
> -enum fscache_checkaux fscache_check_aux(struct fscache_object *object,
> -					const void *data, uint16_t datalen,
> -					loff_t object_size)
> -{
> -	enum fscache_checkaux result;
> -
> -	if (!object->cookie->def->check_aux) {
> -		fscache_stat(&fscache_n_checkaux_none);
> -		return FSCACHE_CHECKAUX_OKAY;
> -	}
> -
> -	result = object->cookie->def->check_aux(object->cookie->netfs_data,
> -						data, datalen, object_size);
> -	switch (result) {
> -		/* entry okay as is */
> -	case FSCACHE_CHECKAUX_OKAY:
> -		fscache_stat(&fscache_n_checkaux_okay);
> -		break;
> -
> -		/* entry requires update */
> -	case FSCACHE_CHECKAUX_NEEDS_UPDATE:
> -		fscache_stat(&fscache_n_checkaux_update);
> -		break;
> -
> -		/* entry requires deletion */
> -	case FSCACHE_CHECKAUX_OBSOLETE:
> -		fscache_stat(&fscache_n_checkaux_obsolete);
> -		break;
> -
> -	default:
> -		BUG();
> -	}
> -
> -	return result;
> -}
> -EXPORT_SYMBOL(fscache_check_aux);
> -
>  /*
>   * Asynchronously invalidate an object.
>   */
> diff --git a/include/linux/fscache-cache.h b/include/linux/fscache-cache.h
> index 5e610f9a524c..264b94dad5be 100644
> --- a/include/linux/fscache-cache.h
> +++ b/include/linux/fscache-cache.h
> @@ -45,8 +45,9 @@ struct fscache_cache_tag {
>  	struct fscache_cache	*cache;		/* cache referred to by this tag */
>  	unsigned long		flags;
>  #define FSCACHE_TAG_RESERVED	0		/* T if tag is reserved for a cache */
> -	atomic_t		usage;
> -	char			name[];	/* tag name */
> +	atomic_t		usage;		/* Number of using netfs's */
> +	refcount_t		ref;		/* Reference count on structure */
> +	char			name[];		/* tag name */
>  };
>  
>  /*
> @@ -415,11 +416,6 @@ extern void fscache_io_error(struct fscache_cache *cache);
>  
>  extern bool fscache_object_sleep_till_congested(signed long *timeoutp);
>  
> -extern enum fscache_checkaux fscache_check_aux(struct fscache_object *object,
> -					       const void *data,
> -					       uint16_t datalen,
> -					       loff_t object_size);
> -
>  extern void fscache_object_retrying_stale(struct fscache_object *object);
>  
>  enum fscache_why_object_killed {
> @@ -431,4 +427,26 @@ enum fscache_why_object_killed {
>  extern void fscache_object_mark_killed(struct fscache_object *object,
>  				       enum fscache_why_object_killed why);
>  
> +/*
> + * Find the key on a cookie.
> + */
> +static inline void *fscache_get_key(struct fscache_cookie *cookie)
> +{
> +	if (cookie->key_len <= sizeof(cookie->inline_key))
> +		return cookie->inline_key;
> +	else
> +		return cookie->key;
> +}
> +
> +/*
> + * Find the auxiliary data on a cookie.
> + */
> +static inline void *fscache_get_aux(struct fscache_cookie *cookie)
> +{
> +	if (cookie->aux_len <= sizeof(cookie->inline_aux))
> +		return cookie->inline_aux;
> +	else
> +		return cookie->aux;
> +}
> +
>  #endif /* _LINUX_FSCACHE_CACHE_H */
> diff --git a/include/linux/fscache.h b/include/linux/fscache.h
> index ba4878b56717..cedab654dc79 100644
> --- a/include/linux/fscache.h
> +++ b/include/linux/fscache.h
> @@ -38,13 +38,6 @@ struct fscache_cookie;
>  struct fscache_netfs;
>  struct netfs_read_request;
>  
> -/* result of index entry consultation */
> -enum fscache_checkaux {
> -	FSCACHE_CHECKAUX_OKAY,		/* entry okay as is */
> -	FSCACHE_CHECKAUX_NEEDS_UPDATE,	/* entry requires update */
> -	FSCACHE_CHECKAUX_OBSOLETE,	/* entry requires deletion */
> -};
> -
>  /*
>   * fscache cookie definition
>   */
> @@ -56,26 +49,6 @@ struct fscache_cookie_def {
>  	uint8_t type;
>  #define FSCACHE_COOKIE_TYPE_INDEX	0
>  #define FSCACHE_COOKIE_TYPE_DATAFILE	1
> -
> -	/* select the cache into which to insert an entry in this index
> -	 * - optional
> -	 * - should return a cache identifier or NULL to cause the cache to be
> -	 *   inherited from the parent if possible or the first cache picked
> -	 *   for a non-index file if not
> -	 */
> -	struct fscache_cache_tag *(*select_cache)(
> -		const void *parent_netfs_data,
> -		const void *cookie_netfs_data);
> -
> -	/* consult the netfs about the state of an object
> -	 * - this function can be absent if the index carries no state data
> -	 * - the netfs data from the cookie being used as the target is
> -	 *   presented, as is the auxiliary data and the object size
> -	 */
> -	enum fscache_checkaux (*check_aux)(void *cookie_netfs_data,
> -					   const void *data,
> -					   uint16_t datalen,
> -					   loff_t object_size);
>  };
>  
>  /*
> @@ -105,9 +78,9 @@ struct fscache_cookie {
>  	struct hlist_head		backing_objects; /* object(s) backing this file/index */
>  	const struct fscache_cookie_def	*def;		/* definition */
>  	struct fscache_cookie		*parent;	/* parent of this entry */
> +	struct fscache_cache_tag	*preferred_cache; /* The preferred cache or NULL */
>  	struct hlist_bl_node		hash_link;	/* Link in hash table */
>  	struct list_head		proc_link;	/* Link in proc list */
> -	void				*netfs_data;	/* back pointer to netfs */
>  
>  	unsigned long			flags;
>  #define FSCACHE_COOKIE_LOOKING_UP	0	/* T if non-index cookie being looked up still */
> @@ -156,9 +129,10 @@ extern void __fscache_release_cache_tag(struct fscache_cache_tag *);
>  extern struct fscache_cookie *__fscache_acquire_cookie(
>  	struct fscache_cookie *,
>  	const struct fscache_cookie_def *,
> +	struct fscache_cache_tag *,
>  	const void *, size_t,
>  	const void *, size_t,
> -	void *, loff_t, bool);
> +	loff_t, bool);
>  extern void __fscache_relinquish_cookie(struct fscache_cookie *, const void *, bool);
>  extern int __fscache_check_consistency(struct fscache_cookie *, const void *);
>  extern void __fscache_update_cookie(struct fscache_cookie *, const void *);
> @@ -252,6 +226,7 @@ void fscache_release_cache_tag(struct fscache_cache_tag *tag)
>   * fscache_acquire_cookie - Acquire a cookie to represent a cache object
>   * @parent: The cookie that's to be the parent of this one
>   * @def: A description of the cache object, including callback operations
> + * @preferred_cache: The cache to use (or NULL)
>   * @index_key: The index key for this cookie
>   * @index_key_len: Size of the index key
>   * @aux_data: The auxiliary data for the cookie (may be NULL)
> @@ -272,19 +247,19 @@ static inline
>  struct fscache_cookie *fscache_acquire_cookie(
>  	struct fscache_cookie *parent,
>  	const struct fscache_cookie_def *def,
> +	struct fscache_cache_tag *preferred_cache,
>  	const void *index_key,
>  	size_t index_key_len,
>  	const void *aux_data,
>  	size_t aux_data_len,
> -	void *netfs_data,
>  	loff_t object_size,
>  	bool enable)
>  {
>  	if (fscache_cookie_valid(parent) && fscache_cookie_enabled(parent))
> -		return __fscache_acquire_cookie(parent, def,
> +		return __fscache_acquire_cookie(parent, def, preferred_cache,
>  						index_key, index_key_len,
>  						aux_data, aux_data_len,
> -						netfs_data, object_size, enable);
> +						object_size, enable);
>  	else
>  		return NULL;
>  }
> 
> 


-- 
Jeff Layton <jlayton@redhat.com>


^ permalink raw reply	[flat|nested] 87+ messages in thread

* Re: [PATCH 00/67] fscache: Rewrite index API and management system
  2021-10-18 14:50 [PATCH 00/67] fscache: Rewrite index API and management system David Howells
                   ` (67 preceding siblings ...)
  2021-10-19 13:29 ` [Linux-cachefs] [PATCH 00/67] fscache: Rewrite index API and management system Marc Dionne
@ 2021-10-19 18:08 ` Jeff Layton
  2021-10-19 19:00 ` David Howells
  2021-10-21 22:20 ` Omar Sandoval
  70 siblings, 0 replies; 87+ messages in thread
From: Jeff Layton @ 2021-10-19 18:08 UTC (permalink / raw)
  To: David Howells, linux-cachefs
  Cc: ceph-devel, linux-afs, Anna Schumaker, linux-nfs,
	Kent Overstreet, linux-mm, Matthew Wilcox, linux-fsdevel,
	Dave Wysochanski, Marc Dionne, Trond Myklebust, Shyam Prasad N,
	Eric Van Hensbergen, v9fs-developer, linux-cifs,
	Latchesar Ionkov, Steve French, Al Viro, Dominique Martinet,
	Ilya Dryomov, Trond Myklebust, Omar Sandoval, Linus Torvalds,
	linux-kernel

On Mon, 2021-10-18 at 15:50 +0100, David Howells wrote:
> Here's a set of patches that rewrites and simplifies the fscache index API
> to remove the complex operation scheduling and object state machine in
> favour of something much smaller and simpler.  It is built on top of the
> set of patches that removes the old API[1].
> 
> The operation scheduling API was intended to handle sequencing of cache
> operations, which were all required (where possible) to run asynchronously
> in parallel with the operations being done by the network filesystem, while
> allowing the cache to be brought online and offline and interrupt service
> with invalidation.
> 
> However, with the advent of the tmpfile capacity in the VFS, an opportunity
> arises to do invalidation much more easily, without having to wait for I/O
> that's actually in progress: Cachefiles can simply cut over its file
> pointer for the backing object attached to a cookie and abandon the
> in-progress I/O, dismissing it upon completion.
> 
> Future work there would involve using Omar Sandoval's vfs_link() with
> AT_LINK_REPLACE[2] to allow an extant file to be displaced by a new hard
> link from a tmpfile as currently I have to unlink the old file first.
> 
> These patches can also simplify the object state handling as I/O operations
> to the cache don't all have to be brought to a stop in order to invalidate
> a file.  To that end, and with an eye on to writing a new backing cache
> model in the future, I've taken the opportunity to simplify the indexing
> structure.
> 
> I've separated the index cookie concept from the file cookie concept by
> type now.  The former is now called a "volume cookie" (struct
> fscache_volume) and there is a container of file cookies.  There are then
> just the two levels.  All the index cookieage is collapsed into a single
> volume cookie, and this has a single printable string as a key.  For
> instance, an AFS volume would have a key of something like
> "afs,example.com,1000555", combining the filesystem name, cell name and
> volume ID.  This is freeform, but must not have '/' chars in it.
> 

Given the indexing changes, what sort of behavior should we expect when
upgrading from old-style to new-style indexes? Do they just not match,
and we end up downloading new copies of all the data and the old stale
stuff eventually gets culled?

Ditto for downgrades -- can we expect sane behavior if someone runs an
old kernel on top of an existing fscache that was populated by a new
kernel?

> I've also eliminated all pointers back from fscache into the network
> filesystem.  This required the duplication of a little bit of data in the
> cookie (cookie key, coherency data and file size), but it's not actually
> that much.  This gets rid of problems with making sure we keep netfs data
> structures around so that the cache can access them.
> 
> I have changed afs throughout the patch series, but I also have patches for
> 9p, nfs and cifs.  Jeff Layton is handling ceph support.
> 
> 
> BITS THAT MAY BE CONTROVERSIAL
> ==============================
> 
> There are some bits I've added that may be controversial:
> 
>  (1) I've provided a flag, S_KERNEL_FILE, that cachefiles uses to check if
>      a files is already being used by some other kernel service (e.g. a
>      duplicate cachefiles cache in the same directory) and reject it if it
>      is.  This isn't entirely necessary, but it helps prevent accidental
>      data corruption.
> 
>      I don't want to use S_SWAPFILE as that has other effects, but quite
>      possibly swapon() should set S_KERNEL_FILE too.
> 
>      Note that it doesn't prevent userspace from interfering, though
>      perhaps it should.
> 
>  (2) Cachefiles wants to keep the backing file for a cookie open whilst we
>      might need to write to it from network filesystem writeback.  The
>      problem is that the network filesystem unuses its cookie when its file
>      is closed, and so we have nothing pinning the cachefiles file open and
>      it will get closed automatically after a short time to avoid
>      EMFILE/ENFILE problems.
> 
>      Reopening the cache file, however, is a problem if this is being done
>      due to writeback triggered by exit().  Some filesystems will oops if
>      we try to open a file in that context because they want to access
>      current->fs or suchlike.
> 
>      To get around this, I added the following:
> 
>      (A) An inode flag, I_PINNING_FSCACHE_WB, to be set on a network
>      	 filesystem inode to indicate that we have a usage count on the
>      	 cookie caching that inode.
> 
>      (B) A flag in struct writeback_control, unpinned_fscache_wb, that is
>      	 set when __writeback_single_inode() clears the last dirty page
>      	 from i_pages - at which point it clears I_PINNING_FSCACHE_WB and
>      	 sets this flag.
> 
> 	 This has to be done here so that clearing I_PINNING_FSCACHE_WB can
> 	 be done atomically with the check of PAGECACHE_TAG_DIRTY that
> 	 clears I_DIRTY_PAGES.
> 
>      (C) A function, fscache_set_page_dirty(), which if it is not set, sets
>      	 I_PINNING_FSCACHE_WB and calls fscache_use_cookie() to pin the
>      	 cache resources.
> 
>      (D) A function, fscache_unpin_writeback(), to be called by
>      	 ->write_inode() to unuse the cookie.
> 
>      (E) A function, fscache_clear_inode_writeback(), to be called when the
>      	 inode is evicted, before clear_inode() is called.  This cleans up
>      	 any lingering I_PINNING_FSCACHE_WB.
> 
>      The network filesystem can then use these tools to make sure that
>      fscache_write_to_cache() can write locally modified data to the cache
>      as well as to the server.
> 
>      For the future, I'm working on write helpers for netfs lib that should
>      allow this facility to be removed by keeping track of the dirty
>      regions separately - but that's incomplete at the moment and is also
>      going to be affected by folios, one way or another, since it deals
>      with pages.
> 
> 
> These patches can be found also on:
> 
> 	https://git.kernel.org/pub/scm/linux/kernel/git/dhowells/linux-fs.git/log/?h=fscache-rewrite-indexing
> 
> David
> 
> Link: https://lore.kernel.org/r/163363935000.1980952.15279841414072653108.stgit@warthog.procyon.org.uk [1]
> Link: https://lore.kernel.org/r/cover.1580251857.git.osandov@fb.com/ [2]
> 
> ---
> Dave Wysochanski (3):
>       NFS: Convert fscache_acquire_cookie and fscache_relinquish_cookie
>       NFS: Convert fscache_enable_cookie and fscache_disable_cookie
>       NFS: Convert fscache invalidation and update aux_data and i_size
> 
> David Howells (63):
>       mm: Stop filemap_read() from grabbing a superfluous page
>       vfs: Provide S_KERNEL_FILE inode flag
>       vfs, fscache: Force ->write_inode() to occur if cookie pinned for writeback
>       afs: Handle len being extending over page end in write_begin/write_end
>       afs: Fix afs_write_end() to handle len > page size
>       nfs, cifs, ceph, 9p: Disable use of fscache prior to its rewrite
>       fscache: Remove the netfs data from the cookie
>       fscache: Remove struct fscache_cookie_def
>       fscache: Remove store_limit* from struct fscache_object
>       fscache: Remove fscache_check_consistency()
>       fscache: Remove fscache_attr_changed()
>       fscache: Remove obsolete stats
>       fscache: Remove old I/O tracepoints
>       fscache: Temporarily disable fscache_invalidate()
>       fscache: Disable fscache_begin_operation()
>       fscache: Remove the I/O operation manager
>       fscache: Rename fscache_cookie_{get,put,see}()
>       cachefiles: Remove tree of active files and use S_CACHE_FILE inode flag
>       cachefiles: Don't set an xattr on the root of the cache
>       cachefiles: Remove some redundant checks on unsigned values
>       cachefiles: Prevent inode from going away when burying a dentry
>       cachefiles: Simplify the pathwalk and save the filename for an object
>       cachefiles: trace: Improve the lookup tracepoint
>       cachefiles: Remove separate backer dentry from cachefiles_object
>       cachefiles: Fold fscache_object into cachefiles_object
>       cachefiles: Change to storing file* rather than dentry*
>       cachefiles: trace: Log coherency checks
>       cachefiles: Trace truncations
>       cachefiles: Trace read and write operations
>       cachefiles: Round the cachefile size up to DIO block size
>       cachefiles: Don't use XATTR_ flags with vfs_setxattr()
>       fscache: Replace the object management state machine
>       cachefiles: Trace decisions in cachefiles_prepare_read()
>       cachefiles: Make cachefiles_write_prepare() check for space
>       fscache: Automatically close a file that's been unused for a while
>       fscache: Add stats for the cookie commit LRU
>       fscache: Move fscache_update_cookie() complete inline
>       fscache: Remove more obsolete stats
>       fscache: Note the object size during invalidation
>       vfs, fscache: Force ->write_inode() to occur if cookie pinned for writeback
>       afs: Render cache cookie key as big endian
>       cachefiles: Use tmpfile/link
>       fscache: Rewrite invalidation
>       cachefiles: Simplify the file lookup/creation/check code
>       fscache: Provide resize operation
>       cachefiles: Put more information in the xattr attached to the cache file
>       fscache: Implement "will_modify" parameter on fscache_use_cookie()
>       fscache: Add support for writing to the cache
>       fscache: Make fscache_clear_page_bits() conditional on cookie
>       fscache: Make fscache_write_to_cache() conditional on cookie
>       afs: Copy local writes to the cache when writing to the server
>       afs: Invoke fscache_resize_cookie() when handling ATTR_SIZE for setattr
>       afs: Add O_DIRECT read support
>       afs: Skip truncation on the server of data we haven't written yet
>       afs: Make afs_write_begin() return the THP subpage
>       cachefiles, afs: Drive FSCACHE_COOKIE_NO_DATA_TO_READ
>       nfs: Convert to new fscache volume/cookie API
>       9p: Use fscache indexing rewrite and reenable caching
>       9p: Copy local writes to the cache when writing to the server
>       netfs: Display the netfs inode number in the netfs_read tracepoint
>       cachefiles: Add tracepoints to log errors from ops on the backing fs
>       cachefiles: Add error injection support
>       cifs: Support fscache indexing rewrite (untested)
> 
> Jeff Layton (1):
>       fscache: disable cookie when doing an invalidation for DIO write
> 
> 
>  fs/9p/cache.c                     |  184 +----
>  fs/9p/cache.h                     |   23 +-
>  fs/9p/v9fs.c                      |   14 +-
>  fs/9p/v9fs.h                      |   13 +-
>  fs/9p/vfs_addr.c                  |   55 +-
>  fs/9p/vfs_dir.c                   |   13 +-
>  fs/9p/vfs_file.c                  |    7 +-
>  fs/9p/vfs_inode.c                 |   24 +-
>  fs/9p/vfs_inode_dotl.c            |    3 +-
>  fs/9p/vfs_super.c                 |    3 +
>  fs/afs/Makefile                   |    3 -
>  fs/afs/cache.c                    |   68 --
>  fs/afs/cell.c                     |   12 -
>  fs/afs/file.c                     |   83 +-
>  fs/afs/fsclient.c                 |   18 +-
>  fs/afs/inode.c                    |  101 ++-
>  fs/afs/internal.h                 |   36 +-
>  fs/afs/main.c                     |   14 -
>  fs/afs/super.c                    |    1 +
>  fs/afs/volume.c                   |   15 +-
>  fs/afs/write.c                    |  170 +++-
>  fs/afs/yfsclient.c                |   12 +-
>  fs/cachefiles/Kconfig             |    8 +
>  fs/cachefiles/Makefile            |    3 +
>  fs/cachefiles/bind.c              |  186 +++--
>  fs/cachefiles/daemon.c            |   20 +-
>  fs/cachefiles/error_inject.c      |   46 ++
>  fs/cachefiles/interface.c         |  660 +++++++--------
>  fs/cachefiles/internal.h          |  191 +++--
>  fs/cachefiles/io.c                |  310 +++++--
>  fs/cachefiles/key.c               |  203 +++--
>  fs/cachefiles/main.c              |   20 +-
>  fs/cachefiles/namei.c             |  978 ++++++++++------------
>  fs/cachefiles/volume.c            |  128 +++
>  fs/cachefiles/xattr.c             |  367 +++------
>  fs/ceph/Kconfig                   |    2 +-
>  fs/cifs/Makefile                  |    2 +-
>  fs/cifs/cache.c                   |  105 ---
>  fs/cifs/cifsfs.c                  |   11 +-
>  fs/cifs/cifsglob.h                |    5 +-
>  fs/cifs/connect.c                 |    3 -
>  fs/cifs/file.c                    |   37 +-
>  fs/cifs/fscache.c                 |  201 ++---
>  fs/cifs/fscache.h                 |   51 +-
>  fs/cifs/inode.c                   |   18 +-
>  fs/fs-writeback.c                 |    8 +
>  fs/fscache/Kconfig                |    4 +
>  fs/fscache/Makefile               |    6 +-
>  fs/fscache/cache.c                |  541 ++++++-------
>  fs/fscache/cookie.c               | 1262 ++++++++++++++---------------
>  fs/fscache/fsdef.c                |   98 ---
>  fs/fscache/internal.h             |  213 +----
>  fs/fscache/io.c                   |  405 ++++++---
>  fs/fscache/main.c                 |  134 +--
>  fs/fscache/netfs.c                |   74 --
>  fs/fscache/object.c               | 1123 -------------------------
>  fs/fscache/operation.c            |  633 ---------------
>  fs/fscache/page.c                 |   84 --
>  fs/fscache/proc.c                 |   43 +-
>  fs/fscache/stats.c                |  202 ++---
>  fs/fscache/volume.c               |  449 ++++++++++
>  fs/netfs/read_helper.c            |    2 +-
>  fs/nfs/Makefile                   |    2 +-
>  fs/nfs/client.c                   |    4 -
>  fs/nfs/direct.c                   |    2 +
>  fs/nfs/file.c                     |    7 +-
>  fs/nfs/fscache-index.c            |  114 ---
>  fs/nfs/fscache.c                  |  264 ++----
>  fs/nfs/fscache.h                  |   89 +-
>  fs/nfs/inode.c                    |   11 +-
>  fs/nfs/super.c                    |    7 +-
>  fs/nfs/write.c                    |    1 +
>  include/linux/fs.h                |    4 +
>  include/linux/fscache-cache.h     |  463 +++--------
>  include/linux/fscache.h           |  626 +++++++-------
>  include/linux/netfs.h             |    4 +-
>  include/linux/nfs_fs_sb.h         |    9 +-
>  include/linux/writeback.h         |    1 +
>  include/trace/events/cachefiles.h |  483 ++++++++---
>  include/trace/events/fscache.h    |  631 +++++++--------
>  include/trace/events/netfs.h      |    5 +-
>  81 files changed, 5140 insertions(+), 7295 deletions(-)
>  delete mode 100644 fs/afs/cache.c
>  create mode 100644 fs/cachefiles/error_inject.c
>  create mode 100644 fs/cachefiles/volume.c
>  delete mode 100644 fs/cifs/cache.c
>  delete mode 100644 fs/fscache/fsdef.c
>  delete mode 100644 fs/fscache/netfs.c
>  delete mode 100644 fs/fscache/object.c
>  delete mode 100644 fs/fscache/operation.c
>  create mode 100644 fs/fscache/volume.c
>  delete mode 100644 fs/nfs/fscache-index.c
> 
> 

-- 
Jeff Layton <jlayton@kernel.org>


^ permalink raw reply	[flat|nested] 87+ messages in thread

* Re: [PATCH 02/67] vfs: Provide S_KERNEL_FILE inode flag
  2021-10-18 14:50 ` [PATCH 02/67] vfs: Provide S_KERNEL_FILE inode flag David Howells
@ 2021-10-19 18:10   ` Jeff Layton
  2021-10-19 19:02   ` David Howells
  1 sibling, 0 replies; 87+ messages in thread
From: Jeff Layton @ 2021-10-19 18:10 UTC (permalink / raw)
  To: David Howells, linux-cachefs
  Cc: Trond Myklebust, Anna Schumaker, Steve French,
	Dominique Martinet, Matthew Wilcox, Alexander Viro,
	Omar Sandoval, Linus Torvalds, linux-afs, linux-nfs, linux-cifs,
	ceph-devel, v9fs-developer, linux-fsdevel, linux-kernel

On Mon, 2021-10-18 at 15:50 +0100, David Howells wrote:
> Provide an S_KERNEL_FILE inode flag that a kernel service, e.g. cachefiles,
> can set to ward off other kernel services and drivers (including itself)
> from using files it is actively using.
> 
> Signed-off-by: David Howells <dhowells@redhat.com>
> ---
> 
>  include/linux/fs.h |    1 +
>  1 file changed, 1 insertion(+)
> 
> diff --git a/include/linux/fs.h b/include/linux/fs.h
> index e7a633353fd2..197493507744 100644
> --- a/include/linux/fs.h
> +++ b/include/linux/fs.h
> @@ -2250,6 +2250,7 @@ struct super_operations {
>  #define S_ENCRYPTED	(1 << 14) /* Encrypted file (using fs/crypto/) */
>  #define S_CASEFOLD	(1 << 15) /* Casefolded file */
>  #define S_VERITY	(1 << 16) /* Verity file (using fs/verity/) */
> +#define S_KERNEL_FILE	(1 << 17) /* File is in use by the kernel (eg. fs/cachefiles) */
>  
>  /*
>   * Note that nosuid etc flags are inode-specific: setting some file-system
> 
> 

It'd be better to fold this in with the patch where the first user is
added. That would make it easier to see how you intend to use it.
-- 
Jeff Layton <jlayton@redhat.com>


^ permalink raw reply	[flat|nested] 87+ messages in thread

* Re: [PATCH 01/67] mm: Stop filemap_read() from grabbing a superfluous page
  2021-10-18 14:50 ` [PATCH 01/67] mm: Stop filemap_read() from grabbing a superfluous page David Howells
  2021-10-19 17:13   ` Jeff Layton
@ 2021-10-19 18:28   ` Matthew Wilcox
  2021-10-19 18:48   ` David Howells
  2 siblings, 0 replies; 87+ messages in thread
From: Matthew Wilcox @ 2021-10-19 18:28 UTC (permalink / raw)
  To: David Howells
  Cc: linux-cachefs, Kent Overstreet, linux-mm, linux-fsdevel,
	Trond Myklebust, Anna Schumaker, Steve French,
	Dominique Martinet, Jeff Layton, Alexander Viro, Omar Sandoval,
	Linus Torvalds, linux-afs, linux-nfs, linux-cifs, ceph-devel,
	v9fs-developer, linux-kernel

On Mon, Oct 18, 2021 at 03:50:32PM +0100, David Howells wrote:
> @@ -2625,6 +2625,10 @@ ssize_t filemap_read(struct kiocb *iocb, struct iov_iter *iter,
>  		if ((iocb->ki_flags & IOCB_WAITQ) && already_read)
>  			iocb->ki_flags |= IOCB_NOWAIT;
>  
> +		isize = i_size_read(inode);
> +		if (unlikely(iocb->ki_pos >= isize))
> +			goto put_pages;
> +

Is there a good reason to assign to isize here?  I'd rather not,
because it complicates analysis, and a later change might look at
the isize read here, not realising it was a racy use.  So I'd
rather see:

		if (unlikely(iocb->ki_pos >= i_size_read(inode)))
			goto put_pages;

^ permalink raw reply	[flat|nested] 87+ messages in thread

* Re: [PATCH 01/67] mm: Stop filemap_read() from grabbing a superfluous page
  2021-10-18 14:50 ` [PATCH 01/67] mm: Stop filemap_read() from grabbing a superfluous page David Howells
  2021-10-19 17:13   ` Jeff Layton
  2021-10-19 18:28   ` Matthew Wilcox
@ 2021-10-19 18:48   ` David Howells
  2021-10-19 20:04     ` Matthew Wilcox
  2 siblings, 1 reply; 87+ messages in thread
From: David Howells @ 2021-10-19 18:48 UTC (permalink / raw)
  To: Matthew Wilcox
  Cc: dhowells, linux-cachefs, Kent Overstreet, linux-mm,
	linux-fsdevel, Trond Myklebust, Anna Schumaker, Steve French,
	Dominique Martinet, Jeff Layton, Alexander Viro, Omar Sandoval,
	Linus Torvalds, linux-afs, linux-nfs, linux-cifs, ceph-devel,
	v9fs-developer, linux-kernel

Matthew Wilcox <willy@infradead.org> wrote:

> > +		isize = i_size_read(inode);
> > +		if (unlikely(iocb->ki_pos >= isize))
> > +			goto put_pages;
> > +
> 
> Is there a good reason to assign to isize here?  I'd rather not,
> because it complicates analysis, and a later change might look at
> the isize read here, not realising it was a racy use.  So I'd
> rather see:

If we don't set isize, the loop will never end.  Actually, maybe we can just
break out at that point rather than going to put_pages.

David


^ permalink raw reply	[flat|nested] 87+ messages in thread

* Re: [PATCH 00/67] fscache: Rewrite index API and management system
  2021-10-18 14:50 [PATCH 00/67] fscache: Rewrite index API and management system David Howells
                   ` (68 preceding siblings ...)
  2021-10-19 18:08 ` Jeff Layton
@ 2021-10-19 19:00 ` David Howells
  2021-10-21 22:20 ` Omar Sandoval
  70 siblings, 0 replies; 87+ messages in thread
From: David Howells @ 2021-10-19 19:00 UTC (permalink / raw)
  To: Jeff Layton
  Cc: dhowells, linux-cachefs, ceph-devel, linux-afs, Anna Schumaker,
	linux-nfs, Kent Overstreet, linux-mm, Matthew Wilcox,
	linux-fsdevel, Dave Wysochanski, Marc Dionne, Trond Myklebust,
	Shyam Prasad N, Eric Van Hensbergen, v9fs-developer, linux-cifs,
	Latchesar Ionkov, Steve French, Al Viro, Dominique Martinet,
	Ilya Dryomov, Trond Myklebust, Omar Sandoval, Linus Torvalds,
	linux-kernel

Jeff Layton <jlayton@kernel.org> wrote:

> Given the indexing changes, what sort of behavior should we expect when
> upgrading from old-style to new-style indexes? Do they just not match,
> and we end up downloading new copies of all the data and the old stale
> stuff eventually gets culled?

Correct: they don't match.  The names of the directories and files will be
quite different - and so will the attached xattrs.  However, no filesystems
currently store locally-modified data in the cache, so you shouldn't lose any
data after upgrading.

> Ditto for downgrades -- can we expect sane behavior if someone runs an
> old kernel on top of an existing fscache that was populated by a new
> kernel?

Correct.  With this branch, filesystems now store locally-modified data into
the cache - but they also upload it to the server at the same time.  If
there's a disagreement between what's in the cache and what's on the server
with this branch, the cache is discarded, so simply discarding the cache on a
download shouldn't be a problem.

It's currently operating as a write-through cache, not a write-back cache.
That will change if I get round to implementing disconnected operation, but
it's not there yet.

David


^ permalink raw reply	[flat|nested] 87+ messages in thread

* Re: [PATCH 02/67] vfs: Provide S_KERNEL_FILE inode flag
  2021-10-18 14:50 ` [PATCH 02/67] vfs: Provide S_KERNEL_FILE inode flag David Howells
  2021-10-19 18:10   ` Jeff Layton
@ 2021-10-19 19:02   ` David Howells
  1 sibling, 0 replies; 87+ messages in thread
From: David Howells @ 2021-10-19 19:02 UTC (permalink / raw)
  To: Jeff Layton
  Cc: dhowells, linux-cachefs, Trond Myklebust, Anna Schumaker,
	Steve French, Dominique Martinet, Matthew Wilcox, Alexander Viro,
	Omar Sandoval, Linus Torvalds, linux-afs, linux-nfs, linux-cifs,
	ceph-devel, v9fs-developer, linux-fsdevel, linux-kernel

Jeff Layton <jlayton@redhat.com> wrote:

> It'd be better to fold this in with the patch where the first user is
> added. That would make it easier to see how you intend to use it.

Yeah - I didn't put it in there because as I zip backwards and forwards
through the patch stack, applying/deapplying this change triggers a complete
rebuild.

David


^ permalink raw reply	[flat|nested] 87+ messages in thread

* Re: [PATCH 01/67] mm: Stop filemap_read() from grabbing a superfluous page
  2021-10-19 18:48   ` David Howells
@ 2021-10-19 20:04     ` Matthew Wilcox
  0 siblings, 0 replies; 87+ messages in thread
From: Matthew Wilcox @ 2021-10-19 20:04 UTC (permalink / raw)
  To: David Howells
  Cc: linux-cachefs, Kent Overstreet, linux-mm, linux-fsdevel,
	Trond Myklebust, Anna Schumaker, Steve French,
	Dominique Martinet, Jeff Layton, Alexander Viro, Omar Sandoval,
	Linus Torvalds, linux-afs, linux-nfs, linux-cifs, ceph-devel,
	v9fs-developer, linux-kernel

On Tue, Oct 19, 2021 at 07:48:15PM +0100, David Howells wrote:
> Matthew Wilcox <willy@infradead.org> wrote:
> 
> > > +		isize = i_size_read(inode);
> > > +		if (unlikely(iocb->ki_pos >= isize))
> > > +			goto put_pages;
> > > +
> > 
> > Is there a good reason to assign to isize here?  I'd rather not,
> > because it complicates analysis, and a later change might look at
> > the isize read here, not realising it was a racy use.  So I'd
> > rather see:
> 
> If we don't set isize, the loop will never end.  Actually, maybe we can just
> break out at that point rather than going to put_pages.

Umm, yes, of course.  Sorry.

It makes more sense to just break because we haven't got any pages,
so putting pages that we haven't got seems unnecessary.
> 

^ permalink raw reply	[flat|nested] 87+ messages in thread

* Re: [PATCH 03/67] vfs, fscache: Force ->write_inode() to occur if cookie pinned for writeback
  2021-10-18 14:51 ` [PATCH 03/67] vfs, fscache: Force ->write_inode() to occur if cookie pinned for writeback David Howells
  2021-10-19 17:22   ` Jeff Layton
@ 2021-10-20  9:49   ` David Howells
  1 sibling, 0 replies; 87+ messages in thread
From: David Howells @ 2021-10-20  9:49 UTC (permalink / raw)
  To: Jeff Layton
  Cc: dhowells, linux-cachefs, Trond Myklebust, Anna Schumaker,
	Steve French, Dominique Martinet, Matthew Wilcox, Alexander Viro,
	Omar Sandoval, Linus Torvalds, linux-afs, linux-nfs, linux-cifs,
	ceph-devel, v9fs-developer, linux-fsdevel, linux-kernel

Jeff Layton <jlayton@redhat.com> wrote:

> IDGI: how would I_PINNING_FSCACHE_WB get set in the first place? 

This is used by a later patch, but because this modifies a very commonly used
header file, it was causing mass rebuilds every time I pushed or popped it.  I
can merge it back in now, I think.

David


^ permalink raw reply	[flat|nested] 87+ messages in thread

* Re: [PATCH 06/67] nfs, cifs, ceph, 9p: Disable use of fscache prior to its rewrite
  2021-10-18 14:51 ` [PATCH 06/67] nfs, cifs, ceph, 9p: Disable use of fscache prior to its rewrite David Howells
  2021-10-19 17:50   ` Jeff Layton
@ 2021-10-20 10:57   ` David Howells
  1 sibling, 0 replies; 87+ messages in thread
From: David Howells @ 2021-10-20 10:57 UTC (permalink / raw)
  To: Jeff Layton
  Cc: dhowells, linux-cachefs, Dave Wysochanski, Trond Myklebust,
	Anna Schumaker, linux-nfs, Ilya Dryomov, ceph-devel,
	Steve French, linux-cifs, Eric Van Hensbergen, Latchesar Ionkov,
	Dominique Martinet, v9fs-developer, Trond Myklebust,
	Matthew Wilcox, Alexander Viro, Omar Sandoval, Linus Torvalds,
	linux-afs, linux-fsdevel, linux-kernel

Jeff Layton <jlayton@kernel.org> wrote:

> The typical way to do this would be to rebrand the existing FSCACHE
> Kconfig symbols into FSCACHE_OLD and then build the new fscache
> structure such that it exists in parallel with the old.

That, there, is nub of the problem.

You can't have parallel cachefiles drivers: There's a single userspace
interface (/dev/cachefiles) and only one driver can register it.  You would
need to decide at compile time whether you want the converted or the
unconverted network filesystems to be cached.

> You'd then just drop the old infrastructure once all of the fs's are
> converted to the new. You could even make them conflict with one another in
> Kconfig too, so that only one could be built in during the transition period
> if supporting both at runtime is too difficult.
> 
> This approach of disabling everything is much more of an all-or-nothing
> affair. It may mean less "churn" overall, but it seems less "nice"
> because you have an interval of commits where fscache is non-functional.
> 
> I'm not necessarily opposed to this approach, but I'd like to better
> understand why doing it this way was preferred.

I'm trying to avoid adding two parallel drivers, but change in place so that I
can test parts of it as I go along.

David


^ permalink raw reply	[flat|nested] 87+ messages in thread

* Re: [PATCH 00/67] fscache: Rewrite index API and management system
  2021-10-18 14:50 [PATCH 00/67] fscache: Rewrite index API and management system David Howells
                   ` (69 preceding siblings ...)
  2021-10-19 19:00 ` David Howells
@ 2021-10-21 22:20 ` Omar Sandoval
  2021-10-21 23:15   ` Steve French
  2021-10-22 18:52   ` David Howells
  70 siblings, 2 replies; 87+ messages in thread
From: Omar Sandoval @ 2021-10-21 22:20 UTC (permalink / raw)
  To: David Howells
  Cc: linux-cachefs, ceph-devel, linux-afs, Anna Schumaker, linux-nfs,
	Kent Overstreet, linux-mm, Matthew Wilcox, linux-fsdevel,
	Dave Wysochanski, Marc Dionne, Trond Myklebust, Shyam Prasad N,
	Eric Van Hensbergen, v9fs-developer, linux-cifs,
	Latchesar Ionkov, Jeff Layton, Steve French, Al Viro,
	Dominique Martinet, Ilya Dryomov, Trond Myklebust, Jeff Layton,
	Linus Torvalds, linux-kernel

On Mon, Oct 18, 2021 at 03:50:15PM +0100, David Howells wrote:
> 
> Here's a set of patches that rewrites and simplifies the fscache index API
> to remove the complex operation scheduling and object state machine in
> favour of something much smaller and simpler.  It is built on top of the
> set of patches that removes the old API[1].
> 
> The operation scheduling API was intended to handle sequencing of cache
> operations, which were all required (where possible) to run asynchronously
> in parallel with the operations being done by the network filesystem, while
> allowing the cache to be brought online and offline and interrupt service
> with invalidation.
> 
> However, with the advent of the tmpfile capacity in the VFS, an opportunity
> arises to do invalidation much more easily, without having to wait for I/O
> that's actually in progress: Cachefiles can simply cut over its file
> pointer for the backing object attached to a cookie and abandon the
> in-progress I/O, dismissing it upon completion.
> 
> Future work there would involve using Omar Sandoval's vfs_link() with
> AT_LINK_REPLACE[2] to allow an extant file to be displaced by a new hard
> link from a tmpfile as currently I have to unlink the old file first.

I had forgotten about that. It'd be great to finish that someday, but
given the dead-end of the last discussion [1], we might need to hash it
out the next time we can convene in person.

1:https://lore.kernel.org/linux-fsdevel/364531.1579265357@warthog.procyon.org.uk/ 

^ permalink raw reply	[flat|nested] 87+ messages in thread

* Re: [PATCH 00/67] fscache: Rewrite index API and management system
  2021-10-21 22:20 ` Omar Sandoval
@ 2021-10-21 23:15   ` Steve French
  2021-10-21 23:23     ` Dominique Martinet
  2021-10-21 23:43     ` Jeff Layton
  2021-10-22 18:52   ` David Howells
  1 sibling, 2 replies; 87+ messages in thread
From: Steve French @ 2021-10-21 23:15 UTC (permalink / raw)
  To: Omar Sandoval
  Cc: David Howells, linux-cachefs, ceph-devel, linux-afs,
	Anna Schumaker, linux-nfs, Kent Overstreet, linux-mm,
	Matthew Wilcox, linux-fsdevel, Dave Wysochanski, Marc Dionne,
	Trond Myklebust, Shyam Prasad N, Eric Van Hensbergen,
	v9fs-developer, CIFS, Latchesar Ionkov, Jeff Layton,
	Steve French, Al Viro, Dominique Martinet, Ilya Dryomov,
	Trond Myklebust, Jeff Layton, Linus Torvalds, LKML

On Thu, Oct 21, 2021 at 5:21 PM Omar Sandoval <osandov@osandov.com> wrote:
>
> On Mon, Oct 18, 2021 at 03:50:15PM +0100, David Howells wrote:
> However, with the advent of the tmpfile capacity in the VFS, an opportunity
> arises to do invalidation much more easily, without having to wait for I/O
> that's actually in progress: Cachefiles can simply cut over its file
> pointer for the backing object attached to a cookie and abandon the
> in-progress I/O, dismissing it upon completion.

Have changes been made to O_TMPFILE?  It is problematic for network filesystems
because it is not an atomic operation, and would be great if it were possible
to create a tmpfile and open it atomically (at the file system level).

Currently it results in creating a tmpfile (which results in
opencreate then close)
immediately followed by reopening the tmpfile which is somewhat counter to
the whole idea of a tmpfile (ie that it is deleted when closed) since
the syscall results
in two opens ie open(create)/close/open/close


-- 
Thanks,

Steve

^ permalink raw reply	[flat|nested] 87+ messages in thread

* Re: [PATCH 00/67] fscache: Rewrite index API and management system
  2021-10-21 23:15   ` Steve French
@ 2021-10-21 23:23     ` Dominique Martinet
  2021-10-21 23:43     ` Jeff Layton
  1 sibling, 0 replies; 87+ messages in thread
From: Dominique Martinet @ 2021-10-21 23:23 UTC (permalink / raw)
  To: Steve French
  Cc: Omar Sandoval, David Howells, linux-cachefs, ceph-devel,
	linux-afs, Anna Schumaker, linux-nfs, Kent Overstreet, linux-mm,
	Matthew Wilcox, linux-fsdevel, Dave Wysochanski, Marc Dionne,
	Trond Myklebust, Shyam Prasad N, Eric Van Hensbergen,
	v9fs-developer, CIFS, Latchesar Ionkov, Jeff Layton,
	Steve French, Al Viro, Ilya Dryomov, Trond Myklebust,
	Jeff Layton, Linus Torvalds, LKML

Steve French wrote on Thu, Oct 21, 2021 at 06:15:49PM -0500:
> Have changes been made to O_TMPFILE?  It is problematic for network filesystems
> because it is not an atomic operation, and would be great if it were possible
> to create a tmpfile and open it atomically (at the file system level).
> 
> Currently it results in creating a tmpfile (which results in
> opencreate then close)
> immediately followed by reopening the tmpfile which is somewhat counter to
> the whole idea of a tmpfile (ie that it is deleted when closed) since
> the syscall results
> in two opens ie open(create)/close/open/close

That depends on the filesystem, e.g. 9p open returns the opened fid so
our semantic could be closer to that of a local filesystem (could
because I didn't test recently and don't recall how it was handled, I
think it was fixed as I remember it being a problem at some point...)

The main problem with network filesystem and "open closed files" is:
what should the server do if the client disconnects? Will the client
come back and try to access that file again? Did the client crash
completely and should the file be deleted? The server has no way of
knowing.
It's the same logic as unlinking an open file, leading to all sort of
"silly renames" that are most famous for nfs (.nfsxxxx files that the
servers have been taught to just delete if they haven't been accessed
for long enough...)

I'm not sure we can have a generic solution for that unfortunately...

(9P is "easy" enough in that the linux client does not attempt to
reconnect ever if the connection has been lost, so we can just really
unlink the file and the server will delete it when the client
disconnects... But if we were to implement a reconnect eventually (I
know of an existing port that does) then this solution is no longer
acceptable)

-- 
Dominique Martinet | Asmadeus

^ permalink raw reply	[flat|nested] 87+ messages in thread

* Re: [PATCH 00/67] fscache: Rewrite index API and management system
  2021-10-21 23:15   ` Steve French
  2021-10-21 23:23     ` Dominique Martinet
@ 2021-10-21 23:43     ` Jeff Layton
  1 sibling, 0 replies; 87+ messages in thread
From: Jeff Layton @ 2021-10-21 23:43 UTC (permalink / raw)
  To: Steve French, Omar Sandoval
  Cc: David Howells, linux-cachefs, ceph-devel, linux-afs,
	Anna Schumaker, linux-nfs, Kent Overstreet, linux-mm,
	Matthew Wilcox, linux-fsdevel, Dave Wysochanski, Marc Dionne,
	Trond Myklebust, Shyam Prasad N, Eric Van Hensbergen,
	v9fs-developer, CIFS, Latchesar Ionkov, Steve French, Al Viro,
	Dominique Martinet, Ilya Dryomov, Trond Myklebust,
	Linus Torvalds, LKML

On Thu, 2021-10-21 at 18:15 -0500, Steve French wrote:
> On Thu, Oct 21, 2021 at 5:21 PM Omar Sandoval <osandov@osandov.com> wrote:
> > 
> > On Mon, Oct 18, 2021 at 03:50:15PM +0100, David Howells wrote:
> > However, with the advent of the tmpfile capacity in the VFS, an opportunity
> > arises to do invalidation much more easily, without having to wait for I/O
> > that's actually in progress: Cachefiles can simply cut over its file
> > pointer for the backing object attached to a cookie and abandon the
> > in-progress I/O, dismissing it upon completion.
> 
> Have changes been made to O_TMPFILE?  It is problematic for network filesystems
> because it is not an atomic operation, and would be great if it were possible
> to create a tmpfile and open it atomically (at the file system level).
> 
> Currently it results in creating a tmpfile (which results in
> opencreate then close)
> immediately followed by reopening the tmpfile which is somewhat counter to
> the whole idea of a tmpfile (ie that it is deleted when closed) since
> the syscall results
> in two opens ie open(create)/close/open/close
> 
> 

In this case, O_TMPFILE is being used on the cachefiles backing store,
and that usually isn't deployed on a netfs. That said, Steve does have a
good point...

What happens if you do end up without O_TMPFILE support on the backing
store? Probably just opting to not cache in that case is fine. Does
cachefiles just shut down in that situation?
-- 
Jeff Layton <jlayton@kernel.org>


^ permalink raw reply	[flat|nested] 87+ messages in thread

* Re: [PATCH 00/67] fscache: Rewrite index API and management system
  2021-10-21 22:20 ` Omar Sandoval
  2021-10-21 23:15   ` Steve French
@ 2021-10-22 18:52   ` David Howells
  1 sibling, 0 replies; 87+ messages in thread
From: David Howells @ 2021-10-22 18:52 UTC (permalink / raw)
  To: Steve French
  Cc: David Howells, Omar Sandoval, Anna Schumaker, Kent Overstreet,
	Matthew Wilcox, Dave Wysochanski, Marc Dionne, Trond Myklebust,
	Shyam Prasad N, Eric Van Hensbergen, Latchesar Ionkov,
	Jeff Layton, Steve French, Al Viro, Dominique Martinet,
	Ilya Dryomov, Trond Myklebust, Jeff Layton, Linus Torvalds,
	linux-cachefs, v9fs-developer, linux-afs, ceph-devel, CIFS,
	linux-nfs, linux-mm, linux-fsdevel, LKML

Steve French <smfrench@gmail.com> wrote:

> On Thu, Oct 21, 2021 at 5:21 PM Omar Sandoval <osandov@osandov.com> wrote:
> >
> > On Mon, Oct 18, 2021 at 03:50:15PM +0100, David Howells wrote:
> > However, with the advent of the tmpfile capacity in the VFS, an opportunity
> > arises to do invalidation much more easily, without having to wait for I/O
> > that's actually in progress: Cachefiles can simply cut over its file
> > pointer for the backing object attached to a cookie and abandon the
> > in-progress I/O, dismissing it upon completion.
> 
> Have changes been made to O_TMPFILE?  It is problematic for network
> filesystems because it is not an atomic operation, and would be great if it
> were possible to create a tmpfile and open it atomically (at the file system
> level).

In this case, it's nothing to do with the network filesystem that's using the
cache per se.  Cachefiles is using tmpfiles on the backing filesystem, so as
long as that's, say, ext4, xfs or btrfs, it should work fine.  The backing
filesystem also needs to support SEEK_HOLE and SEEK_DATA.

I'm not sure I'd recommend putting your cache on a network filesystem.

David


^ permalink raw reply	[flat|nested] 87+ messages in thread

end of thread, other threads:[~2021-10-22 18:52 UTC | newest]

Thread overview: 87+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2021-10-18 14:50 [PATCH 00/67] fscache: Rewrite index API and management system David Howells
2021-10-18 14:50 ` [PATCH 01/67] mm: Stop filemap_read() from grabbing a superfluous page David Howells
2021-10-19 17:13   ` Jeff Layton
2021-10-19 18:28   ` Matthew Wilcox
2021-10-19 18:48   ` David Howells
2021-10-19 20:04     ` Matthew Wilcox
2021-10-18 14:50 ` [PATCH 02/67] vfs: Provide S_KERNEL_FILE inode flag David Howells
2021-10-19 18:10   ` Jeff Layton
2021-10-19 19:02   ` David Howells
2021-10-18 14:51 ` [PATCH 03/67] vfs, fscache: Force ->write_inode() to occur if cookie pinned for writeback David Howells
2021-10-19 17:22   ` Jeff Layton
2021-10-20  9:49   ` David Howells
2021-10-18 14:51 ` [PATCH 04/67] afs: Handle len being extending over page end in write_begin/write_end David Howells
2021-10-18 14:51 ` [PATCH 05/67] afs: Fix afs_write_end() to handle len > page size David Howells
2021-10-18 14:51 ` [PATCH 06/67] nfs, cifs, ceph, 9p: Disable use of fscache prior to its rewrite David Howells
2021-10-19 17:50   ` Jeff Layton
2021-10-20 10:57   ` David Howells
2021-10-18 14:52 ` [PATCH 07/67] fscache: Remove the netfs data from the cookie David Howells
2021-10-19 18:05   ` Jeff Layton
2021-10-18 14:52 ` [PATCH 08/67] fscache: Remove struct fscache_cookie_def David Howells
2021-10-18 14:52 ` [PATCH 09/67] fscache: Remove store_limit* from struct fscache_object David Howells
2021-10-18 14:53 ` [PATCH 10/67] fscache: Remove fscache_check_consistency() David Howells
2021-10-18 14:53 ` [PATCH 11/67] fscache: Remove fscache_attr_changed() David Howells
2021-10-18 14:54 ` [PATCH 12/67] fscache: Remove obsolete stats David Howells
2021-10-18 14:54 ` [PATCH 13/67] fscache: Remove old I/O tracepoints David Howells
2021-10-18 14:54 ` [PATCH 14/67] fscache: Temporarily disable fscache_invalidate() David Howells
2021-10-18 14:54 ` [PATCH 15/67] fscache: Disable fscache_begin_operation() David Howells
2021-10-18 14:54 ` [PATCH 16/67] fscache: Remove the I/O operation manager David Howells
2021-10-18 14:55 ` [PATCH 17/67] fscache: Rename fscache_cookie_{get,put,see}() David Howells
2021-10-18 14:55 ` [PATCH 18/67] cachefiles: Remove tree of active files and use S_CACHE_FILE inode flag David Howells
2021-10-18 14:55 ` [PATCH 19/67] cachefiles: Don't set an xattr on the root of the cache David Howells
2021-10-18 14:55 ` [PATCH 20/67] cachefiles: Remove some redundant checks on unsigned values David Howells
2021-10-18 14:56 ` [PATCH 21/67] cachefiles: Prevent inode from going away when burying a dentry David Howells
2021-10-18 14:56 ` [PATCH 22/67] cachefiles: Simplify the pathwalk and save the filename for an object David Howells
2021-10-18 14:56 ` [PATCH 23/67] cachefiles: trace: Improve the lookup tracepoint David Howells
2021-10-18 14:56 ` [PATCH 24/67] cachefiles: Remove separate backer dentry from cachefiles_object David Howells
2021-10-18 14:57 ` [PATCH 25/67] cachefiles: Fold fscache_object into cachefiles_object David Howells
2021-10-18 14:57 ` [PATCH 26/67] cachefiles: Change to storing file* rather than dentry* David Howells
2021-10-18 14:57 ` [PATCH 27/67] cachefiles: trace: Log coherency checks David Howells
2021-10-18 14:57 ` [PATCH 28/67] cachefiles: Trace truncations David Howells
2021-10-18 14:57 ` [PATCH 29/67] cachefiles: Trace read and write operations David Howells
2021-10-18 14:58 ` [PATCH 30/67] cachefiles: Round the cachefile size up to DIO block size David Howells
2021-10-18 14:58 ` [PATCH 31/67] cachefiles: Don't use XATTR_ flags with vfs_setxattr() David Howells
2021-10-18 14:59 ` [PATCH 32/67] fscache: Replace the object management state machine David Howells
2021-10-18 14:59 ` [PATCH 33/67] cachefiles: Trace decisions in cachefiles_prepare_read() David Howells
2021-10-18 14:59 ` [PATCH 34/67] cachefiles: Make cachefiles_write_prepare() check for space David Howells
2021-10-18 14:59 ` [PATCH 35/67] fscache: Automatically close a file that's been unused for a while David Howells
2021-10-18 15:00 ` [PATCH 36/67] fscache: Add stats for the cookie commit LRU David Howells
2021-10-18 15:00 ` [PATCH 37/67] fscache: Move fscache_update_cookie() complete inline David Howells
2021-10-18 15:00 ` [PATCH 38/67] fscache: Remove more obsolete stats David Howells
2021-10-18 15:00 ` [PATCH 39/67] fscache: Note the object size during invalidation David Howells
2021-10-18 15:00 ` [PATCH 40/67] vfs, fscache: Force ->write_inode() to occur if cookie pinned for writeback David Howells
2021-10-18 15:00 ` [PATCH 41/67] afs: Render cache cookie key as big endian David Howells
2021-10-18 15:01 ` [PATCH 42/67] cachefiles: Use tmpfile/link David Howells
2021-10-18 15:01 ` [PATCH 43/67] fscache: Rewrite invalidation David Howells
2021-10-18 15:01 ` [PATCH 44/67] fscache: disable cookie when doing an invalidation for DIO write David Howells
2021-10-18 15:01 ` [PATCH 45/67] cachefiles: Simplify the file lookup/creation/check code David Howells
2021-10-18 15:01 ` [PATCH 46/67] fscache: Provide resize operation David Howells
2021-10-18 15:02 ` [PATCH 47/67] cachefiles: Put more information in the xattr attached to the cache file David Howells
2021-10-18 15:02 ` [PATCH 48/67] fscache: Implement "will_modify" parameter on fscache_use_cookie() David Howells
2021-10-18 15:02 ` [PATCH 49/67] fscache: Add support for writing to the cache David Howells
2021-10-18 15:03 ` [PATCH 50/67] fscache: Make fscache_clear_page_bits() conditional on cookie David Howells
2021-10-18 15:03 ` [PATCH 51/67] fscache: Make fscache_write_to_cache() " David Howells
2021-10-18 15:03 ` [PATCH 52/67] afs: Copy local writes to the cache when writing to the server David Howells
2021-10-18 15:04 ` [PATCH 53/67] afs: Invoke fscache_resize_cookie() when handling ATTR_SIZE for setattr David Howells
2021-10-18 15:04 ` [PATCH 54/67] afs: Add O_DIRECT read support David Howells
2021-10-18 15:05 ` [PATCH 55/67] afs: Skip truncation on the server of data we haven't written yet David Howells
2021-10-18 15:05 ` [PATCH 56/67] afs: Make afs_write_begin() return the THP subpage David Howells
2021-10-18 15:05 ` [PATCH 57/67] cachefiles, afs: Drive FSCACHE_COOKIE_NO_DATA_TO_READ David Howells
2021-10-18 15:06 ` [PATCH 58/67] NFS: Convert fscache_acquire_cookie and fscache_relinquish_cookie David Howells
2021-10-18 15:06 ` [PATCH 59/67] NFS: Convert fscache_enable_cookie and fscache_disable_cookie David Howells
2021-10-18 15:07 ` [PATCH 60/67] NFS: Convert fscache invalidation and update aux_data and i_size David Howells
2021-10-18 15:08 ` [PATCH 61/67] nfs: Convert to new fscache volume/cookie API David Howells
2021-10-18 15:08 ` [PATCH 62/67] 9p: Use fscache indexing rewrite and reenable caching David Howells
2021-10-18 15:08 ` [PATCH 63/67] 9p: Copy local writes to the cache when writing to the server David Howells
2021-10-18 15:08 ` [PATCH 64/67] netfs: Display the netfs inode number in the netfs_read tracepoint David Howells
2021-10-18 15:08 ` [PATCH 65/67] cachefiles: Add tracepoints to log errors from ops on the backing fs David Howells
2021-10-18 15:08 ` [PATCH 66/67] cachefiles: Add error injection support David Howells
2021-10-18 15:09 ` [PATCH 67/67] cifs: Support fscache indexing rewrite (untested) David Howells
2021-10-19 13:29 ` [Linux-cachefs] [PATCH 00/67] fscache: Rewrite index API and management system Marc Dionne
2021-10-19 18:08 ` Jeff Layton
2021-10-19 19:00 ` David Howells
2021-10-21 22:20 ` Omar Sandoval
2021-10-21 23:15   ` Steve French
2021-10-21 23:23     ` Dominique Martinet
2021-10-21 23:43     ` Jeff Layton
2021-10-22 18:52   ` David Howells

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.