linux-fsdevel.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
* [PATCH 00/20] afs: Fixes and development
@ 2018-04-05 20:29 David Howells
  2018-04-05 20:29 ` [PATCH 01/20] vfs: Remove the const from dir_context::actor David Howells
                   ` (22 more replies)
  0 siblings, 23 replies; 37+ messages in thread
From: David Howells @ 2018-04-05 20:29 UTC (permalink / raw)
  To: torvalds; +Cc: dhowells, linux-fsdevel, linux-afs, linux-kernel


Hi Linus,

Here are a set of AFS patches, a few fixes, but mostly development.  The fixes
are:

 (1) Fix a bunch of checker warnings.

 (2) Fix overincrement of the usage count on pinned cells.

 (3) Fix directory handling.  We cannot assume that a directory blob isn't
     completely rewritten between version numbers on the server, so we can't
     mix pages from different versions in the local pagecache.  If the
     server's copy of a directory changes, we have to entirely throw out our
     own copy and download afresh.

and the development patches:

 (1) Implement the AFS @sys and @cell filename substitutions.

 (2) Introduce a stats file, /proc/fs/afs/stats.

 (3) Improve validation on dentries by keeping track of the last data version
     we got from the server indicating changes not made locally since the
     server inode was cached.  If we made a local change to a directory with
     no interference from another client, we don't need to invalidate all the
     dentries.

 (4) Locally edit our copy of a directory's contents when we make local
     changes to it (eg. mkdir in it).  This is fine, provided the server
     indicates that the remote copy's data version increased by exactly 1.  It
     doesn't matter if the local data deviates from the remote data in exact
     details, provided we flush the local one if we detect interference.

 (5) Add tracepoints for protocol errors and directory editing.

 (6) Do better aggregation of small disjointed writes (such as ld does) to a
     locally newly created or truncated file when we write that to the server,
     reducing the number of store ops at the expense of writing some of the
     padding back too.

Note that this depends on the fscache-next-20180404 patchset already posted.

The patches are tagged here:

	git://git.kernel.org/pub/scm/linux/kernel/git/dhowells/linux-fs.git
	tags/afs-next-20180405

The patches can be found here also:

	http://git.kernel.org/cgit/linux/kernel/git/dhowells/linux-fs.git/log/?h=afs-next

David
---
David Howells (20):
      vfs: Remove the const from dir_context::actor
      afs: Fix checker warnings
      afs: Don't over-increment the cell usage count when pinning it
      afs: Prospectively look up extra files when doing a single lookup
      afs: Implement @sys substitution handling
      afs: Implement @cell substitution handling
      afs: Dump bad status record
      afs: Introduce a statistics proc file
      afs: Init inode before accessing cache
      afs: Make it possible to get the data version in readpage
      afs: Rearrange status mapping
      afs: Keep track of invalid-before version for dentry coherency
      afs: Split the dynroot stuff out and give it its own ops tables
      afs: Fix directory handling
      afs: Split the directory content defs into a header
      afs: Adjust the directory XDR structures
      afs: Locally edit directory data for mkdir/create/unlink/...
      afs: Trace protocol errors
      afs: Add stats for data transfer operations
      afs: Do better accretion of small writes on newly created content


 Documentation/filesystems/afs.txt |   28 +
 fs/afs/Makefile                   |    2 
 fs/afs/addr_list.c                |    6 
 fs/afs/afs.h                      |   27 +
 fs/afs/afs_fs.h                   |    2 
 fs/afs/callback.c                 |   29 -
 fs/afs/cell.c                     |   12 
 fs/afs/cmservice.c                |   22 -
 fs/afs/dir.c                      |  936 +++++++++++++++++++++++++------------
 fs/afs/dir_edit.c                 |  505 ++++++++++++++++++++
 fs/afs/dynroot.c                  |  216 +++++++++
 fs/afs/file.c                     |   27 +
 fs/afs/flock.c                    |    2 
 fs/afs/fsclient.c                 |  622 ++++++++++++++++++++-----
 fs/afs/inode.c                    |   69 +--
 fs/afs/internal.h                 |  102 +++-
 fs/afs/main.c                     |   42 ++
 fs/afs/proc.c                     |  306 ++++++++++++
 fs/afs/rotate.c                   |    2 
 fs/afs/rxrpc.c                    |    9 
 fs/afs/security.c                 |   13 -
 fs/afs/server.c                   |   14 -
 fs/afs/super.c                    |   11 
 fs/afs/vlclient.c                 |   28 +
 fs/afs/write.c                    |   41 +-
 fs/afs/xdr_fs.h                   |  103 ++++
 include/linux/fs.h                |    2 
 include/trace/events/afs.h        |  113 ++++
 28 files changed, 2698 insertions(+), 593 deletions(-)
 create mode 100644 fs/afs/dir_edit.c
 create mode 100644 fs/afs/dynroot.c
 create mode 100644 fs/afs/xdr_fs.h

^ permalink raw reply	[flat|nested] 37+ messages in thread

* [PATCH 01/20] vfs: Remove the const from dir_context::actor
  2018-04-05 20:29 [PATCH 00/20] afs: Fixes and development David Howells
@ 2018-04-05 20:29 ` David Howells
  2018-04-10 13:49   ` Sasha Levin
  2018-04-05 20:29 ` [PATCH 02/20] afs: Fix checker warnings David Howells
                   ` (21 subsequent siblings)
  22 siblings, 1 reply; 37+ messages in thread
From: David Howells @ 2018-04-05 20:29 UTC (permalink / raw)
  To: torvalds; +Cc: dhowells, linux-fsdevel, linux-afs, linux-kernel

Remove the const marking from the actor function pointer in the dir_context
struct.  The const prevents the structure from being used as part of a
kmalloc'd object as it makes the compiler require that the actor member be
set at object initialisation time (or not at all), incuring something like
the following error if you try and set it later:

	fs/afs/dir.c:556:20: error: assignment of read-only member 'actor'

Marking the member const like this adds very little in the way of sanity
checking as the type checking system is likely to provide sufficient - and
if not, the kernel is very likely to oops repeatably in this case.

Fixes: ac6614b76478 ("[readdir] constify ->actor")
Signed-off-by: David Howells <dhowells@redhat.com>
---

 include/linux/fs.h |    2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/include/linux/fs.h b/include/linux/fs.h
index c6baf767619e..dc7de5f06702 100644
--- a/include/linux/fs.h
+++ b/include/linux/fs.h
@@ -1665,7 +1665,7 @@ typedef int (*filldir_t)(struct dir_context *, const char *, int, loff_t, u64,
 			 unsigned);
 
 struct dir_context {
-	const filldir_t actor;
+	filldir_t actor;
 	loff_t pos;
 };
 

^ permalink raw reply related	[flat|nested] 37+ messages in thread

* [PATCH 02/20] afs: Fix checker warnings
  2018-04-05 20:29 [PATCH 00/20] afs: Fixes and development David Howells
  2018-04-05 20:29 ` [PATCH 01/20] vfs: Remove the const from dir_context::actor David Howells
@ 2018-04-05 20:29 ` David Howells
  2018-04-12  5:38   ` Christoph Hellwig
  2018-04-05 20:29 ` [PATCH 03/20] afs: Don't over-increment the cell usage count when pinning it David Howells
                   ` (20 subsequent siblings)
  22 siblings, 1 reply; 37+ messages in thread
From: David Howells @ 2018-04-05 20:29 UTC (permalink / raw)
  To: torvalds; +Cc: dhowells, linux-fsdevel, linux-afs, linux-kernel

Fix warnings raised by checker, including:

 (*) Warnings raised by unequal comparison for the purposes of sorting,
     where the endianness doesn't matter:

fs/afs/addr_list.c:246:21: warning: restricted __be16 degrades to integer
fs/afs/addr_list.c:246:30: warning: restricted __be16 degrades to integer
fs/afs/addr_list.c:248:21: warning: restricted __be32 degrades to integer
fs/afs/addr_list.c:248:49: warning: restricted __be32 degrades to integer
fs/afs/addr_list.c:283:21: warning: restricted __be16 degrades to integer
fs/afs/addr_list.c:283:30: warning: restricted __be16 degrades to integer

 (*) afs_set_cb_interest() is not actually used and can be removed.

 (*) afs_cell_gc_delay() should be provided with a sysctl.

 (*) afs_cell_destroy() needs to use rcu_access_pointer() to read
     cell->vl_addrs.

 (*) afs_init_fs_cursor() should be static.

 (*) struct afs_vnode::permit_cache needs to be marked __rcu.

 (*) afs_server_rcu() needs to use rcu_access_pointer().

 (*) afs_destroy_server() should use rcu_access_pointer() on
     server->addresses as the server object is no longer accessible.

 (*) afs_find_server() casts __be16/__be32 values to int in order to
     directly compare them for the purpose of finding a match in a list,
     but is should also annotate the cast with __force to avoid checker
     warnings.

 (*) afs_check_permit() accesses vnode->permit_cache outside of the RCU
     readlock, though it doesn't then access the value; the extraneous
     access is deleted.

False positives:

 (*) Conditional locking around the code in xdr_decode_AFSFetchStatus.  This
     can be dealt with in a separate patch.

fs/afs/fsclient.c:148:9: warning: context imbalance in 'xdr_decode_AFSFetchStatus' - different lock contexts for basic block

 (*) Incorrect handling of seq-retry lock context balance:

fs/afs/inode.c:455:38: warning: context imbalance in 'afs_getattr' - different
lock contexts for basic block
fs/afs/server.c:52:17: warning: context imbalance in 'afs_find_server' - different lock contexts for basic block
fs/afs/server.c:128:17: warning: context imbalance in 'afs_find_server_by_uuid' - different lock contexts for basic block

Errors:

 (*) afs_lookup_cell_rcu() needs to break out of the seq-retry loop, not go
     round again if it successfully found the workstation cell.

 (*) Fix UUID decode in afs_deliver_cb_probe_uuid().

 (*) afs_cache_permit() has a missing rcu_read_unlock() before one of the
     jumps to the someone_else_changed_it label.  Move the unlock to after
     the label.

 (*) afs_vl_get_addrs_u() is using ntohl() rather than htonl() when
     encoding to XDR.

 (*) afs_deliver_yfsvl_get_endpoints() is using htonl() rather than ntohl()
     when decoding from XDR.

Signed-off-by: David Howells <dhowells@redhat.com>
---

 fs/afs/addr_list.c |    6 +++---
 fs/afs/callback.c  |   20 --------------------
 fs/afs/cell.c      |    6 +++---
 fs/afs/cmservice.c |    6 +++---
 fs/afs/inode.c     |    2 +-
 fs/afs/internal.h  |    2 +-
 fs/afs/proc.c      |    8 ++++++++
 fs/afs/rotate.c    |    2 +-
 fs/afs/security.c  |   11 +++--------
 fs/afs/server.c    |   14 ++++++++------
 fs/afs/vlclient.c  |    8 ++++----
 11 files changed, 35 insertions(+), 50 deletions(-)

diff --git a/fs/afs/addr_list.c b/fs/afs/addr_list.c
index fd9f28b8a933..3bedfed608a2 100644
--- a/fs/afs/addr_list.c
+++ b/fs/afs/addr_list.c
@@ -243,9 +243,9 @@ void afs_merge_fs_addr4(struct afs_addr_list *alist, __be32 xdr, u16 port)
 		    xport == a->sin6_port)
 			return;
 		if (xdr == a->sin6_addr.s6_addr32[3] &&
-		    xport < a->sin6_port)
+		    (u16 __force)xport < (u16 __force)a->sin6_port)
 			break;
-		if (xdr < a->sin6_addr.s6_addr32[3])
+		if ((u32 __force)xdr < (u32 __force)a->sin6_addr.s6_addr32[3])
 			break;
 	}
 
@@ -280,7 +280,7 @@ void afs_merge_fs_addr6(struct afs_addr_list *alist, __be32 *xdr, u16 port)
 		    xport == a->sin6_port)
 			return;
 		if (diff == 0 &&
-		    xport < a->sin6_port)
+		    (u16 __force)xport < (u16 __force)a->sin6_port)
 			break;
 		if (diff < 0)
 			break;
diff --git a/fs/afs/callback.c b/fs/afs/callback.c
index f4291b576054..bf082c719645 100644
--- a/fs/afs/callback.c
+++ b/fs/afs/callback.c
@@ -96,26 +96,6 @@ int afs_register_server_cb_interest(struct afs_vnode *vnode,
 	return 0;
 }
 
-/*
- * Set a vnode's interest on a server.
- */
-void afs_set_cb_interest(struct afs_vnode *vnode, struct afs_cb_interest *cbi)
-{
-	struct afs_cb_interest *old_cbi = NULL;
-
-	if (vnode->cb_interest == cbi)
-		return;
-
-	write_seqlock(&vnode->cb_lock);
-	if (vnode->cb_interest != cbi) {
-		afs_get_cb_interest(cbi);
-		old_cbi = vnode->cb_interest;
-		vnode->cb_interest = cbi;
-	}
-	write_sequnlock(&vnode->cb_lock);
-	afs_put_cb_interest(afs_v2net(vnode), cbi);
-}
-
 /*
  * Remove an interest on a server.
  */
diff --git a/fs/afs/cell.c b/fs/afs/cell.c
index 4235a05afc76..69b95faacc5e 100644
--- a/fs/afs/cell.c
+++ b/fs/afs/cell.c
@@ -18,7 +18,7 @@
 #include <keys/rxrpc-type.h>
 #include "internal.h"
 
-unsigned __read_mostly afs_cell_gc_delay = 10;
+static unsigned __read_mostly afs_cell_gc_delay = 10;
 
 static void afs_manage_cell(struct work_struct *);
 
@@ -75,7 +75,7 @@ struct afs_cell *afs_lookup_cell_rcu(struct afs_net *net,
 			cell = rcu_dereference_raw(net->ws_cell);
 			if (cell) {
 				afs_get_cell(cell);
-				continue;
+				break;
 			}
 			ret = -EDESTADDRREQ;
 			continue;
@@ -411,7 +411,7 @@ static void afs_cell_destroy(struct rcu_head *rcu)
 
 	ASSERTCMP(atomic_read(&cell->usage), ==, 0);
 
-	afs_put_addrlist(cell->vl_addrs);
+	afs_put_addrlist(rcu_access_pointer(cell->vl_addrs));
 	key_put(cell->anonymous_key);
 	kfree(cell);
 
diff --git a/fs/afs/cmservice.c b/fs/afs/cmservice.c
index 41e277f57b20..83ff283979a4 100644
--- a/fs/afs/cmservice.c
+++ b/fs/afs/cmservice.c
@@ -500,9 +500,9 @@ static int afs_deliver_cb_probe_uuid(struct afs_call *call)
 
 		b = call->buffer;
 		r = call->request;
-		r->time_low			= ntohl(b[0]);
-		r->time_mid			= ntohl(b[1]);
-		r->time_hi_and_version		= ntohl(b[2]);
+		r->time_low			= b[0];
+		r->time_mid			= htons(ntohl(b[1]));
+		r->time_hi_and_version		= htons(ntohl(b[2]));
 		r->clock_seq_hi_and_reserved 	= ntohl(b[3]);
 		r->clock_seq_low		= ntohl(b[4]);
 
diff --git a/fs/afs/inode.c b/fs/afs/inode.c
index 65c5b1edd338..d5db9dead18c 100644
--- a/fs/afs/inode.c
+++ b/fs/afs/inode.c
@@ -544,7 +544,7 @@ void afs_evict_inode(struct inode *inode)
 	}
 #endif
 
-	afs_put_permits(vnode->permit_cache);
+	afs_put_permits(rcu_access_pointer(vnode->permit_cache));
 	_leave("");
 }
 
diff --git a/fs/afs/internal.h b/fs/afs/internal.h
index a6a1d75eee41..135192b7dc04 100644
--- a/fs/afs/internal.h
+++ b/fs/afs/internal.h
@@ -458,7 +458,7 @@ struct afs_vnode {
 #ifdef CONFIG_AFS_FSCACHE
 	struct fscache_cookie	*cache;		/* caching cookie */
 #endif
-	struct afs_permits	*permit_cache;	/* cache of permits so far obtained */
+	struct afs_permits __rcu *permit_cache;	/* cache of permits so far obtained */
 	struct mutex		io_lock;	/* Lock for serialising I/O on this mutex */
 	struct mutex		validate_lock;	/* lock for validating this vnode */
 	spinlock_t		wb_lock;	/* lock for wb_keys */
diff --git a/fs/afs/proc.c b/fs/afs/proc.c
index 4508dd54f789..1c95756335b7 100644
--- a/fs/afs/proc.c
+++ b/fs/afs/proc.c
@@ -183,6 +183,7 @@ static int afs_proc_cells_open(struct inode *inode, struct file *file)
  * first item
  */
 static void *afs_proc_cells_start(struct seq_file *m, loff_t *_pos)
+	__acquires(rcu)
 {
 	struct afs_net *net = afs_seq2net(m);
 
@@ -204,6 +205,7 @@ static void *afs_proc_cells_next(struct seq_file *m, void *v, loff_t *pos)
  * clean up after reading from the cells list
  */
 static void afs_proc_cells_stop(struct seq_file *m, void *v)
+	__releases(rcu)
 {
 	rcu_read_unlock();
 }
@@ -413,6 +415,7 @@ static int afs_proc_cell_volumes_open(struct inode *inode, struct file *file)
  * first item
  */
 static void *afs_proc_cell_volumes_start(struct seq_file *m, loff_t *_pos)
+	__acquires(cell->proc_lock)
 {
 	struct afs_cell *cell = m->private;
 
@@ -438,6 +441,7 @@ static void *afs_proc_cell_volumes_next(struct seq_file *p, void *v,
  * clean up after reading from the cells list
  */
 static void afs_proc_cell_volumes_stop(struct seq_file *p, void *v)
+	__releases(cell->proc_lock)
 {
 	struct afs_cell *cell = p->private;
 
@@ -500,6 +504,7 @@ static int afs_proc_cell_vlservers_open(struct inode *inode, struct file *file)
  * first item
  */
 static void *afs_proc_cell_vlservers_start(struct seq_file *m, loff_t *_pos)
+	__acquires(rcu)
 {
 	struct afs_addr_list *alist;
 	struct afs_cell *cell = m->private;
@@ -544,6 +549,7 @@ static void *afs_proc_cell_vlservers_next(struct seq_file *p, void *v,
  * clean up after reading from the cells list
  */
 static void afs_proc_cell_vlservers_stop(struct seq_file *p, void *v)
+	__releases(rcu)
 {
 	rcu_read_unlock();
 }
@@ -580,6 +586,7 @@ static int afs_proc_servers_open(struct inode *inode, struct file *file)
  * first item.
  */
 static void *afs_proc_servers_start(struct seq_file *m, loff_t *_pos)
+	__acquires(rcu)
 {
 	struct afs_net *net = afs_seq2net(m);
 
@@ -601,6 +608,7 @@ static void *afs_proc_servers_next(struct seq_file *m, void *v, loff_t *_pos)
  * clean up after reading from the cells list
  */
 static void afs_proc_servers_stop(struct seq_file *p, void *v)
+	__releases(rcu)
 {
 	rcu_read_unlock();
 }
diff --git a/fs/afs/rotate.c b/fs/afs/rotate.c
index ad1328d85526..ac0feac9d746 100644
--- a/fs/afs/rotate.c
+++ b/fs/afs/rotate.c
@@ -21,7 +21,7 @@
 /*
  * Initialise a filesystem server cursor for iterating over FS servers.
  */
-void afs_init_fs_cursor(struct afs_fs_cursor *fc, struct afs_vnode *vnode)
+static void afs_init_fs_cursor(struct afs_fs_cursor *fc, struct afs_vnode *vnode)
 {
 	memset(fc, 0, sizeof(*fc));
 }
diff --git a/fs/afs/security.c b/fs/afs/security.c
index b88b7d45fdaa..bd82c5bf4a6a 100644
--- a/fs/afs/security.c
+++ b/fs/afs/security.c
@@ -178,18 +178,14 @@ void afs_cache_permit(struct afs_vnode *vnode, struct key *key,
 		}
 	}
 
-	if (cb_break != (vnode->cb_break + vnode->cb_interest->server->cb_s_break)) {
-		rcu_read_unlock();
+	if (cb_break != (vnode->cb_break + vnode->cb_interest->server->cb_s_break))
 		goto someone_else_changed_it;
-	}
 
 	/* We need a ref on any permits list we want to copy as we'll have to
 	 * drop the lock to do memory allocation.
 	 */
-	if (permits && !refcount_inc_not_zero(&permits->usage)) {
-		rcu_read_unlock();
+	if (permits && !refcount_inc_not_zero(&permits->usage))
 		goto someone_else_changed_it;
-	}
 
 	rcu_read_unlock();
 
@@ -278,6 +274,7 @@ void afs_cache_permit(struct afs_vnode *vnode, struct key *key,
 	/* Someone else changed the cache under us - don't recheck at this
 	 * time.
 	 */
+	rcu_read_unlock();
 	return;
 }
 
@@ -296,8 +293,6 @@ int afs_check_permit(struct afs_vnode *vnode, struct key *key,
 	_enter("{%x:%u},%x",
 	       vnode->fid.vid, vnode->fid.vnode, key_serial(key));
 
-	permits = vnode->permit_cache;
-
 	/* check the permits to see if we've got one yet */
 	if (key == vnode->volume->cell->anonymous_key) {
 		_debug("anon");
diff --git a/fs/afs/server.c b/fs/afs/server.c
index a43ef77dabae..e23be63998a8 100644
--- a/fs/afs/server.c
+++ b/fs/afs/server.c
@@ -59,7 +59,8 @@ struct afs_server *afs_find_server(struct afs_net *net,
 				alist = rcu_dereference(server->addresses);
 				for (i = alist->nr_ipv4; i < alist->nr_addrs; i++) {
 					b = &alist->addrs[i].transport.sin6;
-					diff = (u16)a->sin6_port - (u16)b->sin6_port;
+					diff = ((u16 __force)a->sin6_port -
+						(u16 __force)b->sin6_port);
 					if (diff == 0)
 						diff = memcmp(&a->sin6_addr,
 							      &b->sin6_addr,
@@ -79,10 +80,11 @@ struct afs_server *afs_find_server(struct afs_net *net,
 				alist = rcu_dereference(server->addresses);
 				for (i = 0; i < alist->nr_ipv4; i++) {
 					b = &alist->addrs[i].transport.sin6;
-					diff = (u16)a->sin6_port - (u16)b->sin6_port;
+					diff = ((u16 __force)a->sin6_port -
+						(u16 __force)b->sin6_port);
 					if (diff == 0)
-						diff = ((u32)a->sin6_addr.s6_addr32[3] -
-							(u32)b->sin6_addr.s6_addr32[3]);
+						diff = ((u32 __force)a->sin6_addr.s6_addr32[3] -
+							(u32 __force)b->sin6_addr.s6_addr32[3]);
 					if (diff == 0)
 						goto found;
 					if (diff < 0) {
@@ -381,7 +383,7 @@ static void afs_server_rcu(struct rcu_head *rcu)
 {
 	struct afs_server *server = container_of(rcu, struct afs_server, rcu);
 
-	afs_put_addrlist(server->addresses);
+	afs_put_addrlist(rcu_access_pointer(server->addresses));
 	kfree(server);
 }
 
@@ -390,7 +392,7 @@ static void afs_server_rcu(struct rcu_head *rcu)
  */
 static void afs_destroy_server(struct afs_net *net, struct afs_server *server)
 {
-	struct afs_addr_list *alist = server->addresses;
+	struct afs_addr_list *alist = rcu_access_pointer(server->addresses);
 	struct afs_addr_cursor ac = {
 		.alist	= alist,
 		.addr	= &alist->addrs[0],
diff --git a/fs/afs/vlclient.c b/fs/afs/vlclient.c
index 5d8562f1ad4a..f9d89795e41b 100644
--- a/fs/afs/vlclient.c
+++ b/fs/afs/vlclient.c
@@ -303,7 +303,7 @@ struct afs_addr_list *afs_vl_get_addrs_u(struct afs_net *net,
 	r->uuid.clock_seq_hi_and_reserved 	= htonl(u->clock_seq_hi_and_reserved);
 	r->uuid.clock_seq_low			= htonl(u->clock_seq_low);
 	for (i = 0; i < 6; i++)
-		r->uuid.node[i] = ntohl(u->node[i]);
+		r->uuid.node[i] = htonl(u->node[i]);
 
 	trace_afs_make_vl_call(call);
 	return (struct afs_addr_list *)afs_make_call(ac, call, GFP_KERNEL, false);
@@ -504,7 +504,7 @@ static int afs_deliver_yfsvl_get_endpoints(struct afs_call *call)
 		/* Got either the type of the next entry or the count of
 		 * volEndpoints if no more fsEndpoints.
 		 */
-		call->count2 = htonl(*bp++);
+		call->count2 = ntohl(*bp++);
 
 		call->offset = 0;
 		call->count--;
@@ -531,7 +531,7 @@ static int afs_deliver_yfsvl_get_endpoints(struct afs_call *call)
 			return ret;
 
 		bp = call->buffer;
-		call->count2 = htonl(*bp++);
+		call->count2 = ntohl(*bp++);
 		call->offset = 0;
 		call->unmarshall = 4;
 
@@ -576,7 +576,7 @@ static int afs_deliver_yfsvl_get_endpoints(struct afs_call *call)
 		call->offset = 0;
 		call->count--;
 		if (call->count > 0) {
-			call->count2 = htonl(*bp++);
+			call->count2 = ntohl(*bp++);
 			goto again;
 		}
 

^ permalink raw reply related	[flat|nested] 37+ messages in thread

* [PATCH 03/20] afs: Don't over-increment the cell usage count when pinning it
  2018-04-05 20:29 [PATCH 00/20] afs: Fixes and development David Howells
  2018-04-05 20:29 ` [PATCH 01/20] vfs: Remove the const from dir_context::actor David Howells
  2018-04-05 20:29 ` [PATCH 02/20] afs: Fix checker warnings David Howells
@ 2018-04-05 20:29 ` David Howells
  2018-04-10 13:49   ` Sasha Levin
  2018-04-05 20:29 ` [PATCH 04/20] afs: Prospectively look up extra files when doing a single lookup David Howells
                   ` (19 subsequent siblings)
  22 siblings, 1 reply; 37+ messages in thread
From: David Howells @ 2018-04-05 20:29 UTC (permalink / raw)
  To: torvalds; +Cc: dhowells, linux-fsdevel, linux-afs, linux-kernel

AFS cells that are added or set as the workstation cell through /proc are
pinned against removal by setting the AFS_CELL_FL_NO_GC flag on them and
taking a ref.  The ref should be only taken if the flag wasn't already set.

Fix this by making it conditional.

Without this an assertion failure will occur during module removal
indicating that the refcount is too elevated.

Signed-off-by: David Howells <dhowells@redhat.com>
---

 fs/afs/cell.c |    4 ++--
 fs/afs/proc.c |    3 ++-
 2 files changed, 4 insertions(+), 3 deletions(-)

diff --git a/fs/afs/cell.c b/fs/afs/cell.c
index 69b95faacc5e..721425b98b31 100644
--- a/fs/afs/cell.c
+++ b/fs/afs/cell.c
@@ -334,8 +334,8 @@ int afs_cell_init(struct afs_net *net, const char *rootcell)
 		return PTR_ERR(new_root);
 	}
 
-	set_bit(AFS_CELL_FL_NO_GC, &new_root->flags);
-	afs_get_cell(new_root);
+	if (!test_and_set_bit(AFS_CELL_FL_NO_GC, &new_root->flags))
+		afs_get_cell(new_root);
 
 	/* install the new cell */
 	write_seqlock(&net->cells_lock);
diff --git a/fs/afs/proc.c b/fs/afs/proc.c
index 1c95756335b7..2f04d37eeef0 100644
--- a/fs/afs/proc.c
+++ b/fs/afs/proc.c
@@ -284,7 +284,8 @@ static ssize_t afs_proc_cells_write(struct file *file, const char __user *buf,
 			goto done;
 		}
 
-		set_bit(AFS_CELL_FL_NO_GC, &cell->flags);
+		if (test_and_set_bit(AFS_CELL_FL_NO_GC, &cell->flags))
+			afs_put_cell(net, cell);
 		printk("kAFS: Added new cell '%s'\n", name);
 	} else {
 		goto inval;

^ permalink raw reply related	[flat|nested] 37+ messages in thread

* [PATCH 04/20] afs: Prospectively look up extra files when doing a single lookup
  2018-04-05 20:29 [PATCH 00/20] afs: Fixes and development David Howells
                   ` (2 preceding siblings ...)
  2018-04-05 20:29 ` [PATCH 03/20] afs: Don't over-increment the cell usage count when pinning it David Howells
@ 2018-04-05 20:29 ` David Howells
  2018-04-05 20:30 ` [PATCH 05/20] afs: Implement @sys substitution handling David Howells
                   ` (18 subsequent siblings)
  22 siblings, 0 replies; 37+ messages in thread
From: David Howells @ 2018-04-05 20:29 UTC (permalink / raw)
  To: torvalds; +Cc: dhowells, linux-fsdevel, linux-afs, linux-kernel

When afs_lookup() is called, prospectively look up the next 50 uncached
fids also from that same directory and cache the results, rather than just
looking up the one file requested.

This allows us to use the FS.InlineBulkStatus RPC op to increase efficiency
by fetching up to 50 file statuses at a time.

Signed-off-by: David Howells <dhowells@redhat.com>
---

 fs/afs/afs.h               |   21 ++-
 fs/afs/afs_fs.h            |    2 
 fs/afs/callback.c          |    8 +
 fs/afs/cmservice.c         |   12 +-
 fs/afs/dir.c               |  288 ++++++++++++++++++++++++++++++++++++++------
 fs/afs/fsclient.c          |  272 +++++++++++++++++++++++++++++++++++++++++-
 fs/afs/internal.h          |   10 +-
 include/trace/events/afs.h |    2 
 8 files changed, 552 insertions(+), 63 deletions(-)

diff --git a/fs/afs/afs.h b/fs/afs/afs.h
index b94d0edc2b78..a670bca6d3ba 100644
--- a/fs/afs/afs.h
+++ b/fs/afs/afs.h
@@ -67,10 +67,14 @@ typedef enum {
 } afs_callback_type_t;
 
 struct afs_callback {
-	struct afs_fid		fid;		/* file identifier */
-	unsigned		version;	/* callback version */
-	unsigned		expiry;		/* time at which expires */
-	afs_callback_type_t	type;		/* type of callback */
+	unsigned		version;	/* Callback version */
+	unsigned		expiry;		/* Time at which expires */
+	afs_callback_type_t	type;		/* Type of callback */
+};
+
+struct afs_callback_break {
+	struct afs_fid		fid;		/* File identifier */
+	struct afs_callback	cb;		/* Callback details */
 };
 
 #define AFSCBMAX 50	/* maximum callbacks transferred per bulk op */
@@ -123,21 +127,22 @@ typedef u32 afs_access_t;
  * AFS file status information
  */
 struct afs_file_status {
+	u64			size;		/* file size */
+	afs_dataversion_t	data_version;	/* current data version */
+	time_t			mtime_client;	/* last time client changed data */
+	time_t			mtime_server;	/* last time server changed data */
 	unsigned		if_version;	/* interface version */
 #define AFS_FSTATUS_VERSION	1
+	unsigned		abort_code;	/* Abort if bulk-fetching this failed */
 
 	afs_file_type_t		type;		/* file type */
 	unsigned		nlink;		/* link count */
-	u64			size;		/* file size */
-	afs_dataversion_t	data_version;	/* current data version */
 	u32			author;		/* author ID */
 	kuid_t			owner;		/* owner ID */
 	kgid_t			group;		/* group ID */
 	afs_access_t		caller_access;	/* access rights for authenticated caller */
 	afs_access_t		anon_access;	/* access rights for unauthenticated caller */
 	umode_t			mode;		/* UNIX mode */
-	time_t			mtime_client;	/* last time client changed data */
-	time_t			mtime_server;	/* last time server changed data */
 	s32			lock_count;	/* file lock count (0=UNLK -1=WRLCK +ve=#RDLCK */
 };
 
diff --git a/fs/afs/afs_fs.h b/fs/afs/afs_fs.h
index d47b6d01e4c0..ddfa88a7a9c0 100644
--- a/fs/afs/afs_fs.h
+++ b/fs/afs/afs_fs.h
@@ -31,10 +31,12 @@ enum AFS_FS_Operations {
 	FSGETVOLUMEINFO		= 148,	/* AFS Get information about a volume */
 	FSGETVOLUMESTATUS	= 149,	/* AFS Get volume status information */
 	FSGETROOTVOLUME		= 151,	/* AFS Get root volume name */
+	FSBULKSTATUS		= 155,	/* AFS Fetch multiple file statuses */
 	FSSETLOCK		= 156,	/* AFS Request a file lock */
 	FSEXTENDLOCK		= 157,	/* AFS Extend a file lock */
 	FSRELEASELOCK		= 158,	/* AFS Release a file lock */
 	FSLOOKUP		= 161,	/* AFS lookup file in directory */
+	FSINLINEBULKSTATUS	= 65536, /* AFS Fetch multiple file statuses with inline errors */
 	FSFETCHDATA64		= 65537, /* AFS Fetch file data */
 	FSSTOREDATA64		= 65538, /* AFS Store file data */
 	FSGIVEUPALLCALLBACKS	= 65539, /* AFS Give up all outstanding callbacks on a server */
diff --git a/fs/afs/callback.c b/fs/afs/callback.c
index bf082c719645..6049ca837498 100644
--- a/fs/afs/callback.c
+++ b/fs/afs/callback.c
@@ -187,7 +187,7 @@ static void afs_break_one_callback(struct afs_server *server,
  * allow the fileserver to break callback promises
  */
 void afs_break_callbacks(struct afs_server *server, size_t count,
-			 struct afs_callback callbacks[])
+			 struct afs_callback_break *callbacks)
 {
 	_enter("%p,%zu,", server, count);
 
@@ -199,9 +199,9 @@ void afs_break_callbacks(struct afs_server *server, size_t count,
 		       callbacks->fid.vid,
 		       callbacks->fid.vnode,
 		       callbacks->fid.unique,
-		       callbacks->version,
-		       callbacks->expiry,
-		       callbacks->type
+		       callbacks->cb.version,
+		       callbacks->cb.expiry,
+		       callbacks->cb.type
 		       );
 		afs_break_one_callback(server, &callbacks->fid);
 	}
diff --git a/fs/afs/cmservice.c b/fs/afs/cmservice.c
index 83ff283979a4..fa07f83e9f29 100644
--- a/fs/afs/cmservice.c
+++ b/fs/afs/cmservice.c
@@ -178,8 +178,8 @@ static void SRXAFSCB_CallBack(struct work_struct *work)
  */
 static int afs_deliver_cb_callback(struct afs_call *call)
 {
+	struct afs_callback_break *cb;
 	struct sockaddr_rxrpc srx;
-	struct afs_callback *cb;
 	struct afs_server *server;
 	__be32 *bp;
 	int ret, loop;
@@ -218,7 +218,7 @@ static int afs_deliver_cb_callback(struct afs_call *call)
 
 		_debug("unmarshall FID array");
 		call->request = kcalloc(call->count,
-					sizeof(struct afs_callback),
+					sizeof(struct afs_callback_break),
 					GFP_KERNEL);
 		if (!call->request)
 			return -ENOMEM;
@@ -229,7 +229,7 @@ static int afs_deliver_cb_callback(struct afs_call *call)
 			cb->fid.vid	= ntohl(*bp++);
 			cb->fid.vnode	= ntohl(*bp++);
 			cb->fid.unique	= ntohl(*bp++);
-			cb->type	= AFSCM_CB_UNTYPED;
+			cb->cb.type	= AFSCM_CB_UNTYPED;
 		}
 
 		call->offset = 0;
@@ -260,9 +260,9 @@ static int afs_deliver_cb_callback(struct afs_call *call)
 		cb = call->request;
 		bp = call->buffer;
 		for (loop = call->count2; loop > 0; loop--, cb++) {
-			cb->version	= ntohl(*bp++);
-			cb->expiry	= ntohl(*bp++);
-			cb->type	= ntohl(*bp++);
+			cb->cb.version	= ntohl(*bp++);
+			cb->cb.expiry	= ntohl(*bp++);
+			cb->cb.type	= ntohl(*bp++);
 		}
 
 		call->offset = 0;
diff --git a/fs/afs/dir.c b/fs/afs/dir.c
index ba2b458b36d1..63f7b37e1e71 100644
--- a/fs/afs/dir.c
+++ b/fs/afs/dir.c
@@ -29,8 +29,10 @@ static int afs_readdir(struct file *file, struct dir_context *ctx);
 static int afs_d_revalidate(struct dentry *dentry, unsigned int flags);
 static int afs_d_delete(const struct dentry *dentry);
 static void afs_d_release(struct dentry *dentry);
-static int afs_lookup_filldir(struct dir_context *ctx, const char *name, int nlen,
+static int afs_lookup_one_filldir(struct dir_context *ctx, const char *name, int nlen,
 				  loff_t fpos, u64 ino, unsigned dtype);
+static int afs_lookup_filldir(struct dir_context *ctx, const char *name, int nlen,
+			      loff_t fpos, u64 ino, unsigned dtype);
 static int afs_create(struct inode *dir, struct dentry *dentry, umode_t mode,
 		      bool excl);
 static int afs_mkdir(struct inode *dir, struct dentry *dentry, umode_t mode);
@@ -134,11 +136,22 @@ struct afs_dir_page {
 	union afs_dir_block blocks[PAGE_SIZE / sizeof(union afs_dir_block)];
 };
 
+struct afs_lookup_one_cookie {
+	struct dir_context	ctx;
+	struct qstr		name;
+	bool			found;
+	struct afs_fid		fid;
+};
+
 struct afs_lookup_cookie {
-	struct dir_context ctx;
-	struct afs_fid	fid;
-	struct qstr name;
-	int		found;
+	struct dir_context	ctx;
+	struct qstr		name;
+	bool			found;
+	bool			one_only;
+	unsigned short		nr_fids;
+	struct afs_file_status	*statuses;
+	struct afs_callback	*callbacks;
+	struct afs_fid		fids[50];
 };
 
 /*
@@ -330,7 +343,8 @@ static int afs_dir_iterate_block(struct dir_context *ctx,
 		/* found the next entry */
 		if (!dir_emit(ctx, dire->u.name, nlen,
 			      ntohl(dire->u.vnode),
-			      ctx->actor == afs_lookup_filldir ?
+			      (ctx->actor == afs_lookup_filldir ||
+			       ctx->actor == afs_lookup_one_filldir)?
 			      ntohl(dire->u.unique) : DT_UNKNOWN)) {
 			_leave(" = 0 [full]");
 			return 0;
@@ -414,15 +428,15 @@ static int afs_readdir(struct file *file, struct dir_context *ctx)
 }
 
 /*
- * search the directory for a name
+ * Search the directory for a single name
  * - if afs_dir_iterate_block() spots this function, it'll pass the FID
  *   uniquifier through dtype
  */
-static int afs_lookup_filldir(struct dir_context *ctx, const char *name,
-			      int nlen, loff_t fpos, u64 ino, unsigned dtype)
+static int afs_lookup_one_filldir(struct dir_context *ctx, const char *name,
+				  int nlen, loff_t fpos, u64 ino, unsigned dtype)
 {
-	struct afs_lookup_cookie *cookie =
-		container_of(ctx, struct afs_lookup_cookie, ctx);
+	struct afs_lookup_one_cookie *cookie =
+		container_of(ctx, struct afs_lookup_one_cookie, ctx);
 
 	_enter("{%s,%u},%s,%u,,%llu,%u",
 	       cookie->name.name, cookie->name.len, name, nlen,
@@ -447,15 +461,15 @@ static int afs_lookup_filldir(struct dir_context *ctx, const char *name,
 }
 
 /*
- * do a lookup in a directory
+ * Do a lookup of a single name in a directory
  * - just returns the FID the dentry name maps to if found
  */
-static int afs_do_lookup(struct inode *dir, struct dentry *dentry,
-			 struct afs_fid *fid, struct key *key)
+static int afs_do_lookup_one(struct inode *dir, struct dentry *dentry,
+			     struct afs_fid *fid, struct key *key)
 {
 	struct afs_super_info *as = dir->i_sb->s_fs_info;
-	struct afs_lookup_cookie cookie = {
-		.ctx.actor = afs_lookup_filldir,
+	struct afs_lookup_one_cookie cookie = {
+		.ctx.actor = afs_lookup_one_filldir,
 		.name = dentry->d_name,
 		.fid.vid = as->volume->vid
 	};
@@ -481,6 +495,212 @@ static int afs_do_lookup(struct inode *dir, struct dentry *dentry,
 	return 0;
 }
 
+/*
+ * search the directory for a name
+ * - if afs_dir_iterate_block() spots this function, it'll pass the FID
+ *   uniquifier through dtype
+ */
+static int afs_lookup_filldir(struct dir_context *ctx, const char *name,
+			      int nlen, loff_t fpos, u64 ino, unsigned dtype)
+{
+	struct afs_lookup_cookie *cookie =
+		container_of(ctx, struct afs_lookup_cookie, ctx);
+	int ret;
+
+	_enter("{%s,%u},%s,%u,,%llu,%u",
+	       cookie->name.name, cookie->name.len, name, nlen,
+	       (unsigned long long) ino, dtype);
+
+	/* insanity checks first */
+	BUILD_BUG_ON(sizeof(union afs_dir_block) != 2048);
+	BUILD_BUG_ON(sizeof(union afs_dirent) != 32);
+
+	if (cookie->found) {
+		if (cookie->nr_fids < 50) {
+			cookie->fids[cookie->nr_fids].vnode	= ino;
+			cookie->fids[cookie->nr_fids].unique	= dtype;
+			cookie->nr_fids++;
+		}
+	} else if (cookie->name.len == nlen &&
+		   memcmp(cookie->name.name, name, nlen) == 0) {
+		cookie->fids[0].vnode	= ino;
+		cookie->fids[0].unique	= dtype;
+		cookie->found = 1;
+		if (cookie->one_only)
+			return -1;
+	}
+
+	ret = cookie->nr_fids >= 50 ? -1 : 0;
+	_leave(" = %d", ret);
+	return ret;
+}
+
+/*
+ * Do a lookup in a directory.  We make use of bulk lookup to query a slew of
+ * files in one go and create inodes for them.  The inode of the file we were
+ * asked for is returned.
+ */
+static struct inode *afs_do_lookup(struct inode *dir, struct dentry *dentry,
+				   struct key *key)
+{
+	struct afs_lookup_cookie *cookie;
+	struct afs_cb_interest *cbi = NULL;
+	struct afs_super_info *as = dir->i_sb->s_fs_info;
+	struct afs_iget_data data;
+	struct afs_fs_cursor fc;
+	struct afs_vnode *dvnode = AFS_FS_I(dir);
+	struct inode *inode = NULL;
+	int ret, i;
+
+	_enter("{%lu},%p{%pd},", dir->i_ino, dentry, dentry);
+
+	cookie = kzalloc(sizeof(struct afs_lookup_cookie), GFP_KERNEL);
+	if (!cookie)
+		return ERR_PTR(-ENOMEM);
+
+	cookie->ctx.actor = afs_lookup_filldir;
+	cookie->name = dentry->d_name;
+	cookie->nr_fids = 1; /* slot 0 is saved for the fid we actually want */
+
+	read_seqlock_excl(&dvnode->cb_lock);
+	if (dvnode->cb_interest &&
+	    dvnode->cb_interest->server &&
+	    test_bit(AFS_SERVER_FL_NO_IBULK, &dvnode->cb_interest->server->flags))
+		cookie->one_only = true;
+	read_sequnlock_excl(&dvnode->cb_lock);
+
+	for (i = 0; i < 50; i++)
+		cookie->fids[i].vid = as->volume->vid;
+
+	/* search the directory */
+	ret = afs_dir_iterate(dir, &cookie->ctx, key);
+	if (ret < 0) {
+		inode = ERR_PTR(ret);
+		goto out;
+	}
+
+	inode = ERR_PTR(-ENOENT);
+	if (!cookie->found)
+		goto out;
+
+	/* Check to see if we already have an inode for the primary fid. */
+	data.volume = dvnode->volume;
+	data.fid = cookie->fids[0];
+	inode = ilookup5(dir->i_sb, cookie->fids[0].vnode, afs_iget5_test, &data);
+	if (inode)
+		goto out;
+	
+	/* Need space for examining all the selected files */
+	inode = ERR_PTR(-ENOMEM);
+	cookie->statuses = kcalloc(cookie->nr_fids, sizeof(struct afs_file_status),
+				   GFP_KERNEL);
+	if (!cookie->statuses)
+		goto out;
+
+	cookie->callbacks = kcalloc(cookie->nr_fids, sizeof(struct afs_callback),
+				    GFP_KERNEL);
+	if (!cookie->callbacks)
+		goto out_s;
+
+	/* Try FS.InlineBulkStatus first.  Abort codes for the individual
+	 * lookups contained therein are stored in the reply without aborting
+	 * the whole operation.
+	 */
+	if (cookie->one_only)
+		goto no_inline_bulk_status;
+
+	inode = ERR_PTR(-ERESTARTSYS);
+	if (afs_begin_vnode_operation(&fc, dvnode, key)) {
+		while (afs_select_fileserver(&fc)) {
+			if (test_bit(AFS_SERVER_FL_NO_IBULK,
+				      &fc.cbi->server->flags)) {
+				fc.ac.abort_code = RX_INVALID_OPERATION;
+				fc.ac.error = -ECONNABORTED;
+				break;
+			}
+			afs_fs_inline_bulk_status(&fc,
+						  afs_v2net(dvnode),
+						  cookie->fids,
+						  cookie->statuses,
+						  cookie->callbacks,
+						  cookie->nr_fids, NULL);
+		}
+
+		if (fc.ac.error == 0)
+			cbi = afs_get_cb_interest(fc.cbi);
+		if (fc.ac.abort_code == RX_INVALID_OPERATION)
+			set_bit(AFS_SERVER_FL_NO_IBULK, &fc.cbi->server->flags);
+		inode = ERR_PTR(afs_end_vnode_operation(&fc));
+	}
+
+	if (!IS_ERR(inode))
+		goto success;
+	if (fc.ac.abort_code != RX_INVALID_OPERATION)
+		goto out_c;
+
+no_inline_bulk_status:
+	/* We could try FS.BulkStatus next, but this aborts the entire op if
+	 * any of the lookups fails - so, for the moment, revert to
+	 * FS.FetchStatus for just the primary fid.
+	 */
+	cookie->nr_fids = 1;
+	inode = ERR_PTR(-ERESTARTSYS);
+	if (afs_begin_vnode_operation(&fc, dvnode, key)) {
+		while (afs_select_fileserver(&fc)) {
+			afs_fs_fetch_status(&fc,
+					    afs_v2net(dvnode),
+					    cookie->fids,
+					    cookie->statuses,
+					    cookie->callbacks,
+					    NULL);
+		}
+
+		if (fc.ac.error == 0)
+			cbi = afs_get_cb_interest(fc.cbi);
+		inode = ERR_PTR(afs_end_vnode_operation(&fc));
+	}
+
+	if (IS_ERR(inode))
+		goto out_c;
+
+	for (i = 0; i < cookie->nr_fids; i++)
+		cookie->statuses[i].abort_code = 0;
+	
+success:
+	/* Turn all the files into inodes and save the first one - which is the
+	 * one we actually want.
+	 */
+	if (cookie->statuses[0].abort_code != 0)
+		inode = ERR_PTR(afs_abort_to_error(cookie->statuses[0].abort_code));
+
+	for (i = 0; i < cookie->nr_fids; i++) {
+		struct inode *ti;
+
+		if (cookie->statuses[i].abort_code != 0)
+			continue;
+
+		ti = afs_iget(dir->i_sb, key, &cookie->fids[i],
+			      &cookie->statuses[i],
+			      &cookie->callbacks[i],
+			      cbi);
+		if (i == 0) {
+			inode = ti;
+		} else {
+			if (!IS_ERR(ti))
+				iput(ti);
+		}
+	}
+
+out_c:
+	afs_put_cb_interest(afs_v2net(dvnode), cbi);
+	kfree(cookie->callbacks);
+out_s:
+	kfree(cookie->statuses);
+out:
+	kfree(cookie);
+	return inode;
+}
+
 /*
  * Probe to see if a cell may exist.  This prevents positive dentries from
  * being created unnecessarily.
@@ -516,8 +736,7 @@ static int afs_probe_cell_name(struct dentry *dentry)
  * Try to auto mount the mountpoint with pseudo directory, if the autocell
  * operation is setted.
  */
-static struct inode *afs_try_auto_mntpt(struct dentry *dentry,
-					struct inode *dir, struct afs_fid *fid)
+static struct inode *afs_try_auto_mntpt(struct dentry *dentry, struct inode *dir)
 {
 	struct afs_vnode *vnode = AFS_FS_I(dir);
 	struct inode *inode;
@@ -539,7 +758,6 @@ static struct inode *afs_try_auto_mntpt(struct dentry *dentry,
 		goto out;
 	}
 
-	*fid = AFS_FS_I(inode)->fid;
 	_leave("= %p", inode);
 	return inode;
 
@@ -554,16 +772,13 @@ static struct inode *afs_try_auto_mntpt(struct dentry *dentry,
 static struct dentry *afs_lookup(struct inode *dir, struct dentry *dentry,
 				 unsigned int flags)
 {
-	struct afs_vnode *vnode;
-	struct afs_fid fid;
+	struct afs_vnode *dvnode = AFS_FS_I(dir);
 	struct inode *inode;
 	struct key *key;
 	int ret;
 
-	vnode = AFS_FS_I(dir);
-
 	_enter("{%x:%u},%p{%pd},",
-	       vnode->fid.vid, vnode->fid.vnode, dentry, dentry);
+	       dvnode->fid.vid, dvnode->fid.vnode, dentry, dentry);
 
 	ASSERTCMP(d_inode(dentry), ==, NULL);
 
@@ -572,28 +787,29 @@ static struct dentry *afs_lookup(struct inode *dir, struct dentry *dentry,
 		return ERR_PTR(-ENAMETOOLONG);
 	}
 
-	if (test_bit(AFS_VNODE_DELETED, &vnode->flags)) {
+	if (test_bit(AFS_VNODE_DELETED, &dvnode->flags)) {
 		_leave(" = -ESTALE");
 		return ERR_PTR(-ESTALE);
 	}
 
-	key = afs_request_key(vnode->volume->cell);
+	key = afs_request_key(dvnode->volume->cell);
 	if (IS_ERR(key)) {
 		_leave(" = %ld [key]", PTR_ERR(key));
 		return ERR_CAST(key);
 	}
 
-	ret = afs_validate(vnode, key);
+	ret = afs_validate(dvnode, key);
 	if (ret < 0) {
 		key_put(key);
 		_leave(" = %d [val]", ret);
 		return ERR_PTR(ret);
 	}
 
-	ret = afs_do_lookup(dir, dentry, &fid, key);
-	if (ret < 0) {
+	inode = afs_do_lookup(dir, dentry, key);
+	if (IS_ERR(inode)) {
+		ret = PTR_ERR(inode);
 		if (ret == -ENOENT) {
-			inode = afs_try_auto_mntpt(dentry, dir, &fid);
+			inode = afs_try_auto_mntpt(dentry, dir);
 			if (!IS_ERR(inode)) {
 				key_put(key);
 				goto success;
@@ -611,10 +827,9 @@ static struct dentry *afs_lookup(struct inode *dir, struct dentry *dentry,
 		_leave(" = %d [do]", ret);
 		return ERR_PTR(ret);
 	}
-	dentry->d_fsdata = (void *)(unsigned long) vnode->status.data_version;
+	dentry->d_fsdata = (void *)(unsigned long)dvnode->status.data_version;
 
 	/* instantiate the dentry */
-	inode = afs_iget(dir->i_sb, key, &fid, NULL, NULL, NULL);
 	key_put(key);
 	if (IS_ERR(inode)) {
 		_leave(" = %ld", PTR_ERR(inode));
@@ -623,9 +838,7 @@ static struct dentry *afs_lookup(struct inode *dir, struct dentry *dentry,
 
 success:
 	d_add(dentry, inode);
-	_leave(" = 0 { vn=%u u=%u } -> { ino=%lu v=%u }",
-	       fid.vnode,
-	       fid.unique,
+	_leave(" = 0 { ino=%lu v=%u }",
 	       d_inode(dentry)->i_ino,
 	       d_inode(dentry)->i_generation);
 
@@ -639,7 +852,6 @@ static struct dentry *afs_dynroot_lookup(struct inode *dir, struct dentry *dentr
 					 unsigned int flags)
 {
 	struct afs_vnode *vnode;
-	struct afs_fid fid;
 	struct inode *inode;
 	int ret;
 
@@ -654,7 +866,7 @@ static struct dentry *afs_dynroot_lookup(struct inode *dir, struct dentry *dentr
 		return ERR_PTR(-ENAMETOOLONG);
 	}
 
-	inode = afs_try_auto_mntpt(dentry, dir, &fid);
+	inode = afs_try_auto_mntpt(dentry, dir);
 	if (IS_ERR(inode)) {
 		ret = PTR_ERR(inode);
 		if (ret == -ENOENT) {
@@ -736,7 +948,7 @@ static int afs_d_revalidate(struct dentry *dentry, unsigned int flags)
 	_debug("dir modified");
 
 	/* search the directory for this vnode */
-	ret = afs_do_lookup(&dir->vfs_inode, dentry, &fid, key);
+	ret = afs_do_lookup_one(&dir->vfs_inode, dentry, &fid, key);
 	switch (ret) {
 	case 0:
 		/* the filename maps to something */
diff --git a/fs/afs/fsclient.c b/fs/afs/fsclient.c
index 88ec38c2d83c..75554ee98d02 100644
--- a/fs/afs/fsclient.c
+++ b/fs/afs/fsclient.c
@@ -94,7 +94,7 @@ static void xdr_decode_AFSFetchStatus(const __be32 **_bp,
 	data_version |= (u64) ntohl(*bp++) << 32;
 	EXTRACT(status->lock_count);
 	size |= (u64) ntohl(*bp++) << 32;
-	bp++; /* spare 4 */
+	EXTRACT(status->abort_code); /* spare 4 */
 	*_bp = bp;
 
 	if (size != status->size) {
@@ -274,7 +274,7 @@ static void xdr_decode_AFSFetchVolumeStatus(const __be32 **_bp,
 /*
  * deliver reply data to an FS.FetchStatus
  */
-static int afs_deliver_fs_fetch_status(struct afs_call *call)
+static int afs_deliver_fs_fetch_status_vnode(struct afs_call *call)
 {
 	struct afs_vnode *vnode = call->reply[0];
 	const __be32 *bp;
@@ -300,10 +300,10 @@ static int afs_deliver_fs_fetch_status(struct afs_call *call)
 /*
  * FS.FetchStatus operation type
  */
-static const struct afs_call_type afs_RXFSFetchStatus = {
-	.name		= "FS.FetchStatus",
+static const struct afs_call_type afs_RXFSFetchStatus_vnode = {
+	.name		= "FS.FetchStatus(vnode)",
 	.op		= afs_FS_FetchStatus,
-	.deliver	= afs_deliver_fs_fetch_status,
+	.deliver	= afs_deliver_fs_fetch_status_vnode,
 	.destructor	= afs_flat_call_destructor,
 };
 
@@ -320,7 +320,8 @@ int afs_fs_fetch_file_status(struct afs_fs_cursor *fc, struct afs_volsync *volsy
 	_enter(",%x,{%x:%u},,",
 	       key_serial(fc->key), vnode->fid.vid, vnode->fid.vnode);
 
-	call = afs_alloc_flat_call(net, &afs_RXFSFetchStatus, 16, (21 + 3 + 6) * 4);
+	call = afs_alloc_flat_call(net, &afs_RXFSFetchStatus_vnode,
+				   16, (21 + 3 + 6) * 4);
 	if (!call) {
 		fc->ac.error = -ENOMEM;
 		return -ENOMEM;
@@ -1947,3 +1948,262 @@ int afs_fs_get_capabilities(struct afs_net *net,
 	trace_afs_make_fs_call(call, NULL);
 	return afs_make_call(ac, call, GFP_NOFS, false);
 }
+
+/*
+ * Deliver reply data to an FS.FetchStatus with no vnode.
+ */
+static int afs_deliver_fs_fetch_status(struct afs_call *call)
+{
+	struct afs_file_status *status = call->reply[1];
+	struct afs_callback *callback = call->reply[2];
+	struct afs_volsync *volsync = call->reply[3];
+	struct afs_vnode *vnode = call->reply[0];
+	const __be32 *bp;
+	int ret;
+
+	ret = afs_transfer_reply(call);
+	if (ret < 0)
+		return ret;
+
+	_enter("{%x:%u}", vnode->fid.vid, vnode->fid.vnode);
+
+	/* unmarshall the reply once we've received all of it */
+	bp = call->buffer;
+	xdr_decode_AFSFetchStatus(&bp, status, vnode, NULL);
+	callback[call->count].version	= ntohl(bp[0]);
+	callback[call->count].expiry	= ntohl(bp[1]);
+	callback[call->count].type	= ntohl(bp[2]);
+	if (vnode)
+		xdr_decode_AFSCallBack(call, vnode, &bp);
+	else
+		bp += 3;
+	if (volsync)
+		xdr_decode_AFSVolSync(&bp, volsync);
+
+	_leave(" = 0 [done]");
+	return 0;
+}
+
+/*
+ * FS.FetchStatus operation type
+ */
+static const struct afs_call_type afs_RXFSFetchStatus = {
+	.name		= "FS.FetchStatus",
+	.op		= afs_FS_FetchStatus,
+	.deliver	= afs_deliver_fs_fetch_status,
+	.destructor	= afs_flat_call_destructor,
+};
+
+/*
+ * Fetch the status information for a fid without needing a vnode handle.
+ */
+int afs_fs_fetch_status(struct afs_fs_cursor *fc,
+			struct afs_net *net,
+			struct afs_fid *fid,
+			struct afs_file_status *status,
+			struct afs_callback *callback,
+			struct afs_volsync *volsync)
+{
+	struct afs_call *call;
+	__be32 *bp;
+
+	_enter(",%x,{%x:%u},,",
+	       key_serial(fc->key), fid->vid, fid->vnode);
+
+	call = afs_alloc_flat_call(net, &afs_RXFSFetchStatus, 16, (21 + 3 + 6) * 4);
+	if (!call) {
+		fc->ac.error = -ENOMEM;
+		return -ENOMEM;
+	}
+
+	call->key = fc->key;
+	call->reply[0] = NULL; /* vnode for fid[0] */
+	call->reply[1] = status;
+	call->reply[2] = callback;
+	call->reply[3] = volsync;
+
+	/* marshall the parameters */
+	bp = call->request;
+	bp[0] = htonl(FSFETCHSTATUS);
+	bp[1] = htonl(fid->vid);
+	bp[2] = htonl(fid->vnode);
+	bp[3] = htonl(fid->unique);
+
+	call->cb_break = fc->cb_break;
+	afs_use_fs_server(call, fc->cbi);
+	trace_afs_make_fs_call(call, fid);
+	return afs_make_call(&fc->ac, call, GFP_NOFS, false);
+}
+
+/*
+ * Deliver reply data to an FS.InlineBulkStatus call
+ */
+static int afs_deliver_fs_inline_bulk_status(struct afs_call *call)
+{
+	struct afs_file_status *statuses;
+	struct afs_callback *callbacks;
+	struct afs_vnode *vnode = call->reply[0];
+	const __be32 *bp;
+	u32 tmp;
+	int ret;
+
+	_enter("{%u}", call->unmarshall);
+
+	switch (call->unmarshall) {
+	case 0:
+		call->offset = 0;
+		call->unmarshall++;
+
+		/* Extract the file status count and array in two steps */
+	case 1:
+		_debug("extract status count");
+		ret = afs_extract_data(call, &call->tmp, 4, true);
+		if (ret < 0)
+			return ret;
+
+		tmp = ntohl(call->tmp);
+		_debug("status count: %u/%u", tmp, call->count2);
+		if (tmp != call->count2)
+			return -EBADMSG;
+
+		call->count = 0;
+		call->unmarshall++;
+	more_counts:
+		call->offset = 0;
+
+	case 2:
+		_debug("extract status array %u", call->count);
+		ret = afs_extract_data(call, call->buffer, 21 * 4, true);
+		if (ret < 0)
+			return ret;
+
+		bp = call->buffer;
+		statuses = call->reply[1];
+		xdr_decode_AFSFetchStatus(&bp, &statuses[call->count],
+					  call->count == 0 ? vnode : NULL,
+					  NULL);
+
+		call->count++;
+		if (call->count < call->count2)
+			goto more_counts;
+
+		call->count = 0;
+		call->unmarshall++;
+		call->offset = 0;
+
+		/* Extract the callback count and array in two steps */
+	case 3:
+		_debug("extract CB count");
+		ret = afs_extract_data(call, &call->tmp, 4, true);
+		if (ret < 0)
+			return ret;
+
+		tmp = ntohl(call->tmp);
+		_debug("CB count: %u", tmp);
+		if (tmp != call->count2)
+			return -EBADMSG;
+		call->count = 0;
+		call->unmarshall++;
+	more_cbs:
+		call->offset = 0;
+
+	case 4:
+		_debug("extract CB array");
+		ret = afs_extract_data(call, call->buffer, 3 * 4, true);
+		if (ret < 0)
+			return ret;
+
+		_debug("unmarshall CB array");
+		bp = call->buffer;
+		callbacks = call->reply[2];
+		callbacks[call->count].version	= ntohl(bp[0]);
+		callbacks[call->count].expiry	= ntohl(bp[1]);
+		callbacks[call->count].type	= ntohl(bp[2]);
+		statuses = call->reply[1];
+		if (call->count == 0 && vnode && statuses[0].abort_code == 0)
+			xdr_decode_AFSCallBack(call, vnode, &bp);
+		call->count++;
+		if (call->count < call->count2)
+			goto more_cbs;
+
+		call->offset = 0;
+		call->unmarshall++;
+
+	case 5:
+		ret = afs_extract_data(call, call->buffer, 6 * 4, false);
+		if (ret < 0)
+			return ret;
+
+		bp = call->buffer;
+		if (call->reply[3])
+			xdr_decode_AFSVolSync(&bp, call->reply[3]);
+
+		call->offset = 0;
+		call->unmarshall++;
+
+	case 6:
+		break;
+	}
+
+	_leave(" = 0 [done]");
+	return 0;
+}
+
+/*
+ * FS.InlineBulkStatus operation type
+ */
+static const struct afs_call_type afs_RXFSInlineBulkStatus = {
+	.name		= "FS.InlineBulkStatus",
+	.op		= afs_FS_InlineBulkStatus,
+	.deliver	= afs_deliver_fs_inline_bulk_status,
+	.destructor	= afs_flat_call_destructor,
+};
+
+/*
+ * Fetch the status information for up to 50 files
+ */
+int afs_fs_inline_bulk_status(struct afs_fs_cursor *fc,
+			      struct afs_net *net,
+			      struct afs_fid *fids,
+			      struct afs_file_status *statuses,
+			      struct afs_callback *callbacks,
+			      unsigned int nr_fids,
+			      struct afs_volsync *volsync)
+{
+	struct afs_call *call;
+	__be32 *bp;
+	int i;
+
+	_enter(",%x,{%x:%u},%u",
+	       key_serial(fc->key), fids[0].vid, fids[1].vnode, nr_fids);
+
+	call = afs_alloc_flat_call(net, &afs_RXFSInlineBulkStatus,
+				   (2 + nr_fids * 3) * 4,
+				   21 * 4);
+	if (!call) {
+		fc->ac.error = -ENOMEM;
+		return -ENOMEM;
+	}
+
+	call->key = fc->key;
+	call->reply[0] = NULL; /* vnode for fid[0] */
+	call->reply[1] = statuses;
+	call->reply[2] = callbacks;
+	call->reply[3] = volsync;
+	call->count2 = nr_fids;
+
+	/* marshall the parameters */
+	bp = call->request;
+	*bp++ = htonl(FSINLINEBULKSTATUS);
+	*bp++ = htonl(nr_fids);
+	for (i = 0; i < nr_fids; i++) {
+		*bp++ = htonl(fids[i].vid);
+		*bp++ = htonl(fids[i].vnode);
+		*bp++ = htonl(fids[i].unique);
+	}
+
+	call->cb_break = fc->cb_break;
+	afs_use_fs_server(call, fc->cbi);
+	trace_afs_make_fs_call(call, &fids[0]);
+	return afs_make_call(&fc->ac, call, GFP_NOFS, false);
+}
diff --git a/fs/afs/internal.h b/fs/afs/internal.h
index 135192b7dc04..55b07e818400 100644
--- a/fs/afs/internal.h
+++ b/fs/afs/internal.h
@@ -363,6 +363,7 @@ struct afs_server {
 #define AFS_SERVER_FL_UPDATING	4
 #define AFS_SERVER_FL_PROBED	5		/* The fileserver has been probed */
 #define AFS_SERVER_FL_PROBING	6		/* Fileserver is being probed */
+#define AFS_SERVER_FL_NO_IBULK	7		/* Fileserver doesn't support FS.InlineBulkStatus */
 	atomic_t		usage;
 	u32			addr_version;	/* Address list version */
 
@@ -611,7 +612,7 @@ extern struct fscache_cookie_def afs_vnode_cache_index_def;
  */
 extern void afs_init_callback_state(struct afs_server *);
 extern void afs_break_callback(struct afs_vnode *);
-extern void afs_break_callbacks(struct afs_server *, size_t,struct afs_callback[]);
+extern void afs_break_callbacks(struct afs_server *, size_t, struct afs_callback_break*);
 
 extern int afs_register_server_cb_interest(struct afs_vnode *, struct afs_server_entry *);
 extern void afs_put_cb_interest(struct afs_net *, struct afs_cb_interest *);
@@ -702,6 +703,13 @@ extern int afs_fs_give_up_all_callbacks(struct afs_net *, struct afs_server *,
 					struct afs_addr_cursor *, struct key *);
 extern int afs_fs_get_capabilities(struct afs_net *, struct afs_server *,
 				   struct afs_addr_cursor *, struct key *);
+extern int afs_fs_inline_bulk_status(struct afs_fs_cursor *, struct afs_net *,
+				     struct afs_fid *, struct afs_file_status *,
+				     struct afs_callback *, unsigned int,
+				     struct afs_volsync *);
+extern int afs_fs_fetch_status(struct afs_fs_cursor *, struct afs_net *,
+			       struct afs_fid *, struct afs_file_status *,
+			       struct afs_callback *, struct afs_volsync *);
 
 /*
  * inode.c
diff --git a/include/trace/events/afs.h b/include/trace/events/afs.h
index 63815f66b274..0419b7e1e968 100644
--- a/include/trace/events/afs.h
+++ b/include/trace/events/afs.h
@@ -49,6 +49,7 @@ enum afs_fs_operation {
 	afs_FS_ExtendLock		= 157,	/* AFS Extend a file lock */
 	afs_FS_ReleaseLock		= 158,	/* AFS Release a file lock */
 	afs_FS_Lookup			= 161,	/* AFS lookup file in directory */
+	afs_FS_InlineBulkStatus		= 65536, /* AFS Fetch multiple file statuses with errors */
 	afs_FS_FetchData64		= 65537, /* AFS Fetch file data */
 	afs_FS_StoreData64		= 65538, /* AFS Store file data */
 	afs_FS_GiveUpAllCallBacks	= 65539, /* AFS Give up all our callbacks on a server */
@@ -93,6 +94,7 @@ enum afs_vl_operation {
 	EM(afs_FS_ExtendLock,			"FS.ExtendLock") \
 	EM(afs_FS_ReleaseLock,			"FS.ReleaseLock") \
 	EM(afs_FS_Lookup,			"FS.Lookup") \
+	EM(afs_FS_InlineBulkStatus,		"FS.InlineBulkStatus") \
 	EM(afs_FS_FetchData64,			"FS.FetchData64") \
 	EM(afs_FS_StoreData64,			"FS.StoreData64") \
 	EM(afs_FS_GiveUpAllCallBacks,		"FS.GiveUpAllCallBacks") \

^ permalink raw reply related	[flat|nested] 37+ messages in thread

* [PATCH 05/20] afs: Implement @sys substitution handling
  2018-04-05 20:29 [PATCH 00/20] afs: Fixes and development David Howells
                   ` (3 preceding siblings ...)
  2018-04-05 20:29 ` [PATCH 04/20] afs: Prospectively look up extra files when doing a single lookup David Howells
@ 2018-04-05 20:30 ` David Howells
  2018-04-06  6:37   ` Christoph Hellwig
                     ` (5 more replies)
  2018-04-05 20:30 ` [PATCH 06/20] afs: Implement @cell " David Howells
                   ` (17 subsequent siblings)
  22 siblings, 6 replies; 37+ messages in thread
From: David Howells @ 2018-04-05 20:30 UTC (permalink / raw)
  To: torvalds; +Cc: dhowells, linux-fsdevel, linux-afs, linux-kernel

Implement the AFS feature by which @sys at the end of a pathname component
may be substituted for one of a list of values, typically naming the
operating system.  Up to 16 alternatives may be specified and these are
tried in turn until one works.  Each network namespace has[*] a separate
independent list.

Upon creation of a new network namespace, the list of values is
initialised[*] to a single OpenAFS-compatible string representing arch type
plus "_linux26".  For example, on x86_64, the sysname is "amd64_linux26".

[*] Or will, once network namespace support is finalised in kAFS.

The list may be set by:

	# for i in foo bar linux-x86_64; do echo $i; done >/proc/fs/afs/sysname

for which separate writes to the same fd are amalgamated and applied on
close.  The LF character may be used as a separator to specify multiple
items in the same write() call.

The list may be cleared by:

	# echo >/proc/fs/afs/sysname

and read by:

	# cat /proc/fs/afs/sysname
	foo
	bar
	linux-x86_64

Signed-off-by: David Howells <dhowells@redhat.com>
---

 Documentation/filesystems/afs.txt |   28 +++++
 fs/afs/dir.c                      |   78 ++++++++++++++
 fs/afs/internal.h                 |   13 ++
 fs/afs/main.c                     |   42 +++++++
 fs/afs/proc.c                     |  211 +++++++++++++++++++++++++++++++++++++
 5 files changed, 369 insertions(+), 3 deletions(-)

diff --git a/Documentation/filesystems/afs.txt b/Documentation/filesystems/afs.txt
index c5254f6d234d..8c6ea7b41048 100644
--- a/Documentation/filesystems/afs.txt
+++ b/Documentation/filesystems/afs.txt
@@ -11,7 +11,7 @@ Contents:
  - Proc filesystem.
  - The cell database.
  - Security.
- - Examples.
+ - The @sys substitution.
 
 
 ========
@@ -230,3 +230,29 @@ If a file is opened with a particular key and then the file descriptor is
 passed to a process that doesn't have that key (perhaps over an AF_UNIX
 socket), then the operations on the file will be made with key that was used to
 open the file.
+
+
+=====================
+THE @SYS SUBSTITUTION
+=====================
+
+The list of up to 16 @sys substitutions for the current network namespace can
+be configured by writing a list to /proc/fs/afs/sysname:
+
+	[root@andromeda ~]# echo foo amd64_linux_26 >/proc/fs/afs/sysname
+
+or cleared entirely by writing an empty list:
+
+	[root@andromeda ~]# echo >/proc/fs/afs/sysname
+
+The current list for current network namespace can be retrieved by:
+
+	[root@andromeda ~]# cat /proc/fs/afs/sysname
+	foo
+	amd64_linux_26
+
+When @sys is being substituted for, each element of the list is tried in the
+order given.
+
+By default, the list will contain one item that conforms to the pattern
+"<arch>_linux_26", amd64 being the name for x86_64.
diff --git a/fs/afs/dir.c b/fs/afs/dir.c
index 63f7b37e1e71..9bac606725ee 100644
--- a/fs/afs/dir.c
+++ b/fs/afs/dir.c
@@ -589,7 +589,7 @@ static struct inode *afs_do_lookup(struct inode *dir, struct dentry *dentry,
 	inode = ilookup5(dir->i_sb, cookie->fids[0].vnode, afs_iget5_test, &data);
 	if (inode)
 		goto out;
-	
+
 	/* Need space for examining all the selected files */
 	inode = ERR_PTR(-ENOMEM);
 	cookie->statuses = kcalloc(cookie->nr_fids, sizeof(struct afs_file_status),
@@ -766,6 +766,75 @@ static struct inode *afs_try_auto_mntpt(struct dentry *dentry, struct inode *dir
 	return ERR_PTR(ret);
 }
 
+/*
+ * Look up an entry in a directory with @sys substitution.
+ */
+static struct dentry *afs_lookup_atsys(struct inode *dir, struct dentry *dentry,
+				       struct key *key)
+{
+	struct afs_sysnames *subs;
+	struct afs_net *net = afs_i2net(dir);
+	struct dentry *parent, *ret;
+	char *buf, *p, *name;
+	int len, i;
+
+	_enter("");
+
+	parent = dget_parent(dentry);
+
+	ret = ERR_PTR(-ENOMEM);
+	p = buf = kmalloc(AFSNAMEMAX, GFP_KERNEL);
+	if (!buf)
+		goto out_p;
+	if (dentry->d_name.len > 4) {
+		memcpy(p, dentry->d_name.name, dentry->d_name.len - 4);
+		p += dentry->d_name.len - 4;
+	}
+
+	/* There is an ordered list of substitutes that we have to try */
+	for (i = 0; i < AFS_NR_SYSNAME; i++) {
+		read_lock(&net->sysnames_lock);
+		subs = net->sysnames;
+		if (!subs || i >= subs->nr) {
+			read_unlock(&net->sysnames_lock);
+			goto noent;
+		}
+
+		name = subs->subs[i];
+		if (!name) {
+			read_unlock(&net->sysnames_lock);
+			goto noent;
+		}
+
+		len = dentry->d_name.len - 4 + strlen(name);
+		if (len >= AFSNAMEMAX) {
+			read_unlock(&net->sysnames_lock);
+			ret = ERR_PTR(-ENAMETOOLONG);
+			goto out_b;
+		}
+
+		strcpy(p, name);
+		read_unlock(&net->sysnames_lock);
+
+		ret = lookup_one_len(buf, parent, len);
+		if (IS_ERR(ret) || d_is_positive(ret))
+			goto out_b;
+		dput(ret);
+	}
+
+noent:
+	/* We don't want to d_add() the @sys dentry here as we don't want to
+	 * the cached dentry to hide changes to the sysnames list.
+	 */
+	ret = NULL;
+out_b:
+	kfree(buf);
+out_p:
+	dput(parent);
+	key_put(key);
+	return ret;
+}
+
 /*
  * look up an entry in a directory
  */
@@ -805,6 +874,13 @@ static struct dentry *afs_lookup(struct inode *dir, struct dentry *dentry,
 		return ERR_PTR(ret);
 	}
 
+	if (dentry->d_name.len >= 4 &&
+	    dentry->d_name.name[dentry->d_name.len - 4] == '@' &&
+	    dentry->d_name.name[dentry->d_name.len - 3] == 's' &&
+	    dentry->d_name.name[dentry->d_name.len - 2] == 'y' &&
+	    dentry->d_name.name[dentry->d_name.len - 1] == 's')
+		return afs_lookup_atsys(dir, dentry, key);
+
 	inode = afs_do_lookup(dir, dentry, key);
 	if (IS_ERR(inode)) {
 		ret = PTR_ERR(inode);
diff --git a/fs/afs/internal.h b/fs/afs/internal.h
index 55b07e818400..c584d6e4c2f5 100644
--- a/fs/afs/internal.h
+++ b/fs/afs/internal.h
@@ -198,6 +198,16 @@ static inline struct afs_super_info *AFS_FS_S(struct super_block *sb)
 
 extern struct file_system_type afs_fs_type;
 
+/*
+ * Set of substitutes for @sys.
+ */
+struct afs_sysnames {
+#define AFS_NR_SYSNAME 16
+	char		*subs[AFS_NR_SYSNAME];
+	unsigned short	nr;
+	short		error;
+};
+
 /*
  * AFS network namespace record.
  */
@@ -246,8 +256,11 @@ struct afs_net {
 
 	/* Misc */
 	struct proc_dir_entry	*proc_afs;		/* /proc/net/afs directory */
+	struct afs_sysnames	*sysnames;
+	rwlock_t		sysnames_lock;
 };
 
+extern const char afs_init_sysname[];
 extern struct afs_net __afs_net;// Dummy AFS network namespace; TODO: replace with real netns
 
 enum afs_cell_state {
diff --git a/fs/afs/main.c b/fs/afs/main.c
index 15a02a05ff40..01a55d25d064 100644
--- a/fs/afs/main.c
+++ b/fs/afs/main.c
@@ -34,11 +34,42 @@ MODULE_PARM_DESC(rootcell, "root AFS cell name and VL server IP addr list");
 struct workqueue_struct *afs_wq;
 struct afs_net __afs_net;
 
+#if defined(CONFIG_ALPHA)
+const char afs_init_sysname[] = "alpha_linux26";
+#elif defined(CONFIG_X86_64)
+const char afs_init_sysname[] = "amd64_linux26";
+#elif defined(CONFIG_ARM)
+const char afs_init_sysname[] = "arm_linux26";
+#elif defined(CONFIG_ARM64)
+const char afs_init_sysname[] = "aarch64_linux26";
+#elif defined(CONFIG_X86_32)
+const char afs_init_sysname[] = "i386_linux26";
+#elif defined(CONFIG_IA64)
+const char afs_init_sysname[] = "ia64_linux26";
+#elif defined(CONFIG_PPC64)
+const char afs_init_sysname[] = "ppc64_linux26";
+#elif defined(CONFIG_PPC32)
+const char afs_init_sysname[] = "ppc_linux26";
+#elif defined(CONFIG_S390)
+#ifdef CONFIG_64BIT
+const char afs_init_sysname[] = "s390x_linux26";
+#else
+const char afs_init_sysname[] = "s390_linux26";
+#endif
+#elif defined(CONFIG_SPARC64)
+const char afs_init_sysname[] = "sparc64_linux26";
+#elif defined(CONFIG_SPARC32)
+const char afs_init_sysname[] = "sparc_linux26";
+#else
+const char afs_init_sysname[] = "unknown_linux26";
+#endif
+
 /*
  * Initialise an AFS network namespace record.
  */
 static int __net_init afs_net_init(struct afs_net *net)
 {
+	struct afs_sysnames *sysnames;
 	int ret;
 
 	net->live = true;
@@ -67,6 +98,15 @@ static int __net_init afs_net_init(struct afs_net *net)
 	INIT_WORK(&net->fs_manager, afs_manage_servers);
 	timer_setup(&net->fs_timer, afs_servers_timer, 0);
 
+	ret = -ENOMEM;
+	sysnames = kzalloc(sizeof(*sysnames), GFP_KERNEL);
+	if (!sysnames)
+		goto error_sysnames;
+	sysnames->subs[0] = (char *)&afs_init_sysname;
+	sysnames->nr = 1;
+	net->sysnames = sysnames;
+	rwlock_init(&net->sysnames_lock);
+
 	/* Register the /proc stuff */
 	ret = afs_proc_init(net);
 	if (ret < 0)
@@ -92,6 +132,8 @@ static int __net_init afs_net_init(struct afs_net *net)
 	net->live = false;
 	afs_proc_cleanup(net);
 error_proc:
+	kfree(net->sysnames);
+error_sysnames:
 	net->live = false;
 	return ret;
 }
diff --git a/fs/afs/proc.c b/fs/afs/proc.c
index 2f04d37eeef0..7e9d6b2f9734 100644
--- a/fs/afs/proc.c
+++ b/fs/afs/proc.c
@@ -126,6 +126,32 @@ static const struct file_operations afs_proc_servers_fops = {
 	.release	= seq_release,
 };
 
+static int afs_proc_sysname_open(struct inode *inode, struct file *file);
+static int afs_proc_sysname_release(struct inode *inode, struct file *file);
+static void *afs_proc_sysname_start(struct seq_file *p, loff_t *pos);
+static void *afs_proc_sysname_next(struct seq_file *p, void *v,
+					loff_t *pos);
+static void afs_proc_sysname_stop(struct seq_file *p, void *v);
+static int afs_proc_sysname_show(struct seq_file *m, void *v);
+static ssize_t afs_proc_sysname_write(struct file *file,
+				      const char __user *buf,
+				      size_t size, loff_t *_pos);
+
+static const struct seq_operations afs_proc_sysname_ops = {
+	.start	= afs_proc_sysname_start,
+	.next	= afs_proc_sysname_next,
+	.stop	= afs_proc_sysname_stop,
+	.show	= afs_proc_sysname_show,
+};
+
+static const struct file_operations afs_proc_sysname_fops = {
+	.open		= afs_proc_sysname_open,
+	.read		= seq_read,
+	.llseek		= seq_lseek,
+	.release	= afs_proc_sysname_release,
+	.write		= afs_proc_sysname_write,
+};
+
 /*
  * initialise the /proc/fs/afs/ directory
  */
@@ -139,7 +165,8 @@ int afs_proc_init(struct afs_net *net)
 
 	if (!proc_create("cells", 0644, net->proc_afs, &afs_proc_cells_fops) ||
 	    !proc_create("rootcell", 0644, net->proc_afs, &afs_proc_rootcell_fops) ||
-	    !proc_create("servers", 0644, net->proc_afs, &afs_proc_servers_fops))
+	    !proc_create("servers", 0644, net->proc_afs, &afs_proc_servers_fops) ||
+	    !proc_create("sysname", 0644, net->proc_afs, &afs_proc_sysname_fops))
 		goto error_tree;
 
 	_leave(" = 0");
@@ -635,3 +662,185 @@ static int afs_proc_servers_show(struct seq_file *m, void *v)
 		   &alist->addrs[alist->index].transport);
 	return 0;
 }
+
+static void afs_sysnames_free(struct afs_sysnames *sysnames)
+{
+	int i;
+
+	if (sysnames) {
+		for (i = 0; i < sysnames->nr; i++)
+			if (sysnames->subs[i] != afs_init_sysname)
+				kfree(sysnames->subs[i]);
+	}
+}
+
+/*
+ * Handle opening of /proc/fs/afs/sysname.  If it is opened for writing, we
+ * assume the caller wants to change the substitution list and we allocate a
+ * buffer to hold the list.
+ */
+static int afs_proc_sysname_open(struct inode *inode, struct file *file)
+{
+	struct afs_sysnames *sysnames;
+	struct seq_file *m;
+	int ret;
+
+	ret = seq_open(file, &afs_proc_sysname_ops);
+	if (ret < 0)
+		return ret;
+
+	if (file->f_mode & FMODE_WRITE) {
+		sysnames = kzalloc(sizeof(*sysnames), GFP_KERNEL);
+		if (!sysnames) {
+			seq_release(inode, file);
+			return -ENOMEM;
+		}
+
+		m = file->private_data;
+		m->private = sysnames;
+	}
+
+	return 0;
+}
+
+/*
+ * Handle writes to /proc/fs/afs/sysname to set the @sys substitution.
+ */
+static ssize_t afs_proc_sysname_write(struct file *file,
+				      const char __user *buf,
+				      size_t size, loff_t *_pos)
+{
+	struct afs_sysnames *sysnames;
+	struct seq_file *m = file->private_data;
+	char *kbuf = NULL, *s, *p, *sub;
+	int ret, len;
+
+	sysnames = m->private;
+	if (!sysnames)
+		return -EINVAL;
+	if (sysnames->error)
+		return sysnames->error;
+
+	if (size >= PAGE_SIZE - 1) {
+		sysnames->error = -EINVAL;
+		return -EINVAL;
+	}
+	if (size == 0)
+		return 0;
+
+	kbuf = memdup_user_nul(buf, size);
+	if (IS_ERR(kbuf))
+		return PTR_ERR(kbuf);
+
+	inode_lock(file_inode(file));
+
+	p = kbuf;
+	while ((s = strsep(&p, " \t\n"))) {
+		len = strlen(s);
+		if (len == 0)
+			continue;
+		if (len >= AFSNAMEMAX) {
+			sysnames->error = -ENAMETOOLONG;
+			ret = -ENAMETOOLONG;
+			goto out;
+		}
+		if (len >= 4 &&
+		    s[len - 4] == '@' &&
+		    s[len - 3] == 's' &&
+		    s[len - 2] == 'y' &&
+		    s[len - 1] == 's') {
+			/* Protect against recursion */
+			sysnames->error = -EINVAL;
+			ret = -EINVAL;
+			goto out;
+		}
+
+		if (sysnames->nr >= AFS_NR_SYSNAME) {
+			sysnames->error = -EFBIG;
+			ret = -EFBIG;
+			goto out;
+		}
+
+		if (strcmp(s, afs_init_sysname) == 0) {
+			sub = (char *)afs_init_sysname;
+		} else {
+			ret = -ENOMEM;
+			sub = kmemdup(s, len + 1, GFP_KERNEL);
+			if (!sub)
+				goto out;
+		}
+
+		sysnames->subs[sysnames->nr] = sub;
+		sysnames->nr++;
+	}
+
+	ret = size;	/* consume everything, always */
+out:
+	inode_unlock(file_inode(file));
+	kfree(kbuf);
+	return ret;
+}
+
+static int afs_proc_sysname_release(struct inode *inode, struct file *file)
+{
+	struct afs_sysnames *sysnames, *kill = NULL;
+	struct seq_file *m = file->private_data;
+	struct afs_net *net = afs_seq2net(m);
+
+	sysnames = m->private;
+	if (sysnames) {
+		kill = sysnames;
+		if (!sysnames->error) {
+			write_lock(&net->sysnames_lock);
+			kill = net->sysnames;
+			net->sysnames = sysnames;
+			write_unlock(&net->sysnames_lock);
+		}
+		afs_sysnames_free(kill);
+	}
+
+	return seq_release(inode, file);
+}
+
+static void *afs_proc_sysname_start(struct seq_file *m, loff_t *pos)
+	__acquires(&net->sysnames_lock)
+{
+	struct afs_net *net = afs_seq2net(m);
+	struct afs_sysnames *names = net->sysnames;
+
+	read_lock(&net->sysnames_lock);
+
+	if (*pos >= names->nr)
+		return NULL;
+	return (void *)(unsigned long)(*pos + 1);
+}
+
+static void *afs_proc_sysname_next(struct seq_file *m, void *v, loff_t *pos)
+{
+	struct afs_net *net = afs_seq2net(m);
+	struct afs_sysnames *names = net->sysnames;
+
+	*pos += 1;
+	if (*pos >= names->nr)
+		return NULL;
+	return (void *)(unsigned long)(*pos + 1);
+}
+
+static void afs_proc_sysname_stop(struct seq_file *m, void *v)
+	__releases(&net->sysnames_lock)
+{
+	struct afs_net *net = afs_seq2net(m);
+
+	read_unlock(&net->sysnames_lock);
+}
+
+static int afs_proc_sysname_show(struct seq_file *m, void *v)
+{
+	struct afs_net *net = afs_seq2net(m);
+	struct afs_sysnames *sysnames = net->sysnames;
+	unsigned int i = (unsigned long)v - 1;
+
+	if (i < sysnames->nr)
+		seq_printf(m, "%s\n", sysnames->subs[i]);
+	return 0;
+}

^ permalink raw reply related	[flat|nested] 37+ messages in thread

* [PATCH 06/20] afs: Implement @cell substitution handling
  2018-04-05 20:29 [PATCH 00/20] afs: Fixes and development David Howells
                   ` (4 preceding siblings ...)
  2018-04-05 20:30 ` [PATCH 05/20] afs: Implement @sys substitution handling David Howells
@ 2018-04-05 20:30 ` David Howells
  2018-04-05 20:30 ` [PATCH 07/20] afs: Dump bad status record David Howells
                   ` (16 subsequent siblings)
  22 siblings, 0 replies; 37+ messages in thread
From: David Howells @ 2018-04-05 20:30 UTC (permalink / raw)
  To: torvalds; +Cc: dhowells, linux-fsdevel, linux-afs, linux-kernel

Implement @cell substitution handling such that if @cell is seen as a name
in a dynamic root mount, then the name of the root cell for that network
namespace will be substituted for @cell during lookup.

The substitution of @cell for the current net namespace is set by writing
the cell name to /proc/fs/afs/rootcell.  The value can be obtained by
reading the file.

For example:

	# mount -t afs none /kafs -o dyn
	# echo grand.central.org >/proc/fs/afs/rootcell
	# ls /kafs/@cell
	archive/  cvs/  doc/  local/  project/  service/  software/  user/  www/
	# cat /proc/fs/afs/rootcell
	grand.central.org

Signed-off-by: David Howells <dhowells@redhat.com>
---

 fs/afs/cell.c |    2 ++
 fs/afs/dir.c  |   60 +++++++++++++++++++++++++++++++++++++++++++++++++++++++++
 fs/afs/proc.c |   35 ++++++++++++++++++++++++++++++++-
 3 files changed, 96 insertions(+), 1 deletion(-)

diff --git a/fs/afs/cell.c b/fs/afs/cell.c
index 721425b98b31..fdf4c36cff79 100644
--- a/fs/afs/cell.c
+++ b/fs/afs/cell.c
@@ -130,6 +130,8 @@ static struct afs_cell *afs_alloc_cell(struct afs_net *net,
 		_leave(" = -ENAMETOOLONG");
 		return ERR_PTR(-ENAMETOOLONG);
 	}
+	if (namelen == 5 && memcmp(name, "@cell", 5) == 0)
+		return ERR_PTR(-EINVAL);
 
 	_enter("%*.*s,%s", namelen, namelen, name, vllist);
 
diff --git a/fs/afs/dir.c b/fs/afs/dir.c
index 9bac606725ee..4f5e5c15ed6f 100644
--- a/fs/afs/dir.c
+++ b/fs/afs/dir.c
@@ -921,6 +921,62 @@ static struct dentry *afs_lookup(struct inode *dir, struct dentry *dentry,
 	return NULL;
 }
 
+/*
+ * Look up @cell in a dynroot directory.  This is a substitution for the
+ * local cell name for the net namespace.
+ */
+static struct dentry *afs_lookup_atcell(struct dentry *dentry)
+{
+	struct afs_cell *cell;
+	struct afs_net *net = afs_d2net(dentry);
+	struct dentry *parent, *ret;
+	unsigned int seq = 0;
+	char *name;
+	int len;
+
+	if (!net->ws_cell)
+		return ERR_PTR(-ENOENT);
+
+	parent = dget_parent(dentry);
+
+	ret = ERR_PTR(-ENOMEM);
+	name = kmalloc(AFS_MAXCELLNAME + 1, GFP_KERNEL);
+	if (!name)
+		goto out_p;
+
+	rcu_read_lock();
+	do {
+		read_seqbegin_or_lock(&net->cells_lock, &seq);
+		cell = rcu_dereference_raw(net->ws_cell);
+		if (cell) {
+			len = cell->name_len;
+			memcpy(name, cell->name, len + 1);
+		}
+	} while (need_seqretry(&net->cells_lock, seq));
+	done_seqretry(&net->cells_lock, seq);
+	rcu_read_unlock();
+
+	ret = ERR_PTR(-ENOENT);
+	if (!cell)
+		goto out_n;
+
+	ret = lookup_one_len(name, parent, len);
+	if (IS_ERR(ret) || d_is_positive(ret))
+		goto out_n;
+
+	/* We don't want to d_add() the @cell dentry here as we don't want to
+	 * the cached dentry to hide changes to the local cell name.
+	 */
+	dput(ret);
+	ret = NULL;
+
+out_n:
+	kfree(name);
+out_p:
+	dput(parent);
+	return ret;
+}
+
 /*
  * Look up an entry in a dynroot directory.
  */
@@ -942,6 +998,10 @@ static struct dentry *afs_dynroot_lookup(struct inode *dir, struct dentry *dentr
 		return ERR_PTR(-ENAMETOOLONG);
 	}
 
+	if (dentry->d_name.len == 5 &&
+	    memcmp(dentry->d_name.name, "@cell", 5) == 0)
+		return afs_lookup_atcell(dentry);
+
 	inode = afs_try_auto_mntpt(dentry, dir);
 	if (IS_ERR(inode)) {
 		ret = PTR_ERR(inode);
diff --git a/fs/afs/proc.c b/fs/afs/proc.c
index 7e9d6b2f9734..f28719c8d153 100644
--- a/fs/afs/proc.c
+++ b/fs/afs/proc.c
@@ -334,7 +334,40 @@ static ssize_t afs_proc_cells_write(struct file *file, const char __user *buf,
 static ssize_t afs_proc_rootcell_read(struct file *file, char __user *buf,
 				      size_t size, loff_t *_pos)
 {
-	return 0;
+	struct afs_cell *cell;
+	struct afs_net *net = afs_proc2net(file);
+	unsigned int seq = 0;
+	char name[AFS_MAXCELLNAME + 1];
+	int len;
+
+	if (*_pos > 0)
+		return 0;
+	if (!net->ws_cell)
+		return 0;
+
+	rcu_read_lock();
+	do {
+		read_seqbegin_or_lock(&net->cells_lock, &seq);
+		len = 0;
+		cell = rcu_dereference_raw(net->ws_cell);
+		if (cell) {
+			len = cell->name_len;
+			memcpy(name, cell->name, len);
+		}
+	} while (need_seqretry(&net->cells_lock, seq));
+	done_seqretry(&net->cells_lock, seq);
+	rcu_read_unlock();
+
+	if (!len)
+		return 0;
+
+	name[len++] = '\n';
+	if (len > size)
+		len = size;
+	if (copy_to_user(buf, name, len) != 0)
+		return -EFAULT;
+	*_pos = 1;
+	return len;
 }
 
 /*

^ permalink raw reply related	[flat|nested] 37+ messages in thread

* [PATCH 07/20] afs: Dump bad status record
  2018-04-05 20:29 [PATCH 00/20] afs: Fixes and development David Howells
                   ` (5 preceding siblings ...)
  2018-04-05 20:30 ` [PATCH 06/20] afs: Implement @cell " David Howells
@ 2018-04-05 20:30 ` David Howells
  2018-04-05 20:30 ` [PATCH 08/20] afs: Introduce a statistics proc file David Howells
                   ` (15 subsequent siblings)
  22 siblings, 0 replies; 37+ messages in thread
From: David Howells @ 2018-04-05 20:30 UTC (permalink / raw)
  To: torvalds; +Cc: dhowells, linux-fsdevel, linux-afs, linux-kernel

Dump an AFS FileStatus record that is detected as invalid.

Signed-off-by: David Howells <dhowells@redhat.com>
---

 fs/afs/fsclient.c |   35 +++++++++++++++++++++++++++++++++++
 1 file changed, 35 insertions(+)

diff --git a/fs/afs/fsclient.c b/fs/afs/fsclient.c
index 75554ee98d02..90aa3eb2c893 100644
--- a/fs/afs/fsclient.c
+++ b/fs/afs/fsclient.c
@@ -43,6 +43,26 @@ static void xdr_decode_AFSFid(const __be32 **_bp, struct afs_fid *fid)
 	*_bp = bp;
 }
 
+/*
+ * Dump a bad file status record.
+ */
+static void xdr_dump_bad(const __be32 *bp)
+{
+	__be32 x[4];
+	int i;
+
+	pr_notice("AFS XDR: Bad status record\n");
+	for (i = 0; i < 5 * 4 * 4; i += 16) {
+		memcpy(x, bp, 16);
+		bp += 4;
+		pr_notice("%03x: %08x %08x %08x %08x\n",
+			  i, ntohl(x[0]), ntohl(x[1]), ntohl(x[2]), ntohl(x[3]));
+	}
+
+	memcpy(x, bp, 4);
+	pr_notice("0x50: %08x\n", ntohl(x[0]));
+}
+
 /*
  * decode an AFSFetchStatus block
  */
@@ -97,6 +117,20 @@ static void xdr_decode_AFSFetchStatus(const __be32 **_bp,
 	EXTRACT(status->abort_code); /* spare 4 */
 	*_bp = bp;
 
+	switch (status->type) {
+	case AFS_FTYPE_FILE:
+	case AFS_FTYPE_DIR:
+	case AFS_FTYPE_SYMLINK:
+		break;
+	case AFS_FTYPE_INVALID:
+		if (status->abort_code != 0)
+			goto out;
+		/* Fall through */
+	default:
+		xdr_dump_bad(bp - 2);
+		goto out;
+	}
+
 	if (size != status->size) {
 		status->size = size;
 		changed |= true;
@@ -145,6 +179,7 @@ static void xdr_decode_AFSFetchStatus(const __be32 **_bp,
 		status->data_version = data_version;
 	}
 
+out:
 	if (vnode)
 		write_sequnlock(&vnode->cb_lock);
 }

^ permalink raw reply related	[flat|nested] 37+ messages in thread

* [PATCH 08/20] afs: Introduce a statistics proc file
  2018-04-05 20:29 [PATCH 00/20] afs: Fixes and development David Howells
                   ` (6 preceding siblings ...)
  2018-04-05 20:30 ` [PATCH 07/20] afs: Dump bad status record David Howells
@ 2018-04-05 20:30 ` David Howells
  2018-04-05 20:30 ` [PATCH 09/20] afs: Init inode before accessing cache David Howells
                   ` (14 subsequent siblings)
  22 siblings, 0 replies; 37+ messages in thread
From: David Howells @ 2018-04-05 20:30 UTC (permalink / raw)
  To: torvalds; +Cc: dhowells, linux-fsdevel, linux-afs, linux-kernel

Introduce a proc file that displays a bunch of statistics for the AFS
filesystem in the current network namespace.

Signed-off-by: David Howells <dhowells@redhat.com>
---

 fs/afs/dir.c      |    3 +++
 fs/afs/inode.c    |    3 +++
 fs/afs/internal.h |   15 ++++++++++++++-
 fs/afs/proc.c     |   37 +++++++++++++++++++++++++++++++++++++
 4 files changed, 57 insertions(+), 1 deletion(-)

diff --git a/fs/afs/dir.c b/fs/afs/dir.c
index 4f5e5c15ed6f..0e4945948a41 100644
--- a/fs/afs/dir.c
+++ b/fs/afs/dir.c
@@ -206,6 +206,7 @@ bool afs_dir_check_page(struct inode *dir, struct page *page)
 	}
 
 checked:
+	afs_stat_v(vnode, n_read_dir);
 	SetPageChecked(page);
 	return true;
 
@@ -881,6 +882,7 @@ static struct dentry *afs_lookup(struct inode *dir, struct dentry *dentry,
 	    dentry->d_name.name[dentry->d_name.len - 1] == 's')
 		return afs_lookup_atsys(dir, dentry, key);
 
+	afs_stat_v(dvnode, n_lookup);
 	inode = afs_do_lookup(dir, dentry, key);
 	if (IS_ERR(inode)) {
 		ret = PTR_ERR(inode);
@@ -1082,6 +1084,7 @@ static int afs_d_revalidate(struct dentry *dentry, unsigned int flags)
 		goto out_valid; /* the dir contents are unchanged */
 
 	_debug("dir modified");
+	afs_stat_v(dir, n_reval);
 
 	/* search the directory for this vnode */
 	ret = afs_do_lookup_one(&dir->vfs_inode, dentry, &fid, key);
diff --git a/fs/afs/inode.c b/fs/afs/inode.c
index d5db9dead18c..f4e62964efcb 100644
--- a/fs/afs/inode.c
+++ b/fs/afs/inode.c
@@ -366,6 +366,9 @@ void afs_zap_data(struct afs_vnode *vnode)
 {
 	_enter("{%x:%u}", vnode->fid.vid, vnode->fid.vnode);
 
+	if (S_ISDIR(vnode->vfs_inode.i_mode))
+		afs_stat_v(vnode, n_inval);
+
 #ifdef CONFIG_AFS_FSCACHE
 	fscache_invalidate(vnode->cache);
 #endif
diff --git a/fs/afs/internal.h b/fs/afs/internal.h
index c584d6e4c2f5..ebec83017f9a 100644
--- a/fs/afs/internal.h
+++ b/fs/afs/internal.h
@@ -255,9 +255,15 @@ struct afs_net {
 	struct mutex		lock_manager_mutex;
 
 	/* Misc */
-	struct proc_dir_entry	*proc_afs;		/* /proc/net/afs directory */
+	struct proc_dir_entry	*proc_afs;	/* /proc/net/afs directory */
 	struct afs_sysnames	*sysnames;
 	rwlock_t		sysnames_lock;
+
+	/* Statistics counters */
+	atomic_t		n_lookup;	/* Number of lookups done */
+	atomic_t		n_reval;	/* Number of dentries needing revalidation */
+	atomic_t		n_inval;	/* Number of invalidations by the server */
+	atomic_t		n_read_dir;	/* Number of directory pages read */
 };
 
 extern const char afs_init_sysname[];
@@ -775,6 +781,13 @@ static inline void afs_put_net(struct afs_net *net)
 {
 }
 
+static inline void __afs_stat(atomic_t *s)
+{
+	atomic_inc(s);
+}
+
+#define afs_stat_v(vnode, n) __afs_stat(&afs_v2net(vnode)->n)
+
 /*
  * misc.c
  */
diff --git a/fs/afs/proc.c b/fs/afs/proc.c
index f28719c8d153..e1e13a568845 100644
--- a/fs/afs/proc.c
+++ b/fs/afs/proc.c
@@ -152,6 +152,8 @@ static const struct file_operations afs_proc_sysname_fops = {
 	.write		= afs_proc_sysname_write,
 };
 
+static const struct file_operations afs_proc_stats_fops;
+
 /*
  * initialise the /proc/fs/afs/ directory
  */
@@ -166,6 +168,7 @@ int afs_proc_init(struct afs_net *net)
 	if (!proc_create("cells", 0644, net->proc_afs, &afs_proc_cells_fops) ||
 	    !proc_create("rootcell", 0644, net->proc_afs, &afs_proc_rootcell_fops) ||
 	    !proc_create("servers", 0644, net->proc_afs, &afs_proc_servers_fops) ||
+	    !proc_create("stats", 0644, net->proc_afs, &afs_proc_stats_fops) ||
 	    !proc_create("sysname", 0644, net->proc_afs, &afs_proc_sysname_fops))
 		goto error_tree;
 
@@ -877,3 +880,37 @@ static int afs_proc_sysname_show(struct seq_file *m, void *v)
 		seq_printf(m, "%s\n", sysnames->subs[i]);
 	return 0;
 }
+
+/*
+ * Display general per-net namespace statistics
+ */
+static int afs_proc_stats_show(struct seq_file *m, void *v)
+{
+	struct afs_net *net = afs_seq2net(m);
+
+	seq_puts(m, "kAFS statistics\n");
+
+	seq_printf(m, "dir-mgmt: look=%u reval=%u inval=%u\n",
+		   atomic_read(&net->n_lookup),
+		   atomic_read(&net->n_reval),
+		   atomic_read(&net->n_inval));
+
+	seq_printf(m, "dir-data: rdpg=%u\n",
+		   atomic_read(&net->n_read_dir));
+	return 0;
+}
+
+/*
+ * Open "/proc/fs/afs/stats" to allow reading of the stat counters.
+ */
+static int afs_proc_stats_open(struct inode *inode, struct file *file)
+{
+	return single_open(file, afs_proc_stats_show, NULL);
+}
+
+static const struct file_operations afs_proc_stats_fops = {
+	.open		= afs_proc_stats_open,
+	.read		= seq_read,
+	.llseek		= seq_lseek,
+	.release        = single_release,
+};

^ permalink raw reply related	[flat|nested] 37+ messages in thread

* [PATCH 09/20] afs: Init inode before accessing cache
  2018-04-05 20:29 [PATCH 00/20] afs: Fixes and development David Howells
                   ` (7 preceding siblings ...)
  2018-04-05 20:30 ` [PATCH 08/20] afs: Introduce a statistics proc file David Howells
@ 2018-04-05 20:30 ` David Howells
  2018-04-05 20:30 ` [PATCH 10/20] afs: Make it possible to get the data version in readpage David Howells
                   ` (13 subsequent siblings)
  22 siblings, 0 replies; 37+ messages in thread
From: David Howells @ 2018-04-05 20:30 UTC (permalink / raw)
  To: torvalds; +Cc: dhowells, linux-fsdevel, linux-afs, linux-kernel

We no longer parse symlinks when we get the inode to determine if this
symlink is actually a mountpoint as we detect that by examining the mode
instead (symlinks are always 0777 and mountpoints 0644).

Access the cache after mapping the status so that we don't have to manually
set the inode size now.

Note that this may need adjusting if the disconnected operation is
implemented as the file metadata may have to be obtained from the cache.

Signed-off-by: David Howells <dhowells@redhat.com>
---

 fs/afs/inode.c |    7 ++-----
 1 file changed, 2 insertions(+), 5 deletions(-)

diff --git a/fs/afs/inode.c b/fs/afs/inode.c
index f4e62964efcb..bcaff40b664d 100644
--- a/fs/afs/inode.c
+++ b/fs/afs/inode.c
@@ -331,15 +331,12 @@ struct inode *afs_iget(struct super_block *sb, struct key *key,
 		vnode->cb_expires_at += ktime_get_real_seconds();
 	}
 
-	/* set up caching before mapping the status, as map-status reads the
-	 * first page of symlinks to see if they're really mountpoints */
-	inode->i_size = vnode->status.size;
-	afs_get_inode_cache(vnode);
-
 	ret = afs_inode_map_status(vnode, key);
 	if (ret < 0)
 		goto bad_inode;
 
+	afs_get_inode_cache(vnode);
+
 	/* success */
 	clear_bit(AFS_VNODE_UNSET, &vnode->flags);
 	inode->i_flags |= S_NOATIME;

^ permalink raw reply related	[flat|nested] 37+ messages in thread

* [PATCH 10/20] afs: Make it possible to get the data version in readpage
  2018-04-05 20:29 [PATCH 00/20] afs: Fixes and development David Howells
                   ` (8 preceding siblings ...)
  2018-04-05 20:30 ` [PATCH 09/20] afs: Init inode before accessing cache David Howells
@ 2018-04-05 20:30 ` David Howells
  2018-04-05 20:30 ` [PATCH 11/20] afs: Rearrange status mapping David Howells
                   ` (12 subsequent siblings)
  22 siblings, 0 replies; 37+ messages in thread
From: David Howells @ 2018-04-05 20:30 UTC (permalink / raw)
  To: torvalds; +Cc: dhowells, linux-fsdevel, linux-afs, linux-kernel

Store the data version number indicated by an FS.FetchData op into the read
request structure so that it's accessible by the page reader.

Signed-off-by: David Howells <dhowells@redhat.com>
---

 fs/afs/flock.c    |    2 +
 fs/afs/fsclient.c |   83 +++++++++++++++++++++++++++++++----------------------
 fs/afs/inode.c    |    8 +++--
 fs/afs/internal.h |    8 +++--
 fs/afs/security.c |    2 +
 5 files changed, 59 insertions(+), 44 deletions(-)

diff --git a/fs/afs/flock.c b/fs/afs/flock.c
index c40ba2fe3cbe..7a0e017070ec 100644
--- a/fs/afs/flock.c
+++ b/fs/afs/flock.c
@@ -613,7 +613,7 @@ static int afs_do_getlk(struct file *file, struct file_lock *fl)
 	posix_test_lock(file, fl);
 	if (fl->fl_type == F_UNLCK) {
 		/* no local locks; consult the server */
-		ret = afs_fetch_status(vnode, key);
+		ret = afs_fetch_status(vnode, key, false);
 		if (ret < 0)
 			goto error;
 
diff --git a/fs/afs/fsclient.c b/fs/afs/fsclient.c
index 90aa3eb2c893..015bbbba0858 100644
--- a/fs/afs/fsclient.c
+++ b/fs/afs/fsclient.c
@@ -69,9 +69,9 @@ static void xdr_dump_bad(const __be32 *bp)
 static void xdr_decode_AFSFetchStatus(const __be32 **_bp,
 				      struct afs_file_status *status,
 				      struct afs_vnode *vnode,
-				      afs_dataversion_t *store_version)
+				      const afs_dataversion_t *expected_version,
+				      afs_dataversion_t *_version)
 {
-	afs_dataversion_t expected_version;
 	const __be32 *bp = *_bp;
 	umode_t mode;
 	u64 data_version, size;
@@ -136,6 +136,8 @@ static void xdr_decode_AFSFetchStatus(const __be32 **_bp,
 		changed |= true;
 	}
 	status->mode &= S_IALLUGO;
+	if (_version)
+		*_version = data_version;
 
 	_debug("vnode time %lx, %lx",
 	       status->mtime_client, status->mtime_server);
@@ -162,12 +164,8 @@ static void xdr_decode_AFSFetchStatus(const __be32 **_bp,
 		inode_set_iversion_raw(&vnode->vfs_inode, data_version);
 	}
 
-	expected_version = status->data_version;
-	if (store_version)
-		expected_version = *store_version;
-
-	if (expected_version != data_version) {
-		status->data_version = data_version;
+	status->data_version = data_version;
+	if (expected_version && *expected_version != data_version) {
 		if (vnode && !test_bit(AFS_VNODE_UNSET, &vnode->flags)) {
 			_debug("vnode modified %llx on {%x:%u}",
 			       (unsigned long long) data_version,
@@ -175,8 +173,6 @@ static void xdr_decode_AFSFetchStatus(const __be32 **_bp,
 			set_bit(AFS_VNODE_DIR_MODIFIED, &vnode->flags);
 			set_bit(AFS_VNODE_ZAP_DATA, &vnode->flags);
 		}
-	} else if (store_version) {
-		status->data_version = data_version;
 	}
 
 out:
@@ -323,7 +319,8 @@ static int afs_deliver_fs_fetch_status_vnode(struct afs_call *call)
 
 	/* unmarshall the reply once we've received all of it */
 	bp = call->buffer;
-	xdr_decode_AFSFetchStatus(&bp, &vnode->status, vnode, NULL);
+	xdr_decode_AFSFetchStatus(&bp, &vnode->status, vnode,
+				  &call->expected_version, NULL);
 	xdr_decode_AFSCallBack(call, vnode, &bp);
 	if (call->reply[1])
 		xdr_decode_AFSVolSync(&bp, call->reply[1]);
@@ -345,7 +342,8 @@ static const struct afs_call_type afs_RXFSFetchStatus_vnode = {
 /*
  * fetch the status information for a file
  */
-int afs_fs_fetch_file_status(struct afs_fs_cursor *fc, struct afs_volsync *volsync)
+int afs_fs_fetch_file_status(struct afs_fs_cursor *fc, struct afs_volsync *volsync,
+			     bool new_inode)
 {
 	struct afs_vnode *vnode = fc->vnode;
 	struct afs_call *call;
@@ -365,6 +363,7 @@ int afs_fs_fetch_file_status(struct afs_fs_cursor *fc, struct afs_volsync *volsy
 	call->key = fc->key;
 	call->reply[0] = vnode;
 	call->reply[1] = volsync;
+	call->expected_version = new_inode ? 1 : vnode->status.data_version;
 
 	/* marshall the parameters */
 	bp = call->request;
@@ -500,7 +499,9 @@ static int afs_deliver_fs_fetch_data(struct afs_call *call)
 			return ret;
 
 		bp = call->buffer;
-		xdr_decode_AFSFetchStatus(&bp, &vnode->status, vnode, NULL);
+		xdr_decode_AFSFetchStatus(&bp, &vnode->status, vnode,
+					  &vnode->status.data_version,
+					  &req->new_version);
 		xdr_decode_AFSCallBack(call, vnode, &bp);
 		if (call->reply[1])
 			xdr_decode_AFSVolSync(&bp, call->reply[1]);
@@ -570,6 +571,7 @@ static int afs_fs_fetch_data64(struct afs_fs_cursor *fc, struct afs_read *req)
 	call->reply[0] = vnode;
 	call->reply[1] = NULL; /* volsync */
 	call->reply[2] = req;
+	call->expected_version = vnode->status.data_version;
 
 	/* marshall the parameters */
 	bp = call->request;
@@ -614,6 +616,7 @@ int afs_fs_fetch_data(struct afs_fs_cursor *fc, struct afs_read *req)
 	call->reply[0] = vnode;
 	call->reply[1] = NULL; /* volsync */
 	call->reply[2] = req;
+	call->expected_version = vnode->status.data_version;
 
 	/* marshall the parameters */
 	bp = call->request;
@@ -649,8 +652,9 @@ static int afs_deliver_fs_create_vnode(struct afs_call *call)
 	/* unmarshall the reply once we've received all of it */
 	bp = call->buffer;
 	xdr_decode_AFSFid(&bp, call->reply[1]);
-	xdr_decode_AFSFetchStatus(&bp, call->reply[2], NULL, NULL);
-	xdr_decode_AFSFetchStatus(&bp, &vnode->status, vnode, NULL);
+	xdr_decode_AFSFetchStatus(&bp, call->reply[2], NULL, NULL, NULL);
+	xdr_decode_AFSFetchStatus(&bp, &vnode->status, vnode,
+				  &call->expected_version, NULL);
 	xdr_decode_AFSCallBack_raw(&bp, call->reply[3]);
 	/* xdr_decode_AFSVolSync(&bp, call->reply[X]); */
 
@@ -708,6 +712,7 @@ int afs_fs_create(struct afs_fs_cursor *fc,
 	call->reply[1] = newfid;
 	call->reply[2] = newstatus;
 	call->reply[3] = newcb;
+	call->expected_version = vnode->status.data_version;
 
 	/* marshall the parameters */
 	bp = call->request;
@@ -751,7 +756,8 @@ static int afs_deliver_fs_remove(struct afs_call *call)
 
 	/* unmarshall the reply once we've received all of it */
 	bp = call->buffer;
-	xdr_decode_AFSFetchStatus(&bp, &vnode->status, vnode, NULL);
+	xdr_decode_AFSFetchStatus(&bp, &vnode->status, vnode,
+				  &call->expected_version, NULL);
 	/* xdr_decode_AFSVolSync(&bp, call->reply[X]); */
 
 	_leave(" = 0 [done]");
@@ -800,6 +806,7 @@ int afs_fs_remove(struct afs_fs_cursor *fc, const char *name, bool isdir)
 
 	call->key = fc->key;
 	call->reply[0] = vnode;
+	call->expected_version = vnode->status.data_version;
 
 	/* marshall the parameters */
 	bp = call->request;
@@ -837,8 +844,9 @@ static int afs_deliver_fs_link(struct afs_call *call)
 
 	/* unmarshall the reply once we've received all of it */
 	bp = call->buffer;
-	xdr_decode_AFSFetchStatus(&bp, &vnode->status, vnode, NULL);
-	xdr_decode_AFSFetchStatus(&bp, &dvnode->status, dvnode, NULL);
+	xdr_decode_AFSFetchStatus(&bp, &vnode->status, vnode, NULL, NULL);
+	xdr_decode_AFSFetchStatus(&bp, &dvnode->status, dvnode,
+				  &call->expected_version, NULL);
 	/* xdr_decode_AFSVolSync(&bp, call->reply[X]); */
 
 	_leave(" = 0 [done]");
@@ -880,6 +888,7 @@ int afs_fs_link(struct afs_fs_cursor *fc, struct afs_vnode *vnode,
 	call->key = fc->key;
 	call->reply[0] = dvnode;
 	call->reply[1] = vnode;
+	call->expected_version = vnode->status.data_version;
 
 	/* marshall the parameters */
 	bp = call->request;
@@ -921,8 +930,9 @@ static int afs_deliver_fs_symlink(struct afs_call *call)
 	/* unmarshall the reply once we've received all of it */
 	bp = call->buffer;
 	xdr_decode_AFSFid(&bp, call->reply[1]);
-	xdr_decode_AFSFetchStatus(&bp, call->reply[2], NULL, NULL);
-	xdr_decode_AFSFetchStatus(&bp, &vnode->status, vnode, NULL);
+	xdr_decode_AFSFetchStatus(&bp, call->reply[2], NULL, NULL, NULL);
+	xdr_decode_AFSFetchStatus(&bp, &vnode->status, vnode,
+				  &call->expected_version, NULL);
 	/* xdr_decode_AFSVolSync(&bp, call->reply[X]); */
 
 	_leave(" = 0 [done]");
@@ -973,6 +983,7 @@ int afs_fs_symlink(struct afs_fs_cursor *fc,
 	call->reply[0] = vnode;
 	call->reply[1] = newfid;
 	call->reply[2] = newstatus;
+	call->expected_version = vnode->status.data_version;
 
 	/* marshall the parameters */
 	bp = call->request;
@@ -1023,10 +1034,11 @@ static int afs_deliver_fs_rename(struct afs_call *call)
 
 	/* unmarshall the reply once we've received all of it */
 	bp = call->buffer;
-	xdr_decode_AFSFetchStatus(&bp, &orig_dvnode->status, orig_dvnode, NULL);
+	xdr_decode_AFSFetchStatus(&bp, &orig_dvnode->status, orig_dvnode,
+				  &call->expected_version, NULL);
 	if (new_dvnode != orig_dvnode)
 		xdr_decode_AFSFetchStatus(&bp, &new_dvnode->status, new_dvnode,
-					  NULL);
+					  &call->expected_version_2, NULL);
 	/* xdr_decode_AFSVolSync(&bp, call->reply[X]); */
 
 	_leave(" = 0 [done]");
@@ -1077,6 +1089,8 @@ int afs_fs_rename(struct afs_fs_cursor *fc,
 	call->key = fc->key;
 	call->reply[0] = orig_dvnode;
 	call->reply[1] = new_dvnode;
+	call->expected_version = orig_dvnode->status.data_version;
+	call->expected_version_2 = new_dvnode->status.data_version;
 
 	/* marshall the parameters */
 	bp = call->request;
@@ -1126,7 +1140,7 @@ static int afs_deliver_fs_store_data(struct afs_call *call)
 	/* unmarshall the reply once we've received all of it */
 	bp = call->buffer;
 	xdr_decode_AFSFetchStatus(&bp, &vnode->status, vnode,
-				  &call->store_version);
+				  &call->expected_version, NULL);
 	/* xdr_decode_AFSVolSync(&bp, call->reply[X]); */
 
 	afs_pages_written_back(vnode, call);
@@ -1183,7 +1197,7 @@ static int afs_fs_store_data64(struct afs_fs_cursor *fc,
 	call->first_offset = offset;
 	call->last_to = to;
 	call->send_pages = true;
-	call->store_version = vnode->status.data_version + 1;
+	call->expected_version = vnode->status.data_version + 1;
 
 	/* marshall the parameters */
 	bp = call->request;
@@ -1258,7 +1272,7 @@ int afs_fs_store_data(struct afs_fs_cursor *fc, struct address_space *mapping,
 	call->first_offset = offset;
 	call->last_to = to;
 	call->send_pages = true;
-	call->store_version = vnode->status.data_version + 1;
+	call->expected_version = vnode->status.data_version + 1;
 
 	/* marshall the parameters */
 	bp = call->request;
@@ -1288,7 +1302,6 @@ int afs_fs_store_data(struct afs_fs_cursor *fc, struct address_space *mapping,
  */
 static int afs_deliver_fs_store_status(struct afs_call *call)
 {
-	afs_dataversion_t *store_version;
 	struct afs_vnode *vnode = call->reply[0];
 	const __be32 *bp;
 	int ret;
@@ -1300,12 +1313,9 @@ static int afs_deliver_fs_store_status(struct afs_call *call)
 		return ret;
 
 	/* unmarshall the reply once we've received all of it */
-	store_version = NULL;
-	if (call->operation_ID == FSSTOREDATA)
-		store_version = &call->store_version;
-
 	bp = call->buffer;
-	xdr_decode_AFSFetchStatus(&bp, &vnode->status, vnode, store_version);
+	xdr_decode_AFSFetchStatus(&bp, &vnode->status, vnode,
+				  &call->expected_version, NULL);
 	/* xdr_decode_AFSVolSync(&bp, call->reply[X]); */
 
 	_leave(" = 0 [done]");
@@ -1360,7 +1370,7 @@ static int afs_fs_setattr_size64(struct afs_fs_cursor *fc, struct iattr *attr)
 
 	call->key = fc->key;
 	call->reply[0] = vnode;
-	call->store_version = vnode->status.data_version + 1;
+	call->expected_version = vnode->status.data_version + 1;
 
 	/* marshall the parameters */
 	bp = call->request;
@@ -1409,7 +1419,7 @@ static int afs_fs_setattr_size(struct afs_fs_cursor *fc, struct iattr *attr)
 
 	call->key = fc->key;
 	call->reply[0] = vnode;
-	call->store_version = vnode->status.data_version + 1;
+	call->expected_version = vnode->status.data_version + 1;
 
 	/* marshall the parameters */
 	bp = call->request;
@@ -1454,6 +1464,7 @@ int afs_fs_setattr(struct afs_fs_cursor *fc, struct iattr *attr)
 
 	call->key = fc->key;
 	call->reply[0] = vnode;
+	call->expected_version = vnode->status.data_version;
 
 	/* marshall the parameters */
 	bp = call->request;
@@ -2004,7 +2015,8 @@ static int afs_deliver_fs_fetch_status(struct afs_call *call)
 
 	/* unmarshall the reply once we've received all of it */
 	bp = call->buffer;
-	xdr_decode_AFSFetchStatus(&bp, status, vnode, NULL);
+	xdr_decode_AFSFetchStatus(&bp, status, vnode,
+				  &call->expected_version, NULL);
 	callback[call->count].version	= ntohl(bp[0]);
 	callback[call->count].expiry	= ntohl(bp[1]);
 	callback[call->count].type	= ntohl(bp[2]);
@@ -2056,6 +2068,7 @@ int afs_fs_fetch_status(struct afs_fs_cursor *fc,
 	call->reply[1] = status;
 	call->reply[2] = callback;
 	call->reply[3] = volsync;
+	call->expected_version = 1; /* vnode->status.data_version */
 
 	/* marshall the parameters */
 	bp = call->request;
@@ -2116,7 +2129,7 @@ static int afs_deliver_fs_inline_bulk_status(struct afs_call *call)
 		statuses = call->reply[1];
 		xdr_decode_AFSFetchStatus(&bp, &statuses[call->count],
 					  call->count == 0 ? vnode : NULL,
-					  NULL);
+					  NULL, NULL);
 
 		call->count++;
 		if (call->count < call->count2)
diff --git a/fs/afs/inode.c b/fs/afs/inode.c
index bcaff40b664d..488abd78dc26 100644
--- a/fs/afs/inode.c
+++ b/fs/afs/inode.c
@@ -105,7 +105,7 @@ static int afs_inode_map_status(struct afs_vnode *vnode, struct key *key)
 /*
  * Fetch file status from the volume.
  */
-int afs_fetch_status(struct afs_vnode *vnode, struct key *key)
+int afs_fetch_status(struct afs_vnode *vnode, struct key *key, bool new_inode)
 {
 	struct afs_fs_cursor fc;
 	int ret;
@@ -119,7 +119,7 @@ int afs_fetch_status(struct afs_vnode *vnode, struct key *key)
 	if (afs_begin_vnode_operation(&fc, vnode, key)) {
 		while (afs_select_fileserver(&fc)) {
 			fc.cb_break = vnode->cb_break + vnode->cb_s_break;
-			afs_fs_fetch_file_status(&fc, NULL);
+			afs_fs_fetch_file_status(&fc, NULL, new_inode);
 		}
 
 		afs_check_for_remote_deletion(&fc, fc.vnode);
@@ -307,7 +307,7 @@ struct inode *afs_iget(struct super_block *sb, struct key *key,
 
 	if (!status) {
 		/* it's a remotely extant inode */
-		ret = afs_fetch_status(vnode, key);
+		ret = afs_fetch_status(vnode, key, true);
 		if (ret < 0)
 			goto bad_inode;
 	} else {
@@ -432,7 +432,7 @@ int afs_validate(struct afs_vnode *vnode, struct key *key)
 	 * access */
 	if (!test_bit(AFS_VNODE_CB_PROMISED, &vnode->flags)) {
 		_debug("not promised");
-		ret = afs_fetch_status(vnode, key);
+		ret = afs_fetch_status(vnode, key, false);
 		if (ret < 0) {
 			if (ret == -ENOENT) {
 				set_bit(AFS_VNODE_DELETED, &vnode->flags);
diff --git a/fs/afs/internal.h b/fs/afs/internal.h
index ebec83017f9a..94dc6007b584 100644
--- a/fs/afs/internal.h
+++ b/fs/afs/internal.h
@@ -122,7 +122,8 @@ struct afs_call {
 	u32			operation_ID;	/* operation ID for an incoming call */
 	u32			count;		/* count for use in unmarshalling */
 	__be32			tmp;		/* place to extract temporary data */
-	afs_dataversion_t	store_version;	/* updated version expected from store */
+	afs_dataversion_t	expected_version; /* Updated version expected from store */
+	afs_dataversion_t	expected_version_2; /* 2nd updated version expected from store */
 };
 
 struct afs_call_type {
@@ -173,6 +174,7 @@ struct afs_read {
 	loff_t			len;		/* How much we're asking for */
 	loff_t			actual_len;	/* How much we're actually getting */
 	loff_t			remain;		/* Amount remaining */
+	afs_dataversion_t	new_version;	/* Version number returned by server */
 	atomic_t		usage;
 	unsigned int		index;		/* Which page we're reading into */
 	unsigned int		nr_pages;
@@ -700,7 +702,7 @@ extern int afs_flock(struct file *, int, struct file_lock *);
 /*
  * fsclient.c
  */
-extern int afs_fs_fetch_file_status(struct afs_fs_cursor *, struct afs_volsync *);
+extern int afs_fs_fetch_file_status(struct afs_fs_cursor *, struct afs_volsync *, bool);
 extern int afs_fs_give_up_callbacks(struct afs_net *, struct afs_server *);
 extern int afs_fs_fetch_data(struct afs_fs_cursor *, struct afs_read *);
 extern int afs_fs_create(struct afs_fs_cursor *, const char *, umode_t,
@@ -733,7 +735,7 @@ extern int afs_fs_fetch_status(struct afs_fs_cursor *, struct afs_net *,
 /*
  * inode.c
  */
-extern int afs_fetch_status(struct afs_vnode *, struct key *);
+extern int afs_fetch_status(struct afs_vnode *, struct key *, bool);
 extern int afs_iget5_test(struct inode *, void *);
 extern struct inode *afs_iget_pseudo_dir(struct super_block *, bool);
 extern struct inode *afs_iget(struct super_block *, struct key *,
diff --git a/fs/afs/security.c b/fs/afs/security.c
index bd82c5bf4a6a..cea2fff313dc 100644
--- a/fs/afs/security.c
+++ b/fs/afs/security.c
@@ -322,7 +322,7 @@ int afs_check_permit(struct afs_vnode *vnode, struct key *key,
 		 */
 		_debug("no valid permit");
 
-		ret = afs_fetch_status(vnode, key);
+		ret = afs_fetch_status(vnode, key, false);
 		if (ret < 0) {
 			*_access = 0;
 			_leave(" = %d", ret);

^ permalink raw reply related	[flat|nested] 37+ messages in thread

* [PATCH 11/20] afs: Rearrange status mapping
  2018-04-05 20:29 [PATCH 00/20] afs: Fixes and development David Howells
                   ` (9 preceding siblings ...)
  2018-04-05 20:30 ` [PATCH 10/20] afs: Make it possible to get the data version in readpage David Howells
@ 2018-04-05 20:30 ` David Howells
  2018-04-05 20:30 ` [PATCH 12/20] afs: Keep track of invalid-before version for dentry coherency David Howells
                   ` (11 subsequent siblings)
  22 siblings, 0 replies; 37+ messages in thread
From: David Howells @ 2018-04-05 20:30 UTC (permalink / raw)
  To: torvalds; +Cc: dhowells, linux-fsdevel, linux-afs, linux-kernel

Rearrange the AFSFetchStatus to inode attribute mapping code in a number of
ways:

 (1) Use an XDR structure rather than a series of incremented pointer
     accesses when decoding an AFSFetchStatus object.  This allows
     out-of-order decode.

 (2) Don't store the if_version value but rather just check it and abort if
     it's not something we can handle.

 (3) Store the owner and group in the status record as raw values rather
     than converting them to kuid/kgid.  Do that when they're mapped into
     i_uid/i_gid.

 (4) Validate the type and abort code up front and abort if they're wrong.

 (5) Split the inode attribute setting out into its own function from the
     XDR decode of an AFSFetchStatus object.  This allows it to be called
     from elsewhere too.

 (6) Differentiate changes to data from changes to metadata.

 (7) Use the split-out attribute mapping function from afs_iget().

Signed-off-by: David Howells <dhowells@redhat.com>
---

 fs/afs/afs.h      |    6 -
 fs/afs/fsclient.c |  270 ++++++++++++++++++++++++++++++++---------------------
 fs/afs/inode.c    |   23 +----
 fs/afs/internal.h |    6 +
 fs/afs/xdr_fs.h   |   40 ++++++++
 5 files changed, 216 insertions(+), 129 deletions(-)
 create mode 100644 fs/afs/xdr_fs.h

diff --git a/fs/afs/afs.h b/fs/afs/afs.h
index a670bca6d3ba..b4ff1f7ae4ab 100644
--- a/fs/afs/afs.h
+++ b/fs/afs/afs.h
@@ -131,15 +131,13 @@ struct afs_file_status {
 	afs_dataversion_t	data_version;	/* current data version */
 	time_t			mtime_client;	/* last time client changed data */
 	time_t			mtime_server;	/* last time server changed data */
-	unsigned		if_version;	/* interface version */
-#define AFS_FSTATUS_VERSION	1
 	unsigned		abort_code;	/* Abort if bulk-fetching this failed */
 
 	afs_file_type_t		type;		/* file type */
 	unsigned		nlink;		/* link count */
 	u32			author;		/* author ID */
-	kuid_t			owner;		/* owner ID */
-	kgid_t			group;		/* group ID */
+	u32			owner;		/* owner ID */
+	u32			group;		/* group ID */
 	afs_access_t		caller_access;	/* access rights for authenticated caller */
 	afs_access_t		anon_access;	/* access rights for unauthenticated caller */
 	umode_t			mode;		/* UNIX mode */
diff --git a/fs/afs/fsclient.c b/fs/afs/fsclient.c
index 015bbbba0858..ff87fc6bb27f 100644
--- a/fs/afs/fsclient.c
+++ b/fs/afs/fsclient.c
@@ -16,6 +16,7 @@
 #include <linux/iversion.h>
 #include "internal.h"
 #include "afs_fs.h"
+#include "xdr_fs.h"
 
 static const struct afs_fid afs_zero_fid;
 
@@ -64,120 +65,160 @@ static void xdr_dump_bad(const __be32 *bp)
 }
 
 /*
- * decode an AFSFetchStatus block
+ * Update the core inode struct from a returned status record.
  */
-static void xdr_decode_AFSFetchStatus(const __be32 **_bp,
-				      struct afs_file_status *status,
-				      struct afs_vnode *vnode,
-				      const afs_dataversion_t *expected_version,
-				      afs_dataversion_t *_version)
+void afs_update_inode_from_status(struct afs_vnode *vnode,
+				  struct afs_file_status *status,
+				  const afs_dataversion_t *expected_version,
+				  u8 flags)
 {
-	const __be32 *bp = *_bp;
+	struct timespec t;
 	umode_t mode;
+
+	t.tv_sec = status->mtime_client;
+	t.tv_nsec = 0;
+	vnode->vfs_inode.i_ctime = t;
+	vnode->vfs_inode.i_mtime = t;
+	vnode->vfs_inode.i_atime = t;
+
+	if (flags & (AFS_VNODE_META_CHANGED | AFS_VNODE_NOT_YET_SET)) {
+		vnode->vfs_inode.i_uid = make_kuid(&init_user_ns, status->owner);
+		vnode->vfs_inode.i_gid = make_kgid(&init_user_ns, status->group);
+		set_nlink(&vnode->vfs_inode, status->nlink);
+
+		mode = vnode->vfs_inode.i_mode;
+		mode &= ~S_IALLUGO;
+		mode |= status->mode;
+		barrier();
+		vnode->vfs_inode.i_mode = mode;
+	}
+
+	if (!(flags & AFS_VNODE_NOT_YET_SET)) {
+		if (expected_version &&
+		    *expected_version != status->data_version) {
+			_debug("vnode modified %llx on {%x:%u} [exp %llx]",
+			       (unsigned long long) status->data_version,
+			       vnode->fid.vid, vnode->fid.vnode,
+			       (unsigned long long) *expected_version);
+			set_bit(AFS_VNODE_DIR_MODIFIED, &vnode->flags);
+			set_bit(AFS_VNODE_ZAP_DATA, &vnode->flags);
+		}
+	}
+
+	if (flags & (AFS_VNODE_DATA_CHANGED | AFS_VNODE_NOT_YET_SET)) {
+		inode_set_iversion_raw(&vnode->vfs_inode, status->data_version);
+		i_size_write(&vnode->vfs_inode, status->size);
+	}
+}
+
+/*
+ * decode an AFSFetchStatus block
+ */
+static int xdr_decode_AFSFetchStatus(const __be32 **_bp,
+				     struct afs_file_status *status,
+				     struct afs_vnode *vnode,
+				     const afs_dataversion_t *expected_version,
+				     afs_dataversion_t *_version)
+{
+	const struct afs_xdr_AFSFetchStatus *xdr = (const void *)*_bp;
 	u64 data_version, size;
-	bool changed = false;
-	kuid_t owner;
-	kgid_t group;
+	u32 type, abort_code;
+	u8 flags = 0;
+	int ret;
 
 	if (vnode)
 		write_seqlock(&vnode->cb_lock);
 
-#define EXTRACT(DST)				\
-	do {					\
-		u32 x = ntohl(*bp++);		\
-		if (DST != x)			\
-			changed |= true;	\
-		DST = x;			\
-	} while (0)
-
-	status->if_version = ntohl(*bp++);
-	EXTRACT(status->type);
-	EXTRACT(status->nlink);
-	size = ntohl(*bp++);
-	data_version = ntohl(*bp++);
-	EXTRACT(status->author);
-	owner = make_kuid(&init_user_ns, ntohl(*bp++));
-	changed |= !uid_eq(owner, status->owner);
-	status->owner = owner;
-	EXTRACT(status->caller_access); /* call ticket dependent */
-	EXTRACT(status->anon_access);
-	EXTRACT(status->mode);
-	bp++; /* parent.vnode */
-	bp++; /* parent.unique */
-	bp++; /* seg size */
-	status->mtime_client = ntohl(*bp++);
-	status->mtime_server = ntohl(*bp++);
-	group = make_kgid(&init_user_ns, ntohl(*bp++));
-	changed |= !gid_eq(group, status->group);
-	status->group = group;
-	bp++; /* sync counter */
-	data_version |= (u64) ntohl(*bp++) << 32;
-	EXTRACT(status->lock_count);
-	size |= (u64) ntohl(*bp++) << 32;
-	EXTRACT(status->abort_code); /* spare 4 */
-	*_bp = bp;
+	if (xdr->if_version != htonl(AFS_FSTATUS_VERSION)) {
+		pr_warn("Unknown AFSFetchStatus version %u\n", ntohl(xdr->if_version));
+		goto bad;
+	}
 
-	switch (status->type) {
+	type = ntohl(xdr->type);
+	abort_code = ntohl(xdr->abort_code);
+	switch (type) {
 	case AFS_FTYPE_FILE:
 	case AFS_FTYPE_DIR:
 	case AFS_FTYPE_SYMLINK:
+		if (type != status->type &&
+		    vnode &&
+		    !test_bit(AFS_VNODE_UNSET, &vnode->flags)) {
+			pr_warning("Vnode %x:%x:%x changed type %u to %u\n",
+				   vnode->fid.vid,
+				   vnode->fid.vnode,
+				   vnode->fid.unique,
+				   status->type, type);
+			goto bad;
+		}
+		status->type = type;
 		break;
 	case AFS_FTYPE_INVALID:
-		if (status->abort_code != 0)
+		if (abort_code != 0) {
+			status->abort_code = abort_code;
 			goto out;
+		}
 		/* Fall through */
 	default:
-		xdr_dump_bad(bp - 2);
-		goto out;
+		goto bad;
 	}
 
+#define EXTRACT_M(FIELD)					\
+	do {							\
+		u32 x = ntohl(xdr->FIELD);			\
+		if (status->FIELD != x) {			\
+			flags |= AFS_VNODE_META_CHANGED;	\
+			status->FIELD = x;			\
+		}						\
+	} while (0)
+
+	EXTRACT_M(nlink);
+	EXTRACT_M(author);
+	EXTRACT_M(owner);
+	EXTRACT_M(caller_access); /* call ticket dependent */
+	EXTRACT_M(anon_access);
+	EXTRACT_M(mode);
+	EXTRACT_M(group);
+
+	status->mtime_client = ntohl(xdr->mtime_client);
+	status->mtime_server = ntohl(xdr->mtime_server);
+	status->lock_count   = ntohl(xdr->lock_count);
+
+	size  = (u64)ntohl(xdr->size_lo);
+	size |= (u64)ntohl(xdr->size_hi) << 32;
 	if (size != status->size) {
 		status->size = size;
-		changed |= true;
+		flags |= AFS_VNODE_DATA_CHANGED;
+	}
+
+	data_version  = (u64)ntohl(xdr->data_version_lo);
+	data_version |= (u64)ntohl(xdr->data_version_hi) << 32;
+	if (data_version != status->data_version) {
+		status->data_version = data_version;
+		flags |= AFS_VNODE_DATA_CHANGED;
 	}
-	status->mode &= S_IALLUGO;
 	if (_version)
 		*_version = data_version;
 
-	_debug("vnode time %lx, %lx",
-	       status->mtime_client, status->mtime_server);
+	*_bp = (const void *)*_bp + sizeof(*xdr);
 
 	if (vnode) {
-		if (changed && !test_bit(AFS_VNODE_UNSET, &vnode->flags)) {
-			_debug("vnode changed");
-			i_size_write(&vnode->vfs_inode, size);
-			vnode->vfs_inode.i_uid = status->owner;
-			vnode->vfs_inode.i_gid = status->group;
-			vnode->vfs_inode.i_generation = vnode->fid.unique;
-			set_nlink(&vnode->vfs_inode, status->nlink);
-
-			mode = vnode->vfs_inode.i_mode;
-			mode &= ~S_IALLUGO;
-			mode |= status->mode;
-			barrier();
-			vnode->vfs_inode.i_mode = mode;
-		}
-
-		vnode->vfs_inode.i_ctime.tv_sec	= status->mtime_client;
-		vnode->vfs_inode.i_mtime	= vnode->vfs_inode.i_ctime;
-		vnode->vfs_inode.i_atime	= vnode->vfs_inode.i_ctime;
-		inode_set_iversion_raw(&vnode->vfs_inode, data_version);
+		if (test_bit(AFS_VNODE_UNSET, &vnode->flags))
+			flags |= AFS_VNODE_NOT_YET_SET;
+		afs_update_inode_from_status(vnode, status, expected_version,
+					     flags);
 	}
 
-	status->data_version = data_version;
-	if (expected_version && *expected_version != data_version) {
-		if (vnode && !test_bit(AFS_VNODE_UNSET, &vnode->flags)) {
-			_debug("vnode modified %llx on {%x:%u}",
-			       (unsigned long long) data_version,
-			       vnode->fid.vid, vnode->fid.vnode);
-			set_bit(AFS_VNODE_DIR_MODIFIED, &vnode->flags);
-			set_bit(AFS_VNODE_ZAP_DATA, &vnode->flags);
-		}
-	}
+	ret = 0;
 
 out:
 	if (vnode)
 		write_sequnlock(&vnode->cb_lock);
+	return ret;
+
+bad:
+	xdr_dump_bad(*_bp);
+	ret = -EBADMSG;
+	goto out;
 }
 
 /*
@@ -319,8 +360,9 @@ static int afs_deliver_fs_fetch_status_vnode(struct afs_call *call)
 
 	/* unmarshall the reply once we've received all of it */
 	bp = call->buffer;
-	xdr_decode_AFSFetchStatus(&bp, &vnode->status, vnode,
-				  &call->expected_version, NULL);
+	if (xdr_decode_AFSFetchStatus(&bp, &vnode->status, vnode,
+				      &call->expected_version, NULL) < 0)
+		return -EBADMSG;
 	xdr_decode_AFSCallBack(call, vnode, &bp);
 	if (call->reply[1])
 		xdr_decode_AFSVolSync(&bp, call->reply[1]);
@@ -499,9 +541,10 @@ static int afs_deliver_fs_fetch_data(struct afs_call *call)
 			return ret;
 
 		bp = call->buffer;
-		xdr_decode_AFSFetchStatus(&bp, &vnode->status, vnode,
-					  &vnode->status.data_version,
-					  &req->new_version);
+		if (xdr_decode_AFSFetchStatus(&bp, &vnode->status, vnode,
+					      &vnode->status.data_version,
+					      &req->new_version) < 0)
+			return -EBADMSG;
 		xdr_decode_AFSCallBack(call, vnode, &bp);
 		if (call->reply[1])
 			xdr_decode_AFSVolSync(&bp, call->reply[1]);
@@ -652,9 +695,10 @@ static int afs_deliver_fs_create_vnode(struct afs_call *call)
 	/* unmarshall the reply once we've received all of it */
 	bp = call->buffer;
 	xdr_decode_AFSFid(&bp, call->reply[1]);
-	xdr_decode_AFSFetchStatus(&bp, call->reply[2], NULL, NULL, NULL);
-	xdr_decode_AFSFetchStatus(&bp, &vnode->status, vnode,
-				  &call->expected_version, NULL);
+	if (xdr_decode_AFSFetchStatus(&bp, call->reply[2], NULL, NULL, NULL) < 0 ||
+	    xdr_decode_AFSFetchStatus(&bp, &vnode->status, vnode,
+				      &call->expected_version, NULL) < 0)
+		return -EBADMSG;
 	xdr_decode_AFSCallBack_raw(&bp, call->reply[3]);
 	/* xdr_decode_AFSVolSync(&bp, call->reply[X]); */
 
@@ -756,8 +800,9 @@ static int afs_deliver_fs_remove(struct afs_call *call)
 
 	/* unmarshall the reply once we've received all of it */
 	bp = call->buffer;
-	xdr_decode_AFSFetchStatus(&bp, &vnode->status, vnode,
-				  &call->expected_version, NULL);
+	if (xdr_decode_AFSFetchStatus(&bp, &vnode->status, vnode,
+				      &call->expected_version, NULL) < 0)
+		return -EBADMSG;
 	/* xdr_decode_AFSVolSync(&bp, call->reply[X]); */
 
 	_leave(" = 0 [done]");
@@ -844,9 +889,10 @@ static int afs_deliver_fs_link(struct afs_call *call)
 
 	/* unmarshall the reply once we've received all of it */
 	bp = call->buffer;
-	xdr_decode_AFSFetchStatus(&bp, &vnode->status, vnode, NULL, NULL);
-	xdr_decode_AFSFetchStatus(&bp, &dvnode->status, dvnode,
-				  &call->expected_version, NULL);
+	if (xdr_decode_AFSFetchStatus(&bp, &vnode->status, vnode, NULL, NULL) < 0 ||
+	    xdr_decode_AFSFetchStatus(&bp, &dvnode->status, dvnode,
+				      &call->expected_version, NULL) < 0)
+		return -EBADMSG;
 	/* xdr_decode_AFSVolSync(&bp, call->reply[X]); */
 
 	_leave(" = 0 [done]");
@@ -930,9 +976,10 @@ static int afs_deliver_fs_symlink(struct afs_call *call)
 	/* unmarshall the reply once we've received all of it */
 	bp = call->buffer;
 	xdr_decode_AFSFid(&bp, call->reply[1]);
-	xdr_decode_AFSFetchStatus(&bp, call->reply[2], NULL, NULL, NULL);
-	xdr_decode_AFSFetchStatus(&bp, &vnode->status, vnode,
-				  &call->expected_version, NULL);
+	if (xdr_decode_AFSFetchStatus(&bp, call->reply[2], NULL, NULL, NULL) ||
+	    xdr_decode_AFSFetchStatus(&bp, &vnode->status, vnode,
+				      &call->expected_version, NULL) < 0)
+		return -EBADMSG;
 	/* xdr_decode_AFSVolSync(&bp, call->reply[X]); */
 
 	_leave(" = 0 [done]");
@@ -1034,11 +1081,13 @@ static int afs_deliver_fs_rename(struct afs_call *call)
 
 	/* unmarshall the reply once we've received all of it */
 	bp = call->buffer;
-	xdr_decode_AFSFetchStatus(&bp, &orig_dvnode->status, orig_dvnode,
-				  &call->expected_version, NULL);
-	if (new_dvnode != orig_dvnode)
-		xdr_decode_AFSFetchStatus(&bp, &new_dvnode->status, new_dvnode,
-					  &call->expected_version_2, NULL);
+	if (xdr_decode_AFSFetchStatus(&bp, &orig_dvnode->status, orig_dvnode,
+				      &call->expected_version, NULL) < 0)
+		return -EBADMSG;
+	if (new_dvnode != orig_dvnode &&
+	    xdr_decode_AFSFetchStatus(&bp, &new_dvnode->status, new_dvnode,
+				      &call->expected_version_2, NULL) < 0)
+		return -EBADMSG;
 	/* xdr_decode_AFSVolSync(&bp, call->reply[X]); */
 
 	_leave(" = 0 [done]");
@@ -1139,8 +1188,9 @@ static int afs_deliver_fs_store_data(struct afs_call *call)
 
 	/* unmarshall the reply once we've received all of it */
 	bp = call->buffer;
-	xdr_decode_AFSFetchStatus(&bp, &vnode->status, vnode,
-				  &call->expected_version, NULL);
+	if (xdr_decode_AFSFetchStatus(&bp, &vnode->status, vnode,
+				      &call->expected_version, NULL) < 0)
+		return -EBADMSG;
 	/* xdr_decode_AFSVolSync(&bp, call->reply[X]); */
 
 	afs_pages_written_back(vnode, call);
@@ -1314,8 +1364,9 @@ static int afs_deliver_fs_store_status(struct afs_call *call)
 
 	/* unmarshall the reply once we've received all of it */
 	bp = call->buffer;
-	xdr_decode_AFSFetchStatus(&bp, &vnode->status, vnode,
-				  &call->expected_version, NULL);
+	if (xdr_decode_AFSFetchStatus(&bp, &vnode->status, vnode,
+				      &call->expected_version, NULL) < 0)
+		return -EBADMSG;
 	/* xdr_decode_AFSVolSync(&bp, call->reply[X]); */
 
 	_leave(" = 0 [done]");
@@ -2127,9 +2178,10 @@ static int afs_deliver_fs_inline_bulk_status(struct afs_call *call)
 
 		bp = call->buffer;
 		statuses = call->reply[1];
-		xdr_decode_AFSFetchStatus(&bp, &statuses[call->count],
-					  call->count == 0 ? vnode : NULL,
-					  NULL, NULL);
+		if (xdr_decode_AFSFetchStatus(&bp, &statuses[call->count],
+					      call->count == 0 ? vnode : NULL,
+					      NULL, NULL) < 0)
+			return -EBADMSG;
 
 		call->count++;
 		if (call->count < call->count2)
diff --git a/fs/afs/inode.c b/fs/afs/inode.c
index 488abd78dc26..2e32d475ec11 100644
--- a/fs/afs/inode.c
+++ b/fs/afs/inode.c
@@ -30,12 +30,11 @@ static const struct inode_operations afs_symlink_inode_operations = {
 };
 
 /*
- * map the AFS file status to the inode member variables
+ * Initialise an inode from the vnode status.
  */
-static int afs_inode_map_status(struct afs_vnode *vnode, struct key *key)
+static int afs_inode_init_from_status(struct afs_vnode *vnode, struct key *key)
 {
 	struct inode *inode = AFS_VNODE_TO_I(vnode);
-	bool changed;
 
 	_debug("FS: ft=%d lk=%d sz=%llu ver=%Lu mod=%hu",
 	       vnode->status.type,
@@ -46,6 +45,9 @@ static int afs_inode_map_status(struct afs_vnode *vnode, struct key *key)
 
 	read_seqlock_excl(&vnode->cb_lock);
 
+	afs_update_inode_from_status(vnode, &vnode->status, NULL,
+				     AFS_VNODE_NOT_YET_SET);
+
 	switch (vnode->status.type) {
 	case AFS_FTYPE_FILE:
 		inode->i_mode	= S_IFREG | vnode->status.mode;
@@ -79,24 +81,13 @@ static int afs_inode_map_status(struct afs_vnode *vnode, struct key *key)
 		return -EBADMSG;
 	}
 
-	changed = (vnode->status.size != inode->i_size);
-
-	set_nlink(inode, vnode->status.nlink);
-	inode->i_uid		= vnode->status.owner;
-	inode->i_gid            = vnode->status.group;
-	inode->i_size		= vnode->status.size;
-	inode->i_ctime.tv_sec	= vnode->status.mtime_client;
-	inode->i_ctime.tv_nsec	= 0;
-	inode->i_atime		= inode->i_mtime = inode->i_ctime;
 	inode->i_blocks		= 0;
-	inode->i_generation	= vnode->fid.unique;
-	inode_set_iversion_raw(inode, vnode->status.data_version);
 	inode->i_mapping->a_ops	= &afs_fs_aops;
 
 	read_sequnlock_excl(&vnode->cb_lock);
 
 #ifdef CONFIG_AFS_FSCACHE
-	if (changed)
+	if (vnode->status.size > 0)
 		fscache_attr_changed(vnode->cache);
 #endif
 	return 0;
@@ -331,7 +322,7 @@ struct inode *afs_iget(struct super_block *sb, struct key *key,
 		vnode->cb_expires_at += ktime_get_real_seconds();
 	}
 
-	ret = afs_inode_map_status(vnode, key);
+	ret = afs_inode_init_from_status(vnode, key);
 	if (ret < 0)
 		goto bad_inode;
 
diff --git a/fs/afs/internal.h b/fs/afs/internal.h
index 94dc6007b584..8eb0b663b86e 100644
--- a/fs/afs/internal.h
+++ b/fs/afs/internal.h
@@ -702,6 +702,12 @@ extern int afs_flock(struct file *, int, struct file_lock *);
 /*
  * fsclient.c
  */
+#define AFS_VNODE_NOT_YET_SET	0x01
+#define AFS_VNODE_META_CHANGED	0x02
+#define AFS_VNODE_DATA_CHANGED	0x04
+extern void afs_update_inode_from_status(struct afs_vnode *, struct afs_file_status *,
+					 const afs_dataversion_t *, u8);
+
 extern int afs_fs_fetch_file_status(struct afs_fs_cursor *, struct afs_volsync *, bool);
 extern int afs_fs_give_up_callbacks(struct afs_net *, struct afs_server *);
 extern int afs_fs_fetch_data(struct afs_fs_cursor *, struct afs_read *);
diff --git a/fs/afs/xdr_fs.h b/fs/afs/xdr_fs.h
new file mode 100644
index 000000000000..24e23e40c979
--- /dev/null
+++ b/fs/afs/xdr_fs.h
@@ -0,0 +1,40 @@
+/* AFS fileserver XDR types
+ *
+ * Copyright (C) 2018 Red Hat, Inc. All Rights Reserved.
+ * Written by David Howells (dhowells@redhat.com)
+ *
+ * This program is free software; you can redistribute it and/or
+ * modify it under the terms of the GNU General Public Licence
+ * as published by the Free Software Foundation; either version
+ * 2 of the Licence, or (at your option) any later version.
+ */
+
+#ifndef XDR_FS_H
+#define XDR_FS_H
+
+struct afs_xdr_AFSFetchStatus {
+	__be32	if_version;
+#define AFS_FSTATUS_VERSION	1
+	__be32	type;
+	__be32	nlink;
+	__be32	size_lo;
+	__be32	data_version_lo;
+	__be32	author;
+	__be32	owner;
+	__be32	caller_access;
+	__be32	anon_access;
+	__be32	mode;
+	__be32	parent_vnode;
+	__be32	parent_unique;
+	__be32	seg_size;
+	__be32	mtime_client;
+	__be32	mtime_server;
+	__be32	group;
+	__be32	sync_counter;
+	__be32	data_version_hi;
+	__be32	lock_count;
+	__be32	size_hi;
+	__be32	abort_code;
+} __packed;
+
+#endif /* XDR_FS_H */

^ permalink raw reply related	[flat|nested] 37+ messages in thread

* [PATCH 12/20] afs: Keep track of invalid-before version for dentry coherency
  2018-04-05 20:29 [PATCH 00/20] afs: Fixes and development David Howells
                   ` (10 preceding siblings ...)
  2018-04-05 20:30 ` [PATCH 11/20] afs: Rearrange status mapping David Howells
@ 2018-04-05 20:30 ` David Howells
  2018-04-05 20:30 ` [PATCH 13/20] afs: Split the dynroot stuff out and give it its own ops tables David Howells
                   ` (10 subsequent siblings)
  22 siblings, 0 replies; 37+ messages in thread
From: David Howells @ 2018-04-05 20:30 UTC (permalink / raw)
  To: torvalds; +Cc: dhowells, linux-fsdevel, linux-afs, linux-kernel

Each afs dentry is tagged with the version that the parent directory was at
last time it was validated and, currently, if this differs, the directory
is scanned and the dentry is refreshed.

However, this leads to an excessive amount of revalidation on directories
that get modified on the client without conflict with another client.  We
know there's no conflict because the parent directory's data version number
got incremented by exactly 1 on any create, mkdir, unlink, etc., therefore
we can trust the current state of the unaffected dentries when we perform a
local directory modification.

Optimise by keeping track of the last version of the parent directory that
was changed outside of the client in the parent directory's vnode and using
that to validate the dentries rather than the current version.

Signed-off-by: David Howells <dhowells@redhat.com>
---

 fs/afs/dir.c      |   20 +++++++++++++++-----
 fs/afs/fsclient.c |    1 +
 fs/afs/inode.c    |    1 +
 fs/afs/internal.h |    1 +
 4 files changed, 18 insertions(+), 5 deletions(-)

diff --git a/fs/afs/dir.c b/fs/afs/dir.c
index 0e4945948a41..3c7fcd6ad9b7 100644
--- a/fs/afs/dir.c
+++ b/fs/afs/dir.c
@@ -1035,7 +1035,7 @@ static int afs_d_revalidate(struct dentry *dentry, unsigned int flags)
 	struct dentry *parent;
 	struct inode *inode;
 	struct key *key;
-	void *dir_version;
+	long dir_version, de_version;
 	int ret;
 
 	if (flags & LOOKUP_RCU)
@@ -1079,9 +1079,19 @@ static int afs_d_revalidate(struct dentry *dentry, unsigned int flags)
 		goto out_bad_parent;
 	}
 
-	dir_version = (void *) (unsigned long) dir->status.data_version;
-	if (dentry->d_fsdata == dir_version)
-		goto out_valid; /* the dir contents are unchanged */
+	/* We only need to invalidate a dentry if the server's copy changed
+	 * behind our back.  If we made the change, it's no problem.  Note that
+	 * on a 32-bit system, we only have 32 bits in the dentry to store the
+	 * version.
+	 */
+	dir_version = (long)dir->status.data_version;
+	de_version = (long)dentry->d_fsdata;
+	if (de_version == dir_version)
+		goto out_valid;
+
+	dir_version = (long)dir->invalid_before;
+	if (de_version - dir_version >= 0)
+		goto out_valid;
 
 	_debug("dir modified");
 	afs_stat_v(dir, n_reval);
@@ -1140,7 +1150,7 @@ static int afs_d_revalidate(struct dentry *dentry, unsigned int flags)
 	}
 
 out_valid:
-	dentry->d_fsdata = dir_version;
+	dentry->d_fsdata = (void *)dir_version;
 	dput(parent);
 	key_put(key);
 	_leave(" = 1 [valid]");
diff --git a/fs/afs/fsclient.c b/fs/afs/fsclient.c
index ff87fc6bb27f..f7570d229dcc 100644
--- a/fs/afs/fsclient.c
+++ b/fs/afs/fsclient.c
@@ -100,6 +100,7 @@ void afs_update_inode_from_status(struct afs_vnode *vnode,
 			       (unsigned long long) status->data_version,
 			       vnode->fid.vid, vnode->fid.vnode,
 			       (unsigned long long) *expected_version);
+			vnode->invalid_before = status->data_version;
 			set_bit(AFS_VNODE_DIR_MODIFIED, &vnode->flags);
 			set_bit(AFS_VNODE_ZAP_DATA, &vnode->flags);
 		}
diff --git a/fs/afs/inode.c b/fs/afs/inode.c
index 2e32d475ec11..07f450513f3e 100644
--- a/fs/afs/inode.c
+++ b/fs/afs/inode.c
@@ -83,6 +83,7 @@ static int afs_inode_init_from_status(struct afs_vnode *vnode, struct key *key)
 
 	inode->i_blocks		= 0;
 	inode->i_mapping->a_ops	= &afs_fs_aops;
+	vnode->invalid_before	= vnode->status.data_version;
 
 	read_sequnlock_excl(&vnode->cb_lock);
 
diff --git a/fs/afs/internal.h b/fs/afs/internal.h
index 8eb0b663b86e..967acc096f3d 100644
--- a/fs/afs/internal.h
+++ b/fs/afs/internal.h
@@ -477,6 +477,7 @@ struct afs_vnode {
 	struct afs_volume	*volume;	/* volume on which vnode resides */
 	struct afs_fid		fid;		/* the file identifier for this inode */
 	struct afs_file_status	status;		/* AFS status info for this file */
+	afs_dataversion_t	invalid_before;	/* Child dentries are invalid before this */
 #ifdef CONFIG_AFS_FSCACHE
 	struct fscache_cookie	*cache;		/* caching cookie */
 #endif

^ permalink raw reply related	[flat|nested] 37+ messages in thread

* [PATCH 13/20] afs: Split the dynroot stuff out and give it its own ops tables
  2018-04-05 20:29 [PATCH 00/20] afs: Fixes and development David Howells
                   ` (11 preceding siblings ...)
  2018-04-05 20:30 ` [PATCH 12/20] afs: Keep track of invalid-before version for dentry coherency David Howells
@ 2018-04-05 20:30 ` David Howells
  2018-04-05 20:31 ` [PATCH 14/20] afs: Fix directory handling David Howells
                   ` (9 subsequent siblings)
  22 siblings, 0 replies; 37+ messages in thread
From: David Howells @ 2018-04-05 20:30 UTC (permalink / raw)
  To: torvalds; +Cc: dhowells, linux-fsdevel, linux-afs, linux-kernel

Split the AFS dynamic root stuff out of the main directory handling file
and into its own file as they share little in common.

The dynamic root code also gets its own dentry and inode ops tables.

Signed-off-by: David Howells <dhowells@redhat.com>
---

 fs/afs/Makefile   |    1 
 fs/afs/dir.c      |  187 ----------------------------------------------
 fs/afs/dynroot.c  |  216 +++++++++++++++++++++++++++++++++++++++++++++++++++++
 fs/afs/internal.h |   12 ++-
 fs/afs/super.c    |   11 ++-
 5 files changed, 235 insertions(+), 192 deletions(-)
 create mode 100644 fs/afs/dynroot.c

diff --git a/fs/afs/Makefile b/fs/afs/Makefile
index 45b7fc405fa6..6a055423c192 100644
--- a/fs/afs/Makefile
+++ b/fs/afs/Makefile
@@ -12,6 +12,7 @@ kafs-objs := \
 	cell.o \
 	cmservice.o \
 	dir.o \
+	dynroot.o \
 	file.o \
 	flock.o \
 	fsclient.o \
diff --git a/fs/afs/dir.c b/fs/afs/dir.c
index 3c7fcd6ad9b7..8f2df235b07f 100644
--- a/fs/afs/dir.c
+++ b/fs/afs/dir.c
@@ -10,25 +10,19 @@
  */
 
 #include <linux/kernel.h>
-#include <linux/module.h>
-#include <linux/init.h>
 #include <linux/fs.h>
 #include <linux/namei.h>
 #include <linux/pagemap.h>
 #include <linux/ctype.h>
 #include <linux/sched.h>
-#include <linux/dns_resolver.h>
 #include "internal.h"
 
 static struct dentry *afs_lookup(struct inode *dir, struct dentry *dentry,
 				 unsigned int flags);
-static struct dentry *afs_dynroot_lookup(struct inode *dir, struct dentry *dentry,
-					 unsigned int flags);
 static int afs_dir_open(struct inode *inode, struct file *file);
 static int afs_readdir(struct file *file, struct dir_context *ctx);
 static int afs_d_revalidate(struct dentry *dentry, unsigned int flags);
 static int afs_d_delete(const struct dentry *dentry);
-static void afs_d_release(struct dentry *dentry);
 static int afs_lookup_one_filldir(struct dir_context *ctx, const char *name, int nlen,
 				  loff_t fpos, u64 ino, unsigned dtype);
 static int afs_lookup_filldir(struct dir_context *ctx, const char *name, int nlen,
@@ -69,17 +63,6 @@ const struct inode_operations afs_dir_inode_operations = {
 	.listxattr	= afs_listxattr,
 };
 
-const struct file_operations afs_dynroot_file_operations = {
-	.open		= dcache_dir_open,
-	.release	= dcache_dir_close,
-	.iterate_shared	= dcache_readdir,
-	.llseek		= dcache_dir_lseek,
-};
-
-const struct inode_operations afs_dynroot_inode_operations = {
-	.lookup		= afs_dynroot_lookup,
-};
-
 const struct dentry_operations afs_fs_dentry_operations = {
 	.d_revalidate	= afs_d_revalidate,
 	.d_delete	= afs_d_delete,
@@ -702,71 +685,6 @@ static struct inode *afs_do_lookup(struct inode *dir, struct dentry *dentry,
 	return inode;
 }
 
-/*
- * Probe to see if a cell may exist.  This prevents positive dentries from
- * being created unnecessarily.
- */
-static int afs_probe_cell_name(struct dentry *dentry)
-{
-	struct afs_cell *cell;
-	const char *name = dentry->d_name.name;
-	size_t len = dentry->d_name.len;
-	int ret;
-
-	/* Names prefixed with a dot are R/W mounts. */
-	if (name[0] == '.') {
-		if (len == 1)
-			return -EINVAL;
-		name++;
-		len--;
-	}
-
-	cell = afs_lookup_cell_rcu(afs_d2net(dentry), name, len);
-	if (!IS_ERR(cell)) {
-		afs_put_cell(afs_d2net(dentry), cell);
-		return 0;
-	}
-
-	ret = dns_query("afsdb", name, len, "ipv4", NULL, NULL);
-	if (ret == -ENODATA)
-		ret = -EDESTADDRREQ;
-	return ret;
-}
-
-/*
- * Try to auto mount the mountpoint with pseudo directory, if the autocell
- * operation is setted.
- */
-static struct inode *afs_try_auto_mntpt(struct dentry *dentry, struct inode *dir)
-{
-	struct afs_vnode *vnode = AFS_FS_I(dir);
-	struct inode *inode;
-	int ret = -ENOENT;
-
-	_enter("%p{%pd}, {%x:%u}",
-	       dentry, dentry, vnode->fid.vid, vnode->fid.vnode);
-
-	if (!test_bit(AFS_VNODE_AUTOCELL, &vnode->flags))
-		goto out;
-
-	ret = afs_probe_cell_name(dentry);
-	if (ret < 0)
-		goto out;
-
-	inode = afs_iget_pseudo_dir(dir->i_sb, false);
-	if (IS_ERR(inode)) {
-		ret = PTR_ERR(inode);
-		goto out;
-	}
-
-	_leave("= %p", inode);
-	return inode;
-
-out:
-	_leave("= %d", ret);
-	return ERR_PTR(ret);
-}
-
 /*
  * Look up an entry in a directory with @sys substitution.
  */
@@ -923,105 +841,6 @@ static struct dentry *afs_lookup(struct inode *dir, struct dentry *dentry,
 	return NULL;
 }
 
-/*
- * Look up @cell in a dynroot directory.  This is a substitution for the
- * local cell name for the net namespace.
- */
-static struct dentry *afs_lookup_atcell(struct dentry *dentry)
-{
-	struct afs_cell *cell;
-	struct afs_net *net = afs_d2net(dentry);
-	struct dentry *parent, *ret;
-	unsigned int seq = 0;
-	char *name;
-	int len;
-
-	if (!net->ws_cell)
-		return ERR_PTR(-ENOENT);
-
-	parent = dget_parent(dentry);
-
-	ret = ERR_PTR(-ENOMEM);
-	name = kmalloc(AFS_MAXCELLNAME + 1, GFP_KERNEL);
-	if (!name)
-		goto out_p;
-
-	rcu_read_lock();
-	do {
-		read_seqbegin_or_lock(&net->cells_lock, &seq);
-		cell = rcu_dereference_raw(net->ws_cell);
-		if (cell) {
-			len = cell->name_len;
-			memcpy(name, cell->name, len + 1);
-		}
-	} while (need_seqretry(&net->cells_lock, seq));
-	done_seqretry(&net->cells_lock, seq);
-	rcu_read_unlock();
-
-	ret = ERR_PTR(-ENOENT);
-	if (!cell)
-		goto out_n;
-
-	ret = lookup_one_len(name, parent, len);
-	if (IS_ERR(ret) || d_is_positive(ret))
-		goto out_n;
-
-	/* We don't want to d_add() the @cell dentry here as we don't want to
-	 * the cached dentry to hide changes to the local cell name.
-	 */
-	dput(ret);
-	ret = NULL;
-
-out_n:
-	kfree(name);
-out_p:
-	dput(parent);
-	return ret;
-}
-
-/*
- * Look up an entry in a dynroot directory.
- */
-static struct dentry *afs_dynroot_lookup(struct inode *dir, struct dentry *dentry,
-					 unsigned int flags)
-{
-	struct afs_vnode *vnode;
-	struct inode *inode;
-	int ret;
-
-	vnode = AFS_FS_I(dir);
-
-	_enter("%pd", dentry);
-
-	ASSERTCMP(d_inode(dentry), ==, NULL);
-
-	if (dentry->d_name.len >= AFSNAMEMAX) {
-		_leave(" = -ENAMETOOLONG");
-		return ERR_PTR(-ENAMETOOLONG);
-	}
-
-	if (dentry->d_name.len == 5 &&
-	    memcmp(dentry->d_name.name, "@cell", 5) == 0)
-		return afs_lookup_atcell(dentry);
-
-	inode = afs_try_auto_mntpt(dentry, dir);
-	if (IS_ERR(inode)) {
-		ret = PTR_ERR(inode);
-		if (ret == -ENOENT) {
-			d_add(dentry, NULL);
-			_leave(" = NULL [negative]");
-			return NULL;
-		}
-		_leave(" = %d [do]", ret);
-		return ERR_PTR(ret);
-	}
-
-	d_add(dentry, inode);
-	_leave(" = 0 { ino=%lu v=%u }",
-	       d_inode(dentry)->i_ino, d_inode(dentry)->i_generation);
-	return NULL;
-}
-
 /*
  * check that a dentry lookup hit has found a valid entry
  * - NOTE! the hit can be a negative hit too, so we can't assume we have an
@@ -1029,7 +848,6 @@ static struct dentry *afs_dynroot_lookup(struct inode *dir, struct dentry *dentr
  */
 static int afs_d_revalidate(struct dentry *dentry, unsigned int flags)
 {
-	struct afs_super_info *as = dentry->d_sb->s_fs_info;
 	struct afs_vnode *vnode, *dir;
 	struct afs_fid uninitialized_var(fid);
 	struct dentry *parent;
@@ -1041,9 +859,6 @@ static int afs_d_revalidate(struct dentry *dentry, unsigned int flags)
 	if (flags & LOOKUP_RCU)
 		return -ECHILD;
 
-	if (as->dyn_root)
-		return 1;
-
 	if (d_really_is_positive(dentry)) {
 		vnode = AFS_FS_I(d_inode(dentry));
 		_enter("{v={%x:%u} n=%pd fl=%lx},",
@@ -1201,7 +1016,7 @@ static int afs_d_delete(const struct dentry *dentry)
 /*
  * handle dentry release
  */
-static void afs_d_release(struct dentry *dentry)
+void afs_d_release(struct dentry *dentry)
 {
 	_enter("%pd", dentry);
 }
diff --git a/fs/afs/dynroot.c b/fs/afs/dynroot.c
new file mode 100644
index 000000000000..b70380e5d0e3
--- /dev/null
+++ b/fs/afs/dynroot.c
@@ -0,0 +1,216 @@
+/* dir.c: AFS dynamic root handling
+ *
+ * Copyright (C) 2018 Red Hat, Inc. All Rights Reserved.
+ * Written by David Howells (dhowells@redhat.com)
+ *
+ * This program is free software; you can redistribute it and/or
+ * modify it under the terms of the GNU General Public Licence
+ * as published by the Free Software Foundation; either version
+ * 2 of the Licence, or (at your option) any later version.
+ */
+
+#include <linux/fs.h>
+#include <linux/namei.h>
+#include <linux/dns_resolver.h>
+#include "internal.h"
+
+const struct file_operations afs_dynroot_file_operations = {
+	.open		= dcache_dir_open,
+	.release	= dcache_dir_close,
+	.iterate_shared	= dcache_readdir,
+	.llseek		= dcache_dir_lseek,
+};
+
+/*
+ * Probe to see if a cell may exist.  This prevents positive dentries from
+ * being created unnecessarily.
+ */
+static int afs_probe_cell_name(struct dentry *dentry)
+{
+	struct afs_cell *cell;
+	const char *name = dentry->d_name.name;
+	size_t len = dentry->d_name.len;
+	int ret;
+
+	/* Names prefixed with a dot are R/W mounts. */
+	if (name[0] == '.') {
+		if (len == 1)
+			return -EINVAL;
+		name++;
+		len--;
+	}
+
+	cell = afs_lookup_cell_rcu(afs_d2net(dentry), name, len);
+	if (!IS_ERR(cell)) {
+		afs_put_cell(afs_d2net(dentry), cell);
+		return 0;
+	}
+
+	ret = dns_query("afsdb", name, len, "ipv4", NULL, NULL);
+	if (ret == -ENODATA)
+		ret = -EDESTADDRREQ;
+	return ret;
+}
+
+/*
+ * Try to auto mount the mountpoint with pseudo directory, if the autocell
+ * operation is setted.
+ */
+struct inode *afs_try_auto_mntpt(struct dentry *dentry, struct inode *dir)
+{
+	struct afs_vnode *vnode = AFS_FS_I(dir);
+	struct inode *inode;
+	int ret = -ENOENT;
+
+	_enter("%p{%pd}, {%x:%u}",
+	       dentry, dentry, vnode->fid.vid, vnode->fid.vnode);
+
+	if (!test_bit(AFS_VNODE_AUTOCELL, &vnode->flags))
+		goto out;
+
+	ret = afs_probe_cell_name(dentry);
+	if (ret < 0)
+		goto out;
+
+	inode = afs_iget_pseudo_dir(dir->i_sb, false);
+	if (IS_ERR(inode)) {
+		ret = PTR_ERR(inode);
+		goto out;
+	}
+
+	_leave("= %p", inode);
+	return inode;
+
+out:
+	_leave("= %d", ret);
+	return ERR_PTR(ret);
+}
+
+/*
+ * Look up @cell in a dynroot directory.  This is a substitution for the
+ * local cell name for the net namespace.
+ */
+static struct dentry *afs_lookup_atcell(struct dentry *dentry)
+{
+	struct afs_cell *cell;
+	struct afs_net *net = afs_d2net(dentry);
+	struct dentry *parent, *ret;
+	unsigned int seq = 0;
+	char *name;
+	int len;
+
+	if (!net->ws_cell)
+		return ERR_PTR(-ENOENT);
+
+	parent = dget_parent(dentry);
+
+	ret = ERR_PTR(-ENOMEM);
+	name = kmalloc(AFS_MAXCELLNAME + 1, GFP_KERNEL);
+	if (!name)
+		goto out_p;
+
+	rcu_read_lock();
+	do {
+		read_seqbegin_or_lock(&net->cells_lock, &seq);
+		cell = rcu_dereference_raw(net->ws_cell);
+		if (cell) {
+			len = cell->name_len;
+			memcpy(name, cell->name, len + 1);
+		}
+	} while (need_seqretry(&net->cells_lock, seq));
+	done_seqretry(&net->cells_lock, seq);
+	rcu_read_unlock();
+
+	ret = ERR_PTR(-ENOENT);
+	if (!cell)
+		goto out_n;
+
+	ret = lookup_one_len(name, parent, len);
+	if (IS_ERR(ret) || d_is_positive(ret))
+		goto out_n;
+
+	/* We don't want to d_add() the @cell dentry here as we don't want to
+	 * the cached dentry to hide changes to the local cell name.
+	 */
+	dput(ret);
+	ret = NULL;
+
+out_n:
+	kfree(name);
+out_p:
+	dput(parent);
+	return ret;
+}
+
+/*
+ * Look up an entry in a dynroot directory.
+ */
+static struct dentry *afs_dynroot_lookup(struct inode *dir, struct dentry *dentry,
+					 unsigned int flags)
+{
+	struct afs_vnode *vnode;
+	struct inode *inode;
+	int ret;
+
+	vnode = AFS_FS_I(dir);
+
+	_enter("%pd", dentry);
+
+	ASSERTCMP(d_inode(dentry), ==, NULL);
+
+	if (dentry->d_name.len >= AFSNAMEMAX) {
+		_leave(" = -ENAMETOOLONG");
+		return ERR_PTR(-ENAMETOOLONG);
+	}
+
+	if (dentry->d_name.len == 5 &&
+	    memcmp(dentry->d_name.name, "@cell", 5) == 0)
+		return afs_lookup_atcell(dentry);
+
+	inode = afs_try_auto_mntpt(dentry, dir);
+	if (IS_ERR(inode)) {
+		ret = PTR_ERR(inode);
+		if (ret == -ENOENT) {
+			d_add(dentry, NULL);
+			_leave(" = NULL [negative]");
+			return NULL;
+		}
+		_leave(" = %d [do]", ret);
+		return ERR_PTR(ret);
+	}
+
+	d_add(dentry, inode);
+	_leave(" = 0 { ino=%lu v=%u }",
+	       d_inode(dentry)->i_ino, d_inode(dentry)->i_generation);
+	return NULL;
+}
+
+const struct inode_operations afs_dynroot_inode_operations = {
+	.lookup		= afs_dynroot_lookup,
+};
+
+/*
+ * Dirs in the dynamic root don't need revalidation.
+ */
+static int afs_dynroot_d_revalidate(struct dentry *dentry, unsigned int flags)
+{
+	return 1;
+}
+
+/*
+ * Allow the VFS to enquire as to whether a dentry should be unhashed (mustn't
+ * sleep)
+ * - called from dput() when d_count is going to 0.
+ * - return 1 to request dentry be unhashed, 0 otherwise
+ */
+static int afs_dynroot_d_delete(const struct dentry *dentry)
+{
+	return d_really_is_positive(dentry);
+}
+
+const struct dentry_operations afs_dynroot_dentry_operations = {
+	.d_revalidate	= afs_dynroot_d_revalidate,
+	.d_delete	= afs_dynroot_d_delete,
+	.d_release	= afs_d_release,
+	.d_automount	= afs_d_automount,
+};
diff --git a/fs/afs/internal.h b/fs/afs/internal.h
index 967acc096f3d..3bad5b8e38dc 100644
--- a/fs/afs/internal.h
+++ b/fs/afs/internal.h
@@ -669,11 +669,19 @@ extern bool afs_cm_incoming_call(struct afs_call *);
  */
 extern const struct file_operations afs_dir_file_operations;
 extern const struct inode_operations afs_dir_inode_operations;
-extern const struct file_operations afs_dynroot_file_operations;
-extern const struct inode_operations afs_dynroot_inode_operations;
 extern const struct dentry_operations afs_fs_dentry_operations;
 
 extern bool afs_dir_check_page(struct inode *, struct page *);
+extern void afs_d_release(struct dentry *);
+
+/*
+ * dynroot.c
+ */
+extern const struct file_operations afs_dynroot_file_operations;
+extern const struct inode_operations afs_dynroot_inode_operations;
+extern const struct dentry_operations afs_dynroot_dentry_operations;
+
+extern struct inode *afs_try_auto_mntpt(struct dentry *, struct inode *);
 
 /*
  * file.c
diff --git a/fs/afs/super.c b/fs/afs/super.c
index 3623c952b6ff..65081ec3c36e 100644
--- a/fs/afs/super.c
+++ b/fs/afs/super.c
@@ -154,7 +154,7 @@ static int afs_show_devname(struct seq_file *m, struct dentry *root)
 		seq_puts(m, "none");
 		return 0;
 	}
-	
+
 	switch (volume->type) {
 	case AFSVL_RWVOL:
 		break;
@@ -269,7 +269,7 @@ static int afs_parse_device_name(struct afs_mount_params *params,
 	int cellnamesz;
 
 	_enter(",%s", name);
-	
+
 	if (!name) {
 		printk(KERN_ERR "kAFS: no volume name specified\n");
 		return -EINVAL;
@@ -418,7 +418,10 @@ static int afs_fill_super(struct super_block *sb,
 	if (!sb->s_root)
 		goto error;
 
-	sb->s_d_op = &afs_fs_dentry_operations;
+	if (params->dyn_root)
+		sb->s_d_op = &afs_dynroot_dentry_operations;
+	else
+		sb->s_d_op = &afs_fs_dentry_operations;
 
 	_leave(" = 0");
 	return 0;
@@ -676,7 +679,7 @@ static int afs_statfs(struct dentry *dentry, struct kstatfs *buf)
 		buf->f_bfree	= 0;
 		return 0;
 	}
-	
+
 	key = afs_request_key(vnode->volume->cell);
 	if (IS_ERR(key))
 		return PTR_ERR(key);

^ permalink raw reply related	[flat|nested] 37+ messages in thread

* [PATCH 14/20] afs: Fix directory handling
  2018-04-05 20:29 [PATCH 00/20] afs: Fixes and development David Howells
                   ` (12 preceding siblings ...)
  2018-04-05 20:30 ` [PATCH 13/20] afs: Split the dynroot stuff out and give it its own ops tables David Howells
@ 2018-04-05 20:31 ` David Howells
  2018-04-05 20:31 ` [PATCH 15/20] afs: Split the directory content defs into a header David Howells
                   ` (8 subsequent siblings)
  22 siblings, 0 replies; 37+ messages in thread
From: David Howells @ 2018-04-05 20:31 UTC (permalink / raw)
  To: torvalds; +Cc: dhowells, linux-fsdevel, linux-afs, linux-kernel

AFS directories are structured blobs that are downloaded just like files
and then parsed by the lookup and readdir code and, as such, are currently
handled in the pagecache like any other file, with the entire directory
content being thrown away each time the directory changes.

However, since the blob is a known structure and since the data version
counter on a directory increases by exactly one for each change committed
to that directory, we can actually edit the directory locally rather than
fetching it from the server after each locally-induced change.

What we can't do, though, is mix data from the server and data from the
client since the server is technically at liberty to rearrange or compress
a directory if it sees fit, provided it updates the data version number
when it does so and breaks the callback (ie. sends a notification).

Further, lookup with lookup-ahead, readdir and, when it arrives, local
editing are likely want to scan the whole of a directory.

So directory handling needs to be improved to maintain the coherency of the
directory blob prior to permitting local directory editing.

To this end:

 (1) If any directory page gets discarded, invalidate and reread the entire
     directory.

 (2) If readpage notes that if when it fetches a single page that the
     version number has changed, the entire directory is flagged for
     invalidation.

 (3) Read as much of the directory in one go as we can.

Note that this removes local caching of directories in fscache for the
moment as we can't pass the pages to fscache_read_or_alloc_pages() since
page->lru is in use by the LRU.

Signed-off-by: David Howells <dhowells@redhat.com>
---

 fs/afs/dir.c      |  294 +++++++++++++++++++++++++++++++++++++++++------------
 fs/afs/file.c     |   14 +--
 fs/afs/fsclient.c |   24 +++-
 fs/afs/inode.c    |   26 ++---
 fs/afs/internal.h |   13 +-
 fs/afs/proc.c     |    5 +
 fs/afs/write.c    |    3 -
 7 files changed, 271 insertions(+), 108 deletions(-)

diff --git a/fs/afs/dir.c b/fs/afs/dir.c
index 8f2df235b07f..e989a8fe9a9b 100644
--- a/fs/afs/dir.c
+++ b/fs/afs/dir.c
@@ -1,6 +1,6 @@
 /* dir.c: AFS filesystem directory handling
  *
- * Copyright (C) 2002 Red Hat, Inc. All Rights Reserved.
+ * Copyright (C) 2002, 2018 Red Hat, Inc. All Rights Reserved.
  * Written by David Howells (dhowells@redhat.com)
  *
  * This program is free software; you can redistribute it and/or
@@ -13,8 +13,10 @@
 #include <linux/fs.h>
 #include <linux/namei.h>
 #include <linux/pagemap.h>
+#include <linux/swap.h>
 #include <linux/ctype.h>
 #include <linux/sched.h>
+#include <linux/task_io_accounting_ops.h>
 #include "internal.h"
 
 static struct dentry *afs_lookup(struct inode *dir, struct dentry *dentry,
@@ -39,6 +41,14 @@ static int afs_symlink(struct inode *dir, struct dentry *dentry,
 static int afs_rename(struct inode *old_dir, struct dentry *old_dentry,
 		      struct inode *new_dir, struct dentry *new_dentry,
 		      unsigned int flags);
+static int afs_dir_releasepage(struct page *page, gfp_t gfp_flags);
+static void afs_dir_invalidatepage(struct page *page, unsigned int offset,
+				   unsigned int length);
+
+static int afs_dir_set_page_dirty(struct page *page)
+{
+	BUG(); /* This should never happen. */
+}
 
 const struct file_operations afs_dir_file_operations = {
 	.open		= afs_dir_open,
@@ -63,6 +73,12 @@ const struct inode_operations afs_dir_inode_operations = {
 	.listxattr	= afs_listxattr,
 };
 
+const struct address_space_operations afs_dir_aops = {
+	.set_page_dirty	= afs_dir_set_page_dirty,
+	.releasepage	= afs_dir_releasepage,
+	.invalidatepage	= afs_dir_invalidatepage,
+};
+
 const struct dentry_operations afs_fs_dentry_operations = {
 	.d_revalidate	= afs_d_revalidate,
 	.d_delete	= afs_d_delete,
@@ -140,32 +156,17 @@ struct afs_lookup_cookie {
 /*
  * check that a directory page is valid
  */
-bool afs_dir_check_page(struct inode *dir, struct page *page)
+static bool afs_dir_check_page(struct afs_vnode *dvnode, struct page *page,
+			       loff_t i_size)
 {
 	struct afs_dir_page *dbuf;
-	struct afs_vnode *vnode = AFS_FS_I(dir);
-	loff_t latter, i_size, off;
+	loff_t latter, off;
 	int tmp, qty;
 
-#if 0
-	/* check the page count */
-	qty = desc.size / sizeof(dbuf->blocks[0]);
-	if (qty == 0)
-		goto error;
-
-	if (page->index == 0 && qty != ntohs(dbuf->blocks[0].pagehdr.npages)) {
-		printk("kAFS: %s(%lu): wrong number of dir blocks %d!=%hu\n",
-		       __func__, dir->i_ino, qty,
-		       ntohs(dbuf->blocks[0].pagehdr.npages));
-		goto error;
-	}
-#endif
-
 	/* Determine how many magic numbers there should be in this page, but
 	 * we must take care because the directory may change size under us.
 	 */
 	off = page_offset(page);
-	i_size = i_size_read(dir);
 	if (i_size <= off)
 		goto checked;
 
@@ -181,59 +182,21 @@ bool afs_dir_check_page(struct inode *dir, struct page *page)
 	for (tmp = 0; tmp < qty; tmp++) {
 		if (dbuf->blocks[tmp].pagehdr.magic != AFS_DIR_MAGIC) {
 			printk("kAFS: %s(%lx): bad magic %d/%d is %04hx\n",
-			       __func__, dir->i_ino, tmp, qty,
+			       __func__, dvnode->vfs_inode.i_ino, tmp, qty,
 			       ntohs(dbuf->blocks[tmp].pagehdr.magic));
-			trace_afs_dir_check_failed(vnode, off, i_size);
+			trace_afs_dir_check_failed(dvnode, off, i_size);
 			goto error;
 		}
 	}
 
 checked:
-	afs_stat_v(vnode, n_read_dir);
-	SetPageChecked(page);
+	afs_stat_v(dvnode, n_read_dir);
 	return true;
 
 error:
-	SetPageError(page);
 	return false;
 }
 
-/*
- * discard a page cached in the pagecache
- */
-static inline void afs_dir_put_page(struct page *page)
-{
-	kunmap(page);
-	unlock_page(page);
-	put_page(page);
-}
-
-/*
- * get a page into the pagecache
- */
-static struct page *afs_dir_get_page(struct inode *dir, unsigned long index,
-				     struct key *key)
-{
-	struct page *page;
-	_enter("{%lu},%lu", dir->i_ino, index);
-
-	page = read_cache_page(dir->i_mapping, index, afs_page_filler, key);
-	if (!IS_ERR(page)) {
-		lock_page(page);
-		kmap(page);
-		if (unlikely(!PageChecked(page))) {
-			if (PageError(page))
-				goto fail;
-		}
-	}
-	return page;
-
-fail:
-	afs_dir_put_page(page);
-	_leave(" = -EIO");
-	return ERR_PTR(-EIO);
-}
-
 /*
  * open an AFS directory file
  */
@@ -250,6 +213,147 @@ static int afs_dir_open(struct inode *inode, struct file *file)
 	return afs_open(inode, file);
 }
 
+/*
+ * Read the directory into the pagecache in one go, scrubbing the previous
+ * contents.  The list of pages is returned, pinning them so that they don't
+ * get reclaimed during the iteration.
+ */
+static struct afs_read *afs_read_dir(struct afs_vnode *dvnode, struct key *key)
+{
+	struct afs_read *req;
+	loff_t i_size;
+	int nr_pages, nr_inline, i, n;
+	int ret = -ENOMEM;
+
+retry:
+	i_size = i_size_read(&dvnode->vfs_inode);
+	if (i_size < 2048)
+		return ERR_PTR(-EIO);
+	if (i_size > 2048 * 1024)
+		return ERR_PTR(-EFBIG);
+
+	_enter("%llu", i_size);
+
+	/* Get a request record to hold the page list.  We want to hold it
+	 * inline if we can, but we don't want to make an order 1 allocation.
+	 */
+	nr_pages = (i_size + PAGE_SIZE - 1) / PAGE_SIZE;
+	nr_inline = nr_pages;
+	if (nr_inline > (PAGE_SIZE - sizeof(*req)) / sizeof(struct page *))
+		nr_inline = 0;
+
+	req = kzalloc(sizeof(*req) + sizeof(struct page *) * nr_inline,
+		      GFP_KERNEL);
+	if (!req)
+		return ERR_PTR(-ENOMEM);
+
+	refcount_set(&req->usage, 1);
+	req->nr_pages = nr_pages;
+	req->actual_len = i_size; /* May change */
+	req->len = nr_pages * PAGE_SIZE; /* We can ask for more than there is */
+	req->data_version = dvnode->status.data_version; /* May change */
+	if (nr_inline > 0) {
+		req->pages = req->array;
+	} else {
+		req->pages = kcalloc(nr_pages, sizeof(struct page *),
+				     GFP_KERNEL);
+		if (!req->pages)
+			goto error;
+	}
+
+	/* Get a list of all the pages that hold or will hold the directory
+	 * content.  We need to fill in any gaps that we might find where the
+	 * memory reclaimer has been at work.  If there are any gaps, we will
+	 * need to reread the entire directory contents.
+	 */
+	i = 0;
+	do {
+		n = find_get_pages_contig(dvnode->vfs_inode.i_mapping, i,
+					  req->nr_pages - i,
+					  req->pages + i);
+		_debug("find %u at %u/%u", n, i, req->nr_pages);
+		if (n == 0) {
+			gfp_t gfp = dvnode->vfs_inode.i_mapping->gfp_mask;
+
+			if (test_and_clear_bit(AFS_VNODE_DIR_VALID, &dvnode->flags))
+				afs_stat_v(dvnode, n_inval);
+
+			ret = -ENOMEM;
+			req->pages[i] = __page_cache_alloc(gfp);
+			if (!req->pages[i])
+				goto error;
+			ret = add_to_page_cache_lru(req->pages[i],
+						    dvnode->vfs_inode.i_mapping,
+						    i, gfp);
+			if (ret < 0)
+				goto error;
+
+			set_page_private(req->pages[i], 1);
+			SetPagePrivate(req->pages[i]);
+			unlock_page(req->pages[i]);
+			i++;
+		} else {
+			i += n;
+		}
+	} while (i < req->nr_pages);
+
+	/* If we're going to reload, we need to lock all the pages to prevent
+	 * races.
+	 */
+	if (!test_bit(AFS_VNODE_DIR_VALID, &dvnode->flags)) {
+		ret = -ERESTARTSYS;
+		for (i = 0; i < req->nr_pages; i++)
+			if (lock_page_killable(req->pages[i]) < 0)
+				goto error_unlock;
+
+		if (test_bit(AFS_VNODE_DIR_VALID, &dvnode->flags))
+			goto success;
+
+		ret = afs_fetch_data(dvnode, key, req);
+		if (ret < 0)
+			goto error_unlock_all;
+
+		task_io_account_read(PAGE_SIZE * req->nr_pages);
+
+		if (req->len < req->file_size)
+			goto content_has_grown;
+		
+		/* Validate the data we just read. */
+		ret = -EIO;
+		for (i = 0; i < req->nr_pages; i++)
+			if (!afs_dir_check_page(dvnode, req->pages[i],
+						req->actual_len))
+				goto error_unlock_all;
+
+		// TODO: Trim excess pages
+		
+		set_bit(AFS_VNODE_DIR_VALID, &dvnode->flags);
+	}
+
+success:
+	i = req->nr_pages;
+	while (i > 0)
+		unlock_page(req->pages[--i]);
+	return req;
+
+error_unlock_all:
+	i = req->nr_pages;
+error_unlock:
+	while (i > 0)
+		unlock_page(req->pages[--i]);
+error:
+	afs_put_read(req);
+	_leave(" = %d", ret);
+	return ERR_PTR(ret);
+
+content_has_grown:
+	i = req->nr_pages;
+	while (i > 0)
+		unlock_page(req->pages[--i]);
+	afs_put_read(req);
+	goto retry;
+}
+
 /*
  * deal with one block in an AFS directory
  */
@@ -347,8 +451,10 @@ static int afs_dir_iterate_block(struct dir_context *ctx,
 static int afs_dir_iterate(struct inode *dir, struct dir_context *ctx,
 			   struct key *key)
 {
+	struct afs_vnode *dvnode = AFS_FS_I(dir);
 	union afs_dir_block *dblock;
 	struct afs_dir_page *dbuf;
+	struct afs_read *req;
 	struct page *page;
 	unsigned blkoff, limit;
 	int ret;
@@ -360,25 +466,32 @@ static int afs_dir_iterate(struct inode *dir, struct dir_context *ctx,
 		return -ESTALE;
 	}
 
+	req = afs_read_dir(dvnode, key);
+	if (IS_ERR(req))
+		return PTR_ERR(req);
+
 	/* round the file position up to the next entry boundary */
 	ctx->pos += sizeof(union afs_dirent) - 1;
 	ctx->pos &= ~(sizeof(union afs_dirent) - 1);
 
 	/* walk through the blocks in sequence */
 	ret = 0;
-	while (ctx->pos < dir->i_size) {
+	while (ctx->pos < req->actual_len) {
 		blkoff = ctx->pos & ~(sizeof(union afs_dir_block) - 1);
 
-		/* fetch the appropriate page from the directory */
-		page = afs_dir_get_page(dir, blkoff / PAGE_SIZE, key);
-		if (IS_ERR(page)) {
-			ret = PTR_ERR(page);
+		/* Fetch the appropriate page from the directory and re-add it
+		 * to the LRU.
+		 */
+		page = req->pages[blkoff / PAGE_SIZE];
+		if (!page) {
+			ret = -EIO;
 			break;
 		}
+		mark_page_accessed(page);
 
 		limit = blkoff & ~(PAGE_SIZE - 1);
 
-		dbuf = page_address(page);
+		dbuf = kmap(page);
 
 		/* deal with the individual blocks stashed on this page */
 		do {
@@ -386,7 +499,7 @@ static int afs_dir_iterate(struct inode *dir, struct dir_context *ctx,
 					       sizeof(union afs_dir_block)];
 			ret = afs_dir_iterate_block(ctx, dblock, blkoff);
 			if (ret != 1) {
-				afs_dir_put_page(page);
+				kunmap(page);
 				goto out;
 			}
 
@@ -394,11 +507,12 @@ static int afs_dir_iterate(struct inode *dir, struct dir_context *ctx,
 
 		} while (ctx->pos < dir->i_size && blkoff < limit);
 
-		afs_dir_put_page(page);
+		kunmap(page);
 		ret = 0;
 	}
 
 out:
+	afs_put_read(req);
 	_leave(" = %d", ret);
 	return ret;
 }
@@ -1504,3 +1618,47 @@ static int afs_rename(struct inode *old_dir, struct dentry *old_dentry,
 	_leave(" = %d", ret);
 	return ret;
 }
+
+/*
+ * Release a directory page and clean up its private state if it's not busy
+ * - return true if the page can now be released, false if not
+ */
+static int afs_dir_releasepage(struct page *page, gfp_t gfp_flags)
+{
+	struct afs_vnode *dvnode = AFS_FS_I(page->mapping->host);
+
+	_enter("{{%x:%u}[%lu]}", dvnode->fid.vid, dvnode->fid.vnode, page->index);
+
+	set_page_private(page, 0);
+	ClearPagePrivate(page);
+
+	/* The directory will need reloading. */
+	if (test_and_clear_bit(AFS_VNODE_DIR_VALID, &dvnode->flags))
+		afs_stat_v(dvnode, n_relpg);
+	return 1;
+}
+
+/*
+ * invalidate part or all of a page
+ * - release a page and clean up its private data if offset is 0 (indicating
+ *   the entire page)
+ */
+static void afs_dir_invalidatepage(struct page *page, unsigned int offset,
+				   unsigned int length)
+{
+	struct afs_vnode *dvnode = AFS_FS_I(page->mapping->host);
+
+	_enter("{%lu},%u,%u", page->index, offset, length);
+
+	BUG_ON(!PageLocked(page));
+
+	/* The directory will need reloading. */
+	if (test_and_clear_bit(AFS_VNODE_DIR_VALID, &dvnode->flags))
+		afs_stat_v(dvnode, n_inval);
+
+	/* we clean up only if the entire page is being invalidated */
+	if (offset == 0 && length == PAGE_SIZE) {
+		set_page_private(page, 0);
+		ClearPagePrivate(page);
+	}
+}
diff --git a/fs/afs/file.c b/fs/afs/file.c
index 79e665a35fea..91ff1335fd33 100644
--- a/fs/afs/file.c
+++ b/fs/afs/file.c
@@ -187,10 +187,12 @@ void afs_put_read(struct afs_read *req)
 {
 	int i;
 
-	if (atomic_dec_and_test(&req->usage)) {
+	if (refcount_dec_and_test(&req->usage)) {
 		for (i = 0; i < req->nr_pages; i++)
 			if (req->pages[i])
 				put_page(req->pages[i]);
+		if (req->pages != req->array)
+			kfree(req->pages);
 		kfree(req);
 	}
 }
@@ -297,10 +299,11 @@ int afs_page_filler(void *data, struct page *page)
 		 * end of the file, the server will return a short read and the
 		 * unmarshalling code will clear the unfilled space.
 		 */
-		atomic_set(&req->usage, 1);
+		refcount_set(&req->usage, 1);
 		req->pos = (loff_t)page->index << PAGE_SHIFT;
 		req->len = PAGE_SIZE;
 		req->nr_pages = 1;
+		req->pages = req->array;
 		req->pages[0] = page;
 		get_page(page);
 
@@ -309,10 +312,6 @@ int afs_page_filler(void *data, struct page *page)
 		ret = afs_fetch_data(vnode, key, req);
 		afs_put_read(req);
 
-		if (ret >= 0 && S_ISDIR(inode->i_mode) &&
-		    !afs_dir_check_page(inode, page))
-			ret = -EIO;
-
 		if (ret < 0) {
 			if (ret == -ENOENT) {
 				_debug("got NOENT from server"
@@ -447,10 +446,11 @@ static int afs_readpages_one(struct file *file, struct address_space *mapping,
 	if (!req)
 		return -ENOMEM;
 
-	atomic_set(&req->usage, 1);
+	refcount_set(&req->usage, 1);
 	req->page_done = afs_readpages_page_done;
 	req->pos = first->index;
 	req->pos <<= PAGE_SHIFT;
+	req->pages = req->array;
 
 	/* Transfer the pages to the request.  We add them in until one fails
 	 * to add to the LRU and then we stop (as that'll make a hole in the
diff --git a/fs/afs/fsclient.c b/fs/afs/fsclient.c
index f7570d229dcc..b66ff0dc8a5a 100644
--- a/fs/afs/fsclient.c
+++ b/fs/afs/fsclient.c
@@ -101,8 +101,12 @@ void afs_update_inode_from_status(struct afs_vnode *vnode,
 			       vnode->fid.vid, vnode->fid.vnode,
 			       (unsigned long long) *expected_version);
 			vnode->invalid_before = status->data_version;
-			set_bit(AFS_VNODE_DIR_MODIFIED, &vnode->flags);
-			set_bit(AFS_VNODE_ZAP_DATA, &vnode->flags);
+			if (vnode->status.type == AFS_FTYPE_DIR) {
+				if (test_and_clear_bit(AFS_VNODE_DIR_VALID, &vnode->flags))
+					afs_stat_v(vnode, n_inval);
+			} else {
+				set_bit(AFS_VNODE_ZAP_DATA, &vnode->flags);
+			}
 		}
 	}
 
@@ -119,7 +123,7 @@ static int xdr_decode_AFSFetchStatus(const __be32 **_bp,
 				     struct afs_file_status *status,
 				     struct afs_vnode *vnode,
 				     const afs_dataversion_t *expected_version,
-				     afs_dataversion_t *_version)
+				     struct afs_read *read_req)
 {
 	const struct afs_xdr_AFSFetchStatus *xdr = (const void *)*_bp;
 	u64 data_version, size;
@@ -197,8 +201,11 @@ static int xdr_decode_AFSFetchStatus(const __be32 **_bp,
 		status->data_version = data_version;
 		flags |= AFS_VNODE_DATA_CHANGED;
 	}
-	if (_version)
-		*_version = data_version;
+
+	if (read_req) {
+		read_req->data_version = data_version;
+		read_req->file_size = size;
+	}
 
 	*_bp = (const void *)*_bp + sizeof(*xdr);
 
@@ -543,8 +550,7 @@ static int afs_deliver_fs_fetch_data(struct afs_call *call)
 
 		bp = call->buffer;
 		if (xdr_decode_AFSFetchStatus(&bp, &vnode->status, vnode,
-					      &vnode->status.data_version,
-					      &req->new_version) < 0)
+					      &vnode->status.data_version, req) < 0)
 			return -EBADMSG;
 		xdr_decode_AFSCallBack(call, vnode, &bp);
 		if (call->reply[1])
@@ -628,7 +634,7 @@ static int afs_fs_fetch_data64(struct afs_fs_cursor *fc, struct afs_read *req)
 	bp[6] = 0;
 	bp[7] = htonl(lower_32_bits(req->len));
 
-	atomic_inc(&req->usage);
+	refcount_inc(&req->usage);
 	call->cb_break = fc->cb_break;
 	afs_use_fs_server(call, fc->cbi);
 	trace_afs_make_fs_call(call, &vnode->fid);
@@ -671,7 +677,7 @@ int afs_fs_fetch_data(struct afs_fs_cursor *fc, struct afs_read *req)
 	bp[4] = htonl(lower_32_bits(req->pos));
 	bp[5] = htonl(lower_32_bits(req->len));
 
-	atomic_inc(&req->usage);
+	refcount_inc(&req->usage);
 	call->cb_break = fc->cb_break;
 	afs_use_fs_server(call, fc->cbi);
 	trace_afs_make_fs_call(call, &vnode->fid);
diff --git a/fs/afs/inode.c b/fs/afs/inode.c
index 07f450513f3e..69bcfb82dd69 100644
--- a/fs/afs/inode.c
+++ b/fs/afs/inode.c
@@ -53,11 +53,13 @@ static int afs_inode_init_from_status(struct afs_vnode *vnode, struct key *key)
 		inode->i_mode	= S_IFREG | vnode->status.mode;
 		inode->i_op	= &afs_file_inode_operations;
 		inode->i_fop	= &afs_file_operations;
+		inode->i_mapping->a_ops	= &afs_fs_aops;
 		break;
 	case AFS_FTYPE_DIR:
 		inode->i_mode	= S_IFDIR | vnode->status.mode;
 		inode->i_op	= &afs_dir_inode_operations;
 		inode->i_fop	= &afs_dir_file_operations;
+		inode->i_mapping->a_ops	= &afs_dir_aops;
 		break;
 	case AFS_FTYPE_SYMLINK:
 		/* Symlinks with a mode of 0644 are actually mountpoints. */
@@ -69,9 +71,11 @@ static int afs_inode_init_from_status(struct afs_vnode *vnode, struct key *key)
 			inode->i_mode	= S_IFDIR | 0555;
 			inode->i_op	= &afs_mntpt_inode_operations;
 			inode->i_fop	= &afs_mntpt_file_operations;
+			inode->i_mapping->a_ops	= &afs_fs_aops;
 		} else {
 			inode->i_mode	= S_IFLNK | vnode->status.mode;
 			inode->i_op	= &afs_symlink_inode_operations;
+			inode->i_mapping->a_ops	= &afs_fs_aops;
 		}
 		inode_nohighmem(inode);
 		break;
@@ -82,15 +86,9 @@ static int afs_inode_init_from_status(struct afs_vnode *vnode, struct key *key)
 	}
 
 	inode->i_blocks		= 0;
-	inode->i_mapping->a_ops	= &afs_fs_aops;
 	vnode->invalid_before	= vnode->status.data_version;
 
 	read_sequnlock_excl(&vnode->cb_lock);
-
-#ifdef CONFIG_AFS_FSCACHE
-	if (vnode->status.size > 0)
-		fscache_attr_changed(vnode->cache);
-#endif
 	return 0;
 }
 
@@ -247,6 +245,11 @@ static void afs_get_inode_cache(struct afs_vnode *vnode)
 	} __packed key;
 	struct afs_vnode_cache_aux aux;
 
+	if (vnode->status.type == AFS_FTYPE_DIR) {
+		vnode->cache = NULL;
+		return;
+	}
+
 	key.vnode_id		= vnode->fid.vnode;
 	key.unique		= vnode->fid.unique;
 	key.vnode_id_ext[0]	= 0;
@@ -338,10 +341,6 @@ struct inode *afs_iget(struct super_block *sb, struct key *key,
 
 	/* failure */
 bad_inode:
-#ifdef CONFIG_AFS_FSCACHE
-	fscache_relinquish_cookie(vnode->cache, NULL, ret == -ENOENT);
-	vnode->cache = NULL;
-#endif
 	iget_failed(inode);
 	_leave(" = %d [bad]", ret);
 	return ERR_PTR(ret);
@@ -355,9 +354,6 @@ void afs_zap_data(struct afs_vnode *vnode)
 {
 	_enter("{%x:%u}", vnode->fid.vid, vnode->fid.vnode);
 
-	if (S_ISDIR(vnode->vfs_inode.i_mode))
-		afs_stat_v(vnode, n_inval);
-
 #ifdef CONFIG_AFS_FSCACHE
 	fscache_invalidate(vnode->cache);
 #endif
@@ -399,7 +395,7 @@ int afs_validate(struct afs_vnode *vnode, struct key *key)
 	if (test_bit(AFS_VNODE_CB_PROMISED, &vnode->flags)) {
 		if (vnode->cb_s_break != vnode->cb_interest->server->cb_s_break) {
 			vnode->cb_s_break = vnode->cb_interest->server->cb_s_break;
-		} else if (!test_bit(AFS_VNODE_DIR_MODIFIED, &vnode->flags) &&
+		} else if (test_bit(AFS_VNODE_DIR_VALID, &vnode->flags) &&
 			   !test_bit(AFS_VNODE_ZAP_DATA, &vnode->flags) &&
 			   vnode->cb_expires_at - 10 > now) {
 				valid = true;
@@ -445,8 +441,6 @@ int afs_validate(struct afs_vnode *vnode, struct key *key)
 	 * different */
 	if (test_and_clear_bit(AFS_VNODE_ZAP_DATA, &vnode->flags))
 		afs_zap_data(vnode);
-
-	clear_bit(AFS_VNODE_DIR_MODIFIED, &vnode->flags);
 	mutex_unlock(&vnode->validate_lock);
 valid:
 	_leave(" = 0");
diff --git a/fs/afs/internal.h b/fs/afs/internal.h
index 3bad5b8e38dc..3a73c87e7ca8 100644
--- a/fs/afs/internal.h
+++ b/fs/afs/internal.h
@@ -174,12 +174,14 @@ struct afs_read {
 	loff_t			len;		/* How much we're asking for */
 	loff_t			actual_len;	/* How much we're actually getting */
 	loff_t			remain;		/* Amount remaining */
-	afs_dataversion_t	new_version;	/* Version number returned by server */
-	atomic_t		usage;
+	loff_t			file_size;	/* File size returned by server */
+	afs_dataversion_t	data_version;	/* Version number returned by server */
+	refcount_t		usage;
 	unsigned int		index;		/* Which page we're reading into */
 	unsigned int		nr_pages;
 	void (*page_done)(struct afs_call *, struct afs_read *);
-	struct page		*pages[];
+	struct page		**pages;
+	struct page		*array[];
 };
 
 /*
@@ -265,6 +267,7 @@ struct afs_net {
 	atomic_t		n_lookup;	/* Number of lookups done */
 	atomic_t		n_reval;	/* Number of dentries needing revalidation */
 	atomic_t		n_inval;	/* Number of invalidations by the server */
+	atomic_t		n_relpg;	/* Number of invalidations by releasepage */
 	atomic_t		n_read_dir;	/* Number of directory pages read */
 };
 
@@ -489,7 +492,7 @@ struct afs_vnode {
 	unsigned long		flags;
 #define AFS_VNODE_CB_PROMISED	0		/* Set if vnode has a callback promise */
 #define AFS_VNODE_UNSET		1		/* set if vnode attributes not yet set */
-#define AFS_VNODE_DIR_MODIFIED	2		/* set if dir vnode's data modified */
+#define AFS_VNODE_DIR_VALID	2		/* Set if dir contents are valid */
 #define AFS_VNODE_ZAP_DATA	3		/* set if vnode's data should be invalidated */
 #define AFS_VNODE_DELETED	4		/* set if vnode deleted on server */
 #define AFS_VNODE_MOUNTPOINT	5		/* set if vnode is a mountpoint symlink */
@@ -669,9 +672,9 @@ extern bool afs_cm_incoming_call(struct afs_call *);
  */
 extern const struct file_operations afs_dir_file_operations;
 extern const struct inode_operations afs_dir_inode_operations;
+extern const struct address_space_operations afs_dir_aops;
 extern const struct dentry_operations afs_fs_dentry_operations;
 
-extern bool afs_dir_check_page(struct inode *, struct page *);
 extern void afs_d_release(struct dentry *);
 
 /*
diff --git a/fs/afs/proc.c b/fs/afs/proc.c
index e1e13a568845..090fa8c30ea8 100644
--- a/fs/afs/proc.c
+++ b/fs/afs/proc.c
@@ -890,10 +890,11 @@ static int afs_proc_stats_show(struct seq_file *m, void *v)
 
 	seq_puts(m, "kAFS statistics\n");
 
-	seq_printf(m, "dir-mgmt: look=%u reval=%u inval=%u\n",
+	seq_printf(m, "dir-mgmt: look=%u reval=%u inval=%u relpg=%u\n",
 		   atomic_read(&net->n_lookup),
 		   atomic_read(&net->n_reval),
-		   atomic_read(&net->n_inval));
+		   atomic_read(&net->n_inval),
+		   atomic_read(&net->n_relpg));
 
 	seq_printf(m, "dir-data: rdpg=%u\n",
 		   atomic_read(&net->n_read_dir));
diff --git a/fs/afs/write.c b/fs/afs/write.c
index 9370e2feb999..70a563c14e6f 100644
--- a/fs/afs/write.c
+++ b/fs/afs/write.c
@@ -42,10 +42,11 @@ static int afs_fill_page(struct afs_vnode *vnode, struct key *key,
 	if (!req)
 		return -ENOMEM;
 
-	atomic_set(&req->usage, 1);
+	refcount_set(&req->usage, 1);
 	req->pos = pos;
 	req->len = len;
 	req->nr_pages = 1;
+	req->pages = req->array;
 	req->pages[0] = page;
 	get_page(page);
 

^ permalink raw reply related	[flat|nested] 37+ messages in thread

* [PATCH 15/20] afs: Split the directory content defs into a header
  2018-04-05 20:29 [PATCH 00/20] afs: Fixes and development David Howells
                   ` (13 preceding siblings ...)
  2018-04-05 20:31 ` [PATCH 14/20] afs: Fix directory handling David Howells
@ 2018-04-05 20:31 ` David Howells
  2018-04-05 20:31 ` [PATCH 16/20] afs: Adjust the directory XDR structures David Howells
                   ` (7 subsequent siblings)
  22 siblings, 0 replies; 37+ messages in thread
From: David Howells @ 2018-04-05 20:31 UTC (permalink / raw)
  To: torvalds; +Cc: dhowells, linux-fsdevel, linux-afs, linux-kernel

Split the directory content definitions into a header file so that they can
be used by multiple .c files.

Signed-off-by: David Howells <dhowells@redhat.com>
---

 fs/afs/dir.c    |   56 +++----------------------------------------------
 fs/afs/xdr_fs.h |   63 +++++++++++++++++++++++++++++++++++++++++++++++++++++++
 2 files changed, 67 insertions(+), 52 deletions(-)

diff --git a/fs/afs/dir.c b/fs/afs/dir.c
index e989a8fe9a9b..31703f93a90b 100644
--- a/fs/afs/dir.c
+++ b/fs/afs/dir.c
@@ -18,6 +18,7 @@
 #include <linux/sched.h>
 #include <linux/task_io_accounting_ops.h>
 #include "internal.h"
+#include "xdr_fs.h"
 
 static struct dentry *afs_lookup(struct inode *dir, struct dentry *dentry,
 				 unsigned int flags);
@@ -86,55 +87,6 @@ const struct dentry_operations afs_fs_dentry_operations = {
 	.d_automount	= afs_d_automount,
 };
 
-#define AFS_DIR_HASHTBL_SIZE	128
-#define AFS_DIR_DIRENT_SIZE	32
-#define AFS_DIRENT_PER_BLOCK	64
-
-union afs_dirent {
-	struct {
-		uint8_t		valid;
-		uint8_t		unused[1];
-		__be16		hash_next;
-		__be32		vnode;
-		__be32		unique;
-		uint8_t		name[16];
-		uint8_t		overflow[4];	/* if any char of the name (inc
-						 * NUL) reaches here, consume
-						 * the next dirent too */
-	} u;
-	uint8_t	extended_name[32];
-};
-
-/* AFS directory page header (one at the beginning of every 2048-byte chunk) */
-struct afs_dir_pagehdr {
-	__be16		npages;
-	__be16		magic;
-#define AFS_DIR_MAGIC htons(1234)
-	uint8_t		nentries;
-	uint8_t		bitmap[8];
-	uint8_t		pad[19];
-};
-
-/* directory block layout */
-union afs_dir_block {
-
-	struct afs_dir_pagehdr pagehdr;
-
-	struct {
-		struct afs_dir_pagehdr	pagehdr;
-		uint8_t			alloc_ctrs[128];
-		/* dir hash table */
-		uint16_t		hashtable[AFS_DIR_HASHTBL_SIZE];
-	} hdr;
-
-	union afs_dirent dirents[AFS_DIRENT_PER_BLOCK];
-};
-
-/* layout on a linux VM page */
-struct afs_dir_page {
-	union afs_dir_block blocks[PAGE_SIZE / sizeof(union afs_dir_block)];
-};
-
 struct afs_lookup_one_cookie {
 	struct dir_context	ctx;
 	struct qstr		name;
@@ -371,8 +323,8 @@ static int afs_dir_iterate_block(struct dir_context *ctx,
 	curr = (ctx->pos - blkoff) / sizeof(union afs_dirent);
 
 	/* walk through the block, an entry at a time */
-	for (offset = AFS_DIRENT_PER_BLOCK - block->pagehdr.nentries;
-	     offset < AFS_DIRENT_PER_BLOCK;
+	for (offset = (blkoff == 0 ? AFS_DIR_RESV_BLOCKS0 : AFS_DIR_RESV_BLOCKS);
+	     offset < AFS_DIR_SLOTS_PER_BLOCK;
 	     offset = next
 	     ) {
 		next = offset + 1;
@@ -401,7 +353,7 @@ static int afs_dir_iterate_block(struct dir_context *ctx,
 
 		/* work out where the next possible entry is */
 		for (tmp = nlen; tmp > 15; tmp -= sizeof(union afs_dirent)) {
-			if (next >= AFS_DIRENT_PER_BLOCK) {
+			if (next >= AFS_DIR_SLOTS_PER_BLOCK) {
 				_debug("ENT[%zu.%u]:"
 				       " %u travelled beyond end dir block"
 				       " (len %u/%zu)",
diff --git a/fs/afs/xdr_fs.h b/fs/afs/xdr_fs.h
index 24e23e40c979..63e87ccbb55b 100644
--- a/fs/afs/xdr_fs.h
+++ b/fs/afs/xdr_fs.h
@@ -37,4 +37,67 @@ struct afs_xdr_AFSFetchStatus {
 	__be32	abort_code;
 } __packed;
 
+#define AFS_DIR_HASHTBL_SIZE	128
+#define AFS_DIR_DIRENT_SIZE	32
+#define AFS_DIR_SLOTS_PER_BLOCK	64
+#define AFS_DIR_BLOCK_SIZE	2048
+#define AFS_DIR_BLOCKS_PER_PAGE	(PAGE_SIZE / AFS_DIR_BLOCK_SIZE)
+#define AFS_DIR_MAX_SLOTS	65536
+#define AFS_DIR_BLOCKS_WITH_CTR	128
+#define AFS_DIR_MAX_BLOCKS	1023
+#define AFS_DIR_RESV_BLOCKS	1
+#define AFS_DIR_RESV_BLOCKS0	13
+
+/*
+ * Directory entry structure.
+ */
+union afs_dirent {
+	struct {
+		uint8_t		valid;
+		uint8_t		unused[1];
+		__be16		hash_next;
+		__be32		vnode;
+		__be32		unique;
+		uint8_t		name[16];
+		uint8_t		overflow[4];	/* if any char of the name (inc
+						 * NUL) reaches here, consume
+						 * the next dirent too */
+	} u;
+	uint8_t	extended_name[32];
+};
+
+/*
+ * Directory page header (one at the beginning of every 2048-byte chunk).
+ */
+struct afs_dir_pagehdr {
+	__be16		npages;
+	__be16		magic;
+#define AFS_DIR_MAGIC htons(1234)
+	uint8_t		reserved;
+	uint8_t		bitmap[8];
+	uint8_t		pad[19];
+};
+
+/*
+ * Directory block layout
+ */
+union afs_dir_block {
+	struct afs_dir_pagehdr	pagehdr;
+
+	struct {
+		struct afs_dir_pagehdr	pagehdr;
+		uint8_t			alloc_ctrs[AFS_DIR_MAX_BLOCKS];
+		__be16			hashtable[AFS_DIR_HASHTBL_SIZE];
+	} hdr;
+
+	union afs_dirent	dirents[AFS_DIR_SLOTS_PER_BLOCK];
+};
+
+/*
+ * Directory layout on a linux VM page.
+ */
+struct afs_dir_page {
+	union afs_dir_block	blocks[AFS_DIR_BLOCKS_PER_PAGE];
+};
+
 #endif /* XDR_FS_H */

^ permalink raw reply related	[flat|nested] 37+ messages in thread

* [PATCH 16/20] afs: Adjust the directory XDR structures
  2018-04-05 20:29 [PATCH 00/20] afs: Fixes and development David Howells
                   ` (14 preceding siblings ...)
  2018-04-05 20:31 ` [PATCH 15/20] afs: Split the directory content defs into a header David Howells
@ 2018-04-05 20:31 ` David Howells
  2018-04-05 20:31 ` [PATCH 17/20] afs: Locally edit directory data for mkdir/create/unlink/ David Howells
                   ` (6 subsequent siblings)
  22 siblings, 0 replies; 37+ messages in thread
From: David Howells @ 2018-04-05 20:31 UTC (permalink / raw)
  To: torvalds; +Cc: dhowells, linux-fsdevel, linux-afs, linux-kernel

Adjust the AFS directory XDR structures in a number of superficial ways:

 (1) Rename them to all begin afs_xdr_.

 (2) Use u8 instead of uint8_t.

 (3) Mark the structures as __packed so they don't get rearranged by the
     compiler.

 (4) Rename the hdr member of afs_xdr_dir_block to meta.

 (5) Rename the pagehdr member of afs_xdr_dir_block to hdr.

Signed-off-by: David Howells <dhowells@redhat.com>
---

 fs/afs/dir.c    |   62 ++++++++++++++++++++++++++++---------------------------
 fs/afs/xdr_fs.h |   44 ++++++++++++++++++++-------------------
 2 files changed, 53 insertions(+), 53 deletions(-)

diff --git a/fs/afs/dir.c b/fs/afs/dir.c
index 31703f93a90b..9efaa1f53bd8 100644
--- a/fs/afs/dir.c
+++ b/fs/afs/dir.c
@@ -111,7 +111,7 @@ struct afs_lookup_cookie {
 static bool afs_dir_check_page(struct afs_vnode *dvnode, struct page *page,
 			       loff_t i_size)
 {
-	struct afs_dir_page *dbuf;
+	struct afs_xdr_dir_page *dbuf;
 	loff_t latter, off;
 	int tmp, qty;
 
@@ -127,15 +127,15 @@ static bool afs_dir_check_page(struct afs_vnode *dvnode, struct page *page,
 		qty = PAGE_SIZE;
 	else
 		qty = latter;
-	qty /= sizeof(union afs_dir_block);
+	qty /= sizeof(union afs_xdr_dir_block);
 
 	/* check them */
 	dbuf = page_address(page);
 	for (tmp = 0; tmp < qty; tmp++) {
-		if (dbuf->blocks[tmp].pagehdr.magic != AFS_DIR_MAGIC) {
+		if (dbuf->blocks[tmp].hdr.magic != AFS_DIR_MAGIC) {
 			printk("kAFS: %s(%lx): bad magic %d/%d is %04hx\n",
 			       __func__, dvnode->vfs_inode.i_ino, tmp, qty,
-			       ntohs(dbuf->blocks[tmp].pagehdr.magic));
+			       ntohs(dbuf->blocks[tmp].hdr.magic));
 			trace_afs_dir_check_failed(dvnode, off, i_size);
 			goto error;
 		}
@@ -156,8 +156,8 @@ static int afs_dir_open(struct inode *inode, struct file *file)
 {
 	_enter("{%lu}", inode->i_ino);
 
-	BUILD_BUG_ON(sizeof(union afs_dir_block) != 2048);
-	BUILD_BUG_ON(sizeof(union afs_dirent) != 32);
+	BUILD_BUG_ON(sizeof(union afs_xdr_dir_block) != 2048);
+	BUILD_BUG_ON(sizeof(union afs_xdr_dirent) != 32);
 
 	if (test_bit(AFS_VNODE_DELETED, &AFS_FS_I(inode)->flags))
 		return -ENOENT;
@@ -310,17 +310,17 @@ static struct afs_read *afs_read_dir(struct afs_vnode *dvnode, struct key *key)
  * deal with one block in an AFS directory
  */
 static int afs_dir_iterate_block(struct dir_context *ctx,
-				 union afs_dir_block *block,
+				 union afs_xdr_dir_block *block,
 				 unsigned blkoff)
 {
-	union afs_dirent *dire;
+	union afs_xdr_dirent *dire;
 	unsigned offset, next, curr;
 	size_t nlen;
 	int tmp;
 
 	_enter("%u,%x,%p,,",(unsigned)ctx->pos,blkoff,block);
 
-	curr = (ctx->pos - blkoff) / sizeof(union afs_dirent);
+	curr = (ctx->pos - blkoff) / sizeof(union afs_xdr_dirent);
 
 	/* walk through the block, an entry at a time */
 	for (offset = (blkoff == 0 ? AFS_DIR_RESV_BLOCKS0 : AFS_DIR_RESV_BLOCKS);
@@ -330,13 +330,13 @@ static int afs_dir_iterate_block(struct dir_context *ctx,
 		next = offset + 1;
 
 		/* skip entries marked unused in the bitmap */
-		if (!(block->pagehdr.bitmap[offset / 8] &
+		if (!(block->hdr.bitmap[offset / 8] &
 		      (1 << (offset % 8)))) {
 			_debug("ENT[%zu.%u]: unused",
-			       blkoff / sizeof(union afs_dir_block), offset);
+			       blkoff / sizeof(union afs_xdr_dir_block), offset);
 			if (offset >= curr)
 				ctx->pos = blkoff +
-					next * sizeof(union afs_dirent);
+					next * sizeof(union afs_xdr_dirent);
 			continue;
 		}
 
@@ -344,34 +344,34 @@ static int afs_dir_iterate_block(struct dir_context *ctx,
 		dire = &block->dirents[offset];
 		nlen = strnlen(dire->u.name,
 			       sizeof(*block) -
-			       offset * sizeof(union afs_dirent));
+			       offset * sizeof(union afs_xdr_dirent));
 
 		_debug("ENT[%zu.%u]: %s %zu \"%s\"",
-		       blkoff / sizeof(union afs_dir_block), offset,
+		       blkoff / sizeof(union afs_xdr_dir_block), offset,
 		       (offset < curr ? "skip" : "fill"),
 		       nlen, dire->u.name);
 
 		/* work out where the next possible entry is */
-		for (tmp = nlen; tmp > 15; tmp -= sizeof(union afs_dirent)) {
+		for (tmp = nlen; tmp > 15; tmp -= sizeof(union afs_xdr_dirent)) {
 			if (next >= AFS_DIR_SLOTS_PER_BLOCK) {
 				_debug("ENT[%zu.%u]:"
 				       " %u travelled beyond end dir block"
 				       " (len %u/%zu)",
-				       blkoff / sizeof(union afs_dir_block),
+				       blkoff / sizeof(union afs_xdr_dir_block),
 				       offset, next, tmp, nlen);
 				return -EIO;
 			}
-			if (!(block->pagehdr.bitmap[next / 8] &
+			if (!(block->hdr.bitmap[next / 8] &
 			      (1 << (next % 8)))) {
 				_debug("ENT[%zu.%u]:"
 				       " %u unmarked extension (len %u/%zu)",
-				       blkoff / sizeof(union afs_dir_block),
+				       blkoff / sizeof(union afs_xdr_dir_block),
 				       offset, next, tmp, nlen);
 				return -EIO;
 			}
 
 			_debug("ENT[%zu.%u]: ext %u/%zu",
-			       blkoff / sizeof(union afs_dir_block),
+			       blkoff / sizeof(union afs_xdr_dir_block),
 			       next, tmp, nlen);
 			next++;
 		}
@@ -390,7 +390,7 @@ static int afs_dir_iterate_block(struct dir_context *ctx,
 			return 0;
 		}
 
-		ctx->pos = blkoff + next * sizeof(union afs_dirent);
+		ctx->pos = blkoff + next * sizeof(union afs_xdr_dirent);
 	}
 
 	_leave(" = 1 [more]");
@@ -404,8 +404,8 @@ static int afs_dir_iterate(struct inode *dir, struct dir_context *ctx,
 			   struct key *key)
 {
 	struct afs_vnode *dvnode = AFS_FS_I(dir);
-	union afs_dir_block *dblock;
-	struct afs_dir_page *dbuf;
+	struct afs_xdr_dir_page *dbuf;
+	union afs_xdr_dir_block *dblock;
 	struct afs_read *req;
 	struct page *page;
 	unsigned blkoff, limit;
@@ -423,13 +423,13 @@ static int afs_dir_iterate(struct inode *dir, struct dir_context *ctx,
 		return PTR_ERR(req);
 
 	/* round the file position up to the next entry boundary */
-	ctx->pos += sizeof(union afs_dirent) - 1;
-	ctx->pos &= ~(sizeof(union afs_dirent) - 1);
+	ctx->pos += sizeof(union afs_xdr_dirent) - 1;
+	ctx->pos &= ~(sizeof(union afs_xdr_dirent) - 1);
 
 	/* walk through the blocks in sequence */
 	ret = 0;
 	while (ctx->pos < req->actual_len) {
-		blkoff = ctx->pos & ~(sizeof(union afs_dir_block) - 1);
+		blkoff = ctx->pos & ~(sizeof(union afs_xdr_dir_block) - 1);
 
 		/* Fetch the appropriate page from the directory and re-add it
 		 * to the LRU.
@@ -448,14 +448,14 @@ static int afs_dir_iterate(struct inode *dir, struct dir_context *ctx,
 		/* deal with the individual blocks stashed on this page */
 		do {
 			dblock = &dbuf->blocks[(blkoff % PAGE_SIZE) /
-					       sizeof(union afs_dir_block)];
+					       sizeof(union afs_xdr_dir_block)];
 			ret = afs_dir_iterate_block(ctx, dblock, blkoff);
 			if (ret != 1) {
 				kunmap(page);
 				goto out;
 			}
 
-			blkoff += sizeof(union afs_dir_block);
+			blkoff += sizeof(union afs_xdr_dir_block);
 
 		} while (ctx->pos < dir->i_size && blkoff < limit);
 
@@ -493,8 +493,8 @@ static int afs_lookup_one_filldir(struct dir_context *ctx, const char *name,
 	       (unsigned long long) ino, dtype);
 
 	/* insanity checks first */
-	BUILD_BUG_ON(sizeof(union afs_dir_block) != 2048);
-	BUILD_BUG_ON(sizeof(union afs_dirent) != 32);
+	BUILD_BUG_ON(sizeof(union afs_xdr_dir_block) != 2048);
+	BUILD_BUG_ON(sizeof(union afs_xdr_dirent) != 32);
 
 	if (cookie->name.len != nlen ||
 	    memcmp(cookie->name.name, name, nlen) != 0) {
@@ -562,8 +562,8 @@ static int afs_lookup_filldir(struct dir_context *ctx, const char *name,
 	       (unsigned long long) ino, dtype);
 
 	/* insanity checks first */
-	BUILD_BUG_ON(sizeof(union afs_dir_block) != 2048);
-	BUILD_BUG_ON(sizeof(union afs_dirent) != 32);
+	BUILD_BUG_ON(sizeof(union afs_xdr_dir_block) != 2048);
+	BUILD_BUG_ON(sizeof(union afs_xdr_dirent) != 32);
 
 	if (cookie->found) {
 		if (cookie->nr_fids < 50) {
diff --git a/fs/afs/xdr_fs.h b/fs/afs/xdr_fs.h
index 63e87ccbb55b..aa21f3068d52 100644
--- a/fs/afs/xdr_fs.h
+++ b/fs/afs/xdr_fs.h
@@ -51,53 +51,53 @@ struct afs_xdr_AFSFetchStatus {
 /*
  * Directory entry structure.
  */
-union afs_dirent {
+union afs_xdr_dirent {
 	struct {
-		uint8_t		valid;
-		uint8_t		unused[1];
+		u8		valid;
+		u8		unused[1];
 		__be16		hash_next;
 		__be32		vnode;
 		__be32		unique;
-		uint8_t		name[16];
-		uint8_t		overflow[4];	/* if any char of the name (inc
+		u8		name[16];
+		u8		overflow[4];	/* if any char of the name (inc
 						 * NUL) reaches here, consume
 						 * the next dirent too */
 	} u;
-	uint8_t	extended_name[32];
-};
+	u8			extended_name[32];
+} __packed;
 
 /*
- * Directory page header (one at the beginning of every 2048-byte chunk).
+ * Directory block header (one at the beginning of every 2048-byte block).
  */
-struct afs_dir_pagehdr {
+struct afs_xdr_dir_hdr {
 	__be16		npages;
 	__be16		magic;
 #define AFS_DIR_MAGIC htons(1234)
-	uint8_t		reserved;
-	uint8_t		bitmap[8];
-	uint8_t		pad[19];
-};
+	u8		reserved;
+	u8		bitmap[8];
+	u8		pad[19];
+} __packed;
 
 /*
  * Directory block layout
  */
-union afs_dir_block {
-	struct afs_dir_pagehdr	pagehdr;
+union afs_xdr_dir_block {
+	struct afs_xdr_dir_hdr		hdr;
 
 	struct {
-		struct afs_dir_pagehdr	pagehdr;
-		uint8_t			alloc_ctrs[AFS_DIR_MAX_BLOCKS];
+		struct afs_xdr_dir_hdr	hdr;
+		u8			alloc_ctrs[AFS_DIR_MAX_BLOCKS];
 		__be16			hashtable[AFS_DIR_HASHTBL_SIZE];
-	} hdr;
+	} meta;
 
-	union afs_dirent	dirents[AFS_DIR_SLOTS_PER_BLOCK];
-};
+	union afs_xdr_dirent	dirents[AFS_DIR_SLOTS_PER_BLOCK];
+} __packed;
 
 /*
  * Directory layout on a linux VM page.
  */
-struct afs_dir_page {
-	union afs_dir_block	blocks[AFS_DIR_BLOCKS_PER_PAGE];
+struct afs_xdr_dir_page {
+	union afs_xdr_dir_block	blocks[AFS_DIR_BLOCKS_PER_PAGE];
 };
 
 #endif /* XDR_FS_H */

^ permalink raw reply related	[flat|nested] 37+ messages in thread

* [PATCH 17/20] afs: Locally edit directory data for mkdir/create/unlink/...
  2018-04-05 20:29 [PATCH 00/20] afs: Fixes and development David Howells
                   ` (15 preceding siblings ...)
  2018-04-05 20:31 ` [PATCH 16/20] afs: Adjust the directory XDR structures David Howells
@ 2018-04-05 20:31 ` David Howells
  2018-04-05 20:31 ` [PATCH 18/20] afs: Trace protocol errors David Howells
                   ` (5 subsequent siblings)
  22 siblings, 0 replies; 37+ messages in thread
From: David Howells @ 2018-04-05 20:31 UTC (permalink / raw)
  To: torvalds; +Cc: dhowells, linux-fsdevel, linux-afs, linux-kernel

Locally edit the contents of an AFS directory upon a successful inode
operation that modifies that directory (such as mkdir, create and unlink)
so that we can avoid the current practice of re-downloading the directory
after each change.

This is viable provided that the directory version number we get back from
the modifying RPC op is exactly incremented by 1 from what we had
previously.  The data in the directory contents is in a defined format that
we have to parse locally to perform lookups and readdir, so modifying isn't
a problem.

If the edit fails, we just clear the VALID flag on the directory and it
will be reloaded next time it is needed.

Signed-off-by: David Howells <dhowells@redhat.com>
---

 fs/afs/Makefile            |    1 
 fs/afs/dir.c               |   83 ++++++-
 fs/afs/dir_edit.c          |  505 ++++++++++++++++++++++++++++++++++++++++++++
 fs/afs/fsclient.c          |   35 ++-
 fs/afs/inode.c             |    7 -
 fs/afs/internal.h          |   19 +-
 fs/afs/proc.c              |    4 
 include/trace/events/afs.h |   90 ++++++++
 8 files changed, 715 insertions(+), 29 deletions(-)
 create mode 100644 fs/afs/dir_edit.c

diff --git a/fs/afs/Makefile b/fs/afs/Makefile
index 6a055423c192..532acae25453 100644
--- a/fs/afs/Makefile
+++ b/fs/afs/Makefile
@@ -12,6 +12,7 @@ kafs-objs := \
 	cell.o \
 	cmservice.o \
 	dir.o \
+	dir_edit.o \
 	dynroot.o \
 	file.o \
 	flock.o \
diff --git a/fs/afs/dir.c b/fs/afs/dir.c
index 9efaa1f53bd8..7701e382fa94 100644
--- a/fs/afs/dir.c
+++ b/fs/afs/dir.c
@@ -130,17 +130,26 @@ static bool afs_dir_check_page(struct afs_vnode *dvnode, struct page *page,
 	qty /= sizeof(union afs_xdr_dir_block);
 
 	/* check them */
-	dbuf = page_address(page);
+	dbuf = kmap(page);
 	for (tmp = 0; tmp < qty; tmp++) {
 		if (dbuf->blocks[tmp].hdr.magic != AFS_DIR_MAGIC) {
 			printk("kAFS: %s(%lx): bad magic %d/%d is %04hx\n",
 			       __func__, dvnode->vfs_inode.i_ino, tmp, qty,
 			       ntohs(dbuf->blocks[tmp].hdr.magic));
 			trace_afs_dir_check_failed(dvnode, off, i_size);
+			kunmap(page);
 			goto error;
 		}
+
+		/* Make sure each block is NUL terminated so we can reasonably
+		 * use string functions on it.  The filenames in the page
+		 * *should* be NUL-terminated anyway.
+		 */
+		((u8 *)&dbuf->blocks[tmp])[AFS_DIR_BLOCK_SIZE - 1] = 0;
 	}
 
+	kunmap(page);
+
 checked:
 	afs_stat_v(dvnode, n_read_dir);
 	return true;
@@ -1127,6 +1136,7 @@ static int afs_mkdir(struct inode *dir, struct dentry *dentry, umode_t mode)
 	struct afs_vnode *dvnode = AFS_FS_I(dir);
 	struct afs_fid newfid;
 	struct key *key;
+	u64 data_version = dvnode->status.data_version;
 	int ret;
 
 	mode |= S_IFDIR;
@@ -1144,7 +1154,7 @@ static int afs_mkdir(struct inode *dir, struct dentry *dentry, umode_t mode)
 	if (afs_begin_vnode_operation(&fc, dvnode, key)) {
 		while (afs_select_fileserver(&fc)) {
 			fc.cb_break = dvnode->cb_break + dvnode->cb_s_break;
-			afs_fs_create(&fc, dentry->d_name.name, mode,
+			afs_fs_create(&fc, dentry->d_name.name, mode, data_version,
 				      &newfid, &newstatus, &newcb);
 		}
 
@@ -1158,6 +1168,11 @@ static int afs_mkdir(struct inode *dir, struct dentry *dentry, umode_t mode)
 		goto error_key;
 	}
 
+	if (ret == 0 &&
+	    test_bit(AFS_VNODE_DIR_VALID, &dvnode->flags))
+		afs_edit_dir_add(dvnode, &dentry->d_name, &newfid,
+				 afs_edit_dir_for_create);
+
 	key_put(key);
 	_leave(" = 0");
 	return 0;
@@ -1181,6 +1196,7 @@ static void afs_dir_remove_subdir(struct dentry *dentry)
 		clear_nlink(&vnode->vfs_inode);
 		set_bit(AFS_VNODE_DELETED, &vnode->flags);
 		clear_bit(AFS_VNODE_CB_PROMISED, &vnode->flags);
+		clear_bit(AFS_VNODE_DIR_VALID, &vnode->flags);
 	}
 }
 
@@ -1192,6 +1208,7 @@ static int afs_rmdir(struct inode *dir, struct dentry *dentry)
 	struct afs_fs_cursor fc;
 	struct afs_vnode *dvnode = AFS_FS_I(dir);
 	struct key *key;
+	u64 data_version = dvnode->status.data_version;
 	int ret;
 
 	_enter("{%x:%u},{%pd}",
@@ -1207,13 +1224,18 @@ static int afs_rmdir(struct inode *dir, struct dentry *dentry)
 	if (afs_begin_vnode_operation(&fc, dvnode, key)) {
 		while (afs_select_fileserver(&fc)) {
 			fc.cb_break = dvnode->cb_break + dvnode->cb_s_break;
-			afs_fs_remove(&fc, dentry->d_name.name, true);
+			afs_fs_remove(&fc, dentry->d_name.name, true,
+				      data_version);
 		}
 
 		afs_vnode_commit_status(&fc, dvnode, fc.cb_break);
 		ret = afs_end_vnode_operation(&fc);
-		if (ret == 0)
+		if (ret == 0) {
 			afs_dir_remove_subdir(dentry);
+			if (test_bit(AFS_VNODE_DIR_VALID, &dvnode->flags))
+				afs_edit_dir_remove(dvnode, &dentry->d_name,
+						    afs_edit_dir_for_rmdir);
+		}
 	}
 
 	key_put(key);
@@ -1278,6 +1300,7 @@ static int afs_unlink(struct inode *dir, struct dentry *dentry)
 	struct afs_vnode *dvnode = AFS_FS_I(dir), *vnode;
 	struct key *key;
 	unsigned long d_version = (unsigned long)dentry->d_fsdata;
+	u64 data_version = dvnode->status.data_version;
 	int ret;
 
 	_enter("{%x:%u},{%pd}",
@@ -1304,7 +1327,8 @@ static int afs_unlink(struct inode *dir, struct dentry *dentry)
 	if (afs_begin_vnode_operation(&fc, dvnode, key)) {
 		while (afs_select_fileserver(&fc)) {
 			fc.cb_break = dvnode->cb_break + dvnode->cb_s_break;
-			afs_fs_remove(&fc, dentry->d_name.name, false);
+			afs_fs_remove(&fc, dentry->d_name.name, false,
+				      data_version);
 		}
 
 		afs_vnode_commit_status(&fc, dvnode, fc.cb_break);
@@ -1313,6 +1337,10 @@ static int afs_unlink(struct inode *dir, struct dentry *dentry)
 			ret = afs_dir_remove_link(
 				dentry, key, d_version,
 				(unsigned long)dvnode->status.data_version);
+		if (ret == 0 &&
+		    test_bit(AFS_VNODE_DIR_VALID, &dvnode->flags))
+			afs_edit_dir_remove(dvnode, &dentry->d_name,
+					    afs_edit_dir_for_unlink);
 	}
 
 error_key:
@@ -1334,6 +1362,7 @@ static int afs_create(struct inode *dir, struct dentry *dentry, umode_t mode,
 	struct afs_vnode *dvnode = AFS_FS_I(dir);
 	struct afs_fid newfid;
 	struct key *key;
+	u64 data_version = dvnode->status.data_version;
 	int ret;
 
 	mode |= S_IFREG;
@@ -1355,7 +1384,7 @@ static int afs_create(struct inode *dir, struct dentry *dentry, umode_t mode,
 	if (afs_begin_vnode_operation(&fc, dvnode, key)) {
 		while (afs_select_fileserver(&fc)) {
 			fc.cb_break = dvnode->cb_break + dvnode->cb_s_break;
-			afs_fs_create(&fc, dentry->d_name.name, mode,
+			afs_fs_create(&fc, dentry->d_name.name, mode, data_version,
 				      &newfid, &newstatus, &newcb);
 		}
 
@@ -1369,6 +1398,10 @@ static int afs_create(struct inode *dir, struct dentry *dentry, umode_t mode,
 		goto error_key;
 	}
 
+	if (test_bit(AFS_VNODE_DIR_VALID, &dvnode->flags))
+		afs_edit_dir_add(dvnode, &dentry->d_name, &newfid,
+				 afs_edit_dir_for_create);
+
 	key_put(key);
 	_leave(" = 0");
 	return 0;
@@ -1390,10 +1423,12 @@ static int afs_link(struct dentry *from, struct inode *dir,
 	struct afs_fs_cursor fc;
 	struct afs_vnode *dvnode, *vnode;
 	struct key *key;
+	u64 data_version;
 	int ret;
 
 	vnode = AFS_FS_I(d_inode(from));
 	dvnode = AFS_FS_I(dir);
+	data_version = dvnode->status.data_version;
 
 	_enter("{%x:%u},{%x:%u},{%pd}",
 	       vnode->fid.vid, vnode->fid.vnode,
@@ -1420,7 +1455,7 @@ static int afs_link(struct dentry *from, struct inode *dir,
 		while (afs_select_fileserver(&fc)) {
 			fc.cb_break = dvnode->cb_break + dvnode->cb_s_break;
 			fc.cb_break_2 = vnode->cb_break + vnode->cb_s_break;
-			afs_fs_link(&fc, vnode, dentry->d_name.name);
+			afs_fs_link(&fc, vnode, dentry->d_name.name, data_version);
 		}
 
 		afs_vnode_commit_status(&fc, dvnode, fc.cb_break);
@@ -1436,6 +1471,10 @@ static int afs_link(struct dentry *from, struct inode *dir,
 		goto error_key;
 	}
 
+	if (test_bit(AFS_VNODE_DIR_VALID, &dvnode->flags))
+		afs_edit_dir_add(dvnode, &dentry->d_name, &vnode->fid,
+				 afs_edit_dir_for_link);
+
 	key_put(key);
 	_leave(" = 0");
 	return 0;
@@ -1459,6 +1498,7 @@ static int afs_symlink(struct inode *dir, struct dentry *dentry,
 	struct afs_vnode *dvnode = AFS_FS_I(dir);
 	struct afs_fid newfid;
 	struct key *key;
+	u64 data_version = dvnode->status.data_version;
 	int ret;
 
 	_enter("{%x:%u},{%pd},%s",
@@ -1483,7 +1523,8 @@ static int afs_symlink(struct inode *dir, struct dentry *dentry,
 	if (afs_begin_vnode_operation(&fc, dvnode, key)) {
 		while (afs_select_fileserver(&fc)) {
 			fc.cb_break = dvnode->cb_break + dvnode->cb_s_break;
-			afs_fs_symlink(&fc, dentry->d_name.name, content,
+			afs_fs_symlink(&fc, dentry->d_name.name,
+				       content, data_version,
 				       &newfid, &newstatus);
 		}
 
@@ -1497,6 +1538,10 @@ static int afs_symlink(struct inode *dir, struct dentry *dentry,
 		goto error_key;
 	}
 
+	if (test_bit(AFS_VNODE_DIR_VALID, &dvnode->flags))
+		afs_edit_dir_add(dvnode, &dentry->d_name, &newfid,
+				 afs_edit_dir_for_symlink);
+
 	key_put(key);
 	_leave(" = 0");
 	return 0;
@@ -1519,6 +1564,8 @@ static int afs_rename(struct inode *old_dir, struct dentry *old_dentry,
 	struct afs_fs_cursor fc;
 	struct afs_vnode *orig_dvnode, *new_dvnode, *vnode;
 	struct key *key;
+	u64 orig_data_version, new_data_version;
+	bool new_negative = d_is_negative(new_dentry);
 	int ret;
 
 	if (flags)
@@ -1527,6 +1574,8 @@ static int afs_rename(struct inode *old_dir, struct dentry *old_dentry,
 	vnode = AFS_FS_I(d_inode(old_dentry));
 	orig_dvnode = AFS_FS_I(old_dir);
 	new_dvnode = AFS_FS_I(new_dir);
+	orig_data_version = orig_dvnode->status.data_version;
+	new_data_version = new_dvnode->status.data_version;
 
 	_enter("{%x:%u},{%x:%u},{%x:%u},{%pd}",
 	       orig_dvnode->fid.vid, orig_dvnode->fid.vnode,
@@ -1552,7 +1601,8 @@ static int afs_rename(struct inode *old_dir, struct dentry *old_dentry,
 			fc.cb_break = orig_dvnode->cb_break + orig_dvnode->cb_s_break;
 			fc.cb_break_2 = new_dvnode->cb_break + new_dvnode->cb_s_break;
 			afs_fs_rename(&fc, old_dentry->d_name.name,
-				      new_dvnode, new_dentry->d_name.name);
+				      new_dvnode, new_dentry->d_name.name,
+				      orig_data_version, new_data_version);
 		}
 
 		afs_vnode_commit_status(&fc, orig_dvnode, fc.cb_break);
@@ -1564,6 +1614,21 @@ static int afs_rename(struct inode *old_dir, struct dentry *old_dentry,
 			goto error_key;
 	}
 
+	if (ret == 0) {
+		if (test_bit(AFS_VNODE_DIR_VALID, &orig_dvnode->flags))
+		    afs_edit_dir_remove(orig_dvnode, &old_dentry->d_name,
+					afs_edit_dir_for_rename);
+
+		if (!new_negative &&
+		    test_bit(AFS_VNODE_DIR_VALID, &new_dvnode->flags))
+			afs_edit_dir_remove(new_dvnode, &new_dentry->d_name,
+					    afs_edit_dir_for_rename);
+
+		if (test_bit(AFS_VNODE_DIR_VALID, &new_dvnode->flags))
+			afs_edit_dir_add(new_dvnode, &new_dentry->d_name,
+					 &vnode->fid,  afs_edit_dir_for_rename);
+	}
+
 error_key:
 	key_put(key);
 error:
diff --git a/fs/afs/dir_edit.c b/fs/afs/dir_edit.c
new file mode 100644
index 000000000000..8b400f5aead5
--- /dev/null
+++ b/fs/afs/dir_edit.c
@@ -0,0 +1,505 @@
+/* AFS filesystem directory editing
+ *
+ * Copyright (C) 2018 Red Hat, Inc. All Rights Reserved.
+ * Written by David Howells (dhowells@redhat.com)
+ *
+ * This program is free software; you can redistribute it and/or
+ * modify it under the terms of the GNU General Public Licence
+ * as published by the Free Software Foundation; either version
+ * 2 of the Licence, or (at your option) any later version.
+ */
+
+#include <linux/kernel.h>
+#include <linux/fs.h>
+#include <linux/namei.h>
+#include <linux/pagemap.h>
+#include <linux/iversion.h>
+#include "internal.h"
+#include "xdr_fs.h"
+
+/*
+ * Find a number of contiguous clear bits in a directory block bitmask.
+ *
+ * There are 64 slots, which means we can load the entire bitmap into a
+ * variable.  The first bit doesn't count as it corresponds to the block header
+ * slot.  nr_slots is between 1 and 9.
+ */
+static int afs_find_contig_bits(union afs_xdr_dir_block *block, unsigned int nr_slots)
+{
+	u64 bitmap;
+	u32 mask;
+	int bit, n;
+
+	bitmap  = (u64)block->hdr.bitmap[0] << 0 * 8;
+	bitmap |= (u64)block->hdr.bitmap[1] << 1 * 8;
+	bitmap |= (u64)block->hdr.bitmap[2] << 2 * 8;
+	bitmap |= (u64)block->hdr.bitmap[3] << 3 * 8;
+	bitmap |= (u64)block->hdr.bitmap[4] << 4 * 8;
+	bitmap |= (u64)block->hdr.bitmap[5] << 5 * 8;
+	bitmap |= (u64)block->hdr.bitmap[6] << 6 * 8;
+	bitmap |= (u64)block->hdr.bitmap[7] << 7 * 8;
+	bitmap >>= 1; /* The first entry is metadata */
+	bit = 1;
+	mask = (1 << nr_slots) - 1;
+
+	do {
+		if (sizeof(unsigned long) == 8)
+			n = ffz(bitmap);
+		else
+			n = ((u32)bitmap) != 0 ?
+				ffz((u32)bitmap) :
+				ffz((u32)(bitmap >> 32)) + 32;
+		bitmap >>= n;
+		bit += n;
+
+		if ((bitmap & mask) == 0) {
+			if (bit > 64 - nr_slots)
+				return -1;
+			return bit;
+		}
+
+		n = __ffs(bitmap);
+		bitmap >>= n;
+		bit += n;
+	} while (bitmap);
+
+	return -1;
+}
+
+/*
+ * Set a number of contiguous bits in the directory block bitmap.
+ */
+static void afs_set_contig_bits(union afs_xdr_dir_block *block,
+				int bit, unsigned int nr_slots)
+{
+	u64 mask, before, after;
+
+	mask = (1 << nr_slots) - 1;
+	mask <<= bit;
+
+	before = *(u64 *)block->hdr.bitmap;
+
+	block->hdr.bitmap[0] |= (u8)(mask >> 0 * 8);
+	block->hdr.bitmap[1] |= (u8)(mask >> 1 * 8);
+	block->hdr.bitmap[2] |= (u8)(mask >> 2 * 8);
+	block->hdr.bitmap[3] |= (u8)(mask >> 3 * 8);
+	block->hdr.bitmap[4] |= (u8)(mask >> 4 * 8);
+	block->hdr.bitmap[5] |= (u8)(mask >> 5 * 8);
+	block->hdr.bitmap[6] |= (u8)(mask >> 6 * 8);
+	block->hdr.bitmap[7] |= (u8)(mask >> 7 * 8);
+
+	after = *(u64 *)block->hdr.bitmap;
+}
+
+/*
+ * Clear a number of contiguous bits in the directory block bitmap.
+ */
+static void afs_clear_contig_bits(union afs_xdr_dir_block *block,
+				  int bit, unsigned int nr_slots)
+{
+	u64 mask, before, after;
+
+	mask = (1 << nr_slots) - 1;
+	mask <<= bit;
+
+	before = *(u64 *)block->hdr.bitmap;
+
+	block->hdr.bitmap[0] &= ~(u8)(mask >> 0 * 8);
+	block->hdr.bitmap[1] &= ~(u8)(mask >> 1 * 8);
+	block->hdr.bitmap[2] &= ~(u8)(mask >> 2 * 8);
+	block->hdr.bitmap[3] &= ~(u8)(mask >> 3 * 8);
+	block->hdr.bitmap[4] &= ~(u8)(mask >> 4 * 8);
+	block->hdr.bitmap[5] &= ~(u8)(mask >> 5 * 8);
+	block->hdr.bitmap[6] &= ~(u8)(mask >> 6 * 8);
+	block->hdr.bitmap[7] &= ~(u8)(mask >> 7 * 8);
+
+	after = *(u64 *)block->hdr.bitmap;
+}
+
+/*
+ * Scan a directory block looking for a dirent of the right name.
+ */
+static int afs_dir_scan_block(union afs_xdr_dir_block *block, struct qstr *name,
+			      unsigned int blocknum)
+{
+	union afs_xdr_dirent *de;
+	u64 bitmap;
+	int d, len, n;
+
+	_enter("");
+
+	bitmap  = (u64)block->hdr.bitmap[0] << 0 * 8;
+	bitmap |= (u64)block->hdr.bitmap[1] << 1 * 8;
+	bitmap |= (u64)block->hdr.bitmap[2] << 2 * 8;
+	bitmap |= (u64)block->hdr.bitmap[3] << 3 * 8;
+	bitmap |= (u64)block->hdr.bitmap[4] << 4 * 8;
+	bitmap |= (u64)block->hdr.bitmap[5] << 5 * 8;
+	bitmap |= (u64)block->hdr.bitmap[6] << 6 * 8;
+	bitmap |= (u64)block->hdr.bitmap[7] << 7 * 8;
+
+	for (d = (blocknum == 0 ? AFS_DIR_RESV_BLOCKS0 : AFS_DIR_RESV_BLOCKS);
+	     d < AFS_DIR_SLOTS_PER_BLOCK;
+	     d++) {
+		if (!((bitmap >> d) & 1))
+			continue;
+		de = &block->dirents[d];
+		if (de->u.valid != 1)
+			continue;
+
+		/* The block was NUL-terminated by afs_dir_check_page(). */
+		len = strlen(de->u.name);
+		if (len == name->len &&
+		    memcmp(de->u.name, name->name, name->len) == 0)
+			return d;
+
+		n = round_up(12 + len + 1 + 4, AFS_DIR_DIRENT_SIZE);
+		n /= AFS_DIR_DIRENT_SIZE;
+		d += n - 1;
+	}
+
+	return -1;
+}
+
+/*
+ * Initialise a new directory block.  Note that block 0 is special and contains
+ * some extra metadata.
+ */
+static void afs_edit_init_block(union afs_xdr_dir_block *meta,
+				union afs_xdr_dir_block *block, int block_num)
+{
+	memset(block, 0, sizeof(*block));
+	block->hdr.npages = htons(1);
+	block->hdr.magic = AFS_DIR_MAGIC;
+	block->hdr.bitmap[0] = 1;
+
+	if (block_num == 0) {
+		block->hdr.bitmap[0] = 0xff;
+		block->hdr.bitmap[1] = 0x1f;
+		memset(block->meta.alloc_ctrs,
+		       AFS_DIR_SLOTS_PER_BLOCK,
+		       sizeof(block->meta.alloc_ctrs));
+		meta->meta.alloc_ctrs[0] =
+			AFS_DIR_SLOTS_PER_BLOCK - AFS_DIR_RESV_BLOCKS0;
+	}
+
+	if (block_num < AFS_DIR_BLOCKS_WITH_CTR)
+		meta->meta.alloc_ctrs[block_num] =
+			AFS_DIR_SLOTS_PER_BLOCK - AFS_DIR_RESV_BLOCKS;
+}
+
+/*
+ * Edit a directory's file data to add a new directory entry.  Doing this after
+ * create, mkdir, symlink, link or rename if the data version number is
+ * incremented by exactly one avoids the need to re-download the entire
+ * directory contents.
+ *
+ * The caller must hold the inode locked.
+ */
+void afs_edit_dir_add(struct afs_vnode *vnode,
+		      struct qstr *name, struct afs_fid *new_fid,
+		      enum afs_edit_dir_reason why)
+{
+	union afs_xdr_dir_block *meta, *block;
+	struct afs_xdr_dir_page *meta_page, *dir_page;
+	union afs_xdr_dirent *de;
+	struct page *page0, *page;
+	unsigned int need_slots, nr_blocks, b;
+	pgoff_t index;
+	loff_t i_size;
+	gfp_t gfp;
+	int slot;
+
+	_enter(",,{%d,%s},", name->len, name->name);
+
+	i_size = i_size_read(&vnode->vfs_inode);
+	if (i_size > AFS_DIR_BLOCK_SIZE * AFS_DIR_MAX_BLOCKS ||
+	    (i_size & (AFS_DIR_BLOCK_SIZE - 1))) {
+		clear_bit(AFS_VNODE_DIR_VALID, &vnode->flags);
+		return;
+	}
+
+	gfp = vnode->vfs_inode.i_mapping->gfp_mask;
+	page0 = find_or_create_page(vnode->vfs_inode.i_mapping, 0, gfp);
+	if (!page0) {
+		clear_bit(AFS_VNODE_DIR_VALID, &vnode->flags);
+		_leave(" [fgp]");
+		return;
+	}
+
+	/* Work out how many slots we're going to need. */
+	need_slots = round_up(12 + name->len + 1 + 4, AFS_DIR_DIRENT_SIZE);
+	need_slots /= AFS_DIR_DIRENT_SIZE;
+
+	meta_page = kmap(page0);
+	meta = &meta_page->blocks[0];
+	if (i_size == 0)
+		goto new_directory;
+	nr_blocks = i_size / AFS_DIR_BLOCK_SIZE;
+
+	/* Find a block that has sufficient slots available.  Each VM page
+	 * contains two or more directory blocks.
+	 */
+	for (b = 0; b < nr_blocks + 1; b++) {
+		/* If the directory extended into a new page, then we need to
+		 * tack a new page on the end.
+		 */
+		index = b / AFS_DIR_BLOCKS_PER_PAGE;
+		if (index == 0) {
+			page = page0;
+			dir_page = meta_page;
+		} else {
+			if (nr_blocks >= AFS_DIR_MAX_BLOCKS)
+				goto error;
+			gfp = vnode->vfs_inode.i_mapping->gfp_mask;
+			page = find_or_create_page(vnode->vfs_inode.i_mapping,
+						   index, gfp);
+			if (!page)
+				goto error;
+			if (!PagePrivate(page)) {
+				set_page_private(page, 1);
+				SetPagePrivate(page);
+			}
+			dir_page = kmap(page);
+		}
+
+		/* Abandon the edit if we got a callback break. */
+		if (!test_bit(AFS_VNODE_DIR_VALID, &vnode->flags))
+			goto invalidated;
+
+		block = &dir_page->blocks[b % AFS_DIR_BLOCKS_PER_PAGE];
+
+		_debug("block %u: %2u %3u %u",
+		       b,
+		       (b < AFS_DIR_BLOCKS_WITH_CTR) ? meta->meta.alloc_ctrs[b] : 99,
+		       ntohs(block->hdr.npages),
+		       ntohs(block->hdr.magic));
+
+		/* Initialise the block if necessary. */
+		if (b == nr_blocks) {
+			_debug("init %u", b);
+			afs_edit_init_block(meta, block, b);
+			i_size_write(&vnode->vfs_inode, (b + 1) * AFS_DIR_BLOCK_SIZE);
+		}
+
+		/* Only lower dir pages have a counter in the header. */
+		if (b >= AFS_DIR_BLOCKS_WITH_CTR ||
+		    meta->meta.alloc_ctrs[b] >= need_slots) {
+			/* We need to try and find one or more consecutive
+			 * slots to hold the entry.
+			 */
+			slot = afs_find_contig_bits(block, need_slots);
+			if (slot >= 0) {
+				_debug("slot %u", slot);
+				goto found_space;
+			}
+		}
+
+		if (page != page0) {
+			unlock_page(page);
+			kunmap(page);
+			put_page(page);
+		}
+	}
+
+	/* There are no spare slots of sufficient size, yet the operation
+	 * succeeded.  Download the directory again.
+	 */
+	trace_afs_edit_dir(vnode, why, afs_edit_dir_create_nospc, 0, 0, 0, 0, name->name);
+	clear_bit(AFS_VNODE_DIR_VALID, &vnode->flags);
+	goto out_unmap;
+
+new_directory:
+	afs_edit_init_block(meta, meta, 0);
+	i_size = AFS_DIR_BLOCK_SIZE;
+	i_size_write(&vnode->vfs_inode, i_size);
+	slot = AFS_DIR_RESV_BLOCKS0;
+	page = page0;
+	block = meta;
+	nr_blocks = 1;
+	b = 0;
+
+found_space:
+	/* Set the dirent slot. */
+	trace_afs_edit_dir(vnode, why, afs_edit_dir_create, b, slot,
+			   new_fid->vnode, new_fid->unique, name->name);
+	de = &block->dirents[slot];
+	de->u.valid	= 1;
+	de->u.unused[0]	= 0;
+	de->u.hash_next	= 0; // TODO: Really need to maintain this
+	de->u.vnode	= htonl(new_fid->vnode);
+	de->u.unique	= htonl(new_fid->unique);
+	memcpy(de->u.name, name->name, name->len + 1);
+	de->u.name[name->len] = 0;
+
+	/* Adjust the bitmap. */
+	afs_set_contig_bits(block, slot, need_slots);
+	if (page != page0) {
+		unlock_page(page);
+		kunmap(page);
+		put_page(page);
+	}
+
+	/* Adjust the allocation counter. */
+	if (b < AFS_DIR_BLOCKS_WITH_CTR)
+		meta->meta.alloc_ctrs[b] -= need_slots;
+
+	inode_inc_iversion_raw(&vnode->vfs_inode);
+	afs_stat_v(vnode, n_dir_cr);
+	_debug("Insert %s in %u[%u]", name->name, b, slot);
+
+out_unmap:
+	unlock_page(page0);
+	kunmap(page0);
+	put_page(page0);
+	_leave("");
+	return;
+
+invalidated:
+	trace_afs_edit_dir(vnode, why, afs_edit_dir_create_inval, 0, 0, 0, 0, name->name);
+	clear_bit(AFS_VNODE_DIR_VALID, &vnode->flags);
+	if (page != page0) {
+		kunmap(page);
+		put_page(page);
+	}
+	goto out_unmap;
+
+error:
+	trace_afs_edit_dir(vnode, why, afs_edit_dir_create_error, 0, 0, 0, 0, name->name);
+	clear_bit(AFS_VNODE_DIR_VALID, &vnode->flags);
+	goto out_unmap;
+}
+
+/*
+ * Edit a directory's file data to remove a new directory entry.  Doing this
+ * after unlink, rmdir or rename if the data version number is incremented by
+ * exactly one avoids the need to re-download the entire directory contents.
+ *
+ * The caller must hold the inode locked.
+ */
+void afs_edit_dir_remove(struct afs_vnode *vnode,
+			 struct qstr *name, enum afs_edit_dir_reason why)
+{
+	struct afs_xdr_dir_page *meta_page, *dir_page;
+	union afs_xdr_dir_block *meta, *block;
+	union afs_xdr_dirent *de;
+	struct page *page0, *page;
+	unsigned int need_slots, nr_blocks, b;
+	pgoff_t index;
+	loff_t i_size;
+	int slot;
+
+	_enter(",,{%d,%s},", name->len, name->name);
+
+	i_size = i_size_read(&vnode->vfs_inode);
+	if (i_size < AFS_DIR_BLOCK_SIZE ||
+	    i_size > AFS_DIR_BLOCK_SIZE * AFS_DIR_MAX_BLOCKS ||
+	    (i_size & (AFS_DIR_BLOCK_SIZE - 1))) {
+		clear_bit(AFS_VNODE_DIR_VALID, &vnode->flags);
+		return;
+	}
+	nr_blocks = i_size / AFS_DIR_BLOCK_SIZE;
+
+	page0 = find_lock_page(vnode->vfs_inode.i_mapping, 0);
+	if (!page0) {
+		clear_bit(AFS_VNODE_DIR_VALID, &vnode->flags);
+		_leave(" [fgp]");
+		return;
+	}
+
+	/* Work out how many slots we're going to discard. */
+	need_slots = round_up(12 + name->len + 1 + 4, AFS_DIR_DIRENT_SIZE);
+	need_slots /= AFS_DIR_DIRENT_SIZE;
+
+	meta_page = kmap(page0);
+	meta = &meta_page->blocks[0];
+
+	/* Find a page that has sufficient slots available.  Each VM page
+	 * contains two or more directory blocks.
+	 */
+	for (b = 0; b < nr_blocks; b++) {
+		index = b / AFS_DIR_BLOCKS_PER_PAGE;
+		if (index != 0) {
+			page = find_lock_page(vnode->vfs_inode.i_mapping, index);
+			if (!page)
+				goto error;
+			dir_page = kmap(page);
+		} else {
+			page = page0;
+			dir_page = meta_page;
+		}
+
+		/* Abandon the edit if we got a callback break. */
+		if (!test_bit(AFS_VNODE_DIR_VALID, &vnode->flags))
+			goto invalidated;
+
+		block = &dir_page->blocks[b % AFS_DIR_BLOCKS_PER_PAGE];
+
+		if (b > AFS_DIR_BLOCKS_WITH_CTR ||
+		    meta->meta.alloc_ctrs[b] <= AFS_DIR_SLOTS_PER_BLOCK - 1 - need_slots) {
+			slot = afs_dir_scan_block(block, name, b);
+			if (slot >= 0)
+				goto found_dirent;
+		}
+
+		if (page != page0) {
+			unlock_page(page);
+			kunmap(page);
+			put_page(page);
+		}
+	}
+
+	/* Didn't find the dirent to clobber.  Download the directory again. */
+	trace_afs_edit_dir(vnode, why, afs_edit_dir_delete_noent,
+			   0, 0, 0, 0, name->name);
+	clear_bit(AFS_VNODE_DIR_VALID, &vnode->flags);
+	goto out_unmap;
+
+found_dirent:
+	de = &block->dirents[slot];
+
+	trace_afs_edit_dir(vnode, why, afs_edit_dir_delete, b, slot,
+			   ntohl(de->u.vnode), ntohl(de->u.unique),
+			   name->name);
+
+	memset(de, 0, sizeof(*de) * need_slots);
+
+	/* Adjust the bitmap. */
+	afs_clear_contig_bits(block, slot, need_slots);
+	if (page != page0) {
+		unlock_page(page);
+		kunmap(page);
+		put_page(page);
+	}
+
+	/* Adjust the allocation counter. */
+	if (b < AFS_DIR_BLOCKS_WITH_CTR)
+		meta->meta.alloc_ctrs[b] += need_slots;
+
+	inode_set_iversion_raw(&vnode->vfs_inode, vnode->status.data_version);
+	afs_stat_v(vnode, n_dir_rm);
+	_debug("Remove %s from %u[%u]", name->name, b, slot);
+
+out_unmap:
+	unlock_page(page0);
+	kunmap(page0);
+	put_page(page0);
+	_leave("");
+	return;
+
+invalidated:
+	trace_afs_edit_dir(vnode, why, afs_edit_dir_delete_inval,
+			   0, 0, 0, 0, name->name);
+	clear_bit(AFS_VNODE_DIR_VALID, &vnode->flags);
+	if (page != page0) {
+		unlock_page(page);
+		kunmap(page);
+		put_page(page);
+	}
+	goto out_unmap;
+
+error:
+	trace_afs_edit_dir(vnode, why, afs_edit_dir_delete_error,
+			   0, 0, 0, 0, name->name);
+	clear_bit(AFS_VNODE_DIR_VALID, &vnode->flags);
+	goto out_unmap;
+}
diff --git a/fs/afs/fsclient.c b/fs/afs/fsclient.c
index b66ff0dc8a5a..20d6304a0d3e 100644
--- a/fs/afs/fsclient.c
+++ b/fs/afs/fsclient.c
@@ -107,6 +107,13 @@ void afs_update_inode_from_status(struct afs_vnode *vnode,
 			} else {
 				set_bit(AFS_VNODE_ZAP_DATA, &vnode->flags);
 			}
+		} else if (vnode->status.type == AFS_FTYPE_DIR) {
+			/* Expected directory change is handled elsewhere so
+			 * that we can locally edit the directory and save on a
+			 * download.
+			 */
+			if (test_bit(AFS_VNODE_DIR_VALID, &vnode->flags))
+				flags &= ~AFS_VNODE_DATA_CHANGED;
 		}
 	}
 
@@ -190,10 +197,7 @@ static int xdr_decode_AFSFetchStatus(const __be32 **_bp,
 
 	size  = (u64)ntohl(xdr->size_lo);
 	size |= (u64)ntohl(xdr->size_hi) << 32;
-	if (size != status->size) {
-		status->size = size;
-		flags |= AFS_VNODE_DATA_CHANGED;
-	}
+	status->size = size;
 
 	data_version  = (u64)ntohl(xdr->data_version_lo);
 	data_version |= (u64)ntohl(xdr->data_version_hi) << 32;
@@ -736,6 +740,7 @@ static const struct afs_call_type afs_RXFSMakeDir = {
 int afs_fs_create(struct afs_fs_cursor *fc,
 		  const char *name,
 		  umode_t mode,
+		  u64 current_data_version,
 		  struct afs_fid *newfid,
 		  struct afs_file_status *newstatus,
 		  struct afs_callback *newcb)
@@ -763,7 +768,7 @@ int afs_fs_create(struct afs_fs_cursor *fc,
 	call->reply[1] = newfid;
 	call->reply[2] = newstatus;
 	call->reply[3] = newcb;
-	call->expected_version = vnode->status.data_version;
+	call->expected_version = current_data_version + 1;
 
 	/* marshall the parameters */
 	bp = call->request;
@@ -836,7 +841,8 @@ static const struct afs_call_type afs_RXFSRemoveDir = {
 /*
  * remove a file or directory
  */
-int afs_fs_remove(struct afs_fs_cursor *fc, const char *name, bool isdir)
+int afs_fs_remove(struct afs_fs_cursor *fc, const char *name, bool isdir,
+		  u64 current_data_version)
 {
 	struct afs_vnode *vnode = fc->vnode;
 	struct afs_call *call;
@@ -858,7 +864,7 @@ int afs_fs_remove(struct afs_fs_cursor *fc, const char *name, bool isdir)
 
 	call->key = fc->key;
 	call->reply[0] = vnode;
-	call->expected_version = vnode->status.data_version;
+	call->expected_version = current_data_version + 1;
 
 	/* marshall the parameters */
 	bp = call->request;
@@ -920,7 +926,7 @@ static const struct afs_call_type afs_RXFSLink = {
  * make a hard link
  */
 int afs_fs_link(struct afs_fs_cursor *fc, struct afs_vnode *vnode,
-		const char *name)
+		const char *name, u64 current_data_version)
 {
 	struct afs_vnode *dvnode = fc->vnode;
 	struct afs_call *call;
@@ -941,7 +947,7 @@ int afs_fs_link(struct afs_fs_cursor *fc, struct afs_vnode *vnode,
 	call->key = fc->key;
 	call->reply[0] = dvnode;
 	call->reply[1] = vnode;
-	call->expected_version = vnode->status.data_version;
+	call->expected_version = current_data_version + 1;
 
 	/* marshall the parameters */
 	bp = call->request;
@@ -1009,6 +1015,7 @@ static const struct afs_call_type afs_RXFSSymlink = {
 int afs_fs_symlink(struct afs_fs_cursor *fc,
 		   const char *name,
 		   const char *contents,
+		   u64 current_data_version,
 		   struct afs_fid *newfid,
 		   struct afs_file_status *newstatus)
 {
@@ -1037,7 +1044,7 @@ int afs_fs_symlink(struct afs_fs_cursor *fc,
 	call->reply[0] = vnode;
 	call->reply[1] = newfid;
 	call->reply[2] = newstatus;
-	call->expected_version = vnode->status.data_version;
+	call->expected_version = current_data_version + 1;
 
 	/* marshall the parameters */
 	bp = call->request;
@@ -1117,7 +1124,9 @@ static const struct afs_call_type afs_RXFSRename = {
 int afs_fs_rename(struct afs_fs_cursor *fc,
 		  const char *orig_name,
 		  struct afs_vnode *new_dvnode,
-		  const char *new_name)
+		  const char *new_name,
+		  u64 current_orig_data_version,
+		  u64 current_new_data_version)
 {
 	struct afs_vnode *orig_dvnode = fc->vnode;
 	struct afs_call *call;
@@ -1145,8 +1154,8 @@ int afs_fs_rename(struct afs_fs_cursor *fc,
 	call->key = fc->key;
 	call->reply[0] = orig_dvnode;
 	call->reply[1] = new_dvnode;
-	call->expected_version = orig_dvnode->status.data_version;
-	call->expected_version_2 = new_dvnode->status.data_version;
+	call->expected_version = current_orig_data_version + 1;
+	call->expected_version_2 = current_new_data_version + 1;
 
 	/* marshall the parameters */
 	bp = call->request;
diff --git a/fs/afs/inode.c b/fs/afs/inode.c
index 69bcfb82dd69..3808dcbbac6d 100644
--- a/fs/afs/inode.c
+++ b/fs/afs/inode.c
@@ -395,8 +395,11 @@ int afs_validate(struct afs_vnode *vnode, struct key *key)
 	if (test_bit(AFS_VNODE_CB_PROMISED, &vnode->flags)) {
 		if (vnode->cb_s_break != vnode->cb_interest->server->cb_s_break) {
 			vnode->cb_s_break = vnode->cb_interest->server->cb_s_break;
-		} else if (test_bit(AFS_VNODE_DIR_VALID, &vnode->flags) &&
-			   !test_bit(AFS_VNODE_ZAP_DATA, &vnode->flags) &&
+		} else if (vnode->status.type == AFS_FTYPE_DIR &&
+			   test_bit(AFS_VNODE_DIR_VALID, &vnode->flags) &&
+			   vnode->cb_expires_at - 10 > now) {
+				valid = true;
+		} else if (!test_bit(AFS_VNODE_ZAP_DATA, &vnode->flags) &&
 			   vnode->cb_expires_at - 10 > now) {
 				valid = true;
 		}
diff --git a/fs/afs/internal.h b/fs/afs/internal.h
index 3a73c87e7ca8..b4d2fac46fe8 100644
--- a/fs/afs/internal.h
+++ b/fs/afs/internal.h
@@ -269,6 +269,8 @@ struct afs_net {
 	atomic_t		n_inval;	/* Number of invalidations by the server */
 	atomic_t		n_relpg;	/* Number of invalidations by releasepage */
 	atomic_t		n_read_dir;	/* Number of directory pages read */
+	atomic_t		n_dir_cr;	/* Number of directory entry creation edits */
+	atomic_t		n_dir_rm;	/* Number of directory entry removal edits */
 };
 
 extern const char afs_init_sysname[];
@@ -677,6 +679,13 @@ extern const struct dentry_operations afs_fs_dentry_operations;
 
 extern void afs_d_release(struct dentry *);
 
+/*
+ * dir_edit.c
+ */
+extern void afs_edit_dir_add(struct afs_vnode *, struct qstr *, struct afs_fid *,
+			     enum afs_edit_dir_reason);
+extern void afs_edit_dir_remove(struct afs_vnode *, struct qstr *, enum afs_edit_dir_reason);
+
 /*
  * dynroot.c
  */
@@ -723,14 +732,14 @@ extern void afs_update_inode_from_status(struct afs_vnode *, struct afs_file_sta
 extern int afs_fs_fetch_file_status(struct afs_fs_cursor *, struct afs_volsync *, bool);
 extern int afs_fs_give_up_callbacks(struct afs_net *, struct afs_server *);
 extern int afs_fs_fetch_data(struct afs_fs_cursor *, struct afs_read *);
-extern int afs_fs_create(struct afs_fs_cursor *, const char *, umode_t,
+extern int afs_fs_create(struct afs_fs_cursor *, const char *, umode_t, u64,
 			 struct afs_fid *, struct afs_file_status *, struct afs_callback *);
-extern int afs_fs_remove(struct afs_fs_cursor *, const char *, bool);
-extern int afs_fs_link(struct afs_fs_cursor *, struct afs_vnode *, const char *);
-extern int afs_fs_symlink(struct afs_fs_cursor *, const char *, const char *,
+extern int afs_fs_remove(struct afs_fs_cursor *, const char *, bool, u64);
+extern int afs_fs_link(struct afs_fs_cursor *, struct afs_vnode *, const char *, u64);
+extern int afs_fs_symlink(struct afs_fs_cursor *, const char *, const char *, u64,
 			  struct afs_fid *, struct afs_file_status *);
 extern int afs_fs_rename(struct afs_fs_cursor *, const char *,
-			 struct afs_vnode *, const char *);
+			 struct afs_vnode *, const char *, u64, u64);
 extern int afs_fs_store_data(struct afs_fs_cursor *, struct address_space *,
 			     pgoff_t, pgoff_t, unsigned, unsigned);
 extern int afs_fs_setattr(struct afs_fs_cursor *, struct iattr *);
diff --git a/fs/afs/proc.c b/fs/afs/proc.c
index 090fa8c30ea8..5ced655d08b3 100644
--- a/fs/afs/proc.c
+++ b/fs/afs/proc.c
@@ -898,6 +898,10 @@ static int afs_proc_stats_show(struct seq_file *m, void *v)
 
 	seq_printf(m, "dir-data: rdpg=%u\n",
 		   atomic_read(&net->n_read_dir));
+
+	seq_printf(m, "dir-edit: cr=%u rm=%u\n",
+		   atomic_read(&net->n_dir_cr),
+		   atomic_read(&net->n_dir_rm));
 	return 0;
 }
 
diff --git a/include/trace/events/afs.h b/include/trace/events/afs.h
index 0419b7e1e968..f46dee0f5ced 100644
--- a/include/trace/events/afs.h
+++ b/include/trace/events/afs.h
@@ -63,6 +63,27 @@ enum afs_vl_operation {
 	afs_VL_GetCapabilities	= 65537,	/* AFS Get VL server capabilities */
 };
 
+enum afs_edit_dir_op {
+	afs_edit_dir_create,
+	afs_edit_dir_create_error,
+	afs_edit_dir_create_inval,
+	afs_edit_dir_create_nospc,
+	afs_edit_dir_delete,
+	afs_edit_dir_delete_error,
+	afs_edit_dir_delete_inval,
+	afs_edit_dir_delete_noent,
+};
+
+enum afs_edit_dir_reason {
+	afs_edit_dir_for_create,
+	afs_edit_dir_for_link,
+	afs_edit_dir_for_mkdir,
+	afs_edit_dir_for_rename,
+	afs_edit_dir_for_rmdir,
+	afs_edit_dir_for_symlink,
+	afs_edit_dir_for_unlink,
+};
+
 #endif /* end __AFS_DECLARE_TRACE_ENUMS_ONCE_ONLY */
 
 /*
@@ -106,6 +127,25 @@ enum afs_vl_operation {
 	EM(afs_YFSVL_GetEndpoints,		"YFSVL.GetEndpoints") \
 	E_(afs_VL_GetCapabilities,		"VL.GetCapabilities")
 
+#define afs_edit_dir_ops				  \
+	EM(afs_edit_dir_create,			"create") \
+	EM(afs_edit_dir_create_error,		"c_fail") \
+	EM(afs_edit_dir_create_inval,		"c_invl") \
+	EM(afs_edit_dir_create_nospc,		"c_nspc") \
+	EM(afs_edit_dir_delete,			"delete") \
+	EM(afs_edit_dir_delete_error,		"d_err ") \
+	EM(afs_edit_dir_delete_inval,		"d_invl") \
+	E_(afs_edit_dir_delete_noent,		"d_nent")
+
+#define afs_edit_dir_reasons				  \
+	EM(afs_edit_dir_for_create,		"Create") \
+	EM(afs_edit_dir_for_link,		"Link  ") \
+	EM(afs_edit_dir_for_mkdir,		"MkDir ") \
+	EM(afs_edit_dir_for_rename,		"Rename") \
+	EM(afs_edit_dir_for_rmdir,		"RmDir ") \
+	EM(afs_edit_dir_for_symlink,		"Symlnk") \
+	E_(afs_edit_dir_for_unlink,		"Unlink")
+
 
 /*
  * Export enum symbols via userspace.
@@ -118,6 +158,8 @@ enum afs_vl_operation {
 afs_call_traces;
 afs_fs_operations;
 afs_vl_operations;
+afs_edit_dir_ops;
+afs_edit_dir_reasons;
 
 /*
  * Now redefine the EM() and E_() macros to map the enums to the strings that
@@ -464,6 +506,54 @@ TRACE_EVENT(afs_call_state,
 		      __entry->ret, __entry->abort)
 	    );
 
+TRACE_EVENT(afs_edit_dir,
+	    TP_PROTO(struct afs_vnode *dvnode,
+		     enum afs_edit_dir_reason why,
+		     enum afs_edit_dir_op op,
+		     unsigned int block,
+		     unsigned int slot,
+		     unsigned int f_vnode,
+		     unsigned int f_unique,
+		     const char *name),
+
+	    TP_ARGS(dvnode, why, op, block, slot, f_vnode, f_unique, name),
+
+	    TP_STRUCT__entry(
+		    __field(unsigned int,		vnode		)
+		    __field(unsigned int,		unique		)
+		    __field(enum afs_edit_dir_reason,	why		)
+		    __field(enum afs_edit_dir_op,	op		)
+		    __field(unsigned int,		block		)
+		    __field(unsigned short,		slot		)
+		    __field(unsigned int,		f_vnode		)
+		    __field(unsigned int,		f_unique	)
+		    __array(char,			name, 18	)
+			     ),
+
+	    TP_fast_assign(
+		    int __len = strlen(name);
+		    __len = min(__len, 17);
+		    __entry->vnode	= dvnode->fid.vnode;
+		    __entry->unique	= dvnode->fid.unique;
+		    __entry->why	= why;
+		    __entry->op		= op;
+		    __entry->block	= block;
+		    __entry->slot	= slot;
+		    __entry->f_vnode	= f_vnode;
+		    __entry->f_unique	= f_unique;
+		    memcpy(__entry->name, name, __len);
+		    __entry->name[__len] = 0;
+			   ),
+
+	    TP_printk("d=%x:%x %s %s %u[%u] f=%x:%x %s",
+		      __entry->vnode, __entry->unique,
+		      __print_symbolic(__entry->why, afs_edit_dir_reasons),
+		      __print_symbolic(__entry->op, afs_edit_dir_ops),
+		      __entry->block, __entry->slot,
+		      __entry->f_vnode, __entry->f_unique,
+		      __entry->name)
+	    );
+
 #endif /* _TRACE_AFS_H */
 
 /* This part must be outside protection */

^ permalink raw reply related	[flat|nested] 37+ messages in thread

* [PATCH 18/20] afs: Trace protocol errors
  2018-04-05 20:29 [PATCH 00/20] afs: Fixes and development David Howells
                   ` (16 preceding siblings ...)
  2018-04-05 20:31 ` [PATCH 17/20] afs: Locally edit directory data for mkdir/create/unlink/ David Howells
@ 2018-04-05 20:31 ` David Howells
  2018-04-05 20:31 ` [PATCH 19/20] afs: Add stats for data transfer operations David Howells
                   ` (4 subsequent siblings)
  22 siblings, 0 replies; 37+ messages in thread
From: David Howells @ 2018-04-05 20:31 UTC (permalink / raw)
  To: torvalds; +Cc: dhowells, linux-fsdevel, linux-afs, linux-kernel

Trace protocol errors detected in afs.

Signed-off-by: David Howells <dhowells@redhat.com>
---

 fs/afs/cmservice.c         |    4 +--
 fs/afs/fsclient.c          |   68 +++++++++++++++++++++++---------------------
 fs/afs/inode.c             |    2 +
 fs/afs/internal.h          |    1 +
 fs/afs/rxrpc.c             |    9 ++++++
 fs/afs/vlclient.c          |   20 ++++++-------
 include/trace/events/afs.h |   21 ++++++++++++++
 7 files changed, 79 insertions(+), 46 deletions(-)

diff --git a/fs/afs/cmservice.c b/fs/afs/cmservice.c
index fa07f83e9f29..357de908df3a 100644
--- a/fs/afs/cmservice.c
+++ b/fs/afs/cmservice.c
@@ -201,7 +201,7 @@ static int afs_deliver_cb_callback(struct afs_call *call)
 		call->count = ntohl(call->tmp);
 		_debug("FID count: %u", call->count);
 		if (call->count > AFSCBMAX)
-			return -EBADMSG;
+			return afs_protocol_error(call, -EBADMSG);
 
 		call->buffer = kmalloc(call->count * 3 * 4, GFP_KERNEL);
 		if (!call->buffer)
@@ -245,7 +245,7 @@ static int afs_deliver_cb_callback(struct afs_call *call)
 		call->count2 = ntohl(call->tmp);
 		_debug("CB count: %u", call->count2);
 		if (call->count2 != call->count && call->count2 != 0)
-			return -EBADMSG;
+			return afs_protocol_error(call, -EBADMSG);
 		call->offset = 0;
 		call->unmarshall++;
 
diff --git a/fs/afs/fsclient.c b/fs/afs/fsclient.c
index 20d6304a0d3e..efacdb7c1dee 100644
--- a/fs/afs/fsclient.c
+++ b/fs/afs/fsclient.c
@@ -126,7 +126,8 @@ void afs_update_inode_from_status(struct afs_vnode *vnode,
 /*
  * decode an AFSFetchStatus block
  */
-static int xdr_decode_AFSFetchStatus(const __be32 **_bp,
+static int xdr_decode_AFSFetchStatus(struct afs_call *call,
+				     const __be32 **_bp,
 				     struct afs_file_status *status,
 				     struct afs_vnode *vnode,
 				     const afs_dataversion_t *expected_version,
@@ -167,6 +168,7 @@ static int xdr_decode_AFSFetchStatus(const __be32 **_bp,
 	case AFS_FTYPE_INVALID:
 		if (abort_code != 0) {
 			status->abort_code = abort_code;
+			ret = 0;
 			goto out;
 		}
 		/* Fall through */
@@ -229,7 +231,7 @@ static int xdr_decode_AFSFetchStatus(const __be32 **_bp,
 
 bad:
 	xdr_dump_bad(*_bp);
-	ret = -EBADMSG;
+	ret = afs_protocol_error(call, -EBADMSG);
 	goto out;
 }
 
@@ -372,9 +374,9 @@ static int afs_deliver_fs_fetch_status_vnode(struct afs_call *call)
 
 	/* unmarshall the reply once we've received all of it */
 	bp = call->buffer;
-	if (xdr_decode_AFSFetchStatus(&bp, &vnode->status, vnode,
+	if (xdr_decode_AFSFetchStatus(call, &bp, &vnode->status, vnode,
 				      &call->expected_version, NULL) < 0)
-		return -EBADMSG;
+		return afs_protocol_error(call, -EBADMSG);
 	xdr_decode_AFSCallBack(call, vnode, &bp);
 	if (call->reply[1])
 		xdr_decode_AFSVolSync(&bp, call->reply[1]);
@@ -553,9 +555,9 @@ static int afs_deliver_fs_fetch_data(struct afs_call *call)
 			return ret;
 
 		bp = call->buffer;
-		if (xdr_decode_AFSFetchStatus(&bp, &vnode->status, vnode,
+		if (xdr_decode_AFSFetchStatus(call, &bp, &vnode->status, vnode,
 					      &vnode->status.data_version, req) < 0)
-			return -EBADMSG;
+			return afs_protocol_error(call, -EBADMSG);
 		xdr_decode_AFSCallBack(call, vnode, &bp);
 		if (call->reply[1])
 			xdr_decode_AFSVolSync(&bp, call->reply[1]);
@@ -706,10 +708,10 @@ static int afs_deliver_fs_create_vnode(struct afs_call *call)
 	/* unmarshall the reply once we've received all of it */
 	bp = call->buffer;
 	xdr_decode_AFSFid(&bp, call->reply[1]);
-	if (xdr_decode_AFSFetchStatus(&bp, call->reply[2], NULL, NULL, NULL) < 0 ||
-	    xdr_decode_AFSFetchStatus(&bp, &vnode->status, vnode,
+	if (xdr_decode_AFSFetchStatus(call, &bp, call->reply[2], NULL, NULL, NULL) < 0 ||
+	    xdr_decode_AFSFetchStatus(call, &bp, &vnode->status, vnode,
 				      &call->expected_version, NULL) < 0)
-		return -EBADMSG;
+		return afs_protocol_error(call, -EBADMSG);
 	xdr_decode_AFSCallBack_raw(&bp, call->reply[3]);
 	/* xdr_decode_AFSVolSync(&bp, call->reply[X]); */
 
@@ -812,9 +814,9 @@ static int afs_deliver_fs_remove(struct afs_call *call)
 
 	/* unmarshall the reply once we've received all of it */
 	bp = call->buffer;
-	if (xdr_decode_AFSFetchStatus(&bp, &vnode->status, vnode,
+	if (xdr_decode_AFSFetchStatus(call, &bp, &vnode->status, vnode,
 				      &call->expected_version, NULL) < 0)
-		return -EBADMSG;
+		return afs_protocol_error(call, -EBADMSG);
 	/* xdr_decode_AFSVolSync(&bp, call->reply[X]); */
 
 	_leave(" = 0 [done]");
@@ -902,10 +904,10 @@ static int afs_deliver_fs_link(struct afs_call *call)
 
 	/* unmarshall the reply once we've received all of it */
 	bp = call->buffer;
-	if (xdr_decode_AFSFetchStatus(&bp, &vnode->status, vnode, NULL, NULL) < 0 ||
-	    xdr_decode_AFSFetchStatus(&bp, &dvnode->status, dvnode,
+	if (xdr_decode_AFSFetchStatus(call, &bp, &vnode->status, vnode, NULL, NULL) < 0 ||
+	    xdr_decode_AFSFetchStatus(call, &bp, &dvnode->status, dvnode,
 				      &call->expected_version, NULL) < 0)
-		return -EBADMSG;
+		return afs_protocol_error(call, -EBADMSG);
 	/* xdr_decode_AFSVolSync(&bp, call->reply[X]); */
 
 	_leave(" = 0 [done]");
@@ -989,10 +991,10 @@ static int afs_deliver_fs_symlink(struct afs_call *call)
 	/* unmarshall the reply once we've received all of it */
 	bp = call->buffer;
 	xdr_decode_AFSFid(&bp, call->reply[1]);
-	if (xdr_decode_AFSFetchStatus(&bp, call->reply[2], NULL, NULL, NULL) ||
-	    xdr_decode_AFSFetchStatus(&bp, &vnode->status, vnode,
+	if (xdr_decode_AFSFetchStatus(call, &bp, call->reply[2], NULL, NULL, NULL) ||
+	    xdr_decode_AFSFetchStatus(call, &bp, &vnode->status, vnode,
 				      &call->expected_version, NULL) < 0)
-		return -EBADMSG;
+		return afs_protocol_error(call, -EBADMSG);
 	/* xdr_decode_AFSVolSync(&bp, call->reply[X]); */
 
 	_leave(" = 0 [done]");
@@ -1095,13 +1097,13 @@ static int afs_deliver_fs_rename(struct afs_call *call)
 
 	/* unmarshall the reply once we've received all of it */
 	bp = call->buffer;
-	if (xdr_decode_AFSFetchStatus(&bp, &orig_dvnode->status, orig_dvnode,
+	if (xdr_decode_AFSFetchStatus(call, &bp, &orig_dvnode->status, orig_dvnode,
 				      &call->expected_version, NULL) < 0)
-		return -EBADMSG;
+		return afs_protocol_error(call, -EBADMSG);
 	if (new_dvnode != orig_dvnode &&
-	    xdr_decode_AFSFetchStatus(&bp, &new_dvnode->status, new_dvnode,
+	    xdr_decode_AFSFetchStatus(call, &bp, &new_dvnode->status, new_dvnode,
 				      &call->expected_version_2, NULL) < 0)
-		return -EBADMSG;
+		return afs_protocol_error(call, -EBADMSG);
 	/* xdr_decode_AFSVolSync(&bp, call->reply[X]); */
 
 	_leave(" = 0 [done]");
@@ -1204,9 +1206,9 @@ static int afs_deliver_fs_store_data(struct afs_call *call)
 
 	/* unmarshall the reply once we've received all of it */
 	bp = call->buffer;
-	if (xdr_decode_AFSFetchStatus(&bp, &vnode->status, vnode,
+	if (xdr_decode_AFSFetchStatus(call, &bp, &vnode->status, vnode,
 				      &call->expected_version, NULL) < 0)
-		return -EBADMSG;
+		return afs_protocol_error(call, -EBADMSG);
 	/* xdr_decode_AFSVolSync(&bp, call->reply[X]); */
 
 	afs_pages_written_back(vnode, call);
@@ -1380,9 +1382,9 @@ static int afs_deliver_fs_store_status(struct afs_call *call)
 
 	/* unmarshall the reply once we've received all of it */
 	bp = call->buffer;
-	if (xdr_decode_AFSFetchStatus(&bp, &vnode->status, vnode,
+	if (xdr_decode_AFSFetchStatus(call, &bp, &vnode->status, vnode,
 				      &call->expected_version, NULL) < 0)
-		return -EBADMSG;
+		return afs_protocol_error(call, -EBADMSG);
 	/* xdr_decode_AFSVolSync(&bp, call->reply[X]); */
 
 	_leave(" = 0 [done]");
@@ -1585,7 +1587,7 @@ static int afs_deliver_fs_get_volume_status(struct afs_call *call)
 		call->count = ntohl(call->tmp);
 		_debug("volname length: %u", call->count);
 		if (call->count >= AFSNAMEMAX)
-			return -EBADMSG;
+			return afs_protocol_error(call, -EBADMSG);
 		call->offset = 0;
 		call->unmarshall++;
 
@@ -1632,7 +1634,7 @@ static int afs_deliver_fs_get_volume_status(struct afs_call *call)
 		call->count = ntohl(call->tmp);
 		_debug("offline msg length: %u", call->count);
 		if (call->count >= AFSNAMEMAX)
-			return -EBADMSG;
+			return afs_protocol_error(call, -EBADMSG);
 		call->offset = 0;
 		call->unmarshall++;
 
@@ -1679,7 +1681,7 @@ static int afs_deliver_fs_get_volume_status(struct afs_call *call)
 		call->count = ntohl(call->tmp);
 		_debug("motd length: %u", call->count);
 		if (call->count >= AFSNAMEMAX)
-			return -EBADMSG;
+			return afs_protocol_error(call, -EBADMSG);
 		call->offset = 0;
 		call->unmarshall++;
 
@@ -2082,7 +2084,7 @@ static int afs_deliver_fs_fetch_status(struct afs_call *call)
 
 	/* unmarshall the reply once we've received all of it */
 	bp = call->buffer;
-	xdr_decode_AFSFetchStatus(&bp, status, vnode,
+	xdr_decode_AFSFetchStatus(call, &bp, status, vnode,
 				  &call->expected_version, NULL);
 	callback[call->count].version	= ntohl(bp[0]);
 	callback[call->count].expiry	= ntohl(bp[1]);
@@ -2179,7 +2181,7 @@ static int afs_deliver_fs_inline_bulk_status(struct afs_call *call)
 		tmp = ntohl(call->tmp);
 		_debug("status count: %u/%u", tmp, call->count2);
 		if (tmp != call->count2)
-			return -EBADMSG;
+			return afs_protocol_error(call, -EBADMSG);
 
 		call->count = 0;
 		call->unmarshall++;
@@ -2194,10 +2196,10 @@ static int afs_deliver_fs_inline_bulk_status(struct afs_call *call)
 
 		bp = call->buffer;
 		statuses = call->reply[1];
-		if (xdr_decode_AFSFetchStatus(&bp, &statuses[call->count],
+		if (xdr_decode_AFSFetchStatus(call, &bp, &statuses[call->count],
 					      call->count == 0 ? vnode : NULL,
 					      NULL, NULL) < 0)
-			return -EBADMSG;
+			return afs_protocol_error(call, -EBADMSG);
 
 		call->count++;
 		if (call->count < call->count2)
@@ -2217,7 +2219,7 @@ static int afs_deliver_fs_inline_bulk_status(struct afs_call *call)
 		tmp = ntohl(call->tmp);
 		_debug("CB count: %u", tmp);
 		if (tmp != call->count2)
-			return -EBADMSG;
+			return afs_protocol_error(call, -EBADMSG);
 		call->count = 0;
 		call->unmarshall++;
 	more_cbs:
diff --git a/fs/afs/inode.c b/fs/afs/inode.c
index 3808dcbbac6d..06194cfe9724 100644
--- a/fs/afs/inode.c
+++ b/fs/afs/inode.c
@@ -82,7 +82,7 @@ static int afs_inode_init_from_status(struct afs_vnode *vnode, struct key *key)
 	default:
 		printk("kAFS: AFS vnode with undefined type\n");
 		read_sequnlock_excl(&vnode->cb_lock);
-		return -EBADMSG;
+		return afs_protocol_error(NULL, -EBADMSG);
 	}
 
 	inode->i_blocks		= 0;
diff --git a/fs/afs/internal.h b/fs/afs/internal.h
index b4d2fac46fe8..901c11521662 100644
--- a/fs/afs/internal.h
+++ b/fs/afs/internal.h
@@ -872,6 +872,7 @@ extern void afs_flat_call_destructor(struct afs_call *);
 extern void afs_send_empty_reply(struct afs_call *);
 extern void afs_send_simple_reply(struct afs_call *, const void *, size_t);
 extern int afs_extract_data(struct afs_call *, void *, size_t, bool);
+extern int afs_protocol_error(struct afs_call *, int);
 
 static inline int afs_transfer_reply(struct afs_call *call)
 {
diff --git a/fs/afs/rxrpc.c b/fs/afs/rxrpc.c
index f7ae54b6a393..5c6263972ec9 100644
--- a/fs/afs/rxrpc.c
+++ b/fs/afs/rxrpc.c
@@ -926,3 +926,12 @@ int afs_extract_data(struct afs_call *call, void *buf, size_t count,
 	afs_set_call_complete(call, ret, remote_abort);
 	return ret;
 }
+
+/*
+ * Log protocol error production.
+ */
+noinline int afs_protocol_error(struct afs_call *call, int error)
+{
+	trace_afs_protocol_error(call, error, __builtin_return_address(0));
+	return error;
+}
diff --git a/fs/afs/vlclient.c b/fs/afs/vlclient.c
index f9d89795e41b..1ed7e2fd2f35 100644
--- a/fs/afs/vlclient.c
+++ b/fs/afs/vlclient.c
@@ -450,7 +450,7 @@ static int afs_deliver_yfsvl_get_endpoints(struct afs_call *call)
 		call->count2	= ntohl(*bp); /* Type or next count */
 
 		if (call->count > YFS_MAXENDPOINTS)
-			return -EBADMSG;
+			return afs_protocol_error(call, -EBADMSG);
 
 		alist = afs_alloc_addrlist(call->count, FS_SERVICE, AFS_FS_PORT);
 		if (!alist)
@@ -474,7 +474,7 @@ static int afs_deliver_yfsvl_get_endpoints(struct afs_call *call)
 			size = sizeof(__be32) * (1 + 4 + 1);
 			break;
 		default:
-			return -EBADMSG;
+			return afs_protocol_error(call, -EBADMSG);
 		}
 
 		size += sizeof(__be32);
@@ -487,18 +487,18 @@ static int afs_deliver_yfsvl_get_endpoints(struct afs_call *call)
 		switch (call->count2) {
 		case YFS_ENDPOINT_IPV4:
 			if (ntohl(bp[0]) != sizeof(__be32) * 2)
-				return -EBADMSG;
+				return afs_protocol_error(call, -EBADMSG);
 			afs_merge_fs_addr4(alist, bp[1], ntohl(bp[2]));
 			bp += 3;
 			break;
 		case YFS_ENDPOINT_IPV6:
 			if (ntohl(bp[0]) != sizeof(__be32) * 5)
-				return -EBADMSG;
+				return afs_protocol_error(call, -EBADMSG);
 			afs_merge_fs_addr6(alist, bp + 1, ntohl(bp[5]));
 			bp += 6;
 			break;
 		default:
-			return -EBADMSG;
+			return afs_protocol_error(call, -EBADMSG);
 		}
 
 		/* Got either the type of the next entry or the count of
@@ -517,7 +517,7 @@ static int afs_deliver_yfsvl_get_endpoints(struct afs_call *call)
 		if (!call->count)
 			goto end;
 		if (call->count > YFS_MAXENDPOINTS)
-			return -EBADMSG;
+			return afs_protocol_error(call, -EBADMSG);
 
 		call->unmarshall = 3;
 
@@ -545,7 +545,7 @@ static int afs_deliver_yfsvl_get_endpoints(struct afs_call *call)
 			size = sizeof(__be32) * (1 + 4 + 1);
 			break;
 		default:
-			return -EBADMSG;
+			return afs_protocol_error(call, -EBADMSG);
 		}
 
 		if (call->count > 1)
@@ -558,16 +558,16 @@ static int afs_deliver_yfsvl_get_endpoints(struct afs_call *call)
 		switch (call->count2) {
 		case YFS_ENDPOINT_IPV4:
 			if (ntohl(bp[0]) != sizeof(__be32) * 2)
-				return -EBADMSG;
+				return afs_protocol_error(call, -EBADMSG);
 			bp += 3;
 			break;
 		case YFS_ENDPOINT_IPV6:
 			if (ntohl(bp[0]) != sizeof(__be32) * 5)
-				return -EBADMSG;
+				return afs_protocol_error(call, -EBADMSG);
 			bp += 6;
 			break;
 		default:
-			return -EBADMSG;
+			return afs_protocol_error(call, -EBADMSG);
 		}
 
 		/* Got either the type of the next entry or the count of
diff --git a/include/trace/events/afs.h b/include/trace/events/afs.h
index f46dee0f5ced..f0820554caa9 100644
--- a/include/trace/events/afs.h
+++ b/include/trace/events/afs.h
@@ -554,6 +554,27 @@ TRACE_EVENT(afs_edit_dir,
 		      __entry->name)
 	    );
 
+TRACE_EVENT(afs_protocol_error,
+	    TP_PROTO(struct afs_call *call, int error, const void *where),
+
+	    TP_ARGS(call, error, where),
+
+	    TP_STRUCT__entry(
+		    __field(unsigned int,	call		)
+		    __field(int,		error		)
+		    __field(const void *,	where		)
+			     ),
+
+	    TP_fast_assign(
+		    __entry->call = call ? call->debug_id : 0;
+		    __entry->error = error;
+		    __entry->where = where;
+			   ),
+
+	    TP_printk("c=%08x r=%d sp=%pSR",
+		      __entry->call, __entry->error, __entry->where)
+	    );
+
 #endif /* _TRACE_AFS_H */
 
 /* This part must be outside protection */

^ permalink raw reply related	[flat|nested] 37+ messages in thread

* [PATCH 19/20] afs: Add stats for data transfer operations
  2018-04-05 20:29 [PATCH 00/20] afs: Fixes and development David Howells
                   ` (17 preceding siblings ...)
  2018-04-05 20:31 ` [PATCH 18/20] afs: Trace protocol errors David Howells
@ 2018-04-05 20:31 ` David Howells
  2018-04-05 20:31 ` [PATCH 20/20] afs: Do better accretion of small writes on newly created content David Howells
                   ` (3 subsequent siblings)
  22 siblings, 0 replies; 37+ messages in thread
From: David Howells @ 2018-04-05 20:31 UTC (permalink / raw)
  To: torvalds; +Cc: dhowells, linux-fsdevel, linux-afs, linux-kernel

Add statistics to /proc/fs/afs/stats for data transfer RPC operations.  New
lines are added that look like:

	file-rd : n=55794 nb=10252282150
	file-wr : n=9789 nb=3247763645

where n= indicates the number of ops completed and nb= indicates the number
of bytes successfully transferred.  file-rd is the counts for read/fetch
operations and file-wr the counts for write/store operations.

Note that directory and symlink downloading are included in the file-rd
stats at the moment.

Signed-off-by: David Howells <dhowells@redhat.com>
---

 fs/afs/file.c     |    6 ++++++
 fs/afs/internal.h |    4 ++++
 fs/afs/proc.c     |    7 +++++++
 fs/afs/write.c    |    6 ++++++
 4 files changed, 23 insertions(+)

diff --git a/fs/afs/file.c b/fs/afs/file.c
index 91ff1335fd33..e5cac1bc3cf0 100644
--- a/fs/afs/file.c
+++ b/fs/afs/file.c
@@ -242,6 +242,12 @@ int afs_fetch_data(struct afs_vnode *vnode, struct key *key, struct afs_read *de
 		ret = afs_end_vnode_operation(&fc);
 	}
 
+	if (ret == 0) {
+		afs_stat_v(vnode, n_fetches);
+		atomic_long_add(desc->actual_len,
+				&afs_v2net(vnode)->n_fetch_bytes);
+	}
+
 	_leave(" = %d", ret);
 	return ret;
 }
diff --git a/fs/afs/internal.h b/fs/afs/internal.h
index 901c11521662..184094810038 100644
--- a/fs/afs/internal.h
+++ b/fs/afs/internal.h
@@ -271,6 +271,10 @@ struct afs_net {
 	atomic_t		n_read_dir;	/* Number of directory pages read */
 	atomic_t		n_dir_cr;	/* Number of directory entry creation edits */
 	atomic_t		n_dir_rm;	/* Number of directory entry removal edits */
+	atomic_t		n_stores;	/* Number of store ops */
+	atomic_long_t		n_store_bytes;	/* Number of bytes stored */
+	atomic_long_t		n_fetch_bytes;	/* Number of bytes fetched */
+	atomic_t		n_fetches;	/* Number of data fetch ops */
 };
 
 extern const char afs_init_sysname[];
diff --git a/fs/afs/proc.c b/fs/afs/proc.c
index 5ced655d08b3..870b0bad03d0 100644
--- a/fs/afs/proc.c
+++ b/fs/afs/proc.c
@@ -902,6 +902,13 @@ static int afs_proc_stats_show(struct seq_file *m, void *v)
 	seq_printf(m, "dir-edit: cr=%u rm=%u\n",
 		   atomic_read(&net->n_dir_cr),
 		   atomic_read(&net->n_dir_rm));
+
+	seq_printf(m, "file-rd : n=%u nb=%lu\n",
+		   atomic_read(&net->n_fetches),
+		   atomic_long_read(&net->n_fetch_bytes));
+	seq_printf(m, "file-wr : n=%u nb=%lu\n",
+		   atomic_read(&net->n_stores),
+		   atomic_long_read(&net->n_store_bytes));
 	return 0;
 }
 
diff --git a/fs/afs/write.c b/fs/afs/write.c
index 70a563c14e6f..eccc16198f68 100644
--- a/fs/afs/write.c
+++ b/fs/afs/write.c
@@ -356,6 +356,12 @@ static int afs_store_data(struct address_space *mapping,
 	}
 
 	switch (ret) {
+	case 0:
+		afs_stat_v(vnode, n_stores);
+		atomic_long_add((last * PAGE_SIZE + to) -
+				(first * PAGE_SIZE + offset),
+				&afs_v2net(vnode)->n_store_bytes);
+		break;
 	case -EACCES:
 	case -EPERM:
 	case -ENOKEY:

^ permalink raw reply related	[flat|nested] 37+ messages in thread

* [PATCH 20/20] afs: Do better accretion of small writes on newly created content
  2018-04-05 20:29 [PATCH 00/20] afs: Fixes and development David Howells
                   ` (18 preceding siblings ...)
  2018-04-05 20:31 ` [PATCH 19/20] afs: Add stats for data transfer operations David Howells
@ 2018-04-05 20:31 ` David Howells
  2018-04-07 16:50 ` [PATCH 00/20] afs: Fixes and development Linus Torvalds
                   ` (2 subsequent siblings)
  22 siblings, 0 replies; 37+ messages in thread
From: David Howells @ 2018-04-05 20:31 UTC (permalink / raw)
  To: torvalds; +Cc: dhowells, linux-fsdevel, linux-afs, linux-kernel

Processes like ld that do lots of small writes that aren't necessarily
contiguous result in a lot of small StoreData operations to the server, the
idea being that if someone else changes the data on the server, we only
write our changes over that and not the space between.  Further, we don't
want to write back empty space if we can avoid it to make it easier for the
server to do sparse files.

However, making lots of tiny RPC ops is a lot less efficient for the server
than one big one because each op requires allocation of resources and the
taking of locks, so we want to compromise a bit.

Reduce the load by the following:

 (1) If a file is just created locally or has just been truncated with
     O_TRUNC locally, allow subsequent writes to the file to be merged with
     intervening space if that space doesn't cross an entire intervening
     page.

 (2) Don't flush the file on ->flush() but rather on ->release() if the
     file was open for writing.

Just linking vmlinux.o, without this patch, looking in /proc/fs/afs/stats:

	file-wr : n=441 nb=513581204

and after the patch:

	file-wr : n=62 nb=513668555

there were 379 fewer StoreData RPC operations at the expense of an extra
87K being written.

Signed-off-by: David Howells <dhowells@redhat.com>
---

 fs/afs/callback.c |    1 +
 fs/afs/dir.c      |    3 +++
 fs/afs/file.c     |    7 ++++++-
 fs/afs/internal.h |    2 +-
 fs/afs/write.c    |   32 +++++++++++++-------------------
 5 files changed, 24 insertions(+), 21 deletions(-)

diff --git a/fs/afs/callback.c b/fs/afs/callback.c
index 6049ca837498..abd9a84f4e88 100644
--- a/fs/afs/callback.c
+++ b/fs/afs/callback.c
@@ -130,6 +130,7 @@ void afs_break_callback(struct afs_vnode *vnode)
 
 	write_seqlock(&vnode->cb_lock);
 
+	clear_bit(AFS_VNODE_NEW_CONTENT, &vnode->flags);
 	if (test_and_clear_bit(AFS_VNODE_CB_PROMISED, &vnode->flags)) {
 		vnode->cb_break++;
 		afs_clear_permits(vnode);
diff --git a/fs/afs/dir.c b/fs/afs/dir.c
index 7701e382fa94..a14ea4280590 100644
--- a/fs/afs/dir.c
+++ b/fs/afs/dir.c
@@ -1105,6 +1105,7 @@ static void afs_vnode_new_inode(struct afs_fs_cursor *fc,
 				struct afs_file_status *newstatus,
 				struct afs_callback *newcb)
 {
+	struct afs_vnode *vnode;
 	struct inode *inode;
 
 	if (fc->ac.error < 0)
@@ -1122,6 +1123,8 @@ static void afs_vnode_new_inode(struct afs_fs_cursor *fc,
 		return;
 	}
 
+	vnode = AFS_FS_I(inode);
+	set_bit(AFS_VNODE_NEW_CONTENT, &vnode->flags);
 	d_add(new_dentry, inode);
 }
 
diff --git a/fs/afs/file.c b/fs/afs/file.c
index e5cac1bc3cf0..c24c08016dd9 100644
--- a/fs/afs/file.c
+++ b/fs/afs/file.c
@@ -30,7 +30,6 @@ static int afs_readpages(struct file *filp, struct address_space *mapping,
 
 const struct file_operations afs_file_operations = {
 	.open		= afs_open,
-	.flush		= afs_flush,
 	.release	= afs_release,
 	.llseek		= generic_file_llseek,
 	.read_iter	= generic_file_read_iter,
@@ -146,6 +145,9 @@ int afs_open(struct inode *inode, struct file *file)
 		if (ret < 0)
 			goto error_af;
 	}
+
+	if (file->f_flags & O_TRUNC)
+		set_bit(AFS_VNODE_NEW_CONTENT, &vnode->flags);
 	
 	file->private_data = af;
 	_leave(" = 0");
@@ -170,6 +172,9 @@ int afs_release(struct inode *inode, struct file *file)
 
 	_enter("{%x:%u},", vnode->fid.vid, vnode->fid.vnode);
 
+	if ((file->f_mode & FMODE_WRITE))
+		return vfs_fsync(file, 0);
+
 	file->private_data = NULL;
 	if (af->wb)
 		afs_put_wb_key(af->wb);
diff --git a/fs/afs/internal.h b/fs/afs/internal.h
index 184094810038..70dd41e06363 100644
--- a/fs/afs/internal.h
+++ b/fs/afs/internal.h
@@ -504,6 +504,7 @@ struct afs_vnode {
 #define AFS_VNODE_MOUNTPOINT	5		/* set if vnode is a mountpoint symlink */
 #define AFS_VNODE_AUTOCELL	6		/* set if Vnode is an auto mount point */
 #define AFS_VNODE_PSEUDODIR	7 		/* set if Vnode is a pseudo directory */
+#define AFS_VNODE_NEW_CONTENT	8		/* Set if file has new content (create/trunc-0) */
 
 	struct list_head	wb_keys;	/* List of keys available for writeback */
 	struct list_head	pending_locks;	/* locks waiting to be granted */
@@ -1023,7 +1024,6 @@ extern int afs_writepage(struct page *, struct writeback_control *);
 extern int afs_writepages(struct address_space *, struct writeback_control *);
 extern void afs_pages_written_back(struct afs_vnode *, struct afs_call *);
 extern ssize_t afs_file_write(struct kiocb *, struct iov_iter *);
-extern int afs_flush(struct file *, fl_owner_t);
 extern int afs_fsync(struct file *, loff_t, loff_t, int);
 extern int afs_page_mkwrite(struct vm_fault *);
 extern void afs_prune_wb_keys(struct afs_vnode *);
diff --git a/fs/afs/write.c b/fs/afs/write.c
index eccc16198f68..160f6cc26c00 100644
--- a/fs/afs/write.c
+++ b/fs/afs/write.c
@@ -125,7 +125,12 @@ int afs_write_begin(struct file *file, struct address_space *mapping,
 					     page->index, priv);
 			goto flush_conflicting_write;
 		}
-		if (to < f || from > t)
+		/* If the file is being filled locally, allow inter-write
+		 * spaces to be merged into writes.  If it's not, only write
+		 * back what the user gives us.
+		 */
+		if (!test_bit(AFS_VNODE_NEW_CONTENT, &vnode->flags) &&
+		    (to < f || from > t))
 			goto flush_conflicting_write;
 		if (from < f)
 			f = from;
@@ -419,7 +424,8 @@ static int afs_write_back_from_locked_page(struct address_space *mapping,
 		trace_afs_page_dirty(vnode, tracepoint_string("WARN"),
 				     primary_page->index, priv);
 
-	if (start >= final_page || to < PAGE_SIZE)
+	if (start >= final_page ||
+	    (to < PAGE_SIZE && !test_bit(AFS_VNODE_NEW_CONTENT, &vnode->flags)))
 		goto no_more;
 
 	start++;
@@ -440,9 +446,10 @@ static int afs_write_back_from_locked_page(struct address_space *mapping,
 		}
 
 		for (loop = 0; loop < n; loop++) {
-			if (to != PAGE_SIZE)
-				break;
 			page = pages[loop];
+			if (to != PAGE_SIZE &&
+			    !test_bit(AFS_VNODE_NEW_CONTENT, &vnode->flags))
+				break;
 			if (page->index > final_page)
 				break;
 			if (!trylock_page(page))
@@ -455,7 +462,8 @@ static int afs_write_back_from_locked_page(struct address_space *mapping,
 			priv = page_private(page);
 			f = priv & AFS_PRIV_MAX;
 			t = priv >> AFS_PRIV_SHIFT;
-			if (f != 0) {
+			if (f != 0 &&
+			    !test_bit(AFS_VNODE_NEW_CONTENT, &vnode->flags)) {
 				unlock_page(page);
 				break;
 			}
@@ -740,20 +748,6 @@ int afs_fsync(struct file *file, loff_t start, loff_t end, int datasync)
 	return file_write_and_wait_range(file, start, end);
 }
 
-/*
- * Flush out all outstanding writes on a file opened for writing when it is
- * closed.
- */
-int afs_flush(struct file *file, fl_owner_t id)
-{
-	_enter("");
-
-	if ((file->f_mode & FMODE_WRITE) == 0)
-		return 0;
-
-	return vfs_fsync(file, 0);
-}
-
 /*
  * notification that a previously read-only page is about to become writable
  * - if it returns an error, the caller will deliver a bus error signal

^ permalink raw reply related	[flat|nested] 37+ messages in thread

* Re: [PATCH 05/20] afs: Implement @sys substitution handling
  2018-04-05 20:30 ` [PATCH 05/20] afs: Implement @sys substitution handling David Howells
@ 2018-04-06  6:37   ` Christoph Hellwig
  2018-04-06  6:51   ` Al Viro
                     ` (4 subsequent siblings)
  5 siblings, 0 replies; 37+ messages in thread
From: Christoph Hellwig @ 2018-04-06  6:37 UTC (permalink / raw)
  To: David Howells; +Cc: torvalds, linux-fsdevel, linux-afs, linux-kernel

On Thu, Apr 05, 2018 at 09:30:04PM +0100, David Howells wrote:
> Implement the AFS feature by which @sys at the end of a pathname component
> may be substituted for one of a list of values, typically naming the
> operating system.  Up to 16 alternatives may be specified and these are
> tried in turn until one works.  Each network namespace has[*] a separate
> independent list.
> 
> Upon creation of a new network namespace, the list of values is
> initialised[*] to a single OpenAFS-compatible string representing arch type
> plus "_linux26".  For example, on x86_64, the sysname is "amd64_linux26".
> 
> [*] Or will, once network namespace support is finalised in kAFS.

Good luck making that work with the dcache without creating a total
mess.

^ permalink raw reply	[flat|nested] 37+ messages in thread

* Re: [PATCH 05/20] afs: Implement @sys substitution handling
  2018-04-05 20:30 ` [PATCH 05/20] afs: Implement @sys substitution handling David Howells
  2018-04-06  6:37   ` Christoph Hellwig
@ 2018-04-06  6:51   ` Al Viro
  2018-04-06  8:08   ` David Howells
                     ` (3 subsequent siblings)
  5 siblings, 0 replies; 37+ messages in thread
From: Al Viro @ 2018-04-06  6:51 UTC (permalink / raw)
  To: David Howells; +Cc: torvalds, linux-fsdevel, linux-afs, linux-kernel

On Thu, Apr 05, 2018 at 09:30:04PM +0100, David Howells wrote:
> +static struct dentry *afs_lookup_atsys(struct inode *dir, struct dentry *dentry,
> +				       struct key *key)
> +{

> +		ret = lookup_one_len(buf, parent, len);

Er...  Parent is locked only shared here and lookup_one_len() seriously depends
upon exclusive lock.  As it is, race with lookup of the full name will mess the
things up very badly.

^ permalink raw reply	[flat|nested] 37+ messages in thread

* Re: [PATCH 05/20] afs: Implement @sys substitution handling
  2018-04-05 20:30 ` [PATCH 05/20] afs: Implement @sys substitution handling David Howells
  2018-04-06  6:37   ` Christoph Hellwig
  2018-04-06  6:51   ` Al Viro
@ 2018-04-06  8:08   ` David Howells
  2018-04-06  8:13   ` David Howells
                     ` (2 subsequent siblings)
  5 siblings, 0 replies; 37+ messages in thread
From: David Howells @ 2018-04-06  8:08 UTC (permalink / raw)
  To: Al Viro; +Cc: dhowells, torvalds, linux-fsdevel, linux-afs, linux-kernel

Al Viro <viro@ZenIV.linux.org.uk> wrote:

> > +static struct dentry *afs_lookup_atsys(struct inode *dir, struct dentry *dentry,
> > +				       struct key *key)
> > +{
> 
> > +		ret = lookup_one_len(buf, parent, len);
> 
> Er...  Parent is locked only shared here and lookup_one_len() seriously
> depends upon exclusive lock.  As it is, race with lookup of the full name
> will mess the things up very badly.

How should it be done?  Do I have to use d_alloc_parallel(), analogous to
lookup_slow() without taking the rwsem again?

David

PS: Can you stick a banner comment on d_alloc_parallel() describing it?

^ permalink raw reply	[flat|nested] 37+ messages in thread

* Re: [PATCH 05/20] afs: Implement @sys substitution handling
  2018-04-05 20:30 ` [PATCH 05/20] afs: Implement @sys substitution handling David Howells
                     ` (2 preceding siblings ...)
  2018-04-06  8:08   ` David Howells
@ 2018-04-06  8:13   ` David Howells
  2018-04-06 17:54     ` Nikolay Borisov
  2018-04-06  9:04   ` David Howells
  2018-04-06 10:32   ` David Howells
  5 siblings, 1 reply; 37+ messages in thread
From: David Howells @ 2018-04-06  8:13 UTC (permalink / raw)
  To: Al Viro; +Cc: dhowells, torvalds, linux-fsdevel, linux-afs, linux-kernel

Al Viro <viro@ZenIV.linux.org.uk> wrote:

> lookup_one_len() seriously depends upon exclusive lock

In the code it says:

	WARN_ON_ONCE(!inode_is_locked(base->d_inode));

which checks i_rwsem, but in the banner comment it says:

	* The caller must hold base->i_mutex.

is one of these wrong?

David

^ permalink raw reply	[flat|nested] 37+ messages in thread

* Re: [PATCH 05/20] afs: Implement @sys substitution handling
  2018-04-05 20:30 ` [PATCH 05/20] afs: Implement @sys substitution handling David Howells
                     ` (3 preceding siblings ...)
  2018-04-06  8:13   ` David Howells
@ 2018-04-06  9:04   ` David Howells
  2018-04-06 10:32   ` David Howells
  5 siblings, 0 replies; 37+ messages in thread
From: David Howells @ 2018-04-06  9:04 UTC (permalink / raw)
  To: Al Viro; +Cc: dhowells, torvalds, linux-fsdevel, linux-afs, linux-kernel

How about the attached changes?

Note that I've put the checks for ".", ".." and names containing '/' in the
functions where the strings for @sys and @cell are set.

David
---
diff --git a/fs/afs/dir.c b/fs/afs/dir.c
index a14ea4280590..5e3a0ed2043f 100644
--- a/fs/afs/dir.c
+++ b/fs/afs/dir.c
@@ -760,6 +760,55 @@ static struct inode *afs_do_lookup(struct inode *dir, struct dentry *dentry,
 	return inode;
 }
 
+/*
+ * Do a parallel recursive lookup.
+ *
+ * Ideally, we'd call lookup_one_len(), but we can't because we'd need to be
+ * holding i_mutex but we only hold i_rwsem for read.
+ */
+struct dentry *afs_lookup_rec(struct dentry *dir, const char *name, int len)
+{
+	DECLARE_WAIT_QUEUE_HEAD_ONSTACK(wq);
+	struct dentry *dentry, *old;
+	struct inode *inode = dir->d_inode;
+	struct qstr this;
+	int ret;
+
+	this.name = name;
+	this.len = len;
+	this.hash = full_name_hash(dir, name, len);
+
+	if (unlikely(IS_DEADDIR(inode)))
+		return ERR_PTR(-ESTALE);
+
+again:
+	dentry = d_alloc_parallel(dir, &this, &wq);
+	if (IS_ERR(dentry))
+		return ERR_CAST(dentry);
+
+	if (unlikely(!d_in_lookup(dentry))) {
+		ret = dentry->d_op->d_revalidate(dentry, 0);
+		if (unlikely(ret <= 0)) {
+			if (!ret) {
+				d_invalidate(dentry);
+				dput(dentry);
+				goto again;
+			}
+			dput(dentry);
+			dentry = ERR_PTR(ret);
+		}
+	} else {
+		old = inode->i_op->lookup(inode, dentry, 0);
+		d_lookup_done(dentry); /* Clean up wq */
+		if (unlikely(old)) {
+			dput(dentry);
+			dentry = old;
+		}
+	}
+
+	return dentry;
+}
+
 /*
  * Look up an entry in a directory with @sys substitution.
  */
@@ -810,7 +859,7 @@ static struct dentry *afs_lookup_atsys(struct inode *dir, struct dentry *dentry,
 		strcpy(p, name);
 		read_unlock(&net->sysnames_lock);
 
-		ret = lookup_one_len(buf, parent, len);
+		ret = afs_lookup_rec(parent, buf, len);
 		if (IS_ERR(ret) || d_is_positive(ret))
 			goto out_b;
 		dput(ret);
diff --git a/fs/afs/dynroot.c b/fs/afs/dynroot.c
index b70380e5d0e3..42d03a80310d 100644
--- a/fs/afs/dynroot.c
+++ b/fs/afs/dynroot.c
@@ -125,7 +125,7 @@ static struct dentry *afs_lookup_atcell(struct dentry *dentry)
 	if (!cell)
 		goto out_n;
 
-	ret = lookup_one_len(name, parent, len);
+	ret = afs_lookup_rec(parent, name, len);
 	if (IS_ERR(ret) || d_is_positive(ret))
 		goto out_n;
 
diff --git a/fs/afs/internal.h b/fs/afs/internal.h
index 70dd41e06363..1dbcdafb25a0 100644
--- a/fs/afs/internal.h
+++ b/fs/afs/internal.h
@@ -682,6 +682,7 @@ extern const struct inode_operations afs_dir_inode_operations;
 extern const struct address_space_operations afs_dir_aops;
 extern const struct dentry_operations afs_fs_dentry_operations;
 
+extern struct dentry *afs_lookup_rec(struct dentry *, const char *, int);
 extern void afs_d_release(struct dentry *);
 
 /*
diff --git a/fs/afs/proc.c b/fs/afs/proc.c
index 870b0bad03d0..47a0ac21ee44 100644
--- a/fs/afs/proc.c
+++ b/fs/afs/proc.c
@@ -393,6 +393,12 @@ static ssize_t afs_proc_rootcell_write(struct file *file,
 	if (IS_ERR(kbuf))
 		return PTR_ERR(kbuf);
 
+	ret = -EINVAL;
+	if (kbuf[0] == '.')
+		goto out;
+	if (memchr(kbuf, '/', size))
+		goto out;
+
 	/* trim to first NL */
 	s = memchr(kbuf, '\n', size);
 	if (s)
@@ -405,6 +411,7 @@ static ssize_t afs_proc_rootcell_write(struct file *file,
 	if (ret >= 0)
 		ret = size;	/* consume everything, always */
 
+out:
 	kfree(kbuf);
 	_leave(" = %d", ret);
 	return ret;
@@ -775,27 +782,28 @@ static ssize_t afs_proc_sysname_write(struct file *file,
 		len = strlen(s);
 		if (len == 0)
 			continue;
-		if (len >= AFSNAMEMAX) {
-			sysnames->error = -ENAMETOOLONG;
-			ret = -ENAMETOOLONG;
-			goto out;
-		}
+		ret = -ENAMETOOLONG;
+		if (len >= AFSNAMEMAX)
+			goto error;
+
 		if (len >= 4 &&
 		    s[len - 4] == '@' &&
 		    s[len - 3] == 's' &&
 		    s[len - 2] == 'y' &&
-		    s[len - 1] == 's') {
+		    s[len - 1] == 's')
 			/* Protect against recursion */
-			sysnames->error = -EINVAL;
-			ret = -EINVAL;
-			goto out;
-		}
+			goto invalid;
+
+		if (s[0] == '.' &&
+		    (len < 2 || (len == 2 && s[1] == '.')))
+			goto invalid;
+
+		if (memchr(s, '/', len))
+			goto invalid;
 
-		if (sysnames->nr >= AFS_NR_SYSNAME) {
-			sysnames->error = -EFBIG;
-			ret = -EFBIG;
+		ret = -EFBIG;
+		if (sysnames->nr >= AFS_NR_SYSNAME)
 			goto out;
-		}
 
 		if (strcmp(s, afs_init_sysname) == 0) {
 			sub = (char *)afs_init_sysname;
@@ -815,6 +823,12 @@ static ssize_t afs_proc_sysname_write(struct file *file,
 	inode_unlock(file_inode(file));
 	kfree(kbuf);
 	return ret;
+
+invalid:
+	ret = -EINVAL;
+error:
+	sysnames->error = ret;
+	goto out;
 }
 
 static int afs_proc_sysname_release(struct inode *inode, struct file *file)

^ permalink raw reply related	[flat|nested] 37+ messages in thread

* Re: [PATCH 05/20] afs: Implement @sys substitution handling
  2018-04-05 20:30 ` [PATCH 05/20] afs: Implement @sys substitution handling David Howells
                     ` (4 preceding siblings ...)
  2018-04-06  9:04   ` David Howells
@ 2018-04-06 10:32   ` David Howells
  5 siblings, 0 replies; 37+ messages in thread
From: David Howells @ 2018-04-06 10:32 UTC (permalink / raw)
  To: Al Viro; +Cc: dhowells, torvalds, linux-fsdevel, linux-afs, linux-kernel

Hi Al,

Here's an updated set of changes.

I've improved the sysnames list handling and made it install a list consisting
of a blank name if the names are cleared (I'm told this is what should
happen).

I've also reinstated the "."  and ".." checks in afs_lookup_rec() to catch
substitution creating them by substituting a blank name.

David
---
diff --git a/fs/afs/dir.c b/fs/afs/dir.c
index a14ea4280590..45a472f3d368 100644
--- a/fs/afs/dir.c
+++ b/fs/afs/dir.c
@@ -760,6 +760,60 @@ static struct inode *afs_do_lookup(struct inode *dir, struct dentry *dentry,
 	return inode;
 }
 
+/*
+ * Do a parallel recursive lookup.
+ *
+ * Ideally, we'd call lookup_one_len(), but we can't because we'd need to be
+ * holding i_mutex but we only hold i_rwsem for read.
+ */
+struct dentry *afs_lookup_rec(struct dentry *dir, const char *name, int len)
+{
+	DECLARE_WAIT_QUEUE_HEAD_ONSTACK(wq);
+	struct dentry *dentry, *old;
+	struct inode *inode = dir->d_inode;
+	struct qstr this;
+	int ret;
+
+	if (len <= 0)
+		return ERR_PTR(-EINVAL);
+	if (name[0] == '.' &&
+	    (len < 2 || (len == 2 && name[1] == '.')))
+		return ERR_PTR(-EINVAL);
+	if (unlikely(IS_DEADDIR(inode)))
+		return ERR_PTR(-ESTALE);
+
+	this.name = name;
+	this.len = len;
+	this.hash = full_name_hash(dir, name, len);
+
+again:
+	dentry = d_alloc_parallel(dir, &this, &wq);
+	if (IS_ERR(dentry))
+		return ERR_CAST(dentry);
+
+	if (unlikely(!d_in_lookup(dentry))) {
+		ret = dentry->d_op->d_revalidate(dentry, 0);
+		if (unlikely(ret <= 0)) {
+			if (!ret) {
+				d_invalidate(dentry);
+				dput(dentry);
+				goto again;
+			}
+			dput(dentry);
+			dentry = ERR_PTR(ret);
+		}
+	} else {
+		old = inode->i_op->lookup(inode, dentry, 0);
+		d_lookup_done(dentry); /* Clean up wq */
+		if (unlikely(old)) {
+			dput(dentry);
+			dentry = old;
+		}
+	}
+
+	return dentry;
+}
+
 /*
  * Look up an entry in a directory with @sys substitution.
  */
@@ -785,43 +839,33 @@ static struct dentry *afs_lookup_atsys(struct inode *dir, struct dentry *dentry,
 		p += dentry->d_name.len - 4;
 	}
 
-	/* There is an ordered list of substitutes that we have to try */
-	for (i = 0; i < AFS_NR_SYSNAME; i++) {
-		read_lock(&net->sysnames_lock);
-		subs = net->sysnames;
-		if (!subs || i >= subs->nr) {
-			read_unlock(&net->sysnames_lock);
-			goto noent;
-		}
+	/* There is an ordered list of substitutes that we have to try. */
+	read_lock(&net->sysnames_lock);
+	subs = net->sysnames;
+	refcount_inc(&subs->usage);
+	read_unlock(&net->sysnames_lock);
 
+	for (i = 0; i < subs->nr; i++) {
 		name = subs->subs[i];
-		if (!name) {
-			read_unlock(&net->sysnames_lock);
-			goto noent;
-		}
-
 		len = dentry->d_name.len - 4 + strlen(name);
 		if (len >= AFSNAMEMAX) {
-			read_unlock(&net->sysnames_lock);
 			ret = ERR_PTR(-ENAMETOOLONG);
-			goto out_b;
+			goto out_s;
 		}
 
 		strcpy(p, name);
-		read_unlock(&net->sysnames_lock);
-
-		ret = lookup_one_len(buf, parent, len);
+		ret = afs_lookup_rec(parent, buf, len);
 		if (IS_ERR(ret) || d_is_positive(ret))
-			goto out_b;
+			goto out_s;
 		dput(ret);
 	}
 
-noent:
 	/* We don't want to d_add() the @sys dentry here as we don't want to
 	 * the cached dentry to hide changes to the sysnames list.
 	 */
 	ret = NULL;
-out_b:
+out_s:
+	afs_put_sysnames(subs);
 	kfree(buf);
 out_p:
 	dput(parent);
diff --git a/fs/afs/dynroot.c b/fs/afs/dynroot.c
index b70380e5d0e3..42d03a80310d 100644
--- a/fs/afs/dynroot.c
+++ b/fs/afs/dynroot.c
@@ -125,7 +125,7 @@ static struct dentry *afs_lookup_atcell(struct dentry *dentry)
 	if (!cell)
 		goto out_n;
 
-	ret = lookup_one_len(name, parent, len);
+	ret = afs_lookup_rec(parent, name, len);
 	if (IS_ERR(ret) || d_is_positive(ret))
 		goto out_n;
 
diff --git a/fs/afs/internal.h b/fs/afs/internal.h
index 70dd41e06363..6e98acc1b43f 100644
--- a/fs/afs/internal.h
+++ b/fs/afs/internal.h
@@ -207,9 +207,11 @@ extern struct file_system_type afs_fs_type;
  */
 struct afs_sysnames {
 #define AFS_NR_SYSNAME 16
-	char		*subs[AFS_NR_SYSNAME];
-	unsigned short	nr;
-	short		error;
+	char			*subs[AFS_NR_SYSNAME];
+	refcount_t		usage;
+	unsigned short		nr;
+	short			error;
+	char			blank[1];
 };
 
 /*
@@ -682,6 +684,7 @@ extern const struct inode_operations afs_dir_inode_operations;
 extern const struct address_space_operations afs_dir_aops;
 extern const struct dentry_operations afs_fs_dentry_operations;
 
+extern struct dentry *afs_lookup_rec(struct dentry *, const char *, int);
 extern void afs_d_release(struct dentry *);
 
 /*
@@ -849,6 +852,7 @@ extern int __net_init afs_proc_init(struct afs_net *);
 extern void __net_exit afs_proc_cleanup(struct afs_net *);
 extern int afs_proc_cell_setup(struct afs_net *, struct afs_cell *);
 extern void afs_proc_cell_remove(struct afs_net *, struct afs_cell *);
+extern void afs_put_sysnames(struct afs_sysnames *);
 
 /*
  * rotate.c
diff --git a/fs/afs/main.c b/fs/afs/main.c
index 01a55d25d064..d7560168b3bf 100644
--- a/fs/afs/main.c
+++ b/fs/afs/main.c
@@ -104,6 +104,7 @@ static int __net_init afs_net_init(struct afs_net *net)
 		goto error_sysnames;
 	sysnames->subs[0] = (char *)&afs_init_sysname;
 	sysnames->nr = 1;
+	refcount_set(&sysnames->usage, 1);
 	net->sysnames = sysnames;
 	rwlock_init(&net->sysnames_lock);
 
@@ -132,7 +133,7 @@ static int __net_init afs_net_init(struct afs_net *net)
 	net->live = false;
 	afs_proc_cleanup(net);
 error_proc:
-	kfree(net->sysnames);
+	afs_put_sysnames(net->sysnames);
 error_sysnames:
 	net->live = false;
 	return ret;
@@ -148,6 +149,7 @@ static void __net_exit afs_net_exit(struct afs_net *net)
 	afs_purge_servers(net);
 	afs_close_socket(net);
 	afs_proc_cleanup(net);
+	afs_put_sysnames(net->sysnames);
 }
 
 /*
diff --git a/fs/afs/proc.c b/fs/afs/proc.c
index 870b0bad03d0..839a22280606 100644
--- a/fs/afs/proc.c
+++ b/fs/afs/proc.c
@@ -393,6 +393,12 @@ static ssize_t afs_proc_rootcell_write(struct file *file,
 	if (IS_ERR(kbuf))
 		return PTR_ERR(kbuf);
 
+	ret = -EINVAL;
+	if (kbuf[0] == '.')
+		goto out;
+	if (memchr(kbuf, '/', size))
+		goto out;
+
 	/* trim to first NL */
 	s = memchr(kbuf, '\n', size);
 	if (s)
@@ -405,6 +411,7 @@ static ssize_t afs_proc_rootcell_write(struct file *file,
 	if (ret >= 0)
 		ret = size;	/* consume everything, always */
 
+out:
 	kfree(kbuf);
 	_leave(" = %d", ret);
 	return ret;
@@ -699,13 +706,14 @@ static int afs_proc_servers_show(struct seq_file *m, void *v)
 	return 0;
 }
 
-static void afs_sysnames_free(struct afs_sysnames *sysnames)
+void afs_put_sysnames(struct afs_sysnames *sysnames)
 {
 	int i;
 
-	if (sysnames) {
+	if (sysnames && refcount_dec_and_test(&sysnames->usage)) {
 		for (i = 0; i < sysnames->nr; i++)
-			if (sysnames->subs[i] != afs_init_sysname)
+			if (sysnames->subs[i] != afs_init_sysname &&
+			    sysnames->subs[i] != sysnames->blank)
 				kfree(sysnames->subs[i]);
 	}
 }
@@ -732,6 +740,7 @@ static int afs_proc_sysname_open(struct inode *inode, struct file *file)
 			return -ENOMEM;
 		}
 
+		refcount_set(&sysnames->usage, 1);
 		m = file->private_data;
 		m->private = sysnames;
 	}
@@ -775,27 +784,28 @@ static ssize_t afs_proc_sysname_write(struct file *file,
 		len = strlen(s);
 		if (len == 0)
 			continue;
-		if (len >= AFSNAMEMAX) {
-			sysnames->error = -ENAMETOOLONG;
-			ret = -ENAMETOOLONG;
-			goto out;
-		}
+		ret = -ENAMETOOLONG;
+		if (len >= AFSNAMEMAX)
+			goto error;
+
 		if (len >= 4 &&
 		    s[len - 4] == '@' &&
 		    s[len - 3] == 's' &&
 		    s[len - 2] == 'y' &&
-		    s[len - 1] == 's') {
+		    s[len - 1] == 's')
 			/* Protect against recursion */
-			sysnames->error = -EINVAL;
-			ret = -EINVAL;
-			goto out;
-		}
+			goto invalid;
+
+		if (s[0] == '.' &&
+		    (len < 2 || (len == 2 && s[1] == '.')))
+			goto invalid;
+
+		if (memchr(s, '/', len))
+			goto invalid;
 
-		if (sysnames->nr >= AFS_NR_SYSNAME) {
-			sysnames->error = -EFBIG;
-			ret = -EFBIG;
+		ret = -EFBIG;
+		if (sysnames->nr >= AFS_NR_SYSNAME)
 			goto out;
-		}
 
 		if (strcmp(s, afs_init_sysname) == 0) {
 			sub = (char *)afs_init_sysname;
@@ -815,6 +825,12 @@ static ssize_t afs_proc_sysname_write(struct file *file,
 	inode_unlock(file_inode(file));
 	kfree(kbuf);
 	return ret;
+
+invalid:
+	ret = -EINVAL;
+error:
+	sysnames->error = ret;
+	goto out;
 }
 
 static int afs_proc_sysname_release(struct inode *inode, struct file *file)
@@ -825,14 +841,18 @@ static int afs_proc_sysname_release(struct inode *inode, struct file *file)
 
 	sysnames = m->private;
 	if (sysnames) {
-		kill = sysnames;
 		if (!sysnames->error) {
+			kill = sysnames;
+			if (sysnames->nr == 0) {
+				sysnames->subs[0] = sysnames->blank;
+				sysnames->nr++;
+			}
 			write_lock(&net->sysnames_lock);
 			kill = net->sysnames;
 			net->sysnames = sysnames;
 			write_unlock(&net->sysnames_lock);
 		}
-		afs_sysnames_free(kill);
+		afs_put_sysnames(kill);
 	}
 
 	return seq_release(inode, file);

^ permalink raw reply related	[flat|nested] 37+ messages in thread

* Re: [PATCH 05/20] afs: Implement @sys substitution handling
  2018-04-06  8:13   ` David Howells
@ 2018-04-06 17:54     ` Nikolay Borisov
  0 siblings, 0 replies; 37+ messages in thread
From: Nikolay Borisov @ 2018-04-06 17:54 UTC (permalink / raw)
  To: David Howells, Al Viro; +Cc: torvalds, linux-fsdevel, linux-afs, linux-kernel



On  6.04.2018 11:13, David Howells wrote:
> Al Viro <viro@ZenIV.linux.org.uk> wrote:
> 
>> lookup_one_len() seriously depends upon exclusive lock
> 
> In the code it says:
> 
> 	WARN_ON_ONCE(!inode_is_locked(base->d_inode));
> 
> which checks i_rwsem, but in the banner comment it says:
> 
> 	* The caller must hold base->i_mutex.
> 
> is one of these wrong?

Before the switch to i_rwsem, inodes had only i_mutex, so in fact the
comment is likely stale.

> 
> David
> 

^ permalink raw reply	[flat|nested] 37+ messages in thread

* Re: [PATCH 00/20] afs: Fixes and development
  2018-04-05 20:29 [PATCH 00/20] afs: Fixes and development David Howells
                   ` (19 preceding siblings ...)
  2018-04-05 20:31 ` [PATCH 20/20] afs: Do better accretion of small writes on newly created content David Howells
@ 2018-04-07 16:50 ` Linus Torvalds
  2018-04-07 17:19   ` Al Viro
  2018-04-08  7:36 ` David Howells
  2018-04-11 14:37 ` David Howells
  22 siblings, 1 reply; 37+ messages in thread
From: Linus Torvalds @ 2018-04-07 16:50 UTC (permalink / raw)
  To: David Howells; +Cc: linux-fsdevel, linux-afs, Linux Kernel Mailing List

On Thu, Apr 5, 2018 at 1:29 PM, David Howells <dhowells@redhat.com> wrote:
>>
> Here are a set of AFS patches, a few fixes, but mostly development.  The fixes
> are:

So I pulled this after your updated fscache pull request, and I notice
that these three commits are duplicate (not shared):

      fscache: Attach the index key and aux data to the cookie
      fscache: Pass object size in rather than calling back for it
      fscache: Maintain a catalogue of allocated cookies

and partly as a result I get some trivial conflicts.

Now, the conflicts really do look entirely trivial, and that's not the
problem, but the fact that you *didn't* re-send the AFS pull request
makes me wonder if you perhaps didn't want me to pull it after all?

So I decided to not do the resolution, and instead just verify with
you that you still want this pulled?

No need to rebase, no need to do anything at all, really, except reply
with "yes I want you to pull this" or "no, the fscache pull updates
meant that I want you to do something else, hold off".

Pls advice.

(I may decide later to pull anyway, because I *think* you want me to
pull, but thought to ask in case you're online and answer quickly).

                      Linus

^ permalink raw reply	[flat|nested] 37+ messages in thread

* Re: [PATCH 00/20] afs: Fixes and development
  2018-04-07 16:50 ` [PATCH 00/20] afs: Fixes and development Linus Torvalds
@ 2018-04-07 17:19   ` Al Viro
  2018-04-07 18:04     ` Linus Torvalds
  0 siblings, 1 reply; 37+ messages in thread
From: Al Viro @ 2018-04-07 17:19 UTC (permalink / raw)
  To: Linus Torvalds
  Cc: David Howells, linux-fsdevel, linux-afs, Linux Kernel Mailing List

On Sat, Apr 07, 2018 at 09:50:05AM -0700, Linus Torvalds wrote:
> On Thu, Apr 5, 2018 at 1:29 PM, David Howells <dhowells@redhat.com> wrote:
> >>
> > Here are a set of AFS patches, a few fixes, but mostly development.  The fixes
> > are:
> 
> So I pulled this after your updated fscache pull request, and I notice
> that these three commits are duplicate (not shared):
> 
>       fscache: Attach the index key and aux data to the cookie
>       fscache: Pass object size in rather than calling back for it
>       fscache: Maintain a catalogue of allocated cookies
> 
> and partly as a result I get some trivial conflicts.
> 
> Now, the conflicts really do look entirely trivial, and that's not the
> problem, but the fact that you *didn't* re-send the AFS pull request
> makes me wonder if you perhaps didn't want me to pull it after all?
> 
> So I decided to not do the resolution, and instead just verify with
> you that you still want this pulled?
> 
> No need to rebase, no need to do anything at all, really, except reply
> with "yes I want you to pull this" or "no, the fscache pull updates
> meant that I want you to do something else, hold off".
> 
> Pls advice.
> 
> (I may decide later to pull anyway, because I *think* you want me to
> pull, but thought to ask in case you're online and answer quickly).

FWIW, the main problem I see in there is the use of lookup_one_len()
with parent locked only shared.  As it is, that's simply broken -
lookup of full name happening at the same time will bugger the
things badly.

I have a series that lowers requirements to "parent must be held at
least shared" (see vfs.git#work.dcache) and it seems to be working.
With that series the locking problem goes away; however, the use of
dget_parent() around that lookup_one_len() call is pointless -
->lookup() is guaranteed that
	* dentry->d_parent is stable at least until dentry becomes
positive.  Dentry it originally pointed to remains pinned and positive
through the entire call of ->lookup(); 'dir' argument of ->lookup()
is the inode of that dentry.
	* dentry->d_name is stable at least until it becomes positive.
	* dir remains locked at least shared through the entire call
of ->lookup().

All ->lookup() instances rely upon that and there's no need to
play silly buggers with careful grabbing a reference to dentry->d_parent.
That, of course, can be dealt with after merge, but since that commit
has to be at least rebased to avoid bisection hazard... might as well
get rid of dget_parent() there at the same time.

^ permalink raw reply	[flat|nested] 37+ messages in thread

* Re: [PATCH 00/20] afs: Fixes and development
  2018-04-07 17:19   ` Al Viro
@ 2018-04-07 18:04     ` Linus Torvalds
  0 siblings, 0 replies; 37+ messages in thread
From: Linus Torvalds @ 2018-04-07 18:04 UTC (permalink / raw)
  To: Al Viro
  Cc: David Howells, linux-fsdevel, linux-afs, Linux Kernel Mailing List

On Sat, Apr 7, 2018 at 10:19 AM, Al Viro <viro@zeniv.linux.org.uk> wrote:
>
> That, of course, can be dealt with after merge, but since that commit
> has to be at least rebased to avoid bisection hazard... might as well
> get rid of dget_parent() there at the same time.

Ok, I will skip the afs updates for now, but am open to pulling a
fixed version the second week of the merge window.

Thanks,
             Linus

^ permalink raw reply	[flat|nested] 37+ messages in thread

* Re: [PATCH 00/20] afs: Fixes and development
  2018-04-05 20:29 [PATCH 00/20] afs: Fixes and development David Howells
                   ` (20 preceding siblings ...)
  2018-04-07 16:50 ` [PATCH 00/20] afs: Fixes and development Linus Torvalds
@ 2018-04-08  7:36 ` David Howells
  2018-04-11 14:37 ` David Howells
  22 siblings, 0 replies; 37+ messages in thread
From: David Howells @ 2018-04-08  7:36 UTC (permalink / raw)
  To: Linus Torvalds
  Cc: dhowells, linux-fsdevel, linux-afs, Linux Kernel Mailing List

Linus Torvalds <torvalds@linux-foundation.org> wrote:

> Now, the conflicts really do look entirely trivial, and that's not the
> problem, but the fact that you *didn't* re-send the AFS pull request
> makes me wonder if you perhaps didn't want me to pull it after all?

Sorry: Al pointed out that I was incorrectly using lookup_one_len().  I
proposed a fix, but he decided that he wanted to rejig lookup_one_len()
instead, so I was waiting for that.

Al: Can you rebase your vfs/work.dcache branch on top of linus/master (I think
the rest of your branch is pulled already)?  Or should I just merge together
the fscache patches with it and rebase my afs-next branch off of that?

David

^ permalink raw reply	[flat|nested] 37+ messages in thread

* Re: [PATCH 01/20] vfs: Remove the const from dir_context::actor
  2018-04-05 20:29 ` [PATCH 01/20] vfs: Remove the const from dir_context::actor David Howells
@ 2018-04-10 13:49   ` Sasha Levin
  0 siblings, 0 replies; 37+ messages in thread
From: Sasha Levin @ 2018-04-10 13:49 UTC (permalink / raw)
  To: Sasha Levin, David Howells, torvalds; +Cc: dhowells, linux-fsdevel, stable

Hi,

[This is an automated email]

This commit has been processed because it contains a "Fixes:" tag,
fixing commit: ac6614b76478 [readdir] constify ->actor.

The bot has also determined it's probably a bug fixing patch. (score: 8.8479)

The bot has tested the following trees: v4.16.1, v4.15.16, v4.14.33, v4.9.93, v4.4.127.

v4.16.1: Build OK!
v4.15.16: Build OK!
v4.14.33: Build OK!
v4.9.93: Build OK!
v4.4.127: Build OK!

--
Thanks,
Sasha

^ permalink raw reply	[flat|nested] 37+ messages in thread

* Re: [PATCH 03/20] afs: Don't over-increment the cell usage count when pinning it
  2018-04-05 20:29 ` [PATCH 03/20] afs: Don't over-increment the cell usage count when pinning it David Howells
@ 2018-04-10 13:49   ` Sasha Levin
  0 siblings, 0 replies; 37+ messages in thread
From: Sasha Levin @ 2018-04-10 13:49 UTC (permalink / raw)
  To: Sasha Levin, David Howells, torvalds; +Cc: dhowells, linux-fsdevel, stable

Hi,

[This is an automated email]

This commit has been processed by the -stable helper bot and determined
to be a high probability candidate for -stable trees. (score: 5.1905)

The bot has tested the following trees: v4.16.1, v4.15.16, v4.14.33, v4.9.93, v4.4.127.

v4.16.1: Build OK!
v4.15.16: Build OK!
v4.14.33: Failed to apply! Possible dependencies:
    989782dcdc91 ("afs: Overhaul cell database management")

v4.9.93: Failed to apply! Possible dependencies:
    989782dcdc91 ("afs: Overhaul cell database management")

v4.4.127: Failed to apply! Possible dependencies:
    989782dcdc91 ("afs: Overhaul cell database management")


Please let us know if you'd like to have this patch included in a stable tree.

--
Thanks,
Sasha

^ permalink raw reply	[flat|nested] 37+ messages in thread

* Re: [PATCH 00/20] afs: Fixes and development
  2018-04-05 20:29 [PATCH 00/20] afs: Fixes and development David Howells
                   ` (21 preceding siblings ...)
  2018-04-08  7:36 ` David Howells
@ 2018-04-11 14:37 ` David Howells
  22 siblings, 0 replies; 37+ messages in thread
From: David Howells @ 2018-04-11 14:37 UTC (permalink / raw)
  To: Linus Torvalds
  Cc: dhowells, Al Viro, linux-fsdevel, linux-afs, Linux Kernel Mailing List

Linus Torvalds <torvalds@linux-foundation.org> wrote:

> Now, the conflicts really do look entirely trivial, and that's not the
> problem, but the fact that you *didn't* re-send the AFS pull request
> makes me wonder if you perhaps didn't want me to pull it after all?

Al's pulled it into his tree.  I'm hoping he's going to send you a pull
request for it.

David

^ permalink raw reply	[flat|nested] 37+ messages in thread

* Re: [PATCH 02/20] afs: Fix checker warnings
  2018-04-05 20:29 ` [PATCH 02/20] afs: Fix checker warnings David Howells
@ 2018-04-12  5:38   ` Christoph Hellwig
  2018-04-12  6:06     ` Al Viro
  0 siblings, 1 reply; 37+ messages in thread
From: Christoph Hellwig @ 2018-04-12  5:38 UTC (permalink / raw)
  To: David Howells; +Cc: torvalds, linux-fsdevel, linux-afs, linux-kernel, viro

On Thu, Apr 05, 2018 at 09:29:42PM +0100, David Howells wrote:
> Fix warnings raised by checker, including:
> 
>  (*) Warnings raised by unequal comparison for the purposes of sorting,
>      where the endianness doesn't matter:
> 
> fs/afs/addr_list.c:246:21: warning: restricted __be16 degrades to integer
> fs/afs/addr_list.c:246:30: warning: restricted __be16 degrades to integer
> fs/afs/addr_list.c:248:21: warning: restricted __be32 degrades to integer
> fs/afs/addr_list.c:248:49: warning: restricted __be32 degrades to integer
> fs/afs/addr_list.c:283:21: warning: restricted __be16 degrades to integer
> fs/afs/addr_list.c:283:30: warning: restricted __be16 degrades to integer

Seriously - just do the endian swap.  In most case it it free anyway
becaue you have load instructions that can byte swap.  Bonus points
for doing the swap on the element iterated over.

__force hacks without a very good reason (and an explanation for the
reason in the code!) are an instance reason to NAK.

>  (*) afs_find_server() casts __be16/__be32 values to int in order to
>      directly compare them for the purpose of finding a match in a list,
>      but is should also annotate the cast with __force to avoid checker
>      warnings.

Same as above.

^ permalink raw reply	[flat|nested] 37+ messages in thread

* Re: [PATCH 02/20] afs: Fix checker warnings
  2018-04-12  5:38   ` Christoph Hellwig
@ 2018-04-12  6:06     ` Al Viro
  0 siblings, 0 replies; 37+ messages in thread
From: Al Viro @ 2018-04-12  6:06 UTC (permalink / raw)
  To: Christoph Hellwig
  Cc: David Howells, torvalds, linux-fsdevel, linux-afs, linux-kernel

On Wed, Apr 11, 2018 at 10:38:07PM -0700, Christoph Hellwig wrote:
> On Thu, Apr 05, 2018 at 09:29:42PM +0100, David Howells wrote:
> > Fix warnings raised by checker, including:
> > 
> >  (*) Warnings raised by unequal comparison for the purposes of sorting,
> >      where the endianness doesn't matter:
> > 
> > fs/afs/addr_list.c:246:21: warning: restricted __be16 degrades to integer
> > fs/afs/addr_list.c:246:30: warning: restricted __be16 degrades to integer
> > fs/afs/addr_list.c:248:21: warning: restricted __be32 degrades to integer
> > fs/afs/addr_list.c:248:49: warning: restricted __be32 degrades to integer
> > fs/afs/addr_list.c:283:21: warning: restricted __be16 degrades to integer
> > fs/afs/addr_list.c:283:30: warning: restricted __be16 degrades to integer
> 
> Seriously - just do the endian swap.  In most case it it free anyway
> becaue you have load instructions that can byte swap.  Bonus points
> for doing the swap on the element iterated over.
> __force hacks without a very good reason (and an explanation for the
> reason in the code!) are an instance reason to NAK.

Nope.  It's an arbitrary comparison for sorting purposes.  There's no reason
for byteswapping anything, and while a comment to the effect that all we
care about is *some* linear order, no matter which one, would be a good idea,
that's about it.

^ permalink raw reply	[flat|nested] 37+ messages in thread

end of thread, other threads:[~2018-04-12  6:06 UTC | newest]

Thread overview: 37+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2018-04-05 20:29 [PATCH 00/20] afs: Fixes and development David Howells
2018-04-05 20:29 ` [PATCH 01/20] vfs: Remove the const from dir_context::actor David Howells
2018-04-10 13:49   ` Sasha Levin
2018-04-05 20:29 ` [PATCH 02/20] afs: Fix checker warnings David Howells
2018-04-12  5:38   ` Christoph Hellwig
2018-04-12  6:06     ` Al Viro
2018-04-05 20:29 ` [PATCH 03/20] afs: Don't over-increment the cell usage count when pinning it David Howells
2018-04-10 13:49   ` Sasha Levin
2018-04-05 20:29 ` [PATCH 04/20] afs: Prospectively look up extra files when doing a single lookup David Howells
2018-04-05 20:30 ` [PATCH 05/20] afs: Implement @sys substitution handling David Howells
2018-04-06  6:37   ` Christoph Hellwig
2018-04-06  6:51   ` Al Viro
2018-04-06  8:08   ` David Howells
2018-04-06  8:13   ` David Howells
2018-04-06 17:54     ` Nikolay Borisov
2018-04-06  9:04   ` David Howells
2018-04-06 10:32   ` David Howells
2018-04-05 20:30 ` [PATCH 06/20] afs: Implement @cell " David Howells
2018-04-05 20:30 ` [PATCH 07/20] afs: Dump bad status record David Howells
2018-04-05 20:30 ` [PATCH 08/20] afs: Introduce a statistics proc file David Howells
2018-04-05 20:30 ` [PATCH 09/20] afs: Init inode before accessing cache David Howells
2018-04-05 20:30 ` [PATCH 10/20] afs: Make it possible to get the data version in readpage David Howells
2018-04-05 20:30 ` [PATCH 11/20] afs: Rearrange status mapping David Howells
2018-04-05 20:30 ` [PATCH 12/20] afs: Keep track of invalid-before version for dentry coherency David Howells
2018-04-05 20:30 ` [PATCH 13/20] afs: Split the dynroot stuff out and give it its own ops tables David Howells
2018-04-05 20:31 ` [PATCH 14/20] afs: Fix directory handling David Howells
2018-04-05 20:31 ` [PATCH 15/20] afs: Split the directory content defs into a header David Howells
2018-04-05 20:31 ` [PATCH 16/20] afs: Adjust the directory XDR structures David Howells
2018-04-05 20:31 ` [PATCH 17/20] afs: Locally edit directory data for mkdir/create/unlink/ David Howells
2018-04-05 20:31 ` [PATCH 18/20] afs: Trace protocol errors David Howells
2018-04-05 20:31 ` [PATCH 19/20] afs: Add stats for data transfer operations David Howells
2018-04-05 20:31 ` [PATCH 20/20] afs: Do better accretion of small writes on newly created content David Howells
2018-04-07 16:50 ` [PATCH 00/20] afs: Fixes and development Linus Torvalds
2018-04-07 17:19   ` Al Viro
2018-04-07 18:04     ` Linus Torvalds
2018-04-08  7:36 ` David Howells
2018-04-11 14:37 ` David Howells

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).