linux-fsdevel.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
* [PATCH 0/8] afs: NAT-mitigation and other bits
@ 2020-04-24 15:19 David Howells
  2020-04-24 15:19 ` [PATCH 1/8] afs: Always include dir in bulk status fetch from afs_do_lookup() David Howells
                   ` (7 more replies)
  0 siblings, 8 replies; 9+ messages in thread
From: David Howells @ 2020-04-24 15:19 UTC (permalink / raw)
  To: linux-afs
  Cc: Dave Botsch, Jeffrey Altman, linux-fsdevel, linux-kernel, dhowells


The primary part of this patchset, is intended to help deal with the
effects of using an AFS client that is communicating with a server through
some sort of NAT or firewall, whereby if the just right amount of time
lapses before a third party change is made, the client thinks it still has
a valid callback, but the server's attempt to notify the client of the
change bounces because the NAT/firewall window has closed.  The problem is
that kafs does insufficient probing to maintain the firewall window.

The effect is mitigated in the following ways:

 (1) When an FS.InlineBulkStatus op is sent to the server during a file
     lookup, the FID of the directory being looked up in will now get
     included in the list of vnodes to query.  This will find out more
     quickly if the dir has changed.

 (2) The fileserver is now polled regularly by an independent, timed
     manager rather than only being polled by the rotation algorithm when
     someone does a VFS operation that performs an RPC call.

I have included some other bits in the patchset also:

 (1) Apply uninterruptibility a bit more thoroughly.  There are more places
     that need it yet, but they're harder to fix.

 (2) Use the serverUnique field from the VLDB record to trigger a recheck
     of a fileserver's endpoints rather than doing it on a timed basis
     separately for each fileserver.  This reduces the number of VL RPCs
     performed, albeit it's a minor reduction.

 (3) Note when we have detected the epoch from the fileserver so that the
     code that checks it actually does its stuff.

 (4) Remove some unused bits in the code.

Note that I've spotted some bugs in the fileserver rotation algorithm, but
that's going to need its own rewrite as the structure of it is wrong.

David
---
David Howells (8):
      afs: Always include dir in bulk status fetch from afs_do_lookup()
      afs: Make record checking use TASK_UNINTERRUPTIBLE when appropriate
      afs: Use the serverUnique field in the UVLDB record to reduce rpc ops
      afs: Fix to actually set AFS_SERVER_FL_HAVE_EPOCH
      afs: Remove some unused bits
      afs: Split the usage count on struct afs_server
      afs: Actively poll fileservers to maintain NAT or firewall openings
      afs: Show more information in /proc/net/afs/servers


 fs/afs/cmservice.c         |    8 +
 fs/afs/dir.c               |    9 +
 fs/afs/fs_probe.c          |  280 +++++++++++++++++++++++++++++++++-----------
 fs/afs/fsclient.c          |   24 ++--
 fs/afs/internal.h          |   58 ++++++---
 fs/afs/main.c              |    5 +
 fs/afs/proc.c              |   18 ++-
 fs/afs/rotate.c            |   13 +-
 fs/afs/rxrpc.c             |    2 
 fs/afs/server.c            |  201 +++++++++++++++++++-------------
 fs/afs/server_list.c       |    7 +
 fs/afs/vl_rotate.c         |    4 -
 fs/afs/vlclient.c          |    1 
 fs/afs/volume.c            |   32 +++--
 include/trace/events/afs.h |   22 +++
 15 files changed, 455 insertions(+), 229 deletions(-)



^ permalink raw reply	[flat|nested] 9+ messages in thread

* [PATCH 1/8] afs: Always include dir in bulk status fetch from afs_do_lookup()
  2020-04-24 15:19 [PATCH 0/8] afs: NAT-mitigation and other bits David Howells
@ 2020-04-24 15:19 ` David Howells
  2020-04-24 15:20 ` [PATCH 2/8] afs: Make record checking use TASK_UNINTERRUPTIBLE when appropriate David Howells
                   ` (6 subsequent siblings)
  7 siblings, 0 replies; 9+ messages in thread
From: David Howells @ 2020-04-24 15:19 UTC (permalink / raw)
  To: linux-afs
  Cc: Dave Botsch, Jeffrey Altman, linux-fsdevel, linux-kernel, dhowells

When a lookup is done in an AFS directory, the filesystem will speculate
and fetch up to 49 other statuses for files in the same directory and fetch
those as well, turning them into inodes or updating inodes that already
exist.

However, occasionally, a callback break might go missing due to NAT timing
out, but the afs filesystem doesn't then realise that the directory is not
up to date.

Alleviate this by using one of the status slots to check the directory in
which the lookup is being done.

Reported-by: Dave Botsch <botsch@cnf.cornell.edu>
Suggested-by: Jeffrey Altman <jaltman@auristor.com>
Signed-off-by: David Howells <dhowells@redhat.com>
---

 fs/afs/dir.c |    9 +++++++--
 1 file changed, 7 insertions(+), 2 deletions(-)

diff --git a/fs/afs/dir.c b/fs/afs/dir.c
index d1e1caa23c8b..3c486340b220 100644
--- a/fs/afs/dir.c
+++ b/fs/afs/dir.c
@@ -658,7 +658,8 @@ static struct inode *afs_do_lookup(struct inode *dir, struct dentry *dentry,
 
 	cookie->ctx.actor = afs_lookup_filldir;
 	cookie->name = dentry->d_name;
-	cookie->nr_fids = 1; /* slot 0 is saved for the fid we actually want */
+	cookie->nr_fids = 2; /* slot 0 is saved for the fid we actually want
+			      * and slot 1 for the directory */
 
 	read_seqlock_excl(&dvnode->cb_lock);
 	dcbi = rcu_dereference_protected(dvnode->cb_interest,
@@ -709,7 +710,11 @@ static struct inode *afs_do_lookup(struct inode *dir, struct dentry *dentry,
 	if (!cookie->inodes)
 		goto out_s;
 
-	for (i = 1; i < cookie->nr_fids; i++) {
+	cookie->fids[1] = dvnode->fid;
+	cookie->statuses[1].cb_break = afs_calc_vnode_cb_break(dvnode);
+	cookie->inodes[1] = igrab(&dvnode->vfs_inode);
+
+	for (i = 2; i < cookie->nr_fids; i++) {
 		scb = &cookie->statuses[i];
 
 		/* Find any inodes that already exist and get their



^ permalink raw reply related	[flat|nested] 9+ messages in thread

* [PATCH 2/8] afs: Make record checking use TASK_UNINTERRUPTIBLE when appropriate
  2020-04-24 15:19 [PATCH 0/8] afs: NAT-mitigation and other bits David Howells
  2020-04-24 15:19 ` [PATCH 1/8] afs: Always include dir in bulk status fetch from afs_do_lookup() David Howells
@ 2020-04-24 15:20 ` David Howells
  2020-04-24 15:20 ` [PATCH 3/8] afs: Use the serverUnique field in the UVLDB record to reduce rpc ops David Howells
                   ` (5 subsequent siblings)
  7 siblings, 0 replies; 9+ messages in thread
From: David Howells @ 2020-04-24 15:20 UTC (permalink / raw)
  To: linux-afs; +Cc: linux-fsdevel, linux-kernel, dhowells

When an operation is meant to be done uninterruptibly (such as
FS.StoreData), we should not be allowing volume and server record checking
to be interrupted.

Signed-off-by: David Howells <dhowells@redhat.com>
---

 fs/afs/internal.h |    2 +-
 fs/afs/rotate.c   |    6 +++---
 fs/afs/server.c   |    7 ++-----
 fs/afs/volume.c   |    8 +++++---
 4 files changed, 11 insertions(+), 12 deletions(-)

diff --git a/fs/afs/internal.h b/fs/afs/internal.h
index ef732dd4e7ef..15ae9c7f9c00 100644
--- a/fs/afs/internal.h
+++ b/fs/afs/internal.h
@@ -1335,7 +1335,7 @@ extern struct afs_volume *afs_create_volume(struct afs_fs_context *);
 extern void afs_activate_volume(struct afs_volume *);
 extern void afs_deactivate_volume(struct afs_volume *);
 extern void afs_put_volume(struct afs_cell *, struct afs_volume *);
-extern int afs_check_volume_status(struct afs_volume *, struct key *);
+extern int afs_check_volume_status(struct afs_volume *, struct afs_fs_cursor *);
 
 /*
  * write.c
diff --git a/fs/afs/rotate.c b/fs/afs/rotate.c
index 172ba569cd60..2a3305e42b14 100644
--- a/fs/afs/rotate.c
+++ b/fs/afs/rotate.c
@@ -192,7 +192,7 @@ bool afs_select_fileserver(struct afs_fs_cursor *fc)
 			write_unlock(&vnode->volume->servers_lock);
 
 			set_bit(AFS_VOLUME_NEEDS_UPDATE, &vnode->volume->flags);
-			error = afs_check_volume_status(vnode->volume, fc->key);
+			error = afs_check_volume_status(vnode->volume, fc);
 			if (error < 0)
 				goto failed_set_error;
 
@@ -281,7 +281,7 @@ bool afs_select_fileserver(struct afs_fs_cursor *fc)
 
 			set_bit(AFS_VOLUME_WAIT, &vnode->volume->flags);
 			set_bit(AFS_VOLUME_NEEDS_UPDATE, &vnode->volume->flags);
-			error = afs_check_volume_status(vnode->volume, fc->key);
+			error = afs_check_volume_status(vnode->volume, fc);
 			if (error < 0)
 				goto failed_set_error;
 
@@ -341,7 +341,7 @@ bool afs_select_fileserver(struct afs_fs_cursor *fc)
 	/* See if we need to do an update of the volume record.  Note that the
 	 * volume may have moved or even have been deleted.
 	 */
-	error = afs_check_volume_status(vnode->volume, fc->key);
+	error = afs_check_volume_status(vnode->volume, fc);
 	if (error < 0)
 		goto failed_set_error;
 
diff --git a/fs/afs/server.c b/fs/afs/server.c
index b7f3cb2130ca..11b90ac7ea30 100644
--- a/fs/afs/server.c
+++ b/fs/afs/server.c
@@ -594,12 +594,9 @@ bool afs_check_server_record(struct afs_fs_cursor *fc, struct afs_server *server
 	}
 
 	ret = wait_on_bit(&server->flags, AFS_SERVER_FL_UPDATING,
-			  TASK_INTERRUPTIBLE);
+			  (fc->flags & AFS_FS_CURSOR_INTR) ?
+			  TASK_INTERRUPTIBLE : TASK_UNINTERRUPTIBLE);
 	if (ret == -ERESTARTSYS) {
-		if (!(fc->flags & AFS_FS_CURSOR_INTR) && server->addresses) {
-			_leave(" = t [intr]");
-			return true;
-		}
 		fc->error = ret;
 		_leave(" = f [intr]");
 		return false;
diff --git a/fs/afs/volume.c b/fs/afs/volume.c
index 92ca5e27573b..4310336b9bb8 100644
--- a/fs/afs/volume.c
+++ b/fs/afs/volume.c
@@ -281,7 +281,7 @@ static int afs_update_volume_status(struct afs_volume *volume, struct key *key)
 /*
  * Make sure the volume record is up to date.
  */
-int afs_check_volume_status(struct afs_volume *volume, struct key *key)
+int afs_check_volume_status(struct afs_volume *volume, struct afs_fs_cursor *fc)
 {
 	time64_t now = ktime_get_real_seconds();
 	int ret, retries = 0;
@@ -299,7 +299,7 @@ int afs_check_volume_status(struct afs_volume *volume, struct key *key)
 	}
 
 	if (!test_and_set_bit_lock(AFS_VOLUME_UPDATING, &volume->flags)) {
-		ret = afs_update_volume_status(volume, key);
+		ret = afs_update_volume_status(volume, fc->key);
 		clear_bit_unlock(AFS_VOLUME_WAIT, &volume->flags);
 		clear_bit_unlock(AFS_VOLUME_UPDATING, &volume->flags);
 		wake_up_bit(&volume->flags, AFS_VOLUME_WAIT);
@@ -312,7 +312,9 @@ int afs_check_volume_status(struct afs_volume *volume, struct key *key)
 		return 0;
 	}
 
-	ret = wait_on_bit(&volume->flags, AFS_VOLUME_WAIT, TASK_INTERRUPTIBLE);
+	ret = wait_on_bit(&volume->flags, AFS_VOLUME_WAIT,
+			  (fc->flags & AFS_FS_CURSOR_INTR) ?
+			  TASK_INTERRUPTIBLE : TASK_UNINTERRUPTIBLE);
 	if (ret == -ERESTARTSYS) {
 		_leave(" = %d", ret);
 		return ret;



^ permalink raw reply related	[flat|nested] 9+ messages in thread

* [PATCH 3/8] afs: Use the serverUnique field in the UVLDB record to reduce rpc ops
  2020-04-24 15:19 [PATCH 0/8] afs: NAT-mitigation and other bits David Howells
  2020-04-24 15:19 ` [PATCH 1/8] afs: Always include dir in bulk status fetch from afs_do_lookup() David Howells
  2020-04-24 15:20 ` [PATCH 2/8] afs: Make record checking use TASK_UNINTERRUPTIBLE when appropriate David Howells
@ 2020-04-24 15:20 ` David Howells
  2020-04-24 15:20 ` [PATCH 4/8] afs: Fix to actually set AFS_SERVER_FL_HAVE_EPOCH David Howells
                   ` (4 subsequent siblings)
  7 siblings, 0 replies; 9+ messages in thread
From: David Howells @ 2020-04-24 15:20 UTC (permalink / raw)
  To: linux-afs; +Cc: Jeffrey Altman, linux-fsdevel, linux-kernel, dhowells

The U-version VLDB volume record retrieved by the VL.GetEntryByNameU rpc op
carries a change counter (the serverUnique field) for each fileserver
listed in the record as backing that volume.  This is incremented whenever
the registration details for a fileserver change (such as its address
list).  Note that the same value will be seen in all UVLDB records that
refer to that fileserver.

This should be checked before calling the VL server to re-query the address
list for a fileserver.  If it's the same, there's no point doing the query.

Reported-by: Jeffrey Altman <jaltman@auristor.com>
Signed-off-by: David Howells <dhowells@redhat.com>
---

 fs/afs/internal.h    |    5 +++--
 fs/afs/server.c      |   26 ++++++++++++++------------
 fs/afs/server_list.c |    3 ++-
 fs/afs/vlclient.c    |    1 +
 4 files changed, 20 insertions(+), 15 deletions(-)

diff --git a/fs/afs/internal.h b/fs/afs/internal.h
index 15ae9c7f9c00..2d4e0c4e23a4 100644
--- a/fs/afs/internal.h
+++ b/fs/afs/internal.h
@@ -471,6 +471,7 @@ struct afs_vldb_entry {
 #define AFS_VLDB_QUERY_ERROR	4		/* - VL server returned error */
 
 	uuid_t			fs_server[AFS_NMAXNSERVERS];
+	u32			addr_version[AFS_NMAXNSERVERS]; /* Registration change counters */
 	u8			fs_mask[AFS_NMAXNSERVERS];
 #define AFS_VOL_VTM_RW	0x01 /* R/W version of the volume is available (on this server) */
 #define AFS_VOL_VTM_RO	0x02 /* R/O version of the volume is available (on this server) */
@@ -498,7 +499,6 @@ struct afs_server {
 	struct hlist_node	proc_link;	/* Link in net->fs_proc */
 	struct afs_server	*gc_next;	/* Next server in manager's list */
 	time64_t		put_time;	/* Time at which last put */
-	time64_t		update_at;	/* Time at which to next update the record */
 	unsigned long		flags;
 #define AFS_SERVER_FL_NOT_READY	1		/* The record is not ready for use */
 #define AFS_SERVER_FL_NOT_FOUND	2		/* VL server says no such server */
@@ -511,6 +511,7 @@ struct afs_server {
 #define AFS_SERVER_FL_IS_YFS	9		/* Server is YFS not AFS */
 #define AFS_SERVER_FL_NO_RM2	10		/* Fileserver doesn't support YFS.RemoveFile2 */
 #define AFS_SERVER_FL_HAVE_EPOCH 11		/* ->epoch is valid */
+#define AFS_SERVER_FL_NEEDS_UPDATE 12		/* Fileserver address list is out of date */
 	atomic_t		usage;
 	u32			addr_version;	/* Address list version */
 	u32			cm_epoch;	/* Server RxRPC epoch */
@@ -1243,7 +1244,7 @@ extern spinlock_t afs_server_peer_lock;
 extern struct afs_server *afs_find_server(struct afs_net *,
 					  const struct sockaddr_rxrpc *);
 extern struct afs_server *afs_find_server_by_uuid(struct afs_net *, const uuid_t *);
-extern struct afs_server *afs_lookup_server(struct afs_cell *, struct key *, const uuid_t *);
+extern struct afs_server *afs_lookup_server(struct afs_cell *, struct key *, const uuid_t *, u32);
 extern struct afs_server *afs_get_server(struct afs_server *, enum afs_server_trace);
 extern void afs_put_server(struct afs_net *, struct afs_server *, enum afs_server_trace);
 extern void afs_manage_servers(struct work_struct *);
diff --git a/fs/afs/server.c b/fs/afs/server.c
index 11b90ac7ea30..9e50ccde5d37 100644
--- a/fs/afs/server.c
+++ b/fs/afs/server.c
@@ -12,7 +12,6 @@
 #include "protocol_yfs.h"
 
 static unsigned afs_server_gc_delay = 10;	/* Server record timeout in seconds */
-static unsigned afs_server_update_delay = 30;	/* Time till VLDB recheck in secs */
 static atomic_t afs_server_debug_id;
 
 static void afs_inc_servers_outstanding(struct afs_net *net)
@@ -218,7 +217,6 @@ static struct afs_server *afs_alloc_server(struct afs_net *net,
 	RCU_INIT_POINTER(server->addresses, alist);
 	server->addr_version = alist->version;
 	server->uuid = *uuid;
-	server->update_at = ktime_get_real_seconds() + afs_server_update_delay;
 	rwlock_init(&server->fs_lock);
 	INIT_HLIST_HEAD(&server->cb_volumes);
 	rwlock_init(&server->cb_break_lock);
@@ -264,7 +262,7 @@ static struct afs_addr_list *afs_vl_lookup_addrs(struct afs_cell *cell,
  * Get or create a fileserver record.
  */
 struct afs_server *afs_lookup_server(struct afs_cell *cell, struct key *key,
-				     const uuid_t *uuid)
+				     const uuid_t *uuid, u32 addr_version)
 {
 	struct afs_addr_list *alist;
 	struct afs_server *server, *candidate;
@@ -272,8 +270,11 @@ struct afs_server *afs_lookup_server(struct afs_cell *cell, struct key *key,
 	_enter("%p,%pU", cell->net, uuid);
 
 	server = afs_find_server_by_uuid(cell->net, uuid);
-	if (server)
+	if (server) {
+		if (server->addr_version != addr_version)
+			set_bit(AFS_SERVER_FL_NEEDS_UPDATE, &server->flags);
 		return server;
+	}
 
 	alist = afs_vl_lookup_addrs(cell, key, uuid);
 	if (IS_ERR(alist))
@@ -558,7 +559,6 @@ static noinline bool afs_update_server_record(struct afs_fs_cursor *fc, struct a
 		write_unlock(&server->fs_lock);
 	}
 
-	server->update_at = ktime_get_real_seconds() + afs_server_update_delay;
 	afs_put_addrlist(discard);
 	_leave(" = t");
 	return true;
@@ -569,8 +569,6 @@ static noinline bool afs_update_server_record(struct afs_fs_cursor *fc, struct a
  */
 bool afs_check_server_record(struct afs_fs_cursor *fc, struct afs_server *server)
 {
-	time64_t now = ktime_get_real_seconds();
-	long diff;
 	bool success;
 	int ret, retries = 0;
 
@@ -579,13 +577,16 @@ bool afs_check_server_record(struct afs_fs_cursor *fc, struct afs_server *server
 	ASSERT(server);
 
 retry:
-	diff = READ_ONCE(server->update_at) - now;
-	if (diff > 0) {
-		_leave(" = t [not now %ld]", diff);
-		return true;
-	}
+	if (test_bit(AFS_SERVER_FL_UPDATING, &server->flags))
+		goto wait;
+	if (test_bit(AFS_SERVER_FL_NEEDS_UPDATE, &server->flags))
+		goto update;
+	_leave(" = t [good]");
+	return true;
 
+update:
 	if (!test_and_set_bit_lock(AFS_SERVER_FL_UPDATING, &server->flags)) {
+		clear_bit(AFS_SERVER_FL_NEEDS_UPDATE, &server->flags);
 		success = afs_update_server_record(fc, server);
 		clear_bit_unlock(AFS_SERVER_FL_UPDATING, &server->flags);
 		wake_up_bit(&server->flags, AFS_SERVER_FL_UPDATING);
@@ -593,6 +594,7 @@ bool afs_check_server_record(struct afs_fs_cursor *fc, struct afs_server *server
 		return success;
 	}
 
+wait:
 	ret = wait_on_bit(&server->flags, AFS_SERVER_FL_UPDATING,
 			  (fc->flags & AFS_FS_CURSOR_INTR) ?
 			  TASK_INTERRUPTIBLE : TASK_UNINTERRUPTIBLE);
diff --git a/fs/afs/server_list.c b/fs/afs/server_list.c
index 888d91d195d9..f567732df5cc 100644
--- a/fs/afs/server_list.c
+++ b/fs/afs/server_list.c
@@ -51,7 +51,8 @@ struct afs_server_list *afs_alloc_server_list(struct afs_cell *cell,
 		if (!(vldb->fs_mask[i] & type_mask))
 			continue;
 
-		server = afs_lookup_server(cell, key, &vldb->fs_server[i]);
+		server = afs_lookup_server(cell, key, &vldb->fs_server[i],
+					   vldb->addr_version[i]);
 		if (IS_ERR(server)) {
 			ret = PTR_ERR(server);
 			if (ret == -ENOENT ||
diff --git a/fs/afs/vlclient.c b/fs/afs/vlclient.c
index 516e9a3bb5b4..972dc5512f33 100644
--- a/fs/afs/vlclient.c
+++ b/fs/afs/vlclient.c
@@ -82,6 +82,7 @@ static int afs_deliver_vl_get_entry_by_name_u(struct afs_call *call)
 		for (j = 0; j < 6; j++)
 			uuid->node[j] = (u8)ntohl(xdr->node[j]);
 
+		entry->addr_version[n] = ntohl(uvldb->serverUnique[i]);
 		entry->nr_servers++;
 	}
 



^ permalink raw reply related	[flat|nested] 9+ messages in thread

* [PATCH 4/8] afs: Fix to actually set AFS_SERVER_FL_HAVE_EPOCH
  2020-04-24 15:19 [PATCH 0/8] afs: NAT-mitigation and other bits David Howells
                   ` (2 preceding siblings ...)
  2020-04-24 15:20 ` [PATCH 3/8] afs: Use the serverUnique field in the UVLDB record to reduce rpc ops David Howells
@ 2020-04-24 15:20 ` David Howells
  2020-04-24 15:20 ` [PATCH 5/8] afs: Remove some unused bits David Howells
                   ` (3 subsequent siblings)
  7 siblings, 0 replies; 9+ messages in thread
From: David Howells @ 2020-04-24 15:20 UTC (permalink / raw)
  To: linux-afs; +Cc: linux-fsdevel, linux-kernel, dhowells

AFS keeps track of the epoch value from the rxrpc protocol to note (a) when
a fileserver appears to have restarted and (b) when different endpoints of
a fileserver do not appear to be associated with the same fileserver
(ie. all probes back from a fileserver from all of its interfaces should
carry the same epoch).

However, the AFS_SERVER_FL_HAVE_EPOCH flag that indicates that we've
received the server's epoch is never set, though it is used.

Fix this to set the flag when we first receive an epoch value from a probe
sent to the filesystem client from the fileserver.

Fixes: 3bf0fb6f33dd ("afs: Probe multiple fileservers simultaneously")
Signed-off-by: David Howells <dhowells@redhat.com>
---

 fs/afs/cmservice.c |    2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/fs/afs/cmservice.c b/fs/afs/cmservice.c
index 6765949b3aab..380ad5ace7cf 100644
--- a/fs/afs/cmservice.c
+++ b/fs/afs/cmservice.c
@@ -169,7 +169,7 @@ static int afs_record_cm_probe(struct afs_call *call, struct afs_server *server)
 
 	spin_lock(&server->probe_lock);
 
-	if (!test_bit(AFS_SERVER_FL_HAVE_EPOCH, &server->flags)) {
+	if (!test_and_set_bit(AFS_SERVER_FL_HAVE_EPOCH, &server->flags)) {
 		server->cm_epoch = call->epoch;
 		server->probe.cm_epoch = call->epoch;
 		goto out;



^ permalink raw reply related	[flat|nested] 9+ messages in thread

* [PATCH 5/8] afs: Remove some unused bits
  2020-04-24 15:19 [PATCH 0/8] afs: NAT-mitigation and other bits David Howells
                   ` (3 preceding siblings ...)
  2020-04-24 15:20 ` [PATCH 4/8] afs: Fix to actually set AFS_SERVER_FL_HAVE_EPOCH David Howells
@ 2020-04-24 15:20 ` David Howells
  2020-04-24 15:20 ` [PATCH 6/8] afs: Split the usage count on struct afs_server David Howells
                   ` (2 subsequent siblings)
  7 siblings, 0 replies; 9+ messages in thread
From: David Howells @ 2020-04-24 15:20 UTC (permalink / raw)
  To: linux-afs; +Cc: linux-fsdevel, linux-kernel, dhowells

Remove three bits:

 (1) afs_server::no_epoch is neither set nor used.

 (2) afs_server::have_result is set and a wakeup is applied to it, but
     nothing looks at it or waits on it.

 (3) afs_vl_dump_edestaddrreq() prints afs_addr_list::probed, but nothing
     sets it for VL servers.

Signed-off-by: David Howells <dhowells@redhat.com>
---

 fs/afs/fs_probe.c  |    5 +----
 fs/afs/internal.h  |    2 --
 fs/afs/vl_rotate.c |    4 ++--
 3 files changed, 3 insertions(+), 8 deletions(-)

diff --git a/fs/afs/fs_probe.c b/fs/afs/fs_probe.c
index e1b9ed679045..a587767b6ae1 100644
--- a/fs/afs/fs_probe.c
+++ b/fs/afs/fs_probe.c
@@ -117,11 +117,8 @@ void afs_fileserver_probe_result(struct afs_call *call)
 	       (unsigned int)rtt, ret);
 
 	have_result |= afs_fs_probe_done(server);
-	if (have_result) {
-		server->probe.have_result = true;
-		wake_up_var(&server->probe.have_result);
+	if (have_result)
 		wake_up_all(&server->probe_wq);
-	}
 }
 
 /*
diff --git a/fs/afs/internal.h b/fs/afs/internal.h
index 2d4e0c4e23a4..ee17c868ad2c 100644
--- a/fs/afs/internal.h
+++ b/fs/afs/internal.h
@@ -534,12 +534,10 @@ struct afs_server {
 		u32		abort_code;
 		u32		cm_epoch;
 		short		error;
-		bool		have_result;
 		bool		responded:1;
 		bool		is_yfs:1;
 		bool		not_yfs:1;
 		bool		local_failure:1;
-		bool		no_epoch:1;
 		bool		cm_probed:1;
 		bool		said_rebooted:1;
 		bool		said_inconsistent:1;
diff --git a/fs/afs/vl_rotate.c b/fs/afs/vl_rotate.c
index 9a5ce9687779..72eacc14e6e1 100644
--- a/fs/afs/vl_rotate.c
+++ b/fs/afs/vl_rotate.c
@@ -302,8 +302,8 @@ static void afs_vl_dump_edestaddrreq(const struct afs_vl_cursor *vc)
 				pr_notice("VC:  - nr=%u/%u/%u pf=%u\n",
 					  a->nr_ipv4, a->nr_addrs, a->max_addrs,
 					  a->preferred);
-				pr_notice("VC:  - pr=%lx R=%lx F=%lx\n",
-					  a->probed, a->responded, a->failed);
+				pr_notice("VC:  - R=%lx F=%lx\n",
+					  a->responded, a->failed);
 				if (a == vc->ac.alist)
 					pr_notice("VC:  - current\n");
 			}



^ permalink raw reply related	[flat|nested] 9+ messages in thread

* [PATCH 6/8] afs: Split the usage count on struct afs_server
  2020-04-24 15:19 [PATCH 0/8] afs: NAT-mitigation and other bits David Howells
                   ` (4 preceding siblings ...)
  2020-04-24 15:20 ` [PATCH 5/8] afs: Remove some unused bits David Howells
@ 2020-04-24 15:20 ` David Howells
  2020-04-24 15:20 ` [PATCH 7/8] afs: Actively poll fileservers to maintain NAT or firewall openings David Howells
  2020-04-24 15:20 ` [PATCH 8/8] afs: Show more information in /proc/net/afs/servers David Howells
  7 siblings, 0 replies; 9+ messages in thread
From: David Howells @ 2020-04-24 15:20 UTC (permalink / raw)
  To: linux-afs; +Cc: linux-fsdevel, linux-kernel, dhowells

Split the usage count on the afs_server struct to have an active count that
registers who's actually using it separately from the reference count on
the object.

This allows a future patch to dispatch polling probes without advancing the
"unuse" time into the future each time we emit a probe, which would
otherwise prevent unused server records from expiring.

Included in this:

 (1) The latter part of afs_destroy_server() in which the RCU destruction
     of afs_server objects is invoked and the outstanding server count is
     decremented is split out into __afs_put_server().

 (2) afs_put_server() now calls __afs_put_server() rather then setting the
     management timer.

 (3) The calls begun by afs_fs_give_up_all_callbacks() and
     afs_fs_get_capabilities() can now take a ref on the server record, so
     afs_destroy_server() can just drop its ref and needn't wait for the
     completion of these calls.  They'll put the ref when they're done.

 (4) Because of (3), afs_fs_probe_done() no longer needs to wake up
     afs_destroy_server() with server->probe_outstanding.

 (5) afs_gc_servers can be simplified.  It only needs to check if
     server->active is 0 rather than playing games with the refcount.

 (6) afs_manage_servers() can propose a server for gc if usage == 0 rather
     than if ref == 1.  The gc is effected by (5).

Signed-off-by: David Howells <dhowells@redhat.com>
---

 fs/afs/cmservice.c         |    4 +
 fs/afs/fs_probe.c          |    1 
 fs/afs/fsclient.c          |    5 +
 fs/afs/internal.h          |    8 ++
 fs/afs/proc.c              |    9 +--
 fs/afs/rxrpc.c             |    2 -
 fs/afs/server.c            |  151 +++++++++++++++++++++++++++++---------------
 fs/afs/server_list.c       |    4 +
 include/trace/events/afs.h |   18 +++--
 9 files changed, 131 insertions(+), 71 deletions(-)

diff --git a/fs/afs/cmservice.c b/fs/afs/cmservice.c
index 380ad5ace7cf..7dcbca3bf828 100644
--- a/fs/afs/cmservice.c
+++ b/fs/afs/cmservice.c
@@ -268,7 +268,9 @@ static void SRXAFSCB_CallBack(struct work_struct *work)
 	 * to maintain cache coherency.
 	 */
 	if (call->server) {
-		trace_afs_server(call->server, atomic_read(&call->server->usage),
+		trace_afs_server(call->server,
+				 atomic_read(&call->server->ref),
+				 atomic_read(&call->server->active),
 				 afs_server_trace_callback);
 		afs_break_callbacks(call->server, call->count, call->request);
 	}
diff --git a/fs/afs/fs_probe.c b/fs/afs/fs_probe.c
index a587767b6ae1..d458b0adfce5 100644
--- a/fs/afs/fs_probe.c
+++ b/fs/afs/fs_probe.c
@@ -16,7 +16,6 @@ static bool afs_fs_probe_done(struct afs_server *server)
 	if (!atomic_dec_and_test(&server->probe_outstanding))
 		return false;
 
-	wake_up_var(&server->probe_outstanding);
 	clear_bit_unlock(AFS_SERVER_FL_PROBING, &server->flags);
 	wake_up_bit(&server->flags, AFS_SERVER_FL_PROBING);
 	return true;
diff --git a/fs/afs/fsclient.c b/fs/afs/fsclient.c
index 68fc46634346..7384fb545eb4 100644
--- a/fs/afs/fsclient.c
+++ b/fs/afs/fsclient.c
@@ -1842,7 +1842,7 @@ int afs_fs_give_up_all_callbacks(struct afs_net *net,
 	bp = call->request;
 	*bp++ = htonl(FSGIVEUPALLCALLBACKS);
 
-	/* Can't take a ref on server */
+	call->server = afs_use_server(server, afs_server_trace_give_up_cb);
 	afs_make_call(ac, call, GFP_NOFS);
 	return afs_wait_for_call_to_complete(call, ac);
 }
@@ -1924,7 +1924,7 @@ struct afs_call *afs_fs_get_capabilities(struct afs_net *net,
 		return ERR_PTR(-ENOMEM);
 
 	call->key = key;
-	call->server = afs_get_server(server, afs_server_trace_get_caps);
+	call->server = afs_use_server(server, afs_server_trace_get_caps);
 	call->server_index = server_index;
 	call->upgrade = true;
 	call->async = true;
@@ -1934,7 +1934,6 @@ struct afs_call *afs_fs_get_capabilities(struct afs_net *net,
 	bp = call->request;
 	*bp++ = htonl(FSGETCAPABILITIES);
 
-	/* Can't take a ref on server */
 	trace_afs_make_fs_call(call, NULL);
 	afs_make_call(ac, call, GFP_NOFS);
 	return call;
diff --git a/fs/afs/internal.h b/fs/afs/internal.h
index ee17c868ad2c..cb70e1c234cc 100644
--- a/fs/afs/internal.h
+++ b/fs/afs/internal.h
@@ -498,7 +498,7 @@ struct afs_server {
 	struct hlist_node	addr6_link;	/* Link in net->fs_addresses6 */
 	struct hlist_node	proc_link;	/* Link in net->fs_proc */
 	struct afs_server	*gc_next;	/* Next server in manager's list */
-	time64_t		put_time;	/* Time at which last put */
+	time64_t		unuse_time;	/* Time at which last unused */
 	unsigned long		flags;
 #define AFS_SERVER_FL_NOT_READY	1		/* The record is not ready for use */
 #define AFS_SERVER_FL_NOT_FOUND	2		/* VL server says no such server */
@@ -512,7 +512,8 @@ struct afs_server {
 #define AFS_SERVER_FL_NO_RM2	10		/* Fileserver doesn't support YFS.RemoveFile2 */
 #define AFS_SERVER_FL_HAVE_EPOCH 11		/* ->epoch is valid */
 #define AFS_SERVER_FL_NEEDS_UPDATE 12		/* Fileserver address list is out of date */
-	atomic_t		usage;
+	atomic_t		ref;		/* Object refcount */
+	atomic_t		active;		/* Active user count */
 	u32			addr_version;	/* Address list version */
 	u32			cm_epoch;	/* Server RxRPC epoch */
 	unsigned int		debug_id;	/* Debugging ID for traces */
@@ -1244,6 +1245,9 @@ extern struct afs_server *afs_find_server(struct afs_net *,
 extern struct afs_server *afs_find_server_by_uuid(struct afs_net *, const uuid_t *);
 extern struct afs_server *afs_lookup_server(struct afs_cell *, struct key *, const uuid_t *, u32);
 extern struct afs_server *afs_get_server(struct afs_server *, enum afs_server_trace);
+extern struct afs_server *afs_use_server(struct afs_server *, enum afs_server_trace);
+extern void afs_unuse_server(struct afs_net *, struct afs_server *, enum afs_server_trace);
+extern void afs_unuse_server_notime(struct afs_net *, struct afs_server *, enum afs_server_trace);
 extern void afs_put_server(struct afs_net *, struct afs_server *, enum afs_server_trace);
 extern void afs_manage_servers(struct work_struct *);
 extern void afs_servers_timer(struct timer_list *);
diff --git a/fs/afs/proc.c b/fs/afs/proc.c
index 468e1713bce1..9bce7898cd7d 100644
--- a/fs/afs/proc.c
+++ b/fs/afs/proc.c
@@ -378,19 +378,20 @@ static int afs_proc_servers_show(struct seq_file *m, void *v)
 	int i;
 
 	if (v == SEQ_START_TOKEN) {
-		seq_puts(m, "UUID                                 USE ADDR\n");
+		seq_puts(m, "UUID                                 REF ACT ADDR\n");
 		return 0;
 	}
 
 	server = list_entry(v, struct afs_server, proc_link);
 	alist = rcu_dereference(server->addresses);
-	seq_printf(m, "%pU %3d %pISpc%s\n",
+	seq_printf(m, "%pU %3d %3d %pISpc%s\n",
 		   &server->uuid,
-		   atomic_read(&server->usage),
+		   atomic_read(&server->ref),
+		   atomic_read(&server->active),
 		   &alist->addrs[0].transport,
 		   alist->preferred == 0 ? "*" : "");
 	for (i = 1; i < alist->nr_addrs; i++)
-		seq_printf(m, "                                         %pISpc%s\n",
+		seq_printf(m, "                                             %pISpc%s\n",
 			   &alist->addrs[i].transport,
 			   alist->preferred == i ? "*" : "");
 	return 0;
diff --git a/fs/afs/rxrpc.c b/fs/afs/rxrpc.c
index 1ecc67da6c1a..ab2962fff1fb 100644
--- a/fs/afs/rxrpc.c
+++ b/fs/afs/rxrpc.c
@@ -183,7 +183,7 @@ void afs_put_call(struct afs_call *call)
 		if (call->type->destructor)
 			call->type->destructor(call);
 
-		afs_put_server(call->net, call->server, afs_server_trace_put_call);
+		afs_unuse_server_notime(call->net, call->server, afs_server_trace_put_call);
 		afs_put_cb_interest(call->net, call->cbi);
 		afs_put_addrlist(call->alist);
 		kfree(call->request);
diff --git a/fs/afs/server.c b/fs/afs/server.c
index 9e50ccde5d37..4969a681f8f5 100644
--- a/fs/afs/server.c
+++ b/fs/afs/server.c
@@ -25,6 +25,10 @@ static void afs_dec_servers_outstanding(struct afs_net *net)
 		wake_up_var(&net->servers_outstanding);
 }
 
+static struct afs_server *afs_maybe_use_server(struct afs_server *,
+					       enum afs_server_trace);
+static void __afs_put_server(struct afs_net *, struct afs_server *);
+
 /*
  * Find a server by one of its addresses.
  */
@@ -40,7 +44,7 @@ struct afs_server *afs_find_server(struct afs_net *net,
 
 	do {
 		if (server)
-			afs_put_server(net, server, afs_server_trace_put_find_rsq);
+			afs_unuse_server_notime(net, server, afs_server_trace_put_find_rsq);
 		server = NULL;
 		read_seqbegin_or_lock(&net->fs_addr_lock, &seq);
 
@@ -78,9 +82,9 @@ struct afs_server *afs_find_server(struct afs_net *net,
 		}
 
 		server = NULL;
+		continue;
 	found:
-		if (server && !atomic_inc_not_zero(&server->usage))
-			server = NULL;
+		server = afs_maybe_use_server(server, afs_server_trace_get_by_addr);
 
 	} while (need_seqretry(&net->fs_addr_lock, seq));
 
@@ -91,7 +95,7 @@ struct afs_server *afs_find_server(struct afs_net *net,
 }
 
 /*
- * Look up a server by its UUID
+ * Look up a server by its UUID and mark it active.
  */
 struct afs_server *afs_find_server_by_uuid(struct afs_net *net, const uuid_t *uuid)
 {
@@ -107,7 +111,7 @@ struct afs_server *afs_find_server_by_uuid(struct afs_net *net, const uuid_t *uu
 		 * changes.
 		 */
 		if (server)
-			afs_put_server(net, server, afs_server_trace_put_uuid_rsq);
+			afs_unuse_server(net, server, afs_server_trace_put_uuid_rsq);
 		server = NULL;
 
 		read_seqbegin_or_lock(&net->fs_lock, &seq);
@@ -122,7 +126,7 @@ struct afs_server *afs_find_server_by_uuid(struct afs_net *net, const uuid_t *uu
 			} else if (diff > 0) {
 				p = p->rb_right;
 			} else {
-				afs_get_server(server, afs_server_trace_get_by_uuid);
+				afs_use_server(server, afs_server_trace_get_by_uuid);
 				break;
 			}
 
@@ -198,7 +202,7 @@ static struct afs_server *afs_install_server(struct afs_net *net,
 }
 
 /*
- * allocate a new server record
+ * Allocate a new server record and mark it active.
  */
 static struct afs_server *afs_alloc_server(struct afs_net *net,
 					   const uuid_t *uuid,
@@ -212,7 +216,8 @@ static struct afs_server *afs_alloc_server(struct afs_net *net,
 	if (!server)
 		goto enomem;
 
-	atomic_set(&server->usage, 1);
+	atomic_set(&server->ref, 1);
+	atomic_set(&server->active, 1);
 	server->debug_id = atomic_inc_return(&afs_server_debug_id);
 	RCU_INIT_POINTER(server->addresses, alist);
 	server->addr_version = alist->version;
@@ -224,7 +229,7 @@ static struct afs_server *afs_alloc_server(struct afs_net *net,
 	spin_lock_init(&server->probe_lock);
 
 	afs_inc_servers_outstanding(net);
-	trace_afs_server(server, 1, afs_server_trace_alloc);
+	trace_afs_server(server, 1, 1, afs_server_trace_alloc);
 	_leave(" = %p", server);
 	return server;
 
@@ -292,7 +297,6 @@ struct afs_server *afs_lookup_server(struct afs_cell *cell, struct key *key,
 		kfree(candidate);
 	}
 
-	_leave(" = %p{%d}", server, atomic_read(&server->usage));
 	return server;
 }
 
@@ -328,9 +332,38 @@ void afs_servers_timer(struct timer_list *timer)
 struct afs_server *afs_get_server(struct afs_server *server,
 				  enum afs_server_trace reason)
 {
-	unsigned int u = atomic_inc_return(&server->usage);
+	unsigned int u = atomic_inc_return(&server->ref);
+
+	trace_afs_server(server, u, atomic_read(&server->active), reason);
+	return server;
+}
+
+/*
+ * Try to get a reference on a server object.
+ */
+static struct afs_server *afs_maybe_use_server(struct afs_server *server,
+					       enum afs_server_trace reason)
+{
+	unsigned int r = atomic_fetch_add_unless(&server->ref, 1, 0);
+	unsigned int a;
+
+	if (r == 0)
+		return NULL;
+
+	a = atomic_inc_return(&server->active);
+	trace_afs_server(server, r, a, reason);
+	return server;
+}
+
+/*
+ * Get an active count on a server object.
+ */
+struct afs_server *afs_use_server(struct afs_server *server, enum afs_server_trace reason)
+{
+	unsigned int r = atomic_inc_return(&server->ref);
+	unsigned int a = atomic_inc_return(&server->active);
 
-	trace_afs_server(server, u, reason);
+	trace_afs_server(server, r, a, reason);
 	return server;
 }
 
@@ -345,28 +378,56 @@ void afs_put_server(struct afs_net *net, struct afs_server *server,
 	if (!server)
 		return;
 
-	server->put_time = ktime_get_real_seconds();
-
-	usage = atomic_dec_return(&server->usage);
+	usage = atomic_dec_return(&server->ref);
+	trace_afs_server(server, usage, atomic_read(&server->active), reason);
+	if (unlikely(usage == 0))
+		__afs_put_server(net, server);
+}
 
-	trace_afs_server(server, usage, reason);
+/*
+ * Drop an active count on a server object without updating the last-unused
+ * time.
+ */
+void afs_unuse_server_notime(struct afs_net *net, struct afs_server *server,
+			     enum afs_server_trace reason)
+{
+	if (server) {
+		unsigned int active = atomic_dec_return(&server->active);
 
-	if (likely(usage > 0))
-		return;
+		if (active == 0)
+			afs_set_server_timer(net, afs_server_gc_delay);
+		afs_put_server(net, server, reason);
+	}
+}
 
-	afs_set_server_timer(net, afs_server_gc_delay);
+/*
+ * Drop an active count on a server object.
+ */
+void afs_unuse_server(struct afs_net *net, struct afs_server *server,
+		      enum afs_server_trace reason)
+{
+	if (server) {
+		server->unuse_time = ktime_get_real_seconds();
+		afs_unuse_server_notime(net, server, reason);
+	}
 }
 
 static void afs_server_rcu(struct rcu_head *rcu)
 {
 	struct afs_server *server = container_of(rcu, struct afs_server, rcu);
 
-	trace_afs_server(server, atomic_read(&server->usage),
-			 afs_server_trace_free);
+	trace_afs_server(server, atomic_read(&server->ref),
+			 atomic_read(&server->active), afs_server_trace_free);
 	afs_put_addrlist(rcu_access_pointer(server->addresses));
 	kfree(server);
 }
 
+static void __afs_put_server(struct afs_net *net, struct afs_server *server)
+{
+	call_rcu(&server->rcu, afs_server_rcu);
+	afs_dec_servers_outstanding(net);
+}
+
 /*
  * destroy a dead server
  */
@@ -379,19 +440,10 @@ static void afs_destroy_server(struct afs_net *net, struct afs_server *server)
 		.error	= 0,
 	};
 
-	trace_afs_server(server, atomic_read(&server->usage),
-			 afs_server_trace_give_up_cb);
-
 	if (test_bit(AFS_SERVER_FL_MAY_HAVE_CB, &server->flags))
 		afs_fs_give_up_all_callbacks(net, server, &ac, NULL);
 
-	wait_var_event(&server->probe_outstanding,
-		       atomic_read(&server->probe_outstanding) == 0);
-
-	trace_afs_server(server, atomic_read(&server->usage),
-			 afs_server_trace_destroy);
-	call_rcu(&server->rcu, afs_server_rcu);
-	afs_dec_servers_outstanding(net);
+	afs_put_server(net, server, afs_server_trace_destroy);
 }
 
 /*
@@ -400,31 +452,28 @@ static void afs_destroy_server(struct afs_net *net, struct afs_server *server)
 static void afs_gc_servers(struct afs_net *net, struct afs_server *gc_list)
 {
 	struct afs_server *server;
-	bool deleted;
-	int usage;
+	int active;
 
 	while ((server = gc_list)) {
 		gc_list = server->gc_next;
 
 		write_seqlock(&net->fs_lock);
-		usage = 1;
-		deleted = atomic_try_cmpxchg(&server->usage, &usage, 0);
-		trace_afs_server(server, usage, afs_server_trace_gc);
-		if (deleted) {
+
+		active = atomic_read(&server->active);
+		if (active == 0) {
+			trace_afs_server(server, atomic_read(&server->ref),
+					 active, afs_server_trace_gc);
 			rb_erase(&server->uuid_rb, &net->fs_servers);
 			hlist_del_rcu(&server->proc_link);
-		}
-		write_sequnlock(&net->fs_lock);
-
-		if (deleted) {
-			write_seqlock(&net->fs_addr_lock);
 			if (!hlist_unhashed(&server->addr4_link))
 				hlist_del_rcu(&server->addr4_link);
 			if (!hlist_unhashed(&server->addr6_link))
 				hlist_del_rcu(&server->addr6_link);
-			write_sequnlock(&net->fs_addr_lock);
-			afs_destroy_server(net, server);
 		}
+		write_sequnlock(&net->fs_lock);
+
+		if (active == 0)
+			afs_destroy_server(net, server);
 	}
 }
 
@@ -453,15 +502,14 @@ void afs_manage_servers(struct work_struct *work)
 	for (cursor = rb_first(&net->fs_servers); cursor; cursor = rb_next(cursor)) {
 		struct afs_server *server =
 			rb_entry(cursor, struct afs_server, uuid_rb);
-		int usage = atomic_read(&server->usage);
+		int active = atomic_read(&server->active);
 
-		_debug("manage %pU %u", &server->uuid, usage);
+		_debug("manage %pU %u", &server->uuid, active);
 
-		ASSERTCMP(usage, >=, 1);
-		ASSERTIFCMP(purging, usage, ==, 1);
+		ASSERTIFCMP(purging, active, ==, 0);
 
-		if (usage == 1) {
-			time64_t expire_at = server->put_time;
+		if (active == 0) {
+			time64_t expire_at = server->unuse_time;
 
 			if (!test_bit(AFS_SERVER_FL_VL_FAIL, &server->flags) &&
 			    !test_bit(AFS_SERVER_FL_NOT_FOUND, &server->flags))
@@ -532,7 +580,8 @@ static noinline bool afs_update_server_record(struct afs_fs_cursor *fc, struct a
 
 	_enter("");
 
-	trace_afs_server(server, atomic_read(&server->usage), afs_server_trace_update);
+	trace_afs_server(server, atomic_read(&server->ref), atomic_read(&server->active),
+			 afs_server_trace_update);
 
 	alist = afs_vl_lookup_addrs(fc->vnode->volume->cell, fc->key,
 				    &server->uuid);
diff --git a/fs/afs/server_list.c b/fs/afs/server_list.c
index f567732df5cc..b77e50f62459 100644
--- a/fs/afs/server_list.c
+++ b/fs/afs/server_list.c
@@ -16,8 +16,8 @@ void afs_put_serverlist(struct afs_net *net, struct afs_server_list *slist)
 	if (slist && refcount_dec_and_test(&slist->usage)) {
 		for (i = 0; i < slist->nr_servers; i++) {
 			afs_put_cb_interest(net, slist->servers[i].cb_interest);
-			afs_put_server(net, slist->servers[i].server,
-				       afs_server_trace_put_slist);
+			afs_unuse_server(net, slist->servers[i].server,
+					 afs_server_trace_put_slist);
 		}
 		kfree(slist);
 	}
diff --git a/include/trace/events/afs.h b/include/trace/events/afs.h
index c612cabbc378..f9691f69b2d6 100644
--- a/include/trace/events/afs.h
+++ b/include/trace/events/afs.h
@@ -33,6 +33,7 @@ enum afs_server_trace {
 	afs_server_trace_destroy,
 	afs_server_trace_free,
 	afs_server_trace_gc,
+	afs_server_trace_get_by_addr,
 	afs_server_trace_get_by_uuid,
 	afs_server_trace_get_caps,
 	afs_server_trace_get_install,
@@ -241,6 +242,7 @@ enum afs_cb_break_reason {
 	EM(afs_server_trace_destroy,		"DESTROY  ") \
 	EM(afs_server_trace_free,		"FREE     ") \
 	EM(afs_server_trace_gc,			"GC       ") \
+	EM(afs_server_trace_get_by_addr,	"GET addr ") \
 	EM(afs_server_trace_get_by_uuid,	"GET uuid ") \
 	EM(afs_server_trace_get_caps,		"GET caps ") \
 	EM(afs_server_trace_get_install,	"GET inst ") \
@@ -1271,26 +1273,30 @@ TRACE_EVENT(afs_cb_miss,
 	    );
 
 TRACE_EVENT(afs_server,
-	    TP_PROTO(struct afs_server *server, int usage, enum afs_server_trace reason),
+	    TP_PROTO(struct afs_server *server, int ref, int active,
+		     enum afs_server_trace reason),
 
-	    TP_ARGS(server, usage, reason),
+	    TP_ARGS(server, ref, active, reason),
 
 	    TP_STRUCT__entry(
 		    __field(unsigned int,		server		)
-		    __field(int,			usage		)
+		    __field(int,			ref		)
+		    __field(int,			active		)
 		    __field(int,			reason		)
 			     ),
 
 	    TP_fast_assign(
 		    __entry->server = server->debug_id;
-		    __entry->usage = usage;
+		    __entry->ref = ref;
+		    __entry->active = active;
 		    __entry->reason = reason;
 			   ),
 
-	    TP_printk("s=%08x %s u=%d",
+	    TP_printk("s=%08x %s u=%d a=%d",
 		      __entry->server,
 		      __print_symbolic(__entry->reason, afs_server_traces),
-		      __entry->usage)
+		      __entry->ref,
+		      __entry->active)
 	    );
 
 #endif /* _TRACE_AFS_H */



^ permalink raw reply related	[flat|nested] 9+ messages in thread

* [PATCH 7/8] afs: Actively poll fileservers to maintain NAT or firewall openings
  2020-04-24 15:19 [PATCH 0/8] afs: NAT-mitigation and other bits David Howells
                   ` (5 preceding siblings ...)
  2020-04-24 15:20 ` [PATCH 6/8] afs: Split the usage count on struct afs_server David Howells
@ 2020-04-24 15:20 ` David Howells
  2020-04-24 15:20 ` [PATCH 8/8] afs: Show more information in /proc/net/afs/servers David Howells
  7 siblings, 0 replies; 9+ messages in thread
From: David Howells @ 2020-04-24 15:20 UTC (permalink / raw)
  To: linux-afs; +Cc: Dave Botsch, linux-fsdevel, linux-kernel, dhowells

When an AFS client accesses a file, it receives a limited-duration callback
promise that the server will notify it if another client changes a file.
This callback duration can be a few hours in length.

If a client mounts a volume and then an application prevents it from being
unmounted, say by chdir'ing into it, but then does nothing for some time,
the rxrpc_peer record will expire and rxrpc-level keepalive will cease.

If there is NAT or a firewall between the client and the server, the route
back for the server may close after a comparatively short duration, meaning
that attempts by the server to notify the client may then bounce.

The client, however, may (so far as it knows) still have a valid unexpired
promise and will then rely on its cached data and will not see changes made
on the server by a third party until it incidentally rechecks the status or
the promise needs renewal.

To deal with this, the client needs to regularly probe the server.  This
has two effects: firstly, it keeps a route open back for the server, and
secondly, it causes the server to disgorge any notifications that got
queued up because they couldn't be sent.

Fix this by adding a mechanism to emit regular probes.

Two levels of probing are made available: Under normal circumstances the
'slow' queue will be used for a fileserver - this just probes the preferred
address once every 5 mins or so; however, if server fails to respond to any
probes, the server will shift to the 'fast' queue from which all its
interfaces will be probed every 30s.  When it finally responds, the record
will switch back to the slow queue.

Further notes:

 (1) Probing is now no longer driven from the fileserver rotation
     algorithm.

 (2) Probes are dispatched to all interfaces on a fileserver when that an
     afs_server object is set up to record it.

 (3) The afs_server object is removed from the probe queues when we start
     to probe it.  afs_is_probing_server() returns true if it's not listed
     - ie. it's undergoing probing.

 (4) The afs_server object is added back on to the probe queue when the
     final outstanding probe completes, but the probed_at time is set when
     we're about to launch a probe so that it's not dependent on the probe
     duration.

 (5) The timer and the work item added for this must be handed a count on
     net->servers_outstanding, which they hand on or release.  This makes
     sure that network namespace cleanup waits for them.

Fixes: d2ddc776a458 ("afs: Overhaul volume and server record caching and fileserver rotation")
Reported-by: Dave Botsch <botsch@cnf.cornell.edu>
Signed-off-by: David Howells <dhowells@redhat.com>
---

 fs/afs/cmservice.c         |    2 
 fs/afs/fs_probe.c          |  276 +++++++++++++++++++++++++++++++++-----------
 fs/afs/fsclient.c          |   21 ++-
 fs/afs/internal.h          |   41 +++++--
 fs/afs/main.c              |    5 +
 fs/afs/rotate.c            |    7 -
 fs/afs/server.c            |   19 +--
 fs/afs/volume.c            |   24 ++--
 include/trace/events/afs.h |    4 +
 9 files changed, 280 insertions(+), 119 deletions(-)

diff --git a/fs/afs/cmservice.c b/fs/afs/cmservice.c
index 7dcbca3bf828..7ae88958051f 100644
--- a/fs/afs/cmservice.c
+++ b/fs/afs/cmservice.c
@@ -157,7 +157,7 @@ static int afs_record_cm_probe(struct afs_call *call, struct afs_server *server)
 	_enter("");
 
 	if (test_bit(AFS_SERVER_FL_HAVE_EPOCH, &server->flags) &&
-	    !test_bit(AFS_SERVER_FL_PROBING, &server->flags)) {
+	    !afs_is_probing_server(server)) {
 		if (server->cm_epoch == call->epoch)
 			return 0;
 
diff --git a/fs/afs/fs_probe.c b/fs/afs/fs_probe.c
index d458b0adfce5..d98de60b5cd7 100644
--- a/fs/afs/fs_probe.c
+++ b/fs/afs/fs_probe.c
@@ -1,7 +1,7 @@
 // SPDX-License-Identifier: GPL-2.0-or-later
 /* AFS fileserver probing
  *
- * Copyright (C) 2018 Red Hat, Inc. All Rights Reserved.
+ * Copyright (C) 2018, 2020 Red Hat, Inc. All Rights Reserved.
  * Written by David Howells (dhowells@redhat.com)
  */
 
@@ -11,14 +11,83 @@
 #include "internal.h"
 #include "protocol_yfs.h"
 
-static bool afs_fs_probe_done(struct afs_server *server)
+static unsigned int afs_fs_probe_fast_poll_interval = 30 * HZ;
+static unsigned int afs_fs_probe_slow_poll_interval = 5 * 60 * HZ;
+
+/*
+ * Start the probe polling timer.  We have to supply it with an inc on the
+ * outstanding server count.
+ */
+static void afs_schedule_fs_probe(struct afs_net *net,
+				  struct afs_server *server, bool fast)
+{
+	unsigned long atj;
+
+	if (!net->live)
+		return;
+
+	atj = server->probed_at;
+	atj += fast ? afs_fs_probe_fast_poll_interval : afs_fs_probe_slow_poll_interval;
+
+	afs_inc_servers_outstanding(net);
+	if (timer_reduce(&net->fs_probe_timer, atj))
+		afs_dec_servers_outstanding(net);
+}
+
+/*
+ * Handle the completion of a set of probes.
+ */
+static void afs_finished_fs_probe(struct afs_net *net, struct afs_server *server)
+{
+	bool responded = server->probe.responded;
+
+	write_seqlock(&net->fs_lock);
+	if (responded)
+		list_add_tail(&server->probe_link, &net->fs_probe_slow);
+	else
+		list_add_tail(&server->probe_link, &net->fs_probe_fast);
+	write_sequnlock(&net->fs_lock);
+
+	afs_schedule_fs_probe(net, server, responded);
+}
+
+/*
+ * Handle the completion of a probe.
+ */
+static void afs_done_one_fs_probe(struct afs_net *net, struct afs_server *server)
+{
+	_enter("");
+
+	if (atomic_dec_and_test(&server->probe_outstanding))
+		afs_finished_fs_probe(net, server);
+
+	wake_up_all(&server->probe_wq);
+}
+
+/*
+ * Handle inability to send a probe due to ENOMEM when trying to allocate a
+ * call struct.
+ */
+static void afs_fs_probe_not_done(struct afs_net *net,
+				  struct afs_server *server,
+				  struct afs_addr_cursor *ac)
 {
-	if (!atomic_dec_and_test(&server->probe_outstanding))
-		return false;
+	struct afs_addr_list *alist = ac->alist;
+	unsigned int index = ac->index;
+
+	_enter("");
 
-	clear_bit_unlock(AFS_SERVER_FL_PROBING, &server->flags);
-	wake_up_bit(&server->flags, AFS_SERVER_FL_PROBING);
-	return true;
+	trace_afs_io_error(0, -ENOMEM, afs_io_error_fs_probe_fail);
+	spin_lock(&server->probe_lock);
+
+	server->probe.local_failure = true;
+	if (server->probe.error == 0)
+		server->probe.error = -ENOMEM;
+
+	set_bit(index, &alist->failed);
+
+	spin_unlock(&server->probe_lock);
+	return afs_done_one_fs_probe(net, server);
 }
 
 /*
@@ -29,10 +98,8 @@ void afs_fileserver_probe_result(struct afs_call *call)
 {
 	struct afs_addr_list *alist = call->alist;
 	struct afs_server *server = call->server;
-	unsigned int server_index = call->server_index;
 	unsigned int index = call->addr_ix;
 	unsigned int rtt = UINT_MAX;
-	bool have_result = false;
 	u64 _rtt;
 	int ret = call->error;
 
@@ -52,8 +119,9 @@ void afs_fileserver_probe_result(struct afs_call *call)
 		goto responded;
 	case -ENOMEM:
 	case -ENONET:
+		clear_bit(index, &alist->responded);
 		server->probe.local_failure = true;
-		afs_io_error(call, afs_io_error_fs_probe_fail);
+		trace_afs_io_error(call->debug_id, ret, afs_io_error_fs_probe_fail);
 		goto out;
 	case -ECONNRESET: /* Responded, but call expired. */
 	case -ERFKILL:
@@ -72,12 +140,11 @@ void afs_fileserver_probe_result(struct afs_call *call)
 		     server->probe.error == -ETIMEDOUT ||
 		     server->probe.error == -ETIME))
 			server->probe.error = ret;
-		afs_io_error(call, afs_io_error_fs_probe_fail);
+		trace_afs_io_error(call->debug_id, ret, afs_io_error_fs_probe_fail);
 		goto out;
 	}
 
 responded:
-	set_bit(index, &alist->responded);
 	clear_bit(index, &alist->failed);
 
 	if (call->service_id == YFS_FS_SERVICE) {
@@ -102,39 +169,31 @@ void afs_fileserver_probe_result(struct afs_call *call)
 	if (rtt < server->probe.rtt) {
 		server->probe.rtt = rtt;
 		alist->preferred = index;
-		have_result = true;
 	}
 
 	smp_wmb(); /* Set rtt before responded. */
 	server->probe.responded = true;
-	set_bit(AFS_SERVER_FL_PROBED, &server->flags);
+	set_bit(index, &alist->responded);
 out:
 	spin_unlock(&server->probe_lock);
 
-	_debug("probe [%u][%u] %pISpc rtt=%u ret=%d",
-	       server_index, index, &alist->addrs[index].transport,
+	_debug("probe %pU [%u] %pISpc rtt=%u ret=%d",
+	       &server->uuid, index, &alist->addrs[index].transport,
 	       (unsigned int)rtt, ret);
 
-	have_result |= afs_fs_probe_done(server);
-	if (have_result)
-		wake_up_all(&server->probe_wq);
+	return afs_done_one_fs_probe(call->net, server);
 }
 
 /*
- * Probe all of a fileserver's addresses to find out the best route and to
- * query its capabilities.
+ * Probe one or all of a fileserver's addresses to find out the best route and
+ * to query its capabilities.
  */
-static int afs_do_probe_fileserver(struct afs_net *net,
-				   struct afs_server *server,
-				   struct key *key,
-				   unsigned int server_index,
-				   struct afs_error *_e)
+void afs_fs_probe_fileserver(struct afs_net *net, struct afs_server *server,
+			     struct key *key, bool all)
 {
 	struct afs_addr_cursor ac = {
 		.index = 0,
 	};
-	struct afs_call *call;
-	bool in_progress = false;
 
 	_enter("%pU", &server->uuid);
 
@@ -144,50 +203,25 @@ static int afs_do_probe_fileserver(struct afs_net *net,
 	afs_get_addrlist(ac.alist);
 	read_unlock(&server->fs_lock);
 
-	atomic_set(&server->probe_outstanding, ac.alist->nr_addrs);
+	server->probed_at = jiffies;
+	atomic_set(&server->probe_outstanding, all ? ac.alist->nr_addrs : 1);
 	memset(&server->probe, 0, sizeof(server->probe));
 	server->probe.rtt = UINT_MAX;
 
-	for (ac.index = 0; ac.index < ac.alist->nr_addrs; ac.index++) {
-		call = afs_fs_get_capabilities(net, server, &ac, key, server_index);
-		if (!IS_ERR(call)) {
-			afs_put_call(call);
-			in_progress = true;
-		} else {
-			afs_prioritise_error(_e, PTR_ERR(call), ac.abort_code);
-		}
-	}
-
-	if (!in_progress)
-		afs_fs_probe_done(server);
-	afs_put_addrlist(ac.alist);
-	return in_progress;
-}
+	ac.index = ac.alist->preferred;
+	if (ac.index < 0 || ac.index >= ac.alist->nr_addrs)
+		all = true;
 
-/*
- * Send off probes to all unprobed servers.
- */
-int afs_probe_fileservers(struct afs_net *net, struct key *key,
-			  struct afs_server_list *list)
-{
-	struct afs_server *server;
-	struct afs_error e;
-	bool in_progress = false;
-	int i;
-
-	e.error = 0;
-	e.responded = false;
-	for (i = 0; i < list->nr_servers; i++) {
-		server = list->servers[i].server;
-		if (test_bit(AFS_SERVER_FL_PROBED, &server->flags))
-			continue;
-
-		if (!test_and_set_bit_lock(AFS_SERVER_FL_PROBING, &server->flags) &&
-		    afs_do_probe_fileserver(net, server, key, i, &e))
-			in_progress = true;
+	if (all) {
+		for (ac.index = 0; ac.index < ac.alist->nr_addrs; ac.index++)
+			if (!afs_fs_get_capabilities(net, server, &ac, key))
+				afs_fs_probe_not_done(net, server, &ac);
+	} else {
+		if (!afs_fs_get_capabilities(net, server, &ac, key))
+			afs_fs_probe_not_done(net, server, &ac);
 	}
 
-	return in_progress ? 0 : e.error;
+	afs_put_addrlist(ac.alist);
 }
 
 /*
@@ -207,7 +241,7 @@ int afs_wait_for_fs_probes(struct afs_server_list *slist, unsigned long untried)
 	for (i = 0; i < slist->nr_servers; i++) {
 		if (test_bit(i, &untried)) {
 			server = slist->servers[i].server;
-			if (!test_bit(AFS_SERVER_FL_PROBING, &server->flags))
+			if (!atomic_read(&server->probe_outstanding))
 				__clear_bit(i, &untried);
 			if (server->probe.responded)
 				have_responders = true;
@@ -237,7 +271,7 @@ int afs_wait_for_fs_probes(struct afs_server_list *slist, unsigned long untried)
 				server = slist->servers[i].server;
 				if (server->probe.responded)
 					goto stop;
-				if (test_bit(AFS_SERVER_FL_PROBING, &server->flags))
+				if (atomic_read(&server->probe_outstanding))
 					still_probing = true;
 			}
 		}
@@ -272,3 +306,109 @@ int afs_wait_for_fs_probes(struct afs_server_list *slist, unsigned long untried)
 		slist->preferred = pref;
 	return 0;
 }
+
+/*
+ * Probe timer.  We have an increment on fs_outstanding that we need to pass
+ * along to the work item.
+ */
+void afs_fs_probe_timer(struct timer_list *timer)
+{
+	struct afs_net *net = container_of(timer, struct afs_net, fs_probe_timer);
+
+	if (!queue_work(afs_wq, &net->fs_prober))
+		afs_dec_servers_outstanding(net);
+}
+
+/*
+ * Dispatch a probe to a server.
+ */
+static void afs_dispatch_fs_probe(struct afs_net *net, struct afs_server *server, bool all)
+	__releases(&net->fs_lock)
+{
+	struct key *key = NULL;
+
+	/* We remove it from the queues here - it will be added back to
+	 * one of the queues on the completion of the probe.
+	 */
+	list_del_init(&server->probe_link);
+
+	afs_get_server(server, afs_server_trace_get_probe);
+	write_sequnlock(&net->fs_lock);
+
+	afs_fs_probe_fileserver(net, server, key, all);
+	afs_put_server(net, server, afs_server_trace_put_probe);
+}
+
+/*
+ * Probe dispatcher to regularly dispatch probes to keep NAT alive.
+ */
+void afs_fs_probe_dispatcher(struct work_struct *work)
+{
+	struct afs_net *net = container_of(work, struct afs_net, fs_prober);
+	struct afs_server *fast, *slow, *server;
+	unsigned long nowj, timer_at, poll_at;
+	bool first_pass = true, set_timer = false;
+
+	if (!net->live)
+		return;
+
+	_enter("");
+
+	if (list_empty(&net->fs_probe_fast) && list_empty(&net->fs_probe_slow)) {
+		_leave(" [none]");
+		return;
+	}
+
+again:
+	write_seqlock(&net->fs_lock);
+
+	fast = slow = server = NULL;
+	nowj = jiffies;
+	timer_at = nowj + MAX_JIFFY_OFFSET;
+
+	if (!list_empty(&net->fs_probe_fast)) {
+		fast = list_first_entry(&net->fs_probe_fast, struct afs_server, probe_link);
+		poll_at = fast->probed_at + afs_fs_probe_fast_poll_interval;
+		if (time_before(nowj, poll_at)) {
+			timer_at = poll_at;
+			set_timer = true;
+			fast = NULL;
+		}
+	}
+
+	if (!list_empty(&net->fs_probe_slow)) {
+		slow = list_first_entry(&net->fs_probe_slow, struct afs_server, probe_link);
+		poll_at = slow->probed_at + afs_fs_probe_slow_poll_interval;
+		if (time_before(nowj, poll_at)) {
+			if (time_before(poll_at, timer_at))
+			    timer_at = poll_at;
+			set_timer = true;
+			slow = NULL;
+		}
+	}
+
+	server = fast ?: slow;
+	if (server)
+		_debug("probe %pU", &server->uuid);
+
+	if (server && (first_pass || !need_resched())) {
+		afs_dispatch_fs_probe(net, server, server == fast);
+		first_pass = false;
+		goto again;
+	}
+
+	write_sequnlock(&net->fs_lock);
+
+	if (server) {
+		if (!queue_work(afs_wq, &net->fs_prober))
+			afs_dec_servers_outstanding(net);
+		_leave(" [requeue]");
+	} else if (set_timer) {
+		if (timer_reduce(&net->fs_probe_timer, timer_at))
+			afs_dec_servers_outstanding(net);
+		_leave(" [timer]");
+	} else {
+		afs_dec_servers_outstanding(net);
+		_leave(" [quiesce]");
+	}
+}
diff --git a/fs/afs/fsclient.c b/fs/afs/fsclient.c
index 7384fb545eb4..a3e3a16b32d6 100644
--- a/fs/afs/fsclient.c
+++ b/fs/afs/fsclient.c
@@ -1905,14 +1905,13 @@ static const struct afs_call_type afs_RXFSGetCapabilities = {
 };
 
 /*
- * Probe a fileserver for the capabilities that it supports.  This can
- * return up to 196 words.
- */
-struct afs_call *afs_fs_get_capabilities(struct afs_net *net,
-					 struct afs_server *server,
-					 struct afs_addr_cursor *ac,
-					 struct key *key,
-					 unsigned int server_index)
+ * Probe a fileserver for the capabilities that it supports.  This RPC can
+ * reply with up to 196 words.  The operation is asynchronous and if we managed
+ * to allocate a call, true is returned the result is delivered through the
+ * ->done() - otherwise we return false to indicate we didn't even try.
+ */
+bool afs_fs_get_capabilities(struct afs_net *net, struct afs_server *server,
+			     struct afs_addr_cursor *ac, struct key *key)
 {
 	struct afs_call *call;
 	__be32 *bp;
@@ -1921,11 +1920,10 @@ struct afs_call *afs_fs_get_capabilities(struct afs_net *net,
 
 	call = afs_alloc_flat_call(net, &afs_RXFSGetCapabilities, 1 * 4, 16 * 4);
 	if (!call)
-		return ERR_PTR(-ENOMEM);
+		return false;
 
 	call->key = key;
 	call->server = afs_use_server(server, afs_server_trace_get_caps);
-	call->server_index = server_index;
 	call->upgrade = true;
 	call->async = true;
 	call->max_lifespan = AFS_PROBE_MAX_LIFESPAN;
@@ -1936,7 +1934,8 @@ struct afs_call *afs_fs_get_capabilities(struct afs_net *net,
 
 	trace_afs_make_fs_call(call, NULL);
 	afs_make_call(ac, call, GFP_NOFS);
-	return call;
+	afs_put_call(call);
+	return true;
 }
 
 /*
diff --git a/fs/afs/internal.h b/fs/afs/internal.h
index cb70e1c234cc..61320a632e15 100644
--- a/fs/afs/internal.h
+++ b/fs/afs/internal.h
@@ -90,7 +90,6 @@ struct afs_addr_list {
 	unsigned char		nr_ipv4;	/* Number of IPv4 addresses */
 	enum dns_record_source	source:8;
 	enum dns_lookup_status	status:8;
-	unsigned long		probed;		/* Mask of servers that have been probed */
 	unsigned long		failed;		/* Mask of addrs that failed locally/ICMP */
 	unsigned long		responded;	/* Mask of addrs that responded */
 	struct sockaddr_rxrpc	addrs[];
@@ -299,9 +298,10 @@ struct afs_net {
 	 * cell, but in practice, people create aliases and subsets and there's
 	 * no easy way to distinguish them.
 	 */
-	seqlock_t		fs_lock;	/* For fs_servers */
+	seqlock_t		fs_lock;	/* For fs_servers, fs_probe_*, fs_proc */
 	struct rb_root		fs_servers;	/* afs_server (by server UUID or address) */
-	struct list_head	fs_updates;	/* afs_server (by update_at) */
+	struct list_head	fs_probe_fast;	/* List of afs_server to probe at 30s intervals */
+	struct list_head	fs_probe_slow;	/* List of afs_server to probe at 5m intervals */
 	struct hlist_head	fs_proc;	/* procfs servers list */
 
 	struct hlist_head	fs_addresses4;	/* afs_server (by lowest IPv4 addr) */
@@ -310,6 +310,9 @@ struct afs_net {
 
 	struct work_struct	fs_manager;
 	struct timer_list	fs_timer;
+
+	struct work_struct	fs_prober;
+	struct timer_list	fs_probe_timer;
 	atomic_t		servers_outstanding;
 
 	/* File locking renewal management */
@@ -493,7 +496,8 @@ struct afs_server {
 	};
 
 	struct afs_addr_list	__rcu *addresses;
-	struct rb_node		uuid_rb;	/* Link in net->servers */
+	struct rb_node		uuid_rb;	/* Link in net->fs_servers */
+	struct list_head	probe_link;	/* Link in net->fs_probe_list */
 	struct hlist_node	addr4_link;	/* Link in net->fs_addresses4 */
 	struct hlist_node	addr6_link;	/* Link in net->fs_addresses6 */
 	struct hlist_node	proc_link;	/* Link in net->fs_proc */
@@ -504,8 +508,6 @@ struct afs_server {
 #define AFS_SERVER_FL_NOT_FOUND	2		/* VL server says no such server */
 #define AFS_SERVER_FL_VL_FAIL	3		/* Failed to access VL server */
 #define AFS_SERVER_FL_UPDATING	4
-#define AFS_SERVER_FL_PROBED	5		/* The fileserver has been probed */
-#define AFS_SERVER_FL_PROBING	6		/* Fileserver is being probed */
 #define AFS_SERVER_FL_NO_IBULK	7		/* Fileserver doesn't support FS.InlineBulkStatus */
 #define AFS_SERVER_FL_MAY_HAVE_CB 8		/* May have callbacks on this fileserver */
 #define AFS_SERVER_FL_IS_YFS	9		/* Server is YFS not AFS */
@@ -527,6 +529,7 @@ struct afs_server {
 	rwlock_t		cb_break_lock;	/* Volume finding lock */
 
 	/* Probe state */
+	unsigned long		probed_at;	/* Time last probe was dispatched (jiffies) */
 	wait_queue_head_t	probe_wq;
 	atomic_t		probe_outstanding;
 	spinlock_t		probe_lock;
@@ -956,7 +959,6 @@ extern int afs_flock(struct file *, int, struct file_lock *);
  */
 extern int afs_fs_fetch_file_status(struct afs_fs_cursor *, struct afs_status_cb *,
 				    struct afs_volsync *);
-extern int afs_fs_give_up_callbacks(struct afs_net *, struct afs_server *);
 extern int afs_fs_fetch_data(struct afs_fs_cursor *, struct afs_status_cb *, struct afs_read *);
 extern int afs_fs_create(struct afs_fs_cursor *, const char *, umode_t,
 			 struct afs_status_cb *, struct afs_fid *, struct afs_status_cb *);
@@ -978,9 +980,8 @@ extern int afs_fs_extend_lock(struct afs_fs_cursor *, struct afs_status_cb *);
 extern int afs_fs_release_lock(struct afs_fs_cursor *, struct afs_status_cb *);
 extern int afs_fs_give_up_all_callbacks(struct afs_net *, struct afs_server *,
 					struct afs_addr_cursor *, struct key *);
-extern struct afs_call *afs_fs_get_capabilities(struct afs_net *, struct afs_server *,
-						struct afs_addr_cursor *, struct key *,
-						unsigned int);
+extern bool afs_fs_get_capabilities(struct afs_net *, struct afs_server *,
+				    struct afs_addr_cursor *, struct key *);
 extern int afs_fs_inline_bulk_status(struct afs_fs_cursor *, struct afs_net *,
 				     struct afs_fid *, struct afs_status_cb *,
 				     unsigned int, struct afs_volsync *);
@@ -1001,8 +1002,9 @@ extern int afs_fs_store_acl(struct afs_fs_cursor *, const struct afs_acl *,
  * fs_probe.c
  */
 extern void afs_fileserver_probe_result(struct afs_call *);
-extern int afs_probe_fileservers(struct afs_net *, struct key *, struct afs_server_list *);
+extern void afs_fs_probe_fileserver(struct afs_net *, struct afs_server *, struct key *, bool);
 extern int afs_wait_for_fs_probes(struct afs_server_list *, unsigned long);
+extern void afs_fs_probe_dispatcher(struct work_struct *);
 
 /*
  * inode.c
@@ -1251,9 +1253,26 @@ extern void afs_unuse_server_notime(struct afs_net *, struct afs_server *, enum
 extern void afs_put_server(struct afs_net *, struct afs_server *, enum afs_server_trace);
 extern void afs_manage_servers(struct work_struct *);
 extern void afs_servers_timer(struct timer_list *);
+extern void afs_fs_probe_timer(struct timer_list *);
 extern void __net_exit afs_purge_servers(struct afs_net *);
 extern bool afs_check_server_record(struct afs_fs_cursor *, struct afs_server *);
 
+static inline void afs_inc_servers_outstanding(struct afs_net *net)
+{
+	atomic_inc(&net->servers_outstanding);
+}
+
+static inline void afs_dec_servers_outstanding(struct afs_net *net)
+{
+	if (atomic_dec_and_test(&net->servers_outstanding))
+		wake_up_var(&net->servers_outstanding);
+}
+
+static inline bool afs_is_probing_server(struct afs_server *server)
+{
+	return list_empty(&server->probe_link);
+}
+
 /*
  * server_list.c
  */
diff --git a/fs/afs/main.c b/fs/afs/main.c
index c9c45d7078bd..56b52f8dbf15 100644
--- a/fs/afs/main.c
+++ b/fs/afs/main.c
@@ -87,7 +87,8 @@ static int __net_init afs_net_init(struct net *net_ns)
 
 	seqlock_init(&net->fs_lock);
 	net->fs_servers = RB_ROOT;
-	INIT_LIST_HEAD(&net->fs_updates);
+	INIT_LIST_HEAD(&net->fs_probe_fast);
+	INIT_LIST_HEAD(&net->fs_probe_slow);
 	INIT_HLIST_HEAD(&net->fs_proc);
 
 	INIT_HLIST_HEAD(&net->fs_addresses4);
@@ -96,6 +97,8 @@ static int __net_init afs_net_init(struct net *net_ns)
 
 	INIT_WORK(&net->fs_manager, afs_manage_servers);
 	timer_setup(&net->fs_timer, afs_servers_timer, 0);
+	INIT_WORK(&net->fs_prober, afs_fs_probe_dispatcher);
+	timer_setup(&net->fs_probe_timer, afs_fs_probe_timer, 0);
 
 	ret = -ENOMEM;
 	sysnames = kzalloc(sizeof(*sysnames), GFP_KERNEL);
diff --git a/fs/afs/rotate.c b/fs/afs/rotate.c
index 2a3305e42b14..46b68da89faa 100644
--- a/fs/afs/rotate.c
+++ b/fs/afs/rotate.c
@@ -349,9 +349,6 @@ bool afs_select_fileserver(struct afs_fs_cursor *fc)
 		goto failed;
 
 	_debug("__ VOL %llx __", vnode->volume->vid);
-	error = afs_probe_fileservers(afs_v2net(vnode), fc->key, fc->server_list);
-	if (error < 0)
-		goto failed_set_error;
 
 pick_server:
 	_debug("pick [%lx]", fc->untried);
@@ -596,8 +593,8 @@ static void afs_dump_edestaddrreq(const struct afs_fs_cursor *fc)
 					  a->version,
 					  a->nr_ipv4, a->nr_addrs, a->max_addrs,
 					  a->preferred);
-				pr_notice("FC:  - pr=%lx R=%lx F=%lx\n",
-					  a->probed, a->responded, a->failed);
+				pr_notice("FC:  - R=%lx F=%lx\n",
+					  a->responded, a->failed);
 				if (a == fc->ac.alist)
 					pr_notice("FC:  - current\n");
 			}
diff --git a/fs/afs/server.c b/fs/afs/server.c
index 4969a681f8f5..3f707b5ecb62 100644
--- a/fs/afs/server.c
+++ b/fs/afs/server.c
@@ -14,17 +14,6 @@
 static unsigned afs_server_gc_delay = 10;	/* Server record timeout in seconds */
 static atomic_t afs_server_debug_id;
 
-static void afs_inc_servers_outstanding(struct afs_net *net)
-{
-	atomic_inc(&net->servers_outstanding);
-}
-
-static void afs_dec_servers_outstanding(struct afs_net *net)
-{
-	if (atomic_dec_and_test(&net->servers_outstanding))
-		wake_up_var(&net->servers_outstanding);
-}
-
 static struct afs_server *afs_maybe_use_server(struct afs_server *,
 					       enum afs_server_trace);
 static void __afs_put_server(struct afs_net *, struct afs_server *);
@@ -226,6 +215,7 @@ static struct afs_server *afs_alloc_server(struct afs_net *net,
 	INIT_HLIST_HEAD(&server->cb_volumes);
 	rwlock_init(&server->cb_break_lock);
 	init_waitqueue_head(&server->probe_wq);
+	INIT_LIST_HEAD(&server->probe_link);
 	spin_lock_init(&server->probe_lock);
 
 	afs_inc_servers_outstanding(net);
@@ -295,6 +285,12 @@ struct afs_server *afs_lookup_server(struct afs_cell *cell, struct key *key,
 	if (server != candidate) {
 		afs_put_addrlist(alist);
 		kfree(candidate);
+	} else {
+		/* Immediately dispatch an asynchronous probe to each interface
+		 * on the fileserver.  This will make sure the repeat-probing
+		 * service is started.
+		 */
+		afs_fs_probe_fileserver(cell->net, server, key, true);
 	}
 
 	return server;
@@ -464,6 +460,7 @@ static void afs_gc_servers(struct afs_net *net, struct afs_server *gc_list)
 			trace_afs_server(server, atomic_read(&server->ref),
 					 active, afs_server_trace_gc);
 			rb_erase(&server->uuid_rb, &net->fs_servers);
+			list_del(&server->probe_link);
 			hlist_del_rcu(&server->proc_link);
 			if (!hlist_unhashed(&server->addr4_link))
 				hlist_del_rcu(&server->addr4_link);
diff --git a/fs/afs/volume.c b/fs/afs/volume.c
index 4310336b9bb8..249000195f8a 100644
--- a/fs/afs/volume.c
+++ b/fs/afs/volume.c
@@ -266,7 +266,6 @@ static int afs_update_volume_status(struct afs_volume *volume, struct key *key)
 	}
 
 	volume->update_at = ktime_get_real_seconds() + afs_volume_record_life;
-	clear_bit(AFS_VOLUME_NEEDS_UPDATE, &volume->flags);
 	write_unlock(&volume->servers_lock);
 	ret = 0;
 
@@ -283,23 +282,25 @@ static int afs_update_volume_status(struct afs_volume *volume, struct key *key)
  */
 int afs_check_volume_status(struct afs_volume *volume, struct afs_fs_cursor *fc)
 {
-	time64_t now = ktime_get_real_seconds();
 	int ret, retries = 0;
 
 	_enter("");
 
-	if (volume->update_at <= now)
-		set_bit(AFS_VOLUME_NEEDS_UPDATE, &volume->flags);
-
 retry:
-	if (!test_bit(AFS_VOLUME_NEEDS_UPDATE, &volume->flags) &&
-	    !test_bit(AFS_VOLUME_WAIT, &volume->flags)) {
-		_leave(" = 0");
-		return 0;
-	}
-
+	if (test_bit(AFS_VOLUME_WAIT, &volume->flags))
+		goto wait;
+	if (volume->update_at <= ktime_get_real_seconds() ||
+	    test_bit(AFS_VOLUME_NEEDS_UPDATE, &volume->flags))
+		goto update;
+	_leave(" = 0");
+	return 0;
+
+update:
 	if (!test_and_set_bit_lock(AFS_VOLUME_UPDATING, &volume->flags)) {
+		clear_bit(AFS_VOLUME_NEEDS_UPDATE, &volume->flags);
 		ret = afs_update_volume_status(volume, fc->key);
+		if (ret < 0)
+			set_bit(AFS_VOLUME_NEEDS_UPDATE, &volume->flags);
 		clear_bit_unlock(AFS_VOLUME_WAIT, &volume->flags);
 		clear_bit_unlock(AFS_VOLUME_UPDATING, &volume->flags);
 		wake_up_bit(&volume->flags, AFS_VOLUME_WAIT);
@@ -307,6 +308,7 @@ int afs_check_volume_status(struct afs_volume *volume, struct afs_fs_cursor *fc)
 		return ret;
 	}
 
+wait:
 	if (!test_bit(AFS_VOLUME_WAIT, &volume->flags)) {
 		_leave(" = 0 [no wait]");
 		return 0;
diff --git a/include/trace/events/afs.h b/include/trace/events/afs.h
index f9691f69b2d6..19a07fbf35df 100644
--- a/include/trace/events/afs.h
+++ b/include/trace/events/afs.h
@@ -38,10 +38,12 @@ enum afs_server_trace {
 	afs_server_trace_get_caps,
 	afs_server_trace_get_install,
 	afs_server_trace_get_new_cbi,
+	afs_server_trace_get_probe,
 	afs_server_trace_give_up_cb,
 	afs_server_trace_put_call,
 	afs_server_trace_put_cbi,
 	afs_server_trace_put_find_rsq,
+	afs_server_trace_put_probe,
 	afs_server_trace_put_slist,
 	afs_server_trace_put_slist_isort,
 	afs_server_trace_put_uuid_rsq,
@@ -247,10 +249,12 @@ enum afs_cb_break_reason {
 	EM(afs_server_trace_get_caps,		"GET caps ") \
 	EM(afs_server_trace_get_install,	"GET inst ") \
 	EM(afs_server_trace_get_new_cbi,	"GET cbi  ") \
+	EM(afs_server_trace_get_probe,		"GET probe") \
 	EM(afs_server_trace_give_up_cb,		"giveup-cb") \
 	EM(afs_server_trace_put_call,		"PUT call ") \
 	EM(afs_server_trace_put_cbi,		"PUT cbi  ") \
 	EM(afs_server_trace_put_find_rsq,	"PUT f-rsq") \
+	EM(afs_server_trace_put_probe,		"PUT probe") \
 	EM(afs_server_trace_put_slist,		"PUT slist") \
 	EM(afs_server_trace_put_slist_isort,	"PUT isort") \
 	EM(afs_server_trace_put_uuid_rsq,	"PUT u-req") \



^ permalink raw reply related	[flat|nested] 9+ messages in thread

* [PATCH 8/8] afs: Show more information in /proc/net/afs/servers
  2020-04-24 15:19 [PATCH 0/8] afs: NAT-mitigation and other bits David Howells
                   ` (6 preceding siblings ...)
  2020-04-24 15:20 ` [PATCH 7/8] afs: Actively poll fileservers to maintain NAT or firewall openings David Howells
@ 2020-04-24 15:20 ` David Howells
  7 siblings, 0 replies; 9+ messages in thread
From: David Howells @ 2020-04-24 15:20 UTC (permalink / raw)
  To: linux-afs; +Cc: linux-fsdevel, linux-kernel, dhowells

Show more information in /proc/net/afs/servers to make it easier to see
what's going on with the server probing.

Signed-off-by: David Howells <dhowells@redhat.com>
---

 fs/afs/proc.c |   17 +++++++++--------
 1 file changed, 9 insertions(+), 8 deletions(-)

diff --git a/fs/afs/proc.c b/fs/afs/proc.c
index 9bce7898cd7d..1d21465a4108 100644
--- a/fs/afs/proc.c
+++ b/fs/afs/proc.c
@@ -378,21 +378,22 @@ static int afs_proc_servers_show(struct seq_file *m, void *v)
 	int i;
 
 	if (v == SEQ_START_TOKEN) {
-		seq_puts(m, "UUID                                 REF ACT ADDR\n");
+		seq_puts(m, "UUID                                 REF ACT\n");
 		return 0;
 	}
 
 	server = list_entry(v, struct afs_server, proc_link);
 	alist = rcu_dereference(server->addresses);
-	seq_printf(m, "%pU %3d %3d %pISpc%s\n",
+	seq_printf(m, "%pU %3d %3d\n",
 		   &server->uuid,
 		   atomic_read(&server->ref),
-		   atomic_read(&server->active),
-		   &alist->addrs[0].transport,
-		   alist->preferred == 0 ? "*" : "");
-	for (i = 1; i < alist->nr_addrs; i++)
-		seq_printf(m, "                                             %pISpc%s\n",
-			   &alist->addrs[i].transport,
+		   atomic_read(&server->active));
+	seq_printf(m, "  - ALIST v=%u osp=%u r=%lx f=%lx\n",
+		   alist->version, atomic_read(&server->probe_outstanding),
+		   alist->responded, alist->failed);
+	for (i = 0; i < alist->nr_addrs; i++)
+		seq_printf(m, "    [%x] %pISpc%s\n",
+			   i, &alist->addrs[i].transport,
 			   alist->preferred == i ? "*" : "");
 	return 0;
 }



^ permalink raw reply related	[flat|nested] 9+ messages in thread

end of thread, other threads:[~2020-04-24 15:20 UTC | newest]

Thread overview: 9+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2020-04-24 15:19 [PATCH 0/8] afs: NAT-mitigation and other bits David Howells
2020-04-24 15:19 ` [PATCH 1/8] afs: Always include dir in bulk status fetch from afs_do_lookup() David Howells
2020-04-24 15:20 ` [PATCH 2/8] afs: Make record checking use TASK_UNINTERRUPTIBLE when appropriate David Howells
2020-04-24 15:20 ` [PATCH 3/8] afs: Use the serverUnique field in the UVLDB record to reduce rpc ops David Howells
2020-04-24 15:20 ` [PATCH 4/8] afs: Fix to actually set AFS_SERVER_FL_HAVE_EPOCH David Howells
2020-04-24 15:20 ` [PATCH 5/8] afs: Remove some unused bits David Howells
2020-04-24 15:20 ` [PATCH 6/8] afs: Split the usage count on struct afs_server David Howells
2020-04-24 15:20 ` [PATCH 7/8] afs: Actively poll fileservers to maintain NAT or firewall openings David Howells
2020-04-24 15:20 ` [PATCH 8/8] afs: Show more information in /proc/net/afs/servers David Howells

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).